Real Time Recognition Of Speakers From Internet Audio Stream

Radoslaw Weychan; Tomasz Marciniak; Agnieszka Stankiewicz; Adam Dabrowski

doi:10.1515/fcds-2015-0014

.blurhash-client-img { display: none !important; }

Real Time Recognition Of Speakers From Internet Audio Stream

Foundations of Computing and Decision Sciences

Volume 40 (2015): Issue 3 (September 2015)

By: Radoslaw Weychan, Tomasz Marciniak, Agnieszka Stankiewicz and Adam Dabrowski

Open Access

|Sep 2015

[1] S. Araki, T. Hori, M. Fujimoto, S. Watanabe, T. Yoshioka, T. Nakatani, and A. Nakamura. Online meeting recognizer with multichannel speaker diarization. In Signals, Systems and Computers (ASILOMAR), 2010 Conference Record of the Forty Fourth Asilomar Conference on, pages 1697–1701, Nov 2010.10.1109/ACSSC.2010.5757829
Search in Google Scholar
[2] D. Blatt and A. Hero. On tests for global maximum of the log-likelihood function. Information Theory, IEEE Transactions on, 53(7):2510–2525, July 2007.10.1109/TIT.2007.899537
Search in Google Scholar
[3] M. Bosi, K. Brandenburg, S. Quackenbush, L. Fielder, K. Akagiri, H. Fuchs, and M. Dietz. ISO/IEC MPEG-2 Advanced Audio Coding. J. Audio Eng. Soc, 45(10):789–814, 1997.
Search in Google Scholar
[4] M. Brookes. VOICEBOX: Speech Processing Toolbox for MATLAB, 2005.
Search in Google Scholar
[5] J. Dattorro. Convex optimization and Euclidean distance geometry. Lulu. com, 2008.
Search in Google Scholar
[6] J. R. Hershey and R. A. Olsen. Approximating the Kullback Leibler divergence between gaussian mixture models. In ICASSP (4), pages 317–320, 2007.10.1109/ICASSP.2007.366913
Search in Google Scholar
[7] T. Jiang and J. Han. Map-based audio coding compensation for speaker recognition. Journal of Signal and Information Processing, 2:165, 2011.10.4236/jsip.2011.23021
Search in Google Scholar
[8] R. D. Maesschalck, D. Jouan-Rimbaud, and D. Massart. The Mahalanobis distance. Chemometrics and Intelligent Laboratory Systems, 50(1):1 – 18, 2000.10.1016/S0169-7439(99)00047-7
Search in Google Scholar
[9] T. Marciniak, R. Weychan, A. Dabrowski, and A. Krzykowska. Speaker recognition based on short Polish sequences. IEEE SPA: Signal Processing Algorithms, Architectures, Arrangements, and Applications Conference Proceedings, pages 95–98, 2010.
Search in Google Scholar
[10] T. Marciniak, R. Weychan, A. Dabrowski, and A. Krzykowska. Influence of silence removal on speaker recognition based on short Polish sequences. IEEE SPA: Signal Processing Algorithms, Architectures, Arrangements, and Applications Conference Proceedings, pages 159–163, 2011.
Search in Google Scholar
[11] T. Marciniak, R. Weychan, A. Stankiewicz, and A. Dabrowski. Biometric speech signal processing in a system with digital signal processor. Bulletin of the Polish Academy of Sciences. Technical Sciences, Vol. 62, nr 3:589–594, 2014.10.2478/bpasts-2014-0064
Search in Google Scholar
[12] S. Molau, M. Pitz, R. Schluter, and H. Ney. Computing Mel-frequency cepstral coefficients on the power spectrum. In Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on, volume 1, pages 73–76, 2001.
Search in Google Scholar
[13] K. Park, J.-S. Park, and Y.-H. Oh. GMM adaptation based online speaker segmentation for spoken document retrieval. Consumer Electronics, IEEE Transactions on, 56(2):1123–1129, 2010.10.1109/TCE.2010.5506048
Search in Google Scholar
[14] Z. Piotrowski, J. Wojtun, and K. Kaminski. Subscriber authentication using GMM and tms320c6713dsp. Przeglad Elektrotechniczny, (12a/2012):127–130, 2012.
Search in Google Scholar
[15] A. Plinge and G. A. Fink. Online multi-speaker tracking using multiple microphone arrays informed by auditory scene analysis. In Signal Processing Conference (EUSIPCO), 2013 Proceedings of the 21st European, pages 1–5, Sept 2013.
Search in Google Scholar
[16] D. Reynolds. Gaussian mixture models. Encyclopedia of Biometrics, pages 659–663, 2009.10.1007/978-0-387-73003-5_196
Search in Google Scholar
[17] J. B. Tenenbaum, V. D. Silva, and J. C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319–2323, 2000.10.1126/science.290.5500.231911125149
Search in Google Scholar
[18] G. Wen, L. Jiang, and J. Wen. Using locally estimated geodesic distance to optimize neighborhood graph for isometric data embedding. Pattern Recognition, 41(7):2226 – 2236, 2008.10.1016/j.patcog.2007.12.015
Search in Google Scholar
[19] R. Weychan, T. Marciniak, and A. Dabrowski. Analysis of differences between MFCC after multiple GSM transcodings. Przeglad Elektrotechniczny, pages 24–29, 2012.
Search in Google Scholar
[20] R. Weychan, T. Marciniak, A. Stankiewicz, and A. Dabrowski. Real time speaker recognition from internet radio. IEEE SPA: Signal Processing Algorithms, Architectures, Arrangements, and Applications Conference Proceedings, pages 128–132, 2014.
Search in Google Scholar
[21] R. Weychan, A. Stankiewicz, T. Marciniak, and A. Dabrowski. Improving of speaker identification from mobile telephone calls. In Multimedia Communications, Services and Security, volume 429 of Communications in Computer and Information Science, pages 254–264. 2014.10.1007/978-3-319-07569-3_21
Search in Google Scholar

Authors

Metrics

Articles in this issue

DOI: https://doi.org/10.1515/fcds-2015-0014 | Journal eISSN: 2300-3405 | Journal ISSN: 0867-6356

Journal RSS Feed

Language: English

Page range: 223 - 233

Published on: Sep 30, 2015

Published by: Poznan University of Technology

In partnership with: Paradigm Publishing Services

Keywords:

GMM,

Related subjects:

© 2015 Radoslaw Weychan, Tomasz Marciniak, Agnieszka Stankiewicz, Adam Dabrowski, published by Poznan University of Technology
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Volume 40 (2015): Issue 3 (September 2015)