Acoustic analysis assessment in speech pathology detection

Daria Panek; Andrzej Skalski; Janusz Gajda; Ryszard Tadeusiewicz

doi:10.1515/amcs-2015-0046

.blurhash-client-img { display: none !important; }

Acoustic analysis assessment in speech pathology detection

International Journal of Applied Mathematics and Computer Science

Volume 25 (2015): Issue 3 (September 2015)

By: Daria Panek, Andrzej Skalski, Janusz Gajda and Ryszard Tadeusiewicz

Open Access

|Sep 2015

Arroyave, J.R.O., Bonilla, J.F.V. and Trejos, E.D. (2012). Acoustic analysis and non linear dynamics applied to voice pathology detection: A review, Recent Patents on Signal Processing 2(2): 1-11.10.2174/2210686311202020096
Search in Google Scholar
Atal, B.S. and Hanauer, S.L. (1971). Speech analysis and synthesis by linear prediction of the speech wave, The Journal of the Acoustical Society of America 50(2B): 637-655.10.1121/1.19126794106390
Search in Google Scholar
Belafsky, P.C., Postma, G.N., Reulbach, T.R., Holland, B.W. and Koufman, J.A. (2002). Muscle tension dysphonia as a sign of underlying glottal insufficiency, Otolaryngology-Head and Neck Surgery 127(5): 448-451.10.1067/mhn.2002.12889412447240
Search in Google Scholar
Bishop, C.M. (2006). Pattern Recognition and Machine Learning, Vol. 1, Springer, New York, NY.
Search in Google Scholar
Brinca, L.F., Batista, A.P.F., Tavares, A.I., Goncalves, I.C. and Moreno, M.L. (2014). Use of cepstral analyses for differentiating normal from dysphonic voices: A comparative study of connected speech versus sustained vowel in European Portuguese female speakers, Journal of Voice 28(3): 282-286.10.1016/j.jvoice.2013.10.00124491499
Search in Google Scholar
Eadie, T.L. and Doyle, P.C. (2005). Classification of dysphonic voice: Acoustic and auditory-perceptual measures, Journal of Voice 19(1): 1-14.10.1016/j.jvoice.2004.02.00215766846
Search in Google Scholar
Engel, Z.W., Klaczynski, M. and Wszolek, W. (2007). A vibroacoustic model of selected human larynx diseases, International Journal of Occupational Safety and Ergonomics 13(4): 367.10.1080/10803548.2007.1110509418082019
Search in Google Scholar
Farrus, M., Hernando, J. and Ejarque, P. (2007). Jitter and shimmer measurements for speaker recognition, Annual Conference of the International Speech Communication Association (Interspeech 2007), Antwerp, Belgium, pp. 778-781.
Search in Google Scholar
Fong, S., Lan, K. and Wong, R. (2013). Classifying human voices by using hybrid SFX time-series preprocessing and ensemble feature selection, BioMed Research International 2013:1-27, DOI: 10.1155/2013/720834.10.1155/2013/720834383083924288684
Search in Google Scholar
Fraile, R., Saenz-Lechon, N., Godino-Llorente, J., Osma-Ruiz, V. and Fredouille, C. (2009). Automatic detection of laryngeal pathologies in records of sustained vowels by means of mel-frequency cepstral coefficient parameters and differentiation of patients by sex, Folia phoniatrica et logopaedica 61(3): 146-152.10.1159/00021995019571549
Search in Google Scholar
Fujinaga, I. (1996). Adaptive Optical Music Recognition, Ph.D. thesis, McGill University, Montreal.
Search in Google Scholar
Goddard, J., Schlotthauer, G., Torres, M. and Rufiner, H. (2009). Dimensionality reduction for visualization of normal and pathological speech data, Biomedical Signal Processing and Control 4(3): 194-201.10.1016/j.bspc.2009.01.001
Search in Google Scholar
Godino-Llorente, J.I. and Gomez-Vilda, P. (2004). Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors, IEEE Transactions on Biomedical Engineering 51(2): 380-384.10.1109/TBME.2003.82038614765711
Search in Google Scholar
Godino-Llorente, J.I., Gomez-Vilda, P. and Blanco-Velasco, M. (2006a). Dimensionality reduction of a pathological voice quality assessment system based on Gaussian mixture models and short-term cepstral parameters, IEEE Transactions on Biomedical Engineering 53(10): 1943-1953.10.1109/TBME.2006.871883
Search in Google Scholar
Godino-Llorente, J.I., Sáenz-Lechón, N., Osma-Ruiz, V., Aguilera-Navarro, S. and Gómez-Vilda, P. (2006b). An integrated tool for the diagnosis of voice disorders, Medical Engineering & Physics 28(3): 276-289.10.1016/j.medengphy.2005.04.014
Search in Google Scholar
Hadjitodorov, S. and Mitev, P. (2002). A computer system for acoustic analysis of pathological voices and laryngeal diseases screening, Medical Engineering & Physics 24(6): 419-429.10.1016/S1350-4533(02)00031-0
Search in Google Scholar
Horii, Y. (1980). Vocal shimmer in sustained phonation, Journal of Speech, Language, and Hearing Research 23(1): 202-209.10.1044/jshr.2301.2027442177
Search in Google Scholar
Hu, H. and Zahorian, S.A. (2008). A neural network based nonlinear feature transformation for speech recognition, 9th Annual Conference of the International Speech Communication Association (Interspeech 2008), Brisbane, Australia, pp. 1533-1536.
Search in Google Scholar
Huber, J.E., Stathopoulos, E.T., Curione, G.M., Ash, T.A. and Johnson, K. (1999). Formants of children, women, and men: The effects of vocal intensity variation, The Journal of the Acoustical Society of America 106(3): 1532-1542.10.1121/1.42715010489709
Search in Google Scholar
Imai, S. (1983). Cepstral analysis synthesis on the mel frequency scale, IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP’83, Boston, MA, USA, Vol. 8, pp. 93-96.
Search in Google Scholar
Jiang, J.J., Diaz, C.E. and Hanson, D.G. (1998). Finite element modeling of vocal fold vibration in normal phonation and hyperfunctional dysphonia: Implications for the pathogenesis of vocal nodules, Annals of Otology, Rhinology and Laryngology 107(7): 603-610.10.1177/0003489498107007119682857
Search in Google Scholar
Joanes, D. and Gill, C. (1998). Comparing measures of sample skewness and kurtosis, Journal of the Royal Statistical Society: Series D (The Statistician) 47(1): 183-189.10.1111/1467-9884.00122
Search in Google Scholar
Jothilakshmi, S. (2014). Automatic system to detect the type of voice pathology, Applied Soft Computing 21: 244-249. 10.1016/j.asoc.2014.03.036
Search in Google Scholar
Lieberman, P. (1963). Some acoustic measures of the fundamental periodicity of normal and pathologic larynges, The Journal of the Acoustical Society of America 35(3): 344-353.10.1121/1.1918465
Search in Google Scholar
Makhoul, J. (1975). Linear prediction: A tutorial review, Proceedings of the IEEE 63(4): 561-580.10.1109/PROC.1975.9792
Search in Google Scholar
Makki, B., Hosseini, M.N. and Seyyedsalehi, S.A. (2010). An evolving neural network to perform dynamic principal component analysis, Neural Computing and Applications 19(3): 459-463.10.1007/s00521-009-0328-1
Search in Google Scholar
Manfredi, C., D’Aniello, M., Bruscaglioni, P. and Ismaelli, A. (2000). A comparative analysis of fundamental frequency estimation methods with application to pathological voices, Medical Engineering & Physics 22(2): 135-147.10.1016/S1350-4533(00)00018-7
Search in Google Scholar
Maran, A. (1983). Description of specific diseases of the larynx, in R. Harden and A. Marcus (Eds.), Otorhinolaryngology, Vol. 4, Springer, Dordrecht, pp. 99-104.10.1007/978-94-010-9583-9_19
Search in Google Scholar
Matassini, L., Hegger, R., Kantz, H. and Manfredi, C. (2000). Analysis of vocal disorders in a feature space, Medical Engineering & Physics 22(6): 413-418.10.1016/S1350-4533(00)00048-5
Search in Google Scholar
Mathieson, L., Hirani, S., Epstein, R., Baken, R., Wood, G. and Rubin, J. (2009). Laryngeal manual therapy: A preliminary study to examine its treatment effects in the management of muscle tension dysphonia, Journal of Voice 23(3): 353-366.10.1016/j.jvoice.2007.10.002
Search in Google Scholar
Mehta, D.D., Deliyski, D.D., Zeitels, S.M., Quatieri, T.F. and Hillman, R.E. (2010). Voice production mechanisms following phonosurgical treatment of early glottic cancer, The Annals of Otology, Rhinology, and Laryngology 119(1): 1.10.1177/000348941011900101
Search in Google Scholar
Morrison, M.D., Nichol, H. and Rammage, L.A. (1986). Diagnostic criteria in functional dysphonia, The Laryngoscope 96(1): 1-8.10.1288/00005537-198601000-00001
Search in Google Scholar
Nicolosi, L., Harryman, E. and Kresheck, J. (2004). Terminology of Communication Disorders: Speech-Language- Hearing, Lippincott Williams & Wilkins, Philadelphia, PA.
Search in Google Scholar
Noll, A.M. (1967). Cepstrum pitch determination, The Journal of the Acoustical Society of America 41(2): 293-309.10.1121/1.1910339
Search in Google Scholar
Oja, E. (2002). Unsupervised learning in neural computation, Theoretical Computer Science 287(1): 187-207.10.1016/S0304-3975(02)00160-3
Search in Google Scholar
Rabiner, L.R. and Juang, B.-H. (1993). Fundamentals of Speech Recognition, Vol. 14, PTR Prentice Hall, Englewood Cliffs, NJ.
Search in Google Scholar
Rachida, D. and Amar, D. (2009). Effects of acoustic interaction between the subglottic and supraglottic cavities of the human phonatory system, Canadian Acoustics 37(2): 37-43.
Search in Google Scholar
Roy, N. (2003). Functional dysphonia, Current Opinion in Otolaryngology & Head and Neck Surgery 11(3): 144-148. 10.1097/00020840-200306000-0000212923352
Search in Google Scholar
Saenz-Lechon, N., Godino-Llorente, J.I., Osma-Ruiz, V., Blanco-Velasco, M. and Cruz-Roldan, F. (2006). Automatic assessment of voice quality according to the GRBAS scale, 28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS’06, New York, NY, USA, pp. 2478-2481.
Search in Google Scholar
Saldanha, J.C., Ananthakrishna, T. and Pinto, R. (2014). Vocal fold pathology assessment using mel-frequency cepstral coefficients and linear predictive cepstral coefficients features, Journal of Medical Imaging and Health Informatics 4(2): 168-173.10.1166/jmihi.2014.1253
Search in Google Scholar
Schölkopf, B., Smola, A. and Müller, K.-R. (1999). Kernel principal component analysis, in B. Schölkopf, C.J.C.
Search in Google Scholar
Burges and A.J. Smola (Eds.), Advances in Kernel Methods-Support Vector Learning, MIT Press, Cambridge, MA.
Search in Google Scholar
Scholz, M., Fraunholz, M. and Selbig, J. (2008). Nonlinear principal component analysis: Neural network models and applications, in A.N. Gorban et al. (Eds.), Principal Manifolds for Data Visualization and Dimension Reduction, Springer, Berlin/Heidelberg, pp. 44-67.10.1007/978-3-540-73750-6_2
Search in Google Scholar
Scholz, M. and Vigário, R. (2002). Nonlinear PCA: A new hierarchical approach, 10th European Symposium on Artificial Neural Networks (ESANN), Bruges, Belgium, pp. 439-444.
Search in Google Scholar
Skalski, A., Zielinski, T. and Deliyski, D. (2008). Analysis of vocal folds movement in high speed videoendoscopy based on level set segmentation and image registration, International Conference on Signals and Electronic Systems, ICSES’ 08, Kraków, Poland, pp. 223-226.
Search in Google Scholar
Steinecke, I. and Herzel, H. (1995). Bifurcations in an asymmetric vocal-fold model, The Journal of the Acoustical Society of America 97(3): 1874-1884.10.1121/1.4120617699169
Search in Google Scholar
Sulica, L. and Blitzer, A. (Eds.) (2006). Vocal Fold Paralysis, Springer, Berlin/Heidelberg.10.1007/3-540-32504-2
Search in Google Scholar
Tadeusiewicz, R., Korbicz, J., Rutkowski, L. and Duch, W. (Eds.) (2013). Neural Networks in Biomedical Engineering, Inżynieria biomedyczna. Podstawy i zastosowania, Vol. 9, Akademicka Oficyna Wydawnicza EXIT, Warsaw, (in Polish).
Search in Google Scholar
Tsanas, A. (2013). Acoustic analysis toolkit for biomedical speech signal processing: Concepts and algorithms, Models and Analysis of Vocal Emissions for Biomedical Applications 2: 37-40.
Search in Google Scholar
Umapathy, K., Krishnan, S., Parsa, V. and Jamieson, D.G. (2005). Discrimination of pathological voices using a time-frequency approach, IEEE Transactions on Biomedical Engineering 52(3): 421-430.10.1109/TBME.2004.84296215759572
Search in Google Scholar
Wang, Q. (2012). Kernel principal component analysis and its applications in face recognition and active shape models, ARXIV 1207.3538.
Search in Google Scholar
Wong, D., Markel, J. and Gray Jr, A. (1979). Least squares glottal inverse filtering from the acoustic speech waveform, IEEE Transactions on Acoustics, Speech and Signal Processing 27(4): 350-355.10.1109/TASSP.1979.1163260
Search in Google Scholar
Yumoto, E., Gould, W.J. and Baer, T. (1982). Harmonics-to-noise ratio as an index of the degree of hoarseness, The Journal of the Acoustical Society of America 71(6): 1544-1550.10.1121/1.3878087108029
Search in Google Scholar
Zahorian, S. and Hu, H. (2011). Nonlinear Dimensionality Reduction Methods for Use with Automatic Speech Recognition, Vol. 06, Speech Technologies Source: InTech, Rijeka.
Search in Google Scholar

Authors

Metrics

Articles in this issue

DOI: https://doi.org/10.1515/amcs-2015-0046 | Journal eISSN: 2083-8492 | Journal ISSN: 1641-876X

Journal RSS Feed

Language: English

Page range: 631 - 643

Submitted on: Jun 13, 2014

Published on: Sep 30, 2015

Published by: University of Zielona Góra

In partnership with: Paradigm Publishing Services

Publication frequency: 4 issues per year

Keywords:

linear PCA,

non-linear PCA,

auto-associative neural network,

validation,

voice pathology detection

Related subjects:

Mathematics,

Applied mathematics

© 2015 Daria Panek, Andrzej Skalski, Janusz Gajda, Ryszard Tadeusiewicz, published by University of Zielona Góra
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Volume 25 (2015): Issue 3 (September 2015)