References
- Dhillon, V. K. (2022). Vocal Cord Disorders.
https://www.hopkinsmedicine.org/health/conditions-and-diseases/vocal-cord-disorders . (Accessed September 2025). - Verdolini, K., Ramig, L. O. (2001). Review: Occupational risks for voice problems. Logopedics, Phoniatrics, Vocology, 26 (1), 37–46.
- Parsa, V., Jamieson, D. G. (2000). Identification of pathological voices using glottal noise measures. Journal of Speech, Language, and Hearing Research, 43 (2), 469–485.
https://doi.org/10.1044/jslhr.4302.469 . - Wang, J., Xu, H., Peng, X., Liu, J., He, C. (2023). Pathological voice detection based on multi-domain features and deep hierarchical extreme learning machine. The Journal of the Acoustical Society of America, 153 (1), 423–435.
https://doi.org/10.1121/10.0016869 . - AL-Dhief, F. T., Latiff, N. M. A. A., Malik, N. N. N. A., Sabri, N., Albadr, M. A. A., Abbas, A. F., Hussein, Y. M., Mohammed, M. A. (2020). Voice pathology detection using machine learning technique. In 2020 IEEE 5th International Symposium on Telecommunication Technologies (ISTT). IEEE, 99–104.
https://doi.org/10.1109/ISTT50966.2020.9279346 . - Al-Nasheri, A., Muhammad, G., Alsulaiman, M., Ali, Z., Mesallam, T. A., Farahat, M., Malki, K. H., Bencherif, M. A. (2017)a. An investigation of multidimensional voice program parameters in three different databases for voice pathology detection and classification. Journal of Voice, 31 (1), 113.e9–113.e18.
https://doi.org/10.1016/j.jvoice.2016.03.019 . - Al-Nasheri, A., Muhammad, G., Alsulaiman, M., Ali, Z. (2017)b. Investigation of voice pathology detection and classification on different frequency regions using correlation functions. Journal of Voice, 31 (1), 3–15.
https://doi.org/10.1016/j.jvoice.2016.01.014 . - Godino-Llorente, J. I., Gomez-Vilda, P. (2004). Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors. IEEE Transactions on Biomedical Engineering, 51 (2), 380–384.
https://doi.org/10.1109/TBME.2003.820386 . - Mittal., V., Sharma, R. K. (2021). Deep learning approach for voice pathology detection and classification. International Journal of Healthcare Information Systems and Informatics, 16 (4), 1–30.
https://doi.org/10.4018/IJHISI.20211001.oa28 . - Roohum, J., Jayagowri, R. (2020). Voice disorder detection and classification - a review. In Proceedings of the 2nd International Conference on IoT, Social, Mobile, Analytics and Cloud in Computational Vision and Bio-Engineering (ISMAC-CVB 2020).
https://doi.org/10.2139/ssrn.3734762 . - Altayeb, M., Al-Ghraibah., A. (2022). Classification of three pathological voices based on specific features groups using support vector machine. International Journal of Electrical and Computer Engineering (IJECE), 12 (1), 946–956.
https://doi.org/http://doi.org/10.11591/ijece.v12i1.pp946-956 . - Hammami, I. (2019). Classification of psychogenic and laryngeal voice diseases based on wavelet transform analysis and teager energy operator. International Journal of Applied Mathematics, Electronics and Computers, 7 (3), 49–55.
https://doi.org/10.18100/ijamec.458230 . - Wu, H., Soraghan, J., Lowit, A., Di Caterina, G. (2018). Convolutional neural networks for pathological voice detection. In 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, 1–4.
https://doi.org/10.1109/EMBC.2018.8513222 . - Dibazar, A. A., Narayanan, S., Berger., T. W. (2002). Feature analysis for automatic detection of pathological speech. In Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, Vol. 1. 182–183.
https://doi.org/10.1109/IEMBS.2002.1134447 . - Arjmandi, M. K., Pooyan., M. (2012). An optimum algorithm in pathological voice quality assessment using wavelet-packet-based features, linear discriminant analysis and support vector machine. Biomedical Signal Processing and Control, 7 (1), 3–19.
- Hariharan, M., Polat, K., Yaacob, S. (2014). A new feature constituting approach to detection of vocal fold pathology. International Journal of Systems Science, 45 (8), 1622–1634.
https://doi.org/10.1080/00207721.2013.794905 . - Saldanha, J. C., Ananthakrishna, T., Pinto, R. (2014). Vocal fold pathology assessment using mel-frequency cepstral coefficients and linear predictive cepstral coefficients features. Journal of Medical Imaging and Health Informatics, 4 (2), 168–173.
https://doi.org/10.1166/jmihi.2014.1253 . - Arias-Londoño, J. D., Godino-Llorente, J. I., Sáenz-Lechón, N., Osma-Ruiz, V., Castellanos-Domínguez, G. (2011). Automatic detection of pathological voices using complexity measures, noise parameters, and melcepstral coefficients. IEEE Transactions on Biomedical Engineering, 58 (2), 370–379.
https://doi.org/10.1109/TBME.2010.2089052 . - Godino-Llorente, J. I., Aguilera-Navarro, S., Gómez-Vilda, P. (2000). LPC, LPCC and MFCC parameterisation applied to the detection of voice impairments. In 6th International Conference on Spoken Language Processing (ICSLP 2000). ISCA, Vol. 3. 965–968.
https://doi.org/10.21437/ICSLP.2000-695 . - Watts, C. R., Awan., S. N. (2011). Use of spectral/cepstral analyses for differentiating normal from hypofunctional voices in sustained vowel and continuous speech contexts. Journal of Speech, Language, and Hearing Research, 54 (6), 1525–1537.
https://doi.org/10.1044/1092-4388(2011/10-0209) . - Farazi, S., Shekofteh., Y. (2024). Voice pathology detection on spontaneous speech data using deep learning models. International Journal of Speech Technology, 27, 739–751.
https://doi.org/10.1007/s10772-024-10134-4 . - Mohammed, M. A., Abdulkareem, K. H., Mostafa, S. A., Ghani, M. K. A., Maashi, M. S., Garcia-Zapirain, B., Oleagordia, I., Alhakami, H., AL-Dhief, F. T. (2020). Voice pathology detection and classification using convolutional neural network model. Applied Sciences, 10 (11), 3723.
https://doi.org/10.3390/app10113723 . - Ankışhan, H., İnam, S. Ç. (2021). Voice pathology detection by using the deep network architecture. Applied Soft Computing 106, 107310.
https://doi.org/10.1016/j.asoc.2021.107310 . - Barry, W. J. (2000). Saarbrücken Voice Database, Version 2.0. Institute of Phonetics, Saarland University, Germany.
https://stimmdb.coli.uni-saarland.de/ . - Krizhevsky, A., Sutskever, I., Hinton., G. E. (2017). ImageNet classification with deep convolutional networks. Communications of the ACM, 60 (6), 84–90.
https://doi.org/10.1145/3065386 . - Huang, N. E., Shen, Z., Long, S. R., Wu, M. C., Shih, H. H., Zheng, Q., Yen, N.-C., Tung, C. C., Liu, H. H. (1998). The empirical mode decomposition and the hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 454 (1971), 903–995.
https://doi.org/10.1098/rspa.1998.0193 . - Huang, N. E., Wu, M.-L. C., Long, S. R., Shen, S. S. P., Qu, W., Gloersen, P., Fan, K. L. (2003). A confidence limit for the empirical mode decomposition and hilbert spectral analysis. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 459, 2317–2345.
https://doi.org/10.1098/rspa.2003.1123 . - Wu, Z., Huang, N. E. (2009). Ensemble empirical mode decomposition: A noise-assisted data analysis method. Advances in Adaptive Data Analysis, 1 (1), 1–41.
https://doi.org/10.1142/S1793536909000047 . - Eskidere, O., Gürhanlı, A. (2015). Voice Disorder Classification Based on Multitaper Mel Frequency Cepstral Coefficients Features. Computational and Mathematical Methods in Medicine 1–12.
https://doi.org/10.1155/2015/956249 . - Kadiri, S. R., Alku, P. (2020). Analysis and detection of pathological voice using glottal source features. IEEE Journal of Selected Topics in Signal Processing, 14 (2), 367–379.
https://doi.org/10.1109/JSTSP.2019.2957988 . - Tirronen, S., Kadiri, S. R., Alku, P. (2022). The Effect of the MFCC Frame Length in Automatic Voice Pathology Detection. Journal of Voice, 38 (5), 975–982.
https://doi.org/10.1016/j.jvoice.2022.03.021 . - Tirronen, S., Javanmardi, F., Kodali, M., Kadiri, S. R., Alku, P. (2023). Utilizing Wav2vec in Database-Independent Voice Disorder Detection. In 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE. 1–5.
https://doi.org/10.1109/ICASSP49357.2023.10094798 . - Javanmardi, F., Kadiri, S. R., Alku, P. (2023). A comparison of data augmentation methods in voice pathology detection. Computer Speech and Language, 83, 101552.
https://doi.org/10.1016/j.csl.2023.101552 . - Cai, J., Song, Y., Wu, J. (2024). Voice disorder classification using Wav2vec 2.0 feature extraction. Journal of Voice.
https://doi.org/10.1016/j.jvoice.2024.09.002 . - Javanmardi, F., Tirronen, S., Kodali, M., Kadiri, S. R., Alku, P. (2023). Wav2vec-based detection and severity level classification of dysarthria from speech. In 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
https://doi.org/DOI:10.1109/ICASSP49357.2023.100948577 .