Have a personal or library account? Click to login
Comparison of speaker dependent and speaker independent emotion recognition Cover

Comparison of speaker dependent and speaker independent emotion recognition

By: Jan Rybka and  Artur Janicki  
Open Access
|Dec 2013

References

  1. Ayadi, M.E., Kamel, M.S. and Karray, F. (2007). Speech emotion recognition using Gaussian mixture vector autoregressive models, IEEE International Conference onAcoustics, Speech and Signal Processing (ICASSP 2007), Honolulu, HI, USA, Vol. 4, pp. IV-957-IV-960.
  2. Ayadi, M.E., Kamel, M.S. and Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases, Pattern Recognition 44(3): 572-587.10.1016/j.patcog.2010.09.020
  3. Batliner, A., Steidl, S., Hacker, C., Noth, E. and Niemann, H. (2005). Tales of tuning-prototyping for automatic classification of emotional user states, Interspeech 2005, Lisbon, Portugal, pp. 489-492.
  4. Brooks, M. (2012). Voicebox: Speech processing toolbox for Matlab, http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html.
  5. Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W. and Weiss, B. (2005). A database of German emotional speech, Interspeech 2005, Lisbon, Portugal, pp. 1517-1520.
  6. Camacho, A. and Harris, J.G. (2008). A sawtooth waveform inspired pitch estimator for speech and music, Journal of the Acoustical Society of America 124: 1638-1652.10.1121/1.295159219045655
  7. Cichosz, J. and Slot, K. (2007). Emotion recognition in speech signal using emotion-extracting binary decision trees, ACII 2007, Lisbon, Portugal.
  8. Clavel, C., Devillers, L., Richard, G., Vasilexcu, I. and Ehrette, T. (2007). Detection and analysis of abnormal situations through fear-type acoustic manifestations, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007), Honolulu, HI, USA, Vol. 4, pp. IV-21-IV-24.
  9. Devillers, L. and Vidrascu, L. (2006). Real-life emotions detection with lexical and paralinguistic cues on human-human call center dialogs, Interspeech 2006, Pittsburgh, PA, USA, pp. 801-804.
  10. Ekman, P. (1972). Universals and cultural differences in facial expressions of emotions, in J. Cole (Ed.), Nebraska Symposium on Motivation, Vol. 19, University of Nebraska Press, Lincoln, NE, pp. 207-282.
  11. Engberg, I.S., Hansen, A.V., Andersen, O. and Dalsgaard, P. (1997). Design, recording and verification of a Danish emotional speech database, Eurospeech 1997, Rhodes, Greece.10.21437/Eurospeech.1997-482
  12. Erden, M. and Arslan, L.M. (2011). Automatic detection of anger in human-human call center dialogs, Interspeech 2011, Florence, Italy, pp. 81-84.
  13. Gajsek, R., Mihelic, F. and Dobrisek, S. (2013). Speaker state recognition using an HMM-based feature extraction method, Computer Speech and Language 27(1): 135-150.10.1016/j.csl.2012.01.007
  14. Gorska, Z. and Janicki, A. (2012). Recognition of extraversion level based on handwriting and support vector machines, Perceptual and Motor Skills 114(3)(0031-5125): 857-869.10.2466/03.09.28.PMS.114.3.857-86922913026
  15. Grimm, M., Kroschel, K. and Narayanan, S. (2007). Support vector regression for automatic recognition of spontaneous emotions in speech, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007), Honolulu, HI, USA, Vol. 4, pp. IV-1085-IV-1088, ID: 1.
  16. Hassan, A. and Damper, R.I. (2010). Multi-class and hierarchical SVMs for emotion recognition, Interspeech 2010, Makuhari, Japan, pp. 2354-2357.
  17. He, L., Lech, M., Memon, S. and Allen, N. (2008). Recognition of stress in speech using wavelet analysis and teager energy operator, Interspeech 2008, Brisbane, Australia, pp. 605-608.
  18. Hirschberg, J., Benus, S., Brenier, J.M., Enos, F., Friedman, S., Gilman, S., Gir, C., Graciarena, M., Kathol, A. and Michaelis, L. (2005). Distinguishing deceptive from non-deceptive speech, Interspeech 2005, Lisbon, Portugal, pp. 1833-1836.
  19. Iliou, T. and Anagnostopoulos, C.-N. (2010). Classification on speech emotion recognition-a comparative study, International Journal on Advances in Life Sciences 2(1-2): 18-28.
  20. Janicki, A. (2012). On the Impact of Non-speech Sounds on Speaker Recognition, Text, Speech and Dialogue, Vol. 7499, Springer, Berlin/Heidelberg, pp. 566-572.
  21. Janicki, A. and Turkot, M. (2008). Speaker emotion recognition with the use of support vector machines, Telecommunication Review and Telecommunication News (8-9): 994-1005, (in Polish).
  22. Jeleń, Ł., Fevens, T. and Krzy˙zak, A. (2008). Classification of breast cancer malignancy using cytological images of fine needle aspiration biopsies, International Journal of Applied Mathematics and Computer Science 18(1): 75-83, DOI: 10.2478/v10006-008-0007-x.10.2478/v10006-008-0007-x
  23. Kaminska, D. and Pelikant, A. (2012). Recognition of human emotion from a speech signal based on Plutchik’s model, International Journal of Electronics and Telecommunications 58(2): 165-170.10.2478/v10177-012-0024-4
  24. Kang, B.S., Han, C.H., Lee, S.T., Youn, D.H. and Lee, C. (2000). Speaker dependent emotion recognition using speech signals ICSLP 2000, Beijing, China.10.21437/ICSLP.2000-288
  25. Kowalczuk, Z. and Czubenko, M. (2011). Intelligent decision-making system for autonomous robots, International Journal of Applied Mathematics and Computer Science 21(4): 671-684, DOI: 10.2478/v10006-011-0053-7.10.2478/v10006-011-0053-7
  26. Liberman, M., Davis, K., Grossman, M., Martey, N. and Bell, J. (2002). Emotional Prosody Speech and Transcripts, Linguistic Data Consortium, Philadelphia, PA.
  27. Liscombe, J., Hirschberg, J. and Venditti, J.J. (2005). Detecting certainess in spoken tutorial dialogues, Interspeech 2005, Lisbon, Portugal.10.21437/Interspeech.2005-581
  28. Liu, G., Lei, Y. and Hansen, J.H.L. (2010). A novel feature extraction strategy for multi-stream robust emotion identification, Interspeech 2010, Makuhari, Japan, pp. 482-485.
  29. Lugger, M. and Yang, B. (2007). The relevance of voice quality features in speaker independent emotion recognition, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007), Honolulu, HI, USA, Vol. 4, pp. IV-17-IV-20.
  30. Lugger, M., Yang, B. and Wokurek, W. (2006). Robust estimation of voice quality parameters under realworld disturbances, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2006), Toulouse, France, Vol. 1, p. I.
  31. Mehrabian, A. and Wiener, M. (1967). Decoding of inconsistent communications, Journal of Personality and Social Psychology 6(1): 109-114.10.1037/h00245326032751
  32. Neiberg, D., Laukka, P. and Ananthakrishnan, G. (2010). Classification of affective speech using normalized time-frequency cepstra, 5th International Conference on Speech Prosody (Speech Prosody 2010), Chicago, IL, USA, pp. 1-4.
  33. Patan, K. and Korbicz, J. (2012). Nonlinear model predictive control of a boiler unit: A fault tolerant control study, International Journal of Applied Mathematics and Computer Science 22(1): 225-237, DOI: 10.2478/v10006-012-0017-6.10.2478/v10006-012-0017-6
  34. Scherer, K.R. (2003). Vocal communication of emotion: A review of research paradigms, Speech Communication 40(1-2): 227-256.10.1016/S0167-6393(02)00084-5
  35. Schuller, B., Koehler, N., Moeller, R. and Rigoll, G. (2006). Recognition of interest in human conversational speech, Interspeech 2006, Pittsburgh, PA, USA, pp. 793-796.
  36. Schuller, B., Vlasenko, B., Eyben, F., Rigoll, G. and Wendemuth, A. (2009). Acoustic emotion recognition: A benchmark comparison of performances, IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2009), Merano, Italy, pp. 552-557.
  37. Seppi, D., Batliner, A., Schuller, B., Steidl, S., Vogt, T.,Wagner, J., Devillers, L., Vidrascu, L., Amir, N. and Aharonson, V. (2008). Patterns, prototypes, performance: Classifying emotional user states, Interspeech 2008, Brisbane, Australia, pp. 601-604.
  38. Vapnik, V.N. (1982). Estimation of Dependences Based on Empirical Data, Springer-Verlag, New York, NY, (translation of Vosstanovlenie zavisimostei po empiricheskim dannym by Samuel Kotz).
  39. Xiao, Z., Dellandrea, E., Dou, W. and Chen, L. (2006). Two-stage classification of emotional speech, International Conference on Digital Telecommunications (ICDT’06), Cap Esterel, Cˆote d’Azur, France, pp. 32-32.
  40. Yacoub, S., Simske, S., Lin, X. and Burns, J. (2003). Recognition of emotions in interactive voice response systems, Eurospeech 2003, Geneva, Switzerland, pp. 1-4.
  41. Yu, C., Aoki, P. M. and Woodruff, A. (2004). Detecting user engagement in everyday conversations, 8th International Conference on Spoken Language Processing (ICSLP 2004), Jeju, Korea, pp. 1-6.
DOI: https://doi.org/10.2478/amcs-2013-0060 | Journal eISSN: 2083-8492 | Journal ISSN: 1641-876X
Language: English
Page range: 797 - 808
Published on: Dec 31, 2013
Published by: University of Zielona Góra
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2013 Jan Rybka, Artur Janicki, published by University of Zielona Góra
This work is licensed under the Creative Commons License.

Volume 23 (2013): Issue 4 (December 2013)