Have a personal or library account? Click to login
Dynamic Voice Parameter Modifications in Speech Signals Cover
Open Access
|Dec 2021

References

  1. [1] M. A. Karjalainen, V. T. Pulkki, “Communication Acoustics: An Introduction to Speech, Audio and Psychoacoustics”, John Wiley & Sons Ltd., 2015.
  2. [2] G. Fant, “Acoustic Theory of Speech Production”, Walter de Gruyter GmbH, 1970.10.1515/9783110873429
  3. [3] J. L. Flanagan, “Speech Analysis, Synthesis and Perception”, Springer, 1972.10.1007/978-3-662-01562-9
  4. [4] M. R. Schroeder, “A brief history of synthetic speech”, Speech Communication, vol. 13, Elsevier, 1993.10.1016/0167-6393(93)90074-U
  5. [5] https://www.ee.columbia.edu/%7Edpwe/resources/matlab/pvoc/, Accessed on 8th November 2020.
  6. [6] U. Zölzer, “DAFx Digital Audio Effects 2nd Edition”, John Wiley & Sons Ltd., 2011.10.1002/9781119991298
  7. [7] https://www.itu.int/rec/T-REC-P.862-200511-I!Amd2/en, Accessed on 15th May 2021.
  8. [8] V. Panayotov, G. Chen, D. Povey, S. Khudanpur, “Librispeech: An ASR corpus based on public domain audio books”, IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp. 5206-5210, 2015.
  9. [9] R. Wang, J. Lu, “Investigation of golden speakers for second language learners from imitation preference perspective by voice modification”, Speech Communication, vol. 53 (2), pp. 175-184, 2011.10.1016/j.specom.2010.08.015
  10. [10] H. Kai, S. Takamichi, S. Shiota, H. Kiya, “Lightweight Voice Anonymization Based on Data-Driven Optimization of Cascaded Voice Modification Modules”, IEEE Spoken Language Technology Workshop (SLT), pp. 560-566, 2021.10.1109/SLT48900.2021.9383535
  11. [11] R. González Hautamäki, “Human-induced voice modification and speaker recognition”, Automatic, perceptual and acoustic perspectives, Publications of the University of Eastern Finland, Dissertations in Forestry and Natural Sciences, 2017.
  12. [12] E. S. Ottosen, M. Dörfler, “A Phase Vocoder Based on Nonstationary Gabor Frames”, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25 (11), pp. 2199-2208, 2017.
  13. [13] M. Kaniewska, “Human Voice Modification Using Instantaneous Complex Frequency”, J. of the Audio Engineering Society (JAES), Paper 8136, 2010.
  14. [14] V. V. Nar, A. N. Cheeran, S. Banerjee, “Verification of TD-PSOLA for Implementing Voice Modification”, Int. J. of Engineering Research and Applications (IJERA), Vol. 3 (3), pp.461-465, 2013.
  15. [15] S. Kannan, P. R. Raju, R. S. S. Madhav, S. Tripathi, “Voice Conversion Using Spectral Mapping and TD-PSOLA”, Advances in Computing and Network Communications, Lecture Notes in Electrical Engineering, vol. 736, Springer, Singapore, 2021.10.1007/978-981-33-6987-0_17
  16. [16] Y. Y. Zhang, F. F. Wang, W. T. Du, “The DSP Implementation of Algorithm for Voice Speed Changing and Pitch Shifting Based on TD-PSOLA”, Applied Mechanics and Materials, vol. 543-547, pp. 2833-2837, 2014.
  17. [17] B. Akanksh, S. Vekkot, S. Tripathi, “Interconversion of Emotions in Speech Using TD-PSOLA”, Advances in Signal Processing and Intelligent Recognition Systems, Advances in Intelligent Systems and Computing, vol. 425, Springer, Cham, 2016.10.1007/978-3-319-28658-7_32
  18. [18] A. Moinet, T. Dutoit, “PVSOLA: A phase vocoder with synchronized overlap-add”, Proc. of the 14th Int. Conference on Digital Audio Effects (DAFx-11), Paris, France, 2011.
  19. [19] S. Kraft, M. Holters, A. v. d. Knesebeck, U. Zölzer, “Improved PVSOLA time-stretching and pitch-shifting for polyphonic audio”, Proc. Int. Conf. on Digital Audio Effects (DAFx), York, UK, pp. 17-21, 2012.
  20. [20] J. Laroche, “Frequency-Domain Techniques for High-Quality Voice Modification”, Proc. of the 6th Int. Conference on Digital Audio Effects (DAFx-03), London, UK, 2003.
  21. [21] M. Liuni, A. Röbel, “Phase vocoder and beyond”, Musica/Tecnologia, vol. 7, pp. 73-89, 2013.
  22. [22] Z. Průša, N. Holighaus, “Phase vocoder done right”, 25th European Signal Processing Conference (EUSIPCO), pp. 976-980, 2017.10.23919/EUSIPCO.2017.8081353
  23. [23] M. Morise, F. Yokomori, K. Ozawa, “WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications”, IEICE Transactions on Information and Systems, 2016.10.1587/transinf.2015EDP7457
  24. [24] A. Roebel, “A Shape-Invariant Phase Vocoder for Speech Transformation”, Digital Audio Effects (DAFx), Graz, Austria, pp. 1-1, 2010.
  25. [25] A. Roebel, “Shape-invariant speech transformation with the phase vocoder”, InterSpeech, Makuhari, Japan, pp. 2146-2149, 2010.
  26. [26] A. Sorin, S. Shechtman, A. Rendel, “Semi Parametric Concatenative TTS with Instant Voice Modification Capabilities”, Interspeech, pp. 1373-1377, 2017.
  27. [27] O. Perrotin, I. Mcloughlin, “GFM-Voc: A real-time voice quality modification system”, Interspeech, 20th Annual Conf. of the Int. Speech Communication Association, Graz, Austria, pp. 3685-3686, 2019.
  28. [28] Y. Stylianou, “Voice Transformation: A survey”, IEEE Int. Conf. on Acoustics, Speech and Signal Processing, pp. 3585-3588, 2009.
  29. [29] M. Dolson, “The Phase Vocoder: A Tutorial”, Computer Music Journal, vol. 10, 1986.10.2307/3680093
  30. [30] J.L. Flanagan, R.M. Golden, “Phase Vocoder”, Bell System Technical Journal, 1966.10.1002/j.1538-7305.1966.tb01706.x
  31. [31] https://www.dsprelated.com/freebooks/sasp/Choice_Hop_Size.html, Accessed on 22nd April 2021.
  32. [32] https://www.mathworks.com/help/matlab/matlab_prog/what-are-system-objects.html, Accessed on 13th April 2021.
  33. [33] https://www.itu.int/rec/T-REC-P.862-200102-I/en, Accessed on 15th May 2021.
  34. [34] https://www.itu.int/rec/T-REC-P.800.1/en, Accessed on 15th May 2021.
  35. [35] https://www.itu.int/rec/T-REC-P.862.1-200311-I/en, Accessed on 15th May 2021.
  36. [36] https://www.itu.int/rec/T-REC-P.863, Accessed on 17th May 2021
  37. [37] https://qxlab.ucd.ie/index.php/speech-quality-metrics/, Accessed on 17th May 2021.
DOI: https://doi.org/10.2478/aucts-2021-0001 | Journal eISSN: 2668-6449 | Journal ISSN: 1583-7149
Language: English
Page range: 1 - 10
Published on: Dec 30, 2021
Published by: Lucian Blaga University of Sibiu
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2021 Filip Cristian George, Neghină Mihai, published by Lucian Blaga University of Sibiu
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.