Dynamic Voice Parameter Modifications in Speech Signals

Filip Cristian George; Neghină Mihai

doi:10.2478/aucts-2021-0001

.blurhash-client-img { display: none !important; }

Dynamic Voice Parameter Modifications in Speech Signals

Acta Universitatis Cibiniensis. Technical Series

Volume 73 (2021): Issue 1 (December 2021)

By: Filip Cristian George and Neghină Mihai

Open Access

|Dec 2021

[1] M. A. Karjalainen, V. T. Pulkki, “Communication Acoustics: An Introduction to Speech, Audio and Psychoacoustics”, John Wiley & Sons Ltd., 2015.
Search in Google Scholar Back to article
[2] G. Fant, “Acoustic Theory of Speech Production”, Walter de Gruyter GmbH, 1970.10.1515/9783110873429
Search in Google Scholar Back to article
[3] J. L. Flanagan, “Speech Analysis, Synthesis and Perception”, Springer, 1972.10.1007/978-3-662-01562-9
Search in Google Scholar Back to article
[4] M. R. Schroeder, “A brief history of synthetic speech”, Speech Communication, vol. 13, Elsevier, 1993.10.1016/0167-6393(93)90074-U
Search in Google Scholar Back to article
[5] https://www.ee.columbia.edu/%7Edpwe/resources/matlab/pvoc/, Accessed on 8th November 2020.
Search in Google Scholar Back to article
[6] U. Zölzer, “DAFx Digital Audio Effects 2nd Edition”, John Wiley & Sons Ltd., 2011.10.1002/9781119991298
Search in Google Scholar Back to article
[7] https://www.itu.int/rec/T-REC-P.862-200511-I!Amd2/en, Accessed on 15th May 2021.
Search in Google Scholar Back to article
[8] V. Panayotov, G. Chen, D. Povey, S. Khudanpur, “Librispeech: An ASR corpus based on public domain audio books”, IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp. 5206-5210, 2015.
Search in Google Scholar Back to article
[9] R. Wang, J. Lu, “Investigation of golden speakers for second language learners from imitation preference perspective by voice modification”, Speech Communication, vol. 53 (2), pp. 175-184, 2011.10.1016/j.specom.2010.08.015
Search in Google Scholar Back to article
[10] H. Kai, S. Takamichi, S. Shiota, H. Kiya, “Lightweight Voice Anonymization Based on Data-Driven Optimization of Cascaded Voice Modification Modules”, IEEE Spoken Language Technology Workshop (SLT), pp. 560-566, 2021.10.1109/SLT48900.2021.9383535
Search in Google Scholar Back to article
[11] R. González Hautamäki, “Human-induced voice modification and speaker recognition”, Automatic, perceptual and acoustic perspectives, Publications of the University of Eastern Finland, Dissertations in Forestry and Natural Sciences, 2017.
Search in Google Scholar Back to article
[12] E. S. Ottosen, M. Dörfler, “A Phase Vocoder Based on Nonstationary Gabor Frames”, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25 (11), pp. 2199-2208, 2017.
Search in Google Scholar Back to article
[13] M. Kaniewska, “Human Voice Modification Using Instantaneous Complex Frequency”, J. of the Audio Engineering Society (JAES), Paper 8136, 2010.
Search in Google Scholar Back to article
[14] V. V. Nar, A. N. Cheeran, S. Banerjee, “Verification of TD-PSOLA for Implementing Voice Modification”, Int. J. of Engineering Research and Applications (IJERA), Vol. 3 (3), pp.461-465, 2013.
Search in Google Scholar Back to article
[15] S. Kannan, P. R. Raju, R. S. S. Madhav, S. Tripathi, “Voice Conversion Using Spectral Mapping and TD-PSOLA”, Advances in Computing and Network Communications, Lecture Notes in Electrical Engineering, vol. 736, Springer, Singapore, 2021.10.1007/978-981-33-6987-0_17
Search in Google Scholar Back to article
[16] Y. Y. Zhang, F. F. Wang, W. T. Du, “The DSP Implementation of Algorithm for Voice Speed Changing and Pitch Shifting Based on TD-PSOLA”, Applied Mechanics and Materials, vol. 543-547, pp. 2833-2837, 2014.
Search in Google Scholar Back to article
[17] B. Akanksh, S. Vekkot, S. Tripathi, “Interconversion of Emotions in Speech Using TD-PSOLA”, Advances in Signal Processing and Intelligent Recognition Systems, Advances in Intelligent Systems and Computing, vol. 425, Springer, Cham, 2016.10.1007/978-3-319-28658-7_32
Search in Google Scholar Back to article
[18] A. Moinet, T. Dutoit, “PVSOLA: A phase vocoder with synchronized overlap-add”, Proc. of the 14th Int. Conference on Digital Audio Effects (DAFx-11), Paris, France, 2011.
Search in Google Scholar Back to article
[19] S. Kraft, M. Holters, A. v. d. Knesebeck, U. Zölzer, “Improved PVSOLA time-stretching and pitch-shifting for polyphonic audio”, Proc. Int. Conf. on Digital Audio Effects (DAFx), York, UK, pp. 17-21, 2012.
Search in Google Scholar Back to article
[20] J. Laroche, “Frequency-Domain Techniques for High-Quality Voice Modification”, Proc. of the 6th Int. Conference on Digital Audio Effects (DAFx-03), London, UK, 2003.
Search in Google Scholar Back to article
[21] M. Liuni, A. Röbel, “Phase vocoder and beyond”, Musica/Tecnologia, vol. 7, pp. 73-89, 2013.
Search in Google Scholar Back to article
[22] Z. Průša, N. Holighaus, “Phase vocoder done right”, 25th European Signal Processing Conference (EUSIPCO), pp. 976-980, 2017.10.23919/EUSIPCO.2017.8081353
Search in Google Scholar Back to article
[23] M. Morise, F. Yokomori, K. Ozawa, “WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications”, IEICE Transactions on Information and Systems, 2016.10.1587/transinf.2015EDP7457
Search in Google Scholar Back to article
[24] A. Roebel, “A Shape-Invariant Phase Vocoder for Speech Transformation”, Digital Audio Effects (DAFx), Graz, Austria, pp. 1-1, 2010.
Search in Google Scholar Back to article
[25] A. Roebel, “Shape-invariant speech transformation with the phase vocoder”, InterSpeech, Makuhari, Japan, pp. 2146-2149, 2010.
Search in Google Scholar Back to article
[26] A. Sorin, S. Shechtman, A. Rendel, “Semi Parametric Concatenative TTS with Instant Voice Modification Capabilities”, Interspeech, pp. 1373-1377, 2017.
Search in Google Scholar Back to article
[27] O. Perrotin, I. Mcloughlin, “GFM-Voc: A real-time voice quality modification system”, Interspeech, 20th Annual Conf. of the Int. Speech Communication Association, Graz, Austria, pp. 3685-3686, 2019.
Search in Google Scholar Back to article
[28] Y. Stylianou, “Voice Transformation: A survey”, IEEE Int. Conf. on Acoustics, Speech and Signal Processing, pp. 3585-3588, 2009.
Search in Google Scholar Back to article
[29] M. Dolson, “The Phase Vocoder: A Tutorial”, Computer Music Journal, vol. 10, 1986.10.2307/3680093
Search in Google Scholar Back to article
[30] J.L. Flanagan, R.M. Golden, “Phase Vocoder”, Bell System Technical Journal, 1966.10.1002/j.1538-7305.1966.tb01706.x
Search in Google Scholar Back to article
[31] https://www.dsprelated.com/freebooks/sasp/Choice_Hop_Size.html, Accessed on 22nd April 2021.
Search in Google Scholar Back to article
[32] https://www.mathworks.com/help/matlab/matlab_prog/what-are-system-objects.html, Accessed on 13th April 2021.
Search in Google Scholar Back to article
[33] https://www.itu.int/rec/T-REC-P.862-200102-I/en, Accessed on 15th May 2021.
Search in Google Scholar Back to article
[34] https://www.itu.int/rec/T-REC-P.800.1/en, Accessed on 15th May 2021.
Search in Google Scholar Back to article
[35] https://www.itu.int/rec/T-REC-P.862.1-200311-I/en, Accessed on 15th May 2021.
Search in Google Scholar Back to article
[36] https://www.itu.int/rec/T-REC-P.863, Accessed on 17th May 2021
Search in Google Scholar Back to article
[37] https://qxlab.ucd.ie/index.php/speech-quality-metrics/, Accessed on 17th May 2021.
Search in Google Scholar Back to article

Authors

Metrics

Articles in this issue

DOI: https://doi.org/10.2478/aucts-2021-0001 | Journal eISSN: 2668-6449 | Journal ISSN: 1583-7149

Journal RSS Feed

Language: English

Page range: 1 - 10

Published on: Dec 30, 2021

Published by: Lucian Blaga University of Sibiu

In partnership with: Paradigm Publishing Services

Publication frequency: 1 issue per year

Keywords:

Speech,

Voice,

Volume,

Duration,

Pitch,

Timbre,

Related subjects:

Electrical engineering,

Automation,

Mechanical engineering,

Production technology,

Process engineering and industrial engineering

© 2021 Filip Cristian George, Neghină Mihai, published by Lucian Blaga University of Sibiu
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Volume 73 (2021): Issue 1 (December 2021)