References
- Kang. G Shin and Parameswaran Ramanathan. (2020). Real-Time Computing: A New Discipline of Computer Science and Engineering, Proceedings. IEEE, vol. 82, no. 1, pp. 6-24, 1994.
- Alec Radford, Jong Wook Kim, Tao Xu , Greg Brockman, Christine McLeavey, Ilya Sutskever. (2022). Robust Speech Recognition via Large-Scale Weak Supervision. arXiv:2212.04356
- Likhomanenko, T., Xu, Q., Pratap, V., Tomasello, P., Kahn, J., Avidov, G., Collobert, R., and Synnaeve, G. (2020). Rethinking evaluation in asr: Are our models robust enough? arXiv preprint arXiv:2010.11745
- Baevski, A., Zhou, H., Mohamed, A., and Auli, M. (2020). wav2vec 2.0: A framework for self-supervised learning of speech representations. arXiv:2006.11477.
- Chan, W., Park, D., Lee, C., Zhang, Y., Le, Q., and Norouzi, M. SpeechStew. (2021). Simply mix all available speech recognition data to train one large neural network. arXiv preprint arXiv:2104.02133.
- Zhang, Y., Park, D. S., Han, W., Qin, J., Gulati, A., Shor, J., Jansen, A., Xu, Y., Huang, Y., Wang, S., et al. (2021). BigSSL: Exploring the frontier of large-scale semi-supervised learning for automatic speech recognition. arXiv:2109.13226.
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems, pp. 5998–6008.
- Valk, J. and Aluma ̈e, T. (2021) Voxlingua107: a dataset for spoken language recognition. In 2021 IEEE Spoken Language Technology Workshop (SLT), pp. 652–658. IEEE.
- Sanchit Gandhi, Patrick von Platen & Alexander M. Rush. (2017). Distil-Whisper: Robust knowledge distillation via large-scale pseudo labelling, arXiv:2311.00430
- Nicolas Patry. (2022) Making automatic speech recognition work on large files with Wav2Vec2 in Transformers. https://huggingface.co/blog/asr-chunking. Accessed: 25 Nov.,
- H. Nanjo and T. Kawahara (2005) A new ASR evaluation measure and minimum Bayes-risk decoding for open-domain speech understanding 2024.
- Hendrycks, D., Liu, X., Wallace, E., Dziedzic, A., Krishnan, R., and Song, D. (2020). Pretrained transformers improve out-of-distribution robustness. arXiv preprint arXiv:2004.06100.
- Research Institute for Artificial Intelligence “Mihai Drăgănescu”, Romanian Academy Web, Romanian datasets, http://www.racai.ro