Have a personal or library account? Click to login
A multi-threaded approach for improved and faster accent transcription of chemical terms Cover

A multi-threaded approach for improved and faster accent transcription of chemical terms

Open Access
|Apr 2025

References

  1. Hinsvark, Arthur, et al. Accented Speech Recognition: A Survey. arXiv:2104.10747, arXiv, 2 June 2021. arXiv.org, DOI: 10.48550/arXiv.2104.10747.
  2. Droua-Hamdani, G., Selouani, S. A., & Boudraa, M. (2012, June 6). Speaker-independent ASR for Modern Standard Arabic: effect of regional accents. International Journal of Speech Technology, 15(4), 487–493. DOI: 10.1007/s10772-012-9146-4
  3. Vergyri, Dimitra & Lamel, Lori & Gauvain, Jean-Luc. (2010). Automatic speech recognition of multiple accented English data. 1652–1655. 10.21437/Interspeech.2010-477.
  4. Lin, Zhaofeng, Tanvina Patel, and Odette Scharenborg. “Improving Whispered Speech Recognition Performance Using Pseudo-Whispered Based Data Augmentation.” 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2023.
  5. Chang, Jungwon, and Hosung Nam. “Exploring the feasibility of fine-tuning large-scale speech recognition models for domain-specific applications: A case study on Whisper model and KsponSpeech dataset.” Phonetics and Speech Sciences 15.3 (2023): 83–88.
  6. Hock, KIA Siang, and L. I. Lingxia. “Automated processing of massive audio/video content using FFmpeg.” Code4Lib Journal 23 (2014).
  7. Swain, M. C., & Cole, J. M. (2016, October 6). ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific Literature. Journal of Chemical Information and Modeling, 56(10), 1894–1904. DOI: 10.1021/acs.jcim.6b00207
  8. Kothari, S.., Chiwhane, S.., Satya, R.., Ansari, M. A.., Mehta, S.., Naranatt, P.., & Karthikeyan, M.. (2023). Fine-tuning ASR Model Performance on Indian Regional Accents for Accurate Chemical Term Prediction in Audio. International Journal of Intelligent Systems and Applications in Engineering, 11(4), 485–494. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/3583
  9. Phillips, S., & Rogers, A. (1999). International Journal of Parallel Programming, 27(4), 257–288. doi:10.1023/a:1018741730355
  10. Dong, Qianqian, et al. “Learning When to Translate for Streaming Speech.” ArXiv (Cornell University), 15 Sept. 2021, DOI: 10.48550/arxiv.2109.07368. Accessed 4 Feb. 2024.
  11. Krallinger, M., Rabal, O., Leitner, F. et al. The CHEMDNER corpus of chemicals and drugs and its annotation principles. J Cheminform 7 (Suppl 1), S2 (2015). DOI: 10.1186/1758-2946-7-S1-S2
  12. Chong, Jike & Friedland, Gerald & Janin, Adam & Morgan, Nelson & Oei, Chris. (2010). Opportunities and challenges of parallelizing speech recognition. 2-2.
  13. Saito, Takashi. “A framework of human-based speech transcription with a speech chunking front-end.” 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA). IEEE, 2015.
  14. Jorge, Javier, et al. “Live streaming speech recognition using deep bidirectional LSTM acoustic models and interpolated language models.” IEEE/ACM Transactions on Audio, Speech, and Language Processing 30 (2021): 148–161.
  15. Perero-Codosero, Juan M., et al. “Exploring Open-Source Deep Learning ASR for Speech-to-Text TV program transcription.” IberSPEECH. 2018.
  16. A. Radford et al., “Robust Speech Recognition via Large-Scale Weak Supervision,” arXiv preprint arXiv:2212.04356, 2022. [Online]. Available: https://arxiv.org/abs/2212.04356
  17. Baevski, H. Zhou, A. Mohamed, and M. Auli, “wav2vec 2.0: A framework for self-supervised learning of speech representations,” in Advances in Neural Information Processing Systems, vol. 33, pp. 12449–12460, 2020. [Online]. Available: https://arxiv.org/abs/2006.11477
  18. Google Cloud, “Speech-to-Text API,” 2023. [Online]. Available: https://cloud.google.com/speech-to-text
  19. P. Jyothi and M. Hasegawa-Johnson, “Acoustic Model Adaptation for Indian English Speech Recognition,” in Proc. Interspeech, 2015, pp. 1565–1569.
  20. A. Gupta, P. K. Ghosh, and H. A. Murthy, “Automatic Speech Recognition for Indian Accents: A Survey,” in IEEE Access, vol. 10, pp. 59347–59365, 2022. [Online]. Available: DOI: 10.1109/ACCESS.2022.3179123
  21. S. Manjunath and K. R. Ramakrishnan, “Domain-Specific Speech Recognition: Challenges and Solutions,” in IEEE Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 431–445, 2021.
  22. A. Rao, R. Patel, and M. S. Deshpande, “Performance Evaluation of ASR Systems for Indian Accents Using Deep Learning Techniques,” in 2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), pp. 678–683, IEEE, 2021.
  23. S. Setty, S. R. Patil, and N. Gupta, “Speech Recognition for Chemistry Terminology Using Deep Learning,” in IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 7, pp. 3456–3465, 2022.
Language: English
Submitted on: Feb 5, 2025
Published on: Apr 25, 2025
Published by: Professor Subhas Chandra Mukhopadhyay
In partnership with: Paradigm Publishing Services
Publication frequency: 1 times per year

© 2025 Sonali Kothari, Shwetambari Chiwhane, Shreeja Mehta, Pranav Naranatt, Md. Asad Ansari, Rithwik Satya, published by Professor Subhas Chandra Mukhopadhyay
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.