Have a personal or library account? Click to login
Enhanced lstm network with semi-supervised learning and data augmentation for low-resource ASR Cover

Enhanced lstm network with semi-supervised learning and data augmentation for low-resource ASR

Open Access
|Mar 2025

Abstract

Automatic speech recognition (ASR) is essential for developing intelligent systems capable of accurately processing human speech, particularly in low-resource languages. This study addresses the challenges faced by ASR systems in Indian languages, where data and resources are limited. The authors propose a novel three-step methodology that combines data augmentation and semi-supervised learning to enhance ASR performance. First, an enhanced long short-term memory (LSTM) network is used to train a baseline model with limited labeled data. Next, synthetic data is generated and combined with original recordings to refine the ASR model. Finally, semi-supervised training further boosts accuracy. Evaluations demonstrate significant improvements over existing models for Hindi, Marathi, and Odia languages.

Language: English
Submitted on: Nov 20, 2024
Published on: Mar 4, 2025
Published by: Professor Subhas Chandra Mukhopadhyay
In partnership with: Paradigm Publishing Services
Publication frequency: 1 times per year

© 2025 Tripti Choudhary, Vishal Goyal, Atul Bansal, published by Professor Subhas Chandra Mukhopadhyay
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.