Have a personal or library account? Click to login
Performance Evaluation of Deep Neural Networks Applied to Speech Recognition: RNN, LSTM and GRU Cover

Performance Evaluation of Deep Neural Networks Applied to Speech Recognition: RNN, LSTM and GRU

Open Access
|Aug 2019

Abstract

Deep Neural Networks (DNN) are nothing but neural networks with many hidden layers. DNNs are becoming popular in automatic speech recognition tasks which combines a good acoustic with a language model. Standard feedforward neural networks cannot handle speech data well since they do not have a way to feed information from a later layer back to an earlier layer. Thus, Recurrent Neural Networks (RNNs) have been introduced to take temporal dependencies into account. However, the shortcoming of RNNs is that long-term dependencies due to the vanishing/exploding gradient problem cannot be handled. Therefore, Long Short-Term Memory (LSTM) networks were introduced, which are a special case of RNNs, that takes long-term dependencies in a speech in addition to short-term dependencies into account. Similarily, GRU (Gated Recurrent Unit) networks are an improvement of LSTM networks also taking long-term dependencies into consideration. Thus, in this paper, we evaluate RNN, LSTM, and GRU to compare their performances on a reduced TED-LIUM speech data set. The results show that LSTM achieves the best word error rates, however, the GRU optimization is faster while achieving word error rates close to LSTM.

Language: English
Page range: 235 - 245
Submitted on: Sep 29, 2018
Accepted on: Mar 10, 2019
Published on: Aug 30, 2019
Published by: SAN University
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2019 Apeksha Shewalkar, Deepika Nyavanandi, Simone A. Ludwig, published by SAN University
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.