Have a personal or library account? Click to login
LTSD and GDMD features for Telephone Speech Endpoint Detection Cover

LTSD and GDMD features for Telephone Speech Endpoint Detection

By: Atanas Ouzounov  
Open Access
|Nov 2017

Abstract

This paper proposes a new contour-based speech endpoint detector which combines the log-Group Delay Mean-Delta (log-GDMD) feature, an adaptive twothreshold scheme and an eight-state automaton. The adaptive thresholds scheme uses two pairs of thresholds - for the starting and for the ending points, respectively. Each pair of thresholds is calculated by using the contour characteristics in the corresponded region of the utterance. The experimental results have shown that the proposed detector demonstrates better performance compared to the Long-Term Spectral Divergence (LTSD) one in terms of endpoint accuracy. Additional fixed-text speaker verification tests with short phrases of telephone speech based on the Dynamic Time Warping (DTW) and left-to-right Hidden Markov Model (HMM) frameworks confirm the improvements of the verification rate due to the better endpoint accuracy.

DOI: https://doi.org/10.1515/cait-2017-0045 | Journal eISSN: 1314-4081 | Journal ISSN: 1311-9702
Language: English
Page range: 114 - 133
Published on: Nov 30, 2017
Published by: Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2017 Atanas Ouzounov, published by Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.