Identification of Spontaneous Spoken Texts in Slovak

Róbert Sabo; Peter Krammer; Ján Mojžiš; Marcel Kvassay

doi:10.2478/jazcas-2019-0076

.blurhash-client-img { display: none !important; }

Identification of Spontaneous Spoken Texts in Slovak

Journal of Linguistics/Jazykovedný casopis

Volume 70 (2019): Issue 2 (December 2019)

By: Róbert Sabo, Peter Krammer, Ján Mojžiš and Marcel Kvassay

Open Access

|Dec 2019

Abstract

We propose a text classification method for the purpose of creating a language model for automatic recognition of spontaneous spoken speech. Transcripts from our departmental speech database served as spontaneous spoken texts. Using supervised machine learning methods, we have created multiple classification models (including neural networks), that were able to distinguish them from written texts with high accuracy. We subsequently verified the accuracy of our trained models on a database of texts containing direct speech extracted from newspaper articles.

References

Articles in this issue

DOI: https://doi.org/10.2478/jazcas-2019-0076 | Journal eISSN: 1338-4287 | Journal ISSN: 0021-5597

Journal RSS Feed

Language: English

Page range: 481 - 490

Published on: Dec 21, 2019

Published by: Slovak Academy of Sciences, Ľudovít Štúr Institute of Linguistics

In partnership with: Paradigm Publishing Services

Publication frequency: 3 issues per year

Keywords:

spontaneous speech,

text classification,

supervised machine learning,

neural networks,

Slovak language

Related subjects:

Linguistics and semiotics,

Theoretical frameworks and disciplines,

Linguistics, other

© 2019 Róbert Sabo, Peter Krammer, Ján Mojžiš, Marcel Kvassay, published by Slovak Academy of Sciences, Ľudovít Štúr Institute of Linguistics
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Volume 70 (2019): Issue 2 (December 2019)