Have a personal or library account? Click to login
Comma Distribution in Czech Texts: Variation by Genre and Author, and Error Analysis Cover

Comma Distribution in Czech Texts: Variation by Genre and Author, and Error Analysis

Open Access
|Nov 2025

Abstract

This article investigates the distribution and typology of commas in Czech texts, combining genre-differentiated samples with an annotated error corpus to offer a comprehensive view of punctuation usage and misuse. Building on previous work, we expand the analysis from a small newspaper sample to a broader set of texts, encompassing fiction, blogs, translations, and school dictations. Using a consistent typology of comma usage, we classify 1,000 manually selected instances and identify trends in different textual genres. Furthermore, we examine over 1,000 missing comma errors and more than 200 redundant ones from the self-built error corpus. The results reveal genre-dependent tendencies in comma types, especially in the use of commas preceding connectives and within asyndetic structures. The study offers insights for improving automatic comma insertion systems and deepens our understanding of punctuation norms and deviations in Czech.

DOI: https://doi.org/10.2478/jazcas-2025-0004 | Journal eISSN: 1338-4287 | Journal ISSN: 0021-5597
Language: English
Page range: 41 - 51
Published on: Nov 27, 2025
Published by: Slovak Academy of Sciences, Mathematical Institute
In partnership with: Paradigm Publishing Services
Publication frequency: 2 issues per year

© 2025 Jakub Machura, Hana Žižková, Vojtěch Kovář, published by Slovak Academy of Sciences, Mathematical Institute
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.