Have a personal or library account? Click to login
Era- and Genre-Specific Stop Word Lists for Low-Resource Computational Research: A Classical Latin Exemplum Cover

Era- and Genre-Specific Stop Word Lists for Low-Resource Computational Research: A Classical Latin Exemplum

Open Access
|Nov 2024

Abstract

In this data paper, we argue that computational researchers—particularly those working in low-resource contexts—should consult with linguistic specialists to create targeted stop lists developed with specific eras, genres, authors, or contexts in mind. We offer an exemplum of stop lists targeted at Augustan Latin poetry. Our open-access stop lists, available as standalone files alongside a command-line based Python script, can serve as a starting point for other eras or genres of Latin literature. More broadly, the transdisciplinary and collaborative process by which these stop lists were created is of significant benefit to low-resource computational linguistics research teams.

DOI: https://doi.org/10.5334/johd.246 | Journal eISSN: 2059-481X
Language: English
Submitted on: Sep 28, 2024
Accepted on: Nov 12, 2024
Published on: Nov 25, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Rachel E. Dubit, Annie K. Lamar, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.