The Browser-Based GLAUx Treebank Infrastructure: Framework, Functionality, and Future
Abstract
This paper presents the browser-based treebank infrastructure of GLAUx (the Greek Language AUtomated). This linguistic annotation project now has its integrated and user-friendly platform for exploring this data. After discussing the size and types of texts included in the GLAUx corpus, the contribution succinctly surveys the types of linguistic annotation covered by the project (morphology, lemmatization, and syntax). The emphasis of the contribution is on a description of the underlying SQL database structure and the search architecture. Infrastructure-related challenges faced by the GLAUx project are also discussed. Finally, the paper concludes with a discussion of future steps for the project, including additional functionality and expansion of the corpus.
© 2024 Alek Keersmaekers, Frédéric Pietowski, Toon Van Hal, Mark Depauw, published by Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
