Have a personal or library account? Click to login
Supporting the corpus-based study of Shakespeare’s language: Enhancing a corpus of the First Folio Cover

Supporting the corpus-based study of Shakespeare’s language: Enhancing a corpus of the First Folio

Open Access
|Jun 2021

Abstract

This article explores challenges in the corpus linguistic analysis of Shakespeare’s language, and Early Modern English more generally, with particular focus on elaborating possible solutions and the benefits they bring. An account of work that took place within the Encyclopedia of Shakespeare’s Language Project (2016–2019) is given, which discusses the development of the project’s data resources, specifically, the Enhanced Shakespearean Corpus. Topics covered include the composition of the corpus and its subcomponents; the structure of the XML markup; the design of the extensive character metadata; and the word-level corpus annotation, including spelling regularisation, part-of-speech tagging, lemmatisation and semantic tagging. The challenges that arise from each of these undertakings are not exclusive to a corpus-based treatment of Shakespeare’s plays but it is in the context of Shakespeare’s language that they are so severe as to seem almost insurmountable. The solutions developed for the Enhanced Shakespearean Corpus – often combining automated manipulation with manual interventions, and always principled – offer a way through.

DOI: https://doi.org/10.2478/icame-2021-0002 | Journal eISSN: 1502-5462 | Journal ISSN: 0801-5775
Language: English
Page range: 37 - 86
Published on: Jun 12, 2021
Published by: Uppsala University, Department of English
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2021 Jonathan Culpeper, Andrew Hardie, Jane Demmen, Jennifer Hughes, Matt Timperley, published by Uppsala University, Department of English
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.