Have a personal or library account? Click to login
The Corpus for Idiolectal Research (CIDRE) Cover

The Corpus for Idiolectal Research (CIDRE)

Open Access
|Jul 2021

Abstract

The Corpus for Idiolectal Research (CIDRE) is a collection of fiction works from 11 prolific 19th-century French authors (4 women, 7 men; 22–62 works/author; total of 37 million words). Every work is dated with the year it was written. Using programming scripts, the works have been gathered from open source platforms, for example La Bibliothèque électronique du Québec, and stripped of paratext (text not being part of the novel, e.g. prefaces). We distribute the text files, the dating, other metadata and the programming scripts under an open source license. CIDRE is the first resource of French for the study of style and idiolect in a diachronic manner (i.e. stylochronometry) on a larger scale.

DOI: https://doi.org/10.5334/johd.42 | Journal eISSN: 2059-481X
Language: English
Published on: Jul 15, 2021
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2021 Olga Seminck, Philippe Gambette, Dominique Legallois, Thierry Poibeau, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.