Skip to main content
Have a personal or library account? Click to login
GRAF – Gendered Reference Analysis in French Cover

Figures & Tables

Table 1

An example of a sentence coded in the database. The following columns with their values are not shown in the table due to space limitation. Source = written, Sentence = ‘il s’est souvent battu devant les tribunaux contre ceux qui l’accusaient d’avoir été un tortionnaire’, Sentence id = 19, document ID = 1_M_C_040602_bothtranslated.txt. The abbreviations are read the following way: Gen = gender, Num = number, Prs = person, Art = article.

IDHEADTOKENLEMMAUPOSFEATURESGENDERREFERENTGENERIC
16ililPRONGen=Masc,
Num=Sing,
Prs=3,
Type=Prs
MascHumain
21ssX
32PUNCT
46estêtreAUXMood=Ind,
Num=Sing,
Prs=3,
Tense=Pres
56souventsouventADV
60battubattreVERBGen=Masc,
Num=Sing,
Tense=Past
MascHumain
79devantdevantADP
89lesleDETDefinite=Def,
Gen=Masc,
Num=Plur,
Type=Art
Masc
96tribunauxtribunalNOUNGen=Masc,
Num=Plur
Masc
1011contrecontreADP
119ceuxceluiPRONGen=Masc,
Num=Plur,
Type=Dem
MascHumainTRUE
1215quiquiPRONPronType=Rel
1315lleDETDefinite=Def,
Gen=Masc,
Num=Sing,
Type=Art
MascHumain
1413PUNCT
1511accusaientaccusVERBMood=Ind,
Num=Plur,
Prs=3,
Tense=Imp
1621ddeADP
1721PUNCT
1821avoiravoirAUXVerbForm=Inf
1921étéêtreAUXGen=Masc,
Num=Sing,
Tense=Past
Masc
2021ununDETDefinite=Ind,
Gen=Masc,
Num=Sing,
Type=Art
MascHumain
2115tortionnairetortionnaireNOUNGen=Masc,
Num=Sing
Masc
Table 2

Corpus size and diversity ratio (TTR).

SOURCETOKENSUNIQUE LEMMASTYPE-TOKEN RATIO (TTR)
Spoken791134727.0597
Written223234405.1970
Figure 1

Log–log plot of lemma frequency as a function of rank in the full corpus.

Figure 2

Log–log frequency–rank distributions of lemmas in the spoken and written sub-corpora.

Figure 3

Gender distribution across parts-of-speech.

Table 3

Number of masculine and feminine tokens per UPOS category.

UPOSMASC TOKENSFEM TOKENSTOTALFEM PROP.
ADJ463316986331.268
DET6313378710100.375
NOUN10882597116853.354
PRON46064045010.080
VERB22493692618.141
DOI: https://doi.org/10.5334/johd.510 | Journal eISSN: 2059-481X
Language: English
Submitted on: Jan 8, 2026
Accepted on: Mar 5, 2026
Published on: Apr 14, 2026
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Magdalena Lemus-Serrano, Marine Cozzolino, Tessa Vermeir, Mathilde Josserand, Marc Allassonnière-Tang, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.