Table 1
Overview of the texts sampled in STAF.
| AUTHOR – TITLE | YEAR | SENTENCES | TOKENS |
|---|---|---|---|
| Ismali Kadare – Gjenerali ï Ushtrisë së Vdekur | 1963 | 42 | 593 |
| Dritëro Agolli – Njerëz të krisur | 1995 | 11 | 144 |
| Fatos Kongoli – Lëkura e qenit | 2003 | 12 | 317 |
| Rexhep Qosja – Një dashuri dhe shtatë faje | 2003 | 36 | 529 |
| Flutura Açka – Kryqi i harresës | 2004 | 4 | 64 |
| Fatos Kongoli – Ëndrra e Damokleut | 2004 | 50 | 1207 |
| Enkelejd Lamaj – Libri i bardhë | 2011 | 16 | 90 |
| Enkelejd Lamaj – Vendi diku midis | 2014 | 10 | 229 |
| Ridvan Dibra – Gjumi mbi borë | 2016 | 19 | 152 |
| Total | 200 | 3325 |
Table 2
Differences in the annotation of TSA, STAF, and SALT treebanks. UPOS = Universal Parts of Speech; deprel = Dependency Relation; dephead = Dependency Head; features = Universal Features.
| TSA | STAF | SALT | |
|---|---|---|---|
| Multi-word tokens | no | yes | yes |
| UPOS tags | 14 | 15 | 17 |
| UPOS for kam ‘to have’ and jam ‘to be’ as copula | AUX | AUX | VERB |
| Deprels | 33 | 37 | 32 |
| Dephead for adjectival/nominal predications | adj/noun | adj/noun | copula |
| Deprel for për/të + verb | mark | mark | fixed |
| Deprel for oblique temporal modifiers | obl | obl:tmod | advmod |
| Deprel for possessive pronouns | det | det:poss | amod:poss |
| Deprel for articles of prearticulated adjectives | det:adj | det:adj | det |
| Deprel for pronominal clitics in clitic doubling | expl | obj/iobj | obj/iobj |
| Features | 36 | 41 | ? |
| Features for adjectives | Gender, Number | Case, Degree, Gender, Number | Case, Degree, Gender, Number |
| Features for adpositions | – | – | Case |
| Features for adverbs | Degree | Degree, AdvType | AdvType |
| Features for articles | Gender | Case, Definite, Gender, Number, PronType | Case, Gender, Number, PronType |
| Features for possessive markers (i/e/së/të + possessor) | Gender | Gender, Number | Case, Gender, Number, PronType |
| Features for personal pronouns | Gender, Number, PronType | Case, Gender, Number, PronType | Case, Gender, Number, PronType |
