Table 1
Distribution of texts per language and subject level for both examination texts and extracted reading comprehension and summarization texts.
| # OF EXAMINATION TEXTS | # OF EXTRACTED TEXTS | |||||
|---|---|---|---|---|---|---|
| HL | FAL | TOTAL | HL | FAL | TOTAL | |
| Afrikaans | 21 | 22 | 43 | 53 | 58 | 111 |
| English | 24 | 25 | 49 | 56 | 57 | 113 |
| IsiNdebele | 20 | 16 | 36 | 43 | 34 | 77 |
| IsiXhosa | 19 | 21 | 40 | 39 | 42 | 81 |
| IsiZulu | 18 | 18 | 36 | 36 | 39 | 75 |
| Sepedi | 22 | 20 | 42 | 48 | 42 | 90 |
| Sesotho | 22 | 19 | 41 | 49 | 39 | 88 |
| Setswana | 17 | 14 | 31 | 34 | 29 | 63 |
| Siswati | 21 | 18 | 39 | 43 | 38 | 81 |
| Tshivenḓa | 19 | 17 | 36 | 39 | 37 | 76 |
| Xitsonga | 20 | 16 | 36 | 41 | 33 | 74 |
Table 2
Token and type count per language and subject level for the full examination texts.
| # OF TOKENS IN EXAMINATION TEXTS | # OF TYPES IN EXAMINATION TEXTS | |||||
|---|---|---|---|---|---|---|
| HL | FAL | TOTAL | HL | FAL | TOTAL | |
| Afrikaans | 77,787 | 90,731 | 168,518 | 8,943 | 6,946 | 12,829 |
| English | 80,252 | 86,113 | 166,365 | 8,497 | 7,489 | 12,325 |
| IsiNdebele | 48,931 | 37,519 | 86,450 | 12,430 | 9,719 | 18,903 |
| IsiXhosa | 50,480 | 53,518 | 103,998 | 13,529 | 13,488 | 23,136 |
| IsiZulu | 43,456 | 44,082 | 87,538 | 11,738 | 11,076 | 19,726 |
| Sepedi | 66,846 | 56,594 | 123,440 | 5,709 | 5,253 | 8,374 |
| Sesotho | 80,592 | 66,934 | 147,526 | 6,900 | 5,811 | 9,738 |
| Setswana | 52,836 | 42,026 | 94,862 | 5,587 | 4,580 | 8,106 |
| Siswati | 54,597 | 43,845 | 98,442 | 14,608 | 10,792 | 21,868 |
| Tshivenḓa | 62,726 | 52,881 | 115,607 | 5,694 | 4,636 | 7,877 |
| Xitsonga | 71,227 | 50,579 | 121,806 | 5,831 | 4,446 | 7,933 |
Table 3
Token and type count per language and subject level for the extracted reading comprehension and summarization texts.
| # OF TOKENS IN EXAMINATION TEXTS | # OF TYPES IN EXAMINATION TEXTS | |||||
|---|---|---|---|---|---|---|
| HL | FAL | TOTAL | HL | FAL | TOTAL | |
| Afrikaans | 29,298 | 24,804 | 54,102 | 5,761 | 3,955 | 8,019 |
| English | 33,625 | 27,131 | 60,756 | 6,064 | 4,900 | 8,599 |
| IsiNdebele | 18,346 | 12,637 | 30,983 | 7,890 | 6,146 | 12,268 |
| IsiXhosa | 19,980 | 18,601 | 38,581 | 8,920 | 8,611 | 15,347 |
| IsiZulu | 18,812 | 15,540 | 34,352 | 8,211 | 6,732 | 13,177 |
| Sepedi | 27,439 | 19,235 | 46,674 | 3,995 | 3,444 | 5,845 |
| Sesotho | 29,754 | 21,182 | 50,936 | 4,456 | 3,413 | 6,235 |
| Setswana | 21,303 | 14,269 | 35,572 | 3,683 | 2,741 | 5,289 |
| Siswati | 20,885 | 13,607 | 34,492 | 8,902 | 6,300 | 13,394 |
| Tshivenḓa | 25,473 | 18,693 | 44,166 | 3,889 | 2,900 | 5,273 |
| Xitsonga | 25,932 | 17,778 | 43,710 | 3,899 | 2,904 | 5,405 |
