
Figure 1
Percentage of books missing a fiction tag for the 20 most frequent languages.
Table 1
List of attributes included in our dataset.
| ATTRIBUTE | DESCRIPTION |
|---|---|
| HTID | The HathiTrust ID by which the work is accessible. |
| Access Restrictions | Whether the work is made public by HathiTrust. |
| HathiTrust Bibliography Key | The respective bibliography key for the work. For retrieving MARC records. |
| Title | The title of the volume in question. |
| Year Published | The year in which the work was published. |
| Language | The language in which the work was published. |
| Author | The author of the work in question. |
| Fictionality | Whether the work is intended to be fictional (1) or not (0). |
| Length | The length of the work. |

Figure 2
Number of books tagged as fiction for the 18 most frequent languages, before and after classification.

Figure 3
Relative number of non-English books by decade before and after classification.
