Crosthwaite, P., and Baisa, V. (2023). Generative AI and the end of corpus-assisted data-driven learning? Not so fast! Applied Corpus Linguistics, 3(3). Accessible at: https://doi.org/10.1016/j.acorp.2023.100066.
Jurafsky, D., and Martin, J. (2023). Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. (Third edition e-book: draft of January 7, 2023). Accessible at: https://web.stanford.edu/~jurafsky/slp3/ed3book.pdf. (accessed on 19 July 2023).
Kilgarriff, A., Baisa, V., Bušta, J., Jakubícek, M., Kovář, V., Michelfeit, J., Rychlý, P., and Suchomel, V. (2014). The Sketch Engine: ten years on. Lexicography, 1(1), pages 7–36.
Navarro, D. (2015). Learning Statistics with R: A tutorial for psychology students and other beginners. (Version 0.6), 599 p. Sydney. University of New South Wales. Accessible at: http://compcogscisydney.org/learning-statistics-with-r/.
NCES. (n.d.). Volume of data/information created, captured, copied, and consumed worldwide from 2010 to 2020, with forecasts from 2021 to 2025 (in zettabytes). In Statista - The Statistics Portal. Accessible at: https://www.statista.com/statistics/871513/worldwide-data-created/.
Woźniak, M., Wołos, A., Modrzyk, U., Górski, R. L., Winkowski, J., Bajczyk, M., Szymkuć, S., Grzybowski, B., and Eder, M. (2018). Linguistic measures of chemical diversity and the ‘keywords’ of molecular collections. Scientific Reports, 8(1), page 7598.