Have a personal or library account? Click to login

Statistician, Programmer, Data Scientist? Who is, or Should Be, a Corpus Linguist in the 2020s?

Open Access
|Dec 2023

References

  1. Anthony, L. (2022). AntConc (Version 4.2.0) [Computer Software]. Tokyo, Japan. Waseda University. Accessible at: https://www.laurenceanthony.net/software.
  2. Brezina, V. (2018). Statistics for Corpus Linguistics. Cambridge: Cambridge University Press, 314 p.
  3. Cantos Gomez, P. (2013). Statistical Methods in Language and Linguistic Research. London: Equinox, 256 p.
  4. Crosthwaite, P., and Baisa, V. (2023). Generative AI and the end of corpus-assisted data-driven learning? Not so fast! Applied Corpus Linguistics, 3(3). Accessible at: https://doi.org/10.1016/j.acorp.2023.100066.
  5. Desagulier, G. (2017). Corpus Linguistics and Statistics with R. Introduction to Quantitative Methods in Linguistics. Berlin: Springer, 366 p.
  6. Dunne, J. (2022). Natural Language Processing for Corpus Linguistics (Elements in Corpus Linguistics). Cambridge: Cambridge University Press, 96 p.
  7. Gries, S. (2013). Statistics for Linguistics with R. Berlin: De Gruyter, 374 p.
  8. Hirschberg, J., and Manning, Ch. (2015). Advances in natural language processing. Science, 349(6245), pages 261–266.
  9. Hyland, K. (2023). Academic publishing and the attention economy. Journal of English for Academic Purposes, 64. Accessible at: https://doi.org/10.1016/j.jeap.2023.101253.
  10. Jurafsky, D., and Martin, J. (2023). Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. (Third edition e-book: draft of January 7, 2023). Accessible at: https://web.stanford.edu/~jurafsky/slp3/ed3book.pdf. (accessed on 19 July 2023).
  11. Kilgarriff, A., Baisa, V., Bušta, J., Jakubícek, M., Kovář, V., Michelfeit, J., Rychlý, P., and Suchomel, V. (2014). The Sketch Engine: ten years on. Lexicography, 1(1), pages 7–36.
  12. Levshina, N. (2015). How to do Linguistics with R: Data exploration and statistical analysis. Amsterdam: John Benjamins, 454 p.
  13. Lew, R. (2023, June 12). ChatGPT as a COBUILD lexicographer. Accessible at: https://doi.org/10.31219/osf.io/t9mbu.
  14. McEnery, T., and Wilson, A. (1996). Corpus Linguistics. Edinburgh: University Press, 256 p.
  15. Navarro, D. (2015). Learning Statistics with R: A tutorial for psychology students and other beginners. (Version 0.6), 599 p. Sydney. University of New South Wales. Accessible at: http://compcogscisydney.org/learning-statistics-with-r/.
  16. NCES. (n.d.). Volume of data/information created, captured, copied, and consumed worldwide from 2010 to 2020, with forecasts from 2021 to 2025 (in zettabytes). In Statista - The Statistics Portal. Accessible at: https://www.statista.com/statistics/871513/worldwide-data-created/.
  17. Ooi, V. (1998). Computer Corpus Lexicography. Edinburgh: University Press, 224 p.
  18. Scott, M. (2022). WordSmith Tools version 8 (64 bit version) Stroud: Lexical Analysis Software.
  19. Winter, B. (2019). Statistics for Linguists: An Introduction Using R. London: Routledge, 310 p.
  20. Woźniak, M., Wołos, A., Modrzyk, U., Górski, R. L., Winkowski, J., Bajczyk, M., Szymkuć, S., Grzybowski, B., and Eder, M. (2018). Linguistic measures of chemical diversity and the ‘keywords’ of molecular collections. Scientific Reports, 8(1), page 7598.
DOI: https://doi.org/10.2478/jazcas-2023-0023 | Journal eISSN: 1338-4287 | Journal ISSN: 0021-5597
Language: English
Page range: 52 - 59
Published on: Dec 25, 2023
Published by: Slovak Academy of Sciences, Mathematical Institute
In partnership with: Paradigm Publishing Services
Publication frequency: 2 issues per year

© 2023 Łukasz Grabowski, published by Slovak Academy of Sciences, Mathematical Institute
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.