Have a personal or library account? Click to login
Digital Narratives of COVID-19: A Twitter Dataset for Text Analysis in Spanish Cover

Digital Narratives of COVID-19: A Twitter Dataset for Text Analysis in Spanish

Open Access
|Jun 2021

Abstract

Digital Narratives of COVID-19 (DHCovid) offers a curated Twitter corpus of digital conversations about the Coronavirus pandemic. The dataset is collected through a script via Twitter’s Application Programming Interface (API) starting on April 24th, 2020, and stored on GitHub as an open access repository of tweet identifiers that can be consulted, downloaded, and reused by scholars interested in Natural Language Processing (NLP), topic modelling, and other quantitative methods. A stable version of the dataset has also been released through Zenodo. Twitter datasets are structured in three main collections: tweets in Spanish worldwide; geolocated tweets in six Spanish-speaking areas spanning North and Central America (Mexico, Colombia, Ecuador), South America (Argentina, Peru), and Europe (Spain); and geolocated tweets in English and Spanish from the greater Miami area in South Florida.

DOI: https://doi.org/10.5334/johd.28 | Journal eISSN: 2059-481X
Language: English
Published on: Jun 10, 2021
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2021 Susanna Allés-Torrent, Gimena del Rio Riande, Jerry Bonnell, Dieyun Song, Nidia Hernández, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.