Have a personal or library account? Click to login
A Mixed Method Twitter Methodology and Anonymous Corpus Cover

A Mixed Method Twitter Methodology and Anonymous Corpus

By: Tim Ribaric  
Open Access
|Jun 2023

Figures & Tables

Table 1

All columns found in the dataset.

IDGLOBAL ID OF TWEET AS STRING
favorite_countinteger count of how many times tweet was favourited
retweet_countinteger count of how many times tweet was retweeted
created_attimestamp of when tweet was made
full_textfull text of tweet
entities.hashtagslist of hashtags found in tweet
entities.symbolslist of symbols found in tweet
entities.user_mentionslist of users mentioned in the tweet
entities.urlslist of URLs found in the tweet
possibly_sensitiveflag autogenerated to indicate possibility of sensitive content
entities.mediapython list of media found in tweet, e.g. images
full_text_normNormalized full text of tweet
vscore_posVADER positive dimension score for the tweet full-text, 0 to 1 inclusive
vscore_negVADER negative dimension score for the tweet full-text, 0 to 1 inclusive
vscore_neuVADER neutral dimension score for the tweet full-text, 0 to 1 inclusive
vscore_compoundVADER composite score for the tweet full-text, –1 to 1 inclusive
swearsflag if tweet contains a swear word
engagedflag if tweet was either favourited or retweeted
total_engagementEngagement score, ie. combined count of number of retweets and favourites
hashtagsflag if tweet contains a hashtag
questionsflag if tweet contains a question (full text includes a question mark)
mediaflag if tweet contains image
fav_quantwhat quantile tweet is in based on favourite count, if applicable
g_indexthe grief index value for the month that the tweet was made
Table 2

Description of data origin, either direct from Twitter export or result of analysis.

PROVENANCECOLUMNS
Twitter Exportfavourite_count
retweet_count
created_at
full_text
entities.hastags
entities.symbols
entities.user_mentions
entities.urls
possibly_sensitive
entities.media
Derivedfull_text_norm
vscore_pos
vscore_neg
vscore_neu
vscore_compound
swears
engaged
total_engagement
hashtags
questions
media
fav_quant
g_index
johd-9-109-g1.png
Figure 1

Engagement score box plots for tweets with different characteristics.

johd-9-109-g2.png
Figure 2

Engagement score distribution of all tweets.

johd-9-109-g3.png
Figure 3

VADER sentiment score breakdown of all tweets in the archive.

johd-9-109-g4.png
Figure 4

Word cloud of all the tweets in the archive.

DOI: https://doi.org/10.5334/johd.109 | Journal eISSN: 2059-481X
Language: English
Submitted on: Apr 27, 2023
Accepted on: Jun 14, 2023
Published on: Jun 29, 2023
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2023 Tim Ribaric, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.