A Mixed Method Twitter Methodology and Anonymous Corpus

Figures & Tables

All columns found in the dataset.

ID	GLOBAL ID OF TWEET AS STRING
favorite_count	integer count of how many times tweet was favourited
retweet_count	integer count of how many times tweet was retweeted
created_at	timestamp of when tweet was made
full_text	full text of tweet
entities.hashtags	list of hashtags found in tweet
entities.symbols	list of symbols found in tweet
entities.user_mentions	list of users mentioned in the tweet
entities.urls	list of URLs found in the tweet
possibly_sensitive	flag autogenerated to indicate possibility of sensitive content
entities.media	python list of media found in tweet, e.g. images
full_text_norm	Normalized full text of tweet
vscore_pos	VADER positive dimension score for the tweet full-text, 0 to 1 inclusive
vscore_neg	VADER negative dimension score for the tweet full-text, 0 to 1 inclusive
vscore_neu	VADER neutral dimension score for the tweet full-text, 0 to 1 inclusive
vscore_compound	VADER composite score for the tweet full-text, –1 to 1 inclusive
swears	flag if tweet contains a swear word
engaged	flag if tweet was either favourited or retweeted
total_engagement	Engagement score, ie. combined count of number of retweets and favourites
hashtags	flag if tweet contains a hashtag
questions	flag if tweet contains a question (full text includes a question mark)
media	flag if tweet contains image
fav_quant	what quantile tweet is in based on favourite count, if applicable
g_index	the grief index value for the month that the tweet was made

Description of data origin, either direct from Twitter export or result of analysis.

PROVENANCE	COLUMNS
Twitter Export	favourite_count retweet_count created_at full_text entities.hastags entities.symbols entities.user_mentions entities.urls possibly_sensitive entities.media
Derived	full_text_norm vscore_pos vscore_neg vscore_neu vscore_compound swears engaged total_engagement hashtags questions media fav_quant g_index