Table 1
All columns found in the dataset.
| ID | GLOBAL ID OF TWEET AS STRING |
|---|---|
| favorite_count | integer count of how many times tweet was favourited |
| retweet_count | integer count of how many times tweet was retweeted |
| created_at | timestamp of when tweet was made |
| full_text | full text of tweet |
| entities.hashtags | list of hashtags found in tweet |
| entities.symbols | list of symbols found in tweet |
| entities.user_mentions | list of users mentioned in the tweet |
| entities.urls | list of URLs found in the tweet |
| possibly_sensitive | flag autogenerated to indicate possibility of sensitive content |
| entities.media | python list of media found in tweet, e.g. images |
| full_text_norm | Normalized full text of tweet |
| vscore_pos | VADER positive dimension score for the tweet full-text, 0 to 1 inclusive |
| vscore_neg | VADER negative dimension score for the tweet full-text, 0 to 1 inclusive |
| vscore_neu | VADER neutral dimension score for the tweet full-text, 0 to 1 inclusive |
| vscore_compound | VADER composite score for the tweet full-text, –1 to 1 inclusive |
| swears | flag if tweet contains a swear word |
| engaged | flag if tweet was either favourited or retweeted |
| total_engagement | Engagement score, ie. combined count of number of retweets and favourites |
| hashtags | flag if tweet contains a hashtag |
| questions | flag if tweet contains a question (full text includes a question mark) |
| media | flag if tweet contains image |
| fav_quant | what quantile tweet is in based on favourite count, if applicable |
| g_index | the grief index value for the month that the tweet was made |
Table 2
Description of data origin, either direct from Twitter export or result of analysis.
| PROVENANCE | COLUMNS |
|---|---|
| Twitter Export | favourite_count retweet_count created_at full_text entities.hastags entities.symbols entities.user_mentions entities.urls possibly_sensitive entities.media |
| Derived | full_text_norm vscore_pos vscore_neg vscore_neu vscore_compound swears engaged total_engagement hashtags questions media fav_quant g_index |

Figure 1
Engagement score box plots for tweets with different characteristics.

Figure 2
Engagement score distribution of all tweets.

Figure 3
VADER sentiment score breakdown of all tweets in the archive.

Figure 4
Word cloud of all the tweets in the archive.
