Have a personal or library account? Click to login
A Comparison of Topic Modeling Approaches Using Networked Discussion Forum Posts From the City-data.com Corpus Cover

A Comparison of Topic Modeling Approaches Using Networked Discussion Forum Posts From the City-data.com Corpus

By: Ryan M. Omizo  
Open Access
|Feb 2024

Abstract

The City-Data.com Corpus provides over 15,000 discussion forum posts scraped from city-data.com--a website that hosts information about cities across the United States. Like the 20 Newsgroups dataset, the City-Data.com Corpus is weakly labeled by forum topics and thread titles and can be used to trial natural language processing techniques or be used to stage lessons in digital textual analysis in digital humanities pedagogy.

DOI: https://doi.org/10.5334/johd.182 | Journal eISSN: 2059-481X
Language: English
Submitted on: Nov 10, 2023
Accepted on: Jan 10, 2024
Published on: Feb 7, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Ryan M. Omizo, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.