Have a personal or library account? Click to login
Automatic Detection of Four-Panel Cartoon in Large-Scale Korean Digitized Newspapers using Deep Learning Cover

Automatic Detection of Four-Panel Cartoon in Large-Scale Korean Digitized Newspapers using Deep Learning

Open Access
|Jun 2024

Abstract

In the realm of cultural and historical studies, the collection of image-based content from big data is a fundamental aspect of data analysis. However, this process is as intricate as extracting resources from vast terrains. Echoing this sentiment, there is a growing appreciation in scholarly circles for “Four-panel Cartoon” (FPC) as a valuable image content source in big data digital newspapers in the Republic of Korea. Yet, identifying these FPCs amidst the vastness of big data archives is an arduous journey, especially given their unstructured image data format — a task both time-intensive and costly. To address this issue, this research paper presents a novel computational FPC detection mechanism: the development of the YOLOv5_FPC model, via fine-tuning the You Only Look Once Version 5 (YOLOv5) deep learning model, tailored precisely for FPC image detection. We applied our YOLOv5_FPC model to the Chosun Ilbo News Library archive (1920–1940) for automatic FPC data mining, spanning 47,777 JPG image files. We identified 1040 FPC objects within 1035 files, which include previously undiscovered FPCs by previous researchers. We provide a detailed description of our methodology, which includes the collection, labeling, training, detection, and distribution of the data we discovered from big data newspaper archives. Our findings, now available as an open-access dataset in the Journal of Open Humanities Data (JOHD) Dataverse, invite discussions among humanities researchers focusing on the culture and history of Korea between 1920 and 1940.

DOI: https://doi.org/10.5334/johd.205 | Journal eISSN: 2059-481X
Language: English
Submitted on: Mar 6, 2024
Accepted on: Apr 19, 2024
Published on: Jun 6, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Seojoon Lee, Byungjun Kim, Bong Gwan Jun, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.