Have a personal or library account? Click to login
Automatic Detection of Four-Panel Cartoon in Large-Scale Korean Digitized Newspapers using Deep Learning Cover

Automatic Detection of Four-Panel Cartoon in Large-Scale Korean Digitized Newspapers using Deep Learning

Open Access
|Jun 2024

Full Article

1 Context and motivation

In the era of digital transformation, rapid advancements in science and technology influence various spheres of human activity, raising questions about their instrumental values for humanity. As technology advances, digital humanities researchers play an important role in this era across various intersections, including the domains of humanities and Artificial Intelligence (A.I.). This is driven by the public’s desire to harness A.I. to fulfill aspirations and address needs. In this context, digital humanities experts, with an understanding of both humanities and A.I., are vital. Their competence is important for curating data, particularly in humanities research, where not all data is suitable for machine learning aimed at humanistic goals in the real world. When data synergizes with algorithms and computational resources, it can amplify its potential. Understanding the processes of data collection, algorithms, curation, analysis, and distribution of their datasets ensures the data’s continued relevance and value for humanities purposes. Furthermore, sharing their datasets in Internet databases fosters global interdisciplinary research from diverse backgrounds. In this research paper, we provide a comprehensive account of our data strategy: collecting, labeling, training, employing You Only Look Once (YOLO) algorithm for data mining from big data, and disseminating the dataset we discovered from the Chosun Ilbo newspaper as an Excel file. Also, we demonstrate how open-access digital platforms can foster global research by developing scripts for automatic object detection, with the potential for our computational methodology to be expanded in future research to discovering other types of image sources in big data.

The Chosun Ilbo, a big data newspaper digital archive, commenced publication in 1920 and ceased operations on August 10, 1940, under pressure from the Governor-General of Chosun (Kim, 2005), offers valuable insights for the analysis of culture and history. A key element in these archives is the Four-Panel Cartoon (FPC) in Korea, a common feature in historical newspapers and magazines (Lee, 2005). These FPCs, particularly significant during the Japanese colonial era (1910–1945), are more than just cartoons; they are satirical commentaries, as seen in publications like Chosun Ilbo (ChosunIlboNewsLibrary, 2024). FPCs, or “Naecutmanwha” in Korean, offer a window into the culture and politics of the era (Park, 2010). Also, their concise yet effective narrative development and sequential structure have been serialized in various newspapers for decades, capturing significant public attention. These newspaper FPCs hold not only artistic value in the realm of art history but also possess historical values, as their serialization gained momentum during the Japanese colonial period in Korea. An example is the “Meongteongguri” series, which goes beyond mere entertainment to reflect the complexities of life in colonial-era Seoul (Chung, 2016). These periods saw restricted expression, making satire an important form of commentary (Sa, 2009). However, identifying FPCs in vast digital archives spanning decades poses a challenge due to the sheer volume of digital files and the unstructured nature of the image objects in the big data archives. Above all, FPCs manifest in various sizes and shapes, scattered irregularly. Therefore, the development of an efficient object detection algorithm is imperative. To achieve our research goals by addressing this issue, our research employs the YOLOv5 deep learning model, specifically adapted for FPC detection. This model, known as YOLOv5_FPC, identified 1040 FPC images from 47,777 files in the Chosun Ilbo. The authors created an Excel file containing the URLs of the YOLOv5_FPC-detected FPC images, along with publication dates, which is now available on the JOHD Dataverse (Lee et al., 2024a). Any researchers can investigate the FPCs by clicking once on the corresponding URLs in the Excel file, encompassing previously undiscovered FPCs by previous researchers. The data we found encourages novel discourse among humanities scholars interested in Korean culture and history from 1920 to 1940. Our dataset sheds light on how society changed through FPCs. Additionally, by developing scripts for automatic FPC detection using our YOLOv5_FPC weights on the Google Colab platform, this tool simplifies FPC detection for the public, enabling them to detect FPCs in their local computer files. Our interdisciplinary methodology advances culture studies, history studies, digital humanities, and deep learning, enriching academic discourse and achieving humanities objectives.

2 Dataset description

Repository location https://doi.org/10.7910/DVN/DFVZWE

Repository name Journal of Open Humanities Data Dataverse

Object name “Metadata for YOLOv5_FPC-Detected Images”

Format names and versions Excel and CSV files (each file contains metadata for the YOLOv5_FPC-detected four-panel cartoons, including the respective URLs for the 1035 image files which contain 1040 FPC objects).

Creation dates 2023-02-28–2024-04-15.

Dataset creators Seojoon Lee (training data collection, YOLOv5 fine-tuning, big data collection, data mining, database curation, data analysis, and data conceptualization).

Byungjun Kim (big data collection, database curation, data conceptualization, and data advice).

Bong Gwan Jun (project administration, data conceptualization, and data advice).

Language The Excel and CSV files: Written in English. Discovered FPC images: Mostly Korean, partially Japanese four-panel cartoons.

License CC BY-NC-SA 4.0 DEED (Attribution-NonCommercial-ShareAlike 4.0 International).

Publication date 2024-04-15

3 Method

3.1 Related works

To select an appropriate deep learning model for our research, we commenced by conducting a comprehensive literature review. Our objective during this process was to identify a model that meets two essential criteria to accomplish our research goal: high accuracy in object detection and expedited detection speed. Traditionally, object detection has been an essential subdomain within computer vision, significantly contributing to a wide range of academic and commercial use cases. Deep learning techniques have proven highly effective for image categorization and object detection tasks within multimedia visuals, consistently yielding significant results on established benchmark datasets (Girshick et al., 2013; Krizhevsky et al., 2017; Redmon et al., 2016). Examples of such applications include face detection (Taigman et al., 2014), medical image detection (Litjens et al., 2017), vehicle detection (Chen et al., 2013), and logo detection(Hoi et al., 2015). We also have observed that there were endeavors that employ object detection in digital humanities. Siddiqui (2024) employed the Google Vision API to identify objects within 105,000 scenes from 15 films directed by Alfred Hitchcock. This approach facilitates the study of films and digital culture by analyzing movie scenes and their representation in online culture. Fiala et al. (2022) developed an algorithm that automatically converts sixteenth and seventeenth-century mensural sheet music into modern notation software formats, achieving a 99% recognition rate in translating musical material into MusicXML format using a four-step optical music recognition approach. Smirnov suggested utilizing deep learning and neural networks to perform automatic object detection on digital images of fine-art paintings. To achieve this, Smirnov employed a Convolutional Neural Network (CNN) that employed the VGG-19 architecture and was optimized using Bayesian techniques and transfer learning. The results of their approach showed that automatic object detection had an accuracy rate of 64% (Smirnov and Eguizabal, 2018). Hodel et al. (2021) researched developing advanced models for handwritten text recognition, enabling the accurate interpretation of historical documents across a variety of handwriting styles from the German Kurrent script spanning over a century. It benefits the public by providing access to historical texts, thus unlocking new possibilities for research and preservation within digital humanities and archival fields.

In recent years, the adoption of the You Look Only Once (YOLO) object detection algorithm has seen rapid growth across diverse academic and professional domains (Cao et al., 2020). Rajeshwari contends that YOLO provides exceptionally high levels of accuracy and speed, surpassing deep learning models such as R-CNN by nearly 1000 times in speed and outperforming the Fast R-CNN model by a factor of 100 in terms of speed. YOLO utilizes a fully convolutional neural network to generate grids of size S*S across an image, along with bounding boxes and associated class probabilities for each grid (Rajeshwari et al., 2019). After comparing pre-existing deep learning models in the literature review, we decided to employ the You Only Look Once Version 5 (YOLOv5) model (Jocher, 2020), given its ability for multi-class object detection with high accuracy and fast detection speed, fulfilling the two fundamental qualifications required for our research.

3.2 Methodological approach

In this methodology section, we outline detailed procedures spanning phases 1 through 9 as presented in Table 1. These phases include training data collection, labeling, fine-tuning, evaluating the YOLOv5_FPC model, big data collection, data mining of the FPCs, database curation, and developing the YOLOv5_FPC-Detector scripts utilizing the Google Colab platform for wider applications.

Table 1

Research phases of this paper.

NUMBERRESEARCH PHASE
1Training data collection of Four-Panel Cartoons (FPCs).
2Labelling process of the training dataset.
3Fine-tuning of the YOLOv5 Model.
4YOLOv5_FPC model evaluation: F1-score.
5Image file collection from the Chosun Ilbo News Library (1920–1940), totaling 47,777 JPG files.
6Data mining: Deployment of the YOLOv5_FPC model to the 47,777 JPG files from the Chosun Ilbo News Library, to detect FPC image ojects.
7Database curation: Uploading the Excel and CSV file which contains metadata for URLs of YOLOv5_FPC-detected 1035 images files (1040 FPC objects), which includes preciously undiscovered FPCs, to the JOHD Dataverse (Lee etal., 2024a).
8Data analysis of the detected FPC objects.
9Development of the YOLOv5_FPC-Detector script, leveraging the Google Colab platform for enhanced computational efficiency and wider application for the public.

Detecting objects within images is an essential component of computer vision, commonly referred to as object detection. We implemented a pioneering technique that leverages the You Only Look Once (YOLO) algorithm for identifying FPCs in large-scale digital documents. Specifically, we utilized the You Only Look Once Version 5 (YOLOv5) model (Jocher, 2020), a neural network-driven regression approach that facilitates single-stage object detection (Xiao et al., 2020) with high accuracy and rapid detection. YOLO employs a fully Convolutional Neural Network to produce S*S grids across an image, bounding boxes for each grid, and probabilities for each class associated with the bounding boxes (Rajeshwari et al., 2019). It is an object detection algorithm capable of simultaneous detection and classification of objects with high accuracy and rapid speed. We needed to select an agile model that satisfies two critical criteria for our research: high accuracy and fast speed. Thus, after conducting a comprehensive literature review, we determined that this model qualifies for the object detection in our study, which demands high accuracy and rapid detection speed when detecting FPCs in large-scale digital documents. Moreover, YOLO demonstrates proficiency in detecting small objects and exhibits confidence in detecting occluded objects (Rajeshwari et al., 2019), considering the possibility that there could be a presence of small or occluded FPC objects in the scanned and digitized newspapers.

As Figure 1 demonstrates, the initial YOLOv5 model could not detect the FPC image object in the testing dataset. We attributed this to the fact that the initial YOLOv5 model was not pre-trained with FPC image data; the initial YOLOv5 model was trained solely on the Microsoft COCO dataset, which includes only 80 object categories such as cars, bicycles, and wine glasses. To overcome this limitation, we utilized the fine-tuning method from the Ultralytics GitHub website (Jocher, 2020) to train the YOLOv5 model with our training dataset to detect the FPCs in digital documents. To train the model, we created bounding boxes for each shape within the 161 pages of FPC images using the DarkLabel 2.4 software (Darkpgmr, 2021). These bounding boxes were then converted into the YOLOv5 format, along with the corresponding coordinates. We used the Roboflow software (Dwyer et al., 2022) to randomly divide the images into training, validation, and testing sets. The training dataset served to optimize the YOLOv5 model for precise detection of FPCs, while the validation dataset was used for model development and the hyperparameter fine-tuning process during the training. The testing dataset was employed to evaluate the model’s overall performance and to mitigate the risk of over-fitting. We divided the 161 images into 113, 24, and 24 files for the training, validation, and testing sets, respectively, and saved them in separate directories. In our study, we employed a relatively small number of training, validation, and testing datasets due to the inherent challenge of pinpointing the exact locations of FPCs within the vast expanse of Internet. Our aim was to investigate the feasibility of utilizing a limited amount of training data for the detection of FPC objects. Additionally, we created a label directory that contained the coordinate axis of every bounding box in a text file for each figure, which was used for YOLOv5 training.

johd-10-205-g1.jpg
Figure 1

The initial YOLOv5 Model could not detect an FPC.

Creating high-quality datasets is an important aspect of training and testing deep learning models, and is especially crucial when dealing with complex objects such as FPCs. In this paper, we introduce the “Four-panel Cartoon Image Dataset” (Lee et al., 2023b) — a mix of grayscale and color images from 1920 to 2022, which we used for fine-tuning. The composition includes two categories of images: a 4 × 1 matrix and a 2 × 2 matrix. A 4 × 1 matrix FPC denotes a graphical illustration arranged in a rectangular configuration consisting of four rows and a single column, whereas a 2 × 2 matrix cartoon represents an illustration with two rows and two columns. These configurations offer distinct layouts for visually presenting information, facilitating varied academic and artistic applications. Our decision to exclude the 1 × 4 matrix FPC images was based on the fact that such FPCs are not commonly utilized in Korean cartoons, leading us to conclude that their inclusion in our study would not provide a representative sample of the prevalent usage of FPCs. Our dataset comprises 161 JPG format image files, randomly divided into the training (113 files), validation (24 files), and testing (24 files) set using Roboflow software (Dwyer et al., 2022). Table 2 depicts the division of the dataset into training, validation, and testing sets, depending on the matrix, and the era.

Table 2

Matrix and era of the “Four-panel Cartoon Image Dataset”.

SETFPC MATRIXCOLONIAL ERAPOST-COLONIAL ERA
Training4 × 13137
2 × 22628
Validation4 × 1123
2 × 247
Testing4 × 186
2 × 267

3.3 Fine-Tuning for FPC Detection Optimization

We fine-tuned the initial YOLOv5 model by using the training dataset we collected to optimize the FPC object detection.

In Figure 2, the displayed scores of box_loss, object_loss (obj_loss), and class_loss (cls_loss) provide insights into the model’s performance during the training and validation processes. In Arie’s explanation, the YOLO loss function comprises three components: box_loss, object_loss, and class_loss. Box_loss represents the loss incurred in bounding box regression, calculated using Mean Squared Error. Obj_loss quantifies the confidence of object presence, referred to as the objectness loss. Cls_loss accounts for the classification loss, computed using Cross Entropy. Arie also explains precision and recall. Precision assesses the model’s correctness in making positive predictions, while recall evaluates the model’s capacity to accurately detect positive occurrences; these metrics are in a trade-off relationship. The metrics “mAP_0.5” refers to the mean Average Precision (mAP) at an IoU threshold of 0.5, while “mAP_0.5:0.95” denotes the average mAP across various IoU thresholds, ranging from 0.5 to 0.95 (Arie, 2022). In Figure 2, several key points warrant attention. In the train/box_loss and val/box_loss graphs, as the number of epochs increases from 0 to 200 on the x-axis, there appears to be a decrease in the box loss on the y-axis. This decline signifies an improvement in accuracy when predicting bounding box coordinates. In the train/obj_loss and val/obj_loss graphs, as the number of epochs increases from 0 to 200, there is a decrease in the loss values which indicates an improvement of accuracy in the detection of objects. Regarding the train/cls_loss and val/cls_loss graphs, an increase in the epoch count in the x-axis results in a decrease in classification loss of the y-axis. The decrease demonstrates an enhancement in accuracy for classifying objects. In the metrics/precision graph, as the epoch count increases from 0 to 200, the precision score tends to increase as well. In each of the ten graphs depicted in Figure 2, the blue lines labeled as “results” represent performance metrics and outcomes, while the orange dots labeled as “smooth” indicate a smoothed version of the blue curves, helping to reduce noise and fluctuation for easier interpretation.

johd-10-205-g2.png
Figure 2

Model performance while fine-tuning.

Precision=True PositiveTrue Positive+False Positive
Recall=True PositiveTrue Positive+False Negative
F1-score=2PrecisionRecallPrecision+Recall

The F1-score, which unifies precision and recall, serves as the harmonic average of these two evaluation measures. The F1-score takes into account both precision and recall, which are important metrics for evaluating object detection models. Our fine-tuned model, depicted in Figure 3, achieved a high F1-Confidence score of 0.97 for all classes at a confidence threshold of 0.708, signifying accurate and balanced performance in precision and recall across both 4 × 1 and 2 × 2 FPC categories. This YOLOv5_FPC model weights will be utilized for data mining on big data newspapers in this research.

johd-10-205-g3.png
Figure 3

F1-score of our YOLOv5_FPC model.

3.4 Data Mining: FPC Object Detection in the Chosun Ilbo News Library

The Excel file depicted in Figure 4 contains columns including “image”, “source_xml_file”, “publication_day”, “id”, “type”, and “body”, representing the metadata of the 47,777 JPG files from 1920 to 1940 in Chosun Ilbo. Table 3 delineates the definitions of the metadata included in this Excel file. The “image” column, which contains the URLs of the scanned and digitized image files from the Chosun Ilbo newspaper database, was utilized to collect 47,777 JPG image files from Chosun Ilbo (ChosunIlboNewsLibrary, 2024). In total, we collected 47,777 JPG files from the database (Figure 5), using automation. Leveraging the deep learning model YOLOv5_FPC, FPCs were identified within the image files. The original YOLOv5 model, trained on a suite of 80 conventional objects utilizing the Microsoft COCO dataset, was initially incapable of detecting unfamiliar objects including, but not limited to, FPCs. However, after fine-tuning and retraining using FPC train data gathered by the authors in this research, the modified YOLOv5_FPC model adeptly identified FPCs. Utilizing the Chosun Ilbo dataset, our advanced deep learning model detected a total of 4508 image files. Of these, 1035 files encompassed 1040 FPC objects in the 47,777 image files, which underwent manual verification by the authors to confirm their authenticity as FPCs. Noteworthily, the YOLOv5_FPC model exhibited prowess in detecting multi-class FPCs even when both 4 × 1 and 2 × 2 matrix structures coexisted within a singular file. In the 4508 detected image files, along with accurately identified FPC objects, there were also false positive redundant objects, encompassing instances of the vertical Chosun Ilbo title — written in Chinese characters, and vertical advertisements. In the realm of deep learning object detection, a “false positive” is an instance wherein the model incorrectly recognizes a redundant object that is absent in reality. These findings highlight areas for improvement in future research. The accessibility of these FPC images has been available through the dissemination of our Excel and CSV files on the JOHD Dataverse website (Lee et al., 2024a) (Figure 6). This Excel file contains several metadata, including URLs for YOLOv5_FPC-detected images, along with their corresponding publication dates, and the file name. Researchers now have seamless access to the FPCs from the official Chosun Ilbo website by simply clicking on the URLs they want to investigate. This enables a thorough examination of the 1,040 FPC objects, and the frequencies of these objects are analyzed in Tables 4 and 5. The detected image files include previously undiscovered FPCs by previous researchers (Figure 7).

johd-10-205-g4.png
Figure 4

Chosun Ilbo News Library newspaper metadata (1920–1940) (ChosunIlboNewsLibrary, 2024).

johd-10-205-g5.jpg
Figure 5

47,777 image files collected from the Chosun Ilbo News Library (1920–1940) (ChosunIlboNewsLibrary, 2024).

johd-10-205-g6.png
Figure 6

Our Dataset: “Metadata for the YOLOv5_FPC Detected Images” (Lee et al., 2024a) containing the URLs (YOLOv5_FPC-detected 1035 image files; 1040 FPC objects in total), and their publication dates sourced from the Chosun Ilbo News Library (1920–1940).

johd-10-205-g7.jpg
Figure 7

Previously undiscovered FPC image data from the Chosun Ilbo News Library digital archive (ChosunIlbo, 2024).

Table 3

Metadata definitions of Figure 4 Excel file columns.

COLUMN NAMESDEFINITION
idThe unique identifier for each article of Chosun Ilbo.
page_noThe page number of the article.
titleThe title of the newspaper article.
regdateThe registration date of the article.
typeThe type of the article.
publication_dayThe day of the week when the article was published.
sectionThe section of the newspaper where the article is placed.
publication_dateThe date when the article was published (Year-Month-Day).
completenessThe completeness of the article (“Y” indicates Yes).
bodyThe main text of the article.
publication_noThe publication number.
node_idNumeric identifier associated with the article.
source_image_fileThe name of the image file.
@timestampThe timestamp indicating when the newspaper textual data was collected to the Excel file (Figure 4).
isnThe International Standard Number (ISN) associated with the article.
source_xml_fileThe XML file name.
page_sectionThe section of the newspaper (society, general section, advertisement, politics, culture).
urlThe URL to the article.
imageThe URLs containing the website links to the scanned and digitized image files of the Chosun Ilbo newspaper database (used for collecting 47,777 JPG image files in this research).
sub_titleThe subtitle of the article.
authorsThe author(s) who wrote the newspaper article.
Table 4

Frequency analysis of the 1040 YOLOv5_FPC-detected FPCs discovered from the Chosun Ilbo News Library (1920–1940) (ChosunIlboNewsLibrary, 2024).

INDEXNAME OF FPC DETECTED USING THE YOLOV5_FPC MODEL (CHOSUN ILBO NEWS LIBRARY, SPANNING 1920–1940)FREQUENCY (PER FPC)
1Meongteongguri726
2Byeokchangho126
3Japanese Language-Written Cartoon104
4Baekgongsan13
5Dolbo and Mikki12
6Ttukdugiui Seollori11
7Chador’s Adventure9
8Arctic Exploration8
9Rubber Balloon7
10Football Player8
11Makdongi and Goose2
12Biography of a Fool2
13Paengkenggun’s Monkey Catching1
14Buffalo and Fish1
15Then Yes1
16Hobang Bridge1
17Ice Snack1
18The Evil of Alcohol1
19Put Your Hands Up1
20Better Radio1
21Tiger Den1
22Hide and Seek1
23Samyeong’s Caramel Cartoon1
24Love Trees1
Table 5

Frequency comparison of the “Meongteongguri” series.

SERIES OF “MEONGTEONGGURI” FPCCHUNG’S RESEARCH FINDINGS OF FPCS (CHUNG, 2016)OUR YOLOV5_FPC FINDINGS OF FPCSDIFFERENCES (PER FPC)
Reporter Life Part 1None35+35
Modern LifeNone4+4
Social Work5062+12
Heonmulkyeoji4855+7
Ssutdeokdaegi1819+1
Student Life12120
Ssonawatso990
Self-sufficiency8786–1
Round the World148147–1
Hunger Life5019–31
Dating Life181178–3
Family Life102100–2

Upon examining each newspaper image files, the authors were able to discern the date indicators located in the upper right corner of each published newspaper: “日” for day, “月” for month, “年” for year. These date indicators, inscribed in Japanese Kanji — which originated from Chinese characters that derived from archaic pictograms of ancient China (Jones and Aoki, 1988)—, reflect the historical dynamics and linguistic interactions among the three East Asian countries, including Korea.

3.5 Development and Implementation of the YOLOv5_FPC-Detector using the Google Colab Platform

Furthermore, in this research, we developed the YOLOv5_FPC-Detector script on the Google Colab platform (Lee et al., 2023a), leveraging our optimized deep learning weights of the YOLOv5_FPC model; Google Colab is a globally used open-access cloud-based computational platform, available for a broader audience. This provides advantages in terms of both time and cost efficiency for the public who want to detect FPCs in their local computers. The entire script, depicted in Figures 8, 9, and 10, accompanied by detailed usage guidelines, is now accessible on the website (Lee et al., 2023a), and our GitHub repository (Lee et al., 2024b).

johd-10-205-g8.png
Figure 8

Automatic FPC detection using the weights of the YOLOv5_FPC model on Google Colab. This script imports the weights and downloads dependencies required for the automatic detection process.

johd-10-205-g9.png
Figure 9

Users can simply upload their files to detect FPCs on their local computers.

johd-10-205-g10.png
Figure 10

The detected FPCs are saved on their local computers.

Figure 8 ensures the installation of necessary libraries and dependencies essential for the operation of the YOLOv5_FPC model. The weights of the YOLOv5_FPC, which have been uploaded to the Google folder, were downloaded to be utilized in this script for automatic FPC detection. Figure 9 provides a guide on how users can upload image files in ZIP format to Google Colab for FPC detection. By selecting the “Select File” option, users can simply upload their ZIP file. Once uploaded, it automatically detects FPCs within the image files they uploaded by employing the YOLOv5_FPC model. The script (in Figure 10) performs the task of aggregating these text files containing the information of the detected file names, and then cross-referencing them with the folder containing all image files. Subsequently, only those image files identified by the YOLOv5_FPC as containing FPCs are extracted, utilizing the shutil.copy() function. The resulting detected files by the YOLOv5_FPC model are saved in this directory: “/content/Output_folder/detected_true_images.” To enhance user accessibility, the detected files are automatically archived into a compressed file named “FPC_model_detected.images.zip,” and then saved to the user’s local machine. By synergizing the YOLOv5_FPC model with the Google Colab scripts, researchers can effectively detect FPCs across various historical Korean newspapers.

4 Results and discussion

4.1 Challenges and Limitations

During our research, we encountered an issue with false positives. Occasionally, the model erroneously identified objects such as the vertically displayed Chosun Ilbo title — written in Chinese characters, and advertisements as FPCs. We believe that the model erroniously detected the vertically displayed Chinese characters and advertisements as 4 × 1 FPCs. We believe this issue arises because the vertically displayed Chosun Ilbo titles and advertisements visually resemble 4 × 1 panel FPCs. There is a need for further research to diminish the rate of these false positive detections in subsequent studies. One potential solution could be to collect more training data, specifically including 4 × 1 panel FPCs. This was challenging in our study because the vastness of Internet made locating and collecting relevant training data difficult. However, despite this obstacle, we have uncovered previously undiscovered historical FPC images (to the best of our knowledge) in the big data digital archive with this efficient computational methodology, which is now available via the JOHD Dataverse (Lee et al., 2024a), facilitating further discourse.

Our interdisciplinary research has advanced digital humanities through the application of deep learning for cultural and historical content analysis in the big data digital newspaper archive. We have elucidated how humanities researchers can enhance the achievement of humanistic goals by understanding data, algorithms, and the computational platform.

Key contributions are:

  1. Collection of training data for FPCs.

  2. Development of YOLOv5_FPC by fine-tuning the initial YOLOv5 model.

  3. Big data collection from the Chosun Ilbo News Library (1920–1940), totaling 47,777 JPG files.

  4. Data mining: Deployment of the YOLOv5_FPC model on the big data Chosun Ilbo News Library to detect FPC image objects.

  5. Database curation: Uploading the Excel and CSV files containing metadata regarding YOLOv5_FPC-detected FPC image files, including previously undiscovered ones, to foster further global discourse on FPCs.

  6. Data analysis of the detected FPC objects.

  7. Development of the YOLOv5_FPC-Detector script, utilizing the Google Colab platform, to enhance its reuse potential.

4.2 Applications

Future research should focus on the development of comprehensive databases containing FPCs from diverse Korean newspapers with extensive archival records using our YOLOv5_FPC model. By utilizing the YOLOv5_FPC model and the Google Colab scripts, researchers can optimally detect their FPCs from various historical Korean newspapers. Prominent among these are the Chosun Ilbo (ChosunIlboNewsLibrary, 2024), the Donga Ilbo (DongaIlbo, 2024), and the JoongAng Ilbo (JoongAngIlbo, 2024) which are well-known newspaper publication companies in the Republic of Korea and contain digital archives spanning decades. This approach will enable researchers to compare and analyze historical FPCs from various publication companies. Furthermore, expanding this research methodology to include the detection of 1 × 4 matrix FPCs — which are prevalent in Western countries but not covered in this study — could enable researchers in Western countries to deepen their understanding of societal implications by investigating FPCs in future research. Researchers would be able to identify which characteristics of FPCs were prevalent during specific eras and among particular audiences, as well as how society was portrayed in the FPCs and, conversely, how the FPCs influenced society. Additionally, the adoption of Linked Research on Open Data (LOD) and Relational Database (RDB) will be essential, as they will streamline data management and improve access and connectivity of information for scholars. Such a database will be a crucial resource in cultural and historical studies, offering new insights into humanities events and under-researched areas. Moreover, the strategic application of these databases in content services could extend their utility beyond their current status quo, making them valuable tools for broader engagements. Our image detection methodology could be expanded to effectively extract diverse image content, including but not limited to, advertisements, clothing, photos, and cultural and historical image sources from large-scale databases.

Funding Information

This work was supported by the Korea Foundation for the Advancement of Science and Creativity (KOFAC), the Ministry of Science & ICT (Project Number: D23030005; Funding ID: N01230055), and The Chosun Ilbo Media Institute (G01230588).

Competing Interests

The authors have no competing interests to declare.

Author Contributions

Seojoon Lee: Writing – original draft, Methodology, Data curation, Conceptualization, Software, Formal Analysis, Visualization, Writing – review & editing.

Byungjun Kim: Data curation, Validation, Conceptualization, Writing – review & editing.

Bong Gwan Jun: Project administration, Conceptualization, Writing – review & editing, Funding acquisition.

DOI: https://doi.org/10.5334/johd.205 | Journal eISSN: 2059-481X
Language: English
Submitted on: Mar 6, 2024
Accepted on: Apr 19, 2024
Published on: Jun 6, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Seojoon Lee, Byungjun Kim, Bong Gwan Jun, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.