Hierarchical Text Classification: Fine-tuned GPT-2 vs BERT-BiLSTM

Bouchiha, Djelloul; Bouziane, Abdelghani; Doumi, Noureddine; Hamzaoui, Benamar; Boukli-Hacene, Sofiane

Hierarchical Text Classification: Fine-tuned GPT-2 vs BERT-BiLSTM

Applied Computer Systems

Volume 30 (2025): Issue 1 (January 2025)

By:

Djelloul Bouchiha

, Abdelghani Bouziane

, Noureddine Doumi

, Benamar Hamzaoui

and Sofiane Boukli-Hacene

Open Access

|Mar 2025

References

J. Eisenstein, Introduction to Natural Language Processing. The MIT Press, 2019.
Search in Google Scholar Back to article
K. Kowsari, K. Jafari Meimandi, M. Heidarysafa, S. Mendu, L. Barnes, and D. Brown, “Text classification algorithms: A survey,” Information, vol. 10, no. 4, Apr. 2019, Art. no. 150. https://doi.org/10.3390/info10040150
Search in Google Scholar Back to article
A. Palanivinayagam, C. Z. El-Bayeh, and R. Damaševičius, “Twenty years of machine-learning-based text classification: A systematic review,” Algorithms, vol. 16, no. 5, Apr. 2023, Art. no. 236. https://doi.org/10.3390/a16050236
Search in Google Scholar Back to article
J. M. Patel, “Deep learning: Concepts, architectures, workflow, applications and future directions,” International Journal for Multidisciplinary Research, vol. 5, no. 6, pp. 1–7, Nov.–Dec. 2023. https://doi.org/10.36948/ijfmr.2023.v05i06.11497
Search in Google Scholar Back to article
D. H. Hagos, R. Battle, and D. B. Rawat, “Recent advances in generative AI and large language models: Current status, challenges, and perspectives,” arXiv:2407.14962, Aug. 2024. https://doi.org/10.48550/arXiv.2407.14962
Search in Google Scholar Back to article
A. Zangari, M. Marcuzzo, M. Schiavinato, M. Rizzo, A. Gasparetto, and A. Albarelli, “Hierarchical text classification: A review of current research,” Electronics, vol. 13, no. 7, Mar. 2024, Art. no. 1199. https://doi.org/10.3390/electronics13071199
Search in Google Scholar Back to article
A. Gasparetto, M. Marcuzzo, A. Zangari, and A. Albarelli, “A survey on text classification algorithms: From text to predictions,” Information, vol. 13, no. 2, Feb. 2022, Art. no. 83. https://doi.org/10.3390/info13020083
Search in Google Scholar Back to article
Q. Li, H. Peng, J. Li, C. Xia, R. Yang, L. Sun, P. S. Yu, and L. He, “A survey on text classification: From traditional to deep learning,” ACM Trans. Intell. Syst. Technol., vol. 13, no. 2, Apr. 2022, Art. no. 31. https://doi.org/10.1145/3495162
Search in Google Scholar Back to article
S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, and J. Gao, “Deep learning-based text classification: A comprehensive review,” ACM Comput. Surv., vol. 54, no. 3, Apr. 2021, Art. no. 62. https://doi.org/10.1145/3439726
Search in Google Scholar Back to article
A. Gasparetto, A. Zangari, M. Marcuzzo, and A. Albarelli, “A survey on text classification: Practical perspectives on the Italian language,” PLoS ONE, vol. 17, no. 7, Jul. 2022, Art. no. e0270904. https://doi.org/10.1371/journal.pone.0270904
Search in Google Scholar Back to article
J. Fields, K. Chovanec, and P. Madiraju, “A survey of text classification with transformers: How wide? How large? How long? How accurate? How expensive? How safe?,” IEEE Access, vol. 12, pp. 6518–6531, Jan. 2024. https://doi.org/10.1109/ACCESS.2024.3349952
Search in Google Scholar Back to article
S. Bhawsar, S. Dubey, S. Kushwaha, and S. Sharma, “Text classification using deep learning: A survey,” in Proceedings of International Conference on Computational Intelligence, Singapore, 2023, pp. 205–216. https://doi.org/10.1007/978-981-19-2126-1_16
Search in Google Scholar Back to article
A. Zangari, M. Marcuzzo, M. Rizzo, L. Giudice, A. Albarelli, and A. Gasparetto, “Hierarchical text classification and its foundations: A review of current research,” Electronics, vol. 13, no. 7, Mar. 2024, Art. no. 1199. https://doi.org/10.3390/electronics13071199
Search in Google Scholar Back to article
Z. Yang and G. Liu, “Hierarchical sequence-to-sequence model for multi-label text classification,” IEEE Access, vol. 7, pp. 153012–153020, Oct. 2019. https://doi.org/10.1109/ACCESS.2019.2948855
Search in Google Scholar Back to article
W. Zhao, H. Gao, S. Chen, and N. Wang, “Generative multi-task learning for text classification,” IEEE Access, vol. 8, pp. 86380–86387, May 2020. https://doi.org/10.1109/ACCESS.2020.2991337
Search in Google Scholar Back to article
K. Rivas Rojas, G. Bustamante, A. Oncevay, and M. A. Sobrevilla Cabezudo, “Efficient strategies for hierarchical text classification: External knowledge and auxiliary tasks,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp. 2252–2257. https://doi.org/10.18653/v1/2020.acl-main.205
Search in Google Scholar Back to article
J. Risch, S. Garda, and R. Krestel, “Hierarchical document classification as a sequence generation task,” in Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, China, Aug. 2020, pp. 147–155. https://doi.org/10.1145/3383583.3398538
Search in Google Scholar Back to article
J. Yan, P. Li, H. Chen, J. Zheng, and Q. Ma, “Does the order matter? A random generative way to learn label hierarchy for hierarchical text classification,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 276–285, Nov. 2024. https://doi.org/10.1109/TASLP.2023.3329374
Search in Google Scholar Back to article
J. Kwon, H. Kamigaito, Y.-I. Song, and M. Okumura, “Hierarchical label generation for text classification,” in Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, Dubrovnik, Croatia, May 2023, pp. 625–632. https://doi.org/10.18653/v1/2023.findings-eacl.46
Search in Google Scholar Back to article
F. Torba, C. Gravier, C. Laclau, A. Kammoun, and J. Subercaze, “A study on hierarchical text classification as a Seq2seq task,” in Advances in Information Retrieval. ECIR 2024, Cham, Mar. 2024, pp. 287–296. https://doi.org/10.1007/978-3-031-56063-7_20
Search in Google Scholar Back to article
J. Zhang, Y. Li, F. Shen, Y. He, H. Tan, and Y. He, “Hierarchical text classification with multi-label contrastive learning and KNN,” Neurocomputing, vol. 577, Apr. 2024, Art. no. 127323. https://doi.org/10.1016/j.neucom.2024.127323
Search in Google Scholar Back to article
J. Zhou, C. Ma, D. Long, G. Xu, N. Ding, H. Zhang, P. Xie, and G. Liu, “Hierarchy-aware global model for hierarchical text classification,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp. 1106–1117. https://doi.org/10.18653/v1/2020.acl-main.104
Search in Google Scholar Back to article
H. Liu, X. Huang, and X. Liu, “Improve label embedding quality through global sensitive GAT for hierarchical text classification,” Expert Systems with Applications, vol. 238, Mar. 2024, Art. no. 122267. https://doi.org/10.1016/j.eswa.2023.122267
Search in Google Scholar Back to article
H. Chen, Q. Ma, Z. Lin, and J. Yan, “Hierarchy-aware label semantics matching network for hierarchical text classification,” in Proc. of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int. J. Conf. on Nat. Language Processing, ACL/IJCNLP 2021, Aug. 2021, pp. 4370–4379. https://doi.org/10.18653/v1/2021.acl-long.337
Search in Google Scholar Back to article
J. Zhang, Y. Li, F. Shen, C. Xia, H. Tan, and Y. He, “Hierarchy-aware and label balanced model for hierarchical text classification,” Knowledge-Based Systems, vol. 300, Sep. 2024, Art. no. 112153. https://doi.org/10.1016/j.knosys.2024.112153
Search in Google Scholar Back to article
A. Pal, M. Selvakumar, and M. Sankarasubbu, “MAGNET: Multi-label text classification using attention-based graph neural network,” in Proceedings of the 12th International Conference on Agents and Artificial Intelligence (ICAART 2020), Valletta, Malta, 2020, pp. 494–505. https://doi.org/10.5220/0008940304940505
Search in Google Scholar Back to article
R. Zhao, X. Wei, C. Ding, and Y. Chen, “Hierarchical multi-label text classification: Self-adaption semantic awareness network integrating text topic and label level information,” in Proceedings of the Knowledge Science, Engineering and Management, Hangzhou, China, Aug. 2021, pp. 406–418. https://doi.org/10.1007/978-3-030-82147-0_33
Search in Google Scholar Back to article
J. Chen, S. Zhao, F. Lu, F. Liu, and Y. Zhang, “Research on patent classification based on hierarchical label semantics,” in 2022 3rd International Conference on Education, Knowledge and Information Management (ICEKIM), Harbin, China, Aug. 2022, pp. 1025–1032. https://doi.org/10.1109/ICEKIM55072.2022.00223
Search in Google Scholar Back to article
Z. Yao, H. Chai, J. Cui, S. Tang, and Q. Liao, “HITSZQ at SemEval-2023 Task 10: Category-aware sexism detection model with self-training strategy,” in Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), Toronto, Canada, Jul. 2023, pp. 934–940. https://doi.org/10.18653/v1/2023.semeval-1.129
Search in Google Scholar Back to article
B. Ning, D. Zhao, X. Zhang, C. Wang, and S. Song, “UMP-MG: A unidirected message-passing multi-label generation model for hierarchical text classification,” Data Science and Engineering, vol. 8, pp. 112–123, Jun. 2023. https://doi.org/10.1007/s41019-023-00210-1
Search in Google Scholar Back to article
J. Song, F. Wang, and Y. Yang, “Peer-label assisted hierarchical text classification,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, Canada, Jul. 2023, pp. 3747–3758. https://doi.org/10.18653/v1/2023.acl-long.207
Search in Google Scholar Back to article
Z. Wang, P. Wang, L. Huang, X. Sun, and H. Wang, “Incorporating hierarchy into text encoder: a contrastive learning approach for hierarchical text classification,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, May 2022, pp. 7109–7119. https://doi.org/10.18653/v1/2022.acl-long.491
Search in Google Scholar Back to article
Y. Liu, K. Zhang, Z. Huang, K. Wang, Y. Zhang, Q. Liu, and E. Chen, “Enhancing hierarchical text classification through knowledge graph integration,” in Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, Toronto, ON, Canada, Jul. 2023, pp. 5797–5810. https://doi.org/10.18653/v1/2023.findings-acl.358
Search in Google Scholar Back to article
Z. Wang, P. Wang, T. Liu, Y. Cao, Z. Sui, and H. Wang, “HPT: Hierarchy-aware prompt tuning for hierarchical text classification,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, Dec. 2022, pp. 3740–3751. https://doi.org/10.18653/v1/2022.emnlp-main.246
Search in Google Scholar Back to article
K. Ji, Y. Lian, J. Gao, and B. Wang, “Hierarchical verbalizer for few-shot hierarchical text classification,” in Proc. of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, ON, Canada, Jul. 2023, pp. 2918–2933. https://doi.org/10.18653/v1/2023.acl-long.164
Search in Google Scholar Back to article
Y. Zhang, R. Yang, X. Xu, R. Li, J. Xiao, J. Shen, and J. Han, “TELEClass: Taxonomy enrichment and LLM-enhanced hierarchical text classification with minimal supervision,” arXiv:2403.00165, 2024. https://arxiv.org/pdf/2403.00165
Search in Google Scholar Back to article
L. Chen, H. Chou, and X. Zhu, “Developing prefix-tuning models for hierarchical text classification,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track, Abu Dhabi, UAE, Dec. 2022, pp. 390–397. https://doi.org/10.18653/v1/2022.emnlp-industry.39
Search in Google Scholar Back to article
A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, 2019, Art. no. 9.
Search in Google Scholar Back to article
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in Neural Information Processing Systems, vol. 30, 2017. https://arxiv.org/pdf/1706.03762
Search in Google Scholar Back to article
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in The 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), Minneapolis, Minnesota, 2019, pp. 4171–4186.
Search in Google Scholar Back to article
S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, Nov. 1997. https://doi.org/10.1162/neco.1997.9.8.1735
Search in Google Scholar Back to article
M. Schuster and K. K. Paliwal, “Bidirectional recurrent neural networks,” IEEE Transactions on Signal Processing, vol. 45, no. 11, pp. 2673–2681, Nov. 1997. https://doi.org/10.1109/78.650093
Search in Google Scholar Back to article
H. Bichri, A. Chergui, and M. Hain, “Investigating the impact of train/test split ratio on the performance of pre-trained models with custom datasets,” International Journal of Advanced Computer Science & Applications, vol. 15, no. 2, 2024. https://doi.org/10.14569/IJACSA.2024.0150235
Search in Google Scholar Back to article
B. Vrigazova, “The proportion for splitting data into training and test set for the bootstrap in classification problems,” Business Systems Research Journal, vol. 12, no. 1, pp. 228–242, May 2021. https://doi.org/10.2478/bsrj-2021-0015
Search in Google Scholar Back to article
D. M. W. Powers, “Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation,” International Journal of Machine Learning Technology, vol. 2, no. 1, pp. 37–63, Jan. 2011. https://doi.org/10.9735/2229-3981
Search in Google Scholar Back to article
P. Christen, D. J. Hand, and N. Kirielle, “A review of the F-measure: Its history, properties, criticism, and alternatives,” ACM Comput. Surv., vol. 56, no. 3, Oct. 2023, Art. no. 73. https://doi.org/10.1145/3606367
Search in Google Scholar Back to article
A. Kosmopoulos, I. Partalas, E. Gaussier, G. Paliouras, and I. Androutsopoulos, “Evaluation measures for hierarchical classification: a unified view and novel approaches,” Data Mining and Knowledge Discovery, vol. 29, pp. 820–865, May 2015. https://doi.org/10.1007/s10618-014-0382-x
Search in Google Scholar Back to article

DOI: https://doi.org/10.2478/acss-2025-0005 | Journal eISSN: 2255-8691 | Journal ISSN: 2255-8683

Journal RSS Feed

Language: English

Page range: 40 - 46

Submitted on: Nov 7, 2024

Accepted on: May 26, 2025

Published on: Mar 15, 2025

Published by: Riga Technical University

In partnership with: Paradigm Publishing Services

Publication frequency: 1 issue per year

Keywords:

BERT,

BiLSTM,

fine-tuned GPT-2,

generative pre-trained transformer (GPT),

hierarchical text classification (HTC),

large language model (LLM)

Related subjects:

Computer sciences,

Artificial intelligence,

Information technology,

Project management,

Software development

© 2025 Djelloul Bouchiha, Abdelghani Bouziane, Noureddine Doumi, Benamar Hamzaoui, Sofiane Boukli-Hacene, published by Riga Technical University
This work is licensed under the Creative Commons Attribution 4.0 License.

Previous article Volume 30 (2025): Issue 1 (January 2025)Next article