Have a personal or library account? Click to login

Hierarchical Text Classification: Fine-tuned GPT-2 vs BERT-BiLSTM

Open Access
|Mar 2025

References

  1. J. Eisenstein, Introduction to Natural Language Processing. The MIT Press, 2019.
  2. K. Kowsari, K. Jafari Meimandi, M. Heidarysafa, S. Mendu, L. Barnes, and D. Brown, “Text classification algorithms: A survey,” Information, vol. 10, no. 4, Apr. 2019, Art. no. 150. https://doi.org/10.3390/info10040150
  3. A. Palanivinayagam, C. Z. El-Bayeh, and R. Damaševičius, “Twenty years of machine-learning-based text classification: A systematic review,” Algorithms, vol. 16, no. 5, Apr. 2023, Art. no. 236. https://doi.org/10.3390/a16050236
  4. J. M. Patel, “Deep learning: Concepts, architectures, workflow, applications and future directions,” International Journal for Multidisciplinary Research, vol. 5, no. 6, pp. 1–7, Nov.–Dec. 2023. https://doi.org/10.36948/ijfmr.2023.v05i06.11497
  5. D. H. Hagos, R. Battle, and D. B. Rawat, “Recent advances in generative AI and large language models: Current status, challenges, and perspectives,” arXiv:2407.14962, Aug. 2024. https://doi.org/10.48550/arXiv.2407.14962
  6. A. Zangari, M. Marcuzzo, M. Schiavinato, M. Rizzo, A. Gasparetto, and A. Albarelli, “Hierarchical text classification: A review of current research,” Electronics, vol. 13, no. 7, Mar. 2024, Art. no. 1199. https://doi.org/10.3390/electronics13071199
  7. A. Gasparetto, M. Marcuzzo, A. Zangari, and A. Albarelli, “A survey on text classification algorithms: From text to predictions,” Information, vol. 13, no. 2, Feb. 2022, Art. no. 83. https://doi.org/10.3390/info13020083
  8. Q. Li, H. Peng, J. Li, C. Xia, R. Yang, L. Sun, P. S. Yu, and L. He, “A survey on text classification: From traditional to deep learning,” ACM Trans. Intell. Syst. Technol., vol. 13, no. 2, Apr. 2022, Art. no. 31. https://doi.org/10.1145/3495162
  9. S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, and J. Gao, “Deep learning-based text classification: A comprehensive review,” ACM Comput. Surv., vol. 54, no. 3, Apr. 2021, Art. no. 62. https://doi.org/10.1145/3439726
  10. A. Gasparetto, A. Zangari, M. Marcuzzo, and A. Albarelli, “A survey on text classification: Practical perspectives on the Italian language,” PLoS ONE, vol. 17, no. 7, Jul. 2022, Art. no. e0270904. https://doi.org/10.1371/journal.pone.0270904
  11. J. Fields, K. Chovanec, and P. Madiraju, “A survey of text classification with transformers: How wide? How large? How long? How accurate? How expensive? How safe?,” IEEE Access, vol. 12, pp. 6518–6531, Jan. 2024. https://doi.org/10.1109/ACCESS.2024.3349952
  12. S. Bhawsar, S. Dubey, S. Kushwaha, and S. Sharma, “Text classification using deep learning: A survey,” in Proceedings of International Conference on Computational Intelligence, Singapore, 2023, pp. 205–216. https://doi.org/10.1007/978-981-19-2126-1_16
  13. A. Zangari, M. Marcuzzo, M. Rizzo, L. Giudice, A. Albarelli, and A. Gasparetto, “Hierarchical text classification and its foundations: A review of current research,” Electronics, vol. 13, no. 7, Mar. 2024, Art. no. 1199. https://doi.org/10.3390/electronics13071199
  14. Z. Yang and G. Liu, “Hierarchical sequence-to-sequence model for multi-label text classification,” IEEE Access, vol. 7, pp. 153012–153020, Oct. 2019. https://doi.org/10.1109/ACCESS.2019.2948855
  15. W. Zhao, H. Gao, S. Chen, and N. Wang, “Generative multi-task learning for text classification,” IEEE Access, vol. 8, pp. 86380–86387, May 2020. https://doi.org/10.1109/ACCESS.2020.2991337
  16. K. Rivas Rojas, G. Bustamante, A. Oncevay, and M. A. Sobrevilla Cabezudo, “Efficient strategies for hierarchical text classification: External knowledge and auxiliary tasks,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp. 2252–2257. https://doi.org/10.18653/v1/2020.acl-main.205
  17. J. Risch, S. Garda, and R. Krestel, “Hierarchical document classification as a sequence generation task,” in Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, China, Aug. 2020, pp. 147–155. https://doi.org/10.1145/3383583.3398538
  18. J. Yan, P. Li, H. Chen, J. Zheng, and Q. Ma, “Does the order matter? A random generative way to learn label hierarchy for hierarchical text classification,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 276–285, Nov. 2024. https://doi.org/10.1109/TASLP.2023.3329374
  19. J. Kwon, H. Kamigaito, Y.-I. Song, and M. Okumura, “Hierarchical label generation for text classification,” in Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, Dubrovnik, Croatia, May 2023, pp. 625–632. https://doi.org/10.18653/v1/2023.findings-eacl.46
  20. F. Torba, C. Gravier, C. Laclau, A. Kammoun, and J. Subercaze, “A study on hierarchical text classification as a Seq2seq task,” in Advances in Information Retrieval. ECIR 2024, Cham, Mar. 2024, pp. 287–296. https://doi.org/10.1007/978-3-031-56063-7_20
  21. J. Zhang, Y. Li, F. Shen, Y. He, H. Tan, and Y. He, “Hierarchical text classification with multi-label contrastive learning and KNN,” Neurocomputing, vol. 577, Apr. 2024, Art. no. 127323. https://doi.org/10.1016/j.neucom.2024.127323
  22. J. Zhou, C. Ma, D. Long, G. Xu, N. Ding, H. Zhang, P. Xie, and G. Liu, “Hierarchy-aware global model for hierarchical text classification,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp. 1106–1117. https://doi.org/10.18653/v1/2020.acl-main.104
  23. H. Liu, X. Huang, and X. Liu, “Improve label embedding quality through global sensitive GAT for hierarchical text classification,” Expert Systems with Applications, vol. 238, Mar. 2024, Art. no. 122267. https://doi.org/10.1016/j.eswa.2023.122267
  24. H. Chen, Q. Ma, Z. Lin, and J. Yan, “Hierarchy-aware label semantics matching network for hierarchical text classification,” in Proc. of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int. J. Conf. on Nat. Language Processing, ACL/IJCNLP 2021, Aug. 2021, pp. 4370–4379. https://doi.org/10.18653/v1/2021.acl-long.337
  25. J. Zhang, Y. Li, F. Shen, C. Xia, H. Tan, and Y. He, “Hierarchy-aware and label balanced model for hierarchical text classification,” Knowledge-Based Systems, vol. 300, Sep. 2024, Art. no. 112153. https://doi.org/10.1016/j.knosys.2024.112153
  26. A. Pal, M. Selvakumar, and M. Sankarasubbu, “MAGNET: Multi-label text classification using attention-based graph neural network,” in Proceedings of the 12th International Conference on Agents and Artificial Intelligence (ICAART 2020), Valletta, Malta, 2020, pp. 494–505. https://doi.org/10.5220/0008940304940505
  27. R. Zhao, X. Wei, C. Ding, and Y. Chen, “Hierarchical multi-label text classification: Self-adaption semantic awareness network integrating text topic and label level information,” in Proceedings of the Knowledge Science, Engineering and Management, Hangzhou, China, Aug. 2021, pp. 406–418. https://doi.org/10.1007/978-3-030-82147-0_33
  28. J. Chen, S. Zhao, F. Lu, F. Liu, and Y. Zhang, “Research on patent classification based on hierarchical label semantics,” in 2022 3rd International Conference on Education, Knowledge and Information Management (ICEKIM), Harbin, China, Aug. 2022, pp. 1025–1032. https://doi.org/10.1109/ICEKIM55072.2022.00223
  29. Z. Yao, H. Chai, J. Cui, S. Tang, and Q. Liao, “HITSZQ at SemEval-2023 Task 10: Category-aware sexism detection model with self-training strategy,” in Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), Toronto, Canada, Jul. 2023, pp. 934–940. https://doi.org/10.18653/v1/2023.semeval-1.129
  30. B. Ning, D. Zhao, X. Zhang, C. Wang, and S. Song, “UMP-MG: A unidirected message-passing multi-label generation model for hierarchical text classification,” Data Science and Engineering, vol. 8, pp. 112–123, Jun. 2023. https://doi.org/10.1007/s41019-023-00210-1
  31. J. Song, F. Wang, and Y. Yang, “Peer-label assisted hierarchical text classification,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, Canada, Jul. 2023, pp. 3747–3758. https://doi.org/10.18653/v1/2023.acl-long.207
  32. Z. Wang, P. Wang, L. Huang, X. Sun, and H. Wang, “Incorporating hierarchy into text encoder: a contrastive learning approach for hierarchical text classification,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, May 2022, pp. 7109–7119. https://doi.org/10.18653/v1/2022.acl-long.491
  33. Y. Liu, K. Zhang, Z. Huang, K. Wang, Y. Zhang, Q. Liu, and E. Chen, “Enhancing hierarchical text classification through knowledge graph integration,” in Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, Toronto, ON, Canada, Jul. 2023, pp. 5797–5810. https://doi.org/10.18653/v1/2023.findings-acl.358
  34. Z. Wang, P. Wang, T. Liu, Y. Cao, Z. Sui, and H. Wang, “HPT: Hierarchy-aware prompt tuning for hierarchical text classification,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, Dec. 2022, pp. 3740–3751. https://doi.org/10.18653/v1/2022.emnlp-main.246
  35. K. Ji, Y. Lian, J. Gao, and B. Wang, “Hierarchical verbalizer for few-shot hierarchical text classification,” in Proc. of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, ON, Canada, Jul. 2023, pp. 2918–2933. https://doi.org/10.18653/v1/2023.acl-long.164
  36. Y. Zhang, R. Yang, X. Xu, R. Li, J. Xiao, J. Shen, and J. Han, “TELEClass: Taxonomy enrichment and LLM-enhanced hierarchical text classification with minimal supervision,” arXiv:2403.00165, 2024. https://arxiv.org/pdf/2403.00165
  37. L. Chen, H. Chou, and X. Zhu, “Developing prefix-tuning models for hierarchical text classification,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track, Abu Dhabi, UAE, Dec. 2022, pp. 390–397. https://doi.org/10.18653/v1/2022.emnlp-industry.39
  38. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, 2019, Art. no. 9.
  39. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in Neural Information Processing Systems, vol. 30, 2017. https://arxiv.org/pdf/1706.03762
  40. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in The 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), Minneapolis, Minnesota, 2019, pp. 4171–4186.
  41. S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, Nov. 1997. https://doi.org/10.1162/neco.1997.9.8.1735
  42. M. Schuster and K. K. Paliwal, “Bidirectional recurrent neural networks,” IEEE Transactions on Signal Processing, vol. 45, no. 11, pp. 2673–2681, Nov. 1997. https://doi.org/10.1109/78.650093
  43. H. Bichri, A. Chergui, and M. Hain, “Investigating the impact of train/test split ratio on the performance of pre-trained models with custom datasets,” International Journal of Advanced Computer Science & Applications, vol. 15, no. 2, 2024. https://doi.org/10.14569/IJACSA.2024.0150235
  44. B. Vrigazova, “The proportion for splitting data into training and test set for the bootstrap in classification problems,” Business Systems Research Journal, vol. 12, no. 1, pp. 228–242, May 2021. https://doi.org/10.2478/bsrj-2021-0015
  45. D. M. W. Powers, “Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation,” International Journal of Machine Learning Technology, vol. 2, no. 1, pp. 37–63, Jan. 2011. https://doi.org/10.9735/2229-3981
  46. P. Christen, D. J. Hand, and N. Kirielle, “A review of the F-measure: Its history, properties, criticism, and alternatives,” ACM Comput. Surv., vol. 56, no. 3, Oct. 2023, Art. no. 73. https://doi.org/10.1145/3606367
  47. A. Kosmopoulos, I. Partalas, E. Gaussier, G. Paliouras, and I. Androutsopoulos, “Evaluation measures for hierarchical classification: a unified view and novel approaches,” Data Mining and Knowledge Discovery, vol. 29, pp. 820–865, May 2015. https://doi.org/10.1007/s10618-014-0382-x
DOI: https://doi.org/10.2478/acss-2025-0005 | Journal eISSN: 2255-8691 | Journal ISSN: 2255-8683
Language: English
Page range: 40 - 46
Submitted on: Nov 7, 2024
Accepted on: May 26, 2025
Published on: Mar 15, 2025
Published by: Riga Technical University
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Djelloul Bouchiha, Abdelghani Bouziane, Noureddine Doumi, Benamar Hamzaoui, Sofiane Boukli-Hacene, published by Riga Technical University
This work is licensed under the Creative Commons Attribution 4.0 License.