Have a personal or library account? Click to login
Performance evaluation of seven multi-label classification methods on real-world patent and publication datasets Cover

Performance evaluation of seven multi-label classification methods on real-world patent and publication datasets

By: Shuo Xu,  Yuefu Zhang,  Xin An and  Sainan Pi  
Open Access
|May 2024

References

  1. Blei, D. M., Ng, A. Y. & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3(Jan), 993-1022.
  2. Boutell, M. R., Luo, J. B., Shen, X. P. & Brown, C. M. (2004). Learning multi-label scene classification. Pattern Recognition, 37(9), 1757-1771. https://doi.org/10.1016/j.patcog.2004.03.009
  3. Chen, L., Xu, S., Zhu, L. J., Zhang, J., Lei, X. P. & Yang, G. C. (2020). A deep learning based method for extracting semantic information from patent documents. Scientometrics, 125(1), 289-312.
  4. Chen, L., Xu, S., Zhu, L J.., Zhang, J., Yang, G. C., & Xu, H. Y. (2022). A deep learning based method benefiting from characteristics of patents for semantic relation classification. Journal of Informetrics, 16(3), 101312.
  5. Chen, Q. Y., Allot, A., Leaman, R., Islamaj, R., Du, J. C., Fang, L., …, & Lu, Z. Y. (2022) Multilabel classification for biomedical literature: an overview of the BioCreative VII LitCovid track for COVID-19 literature topic annotation. Database, 2022, baac069.
  6. Clare, A. & King, R. D. (2001). Knowledge discovery in multi-label phenotype data. In: Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery (pp. 42-53). Springer, Berlin, Heidelberg.
  7. Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K. & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391-407.
  8. Dekel, O. & Shamir, O. (2010). Multiclass-multilabel classification with more classes than examples. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (pp. 137-144).
  9. Du, J. C., Chen, Q. Y., Peng, Y. F., Xiang, Y., Tao, C., & Lu, Z. Y. (2019). ML-Net: multi-label classification of biomedical texts with deep neural networks. Journal of the American Medical Informatics Association, 26(11), 1279-1285.
  10. Elisseeff, A. & Weston, J. (2001). A kernel method for multi-labelled classification. In: Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic (pp. 681–687).
  11. Freitas Rocha, V., Varejão, F. M., & Segatto, M. E. V. (2022). Ensemble of classifier chains and decision templates for multi-label classification. Knowledge and Information Systems, 1-21.
  12. Fürnkranz, J. & Hüllermeier, E. (2003). Pairwise preference learning and ranking. In: European Conference on Machine Learning (pp. 145-156). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39857-8_15
  13. Fürnkranz, J., Hüllermeier, E., Loza Mencía, E. & Brinker, K. (2008). Multilabel classification via calibrated label ranking. Machine Learning, 73(2), 133-153.
  14. Ghamrawi, N. & McCallum, A. (2005). Collective multi-label classification. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management (pp. 195-200).
  15. Godbole, S. & Sarawagi, S. (2004). Discriminative methods for multi-labeled classification. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 22-30). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24775-3_5
  16. Haghighian Roudsari, A., Afshar, J., Lee, W., & Lee, S. (2022). PatentNet: multi-label classification of patent documents using deep learning based language understanding. Scientometrics, 1-25.
  17. Katakis, I., Tsoumakas, G. & Vlahavas, I. (2008). Multilabel text classification for automated tag suggestion. In: Proceedings of the ECML/PKDD 2008 Discovery Challenge (p. 5).
  18. Kim, Y. (2014). Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empiric in Natural Language Processing (pp. 1746–1751).
  19. Lai, S. W., Xu, L. H., Liu, K. & Zhao, J. (2015). Recurrent convolutional neural networks for text classification. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence (pp. 2267–2273).
  20. Lewis, D. D., Yang, Y. M., Russell-Rose, T. & Li, F. (2004). Rcv1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5(Apr), 361-397.
  21. Li, T. & Ogihara, M. (2003). Detecting emotion in music. In: Proceedings of the 4th International Conference on Music Information Retrieval.
  22. Liu, L. Q., Mu, F. N., Li, P. Y., Mu, X., Tang, J., Ai, X. S., … & Zhou, X. (2019). NeuralClassifier: an open-source neural hierarchical multi-label text classification toolkit. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 87-92). https://doi.org/10.18653/v1/P19-3015
  23. Liu, P. F., Qiu, X. P. & Huang, X. J. (2016). Recurrent neural network for text classification with multi-task learning. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence (pp. 2873–2879). https://doi.org/10.48550/arXiv.1605.05101
  24. Liu, T. Y., Yang, Y. M., Wan, H., Zeng, H. J., Chen, Z. & Ma, W. Y. (2005). Support vector machines classification with a very large-scale taxonomy. ACM SIGKDD Explorations Newsletter, 7(1), 36-43.
  25. Madjarov, G., Kocev, D., Gjorgjevikj, D. & Džeroski, S. (2012). An extensive experimental comparison of methods for multi-label learning. Pattern Recognition, 45(9), 3084-3104. https://doi.org/10.1016/j.patcog.2012.03.004.
  26. Pang, B. & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1–2), 1-135. http://dx.doi.org/10.1561/1500000011
  27. Pestian, J., Brew, C., Matykiewicz, P., Hovermale, D. J., Johnson, N., Cohen, K. B. & Duch, W. (2007). A shared task involving multi-label classification of clinical free text. In: Biological, Translational, and Clinical Language Processing (pp. 97-104).
  28. Read, J., Martino, L., Olmos, P. M. & Luengo, D. (2015). Scalable multi-output label prediction: From classifier chains to classifier trellises. Pattern Recognition, 48(6), 2096-2109. https:// doi.org/10.1016/j.patcog.2015.01.004
  29. Read, J., Pfahringer, B. & Holmes, G. (2008). Multi-label classification using ensembles of pruned sets. In: Proceedings of the 8th IEEE International Conference on Data Mining (pp. 995-1000). https://doi.org/10.1109/ICDM.2008.74
  30. Read, J., Pfahringer, B., Holmes, G. & Frank, E. (2011). Classifier chains for multi-label classification. Machine Learning, 85(3), 333-359.
  31. Read, J., Pfahringer, B., Holmes, G., & Frank, E. (2011). Classifier chains for multi-label classification. Machine learning, 85, 333-359.
  32. Roudsari, A. H., Afshar, J. Lee, W. & Lee S. (2022). PatentNet: multi-label classification of patent documents using deep learning base language understanding. Scientometrics, 127(1), 207-231. https://doi.org/10.1007/s11192-021-04179-4
  33. Rubin, T. N., Chambers, A., Smyth, P. & Steyvers, M. (2012). Statistical topic models for multilabel document classification. Machine Learning, 88(1), 157-208. https://doi.org/10.1007/s10994-011-5272-5
  34. Schapire, R. E. (1999). A brief introduction to boosting. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence (pp. 1401-1406).
  35. Sechidis, K., Tsoumakas, G. & Vlahavas, I. (2011). On the stratification of multi-label data. In: Proceedings of the European Conference on Machine Learning and Principles and Practices of Knowledge Discovery in Database (pp. 145-158).
  36. Szymański, P. & Kajdanowicz, T. (2017). A scikit-based Python environment for performing multilabel classification. https://doi.org/10.48550/arXiv.1702.01460
  37. Szymański, P., Kajdanowicz, T. & Kersting, K. (2016). How is a data-driven approach better than random choice in label space division for multi-label classification? Entropy, 18(8), 282. https://doi.org/10.3390/e18080282
  38. Trohidis, K., Tsoumakas, G., Kalliris, G. & Vlahavas, I. (2011). Multi-label classification of music by emotion. EURASIP Journal on Audio, Speech, and Music Processing, 2011(1), 1-9. https:// doi.org/10.1186/1687-4722-2011-426793
  39. Tsoumakas, G. & Vlahavas, I. (2007). Random k-labelsets: An ensemble method for multilabel classification. In: Proceedings of the 18th European Conference on Machine Learning (pp. 406-417). https://doi.org/10.1007/978-3-540-74958-5_38
  40. Tsoumakas, G., Katakis, I. & Vlahavas, I. (2010). Random k-labelsets for multilabel classification. IEEE Transactions on Knowledge and Data Engineering, 23(7), 1079-1089. https://doi.org/10.1109/TKDE.2010.164
  41. Ueda, N. & Saito, K. (2002). Parametric mixture models for multi-labeled text. In: Proceedings of the 15th International Conference on Neural Information Processing Systems (pp. 737-744).
  42. Xu, S. & An, X. (2019). ML2S-SVM: multi-label least-squares support vector machine classifiers, The Electronic Library, 37(6), 1040-1058. https://doi.org/10.1108/EL-09-2019-0207
  43. Xu, S. (2018). Bayesian naïve Bayes classifiers to text classification. Journal of Information Science, 44(1), 48-59. https://doi.org/10.1177/0165551516677946
  44. Yang, Y. M., Zhang, J. & Kisiel, B. (2003). A scalability analysis of classifiers in text categorization. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 96-103).
  45. Yu, Z. L., Wang, Q., Fan, Y., Dai, H. J. & Qiu, M. K. (2015). An improved classifier chain algorithm for multi-label classification of big data analysis. In: Proceedings of the IEEE 12th International Conference on Embedded Software and Systems (pp. 1298-1301). https://doi.org/10.1109/ HPCC-CSS-ICESS.2015.240
  46. Zhang, M. L. & Zhou, Z. H. (2006). Multilabel neural networks with applications to functional genomics and text categorization. IEEE Transactions on Knowledge and Data Engineering, 18(10), 1338-1351. https://doi.org/10.1109/TKDE.2006.162
  47. Zhang, M. L. & Zhou, Z. H. (2007). ML-KNN: A lazy learning approach to multi-label learning. Pattern Recognition, 40(7), 2038-2048. https://doi.org/10.1016/j.patcog.2006.12.019
DOI: https://doi.org/10.2478/jdis-2024-0014 | Journal eISSN: 2543-683X | Journal ISSN: 2096-157X
Language: English
Page range: 81 - 103
Submitted on: Nov 5, 2023
Accepted on: Feb 26, 2024
Published on: May 27, 2024
Published by: Chinese Academy of Sciences, National Science Library
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2024 Shuo Xu, Yuefu Zhang, Xin An, Sainan Pi, published by Chinese Academy of Sciences, National Science Library
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.