Performance evaluation of seven multi-label classification methods on real-world patent and publication datasets

Xu, Shuo; Zhang, Yuefu; An, Xin; Pi, Sainan

doi:10.2478/jdis-2024-0014

References

Blei, D. M., Ng, A. Y. & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3(Jan), 993-1022.
Search in Google Scholar Back to article
Boutell, M. R., Luo, J. B., Shen, X. P. & Brown, C. M. (2004). Learning multi-label scene classification. Pattern Recognition, 37(9), 1757-1771. https://doi.org/10.1016/j.patcog.2004.03.009
Open DOI Search in Google Scholar Back to article
Chen, L., Xu, S., Zhu, L. J., Zhang, J., Lei, X. P. & Yang, G. C. (2020). A deep learning based method for extracting semantic information from patent documents. Scientometrics, 125(1), 289-312.
Search in Google Scholar Back to article
Chen, L., Xu, S., Zhu, L J.., Zhang, J., Yang, G. C., & Xu, H. Y. (2022). A deep learning based method benefiting from characteristics of patents for semantic relation classification. Journal of Informetrics, 16(3), 101312.
Search in Google Scholar Back to article
Chen, Q. Y., Allot, A., Leaman, R., Islamaj, R., Du, J. C., Fang, L., …, & Lu, Z. Y. (2022) Multilabel classification for biomedical literature: an overview of the BioCreative VII LitCovid track for COVID-19 literature topic annotation. Database, 2022, baac069.
Search in Google Scholar Back to article
Clare, A. & King, R. D. (2001). Knowledge discovery in multi-label phenotype data. In: Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery (pp. 42-53). Springer, Berlin, Heidelberg.
Search in Google Scholar Back to article
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K. & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391-407.
Search in Google Scholar Back to article
Dekel, O. & Shamir, O. (2010). Multiclass-multilabel classification with more classes than examples. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (pp. 137-144).
Search in Google Scholar Back to article
Du, J. C., Chen, Q. Y., Peng, Y. F., Xiang, Y., Tao, C., & Lu, Z. Y. (2019). ML-Net: multi-label classification of biomedical texts with deep neural networks. Journal of the American Medical Informatics Association, 26(11), 1279-1285.
Search in Google Scholar Back to article
Elisseeff, A. & Weston, J. (2001). A kernel method for multi-labelled classification. In: Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic (pp. 681–687).
Search in Google Scholar Back to article
Freitas Rocha, V., Varejão, F. M., & Segatto, M. E. V. (2022). Ensemble of classifier chains and decision templates for multi-label classification. Knowledge and Information Systems, 1-21.
Search in Google Scholar Back to article
Fürnkranz, J. & Hüllermeier, E. (2003). Pairwise preference learning and ranking. In: European Conference on Machine Learning (pp. 145-156). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39857-8_15
Open DOI Search in Google Scholar Back to article
Fürnkranz, J., Hüllermeier, E., Loza Mencía, E. & Brinker, K. (2008). Multilabel classification via calibrated label ranking. Machine Learning, 73(2), 133-153.
Search in Google Scholar Back to article
Ghamrawi, N. & McCallum, A. (2005). Collective multi-label classification. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management (pp. 195-200).
Search in Google Scholar Back to article
Godbole, S. & Sarawagi, S. (2004). Discriminative methods for multi-labeled classification. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 22-30). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24775-3_5
Open DOI Search in Google Scholar Back to article
Haghighian Roudsari, A., Afshar, J., Lee, W., & Lee, S. (2022). PatentNet: multi-label classification of patent documents using deep learning based language understanding. Scientometrics, 1-25.
Search in Google Scholar Back to article
Katakis, I., Tsoumakas, G. & Vlahavas, I. (2008). Multilabel text classification for automated tag suggestion. In: Proceedings of the ECML/PKDD 2008 Discovery Challenge (p. 5).
Search in Google Scholar Back to article
Kim, Y. (2014). Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empiric in Natural Language Processing (pp. 1746–1751).
Search in Google Scholar Back to article
Lai, S. W., Xu, L. H., Liu, K. & Zhao, J. (2015). Recurrent convolutional neural networks for text classification. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence (pp. 2267–2273).
Search in Google Scholar Back to article
Lewis, D. D., Yang, Y. M., Russell-Rose, T. & Li, F. (2004). Rcv1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5(Apr), 361-397.
Search in Google Scholar Back to article
Li, T. & Ogihara, M. (2003). Detecting emotion in music. In: Proceedings of the 4th International Conference on Music Information Retrieval.
Search in Google Scholar Back to article
Liu, L. Q., Mu, F. N., Li, P. Y., Mu, X., Tang, J., Ai, X. S., … & Zhou, X. (2019). NeuralClassifier: an open-source neural hierarchical multi-label text classification toolkit. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 87-92). https://doi.org/10.18653/v1/P19-3015
Open DOI Search in Google Scholar Back to article
Liu, P. F., Qiu, X. P. & Huang, X. J. (2016). Recurrent neural network for text classification with multi-task learning. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence (pp. 2873–2879). https://doi.org/10.48550/arXiv.1605.05101
Open DOI Search in Google Scholar Back to article
Liu, T. Y., Yang, Y. M., Wan, H., Zeng, H. J., Chen, Z. & Ma, W. Y. (2005). Support vector machines classification with a very large-scale taxonomy. ACM SIGKDD Explorations Newsletter, 7(1), 36-43.
Search in Google Scholar Back to article
Madjarov, G., Kocev, D., Gjorgjevikj, D. & Džeroski, S. (2012). An extensive experimental comparison of methods for multi-label learning. Pattern Recognition, 45(9), 3084-3104. https://doi.org/10.1016/j.patcog.2012.03.004.
Open DOI Search in Google Scholar Back to article
Pang, B. & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1–2), 1-135. http://dx.doi.org/10.1561/1500000011
Open DOI Search in Google Scholar Back to article
Pestian, J., Brew, C., Matykiewicz, P., Hovermale, D. J., Johnson, N., Cohen, K. B. & Duch, W. (2007). A shared task involving multi-label classification of clinical free text. In: Biological, Translational, and Clinical Language Processing (pp. 97-104).
Search in Google Scholar Back to article
Read, J., Martino, L., Olmos, P. M. & Luengo, D. (2015). Scalable multi-output label prediction: From classifier chains to classifier trellises. Pattern Recognition, 48(6), 2096-2109. https:// doi.org/10.1016/j.patcog.2015.01.004
Open DOI Search in Google Scholar Back to article
Read, J., Pfahringer, B. & Holmes, G. (2008). Multi-label classification using ensembles of pruned sets. In: Proceedings of the 8^th IEEE International Conference on Data Mining (pp. 995-1000). https://doi.org/10.1109/ICDM.2008.74
Open DOI Search in Google Scholar Back to article
Read, J., Pfahringer, B., Holmes, G. & Frank, E. (2011). Classifier chains for multi-label classification. Machine Learning, 85(3), 333-359.
Search in Google Scholar Back to article
Read, J., Pfahringer, B., Holmes, G., & Frank, E. (2011). Classifier chains for multi-label classification. Machine learning, 85, 333-359.
Search in Google Scholar Back to article
Roudsari, A. H., Afshar, J. Lee, W. & Lee S. (2022). PatentNet: multi-label classification of patent documents using deep learning base language understanding. Scientometrics, 127(1), 207-231. https://doi.org/10.1007/s11192-021-04179-4
Open DOI Search in Google Scholar Back to article
Rubin, T. N., Chambers, A., Smyth, P. & Steyvers, M. (2012). Statistical topic models for multilabel document classification. Machine Learning, 88(1), 157-208. https://doi.org/10.1007/s10994-011-5272-5
Open DOI Search in Google Scholar Back to article
Schapire, R. E. (1999). A brief introduction to boosting. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence (pp. 1401-1406).
Search in Google Scholar Back to article
Sechidis, K., Tsoumakas, G. & Vlahavas, I. (2011). On the stratification of multi-label data. In: Proceedings of the European Conference on Machine Learning and Principles and Practices of Knowledge Discovery in Database (pp. 145-158).
Search in Google Scholar Back to article
Szymański, P. & Kajdanowicz, T. (2017). A scikit-based Python environment for performing multilabel classification. https://doi.org/10.48550/arXiv.1702.01460
Open DOI Search in Google Scholar Back to article
Szymański, P., Kajdanowicz, T. & Kersting, K. (2016). How is a data-driven approach better than random choice in label space division for multi-label classification? Entropy, 18(8), 282. https://doi.org/10.3390/e18080282
Open DOI Search in Google Scholar Back to article
Trohidis, K., Tsoumakas, G., Kalliris, G. & Vlahavas, I. (2011). Multi-label classification of music by emotion. EURASIP Journal on Audio, Speech, and Music Processing, 2011(1), 1-9. https:// doi.org/10.1186/1687-4722-2011-426793
Open DOI Search in Google Scholar Back to article
Tsoumakas, G. & Vlahavas, I. (2007). Random k-labelsets: An ensemble method for multilabel classification. In: Proceedings of the 18th European Conference on Machine Learning (pp. 406-417). https://doi.org/10.1007/978-3-540-74958-5_38
Open DOI Search in Google Scholar Back to article
Tsoumakas, G., Katakis, I. & Vlahavas, I. (2010). Random k-labelsets for multilabel classification. IEEE Transactions on Knowledge and Data Engineering, 23(7), 1079-1089. https://doi.org/10.1109/TKDE.2010.164
Open DOI Search in Google Scholar Back to article
Ueda, N. & Saito, K. (2002). Parametric mixture models for multi-labeled text. In: Proceedings of the 15th International Conference on Neural Information Processing Systems (pp. 737-744).
Search in Google Scholar Back to article
Xu, S. & An, X. (2019). ML²S-SVM: multi-label least-squares support vector machine classifiers, The Electronic Library, 37(6), 1040-1058. https://doi.org/10.1108/EL-09-2019-0207
Open DOI Search in Google Scholar Back to article
Xu, S. (2018). Bayesian naïve Bayes classifiers to text classification. Journal of Information Science, 44(1), 48-59. https://doi.org/10.1177/0165551516677946
Open DOI Search in Google Scholar Back to article
Yang, Y. M., Zhang, J. & Kisiel, B. (2003). A scalability analysis of classifiers in text categorization. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 96-103).
Search in Google Scholar Back to article
Yu, Z. L., Wang, Q., Fan, Y., Dai, H. J. & Qiu, M. K. (2015). An improved classifier chain algorithm for multi-label classification of big data analysis. In: Proceedings of the IEEE 12th International Conference on Embedded Software and Systems (pp. 1298-1301). https://doi.org/10.1109/ HPCC-CSS-ICESS.2015.240
Open DOI Search in Google Scholar Back to article
Zhang, M. L. & Zhou, Z. H. (2006). Multilabel neural networks with applications to functional genomics and text categorization. IEEE Transactions on Knowledge and Data Engineering, 18(10), 1338-1351. https://doi.org/10.1109/TKDE.2006.162
Open DOI Search in Google Scholar Back to article
Zhang, M. L. & Zhou, Z. H. (2007). ML-KNN: A lazy learning approach to multi-label learning. Pattern Recognition, 40(7), 2038-2048. https://doi.org/10.1016/j.patcog.2006.12.019
Open DOI Search in Google Scholar Back to article

Performance evaluation of seven multi-label classification methods on real-world patent and publication datasets

References

Paradigm

My account