Leveraging Unseen Features along with their PLM-based Representation to Handle Negative Covariate Shift Problem in Text Classification

Nesar Ahmad Wasi; Muhammad Abulaish

doi:10.2478/fcds-2024-0020

.blurhash-client-img { display: none !important; }

Leveraging Unseen Features along with their PLM-based Representation to Handle Negative Covariate Shift Problem in Text Classification

Foundations of Computing and Decision Sciences

Volume 49 (2024): Issue 4 (December 2024)

By: Nesar Ahmad Wasi and Muhammad Abulaish

Open Access

|Nov 2024

Abstract

This paper presents a novel approach to address the problem of negative covariate shift by using unseen features. Covariate shift occurs when there is a drift between the data observed during the training and testing phase of a machine learning model. Covariate shift typically transpires in the negative class as a consequence of the swift evolution of topics discussed therein, which is driven by the characteristics of online social media. Because there is a shift in data, it signals that the data is changing, and it includes features that the trained model did not see during the training phase. We refer to such features as unseen features. To the best of our knowledge, we are the first to use unseen features to address negative covariate shift problem. The proposed approach is compared to three baselines and one state-of-theart method. The experimental results obtained from a multi-domain sentiment dataset show that the proposed approach outperforms the baselines and state-of-the-art approaches by a significant margin in terms of various performance evaluation metrics.

References

Bickel S., Brückner M., and Scheffer T. Discriminative learning under covariate shift. Journal of Machine Learning Research, 10(9), 2009.
Search in Google Scholar Back to article
Blitzer J., Dredze M., and Pereira F. Biographies, Bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the ACL, pages 440–447. ACL, 2007.
Search in Google Scholar Back to article
Chang C.-C. and Lin C.-J. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1–27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
Search in Google Scholar Back to article
Colas F. and Brazdil P. Comparison of svm and some older classification algorithms in text classification tasks. In Artificial Intelligence in Theory and Practice, pages 169–178. Springer US, 2006.
Search in Google Scholar Back to article
Devlin J., Chang M., Lee K., and Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, Minneapolis, MN, USA, June 2-7, pages 4171–4186. ACL, 2019.
Search in Google Scholar Back to article
Fang T., Lu N., Niu G., and Sugiyama M. Rethinking importance weighting for deep learning under distribution shift. In In Proceedings of the Advances in Neural Information Processing Systems, volume 33, pages 11996–12007. Curran Associates, Inc., 2020.
Search in Google Scholar Back to article
Fei G. and Liu B. Social media text classification under negative covariate shift. In Proceedings of the 20th Conference on Empirical Methods in Natural Language Processing, pages 2347–2356, Lisbon, Portugal, Sept. 2015. Association for Computational Linguistics.
Search in Google Scholar Back to article
Hammoudeh Z. and Lowd D. Learning from positive and unlabeled data with arbitrary positive shift. In Proceedings of the 34th International Conference on Neural Information Processing Systems, pages 13088–13099. Curran Associates Inc., 2020.
Search in Google Scholar Back to article
Heckman J. J. Sample selection bias as a specification error. Econometrica, 47(1):153–161, 1979.
Search in Google Scholar Back to article
Joachims T. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the 10th European Conference on Machine Learning, ECML’98, page 137–142. Springer-Verlag, 1998.
Search in Google Scholar Back to article
Joulin A., Grave E., Bojanowski P., Douze M., Jégou H., and Mikolov T. Fasttext.zip: Compressing text classification models. CoRR, abs/1612.03651, 2016.
Search in Google Scholar Back to article
Khan S. S. and Madden M. G. One-class classification: taxonomy of study and review of techniques. The Knowledge Engineering Review, 29(3):345–374, 2014.
Search in Google Scholar Back to article
Liu B., Dai Y., Li X., Lee W., and Yu P. Building text classifiers using positive and unlabeled examples. In Proceedings of the 3rd IEEE International Conference on Data Mining, pages 179–186, 2003.
Search in Google Scholar Back to article
Mikolov T., Chen K., Corrado G., and Dean J. Efficient estimation of word representations in vector space. In Proceedings of the 1st International Conference on Learning Representations, ICLR, Scottsdale, Arizona, USA, May 2-4, Workshop Track Proceedings, 2013.
Search in Google Scholar Back to article
Minter T. Single-class classification. In In Symposium on Machine Processing of Remotely Sensed Data, page 54, 1975.
Search in Google Scholar Back to article
Nguyen T., Lyu B., Ishwar P., Scheutz M., and Aeron S. Joint covariate-alignment and concept-alignment: A framework for domain generalization. In In Proceedings of the 32nd IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pages 1–6, 2022.
Search in Google Scholar Back to article
Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., V.Dubourg, Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., and Duchesnay E. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
Search in Google Scholar Back to article
Pennington J., Socher R., and Manning C. D. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP, October 25-29, 2014, Doha, Qatar, pages 1532–1543. ACL, 2014.
Search in Google Scholar Back to article
Peters M. E., Neumann M., Iyyer M., Gardner M., Clark C., Lee K., and Zettlemoyer L. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227–2237, New Orleans, Louisiana, June 2018. Association for Computational Linguistics.
Search in Google Scholar Back to article
Radford A., Narasimhan K., Salimans T., and Sutskever I. Improving language understanding by generative pre-training. Open AI, 2018.
Search in Google Scholar Back to article
Sakai T. and Shimizu N. Covariate shift adaptation on learning from positive and unlabeled data. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence. AAAI Press, 2019.
Search in Google Scholar Back to article
Shimodaira H. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, 90(2):227–244, 2000.
Search in Google Scholar Back to article
Sugiyama M. and Müller K.-R. Input-dependent estimation of generalization error under covariate shift. Statistics & Risk Modeling with Applications in Finance and Insurance, 23(4):249–279, 2005.
Search in Google Scholar Back to article
Sugiyama M., Nakajima S., Kashima H., Buenau P., and Kawanabe M. Direct importance estimation with model selection and its application to covariate shift adaptation. In Proceedings of the 21st Annual Conference on Advances in Neural Information Processing Systems, volume 20, pages 1433–1440. Curran Associates, Inc., 2007.
Search in Google Scholar Back to article
Tax D. M. and Duin R. P. Support vector domain description. Pattern Recognition Letters, 20(11):1191–1199, 1999.
Search in Google Scholar Back to article
Tian J., Hsu Y.-C., Shen Y., Jin H., and Kira Z. Exploring covariate and concept shift for out-of-distribution detection. In In Proceedings of the NeurIPS 2021 Workshop on Distribution Shifts: Connecting Methods and Applications, 2021.
Search in Google Scholar Back to article
Wasi N. A. and Abulaish M. An unseen features enhanced text classification approach. In In the Proceddings of the International Joint Conference on Neural Networks (IJCNN), page 8, Queensland, Australia, June 2023.
Search in Google Scholar Back to article
Wasi N. A. and Abulaish M. An unseen features-enriched lifelong machine learning framework. In In the Proceddings of the International Conference on Computational Science and Its Applications – ICCSA, pages 471–481, Athen, Greece, 2023. Springer Nature Switzerland.
Search in Google Scholar Back to article
Yang Z., Dai Z., Yang Y., Carbonell J., Salakhutdinov R. R., and Le Q. V. Xlnet: Generalized autoregressive pretraining for language understanding. In Proceedings of the 32th Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
Search in Google Scholar Back to article
Yu H., Han J., and Chang K. C.-C. Pebl: Positive example based learning for web page classification using svm. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’02, page 239–248, New York, NY, USA, 2002. Association for Computing Machinery.
Search in Google Scholar Back to article
Zhou A. and Levine S. Bayesian adaptation for covariate shift. In Proceedings of the 34th Annual Conference on Neural Information Processing Systems 2021, NeurIPS, pages 914–927, 2021.
Search in Google Scholar Back to article

Articles in this issue

DOI: https://doi.org/10.2478/fcds-2024-0020 | Journal eISSN: 2300-3405 | Journal ISSN: 0867-6356

Journal RSS Feed

Language: English

Page range: 409 - 430

Submitted on: Sep 17, 2023

Accepted on: Jun 17, 2024

Published on: Nov 30, 2024

Published by: Poznan University of Technology

In partnership with: Paradigm Publishing Services

Keywords:

Related subjects:

© 2024 Nesar Ahmad Wasi, Muhammad Abulaish, published by Poznan University of Technology
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Volume 49 (2024): Issue 4 (December 2024)