Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na https://bibliotekanauki.pl

PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
2024 | Vol. 49, No. 4 | 409--430
Tytuł artykułu

Leveraging Unseen Features along with their PLM-based Representation to Handle Negative Covariate Shift Problem in Text Classification

Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
This paper presents a novel approach to address the problem of negative covariate shift by using unseen features. Covariate shift occurs when there is a drift between the data observed during the training and testing phase of a machine learning model. Covariate shift typically transpires in the negative class as a consequence of the swift evolution of topics discussed therein, which is driven by the characteristics of online social media. Because there is a shift in data, it signals that the data is changing, and it includes features that the trained model did not see during the training phase. We refer to such features as unseen features. To the best of our knowledge, we are the first to use unseen features to address negative covariate shift problem. The proposed approach is compared to three baselines and one state-of-the art method. The experimental results obtained from a multi-domain sentiment dataset show that the proposed approach outperforms the baselines and state-of-the-art approaches by a significant margin in terms of various performance evaluation metrics.
Wydawca

Rocznik
Strony
409--430
Opis fizyczny
Bibliogr. 31 poz., rys., tab.
Twórcy
  • Department of Computer Science, South Asian University, New Delhi, India, abulaish@sau.ac.in
Bibliografia
  • [1] Bickel S., Brückner M., and Scheffer T. Discriminative learning under covariate shift. Journal of Machine Learning Research, 10(9), 2009.
  • [2] Blitzer J., Dredze M., and Pereira F. Biographies, Bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the ACL, pages 440-447. ACL, 2007.
  • [3] Chang C.-C. and Lin C.-J. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1-27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
  • [4] Colas F. and Brazdil P. Comparison of svm and some older classification algorithms in text classification tasks. In Artificial Intelligence in Theory and Practice, pages 169–178. Springer US, 2006.
  • [5] Devlin J., Chang M., Lee K., and Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, Minneapolis, MN, USA, June 2-7, pages 4171-4186. ACL, 2019.
  • [6] Fang T., Lu N., Niu G., and Sugiyama M. Rethinking importance weighting for deep learning under distribution shift. In In Proceedings of the Advances in Neural Information Processing Systems, volume 33, pages 11996-12007. Curran Associates, Inc., 2020.
  • [7] Fei G. and Liu B. Social media text classification under negative covariate shift. In Proceedings of the 20th Conference on Empirical Methods in Natural Language Processing, pages 2347-2356, Lisbon, Portugal, Sept. 2015. Association for Computational Linguistics.
  • [8] Hammoudeh Z. and Lowd D. Learning from positive and unlabeled data with arbitrary positive shift. In Proceedings of the 34th International Conference on Neural Information Processing Systems, pages 13088-13099. Curran Associates Inc., 2020.
  • [9] Heckman J. J. Sample selection bias as a specification error. Econometrica, 47(1):153–161, 1979.
  • [10] Joachims T. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the 10th European Conference on Machine Learning, ECML’98, page 137-142. Springer-Verlag, 1998.
  • [11] Joulin A., Grave E., Bojanowski P., Douze M., Jégou H., and Mikolov T. Fasttext.zip: Compressing text classification models. CoRR, abs/1612.03651, 2016.
  • [12] Khan S. S. and Madden M. G. One-class classification: taxonomy of study and review of techniques. The Knowledge Engineering Review, 29(3):345–374, 2014.
  • [13] Liu B., Dai Y., Li X., Lee W., and Yu P. Building text classifiers using positive and unlabeled examples. In Proceedings of the 3rd IEEE International Conference on Data Mining, pages 179-186, 2003.
  • [14] Mikolov T., Chen K., Corrado G., and Dean J. Efficient estimation of word representations in vector space. In Proceedings of the 1st International Conference on Learning Representations, ICLR, Scottsdale, Arizona, USA, May 2-4, Workshop Track Proceedings, 2013.
  • [15] Minter T. Single-class classification. In In Symposium on Machine Processing of Remotely Sensed Data, page 54, 1975.
  • [16] Nguyen T., Lyu B., Ishwar P., Scheutz M., and Aeron S. Joint covariate-alignment and concept-alignment: A framework for domain generalization. In In Proceedings of the 32nd IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pages 1-6, 2022.
  • [17] Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., V.Dubourg, Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., and Duchesnay E. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825-2830, 2011.
  • [18] Pennington J., Socher R., and Manning C. D. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP, October 25-29, 2014, Doha, Qatar, pages 1532-1543. ACL, 2014.
  • [19] Peters M. E., Neumann M., Iyyer M., Gardner M., Clark C., Lee K., and Zettlemoyer L. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227-2237, New Orleans, Louisiana, June 2018. Association for Computational Linguistics.
  • [20] Radford A., Narasimhan K., Salimans T., and Sutskever I. Improving language understanding by generative pre-training. Open AI, 2018.
  • [21] Sakai T. and Shimizu N. Covariate shift adaptation on learning from positive and unlabeled data. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence. AAAI Press, 2019.
  • [22] Shimodaira H. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, 90(2):227–244, 2000.
  • [23] Sugiyama M. and Müller K.-R. Input-dependent estimation of generalization error under covariate shift. Statistics & Risk Modeling with Applications in Finance and Insurance, 23(4):249–279, 2005.
  • [24] Sugiyama M., Nakajima S., Kashima H., Buenau P., and Kawanabe M. Direct importance estimation with model selection and its application to covariate shift adaptation. In Proceedings of the 21st Annual Conference on Advances in Neural Information Processing Systems, volume 20, pages 1433-1440. Curran Associates, Inc., 2007.
  • [25] Tax D. M. and Duin R. P. Support vector domain description. Pattern Recognition Letters, 20(11):1191-1199, 1999.
  • [26] Tian J., Hsu Y.-C., Shen Y., Jin H., and Kira Z. Exploring covariate and concept shift for out-of-distribution detection. In In Proceedings of the NeurIPS 2021 Workshop on Distribution Shifts: Connecting Methods and Applications, 2021.
  • [27] Wasi N. A. and Abulaish M. An unseen features enhanced text classification approach. In In the Proceddings of the International Joint Conference on Neural Networks (IJCNN), page 8, Queensland, Australia, June 2023.
  • [28] Wasi N. A. and Abulaish M. An unseen features-enriched lifelong machine learning framework. In In the Proceddings of the International Conference on Computational Science and Its Applications - ICCSA, pages 471-481, Athen, Greece, 2023. Springer Nature Switzerland.
  • [29] Yang Z., Dai Z., Yang Y., Carbonell J., Salakhutdinov R. R., and Le Q. V. Xlnet: Generalized autoregressive pretraining for language understanding. In Proceedings of the 32th Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  • [30] Yu H., Han J., and Chang K. C.-C. Pebl: Positive example based learning for web page classification using svm. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’02, page 239-248, New York, NY, USA, 2002. Association for Computing Machinery.
  • [31] Zhou A. and Levine S. Bayesian adaptation for covariate shift. In Proceedings of the 34th Annual Conference on Neural Information Processing Systems 2021, NeurIPS, pages 914-927, 2021.
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.baztech-bdbafb0f-54ad-4459-bdf1-9d9f3d45b528
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.