PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Joint feature selection and classification for positive unlabelled multi-label data using weighted penalized empirical risk minimization

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
We consider the positive-unlabelled multi-label scenario in which multiple target variables are not observed directly. Instead, we observe surrogate variables indicating whether or not the target variables are labelled. The presence of a label means that the corresponding variable is positive. The absence of the label means that the variable can be either positive or negative. We analyze embedded feature selection methods based on two weighted penalized empirical risk minimization frameworks. In the first approach, we introduce weights of observations. The idea is to assign larger weights to observations for which there is a consistency between the values of the true target variable and the corresponding surrogate variable. In the second approach, we consider a weighted empirical risk function which corresponds to the risk function for the true unobserved target variables. The weights in both the methods depend on the unknown propensity score functions, whose estimation is a challenging problem. We propose to use very simple bounds for the propensity score, which leads to relatively simple forms of weights. In the experiments we analyze the predictive power of the methods considered for different labelling schemes.
Rocznik
Strony
311--322
Opis fizyczny
Bibliogr. 46 poz., tab., wykr.
Twórcy
  • Institute of Computer Science, Polish Academy of Sciences, Jana Kazimierza 5, 01-248 Warsaw, Poland
  • Faculty of Mathematics and Information Science, Warsaw University of Technology, Koszykowa 75, 00-062 Warsaw, Poland
Bibliografia
  • [1] Argyriou, A., Evgeniou, T. and Pontil, M. (2008). Convex multi-task feature learning, Machine Learning 73(3): 243–272.
  • [2] Bekker, J. and Davis, J. (2018). Estimating the class prior in positive and unlabeled data through decision tree induction, Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, USA, pp. 1–8.
  • [3] Bekker, J. and Davis, J. (2020). Learning from positive and unlabeled data: A survey, Machine Learning 109(4): 719–760.
  • [4] Bekker, J., Robberechts, P. and Davis, J. (2019). Beyond the selected completely at random assumption for learning from positive and unlabeled data, in U. Brefeld et al. (Eds), Proceedings of the 2019 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Springer, Cham, pp. 71–85.
  • [5] Biecek, P. (2018). DALEX: Explainers for complex predictive models in R, Journal of Machine Learning Research 19(1): 3245–3249.
  • [6] Bucak, S.S., Jin, R. and Jain, A.K. (2011). Multi-label learning with incomplete class assignments, Proceedings of the Conference on Computer Vision and Pattern Recognition, Colorado Springs, USA, pp. 2801–2808.
  • [7] Couso, I., Dubois, D. and Hüllermeier, E. (2017). Maximum likelihood estimation and coarse data, Proceedings of the International Conference on Scalable Uncertainty Management, Granada, Spain, pp. 3–16.
  • [8] Dembczyński, K., Waegeman, W., Cheng, W. and Hüllermeier, E. (2012). On label dependence and loss minimization in multi-label classification, Machine Learning 88(1): 5–45.
  • [9] Elkan, C. and Noto, K. (2008). Learning classifiers from only positive and unlabeled data, Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’08, Las Vegas, USA, pp. 213–220.
  • [10] Frenay, B. and Verleysen, M. (2014). Classification in the presence of label noise: A survey, IEEE Transactions on Neural Networks and Learning Systems 25(5): 845–869.
  • [11] Gibaja, E. and Ventura, S. (2015). A tutorial on multilabel learning, ACM Computing Surveys 47(3): 1–38.
  • [12] Guyon, I. and Elisseeff, A. (2003). An introduction to variable and feature selection, Journal of Machine Learning Research 3(1): 1157–1182.
  • [13] Hall, E.J. and Brenner, D.J. (2008). Cancer risks from diagnostic radiology, British Journal of Radiology 81(965): 362–378.
  • [14] Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference and Prediction, Springer, New York.
  • [15] Hazell, L. and Shakir, S. (2006). Under-reporting of adverse drug reactions: A systematic review, Drug Safety 29(5): 385–396.
  • [16] He, Z.-F., Yang, M., Gao, Y., Liu, H.-D. and Yin, Y. (2019). Joint multi-label classification and label correlations with missing labels and feature selection, Knowledge-Based Systems 163(1): 145–158.
  • [17] Heitjan, D.F. and Rubin, D.B. (1991). Ignorability and coarse data, Annals of Statistics 19(4): 2244–2253.
  • [18] Jain, S., White, M. and Radivojac, P. (2016). Estimating the class prior and posterior from noisy positives and unlabeled data, Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, pp. 2693–2701.
  • [19] Jaskie, K., Elkan, C. and Spanias, A. (2020). A modified logistic regression for positive and unlabeld learning, 53rd Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, USA, pp. 2007–2011.
  • [20] Ji, S., Tang, L., Yu, S. and Ye, J. (2010). A shared-subspace learning framework for multi-label classification, ACM Transactions on Knowledge Discovery from Data 4(2): 1–29.
  • [21] Kakade, S.M., Shalev-Shwartz, S. and Tewari, A. (2012). Regularization techniques for learning with matrices, Journal of Machine Learning Research 13(1): 1865–1890.
  • [22] Kanehira, A. and Harada, T. (2016). Multi-label ranking from positive and unlabeled data, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 5138–5146.
  • [23] Kashef, S., Nezamabadi-pour, H. and Nikpour, B. (2018). Multilabel feature selection: A comprehensive review and guiding experiments, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 8(2): 1–29.
  • [24] Lagasse, R.S. (2002). Anesthesia safety: Model or myth?: A review of the published literature and analysis of current original data, Anesthesiology: The Journal of the American Society of Anesthesiologists 97(6): 1609–1617.
  • [25] Łazęcka, M., Mielniczuk, J. and Teisseyre, P. (2021). Estimating the class prior for positive and unlabelled data via logistic regression, Advances in Data Analysis and Classification 15(4): 1039–1068.
  • [26] Lee, J. and Kim, D.-W. (2017). SCLS: Multi-label feature selection based on scalable criterion for large label set, Pattern Recognition 66(1): 342–352.
  • [27] Natarajan, N., Dhillon, I.S., Ravikumar, P. and Tewari, A. (2013). Learning with noisy labels, Proceedings of the 26th International Conference on Neural Information Processing Systems, NIPS’13, Lake Tahoe, USA, pp. 1196–1204.
  • [28] Naula, P., Airola, A., Salakoski, T. and Pahikkala, T. (2014). Multi-label learning under feature extraction budgets, Pattern Recognition Letters 40(1): 56–65.
  • [29] Pereira, R.B., Plastino, A., Zadrozny, B. and Merschmann, L. H. C. (2018). Categorizing feature selection methods for multi-label classification, Artificial Intelligence Review 49(1): 1–22.
  • [30] Plessis, M.C., Niu, G. and Sugiyama, M. (2017). Class-prior estimation for learning from positive and unlabeled data, Machine Learning 106(4): 463–492.
  • [31] Ramaswamy, H., Scott, C. and Tewari, A. (2016). Mixture proportion estimation via kernel embeddings of distributions, Proceedings of the 33rd International Conference on Machine Learning, New York, USA, pp. 2052–2060.
  • [32] Sechidis, K. and Brown, G. (2018). Simple strategies for semi-supervised feature selection, Machine Learning 107(2): 357–395.
  • [33] Sechidis, K., Calvo, B., and Brown, G. (2014). Statistical hypothesis testing in positive unlabelled data, Machine Learning and Knowledge Discovery in Databases, Nancy, France, pp. 66–81.
  • [34] Sechidis, K., Sperrin, M., Petherick, E.S., Lujan, M. and Brown, G. (2017). Dealing with under-reported variables: An information theoretic solution, International Journal of Approximate Reasoning 85(1): 159–177.
  • [35] Shalev-Shwartz, S. and Ben-David, S. (2013). Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press, Cambridge.
  • [36] Sun, Y.-Y., Zhang, Y. and Zhou, Z.-H. (2010). Multi-label learning with weak label, Proceedings of the 24th AAAI Conference on Artificial Intelligence, AAAI’10, Atlanta, USA, pp. 593–598.
  • [37] Teisseyre, P. (2020). Learning classifier chains using matrix regularization: application to multimorbidity prediction, Proceedings of the European Conference on Artificial Intelligence, ECAI 2020, Santiago de Compostela, Spain, pp. 1–8.
  • [38] Teisseyre, P. (2021). Classifier chains for positive unlabelled multi-label learning, Knowledge-Based Systems 213(1): 1–16.
  • [39] Teisseyre, P., Mielniczuk, J. and Łazecka, M. (2020). Different strategies of fitting logistic regression for positive and unlabelled data, Proceedings of the International Conference on Computational Science, ICCS 2020, Amsterdam, The Netherlands, pp. 3–17.
  • [40] Tsoumakas, G., Spyromitros-Xioufis, E., Vilcek, J. and Vlahavas, I. (2011). Mulan: A Java library for multi-label learning, Journal of Machine Learning Research 12(1): 2411–2414.
  • [41] Walley, N.M. et al. (2018). Characteristics of undiagnosed diseases network applicants: Implications for referring providers, BMC Health Services Research 18(1): 1–8.
  • [42] Wei, T., Guo, L.-Z., Li, Y.-F. and Gao, W. (2018). Learning safe multi-label prediction for weakly labeled data, Machine Learning 107(4): 703–725.
  • [43] Wu, L., Jin, R. and Jain, A.K. (2013). Tag completion for image retrieval, IEEE Transactions on Pattern Analysis and Machine Intelligence 35(3): 716–727.
  • [44] Zhang, M. and Zhou, Z. (2013). A review on multi-label learning algorithms, IEEE Transactions on Knowledge and Data Engineering 26(8): 1819–1837.
  • [45] Zhu, P., Xu, Q., Hu, Q., Zhang, C. and Zhao, H. (2018). Multi-label feature selection with missing labels, Pattern Recognition 74(1): 488–502.
  • [46] Zufferey, D., Hofer, T., Hennebert, J., Schumacher, M., Ingold, R. and Bromuri, S. (2015). Performance comparison of multi-label learning algorithms on clinical data for chronic diseases, Computers in Biology and Medicine 65(1): 34–43.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-691cf764-72ec-4edd-888a-f805810a17cc
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.