Using reinforcement learning to select an optimal feature set

Akhiat, Yassine; Zinedine, Ahmed; Chahhou, Mohamed

doi:10.14313/JAMRIS/1‐2024/6

Artykuł - szczegóły

Tytuł artykułu

Using reinforcement learning to select an optimal feature set

Autorzy

Akhiat Yassine , Zinedine Ahmed , Chahhou Mohamed

Treść / Zawartość

Pełne teksty:

Pobierz

Identyfikatory

DOI

10.14313/JAMRIS/1‐2024/6

Warianty tytułu

Języki publikacji

Abstrakty

Feature Selection (FS) is an essential research topic in the area of machine learning. FS, which is the process of identifying the relevant features and removing the irrelevant and redundant ones, is meant to deal with the high dimensionality problem for the sake of selecting the best performing feature subset. In the literature, many feature selection techniques approach the task as a research problem, where each state in the search space is a possible feature subset. In this paper, we introduce a new feature selection method based on reinforcement learning. First, decision tree branches are used to traverse the search space. Second, a transition similarity measure is proposed so as to ensure exploit-explore trade-off. Finally, the informative features are the most involved ones in constructing the best branches. The performance of the proposed approaches is evaluated on nine standard benchmark datasets. The results using the AUC score show the effectiveness of the proposed system.

Słowa kluczowe

feature selection data mining decision tree reinforcement learning dimensionality reduction

Wydawca

Łukasiewicz Industrial Research Institute for Automation and Measurements PIAP

Czasopismo

Journal of Automation Mobile Robotics and Intelligent Systems

Rocznik

2024

Tom

Vol. 18, No. 1

Strony

56--66

Opis fizyczny

Bibliogr. 40 poz., rys.

Twórcy

autor

Akhiat Yassine

yassine.akhiat@usmba.ac.ma

Department of Informatics, Faculty of Sciences Dhar El Mahraz, USMBA, Fez Morocco

https://orcid.org/0000-0002-9478-6328

autor

Zinedine Ahmed

ahmed.zinedine@usmba.ac.ma

Department of Informatics, Faculty of Sciences Dhar El Mahraz, USMBA, Fez Morocco

autor

Chahhou Mohamed

mchahhou@hotmail.com

Department of Informatics, Faculty of Sciences, UAE, Tetouan Morocco

Bibliografia

[1] R. Roelofs, S. Fridovich-Keil, J. Miller, V. Shankar, M. Hardt, B. Recht, and L. Schmidt, “A meta-analysis of overfitting in machine learning,” in Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2019, pp. 9179-9189.
[2] X. Ying, “An overview of overfitting and its solutions,” in Journal of Physics: Conference Series, vol. 168, no. 2. IOP Publishing, 2019, p. 022022.
[3] M. Li, H. Wang, L. Yang, Y. Liang, Z. Shang, and H. Wan, “Fast hybrid dimensionality reduction method for classification based on feature selection and grouped feature extraction,” Expert Systems with Applications, vol. 150, p. 113277, 2020.
[4] H. Liu, H. Motoda, and L. Yu, “A selective sampling approach to active feature selection,” Artificial Intelligence, vol. 159, no. 1-2, pp. 49-74, 2004.
[5] Y. Akhiat, Y. Asnaoui, M. Chahhou, and A. Zinedine, “A new graph feature selection approach,” in 2020 6th IEEE Congress on Information Science and Technology (CiSt). IEEE, 2021, pp. 156-161.
[6] D. M. Atallah, M. Badawy, and A. El-Sayed, “Intelligent feature selection with modified k-nearest neighbor for kidney transplantation prediction,” SN Applied Sciences, vol. 1, no. 10, pp. 1-17, 2019.
[7] I. Guyon, S. Gunn, M. Nikravesh, and L. A. Zadeh, Feature extraction: foundations and applications. Springer, 2008, vol. 207.
[8] I. Guyon and A. Elisseeff, “An introduction to feature extraction,” in Feature extraction. Springer, 2006, pp. 1-25.
[9] A. Yassine, “Feature selection methods for high dimensional data,” 2021.
[10] Y. Manzali, Y. Akhiat, M. Chahhou, M. Elmohajir, and A. Zinedine, “Reducing the number of trees in a forest using noisy features,” Evolving Systems, pp. 1-18, 2022.
[11] Y. Akhiat, Y. Manzali, M. Chahhou, and A. Zinedine, “A new noisy random forest based method for feature selection,” CYBERNETICS AND INFORMATION TECHNOLOGIES, vol. 21, no. 2, 2021.
[12] S. Abe, “Feature selection and extraction,” in Support vector machines for pattern classification. Springer, 2010, pp. 331-341.
[13] J. Cai, J. Luo, S. Wang, and S. Yang, “Feature selection in machine learning: A new perspective,” Neurocomputing, vol. 300, pp. 70-79, 2018.
[14] Y. Akhiat, M. Chahhou, and A. Zinedine, “Feature selection based on graph representation,” in 2018 IEEE 5th International Congress on Information Science and Technology (CiSt). IEEE, 2018, pp. 232-237.
[15] J. C. Ang, A. Mirzal, H. Haron, and H. N. A. Hamed, “Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection,” IEEE/ACM transactions on computational biology and bioinformatics, vol. 13, no. 5, pp. 971-989, 2015.
[16] L. A. Belanche and F. F. González, “Review and evaluation of feature selection algorithms in synthetic problems,” arXiv preprint arXiv:1101.2320, 2011.
[17] G. Chandrashekar and F. Sahin, “A survey on feature selection methods,” Computers & Electrical Engineering, vol. 40, no. 1, pp. 16-28, 2014.
[18] B. Nithya and V. Ilango, “Evaluation of machine learning based optimized feature selection approaches and classification methods for cervical cancer prediction,” SN Applied Sciences, vol. 1, no. 6, pp. 1-16, 2019.
[19] A. Bommert, X. Sun, B. Bischl, J. Rahnenführer, and M. Lang, “Benchmark for filter methods for feature selection in high-dimensional classification data,” Computational Statistics & Data Analysis, vol. 143, p. 106839, 2020.
[20] Y. Akhiat, M. Chahhou, and A. Zinedine, “Ensemble feature selection algorithm,” International Journal of Intelligent Systems and Applications, vol. 11, no. 1, p. 24, 2019.
[21] L. Čehovin and Z. Bosnić, “Empirical evaluation of feature selection methods in classification,” Intelligent data analysis, vol. 14, no. 3, pp. 265-281, 2010.
[22] Y. Asnaoui, Y. Akhiat, and A. Zinedine, “Feature selection based on attributes clustering,” in 2021 Fifth International Conference On Intelligent Computing in Data Sciences (ICDS). IEEE, 2021, pp. 1-5.
[23] Y. Bouchlaghem, Y. Akhiat, and S. Amjad, “Feature selection: A review and comparative study,” in E3S Web of Conferences, vol. 351. EDP Sciences, 2022, p. 01046.
[24] A. Destrero, S. Mosci, C. D. Mol, A. Verri, and F. Odone, “Feature selection for high-dimensional data,” Computational Management Science, vol. 6, pp. 25-40, 2009.
[25] V. Fonti and E. Belitser, “Feature selection using lasso,” VU Amsterdam Research Paper in Business Analytics, vol. 30, pp. 1-25, 2017.
[26] I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,” Journal of machine learning research, vol. 3, no. Mar, pp. 1157-1182, 2003.
[27] R. Zebari, A. Abdulazeez, D. Zeebaree, D. Zebari, and J. Saeed, “A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction,” Journal of Applied Science and Technology Trends, vol. 1, no. 2, pp. 56-70, 2020.
[28] J. Miao and L. Niu, “A survey on feature selection,” Procedia Computer Science, vol. 91, pp. 919-926, 2016.
[29] L. C. Molina, L. Belanche, and À. Nebot, “Feature selection algorithms: A survey and experimental evaluation,” in 2002 IEEE International Conference on Data Mining, 2002. Proceedings. IEEE, 2002, pp. 306-313.
[30] R. Caruana, A. Niculescu-Mizil, G. Crew, and A. Ksikes, “Ensemble selection from libraries of models,” in Proceedings of the twenty-First international conference on Machine learning, 2004, p. 18.
[31] A. Yassine, C. Mohamed, and A. Zinedine, “Feature selection based on pairwise evalution,” in 2017 Intelligent Systems and Computer Vision (ISCV). IEEE, 2017, pp. 1-6.
[32] B. Gregorutti, B. Michel, and P. Saint-Pierre, “Correlation and variable importance in random forests,” Statistics and Computing, vol. 27, no. 3, pp. 659-678, 2017.
[33] J. Kacprzyk, J. W. Owsinski, and D. A. Viattchenin, “A new heuristic possibilistic clustering algorithm for feature selection,” Journal of Automation Mobile Robotics and Intelligent Systems, vol. 8, 2014.
[34] L. Breiman, “Random forests,” Machine learning, vol. 45, no. 1, pp. 5-32, 2001.
[35] H. Han, X. Guo, and H. Yu, “Variable selection using mean decrease accuracy and mean decrease gini based on random forest,” in 2016 7th ieee international conference on software engineering and service science (icsess). IEEE, 2016, pp. 219-224.
[36] R. Sutton and A. Barto, “Reinforcement learning: An introduction. 2017. ucl,” Computer Science Department, Reinforcement Learning Lectures, 2018.
[37] Y. Fenjiro and H. Benbrahim, “Deep reinforcement learning overview of the state of the art.” Journal of Automation, Mobile Robotics and Intelligent Systems, pp. 20-39, 2018.
[38] S. M. H. Fard, A. Hamzeh, and S. Hashemi, “Using reinforcement learning to find an optimal set of features,” Computers & Mathematics with Applications, vol. 66, no. 10, pp. 1892-1904, 2013.
[39] M. Lichman, “Uci machine learning repository [http://archive. ics. uci. edu/ml]. irvine, ca: University of california, school of information and computer science,” URL: http://archive. ics. uci. edu/ml, 2013.
[40] F. F. Provost, “T., and kohavi, r. the case against accuracy estimation for comparing classifiers,” in Proceedings of the Fifteenth International Conference on Machine Learning, 1998.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-d03c2c8e-22d9-4cd0-9805-a51031d6dcca