Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
This paper provides a comprehensive assessment of basic feature selection (FS) methods that have originated from nature-inspired (NI) meta-heuristics; two well-known filter-based FS methods are also included for comparison. The performances of the considered methods are compared on four balanced highdimensional and real-world text data sets regarding the accuracy, the number of selected features, and computation time. This study differs from existing studies in terms of the extent of experimental analyses that were performed under different circumstances where the classifier, feature model, and term-weighting scheme were different. The results of the extensive experiments indicated that basic NI algorithms produce slightly different results than filter-based methods for the text FS problem. However, filter-based methods often provide better results by using lower numbers of features and computation times.
Słowa kluczowe
Wydawca
Czasopismo
Rocznik
Tom
Strony
179--204
Opis fizyczny
Bibliogr. 43 poz., rys., tab.
Twórcy
autor
- Adıyaman University, Department of Computer Engineering, Adıyaman, Turkey
Bibliografia
- [1] Aghdam M.H., Ghasem-Aghaee N., Basiri M.E.: Text feature selection using ant colony optimization, Expert systems with applications, vol. 36(3), pp. 6843–6853, 2009.
- [2] Aghdam M.H., Ghasem-Aghaee N., Basiri M.E.: Application of ant colony optimization for feature selection in text categorization. In: 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence), pp. 2867–2873, IEEE, 2008.
- [3] Al-Tashi Q., Rais H.M., Abdulkadir S.J., Mirjalili S., Alhussian H.: A review of grey wolf optimizer-based feature selection methods for classification, Evolutionary Machine Learning Techniques, pp. 273–286, 2020.
- [4] Basiri M.E., Nemati S.: A novel hybrid ACO-GA algorithm for text feature selection. In: 2009 IEEE Congress on Evolutionary Computation, pp. 2561–2568, IEEE, 2009.
- [5] Belazzoug M., Touahria M., Nouioua F., Brahimi M.: An improved sine cosine algorithm to select features for text categorization, Journal of King Saud University – Computer and Information Sciences, vol. 32(4), pp. 454–464, 2020.
- [6] Chantar H., Mafarja M., Alsawalqah H., Heidari A.A., Aljarah I., Faris H.: Feature selection using binary grey wolf optimizer with elite-based crossover for Arabic text classification, Neural Computing and Applications, vol. 32(16), pp. 12201–12220, 2020.
- [7] Chen H., Hou Q., Han L., Hu Z., Ye Z., Zeng J., Yuan J.: Distributed Text Feature Selection Based On Bat Algorithm Optimization. In: 2019 10th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), vol. 1, pp. 75–80, IEEE, 2019.
- [8] Çoban Ö., Özyer B., Özyer G.T.: Sentiment analysis for Turkish Twitter feeds. In: 2015 23nd Signal Processing and Communications Applications Conference (SIU), pp. 2388–2391, IEEE, 2015.
- [9] Diao R., Shen Q.: Nature inspired feature selection meta-heuristics, ArtificialIntelligence Review, vol. 44(3), pp. 311–340, 2015.
- [10] Faris H., Aljarah I., Al-Betar M.A., Mirjalili S.: Grey wolf optimizer: a review of recent variants and applications, Neural Computing and Applications, vol. 30(2), pp. 413–435, 2018.
- [11] Gao W., Liu S.: Improved artificial bee colony algorithm for global optimization, Information Processing Letters, vol. 111(17), pp. 871–882, 2011.
- [12] Garg S., Verma S.: A Comparative Study of Evolutionary Methods for Feature Selection in Sentiment Analysis. In: IJCCI, pp. 131–138, 2019.
- [13] Geem Z.W., Kim J.H., Loganathan G.V.: A new heuristic optimization algorithm: harmony search, Simulation, vol. 76(2), pp. 60–68, 2001.
- [14] Holland J.H.: Adaptation in natural and artificial systems, University of Michigan Press, Ann Arbor, MI, 1975.
- [15] Inbarani H.H., Bagyamathi M., Azar A.T.: A novel hybrid feature selection method based on rough set and improved harmony search, Neural Computing and Applications, vol. 26(8), pp. 1859–1880, 2015.
- [16] Karaboga D., Basturk B.: A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm, Journal of Global Optimization, vol. 39(3), pp. 459–471, 2007.
- [17] Karakatič S.: EvoPreprocess-Data Preprocessing Framework with NatureInspired Optimization Algorithms, Mathematics, vol. 8(6), pp. 1–29, 2020.
- [18] Kennedy J., Eberhart R.: Particle swarm optimization. In: Proceedings of ICNN’95 – International Conference on Neural Networks, vol. 4, pp. 1942–1948, IEEE, 1995.
- [19] Khurana A., Verma O.P.: A Fine Tuned Model of Grasshopper Optimization Algorithm with Classifiers for Optimal Text Classification. In: 2020 IEEE 17th India Council International Conference (INDICON), pp. 1–7, IEEE, 2020.
- [20] Kohavi R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International Joint Conference on AI, pp. 1137–1145, Montreal, Canada, 1995.
- [21] Kyaw K.S., Limsiroratana S.: Traditional and Swarm Intelligent Based Text Feature Selection for Document Classification. In: 2019 19th International Symposium on Communications and Information Technologies (ISCIT), pp. 226–231, IEEE, 2019.
- [22] Labani M., Moradi P., Jalili M.: A multi-objective genetic algorithm for text feature selection using the relative discriminative criterion, Expert Systems with Applications, vol. 149, pp. 1–21, 2020.
- [23] Largeron C., Moulin C., Géry M.: Entropy based feature selection for text categorization. In: Proceedings of the 2011 ACM symposium on applied computing, pp. 924–928, 2011.
- [24] Mafarja M., Qasem A., Heidari A.A., Aljarah I., Faris H., Mirjalili S.: Efficient hybrid nature-inspired binary optimizers for feature selection, Cognitive Computation, vol. 12(1), pp. 150–175, 2020.
- [25] Mirjalili S.: SCA: a sine cosine algorithm for solving optimization problems, Knowledge-Based Systems, vol. 96, pp. 120–133, 2016.
- [26] Mirjalili S., Mirjalili S.M., Lewis A.: Grey wolf optimizer, Advances in Engineering Software, vol. 69, pp. 46–61, 2014.
- [27] Özel S.A., Saraç E., Akdemir S., Aksu H.: Detection of cyberbullying on social media messages in Turkish. In: 2017 International Conference on Computer Science and Engineering (UBMK), pp. 366–370, IEEE, 2017.
- [28] Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., Duchesnay E.: Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.
- [29] Porter M.F.: An algorithm for suffix stripping, Program: Electronic Library and Information Systems, vol. 14(3), pp. 130–137, 1980.
- [30] Purushothaman R., Rajagopalan S., Dhandapani G.: Hybridizing Gray Wolf Optimization (GWO) with Grasshopper Optimization Algorithm (GOA) for text feature selection and clustering, Applied Soft Computing, vol. 96, pp. 1–14, 2020.
- [31] Rawashdeh G., Mamat R., Bakar Z.B.A., Abd Rahim N.H.: Comparative between optimization feature selection by using classifiers algorithms on spam email, International Journal of Electrical and Computer Engineering, vol. 9(6), pp. 5479–5485, 2019.
- [32] Sharma M., Kaur P.: A Comprehensive Analysis of Nature-Inspired MetaHeuristic Techniques for Feature Selection Problem, Archives of Computational Methods in Engineering, pp. 1103–1127, 2021.
- [33] Shrivastava P., Shukla A., Vepakomma P., Bhansali N., Verma K.: A survey of nature-inspired algorithms for feature selection to identify Parkinson’s disease, Computer Methods and Programs in Biomedicine, vol. 139, pp. 171–179, 2017.
- [34] Storn R., Price K.: Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces, Journal of Global Optimization, vol. 11(4), pp. 341–359, 1997.
- [35] Wang Y., Liu Y., Feng L., Zhu X.: Novel feature selection method based on harmony search for email classification, Knowledge-Based Systems, vol. 73, pp. 311–323, 2015.
- [36] Yang X.S.: Firefly algorithm, stochastic test functions and design optimisation, International Journal of Bio-Inspired Computation, vol. 2(2), pp. 78–84, 2010.
- [37] Yang X.S.: Nature-inspired algorithms and applied optimization, vol. 744, Springer, 2017.
- [38] Yang X.S.: Nature-inspired optimization algorithms, Academic Press, 2020.
- [39] Yang X.S., Deb S.: Cuckoo search via Lévy flights. In: 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC), pp. 210–214, Ieee, 2009.
- [40] Yang X.S., Deb S.: Engineering optimisation by cuckoo search, International Journal of Mathematical Modelling and Numerical Optimisation, vol. 1(4), pp. 330–343, 2010.
- [41] Yang X.S., Gandomi A.H.: Bat algorithm: a novel approach for global engineering optimization, Engineering Computations, 2012.
- [42] Yang Y., Pedersen J.O.: A Comparative Study on Feature Selection in Text Categorization. In: ICML’97: Proceedings of the Fourteenth International Conference on Machine Learning, pp. 412–420, Nashville, TN, USA, 1997.
- [43] Yildirim S., Yildiz T.: A Comparison of Different Approaches to Document Representation in Turkish Language, Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, vol. 22(2), pp. 569–576, 2018.
Uwagi
PL
Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2022-2023).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-162f60c7-dc72-45d8-9e40-7447d73dc41d