PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Feature selection for the low industrial yield of cane sugar production based on rule learning algorithms

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
This article presents a model based on machine learning for the selection of the characteristics that most influence the low industrial yield of cane sugar production in Cuba. The set of data used in this work corresponds to a period of ten years of sugar harvests from 2010 to 2019. A pro‐ cess of understanding the business and of understand‐ ing and preparing the data is carried out. The accuracy of six rule learning algorithms is evaluated: CONJUNC‐ TIVERULE, DECISIONTABLE, RIDOR, FURIA, PART and JRIP. The results obtained allow us to identify: R417, R379, R378, R419a, R410, R613, R1427 and R380, as the indi‐ cators that most influence low industrial performance.
Twórcy
  • ESI DATAZUCAR, AZCUBA, Avenida 23 No.171/ N y O, Vedado, Plaza de la Revolución, La Habana, Cuba
  • CUJAE, Calle 114 No.11901/ Ciclovía y Rotonda, Marianao, La Habana, Cuba
  • CUJAE, Calle 114 No.11901/ Ciclovía y Rotonda, Marianao, La Habana, Cuba
  • CUJAE, Calle 114 No.11901/ Ciclovía y Rotonda, Marianao, La Habana, Cuba
Bibliografia
  • [1] F. Beck, and J. Fürnkranz. “An Empirical Investigation Into Deep and Shallow Rule Learning”, Frontiers in Artiϔicial Intelligence, vol. 4, 2021.
  • [2] J. Casillas, O. Cordón, M. J. Del Jesus, and F. Herrera. “Genetic feature selection in a fuzzy rule‐based classification system learning process for high‐dimensional problems”, Information Sciences, vol. 136, no. 1, 2001, 135–157, doi: 10.1016/S0020‐0255(01)00147‐5.
  • [3] J. Coto Palacio, Y. Jiménez Martínez, A. Nowé, J. Coto Palacio, Y. Jiménez Martínez, and A. Nowé. “Aplicación de sistemas neuroborrosos en la clasificación de reportes en problemas de secuenciación”, Revista Cubana de Ciencias Informáticas, vol. 14, no. 4, 2020, 34–47, Publisher: Universidad de las Ciencias Informáticas.
  • [4] B. Dębska, and B. Guzowska‐Świder. “Decision trees in selection of featured determined food quality”, Analytica Chimica Acta, vol. 705, no. 1, 2011, 261–271, doi: 10.1016/j.aca.2011.06.030.
  • [5] Y. Everingham, J. Sexton, D. Skocaj, and G. Inman‐Bamber. “Accurate prediction of sugarcane yield using a random forest algorithm”, Agronomy for Sustainable Development, vol. 36, no. 2, 2016, 27, doi: 10.1007/s13593‐016‐0364‐z.
  • [6] M. A. Ferraciolli, F. F. Bocca, and L. H. A. Rodrigues. “Neglecting spatial autocorrelation causes underestimation of the error of sugar‐cane yield models”, Computers and Electronics in Agriculture, vol. 161, 2019, 233–240, doi: 10.1016/j.compag.2018.09.003.
  • [7] Y. Filiberto, R. Bello, Y. Caballero, and M. Frías. “Algoritmo para el aprendizaje de reglas de clasificación basado en la teoría de los conjuntos aproximados extendida”, DYNA, vol. 78, no. 169, 2011, 62–70, Publisher: 2006, Revista DYNA.
  • [8] J. Fürnkranz. “Rule Learning”. In: C. Sammut and G. I. Webb, eds., Encyclopedia of Machine Learning, 875–879. Springer US, Boston, MA, 2010.
  • [9] S. García, J. Luengo, and F. Herrera, Data Preprocessing in Data Mining, volume 72 of Intelligent Systems Reference Library, Springer International Publishing: Cham, 2015, doi: 10.1007/978‐3‐319‐10247‐4.
  • [10] P. K. Giri, S. S. De, S. Dehuri, and S. Cho. “Biogeography based optimization for mining rules to assess credit risk”, Intelligent Systems in Accounting, Finance and Management, vol. 28, no. 1, 2021, 35–51, 10.1002/isaf.1486.
  • [11] Gnanambal, S., Thangaraj, M., Meenatchi, V. T., and Gayathri, V., “Classification Algorithms with Attribute Selection: an evaluation study using WEKA”, International Journal of Advanced Networking and Applications, vol. 9, no. 6, 2018, 3640–3644.
  • [12] E. A. d. M. Gomes Soares, L. C. Leite Damascena, L. M. Mendes de Lima, and R. Marcos de Moraes. “Analysis of the Fuzzy Unordered Rule Induction Algorithm as a Method for Classification”, 2018.
  • [13] J. J. T. Gordillo, and V. H. P. Rodríguez. “Cálculo de la fiabilidad y concordancia entre codificadores de un sistema de categorías para el estudio del foro online en e‐learning”, vol. 27, 2009, 17.
  • [14] A. Gupta, A. Mohammad, A. Syed, and M. N.. “A Comparative Study of Classification Algorithms using Data Mining: Crime and Accidents in Denver City the USA”, International Journal of Advanced Computer Science and Applications, vol. 7, no. 7, 2016, doi: 10.14569/IJACSA.2016.070753.
  • [15] I. Guyon, and A. Elisseeff. “An introduction to variable and feature selection”, The Journal of Machine Learning Research, vol. 3, 2003, 1157–1182.
  • [16] R. G. Hammer, P. C. Sentelhas, and J. C. Q. Mariano. “Sugarcane Yield Prediction Through Data Mining and Crop Simulation Models”, Sugar Tech, vol. 22, no. 2, 2020, 216–225, doi: 10.1007/s12355‐ 019‐00776‐z.
  • [17] J. Hernández Orallo, M. J. Ramárez Quintana, and C. Ferri Ramírez. Introducción a la Minería de Datos, Pearson Educacion. S.A: España, 2004, OCLC: 933368678.
  • [18] C. Huang, X. Huang, Y. Fang, J. Xu, Y. Qu, P. Zhai, L. Fan, H. Yin, Y. Xu, and J. Li. “Sample imbalance disease classification model based on association rule feature selection”, Pattern Recognition Letters, vol. 133, 2020, 280–286, doi: 10.1016/j.patrec.2020.03.016.
  • [19] Y. Li and Z.‐F. Wu. “Fuzzy feature selection based on min–max learning rule and extension matrix”, Pattern Recognition, vol. 41, no. 1, 2008, 217–226, doi: 10.1016/j.patcog.2007.06.007.
  • [20] J. Martinez Heras. “Precision, Recall, F1, Accuracy en clasificación”, October 2020. Section: machine learning.
  • [21] V. B. Núñez, R. Velandia, F. Hernández, J. Meléndez, and H. Vargas. “Atributos Relevantes para el Diagnóstico Automático de Eventos de Tensión en Redes de Distribución de Energía Eléctrica”, Revista Iberoamericana de Automática e Informática Industrial RIAI, vol. 10, no. 1, 2013, 73–84, doi: 10.1016/j.riai.2012.11.007.
  • [22] R. A. V. Ortega, and F. L. H. Suárez. “Evaluación de algoritmos de extracción de reglas de decisión para el diagnóstico de huecos de tensión”, 2010, 127.
  • [23] M. S. Pathan, A. Nag, M. M. Pathan, and S. Dev. “Analyzing the impact of feature selection on the accuracy of heart disease prediction”, Healthcare Analytics, vol. 2, 2022, 100060, doi: 10.1016/j.health.2022.100060.
  • [24] B. T. Pham, C. Luu, T. V. Phong, H. D. Nguyen, H. V. Le, T. Q. Tran, H. T. Ta, and I. Prakash. “Flood risk assessment using hybrid artificial intelligence models integrated with multi‐criteria decision analysis in Quang Nam Province, Vietnam”, Journal of Hydrology, vol. 592, 2021, 125815, doi: 10.1016/j.jhydrol.2020.125815.
  • [25] F. M. Pérez. “Estudio y análisis del funcionamiento de técnicas de minería de datos en conjuntos de datos relacionados con la Biología”, 35.
  • [26] H. Rao, X. Shi, A. K. Rodrigue, J. Feng, Y. Xia, M. Elhoseny, X. Yuan, and L. Gu. “Feature selection based on artificial bee colony and gradient boosting decision tree”, Applied Soft Computing, vol. 74, 2019, 634–642, doi: 10.1016/j.asoc.2018.10.036.
  • [27] M. Ribas García, R. Consuegra del Rey, and M. Alfonso Alfonso. “Análisis de los factores que más inciden sobre el rendimiento industrial azucarero”, vol. 43, no. 1, 2016, 10.
  • [28] A. Rivas Méndez. “Estudio experimental sobre algoritmos de clasificación supervisada basados en reglas en conjuntos de datos de alta dimensión”, 2014, Accepted: 2019‐07‐09T15:50:17Z Publisher: Universidad de Holguín, Facultad Informática Matemática, Departamento de Informática.
  • [29] M. Schiezaro, and H. Pedrini. “Data feature selection based on Artificial Bee Colony algorithm”, EURASIP Journal on Image and Video Processing, vol. 2013, no. 1, 2013, 47, doi: 10.1186/1687‐5281‐2013‐47.
  • [30] K. Topouzelis, and A. Psyllos. “Oil spill feature selection and classification using decision tree forest on SAR image data”, ISPRS Journal of Photogrammetry and Remote Sensing, vol. 68, 2012, 135–143, doi: 10.1016/j.isprsjprs.2012.01.005.
  • [31] S. Veenadhari, B. Misra, and C. Singh. “Machine learning approach for forecasting crop yield based on climatic parameters”. In: 2014 International Conference on Computer Communication and Informatics, Coimbatore, India, 2014, 1–5, doi: 10.1109/ICCCI.2014.6921718.
  • [32] M. Widmann. “From Modeling to Scoring: Confusion Matrix and Class Statistics”, May 2019.
  • [33] H. Zhou, J. Zhang, Y. Zhou, X. Guo, and Y. Ma. “A feature selection algorithm of decision tree based on feature weight”, Expert Systems with Applications, vol. 164, 2021, 113842, doi: 10.1016/j.eswa.2020.113842.
  • [34] L. Zhou, Y.‐W. Si, and H. Fujita. “Predicting the listing statuses of Chinese‐listed companies using decision trees combined with an improved filter feature selection method”, Knowledge-Based Systems, vol. 128, 2017, 93–101, doi: 10.1016/j.knosys.2017.05.003.
Uwagi
Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2024).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-23b10157-5df5-4508-834b-037e1aeebd6b
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.