PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Experiments on software error prediction using Decision Tree and Random Forest algorithms

Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Machine learning algorithms are widely used in the assessment of error-proneness in software. We conducted several experiments with error prediction on public PROMISE repository. We used Decision Tree and Random Forest algorithms. We also examined techniques aiming at the improvement of performance and accuracy of the model - such as oversampling, hyperparameter optimization or threshold adjustment. The outcome of our experiments suggests that Random Forest algorithm, with 100 - 1000 trees, can be used to obtain high values of evaluation parameters such as accuracy and balanced accuracy. However, it has to be implemented with a set of techniques countering imbalance of the datasets used to assure high values of precision and recall that correspond with correct detection of erroneous software. Additionally, it was shown that the oversampling and hyperparameter optimization could be reliably applied to the algorithm, while threshold adjustment technique was not found to be consistent.
Rocznik
Tom
Strony
865--869
Opis fizyczny
Bibliogr. 33 poz., il., tab., wykr.
Twórcy
  • Warsaw University of Technology, Institute of Computer Science, Nowowiejska 15/19 00-665 Warsaw, Poland
  • Warsaw University of Technology, Institute of Computer Science, Nowowiejska 15/19 00-665 Warsaw, Poland
Bibliografia
  • 1. F. Elberzhager, A. R. Rosbach, Eschbach, J. Münch, “Reducing Test Effort: A Systematic Mapping Study on Existing Approaches”, Information and Software Technology, vol. 54, no. 10, 1092-1106, 2012.
  • 2. K. Bareja, A. Singhal, “A Review of Estimation Techniques to Reduce Testing Efforts in Software Development”, http://dx.doi.org/ 10.1109/ACCT.2015.110, 2015.
  • 3. J. Hryszko, L. Madeyski, “Cost Effectiveness of Software Defect Prediction in an Industrial Project”, http://dx.doi.org/ 10.1515/fcds-2018-0002, 2018.
  • 4. Y.Z. Bala, P.A. Samat, K.Y. Sharif, N. Manshor, “Current Software Defect Prediction: A Systematic Review”, http://dx.doi.org/ 10.1109/AiIC54368.2022.99114586, 2022
  • 5. F. Matloob et al., “Software Defect Prediction Using Ensemble Learning: A Systematic Literature Review”, http://dx.doi.org/ 0.1109/ACCESS.2021.3095559, 2021.
  • 6. Y. Zhao, K. Damevski, H,Chen, “A Systematic Survey of Just-in-Time Software Defect Prediction”, http://dx.doi.org/ 10.1145/3567550, 2023.
  • 7. T. Menzies , J. DiStefano, A. Orrego , R. Chapman, “ Assessing predictors of software defects”, in Proc Predictive software models workshop, pp. 1-5, 2004.
  • 8. G. Boetticher, T. Menzies, T. Ostrand, PROMISE Repository of Empirical Software Engineering Data, West Virginia University, Department of Computer Science 2007.
  • 9. C. Catal, B. Diri, B. Ozumut, “An artificial immune system approach for fault prediction in object oriented software”, pp. 238-245, http://dx.doi.org/ 10.1109/DEPCOS-RELCOMEX, 2007.
  • 10. C. Catal, B. Diri, “Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem”, http://dx.doi.org/ 10.1016/j.ins.2008.12.001, 2009.
  • 11. J. Brownlee, “Clonal selection theory & CLONALG. The clonal selection classification algorithm”, in Technical Report 2-02, Swinburne University of Technology, 2005.
  • 12. J. H. Carter, “The immune system as a model for pattern recognition and classification”, http://dx.doi.org/10.1136/jamia.2000.0070028, 2001.
  • 13. L. Breiman, “Bagging predictors.”, Mach Learn 24, pp.123–140, https://doi.org/10.1007/BF00058655Y, 1996.
  • 14. D. Mundada, A. Murade, O. Vaidya, and J. N. Swathi, “Software Fault Prediction Using Artificial Neural Network And Resilient Back Propagation”, Int. J. Comput. Sci. Eng., vol. 5, no. 03, pp. 173–179, 2016.
  • 15. Z. Xiang, L. Zhang, "Research on an Optimized C4.5 Algorithm Based on Rough Set Theory", http://dx.doi.org/ 10.1109/ICMeCG.2012.74, 2012.
  • 16. P. Bishnu and V. Bhattacherjee, “Software Fault Prediction Using Quad Tree-Based K-Means Clustering Algorithm”, pp. 1146–1150, http://dx.doi.org/ 10.1109/TKDE.2011.163, 2012.
  • 17. P. Bishnu and V. Bhattacherjee, “Outlier Detection Technique Using Quad Tree” in Proc Int’l Conf. Computer Comm. Control and Information Technology, pp. 143-148, 2009.
  • 18. A. Okutan and O. Taner, “Software defect prediction using Bayesian networks”, http://dx.doi.org/ 10.1007/s10664-012-9218-8, 2014.
  • 19. P. Kumudha, R. Venkatesan, “Cost-Sensitive Radial Basis Function Neural Network Classifier for Software Defect Prediction”, http://dx.doi.org/ 10.1155/2016/2401496, 2016.
  • 20. S. Gupta, D. Gupta, “Fault Prediction using Metric Threshold Value of Object Oriented Systems”, International Journal of Engineering Science and Computing, vol. 7, no. 6, pp. 13629–13643, 2017
  • 21. E. Erturk, E. Akcapinar, “Iterative software fault prediction with a hybrid approach”, http://dx.doi.org/ 10.1016/j.asoc.2016.08.025, 2016.
  • 22. J. S. R. Jang, "ANFIS: adaptive-network-based fuzzy inference system", http://dx.doi.org/ 10.1109/21.256541, 1993.
  • 23. F. Alighardashi, M. Ali, Z. Chahooki, “The Effectiveness of the Fused Weighted Filter Feature Selection Method to Improve Software Fault Prediction”, pp. 5, http://dx.doi.org/10.22385/jctecs.v8i0.96, 2016.
  • 24. C. Lakshmi Prabha, Dr.N. shivakumar “Software Defect Prediction Using Machine Learning Techniques” , Proc. of the Fourth International Conference on Trends in Electronics and Informatics, IEEE Xplore Part Number: CFP20J32-ART; ISBN: 978-1-7281-5518-0, 2020.
  • 25. Y. Shen, S. Hu, S, Cai, M. Chen, “Software Defect Prediction based on Bayesian Optimization Random Forest”, http://dx.doi.org/ 10.1109/DSA56465.2022.00149, 2022.
  • 26. T.F. Husin, M.R. Pribadi, Yohannes, “Implementation of LSSVM in Classification of Software Defect Prediction Data with Feature Selection”, 9th Int. Conf. on Electrical Engineering, Computer Science and Informatics (EECSI2022), pp.126-131, 2022.
  • 27. MD.A. Jahangir, MD. A.Tajwar, W. Marma, “Intelligent Software Bug Prediction: An Empirical Approach”, http://dx.doi.org , 101109/ICREST57604.2023.10070026, 2023.
  • 28. Python Core Team, “Python: A dynamic, open source programming language”, Python Software Foundation, accessed 28.04.2022, <https://www.python.org/>
  • 29. C.R. Harris, K.J. Millman, S.J. van der Walt et al. “Array programming with NumPy”, Nature 585, pp. 357–362, http://dx.doi.org/ 10.1038/s41586-020-2649-2, 2020.
  • 30. W. McKinney, “Data structures for statistical computing in python”, Proc. of the 9th Python in Science Conference, vol 445, pp. 56-61, http://dx.doi.org/ 10.25080/Majora-92bf1922-00a, 2010.
  • 31. Pedregosa et al., “Scikit-learn: Machine Learning in Python”, Journal of Machine Learning Research 12, pp. 2825-2830, 2011.
  • 32. G. Lematre, F. Nogueira, C. K. Áridas, “Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning”, Journal of Machine Learning Research 17, pp. 1-5, http://dx.doi.org/ 10.48550/arXiv.1609.06570, 2017.
  • 33. N. V. Chawla, K. W. Bowyer, L. O. Hall, W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique”, Journal of artificial intelligence research, pp. 321-357, 2002.
Uwagi
1. Thematic Tracks Short Papers
2. Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2024).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-8eee7233-98ef-40ed-8a01-993835efa49e
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.