Predicting Aggregated User Satisfaction in Software Projects

Radliński, Ł.

doi:10.1515/fcds-2018-0017

Powiadomienia systemowe

Sesja wygasła!
Sesja wygasła!
Sesja wygasła!

Artykuł - szczegóły

Tytuł artykułu

Predicting Aggregated User Satisfaction in Software Projects

Autorzy

Radliński Ł.

Wybrane pełne teksty z tego czasopisma

Identyfikatory

DOI

10.1515/fcds-2018-0017

Warianty tytułu

Języki publikacji

Abstrakty

User satisfaction is an important feature of software quality. However, it was rarely studied in software engineering literature. By enhancing earlier research this paper focuses on predicting user satisfaction with machine learning techniques using software development data from an extended ISBSG dataset. This study involved building, evaluating and comparing a total of 15,600 prediction schemes. Each scheme consists of a different combination of its components: manual feature preselection, handling missing values, outlier elimination, value normalization, automated feature selection, and a classifier. The research procedure involved a 10-fold cross-validation and separate testing, both repeated 10 times, to train and to evaluate each prediction scheme. Achieved level of accuracy for best performing schemes expressed by Matthews correlation coefficient was about 0.5 in the cross-validation and about 0.5–0.6 in the testing stage. The study identified the most accurate settings for components of prediction schemes.

Słowa kluczowe

user satisfaction prediction scheme software projects machine learning ISBSG

Wydawca

Wydawnictwo Politechniki Poznańskiej

Czasopismo

Foundations of Computing and Decision Sciences

Rocznik

2018

Tom

Vol. 43, No. 4

Strony

335--357

Opis fizyczny

Bibliogr. 39 poz., tab. fig.

Twórcy

autor

Radliński Ł.

lukasz.radlinski@zut.edu.pl.

Faculty of Computer Science and Information Technology, West Pomeranian University of Technology in Szczecin, ul. Żołnierska 49, 71-210 Szczecin, Poland

Bibliografia

[1] Atkeson C.G., Moore A.W., Schaal S., Locally Weighted Learning, Artificial Intelligence Review, 11, 1-5, 1997, 11-73.
[2] Cerpa, N., Bardeen, M., Astudillo, C. A., Verner, J., Evaluating different families of prediction methods for estimating software project outcomes, Journal of Systems and Software, 112, 2016, 48–64.
[3] Cleary J.G., Trigg L.E., K*: an instance-based learner using and entropic distance measure, in: Proceedings of the Twelfth International Conference on International Conference on Machine Learning (ICML’95), Armand Prieditis and Stuart J. Russell (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1995, 108-114.
[4] Ding, C.H.Q., Peng, H., Minimum redundancy feature selection from microarray gene expression data, in: Proc. the 2nd IEEE Comp. Society Bioinformatics Conf., Stanford, CA, IEEE Comp. Society, Los Alamitos, 2003, 523–529.
[5] Fenton, N., Marsh, W., Neil, M., Cates, P., Forey, S., Tailor, M., Making Resource Decisions for Software Projects, in: Proceedings of the 26th International Conference on Software Engineering, IEEE Computer Society, Washington, DC, 2004, 397–406.
[6] Frank E., Hall M.A., Witten I.H., The WEKA Workbench. Online Appendix for “Data Mining: Practical Machine Learning Tools and Techniques”, Morgan Kaufmann, Fourth Edition, 2016, http://www.cs.waikato.ac.nz/ml/weka/Witten_et_al_2016_appendix.pdf, last accessed 2018/05/22.
[7] Frank E., Witten I.H., Generating Accurate Rule Sets Without Global Optimization. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML ‘98), Jude W. Shavlik (Ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1998, 144-151.
[8] Friedman J., Hastie T., Tibshirani R., Special Invited Paper. Additive Logistic Regression: A Statistical View of Boosting, The Annals of Statistics, 28, 2, 2000, 337-374.
[9] Garcés, L., Ampatzoglou, A., Avgeriou, P., Nakagawa, E.Y., Quality attributes and quality models for ambient assisted living software systems: A systematic mapping, Information and Software Technology, 82, 2017, 121-138.
[10] Hall M., Frank E., Combining Naive Bayes and Decision Tables, in: D.L. Wilson & H. Chad (Eds), Proceedings of Twenty-First International Florida Artificial Intelligence Research Society Conference, AAAI Press, Coconut Grove, Florida, USA, 2008, 318-319.
[11] Holmes G., Pfahringer B., Kirkby R., Frank E., Hall M., Multiclass Alternating Decision Trees, in: Proceedings of the 13th European Conference on Machine Learning (ECML ‘02), Tapio Elomaa, Heikki Mannila, and Hannu Toivonen (Eds.). Springer-Verlag, London, UK, 2002, 161-172.
[12] Holte R.C., Very simple classification rules perform well on most commonly used datasets. Machine Learning. 11, 1993, 63-91.
[13] ISBSG Repository Data Release 11. International Software Benchmarking Standards Group, 2009.
[14] Idri, A., Bachiri, M., Fernández-Alemán, J.L., A Framework for Evaluating the Software Product Quality of Pregnancy Monitoring Mobile Personal Health Records, Journal of Medical Systems, 40, 3, 2016, art. no. 50, 1-17.
[15] ISO/IEC: Software engineering Software product Quality Requirements and Evaluation (SQuaRE) System and software quality models, volume ISO/IEC 25010:2011(E), 2011.
[16] Jin W., Tung A.K.H., Han J., Wang W., Ranking Outliers Using Symmetric Neighborhood Relationship, in: Ng WK., Kitsuregawa M., Li J., Chang K. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2006. Lecture Notes in Computer Science, vol 3918. Springer, Berlin, Heidelberg, 2006.
[17] Jones C., Applied Software Measurement: Global Analysis of Productivity and Quality, McGraw-Hill Education, 3rd edition, 2008.
[18] Kitchenham B.A., Madeyski L., Budgen D., Keung J., Brereton P., Charters S., Gibbs S., Pohthong A., Robust Statistical Methods for Empirical Software Engineering, Empirical Software Engineering, 22, 2, 2017, 579-630.
[19] Kocaguneli E., Menzies T., Bener A., Keung J. W., Exploiting the Essential Assumptions of Analogy-Based Effort Estimation, IEEE Transactions on Software Engineering, 38, 2, 2012, 425–438.
[20] Kohavi R., The power of decision tables, in: Proceedings of the 8th European Conference on Machine Learning (ECML’95), Nada Lavrač and Stefan Wrobel (Eds.). Springer-Verlag, Berlin, Heidelberg, 1995, 174-189.
[21] Kohavi R., Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid, in: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD’96), Evangelos Simoudis, Jiawei Han, and Usama Fayyad (Eds.). AAAI Press, 1996, 202-207.
[22] Landwehr N., Hall M., Frank E., Logistic Model Trees. Machine Learning, 59, 1-2, 2005, 161-205.
[23] Le Cessie S., Van Houwelingen J., Ridge Estimators in Logistic Regression, Journal of the Royal Statistical Society. Series C (Applied Statistics), 41, 1, 1992, 191-201.
[24] Menzies T., Jalali O., Hihn J., Baker D., Lum K., Stable rankings for different effort models, Automated Software Engineering, 17, 4, 2010, 409–437.
[25] Olsina, L., Lew, P., Dieser, A., Rivera, B., Updating quality models for evaluating new generation web applications, Journal of Web Engineering, 11, 3, 2012, 209-246.
[26] Pearl J., Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Representation and Reasoning Series (2nd printing ed.). San Francisco, California: Morgan Kaufmann, 1988.
[27] Quinlan R., C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA, 1993.
[28] R Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2017.
[29] Radlinski L., How software development factors influence user satisfaction in meeting business objectives and requirements?, in: Madeyski, L., Ochodek, M. (eds.), Software Engineering from Research and Practice Perspectives, chapter 6, Nakom, Poznan-Warszawa, 2014, 101–119.
[30] Radliński Ł., Preliminary evaluation of schemes for predicting user satisfaction with the ability of system to meet stated objectives, Journal of Theoretical and Applied Computer Science, 9, 2, 2015, 32–50.
[31] Radlinski L., Towards expert-based modeling of integrated software quality, Journal of Theoretical and Applied Computer Science, 6, 2, 2012, 13–26.
[32] RapidMiner Studio, https://rapidminer.com/products/studio/, last accessed 2018/05/22.
[33] Schowe B., Morik K., Fast-Ensembles of Minimum Redundancy Feature Selection, in: Okun O., Valentini G., Re M. (eds) Ensembles in Machine Learning Applications. Studies in Computational Intelligence, vol 373. Springer, Berlin, Heidelberg, 2011.
[34] Shepperd M., Bowes D., Hall T., Researcher Bias: The Use of Machine Learning in Software Defect Prediction. IEEE Transactions on Software Engineering, 40, 2014, 603–616.
[35] Shi H., Best-first Decision Tree Learning, Thesis, Master of Science. The University of Waikato, Hamilton, New Zealand, 2007.
[36] Song Q., Jia Z., Shepperd M., Ying S., Liu J., A General Software Defect-Proneness Prediction Framework, IEEE Transactions on Software Engineering, 37, 3, 2011, 356-370.
[37] Sumner M., Frank E., Hall M., Speeding up logistic model tree induction, in: Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’05), Alípio Mário Jorge, Luís Torgo, Pavel Brazdil, Rui Camacho, and João Gama (Eds.). Springer-Verlag, Berlin, Heidelberg, 2005, 675-683.
[38] Tang J., Chen Z., Fu A. W. C., Cheung, D. W., Enhancing Effectiveness of Outlier Detections for Low Density Patterns, in: Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD). Taipei, 2002, 535-548.
[39] Vargas J.A., García-Mundo L., Genero M., Piattini M., A systematic mapping study on serious game quality, in: Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering (EASE ‘14), ACM, New York, 2014, Article no. 15.

Uwagi

Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2018).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-e293bc4f-9b4b-4e6d-802b-764999d00c54