Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
The aim of the paper is to present how some of the data mining tasks can be solved using the R programming language. The full R scripts are provided for preparing data sets, solving the tasks and analyzing the results.
Słowa kluczowe
Rocznik
Tom
Strony
27--49
Opis fizyczny
Bibliogr. 18 poz., tab., wykr.
Twórcy
Bibliografia
- [1] D.T. Larose, Ch.D. Larose, Discovering Knowledge in Data. An Introduction to Data Mining, Hoboken: Wiley, 2014.
- [2] D.T. Larose, Ch.D. Larose, Data Mining and Predictive Analytics, Hoboken: Wiley, 2015.
- [3] Data sets distributed with R: airquality data set. [On-line]. https://forge.scilab.org/index.php/p/rdataset/source/tree/master/csv/datasets/airquality.csv. [18.11.2020].
- [4] I. Ben-Gal, Outlier Detection. [in:] Data Mining and Knowledge Discovery Handbook. (Eds.) O. Maimon and L. Rokach., Boston, MA: Springer, 117-130, 2005. https://doi.org/10.1007/0-387-25465-X_7.
- [5] An Introduction to Outliers – What are Outliers – Types of Outliers. [On-line]. https://www.anblicks.com/resources/insights-blogs/an-introduction-to-outliers/. [18.11.2020].
- [6] R. Rakotomalala, Tanagra: Body Mass Index Data Set. [On-line]. eric.univ-lyon2.fr/~ricco/tanagra/fichiers/body_mass_index.xls. [18.11.2020].
- [7] G. Seif, The 5 Clustering Algorithms Data Scientists Need to Know. [On-line]. https://towardsdatascience.com/the-5-clustering-algorithms-data-scientists-need-to-know-a36d136ef68. [18.11.2020].
- [8] P. Fränti, S. Sieranoja, K-means properties on six clustering benchmark datasets, “Applied Intelligence” 48 (12), 4743-4759, 2018. https://doi.org/10.1007/s10489-018-1238-7.
- [9] P. Fränti, S. Sieranoja, Clustering basic benchmark: Aggregation, [On-line]. http://cs.joensuu.fi/sipu/datasets/. [18.11.2020].
- [10] H. You, G. Rumbe, Comparative Study of Classification Techniques on Breast Cancer FNA Biopsy Data, “International Journal of Interactive Multi-media and Artificial Intelligence” 1(3): 5-12, 2010.
- [11] C. Elkan, Predictive analytics and data mining, [On-line]. https://www.re-searchgate.net/publication/228780185_Predictive_analytics_and_data_min-ing. [20.11.2020].
- [12] UCI Machine Learning Repository: Seeds Data Set, [On-line]. https://ar-chive.ics.uci.edu/ml/datasets/seeds. [20.11.2020].
- [13] UCI Machine Learning Repository: Heart Disease Data Set. [On-line]. https://archive.ics.uci.edu/ml/datasets/heart+disease. [20.11.2020].
- [14] RDocumentation: C5.0 Control. [On-line]. https://www.rdocumenta-tion.org/packages/C50/versions/0.1.3.1/topics/C5.0Control. [20.11.2020],
- [15] UCI Machine Learning Repository: Auto MPG Data Set, [On-line]. https://archive.ics.uci.edu/ml/datasets/auto+mpg. [20.11.2020].
- [16] K. Vougas, M. Krochmal, T. Jackson, A. Polyzos, A. Aggelopoulos, I.S. Pateras, M. Liontos, A. Varvarigou, E.O. Johnson, V. Georgoulias, A. Vla-hou, P. Townsend, D. Thanos, J. Bartek, V. G. Gorgoulis, Deep Learning and Association Rule Mining for Predicting Drug Response in Cancer. A Personalised Medicine Approach, bioRxiv. [On-line]. https://www.biorxiv.org/con-tent/10.1101/070490v3.full. [24.11.2020].
- [17] T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, New York: Springer, 2009.
- [18] UCI Machine Learning Repository: Congressional Voting Records Data Set. [On-line]. https://archive.ics.uci.edu/ml/datasets/congressional+voting+rec-ords. [20.11.2020].
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-f1aa3597-9eaf-4795-9e78-19227d44aca2