Comparative evaluation of the different data mining techniques used for the medical database

Kasperczuk, A.; Dardzińska, A.

doi:10.1515/ama-2016-0036

Artykuł - szczegóły

Tytuł artykułu

Comparative evaluation of the different data mining techniques used for the medical database

Autorzy

Kasperczuk A. , Dardzińska A.

Treść / Zawartość

Pełne teksty:

KASPERCZUK_DARDZINSKA_COMPARATIVE EVALUATION_3_2016.pdf

Pobierz

Identyfikatory

DOI

10.1515/ama-2016-0036

Warianty tytułu

Języki publikacji

Abstrakty

Data mining is the upcoming research area to solve various problems. Classification and finding association are two main steps in the field of data mining. In this paper, we use three classification algorithms: J48 (an open source Java implementation of C4.5 algorithm), Multilayer Perceptron - MLP (a modification of the standard linear perceptron) and Naïve Bayes (based on Bayes rule and a set of conditional independence assumptions) of the Weka interface. These classifiers have been used to choose the best algorithm based on the conditions of the voice disorders database. To find association rules over transactional medical database first we use apriori algorithm for frequent item set mining. These two initial steps of analysis will help to create the medical knowledgebase. The ultimate goal is to build a model, which can improve the way to read and interpret the existing data in medical database and future data as well.

Słowa kluczowe

data mining classification WEKA J48 MLP apriori association rules

baza wiedzy medycznej data mining eksploracja danych algorytm klasyfikacji

Wydawca

Oficyna Wydawnicza Politechniki Białostockiej

Czasopismo

Acta Mechanica et Automatica

Rocznik

2016

Tom

Vol. 10, no. 3

Strony

233--238

Opis fizyczny

Bibliogr. 19 poz., tab., wykr.

Twórcy

autor

Kasperczuk A.

a.kasperczuk@doktoranci.pb.edu.pl

Department of Mechanics and Computer Science, Bialystok University of Technology, ul. Wiejska 45c, 15-351 Bialystok, Poland

autor

Dardzińska A.

a.dardzinska@pb.edu.pl

Department of Mechanics and Computer Science, Bialystok University of Technology, ul. Wiejska 45c, 15-351 Bialystok, Poland

Bibliografia

1. Agrawal R., Srikant R. (1993), Fast algorithm for mining assocation rules, International Conference on Very Large Databases, 487-499.
2. Bouckaert R (2004), Naive Bayes Classifiers That Perform Well with Continuous Variables, Lecture Notes in Computer Science Volume, 3339, 1089-1094.
3. Cheng J, Greiner R (2001), Learning Bayesian Belief Network Classifiers, Algorithms and System In Stroulia & Matwin LNAI 2056, 141-151.
4. Dardzinska A, Romaniuk A (2015a), Incomplete distributed information systems optimization based on queries, Advances in Swarm and Computational Intelligence, Volume 9142 of LNCS Springer, 265-274.
5. Dardzinska A. (2013), Action Rules Mining. Springer, pp.90.
6. Dardzinska A., Ras Z. (2003), On Rules Discovery from Incomplete Information Systems, Proceedings of ICDM’03 Workshop on Foundations and New Directions of Data Mining, Melbourne, Florida, IEEE Computer Society.
7. Dardzinska A., Romaniuk A. (2015b) Queries for detailed information system selection, Position Papers of the 2015 Federated Conference on Computer Science and Information Systems, Annals of Computer Science and Information Systems vol. 6, Computer Science and Information Systems: FedCSIS, 11-15.
8. Deogun J., Raghavan V., Sever H. (1994), Rough set based classification methods and extended decision tables, International Workshop on Rough Sets and Soft Computing, 302-309.
9. Frawley W., Piatetsky-Shapiro G., Matheus C. (1991), Knowledge discovery in databases, An overview. Knowledge Discovery in Databases, 1–27.
10. Freund Y, Mason L. (1999), The alternating decision tree algorithm, In Proceedings of the 16th International Conference on Machine Learning, 124-133.
11. Han J., Kamber M. (2006), Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, Second Edition, 21-27.
12. Han J., Pei J., Yin Y. (2000), Mining frequent patterns without candidate generation, ACM SIGMOD International Conference on Management of Data, 1–12.
13. Pauk J., Dardzinska A. (2012), New method for finding rules in incomplete information systems controlled by reducts in flat feet treatment, Image Proc. and Communications Challenges. Advances in Intelligent and Soft Computing, 184, 209-214.
14. Ras Z., Dardzinska A. (2011), From Data to Classification Rules and Action,. International Journal of Intelligent Systems, Wiley, 26(6), 572-590.
15. Ras Z., Dardzinska. A.,Tsay. L., Wasyluk H. (2008), Association Action Rules, IEEE Interna-tional Conference on Data Mining Workshops, 283-290.
16. Ras Z., Joshi S. (1997), Query approximate answering system for an incomplete DKBS, Fundamenta Informaticae Journal, 20(3/4), 313-324.
17. Sliwinska-Kowalska M., Niebudek-Bogusz E., Fiszer M., et al. (2006), The prevaence and risk factors for occupational voice disorders in teachers, Folia Phoniatr. Logop, 58(2), 85-101.
18. Thair Nu Phyu (2009), Survey of Classification Techniques in Data Mining, Proceedings of the International MultiConference of Engineers and Computer Scientists, Vol I IMECS.
19. Yoo I, Alafaireet P, Marinov M, et al. (2012), Data mining in healthcare and biomedicine, A survey of the literature. Journal of medical systems, 36(4), 2431-2448.

Uwagi

Opracowanie ze środków MNiSW w ramach umowy 812/P-DUN/2016 na działalność upowszechniającą naukę.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-32c0af7d-56d9-4b71-ba4b-9596f83ecb15