Finding outliers for large medical datasets

Duraj, A.; Krawczyk, A.

Artykuł - szczegóły

Tytuł artykułu

Finding outliers for large medical datasets

Autorzy

Duraj A. , Krawczyk A.

Wybrane pełne teksty z tego czasopisma

http://pe.org.pl/

Identyfikatory

Warianty tytułu

Wykrywanie wyjątków w dużych zbiorach danych medycznych

Języki publikacji

Abstrakty

The paper deals with data mining which is a process of extracting valid, previous unknown, and ultimately comprehensible information for large datasets. One of very interesting problems appearing in scientific investigations are detection of mistakes in files of data, or the detection outlier. Finding the rare instance or the outliers is important in many disciplines and KDD (Knowledge Discovery and Data-Mining) applications.

Artykuł dotyczy metody wykrywania wyjątków w zbiorach danych dostrzegane jako różnego rodzaju anomalie, powstałe np. z powodu mechanicznego uszkodzenia, zmiany w zachowaniu systemu, czy choćby poprzez naturalny błąd człowieka. Jak się jednak wydaje, powyżej sformułowany problem badawczy jest bardzo istotny i nadal aktualny, szczególnie w przypadku medycznych zbiorów danych. Wykrycie wyjątków może zidentyfikować defekty, usunąć zanieczyszczenia danych a przede wszystkim stanowi podstawę w procesach podejmowania decyzji.

Słowa kluczowe

outliers detection k-means algorithm DBSCAN algorithm

wykrywanie wyjątków algorytm k-średnich algorytm DBSCAN

Wydawca

Wydawnictwo SIGMA-NOT

Czasopismo

Przegląd Elektrotechniczny

Rocznik

2010

Tom

R. 86, nr 12

Strony

188--191

Opis fizyczny

Bibliogr. 19 poz., tab.

Twórcy

autor

Duraj A.

autor

Krawczyk A.

Technical University of Łódź, Institute of Computer Science, apyc@ics.p.lodz.pl

Bibliografia

[1] Aggarwal C.C., Yu P.S., Outlier Detection for High Dimensional Data, In: Proceedings of the ACM SIGMOD Conference 2001.
[2] Barnett V., Lewis T., Outliers in Statistical Data. John Wiley & Sons., 3 edition., 1994
[3] Breuning M.M., Kriegel H.P., Ng R.T., Sander J., LOF: Identifying Density-Based Local Outliers, ACM SIGMOD Record, Volume 29 , Issue 2, June 2000, pp. 93 -104
[4] Cichosz P., Systemy uczące się, Wydawnictwa Naukowo- Techniczne, Warszawa, 2000
[5] Ester, M., Kriegel, H.-P., Sander, J., and Xu, X., A densitybased algorithm for discovering clusters in large spatial databases with noise, in: E. Simoudis, J. Han, U.M. Fayyad (Eds.), Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, AAAI, Menlo Park, CA, 1996, pp. 226–231
[6] Han J., Kamber M., Data Mining: Concept and Techniques, Academic Press, 2001
[7] He Z., Xu X., Deng S., Discovering Cluster-Based Local Outliers, Pattern Recognition Letters, Volume 24, Issues 9-10, June 2003, pp. 1641-1650.
[8] Hodge V.J., Austin J., A Survey of Outlier Detection Methodologies, Artificial Intelligence Review, Volume 22, 2004, pp. 85-126
[9] Inmon W.H., What is Data Warehouse?, PRISM, vol. 1, No. 1, 1995
[10] Jarke M., Lenzerini M., Vassiliou Y., Vassiliadis P., Fundamentals of Data Warehouses, Springer-Verlag Berlin Heidelberg, 2002.
[11] Jiang M. F., Tseng S. S., Su C. M. , Two-phase clustering process for outliers detection, Pattern Recognition Letters, Volume 22, Issue 6-7, May 2001, pp. 691-700
[12] Jin W., Tung A.K., Han J., Mining Top-n Local Outliers in Large Data Bases, Proceedings of International Conference on Knowledge Discovery in Data Bases, 2002, pp. 293-298
[13] John G. H., Robust Decision Trees: Removing Outliers from Databases In: Proceedings of the First International Conference on Knowledge Discovery and Data Mining. Menlo Park, CA, AAAI Press, 1995, pp. 174–179
[14] Knorr E.M., Ng R.T., Tucakov V., Distance-Based Outliers: Algorithms and Applications, The International Journal on Very Large Data Bases, Volume 8, 2000, pp. 237-253
[15] Ramaswamy S., Rastogi R., Shim K., Efficient Algorithms for Mining Outliers from Large Data Sets, Preceedings of ACM SIGMOD International Conference on Management of Data, 2000, pp. 427-438
[16] Rousseeuw, P. and Leroy, A., Robust Regression and Outlier Detection. John Wiley & Sons., 3 edition, 1996
[17] Tang J., Z. Chen, A. Wai-chee Fu, D. Cheung, A Robust Detection Scheme for Large Data Sets, In 6th Pacific-Asia Conf. on Knowledge Discovery and Data Miting, 2001
[18] Wheeler R., Aitken S., Multiple algorithms for fraud detection, Knowledge-Based System, Volume 13, Issues 2-3, April 2000, pp. 93-99
[19] Yu D., Sheikholeslami G., Zhang A., FindOut: Finding Outliers in Very Large Datasets, Knowledge and Information Systems, Volume 4, 2002, pp. 387-412

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BPOK-0032-0052