PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Handling class label noise in medical pattern classification systems

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Pattern classification systems play an important role in medical decision support. They allow to automatize and speed-up the data analysis process, while being able to handle complex and massive amounts of information and discover new knowledge. However, their quality is based on the classification models built, which require a training set. In supervised classification we must supply class labels to each training sample, which is usually done by domain experts or some automatic systems. As both of these approaches cannot be deemed as flawless, there is a chance that the dataset is corrupted by class noise. In such a situation, class labels are wrongly assigned to objects, which may negatively affect the classifier training process and impair the classification performance. In this contribution, we analyze the usefulness of existing tools to deal with class noise, known as noise filtering methods, in the context of medical pattern classification. The experiments carried out on several real-world medical datasets prove the importance of noise filtering as a pre-processing step and its beneficial influence on the obtained classification accuracy.
Rocznik
Tom
Strony
123--130
Opis fizyczny
Bibliogr. 26 poz., rys., tab.
Twórcy
autor
  • ENGINE Centre, Wrocław University of Technology, Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland
autor
  • Department of Systems and Computer Networks, Wrocław University of Technology, Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland,
autor
  • Department of Systems and Computer Networks, Wrocław University of Technology, Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland,
Bibliografia
  • [1] AZAR A. T., HASSANIEN A. E. Dimensionality reduction of medical big data using neural-fuzzy classifier. Soft Comput., 2015, Vol. 19. pp. 1115–1127.
  • [2] BATISTA G. E. A. P. A., MONARD M. C. An analysis of four missing data treatment methods for supervised learning. Applied Artificial Intelligence, 2003, Vol. 17. pp. 519–533.
  • [3] BRODLEY C. E., FRIEDL M. A. Identifying Mislabeled Training Data. Journal of Artificial Intelligence Research, 1999, Vol. 11. pp. 131–167.
  • [4] CZARNECKI W. M. Weighted tanimoto extreme learning machine with case study in drug discovery. IEEE Comp. Int. Mag., 2015, Vol. 10. pp. 19–29.
  • [5] DEVIJVER P. On the editing rate of the MULTIEDIT algorithm. Pattern Recognition Letters, 1986, Vol. 4. pp. 9–12.
  • [6] GARCIA L. P. F., DE CARVALHO A. C. P. L. F., LORENA A. C. Effect of label noise in the complexity of classification problems. Neurocomputing, 2015, Vol. 160. pp. 108–119.
  • [7] HUANG G., ZHANG Y., CAO J., STEYN M., TARAPOREWALLA K. On line mining abnormal period patterns from multiple medical sensor data streams. World Wide Web, 2014, Vol. 17. pp. 569–587.
  • [8] KHOSHGOFTAAR T. M., REBOURS P. Improving software quality prediction by noise filtering techniques. Journal of Computer Science and Technology, 2007, Vol. 22. pp. 387–396.
  • [9] KONONENKO I. Machine learning for medical diagnosis: history, state of the art and perspective. Artificial Intelligence in Medicine, 2001, Vol. 23. pp. 89–109.
  • [10] KRAWCZYK B., FILIPCZUK P. Cytological image analysis with firefly nuclei detection and hybrid one-class classification decomposition. Engineering Applications of Artificial Intelligence, 2014, Vol. 31. pp. 126–135.
  • [11] KRAWCZYK B., SCHAEFER G. A hybrid classifier committee for analysing asymmetry features in breast thermograms. Appl. Soft Comput., 2014, Vol. 20. pp. 112–118.
  • [12] KRAWCZYK B., WO´ZNIAK M. Hypertension type classification using hierarchical ensemble of one-class classifiers for imbalanced data. ICT Innovations 2014, 2015, Vol. 311 of Advances in Intelligent Systems and Computing. pp. 341–349.
  • [13] LE CESSIE S., VAN HOUWELINGEN J. Ridge estimators in logistic regression. Applied Statistics, 1992, Vol. 41. pp. 191– 201.
  • [14] MCLACHLAN G. J. Discriminant Analysis and Statistical Pattern Recognition (Wiley Series in Probability and Statistics). 2004. Wiley-Interscience.
  • [15] POMBO N., ARAÚJO P., VIANA J. Knowledge discovery in clinical decision support systems for pain management: A systematic review. Artificial Intelligence in Medicine, 2014, Vol. 60. pp. 1–11.
  • [16] QUINLAN J. R. C4.5: programs for machine learning. 1993. Morgan Kaufmann Publishers, San Francisco, CA, USA.
  • [17] SÁEZ J. A., GALAR M., LUENGO J., HERRERA F. Analyzing the presence of noise in multi-class problems: alleviating its influence with the one-vs-one decomposition. Knowl. Inf. Syst., 2014, Vol. 38. pp. 179–206.
  • [18] SÁEZ J. A., GALAR M., LUENGO J., HERRERA F. INFFC: an iterative class noise filter based on the fusion of classifiers with noise sensitivity control. Information Fusion, 2016, Vol. 27. pp. 19–32.
  • [19] SÁEZ J. A., LUENGO J., HERRERA F. Predicting noise filtering efficacy with data complexity measures for nearest neighbor classification. Pattern Recognition, 2013, Vol. 46. pp. 355–364.
  • [20] SÁNCHEZ J., BARANDELA R., MÁRQUES A., ALEJO R., BADENAS J. Analysis of new techniques to obtain quality training sets. Pattern Recognition Letters, 2003, Vol. 24. pp. 1015–1022.
  • [21] SÁNCHEZ J., PLA F., FERRI F. Prototype selection for the nearest neighbor rule through proximity graphs. Pattern Recognition Letters, 1997, Vol. 18. pp. 507–513.
  • [22] SANZ J., GALAR M., JURIO A., BRUGOS A., PAGOLA M., BUSTINCE H. Medical diagnosis of cardiovascular diseases using an interval-valued fuzzy rule-based classification system. Appl. Soft Comput., 2014, Vol. 20. pp. 103–111.
  • [23] TENG C.-M. Correcting Noisy Data. Proceedings of the Sixteenth International Conference on Machine Learning, 1999. Morgan Kaufmann Publishers, San Francisco, CA, USA, pp. 239–248.
  • [24] WILSON D. Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems and Man and Cybernetics, 1972, Vol. 2. pp. 408–421.
  • [25] WILSON D. R., MARTINEZ T. R. Improved heterogeneous distance functions. Journal of Artificial Intelligence Research, 1997, Vol. 6. pp. 1–34.
  • [26] WOLPERT D. The supervised learning no-free-lunch theorems. In Proc. 6th Online World Conference on Soft Computing in Industrial Applications, 2001. pp. 25–42.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-336e14fd-ee7c-4f15-9577-dfe10508d336
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.