Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na https://bibliotekanauki.pl

PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
2013 | Vol. 127, nr 1-4 | 273--288
Tytuł artykułu

Random Reducts : A Monte Carlo Rough Set-based Method for Feature Selection in Large Datasets

Wybrane pełne teksty z tego czasopisma
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
An important step prior to constructing a classifier for a very large data set is feature selection. With many problems it is possible to find a subset of attributes that have the same discriminative power as the full data set. There are many feature selection methods but in none of them are Rough Set models tied up with statistical argumentation. Moreover, known methods of feature selection usually discard shadowed features, i.e. those carrying the same or partially the same information as the selected features. In this study we present Random Reducts (RR) - a feature selection method which precedes classification per se. The method is based on the Monte Carlo Feature Selection (MCFS) layout and uses Rough Set Theory in the feature selection process. On synthetic data, we demonstrate that the method is able to select otherwise shadowed features of which the user should be made aware, and to find interactions in the data set.
Słowa kluczowe
Wydawca

Rocznik
Strony
273--288
Opis fizyczny
Bibliogr. 15 poz., tab., wykr.
Twórcy
autor
autor
  • Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, Warsaw, Poland, jan.komorowski@lcb.uu.se
Bibliografia
  • [1] Bazan, J. G., Skowron, A., Synak, P.: Dynamic reducts as a tool for extracting laws from decisions tables, in: Methodologies for Intelligent Systems, Springer, 1994, 346-355.
  • [2] Draminski, M.: dmlab: a data mining library., 2009.
  • [3] Draminski, M., Kierczak, M., Koronacki, J., Komorowski, J.: Monte Carlo feature selection and interdependency discovery in supervised classification, in: Advances in Machine Learning II, Springer, 2010, 371-385.
  • [4] Draminski, M., Koronacki, J., Komorowski, J.: A study on Monte Carlo Gene Screening, in: Intelligent Information Processing and Web Mining, Springer, 2005, 349-356.
  • [5] Draminski, M., Rada-Iglesias, A., Enroth, S., Wadelius, C., Koronacki, J., Komorowski, J.: Monte Carlo feature selection for supervised classification, Bioinformatics, 24(1), 2008, 110-117.
  • [6] Komorowski, J., 0hrn, A., Skowron, A.: The ROSETTA rough set software system, Handbook of data mining and knowledge discovery, 2002, 2-3.
  • [7] Kowalczyk, W.: Rough data modelling, A new technique for analyzing data, Rough sets in knowledge discovery, 1, 1998, 400-421.
  • [8] Metropolis, N., Ulam, S.: The Monte Carlo method, Journal of the American statistical association, 44(247), 1949, 335-341.
  • [9] Øhrn, A.: Rosetta: A collection of classes and routines for empirical modelling and data mining., 1996.
  • [10] Øhrn, A., Komorowski, J., et al.: ROSETTA-A Rough Set Toolkit for Analysis of Data, Proc. Third International Joint Conference on Information Sciences, Citeseer, 1997.
  • [11] Pawlak, Z.: Rough sets, International Journal of Computer & Information Sciences, 11(5), 1982, 341-356.
  • [12] Pawlak, Z.: Rough sets: theoretical aspects of reasoning about data, system theory, Knowledge Engineering and Problem Solving, vol. 9, 1991.
  • [13] Skowron, A.: The rough sets theory and evidence theory, Fundamenta Informaticae, 13(3), 1990, 245-262.
  • [14] Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems, in: Intelligent Decision Support, Springer, 1992, 331-362.
  • [15] Swiniarski, R. W.: Rough sets methods in feature reduction and classification, Int. J, Appl. Math. Comput, 11(3), 2001,565-582.
Typ dokumentu
Bibliografia
Identyfikatory
Identyfikator YADDA
bwmeta1.element.baztech-87a7ee05-2331-458a-954e-5430b90f4d90
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.