PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

An improved comparison of three rough set approaches to missing attribute values

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
In a previous paper three types of missing attribute values: lost values, attribute-concept values and "do not care" conditions were compared using six data sets. Since previous experimental results were affected by large variances due to conducting experiments on different versions of a given data set, we conducted new experiments, using the same pattern of missing attribute values for all three types of missing attribute values and for both certain and possible rules. Additionally, in our new experiments, the process of incremental replacing specified values by missing attribute values was terminated when entire rows of the data sets were full of missing attribute values. Finally, we created new, incomplete data sets by replacing the specified values starting from 5% of all attribute values, instead of 10% as in the previous experiments, with an increment of 5% instead of the previous increment of 10%. As a result, it is becoming more clear that the best approach to missing attribute values is based on lost values, with small difference between certain and possible rules, and that the worst approach is based on "do not care" conditions, certain rules. With our improved experimental setup it is also more clear that for a given data set the type of the missing attribute values should be selected individually.
Rocznik
Strony
469--486
Opis fizyczny
Bibliogr. 28 poz., wykr.
Twórcy
autor
autor
  • Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045, USA and Institute of Computer Science, Polish Academy of Sciences, 01-237 Warsaw, Poland
Bibliografia
  • CHAN, C.C. and GRZYMALA-BUSSE, J.W. (1991) On the attribute redundancy and the learning programs IDS, PRISM, and LEM2. Technical report. Department of Computer Science, University of Kansas.
  • CHMIELEWSKI, M.R. and GRZYMALA-BUSSE, J.W. (1996) Global discretization of continuous attributes as preprocessing for machine learning. International Journal of Approximate Reasoning 15, 319-331.
  • DARDZINSKA, A. and RAS, Z.W. (2005) CHASE-2: Rule based chase algorithm for information systems of type lambda. In: Proceedings of the Second International Workshop on Active Mining (AM’2003), 258-270.
  • GRZYMALA-BUSSE, J.W. (1992) LERS-A system for learning from examples based on rough sets. In: R. Slowinski, ed., Intelligent Decision Support. Handbook of Applications and Advances of the Rough Set Theory. Kluwer Academic Publishers, Dordrecht, Boston, London, 3-18.
  • GRZYMALA-BUSSE, J.W. (2002) MLEM2: A new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, (IPMU 2002), 243-250.
  • GRZYMALA-BUSSE, J.W. (2003) Rough set strategies to data with missing attribute values. In: Workshop Notes, Foundations and New Directions of Data Mining, in conjunction with the 3-rd International Conference on Data Mining, 56-63.
  • GRZYMALA-BUSSE, J.W. (2004a) Characteristic relations for incomplete data: A generalization of the indiscernibility relation. In: Proceedings of the Fourth International Conference on Rough Sets and Current Trends in Computing, 244-253.
  • GRZYMALA-BUSSE, J.W. (2004b) Data with missing attribute values: Generalization of indiscernibility relation and rule induction. Transactions on Rough Sets, 1, 78-95.
  • GRZYMALA-BUSSE, J.W. (2004c) Three approaches to missing attribute values - A rough set perspective. In: Proceedings of the Workshop on Foundation of Data Mining, associated with the Fourth IEEE International Conference on Data Mining, 55-62.
  • GRZYMALA-BUSSE, J.W. and GRZYMALA-BUSSE, W.J. (2007) An experimental comparison of three rough set approaches to missing attribute values. In: J.F. Peters and A. Skowron, eds. Springer-Verlag, Berlin, Heidelberg, 31-50.
  • GRZYMALA-BUSSE, J.W., GRZYMALA-BUSSE, W.J., HIPPE, Z.S. and RZĄSA, W. (2008) An improved comparison of three rough set approaches to missing attribute values. In: Proceeedings of the 16-th International Conference on Intelligent Information Systems, 141-150.
  • GRZYMALA-BUSSE, J.W. and HU, M. (2000) A comparison of several approaches to missing attribute values in data mining. In: Proceedings of the Second International Conference on Rough Sets and Current Trends in Computing, 340-347.
  • GRZYMALA-BUSSE, J.W. and WANG, A.Y. (1997) Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In: Proceedings of the Fifth International Workshop on Rough Sets and Soft Computing (RSSC’97) at the Third Joint Conference on Information Sciences (JCIS’97), 69-72.
  • GRZYMALA-BUSSE, J.W. (1991) On the unknown attribute values in learning from examples. In: Proceedings of the ISMIS-91, 6th International Symposium on Methodologies for Intelligent Systems, 368-377.
  • HONG, T.P., TSENG, L.H. and CHIEN, B.C. (2004) Learning coverage rules from incomplete data based on rough sets. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, 3226-3231.
  • KRYSZKIEWICZ, M. (1995) Rough set approach to incomplete information systems. In: Proceedings of the Second Annual Joint Conference on Information Sciences, 194-197.
  • KRYSZKIEWICZ, M. (1999) Rules in incomplete information systems. Information Sciences 113, 271-292.
  • LIN, T.Y. (1992) Topological and fuzzy rough sets. In: R. Slowinski, ed. Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory. Kluwer Academic Publishers, Dordrecht, Boston, London, 287-304.
  • LITTLE, R. J. A. and RUBIN, D.B. (2002) Statistical Analysis with Missing Data. Second Edition. John Wiley & Sons, Hoboken, N.J.
  • NAKATA, M. and SAKAI, H. (2005) Rough sets handling missing values probabilistically interpreted. In: Proceedings of the 10-th International Conference RSFDGrC’2005 on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, Springer-Verlag, Berlin, Heidelberg, 325-334.
  • PAWLAK, Z. (1982) Rough sets. International Journal of Computer and Information Sciences 11, 341-356.
  • PAWLAK, Z. (1991) Rough Sets. Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht, Boston, London.
  • SLOWINSKI, R. and VANDERPOOTEN, D. (2000) A generalized definition of rough approximations based on similarity. IEEE Transactions on Knowledge and Data Engineering 12, 331-336.
  • STEFANOWSKI, J. and TSOUKIAS, A. (1999) On the extension of rough sets under incomplete information. In: Proceedings of the RSFDGrC’1999, 7th International Workshop on New Directions in Rough Sets, Data Mining, and Granular-Soft Computing, 73-81.
  • STEFANOWSKI, J. and TSOUKIAS, A. (2001) Incomplete information tables and rough classification. Computational Intelligence 17, 545-566.
  • WANG, G. (2002) Extension of rough set under incomplete information systems. In: Proceedings of the IEEE International Conference on Fuzzy Systems, 1098-1103.
  • YAO, Y.Y. (1998) Relational interpretations of neighborhood operators and rough set approximation operators. Information Sciences 111, 239-259.
  • YAO, Y.Y. and LIN, T.Y. (1996) Generalization of rough sets using modal logics. Intelligent Automation and Soft Computing 2, 103-119.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BAT5-0055-0013
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.