An Analysis of Probabilistic Approximations for Rule Induction from Incomplete Data Sets

Clark, P. G.; Grzymala-Busse, J. W.; Hippe, Z. S.

doi:10.3233/FI-2014-1049

Artykuł - szczegóły

Tytuł artykułu

An Analysis of Probabilistic Approximations for Rule Induction from Incomplete Data Sets

Autorzy

Clark P. G. , Grzymala-Busse J. W. , Hippe Z. S.

Wybrane pełne teksty z tego czasopisma

https://fi.episciences.org/

Identyfikatory

DOI

10.3233/FI-2014-1049

Warianty tytułu

Języki publikacji

Abstrakty

The main objective of our research was to test whether the probabilistic approximations should be used in rule induction from incomplete data. For our research we designed experiments using six standard data sets. Four of the data sets were incomplete to begin with and two of the data sets had missing attribute values that were randomly inserted. In the six data sets, we used two interpretations of missing attribute values: lost values and “do not care” conditions. In addition we used three definitions of approximations: singleton, subset and concept. Among 36 combinations of a data set, type of missing attribute values and type of approximation, for five combinations the error rate (the result of ten-fold cross validation) was smaller than for ordinary (lower and upper) approximations; for other four combinations, the error rate was larger than for ordinary approximations. For the remaining 27 combinations, the difference between these error rates was not statistically significant.

Słowa kluczowe

probability theory approximation theory missing data data analysis set theory

Wydawca

IOS Press

Czasopismo

Fundamenta Informaticae

Rocznik

2014

Tom

Vol. 132, nr 3

Strony

365--379

Opis fizyczny

Bibliogr. 29 poz., tab., wykr.

Twórcy

autor

Clark P. G.

patrick.g.clark@gmail.com

Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045, USA

autor

Grzymala-Busse J. W.

jerzy@ku.edu

Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045, USA

autor

Hippe Z. S.

zhippe@wsiz.rzeszow.pl

Department of Expert Systems and Artificial Intelligence, University of Information Technology and Management, 35-225 Rzeszow, Poland

Bibliografia

[1] Clark, P. G., Grzymala-Busse, J. W.: Experiments on probabilistic approximations, Proceedings of the 2011 IEEE International Conference on Granular Computing, 2011.
[2] Clark, P. G., Grzymala-Busse, J. W., Hippe, Z. S.: How good are probabilistic approximations for rule induction from data with missing attribute values?, Proceedings of the RSCTC 2012, the 8-th International Conference on Rough Sets and Current Trends in Computing, 2012.
[3] Grzymala-Busse, J. W.: On the unknown attribute values in learning from examples, Proceedings of the ISMIS-91, 6th International Symposium on Methodologies for Intelligent Systems, 1991.
[4] Grzymala-Busse, J. W.: LERS—A system for learning from examples based on rough sets, in: Intelligent Decision Support. Handbook of Applications and Advances of the Rough Set Theory (R. Slowinski, Ed.), Kluwer Academic Publishers, Dordrecht, Boston, London, 1992, 3–18.
[5] Grzymala-Busse, J. W.: MLEM2: A new algorithm for rule induction from imperfect data, Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, 2002.
[6] Grzymala-Busse, J. W.: Rough set strategies to data with missing attribute values, Workshop Notes, Foundations and New Directions of Data Mining, in conjunction with the 3-rd International Conference on Data Mining, 2003.
[7] Grzymala-Busse, J. W.: Data with missing attribute values: Generalization of indiscernibility relation and rule induction., Transactions on Rough Sets, 1, 2004, 78–95.
[8] Grzymala-Busse, J. W.: Generalized parameterized approximations, Proceedings of the RSKT 2011, the 6-thInternational Conference on Rough Sets and Knowledge Technology, 2011.
[9] Grzymala-Busse, J. W., Wang, A. Y.: Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values, Proceedings of the Fifth International Workshop on Rough Sets and Soft Computing (RSSC’97) at the Third Joint Conference on Information Sciences (JCIS’97), 1997.
[10] Grzymala-Busse, J. W., Ziarko, W.: Data mining based on rough sets, in: Data Mining: Opportunities and Challenges (J. Wang, Ed.), Idea Group Publ., Hershey, PA, 2003, 142–173.
[11] Kryszkiewicz, M.: Rough set approach to incomplete information systems, Proceedings of the Second Annual Joint Conference on Information Sciences, 1995.
[12] Kryszkiewicz, M.: Rules in incomplete information systems, Information Sciences, 113(3-4), 1999, 271–292.
[13] Lin, T. Y.: Neighborhood systems and approximation in database and knowledge base systems, Proceedings of the ISMIS-89, the Fourth International Symposium on Methodologies of Intelligent Systems, 1989.
[14] Lin, T. Y.: Topological and fuzzy rough sets, in: Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory (R. Slowinski, Ed.), Kluwer Academic Publishers, Dordrecht, Boston, London, 1992, 287–304.
[15] Pawlak, Z.: Rough sets, International Journal of Computer and Information Sciences, 11, 1982, 341–356.
[16] Pawlak, Z.: Rough Sets. Theoretical Aspects of Reasoning about Data, Kluwer Academic Publishers, Dordrecht, Boston, London, 1991.
[17] Pawlak, Z., Skowron, A.: Rough sets: Some extensions, Information Sciences, 177, 2007, 28–40.
[18] Pawlak, Z., Wong, S. K. M., Ziarko, W.: Rough sets: probabilistic versus deterministic approach, International Journal of Man-Machine Studies, 29, 1988, 81–95.
[19] Ślęzak, D., Ziarko, W.: The investigation of the Bayesian rough set model, International Journal of Approximate Reasoning, 40, 2005, 81–91.
[20] Slowinski, R., Vanderpooten, D.: A generalized definition of rough approximations based on similarity, IEEE Transactions on Knowledge and Data Engineering, 12, 2000, 331–336.
[21] Stefanowski, J., Tsoukias, A.: On the extension of rough sets under incomplete information, Proceedings of the RSFDGrC’1999, 7th International Workshop on New Directions in Rough Sets, Data Mining, and Granular-Soft Computing, 1999.
[22] Stefanowski, J., Tsoukias, A.: Incomplete information tables and rough classification, Computational Intelligence, 17(3), 2001, 545–566.
[23] Wong, S. K. M., Ziarko, W.: INFER—An adaptive decision support system based on the probabilistic approximate classification, Proceedings of the 6-th International Workshop on Expert Systems and their Applications,1986.
[24] Yao, Y. Y.: Relational interpretations of neighborhood operators and rough set approximation operators, Information Sciences, 111, 1998, 239–259.
[25] Yao, Y. Y.: Probabilistic rough set approximations, International Journal of Approximate Reasoning, 49, 2008, 255–271.
[26] Yao, Y. Y., Wong, S. K. M.: A decision theoretic framework for approximate concepts, International Journal of Man-Machine Studies, 37, 1992, 793–809.
[27] Yao, Y. Y., Wong, S. K. M., Lingras, P.: A decision-theoretic rough set model, Proceedings of the 5thInternational Symposium on Methodologies for Intelligent Systems, 1990.
[28] Ziarko, W.: Variable precision rough set model, Journal of Computer and System Sciences, 46(1), 1993,39–59.
[29] Ziarko, W.: Probabilistic approach to rough sets, International Journal of Approximate Reasoning, 49, 2008, 272–284.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-b1d8788f-a562-4612-a864-adecbf0b90cf