PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Rough Sets for Handling Imbalanced Data: Combining Filtering and Rule-based Classifiers

Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
The paper addresses problems of improving performance of rule-based classifiers constructed from imbalanced data sets, i.e., data sets where the minority class of primary importance is under-represented in comparison to majority classes. We introduced two techniques to detect and process inconsistent examples from the majority classes in the boundary between the minority and majority classes. Both these techniques differ in the way of processing inconsistent boundary examples from the majority classes. The first approach removes them, while the other relabels them as belonging to the minority class. The experiments showed that the best results were obtained for the filtering technique, where inconsistent majority class examples were reassigned to the minority class, combined with a classifier composed of decision rules generated by the MODLEM algorithm.
Wydawca
Rocznik
Strony
379--391
Opis fizyczny
tab., bibliogr. 27 poz.
Twórcy
autor
Bibliografia
  • [1] Bairagi, R., Suchindran, C. M.: An estimator of the cutoff point maximizing sum of sensitivity and specificity. Sankhya, Series B, Indian Journal of Statistics 51, 1989, 263-269.
  • [2] Batista G., Prati R., Monard M.: A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsletter, 6 (1), 2004, 20-29
  • [3] Blake C., Koegh E., Mertz C.J.: Repository of Machine Learning, University of California at Irvine 1999 [URL: http://www.ics.uci.edu/ mlearn/MLRepositoru.html].
  • [4] Chawla N., Bowyer K., Hall L., Kegelmeyer W.: SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 2002, 341-378.
  • [5] Divijver P., Kittler J.: Pattern Recognition: A Statistical Approach, Prentice Hall, 1982.
  • [6] Grzymala-Busse J.W.: LERS - a system for learning from examples based on rough sets. In: Slowinski R. (ed.), Intelligent decision support. Handbook of application and advances of the rough sets theory, Kluwer Academic Publishers, 1992, 3-18.
  • [7] Grzymala-Busse J.W.: Managing uncertainty in machine learning from examples. In: Proceedings of the 3rd International Symposium in Intelligent Systems, IPI PAN Press, 1994, 70-84.
  • [8] Grzymala-Busse J.W., Goodwin L.K., Zhang, X.: Increasing sensitivity of preterm birth by changing rule strengths. In: Proceedings of the Eigth Workshop on Intelligent Information Systems (IIS'99), 1999, 127-136.
  • [9] Grzymala-Busse J.W., Goodwin L.K., Grzymala-BusseW.J., Zheng X.: An approach to imbalanced data sets based on changing rule strength. In: Proceedings. Learning from Imbalanced Data Sets, AAAI Workshop at the 17th Conference on AI, AAAI-2000, 2000, 69-74.
  • [10] Grzymala-Busse J.W., Stefanowski J. Wilk Sz.: A comparison of two approaches to data mining from imbalanced data. In: Proceedings of the KES 2004, 8-th International Conference on Knowledge-based Intelligent Information & Engineering Systems, Springer LNCS vol. 3213, 2004, 757-763; An extended version of this study has been published in Journal of Intelligent Manufacturing, 16 (6), December 2005, 565-575.
  • [11] Holte R.C., Acker L.E., Porter B.: Concept learning and problem of small disjuncts. In: Proceedings of 11th Joint Conference on Artificial Intelligence, 1989, 813-819.
  • [12] Japkowicz, N.: Learning from imbalanced data sets: a comparison of various strategies. In: Proceedings. Learning from Imbalanced Data Sets, AAAI Workshop at the 17th Conference on AI, AAAI-2000, 2000, 10-17.
  • [13] Komorowski J., Pawlak Z., Polkowski L., Skowron A.: Rough Sets: tutorial. In: Pal S.K., Skowron A. (eds.), Rough fuzzy hybridization. A new trend in decision-making, Springer-Verlag, 1999, 3-98.
  • [14] Kubat M., Matwin S.: Addresing the curse of imbalanced training sets: one-side selection. In: Proceedings of 14th International Conference on Machine Learning ICML 97, 1997, 179-186.
  • [15] Laurikkala J.: Improving identification of difficult small classes by balancing class distribution, Technical Report A-2001-2, University of Tampere, 2001.
  • [16] Lewis D., Catlett J.: Heterogeneous uncertainty sampling for supervised learning. In: Proceedings of the 11th International Conference on Machine Learning, 1994, 148-156.
  • [17] Pawlak Z.: Rough sets. Theoretical aspects of reasoning about Data, Kluwer Academic Publishers, 1991.
  • [18] Riddle P., Segal R., Etzioni O.: Representation design and Brute-force induction in a Boening manufacturing fomain. Applied Artificial Intelligence Journal, 8, 1994, 125-147.
  • [19] Slowinski K., Stefanowski J., Siwinski D.: Application of rule induction and rough sets to verification of magnetic resonance diagnosis, Fundamenta Informaticae, 53 (3/4), 2002, 345-363.
  • [20] Stefanowski J.: Classification support based on the rough sets. Foundations of Computing and Decision Sciences, 18 (3-4), 1993, 371-380.
  • [21] Stefanowski J.: The rough set based rule induction technique for classification problems. In: Proceedings of 6th European Conference on Intelligent Techniques and Soft Computing EUFIT'98, 1998, 109-113.
  • [22] Stefanowski J., Borkiewicz R.: Interactive rule discovery of decision rules, In: Proceedings of VIIIth Intelligent Information Systems, Ustron 14-18 June 1999, IPI PAN Press, Warszawa, 112-116.
  • [23] Stefanowski J.,Wilk Sz.: Evaluating business credit risk by means of approach integrating decision rules and case based learning. International Journal of Intelligent Systems in Accounting, Finance and Management, 10, 2001, 97-114.
  • [24] Stefanowski J., Vanderpooten D.: Induction of decision rules in classification and discovery-oriented perspectives. International Journal of Intelligent Systems, 16, 2001, 13-28.
  • [25] Tomek I.: Two modifications of CNN. IEEE Transactions of Systems, Man and Communications, SMC-6, 1976, 769-772.
  • [26] Weiss G.M.: Mining with rarity: a unifying framework. ACM SIGKDD Explorations Newsletter, 6 (1), 2004, 7-19.
  • [27] Wilk S., Slowinski R., Michalowski W., Greco S.: Supporting triage of children with abdominal pain in the emergency room. European Journal of Operation Research, 160 (3), 2004, 696-709 .
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BUS2-0010-0076
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.