Investigating Accuracies of Classifications for Randomized Imbalanced Class Distributions

Abe, H.; Tsumoto, S.

Powiadomienia systemowe

Sesja wygasła!
Sesja wygasła!

Artykuł - szczegóły

Tytuł artykułu

Investigating Accuracies of Classifications for Randomized Imbalanced Class Distributions

Autorzy

Abe H. , Tsumoto S.

Wybrane pełne teksty z tego czasopisma

https://fi.episciences.org/

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

In datamining post-processing, rule selection with objective rule evaluation indices is one of useful methods for extracting valuable knowledge from mined patterns. However, the relationship between an index value and experts’ criteria has never been clarified. In order to determine the relationship, we have developed a method to obtain learning models from a dataset consisting of objective rule evaluation indices and evaluation labels for rules. In this study, we have compared accuracies of classification learning algorithms for datasets with randomized class labels. Then, the result shows that accuracies of classification learning algorithms without any criterion of a human expert can not outperform each percentage of majority class on both of the balanced and imbalanced class distribution datasets. With regarding to this result, we can determine whether or not a labeled rule set contains some criteria based on the dataset consisting the objective rule evaluation indices.

Słowa kluczowe

data mining postprocessing rule evaluation learning model

Wydawca

IOS Press

Czasopismo

Fundamenta Informaticae

Rocznik

2009

Tom

Vol. 90, nr 4

Strony

369--378

Opis fizyczny

bibliogr. 30 poz., tab., wykr.

Twórcy

autor

Abe H.

autor

Tsumoto S.

Shimane University 89-1 Enya-cho Izumo Shimane, 6938501, JAPAN, abe@med.shimane-u.ac.jp

Bibliografia

[1] COIN: Calculation modules for Objective rule evaluation INdices, http://coin.sourceforge.jp/.
[2] Abe, H., Tsumoto, S., Ohsaki,M., Yamaguchi, T.: A Rule Evaluation SupportMethod with LearningModels Based on Objective Rule Evaluation Indexes, Proceedings of the fifth IEEE International Conference on Data Mining ICDM-2005, 2005.
[3] Abe, H., Tsumoto, S., Ohsaki, M., Yokoi, H., Yamaguchi, T.: Evaluation of Learning Costs of Rule Evaluation Models Based on Objective Indices to Predict Human Hypothesis Construction Phases, Proceedings of IEEE International Conference on Granular Computing GrC-2007, 2007.
[4] Ali, K., Manganaris, S., Srikant, R.: Partial Classification Using Association Rules, Proceedings of the International Conference on Knowledge Discovery and Data Mining KDD-1997, 1997.
[5] Breiman, L.: Bagging Predictors, Machine Learning, 24(2), 1996, 123-140.
[6] Brin, S.,Motwani, R., Ullman, J., Tsur, S.: Dynamic itemset counting and implication rules for market basket data, Proceedings of the ACM SIGMOD International Conference on Management of Data, 1997.
[7] Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P.: From Data Mining to Knowledge Discovery in Databases, AI Magazine, 17(3), 1996, 37-54.
[8] Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R., Eds.: Advances in Knowledge Discovery and Data Mining, chapter Explora: AMultipattern andMultistrategy Discovery Assistant, AAAI/MIT Press, California, 1996, 249-271.
[9] Frank, E., Wang, Y., Inglis, S., Holmes, G., Witten, I. H.: Using model trees for classification, Machine Learning, 32(1), 1998, 63-76.
[10] Frank, E., Witten, I. H.: Generating accurate rule sets without global optimization, Proceedings of the Fifteenth International Conference on Machine Learning, 1998.
[11] Freitas, A. A.: On rule interestingness measures., Knowledge-Based Systems, 12(5-6), 1999, 309-315.
[12] Gago, P., Bento, C.: A Metric for Selection of the Most Promising Rules, Proceedings of the European Conference on the Principles of Data Mining and Knowledge Discovery PKDD-1998, 1998.
[13] Goodman, L. A., Kruskal, W. H.: Springer Series in Statistics, vol. 1, chapter Measures of association for cross classifications, Springer-Verlag, 1979.
[14] Gray, B., Orlowska, M. E.: CCAIIA: Clustering Categorical Attributes into Interesting Association Rules, Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining PAKDD-1998, 1998.
[15] Hamilton, H. J., Shan, N., Ziarko, W.: Machine Learning of Credible Classifications, Proceedings of the Australian Conference on Artificial Intelligence AI-1997, 1997.
[16] Hilderman, R. J., Hamilton, H. J.: Knowledge Discovery and Measures of Interest, Kluwer Academic Publishers, 2001.
[17] Hinton, G. E.: Learning distributed representations of concepts, Proceedings of 8th Annual Conference of the Cognitive Science Society, 1986.
[18] Holte, R. C.: Very simple classification rules perform well on most commonly used datasets, Machine Learning, 11, 1993, 63-91.
[19] Ohsaki, M., Abe, H., Yokoi, H., Tsumoto, S., Yamaguchi, T.: Evaluation of Rule Interestingness Measures in Medical Knowledge Discovery in Databases, Artificial Intelligence in Medicine, 41(3), 2007, 177-196.
[20] Ohsaki,M., Kitaguchi, S., Kume, S., Yokoi, H., Yamaguchi, T.: Evaluation of Rule Interestingness Measures with a Clinical Dataset on Hepatitis, Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery ECML/PKDD-2004, LNAI3202, 2004.
[21] Piatetsky-Shapiro, G.: Discovery, Analysis and Presentation of Strong Rules, Knowledge Discovery in Databases (P.-S. G., W. J. Frawley, Eds.), AAAI/MIT Press, 1991.
[22] Platt, J.: Fast Training of Support Vector Machines using Sequential Minimal Optimization, Advances in Kernel Methods - Support Vector Learning (B. C., A. Smola, Eds.), MIT Press, 1999.
[23] Quinlan, J. R.: Programs for Machine Learning, Morgan Kaufmann Publishers, 1993.
[24] Rijsbergen, C.: Information Retrieval, Chapter 7, http://www.dcs.gla.ac.uk/Keith/Chapter.7/Ch.7.html, 1979.
[25] Smyth, P., Goodman, R. M.: Rule Induction using Information Theory, Knowledge Discovery in Databases (P.-S. G., F. W. J., Eds.), AAAI/MIT Press, 1991.
[26] Tan, P. N., Kumar, V., Srivastava, J.: Selecting the Right Interestingness Measure for Association Patterns, Proceedings of the International Conference on Knowledge Discovery and Data Mining KDD-2002, 2002.
[27] Wang, Y.: On Cognitive Informatics, Brain and Mind: A Transdisciplinary Journal of Neuroscience and Nuerophilosophy, 4(2), 2003, 151-167.
[28] Witten, I. H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann, 2000.
[29] Yao, Y. Y., Zhong, N.: An Analysis of Quantitative Measures Associated with Rules, Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining PAKDD-1999, 1999.
[30] Zhong, N., Yao, Y. Y., Ohshima, M.: Peculiarity Oriented Multi-Database Mining, IEEE Transactions on Knowledge and Data Engineering, 15(4), 2003, 952-960.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BUS8-0004-0024