Feature Extraction for Dynamic Integration of Classifiers

Pechenizkiy, M.; Tsymbal, A.; Puuronen, S.; Patterson, D.

Artykuł - szczegóły

Tytuł artykułu

Feature Extraction for Dynamic Integration of Classifiers

Autorzy

Pechenizkiy M. , Tsymbal A. , Puuronen S. , Patterson D.

Wybrane pełne teksty z tego czasopisma

https://fi.episciences.org/

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

Recent research has shown the integration of multiple classifiers to be one of the most important directions in machine learning and data mining. In this paper, we present an algorithm for the dynamic integration of classifiers in the space of extracted features (FEDIC). It is based on the technique of dynamic integration, in which local accuracy estimates are calculated for each base classifier of an ensemble, in the neighborhood of a new instance to be processed. Generally, the whole space of original features is used to find the neighborhood of a new instance for local accuracy estimates in dynamic integration. However, when dynamic integration takes place in high dimensions the search for the neighborhood of a new instance is problematic, since the majority of space is empty and neighbors can in fact be located far from each other. Furthermore, when noisy or irrelevant features are present it is likely that also irrelevant neighbors will be associated with a test instance. In this paper, we propose to use feature extraction in order to cope with the curse of dimensionality in the dynamic integration of classifiers. We consider classical principal component analysis and two eigenvector-based class-conditional feature extraction methods that take into account class information. Experimental results show that, on some data sets, the use of FEDIC leads to significantly higher ensemble accuracies than the use of plain dynamic integration in the space of original features.

Słowa kluczowe

feature extraction dynamic integration of classifiers

Wydawca

IOS Press

Czasopismo

Fundamenta Informaticae

Rocznik

2007

Tom

Vol. 77, nr 3

Strony

243--275

Opis fizyczny

bibliogr. 51 poz., tab., wykr.

Twórcy

autor

Pechenizkiy M.

autor

Tsymbal A.

autor

Puuronen S.

autor

Patterson D.

Information Systems Group, Department of Computing Science, Eindhoven University of Technology, P.P.Box 513, 5600 MB Eindhoven, The Netherlands, m.pechenizky@tue.nl

Bibliografia

[1] Aivazyan, S.: Applied statistics: classification and dimension reduction, Finance and Statistics, Moscow, 1989.
[2] Aladjem, M.: Parametric and nonparametric linear mappings of multidimensional data, Pattern Recognition, 24(6), 1991, 543-553, ISSN 0031-3203.
[3] Bauer, E., Kohavi, R.: An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants, Maching Learning, 36(1-2), 1999, 105-139, ISSN 0885-6125.
[4] Bellman, R.: Adaptive Control Processes: A Guided Tour, Princeton University Press, 1961.
[5] Blake, C., Keogh, E., Merz, C.: UCI repository of machine learning databases [http://www.ics.uci.edu/ mlearn/MLRepository.html], 1998.
[6] Brodley, C., Lane, T.: Creating and exploiting coverage and diversity, AAAI-96 Workshop on Integrating Multiple Learned Models for Improving and Scaling Machine Learning Algorithms, Portland, OR, 1996, 8-14.
[7] Chan, P.: An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning, Ph.D. Thesis, Department of Computer Science, Columbia University, New York, 1996.
[8] Chan, P. K., Stolfo, S. J.: Sharing Learned Models among Remote Database Partitions by Local Meta-Learning., KDD, 1996, 2-7.
[9] Chan, P. K., Stolfo, S. J.: On the Accuracy of Meta-Learning for Scalable Data Mining, Journal Intelligent Information Systems, 8(1), 1997, 5-28.
[10] Cordella, L. P., Foggia, P., Sansone, C., Vento, M.: Reliability Parameters to Improve Combination Strategies in Multi-Expert Systems, Pattern Analysis and Applications, 2(3), 1999, 205-214.
[11] Cost, R. S., Salzberg, S.: A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features, Machine Learning, 10, 1993, 57-78.
[12] Dietterich, T. G.: Machine-Learning Research: four current directions, AI Magazine, 18(4), 1997, 97-136.
[13] Dietterich, T. G., Bakiri, G.: Solving Multiclass Learning Problems via Error-Correcting Output Codes, Journal of Artificial Intelligence Research (JAIR), 2, 1995, 263-286.
[14] Fayyad, U.M.: DataMining and Knowledge Discovery: Making Sense Out of Data, IEEE Expert: Intelligent Systems and Their Applications, 11(5), 1996, 20-25, ISSN 0885-9000.
[15] Fukunaga, K.: Introduction to Statistical Pattern recognition, Academic Press, London, 1990.
[16] Gama, J.: Combining classification algorithms, Ph.D. Thesis, Dept. of Computer Science, University of Porto, 1999.
[17] Giacinto, G., Roli, F.: Methods for dynamic classifier selection, ICIAP '99, 10th International Conference on Image Analysis and Processing, IEEE CS Press, 1999, 659-664.
[18] Hao, H., Liu, C.-L., Sako, H.: Comparison of Genetic Algorithm and Sequential Search Methods for Classifier Subset Selection, ICDAR, 2003, 765-769.
[19] Hastie, T., Tibshirani, R.: Discriminant Adaptive Nearest Neighbor Classification, IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(6), 1996, 607-616.
[20] Heath, D., Kasif, S., Salzberg, S.: Cognitive Technology: in Search of a Humane Interface, chapter Committees of decision trees, Elsevier Science, 1996, 305-317.
[21] Jolliffe, I.: Principal Component Analysis, New York: Springer, 1986.
[22] Kambhatla, N., Leen, T. K.: Dimension Reduction by Local Principal Component Analysis, Neural Computation, 9(7), 1997, 1493-1516.
[23] Kohavi, R.: Wrappers for performance enhancement and oblivious decision graphs, Ph.D. thesis, Dept. of Computer Science, Stanford University, Stanford, USA, 1995.
[24] Kohavi, R., Sommerfield, D., Dougherty, J.: Data Mining Using LC a Machine Learning Library in C++, Tools with Artificial Intelligence, 1996, 234-245.
[25] Koppel, M., Engelson, S.: Integrating multiple classifiers by finding their areas of expertise, AAAI-96 Workshop On Integrating Multiple Learning Models for Improving and Scaling Machine Learning Algorithms, Portland, OR, 1996, 53-58.
[26] Liu, H.: Feature Extraction, Construction and Selection: A Data Mining Perspective, Kluwer Academic Publishers, 1998.
[27] Merz, C.: Learning from data, artificial intelligence and statistics, chapter Dynamical selection of learning algorithms, New York: Springer, 1996.
[28] Merz, C.: Classification and regression by combining models, Ph.D. Thesis, Dept. of Information and Computer Science, University of California, Irvine, USA, 1998.
[29] Merz, C. J.: Using Correspondence Analysis to Combine Classifiers, Machine Learning, 36(1-2), 1999, 33-58.
[30] Opitz, D. W.: Feature Selection for Ensembles, AAAI/IAAI, 1999, 379-384.
[31] Opitz, D. W., Shavlik, J. W.: Generating Accurate and Diverse Members of a Neural-Network Ensemble, NIPS, 1995, 535-541.
[32] Oza, N., Tumer, K.: Dimensionality reduction through classifier ensembles.Technical report NASA-ARC-IC-1999-124, Technical report, Computational Sciences Division, NASA Ames Research Center, Moffett Field, CA, 1999.
[33] Pechenizkiy, M., Tsymbal, A., Puuronen, S.: Supervised Learning and Local Dimensionality Reduction within Natural Clusters: Biomedical Data Analysis, IEEE Transactions on Information Technology in Biomedicine, Special Post-conference Issue "Mining Biomedical Data", 10(3), 2006, 533-539.
[34] Puuronen, S., Terziyan, V. Y., Tsymbal, A.: A Dynamic Integration Algorithm for an Ensemble of Classifiers, ISMIS, 1999, 592-600.
[35] Puuronen, S., Tsymbal, A.: Local Feature Selection with Dynamic Integration of Classifiers, Fundamenta Informaticae, Special Issue "Intelligent Information Systems", 47(1-2), 2001, 91-117.
[36] Quinlan, J. R.: C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993.
[37] Quinlan, J. R.: Bagging, Boosting, and C4.5, AAAI/IAAI, Vol. 1, 1996, 725-730.
[38] Schaffer, C.: Selecting a Classification Method by Cross-Validation, Machine Learning, 13, 1993, 135-143.
[39] Skalak, D.: Combining Nearest Neighbor Classifiers, Ph.D. Thesis, Dept. of Computer Science, University of Massachusetts, Amherst MA, 1997.
[40] Todorovski, L., Dzeroski, S.: Combining Multiple Models with Meta Decision Trees, PKDD, 2000, 54-64.
[41] Tsymbal, A.: Dynamic Integration of Data Mining Methods in Knowledge Discovery Systems, Ph.D. thesis, University of Jyv¨askyl¨a, Finland, 2002.
[42] Tsymbal, A., Pechenizkiy, M., Cunningham, P.: Diversity in search strategies for ensemble feature selection, Information Fusion, 6(1), 2005, 83-98.
[43] Tsymbal, A., Puuronen, S., Patterson, D. W.: Ensemble feature selection with the simple Bayesian classification, Information Fusion, 4(2), 2003, 87-100.
[44] Tsymbal, A., Puuronen, S., Pechenizkiy, M., Baumgarten, M., Patterson, D. W.: Eigenvector-Based Feature Extraction for Classification, FLAIRS Conference, 2002, 354-358.
[45] Tsymbal, A., Puuronen, S., Skrypnyk, I.: Ensemble feature selection with dynamic integration of classifiers, Int. Congress on Comp. Intelligence Methods and Applications CIMA2001, 2001.
[46] Vijayakumar, S., Schaal, S.: Local Dimensionality Reduction for Locally Weighted Learning, IEEE International Symposium on Computational Intelligence in Robotics and Automation, (CIRA'97), Monterey, California, 1997, 220-225.
[47] Webb, G. I.: MultiBoosting: A Technique for Combining Boosting and Wagging, Machine Learning, 40(2), 2000, 159-196.
[48] William, D., Goldstein, M.: Multivariate Analysis. Methods and Applications, John Wiley & Sons, 1984.
[49] Wilson, D. R., Martinez, T. R.: Improved Heterogeneous Distance Functions, Journal of Artificial Intelligence Research (JAIR), 6, 1997, 1-34.
[50] Woods, K., Kegelmeyer, W. P., Bowyer, K. W.: Combination of Multiple Classifiers Using Local Accuracy Estimates, IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4), 1997, 405-410.
[51] Wroblewski, J.: Ensembles of Classifiers Based on Approximate Reducts, Fundamenta Informaticae, 47(3-4), 2001, 351-360.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BUS5-0010-0010