Prawdopodobieństwo błędu klasyfikatorów złożonych dla problemów wieloklasowych

Huk, M.; Szczepanik, M.

Artykuł - szczegóły

Tytuł artykułu

Prawdopodobieństwo błędu klasyfikatorów złożonych dla problemów wieloklasowych

Autorzy

Huk M. , Szczepanik M.

Treść / Zawartość

Pełne teksty:

httpwww_ein_org_plpodstronywydania51pdf02p.pdf

Pobierz

httpwww_ein_org_plpodstronywydania51pdf02.pdf

Pobierz

Identyfikatory

Warianty tytułu

Multiple classifier error probability for multi-class problems

Języki publikacji

Abstrakty

In this paper we consider majority voting of multiple classifiers systems in the case of two-valued decision support for many-class problem. Using an explicit representation of the classification error probability for ensemble binomial voting and two class problem, we obtain general equation for classification error probability for the case under consideration. Thus we are extending theoretical analysis of the given subject initially performed for the two class problem by Hassen and Salamon and still used by Kuncheva and other researchers. This allows us to observe important dependence of maximal posterior error probability of base classifier allowable for building multiple classifiers from the number of considered classes. This indicates the possibility of improving the performance of multiple classifiers for multiclass problems, which may have important implications for their future applications in many fields of science and industry, including the problems of machines diagnostic and systems reliability testing.

W niniejszym artykule rozważamy systemy złożonych klasyfikatorów z głosowaniem większościowym dla przypadku problemów wieloklasowych, wykorzystujące wielowartościowe klasyfikatory bazowe. Stosując bezpośrednią reprezentację prawdopodobieństwa błędnej klasyfikacji dla analogicznych systemów w problemach dwuklasowych, otrzymujemy ogólny wzór na prawdopodobieństwo błędu klasyfikacji w przypadku wieloklasowym. Tym samym rozszerzamy teoretyczne analizy tego zagadnienia pierwotnie przeprowadzone dla problemów dwuklasowych przez Hansena i Salomona i ciagle wykorzystywane przez Kunchevę i innych badaczy. Pozwala nam to zaobserwować istotną zależność maksymalnego dopuszczalnego poziomu prawdopodobieństwa błędów klasyfikatorów bazowych od liczby rozważanych przez nie klas. Wskazuje to na możliwość poprawy parametrów klasyfikatorów złożonych dla problemów wieloklasowych, co może mieć niebagatelne znaczenie dla dalszych ich zastosowań w licznych dziedzinach nauki i przemysłu, z uwzględnieniem zagadnień diagnostyki maszyn oraz badania niezawodności systemów.

Słowa kluczowe

multiple classifiers majority voting multi-class problems

klasyfikatory złożone głosowanie większościowe problemy wieloklasowe

Wydawca

Polskie Naukowo-Techniczne Towarzystwo Eksploatacyjne

Czasopismo

Eksploatacja i Niezawodność

Rocznik

2011

Tom

nr 3

Strony

12--16

Opis fizyczny

Bibliogr. 36 poz.

Twórcy

autor

Huk M.

autor

Szczepanik M.

Instytut Informatyki Politechnika Wrocławska, ul. Wybrzeże. Wyspiańskiego nr 27 50-370 Wrocław, Polska, Maciej.Huk@pwr.wroc.pl

Bibliografia

1. Ali K, Pazzani M. Error reduction through learning multiple descriptions. Machine Learning 1996; 24(3): 173-206.
2. Bian S, Wang W. On diversity and accuracy of homogeneous and heterogeneous ensembles. IOS Press Amsterdam: 2007, 4(2): 103-128.
3. Brown G, Wyatt J, Harris R, Yao X. Diversity creation methods: A survey and categorization. Journal of Information Fusion 2005; 6(1).
4. Bruzzone L, Cossu R, Vernazza G. Detection of land-cover transitions by combining multidate classifiers. IOS Press Amsterdam: 2007, 25(13): 1491-1500.
5. Buhlmann P, Hothorn T. Boosting algorithms: Regularization, Prediction and Model Fitting. Statistical Science 2007; 22(4): 477-505.
6. Claeskens G, Hjort N. Model Selection and Model Averaging. Volume 27 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press 2008.
7. Dietterich T. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization., Machine Learning 2000; 40(2): 139-157.
8. Dietterich T. (Ensemble learning, in M. ARBIB (Ed.)) The Handbook of Brain Theory and Neural Networks. Second ed., Cambridge: 2002, 405-408.
9. Elovici, Y, Shapira B, Kantor P. A decision theoretic approach to combining information filters: Analytical and empirical evaluation., Journal of the American Society for Information Science and Technology 2006; 57(3): 306-320.
10. Evgeniou T, Pontil M, Elisseef A. Leave one out error, stability, and generalization of voting combinations of classifiers, Machine Learning 2004; 55(1): 71-97.
11. Freund Y, Lu J, Schapire R. Boosting: Models, Applications and Extensions., Chapman and Hall/CRC 2010.
12. Freund Y, Schapire R. Experiments with a new boosting algorithm., Machine Learning: Proceedings of the Thirteenth International Conference (ICML 96). SanFrancisco: 1996, 148-156.
13. Fumera G, Roli F. A theoretical and experimental analysis of linear combiners for multiple classifier systems., IEEE Transactions on Pattern Analysis and Machine Intelligence 2005; 27(6): 942-956.
14. Halbiniak Z, Jóźwiak I.: Deterministic chaos in the processor load. Chaos, Solitons and Fractals 2007; 31(2): 409-416.
15. Hansen L, Salamon P. Neural network ensembles, IEEE Transactions on Pattern Analysis and Machine Intelligence 1990; 12(10): 993-1001.
16. Jacak J, Jóźwiak I, Jacak L.: Composite fermions in braid group terms. Open Systems and Information Dynamics 2010; 17(1): 53-71.
17. Jahrer M, Tscher A, Legenstein R. Combining predictions for accurate recommender systems, Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, Washington, DC, USA: 2010, 693-702.
18. Kuncheva L. Combining Pattern Classifiers. Methods and Algorithms., Wiley 2004
19. Kuncheva L, Bezdek J, Sutton M. On combining multiple classifiers by fuzzy templates, Proceedings of the on combining multiple classifiers by fuzzy templates. Conference, Pensacola, Florida USA: 1990, 193-197.
20. Kuncheva L, Whitaker C, Shipp C, Duin R. Limits on the majority vote accuracy in classifier fusion, Pattern Analysis and Applications 2003; 6: 22-31.
21. Leapa N, Clemansa P, Bauer K, Oxley M. An investigation of the effects of correlation and autocorrelation on classifier fusion and optimal classifier ensembles, International Journal of General Systems 2008; 37(4): 475-498.
22. Leigh W, Purvis R, Ragusa J. Forecasting the nyse composite index with technical analysis, pattern recognizer, neural networks, and genetic algorithm: a case study in romantic decision support. Decision Support Systems 2002; 32(4): 361-377.
23. Liu Y, Yao X, Higuchi T. Evolutionary ensembles with negative correlation learning. IEEE Transactions on Evolutionary Computation 2000; 4(4): 380-387.
24. Mangiameli P, West D, Rampal R. Model selection for medical diagnosis decision support systems, Decision Support Systems 2004; 36(3): 247-259.
25. Menahem E, Shabtai A, Rokach L, Elovici Y. Improving malware detection by applying multi-inducer ensemble. Computational Statistics and Data Analysis 2009; 53(4): 1483-1494.
26. Niewczas A, Pieniak D, Bachanek T, Surowska B, Bieniaś J, Pałka K. Prognosing of functional degradation of bio-mechanical systems exemplified by the tooth-composite filling system. Eksploatacja i Niezawodnosc - Maintenance and Reliability 2010; 45(1): 23-34.
27. Opitz D, Shavlik J. Generating accurate and diverse members of a neural-network ensemble, Advances in Neural Information Processing Systems. MIT Press, Denver: 1996, 535-543.
28. Rokach L. Mining manufacturing data using genetic algorithm-based feature set decomposition. International Journal of Intelligent Systems Technologies and Applications 2008; 4(1): 57-78.
29. Rokach L. Taxonomy for characterizing ensemble methods in classification tasks: a review and annotated bibliography. Computational Statistics and Data Analysis 2009; 53(12): 4046-4072.
30. Santhanam P, Bassin K. Managing the maintenance of ported, outsourced, and legacy software via orthogonal defect classification. In proc. IEEE International Conference on Software Maintenance 2001; 726-734
31. Shahrtash S, Jamehbozorg A. A Decision-Tree-Based Method for Fault Classification in Single-Circuit Transmission Lines. IEEE Transactions on Power Delivery 2010; 25(4): 2190-2196.
32. Tan A, Gilbert D, Deville Y. Multi-class protein fold classification using a new ensemble machine learning approach. Genome Informatics 2003; 14: 206-217.
33. Tao J, Zhang Y, Chen X, Ming Z. Bayesian reliability growth model based on dynamic distribution parameters. Eksploatacja i Niezawodnosc - Maintenance and Reliability 2010; 46(2): 13-16.
34. Valentini G, Masulli F. Ensembles of learning machines. in M. M. and T. R. (Eds), Neural Nets: 13th Italian Workshop on Neural Nets, Vol. 2486 of Lecture Notes in Computer Science, Springer, Berlin: 2002, 3-19.
35. Xu D, Wu M, An J. Design of an expert system based on neural network ensembles for missile fault diagnosis. In Proc. IEEE International Conference on Robotics, Intelligent Systems and Signal Processing 2003, 2: 903-908.
36. Yu T, Cui W, Song B, Wang S. Reliability growth estimation for unmanned aerial vechicle during flight-testing phases. Eksploatacja i Niezawodnosc - Maintenance and Reliability 2010; 46(2): 43-47.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BAT1-0039-0045