PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Rough set Based Ensemble Classifier for Web Page Classification

Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Combining the results of a number of individually trained classification systems to obtain a more accurate classifier is a widely used technique in pattern recognition. In this article, we have introduced a rough set based meta classifier to classify web pages. The proposed method consists of two parts. In the first part, the output of every individual classifier is considered for constructing a decision table. In the second part, rough set attribute reduction and rule generation processes are used on the decision table to construct a meta classifier. It has been shown that (1) the performance of the meta classifier is better than the performance of every constituent classifier and, (2) the meta classifier is optimal with respect to a quality measure defined in the article. Experimental studies show that the meta classifier improves accuracy of classification uniformly over some benchmark corpora and beats other ensemble approaches in accuracy by a decisive margin, thus demonstrating the theoretical results. Apart from this, it reduces the CPU load compared to other ensemble classification techniques by removing redundant classifiers from the combination.
Słowa kluczowe
Wydawca
Rocznik
Strony
171--187
Opis fizyczny
bibliogr. 24 poz., tab.
Twórcy
autor
autor
autor
Bibliografia
  • [1] Rough Set Exprloration System (RSES) available at http://logic.mimuw.edu.pl/ rses/.
  • [2] Bazan, J. G., Nguyen, H. S., Nguyen, S. H., Synak, P., Wroblewski, J.: Rough set algorithms in classification problem, Rough set methods and applications: new developments in knowledge discovery in information systems, Physica-Verlag GmbH, Heidelberg, Germany, 2000, 49-88.
  • [3] Breiman, L.: Bagging predictors, Machine Learning, 24, 1996, 123-140.
  • [4] Chakrabarti, S., Roy, S., Mahesh, V., Soundalgekar: Fast and accurate text classification via multiple linear discriminant projections, The International Journal on Very Large Data Bases, 12(2), 2003, 170-185.
  • [5] Dietterich, T. G.: An experimental comparison of three methods for constructing ensembles of decision tree, Machine Learning, 40, 2000, 139-158.
  • [6] Dietterich, T. G., Bakiri, G.: Solving multiclass learning problems via errorcorrecting output codes, Journal of Artificial Intelligence Research, 2, January 1995, 263-286.
  • [7] Dzeroski, S., Zenko, B.: Is Combining Classifiers with Stacking Better than Selecting the Best One?, Machine Learning, Kluwer Academic Publishers, Hingham, MA, USA, 54(3), 2004, 255-273, ISSN 0885-6125.
  • [8] Gama, J.: Combining Classiers by Constructive Induction, Ninth European Conference on Machine Learning, Springer 1997, Prague, Czech Republic, April 1997.
  • [9] Larkey, L. S., Croft,W. B.: Combining classifiers in text categorization, Proceedings of SIGIR-96, 19th ACM International Conference on Research and Development in Information Retrieval, ACM Press, New York, US, 1996, 289-297.
  • [10] Li, Y. H., Jain, A. K.: Classification of text documents, Computer Journal, 41(8), 1998, 537-546.
  • [11] McCallum, A., Nigam, K.: A comparison of event models for naive Bayes text classication, International Conference on Machine Learning, Morgan Kaufmann Publishers, Madison, Wisconsin USA, 1998.
  • [12] Merz, C. J.: Using Correspondence Analysis to Combine Classiers, Machine Learning. Kluwer Academic Publishers, 36(1/2), 1999, 33-58.
  • [13] Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data, Kluwer Academic Publishers, Dordrecht; Boston, 1991.
  • [14] Pawlak, Z.: Rough sets and decision analysis, INFOR: Information system and operational research, 38(3), 2000, 132-144.
  • [15] Pawlak, Z.: Rough sets and decision algorithms, in Rough Sets and Current Trends in Computing (Second International Conference, RSCTC 2000), Springer, Berlin, RSCTC, 2001, 30-45.
  • [16] Quinlan, J. R.: Bagging, Boosting and C4.5, AAAI96: In Proc. of the 13 AAAI(American Association for Artificial Intelligence) Conference on Artificial Intelligence, AAAI Press, Menlo Park, CA, 1996, 725-730.
  • [17] Schutze, H., Hull, D. A., Pederson, J. O.: comparison of classiers and document representations for the routing problem, Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, Seattle, Washington, USA, 1995, 229-237.
  • [18] Scott, S., Matwin, S.: Feature engineering for text classification, ICML: International Conference on Machine Learning', Morgan Kaufmann Publishers, San Francisco, US, Bled, Slovenia, 1999, 379-388.
  • [19] Senator, T. E.: Multi-Stage Classification, ICDM '05: Proceedings of the Fifth IEEE International Conference on Data Mining, IEEE Computer Society, Washington, DC, USA, 2005, 386-393.
  • [20] Ting, K. M., Witten, I. H.: Issues in stacked generalization, Journal of Articial Intelligence Research, 10, 1999, 271-289.
  • [21] Tsoumakas, G., Katakis, I., Vlahavas, I.: Effective Voting of Heterogeneous Classifiers, 15th European Conference on Machine Learning, Springer, Berlin / Heidelberg, 3201, 2004, 20-24, ISSN 0302-9743.
  • [22] Tumer, K., Ghosh, J.: Robust combining of disparate classifiers through order statistics, Pattern Analysis and Applications, 5 (2002), 2002.
  • [23] Wang, G., Liu, Q., Yao, Y., Skowron, A.: Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, RSFDGrC, Springer 2003, Chongqing, China, 2003, 41-48.
  • [24] Zenko, B., Todorovski, L., Dzeroski, S.: A Comparison of Stacking with Meta Decision Trees to Bagging, Boosting, and Stacking with other Methods, ICDM '01: Proceedings of the 2001 IEEE International Conference on Data Mining, IEEE Computer Society, Washington, DC, USA, 2001, 669-670.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BUS5-0009-0040
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.