PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Multiaspect Text Categorization Problem Solving : a Nearest Neighbours Classifier Based Approaches and Beyond

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
We deal with the problem of the multiaspect text categorization which calls for the classification of the documents with respect to two, in a sense, orthogonal sets of categories. We briefly define the problem, mainly referring to our previous work, and study the application of the k- nearest neighbours algorithm. We propose a new technique meant to enhance the effectiveness of this algorithm when applied to the problem in question. We show some experimental results confirming usefulness of the proposed approach.
Twórcy
autor
  • Systems Research Institute, Polish Academy of Sciences, 01-447 Warszawa, ul. Newelska 6, Poland
autor
  • Systems Research Institute, Polish Academy of Sciences, 01-447 Warszawa, ul. Newelska 6, Poland
autor
  • Systems Research Institute, Polish Academy of Sciences, 01-447 Warszawa, ul. Newelska 6, Poland
Bibliografia
  • [1] J. Allan, ed., Topic Detection and Tracking: Eventbased Information, Kluwer Academic Publishers, 2002.
  • [2] R. Baeza-Yates and B. Ribeiro-Neto, Modern information retrieval, ACM Press and Addison Wesley, 1999.
  • [3] A. Beygelzimer, S. Kakadet, J. Langford, S. Arya, D. Mount, and S. Li. FNN: Fast Nearest Neighbor Search Algorithms and Applications, 2013. R package version 1.1.
  • [4] S. Bird, R. Dale, B. Dorr, B. Gibson, M. Joseph, M.-Y. Kan, D. Lee, B. Powley, D. Radev, and Y. Tan, “The ACL anthology reference corpus: A reference dataset for bibliographic research in computational linguistics”. In: Proc. of Language Resources and Evaluation Conference (LREC 08), Marrakesh, Morocco, 1755–1759.
  • [5] M. Delgado, M. D. Ruiz, D. Sánchez, and M. A. Vila, “Fuzzy quantification: a state of the art”, Fuzzy Sets and Systems, vol. 242, 2014, 1–30, http://dx.doi.org/10.1016/j.fss.2013.10.012.
  • [6] S. A. Dudani, “The distance-weighted knearest-neighbor rule”, IEEE Transactions on Systems, Man, and Cybernetics, vol. 6, no. 4, 1976, 325–327, http: //dx.doi.org/10.1109/TSMC.1976.5408784.
  • [7] I. Feinerer, K. Hornik, and D. Meyer, “Text mining infrastructure in R”, Journal of Statistical Software, vol. 25, no. 5, 2008, 1–54, http://dx.doi.org/10.18637/jss.v025.i05.
  • [8] A. Feng and J. Allan, “Hierarchical topic detection in tdt-2004”.
  • [9] M. Gajewski, J. Kacprzyk, and S. Zadrożny, “Topic detection and tracking: a focused survey and a new variant”, Informatyka Stosowana, to appear.
  • [10] E. Han, G. Karypis, and V. Kumar, “Text categorization using weight adjusted k-nearest neighbor classifiication”. In: D. W. Cheung, G. J. Williams, and Q. Li, eds., Knowledge Discovery and Data Mining - PAKDD 2001, 5th Pacifiic-Asia Conference, Hong Kong, China, April 16-18, 2001, Proceedings, vol. 2035, 2001, 53–65.
  • [11] J. Kacprzyk, J. W. Owsiński, and D. A. Viattchenin, “A new heuristic possibilistic clustering algorithm for feature selection”, Journal of Automation, Mobile Robotics & Intelligent Systems, vol. 8, no. 2, 2014, http://dx.doi.org/10.14313/JAMRIS_2-2014/18.
  • [12] J. Kacprzyk and S. Zadrożny. “Power of linguistic data summaries and their protoforms”. In: C. Kahraman, ed., Computational Intelligence Systems in Industrial Engineering, volume 6 of Atlantis Computational Intelligence Systems, 71–90. Atlantis Press, 2012. http://dx.doi.org/10.2991/978-94-91216-77-0_4.
  • [13] D. Olszewski, J. Kacprzyk, and S. Zadrożny. “Time series visualization using asymmetric selforganizing map”. In: M. Tomassini, A. Antonioni, F. Daolio, and P. Buesser, eds., Adaptive and Natural Computing Algorithms, volume 7824 of Lecture Notes in Computer Science, 40–49. Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-37213-1_5.
  • [14] D. Olszewski, J. Kacprzyk, and S. Zadrożny. “Asymmetric k-means clustering of the asymmetric self-organizing map”. In: L. Rutkowski, M. Korytkowski, R. Scherer, R. Tadeusiewicz, L. Zadeh, and J. Zurada, eds., Artifiicial Intelligence and Soft Computing, volume 8468 of Lecture Notes in Computer Science, 772–783. Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-07176-3_67.
  • [15] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2014.
  • [16] F. Sebastiani, “Machine learning in automated text categorization”, ACM Computing Survys, vol. 34, no. 1, 2002, 1–47, http://dx.doi.org/10. 1145/505282.505283.
  • [17] M. Szymczak, S. Zadrożny, A. Bronselaer, and G. D. Tré, “Coreference detection in an XML schema”, Information Sciences, vol. 296, 2015, 237 – 262, http://dx.doi.org/10.1016/j.ins.2014.11.002.
  • [18] R. Yager, “Quantifiier guided aggregation using OWA operators”, International Journal of Intelligent Systems, vol. 11, 1996, 49–73, http://dx.doi.org/10.1002/(SICI)1098-111X(199601)11:1%3C49::AID-INT3%3E3.0.CO;2-Z.
  • [19] Y. Yang, “An evaluation of statistical approaches to text categorization”, Information Retrieval, vol. 1, no. 1-2, 1999, 69–90, http://dx.doi.org/ 10.1023/A:1009982220290.
  • [20] Y. Yang, T. Ault, T. Pierce, and C. W. Lattimer, “Improving text categorization methods for event tracking”. In: SIGIR, 2000, 65–72, http://dx. doi.org/10.1145/345508.345550.
  • [21] L. Zadeh, “A computational approach to fuzzy quantifiiers in natural languages”, Computers and Mathematics with Applications, vol. 9, 1983, 149–184, http://dx.doi.org/10.1016/0898-1221(83)90013-5.
  • [22] S. Zadrożny, J. Kacprzyk, M. Gajewski, and M. Wysocki, “A novel text classifiication problem and its solution”, Technical Transaction. Automatic Control, vol. 4-AC, 2013, 7–16.
  • [23] S. Zadrożny, J. Kacprzyk, and M. Gajewski, “A novel approach to sequence-of-documents focused text categorization using the concept of a degree of fuzzy set subsethood”. In: Proceedings of the Annual Conference of the North American Fuzzy Information processing Society NAFIPS’2015 and 5th World Conference on Soft Computing 2015, Redmond, WA, USA, August 17-19, 2015, 2015.
  • [24] S. Zadrożny, J. Kacprzyk, and M. Gajewski. “A new approach to the multiaspect text categorization by using the support vector machines”. In: G. De Tré, P. Grzegorzewski, J. Kacprzyk, J. W. Owsiński, W. Penczek, and S. Zadrożny, eds., Challenging problems and solutions in intelligent systems, to appear. Springer, Heidelberg New York, 2016.
  • [25] S. Zadrożny, J. Kacprzyk, and M. Gajewski, “A new two-stage approach to the multiaspect text categorization”. In: 2015 IEEE Symposium on Computational Intelligence for Human-like Intelligence, CIHLI 2015, Cape Town, South Africa, December 8-10, 2015, to appear, 2015.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-80c3755d-936b-4188-b288-81c4a01bce20
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.