PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Powiadomienia systemowe
  • Sesja wygasła!
  • Sesja wygasła!
Tytuł artykułu

Efficiently Mining Frequent Embedded Unordered Trees

Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Mining frequent trees is very useful in domains like bioinformatics, web mining, mining semi-structured data, and so on. In this paper we introduce SLEUTH, an efficient algorithm for mining frequent, unordered, embedded subtrees in a database of labeled trees. The key contributions of our work are as follows: We give the first algorithm that enumerates all embedded, unordered trees. We propose a new equivalence class extension scheme to generate all candidate trees. We extend the notion of scope-list joins to compute frequency of unordered trees. We conduct performance evaluation on several synthetic and real datasets to show that SLEUTH is an efficient algorithm, which has performance comparable to TreeMiner, that mines only ordered trees.
Słowa kluczowe
Wydawca
Rocznik
Strony
33--52
Opis fizyczny
Bliogr. 26 poz., wykr.
Twórcy
autor
  • Computer Science Department Rensselaer Polytechnic Institute Troy NY 12180, USA, zaki@cs.rpi.edu
Bibliografia
  • [1] Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A. I.: Fast Discovery of Association Rules, Advances in Knowledge Discovery and Data Mining (U. Fayyad, et al, Eds.), AAAI Press, Menlo Park, CA, 1996.
  • [2] Agrawal, R., Srikant, R.: Mining Sequential Patterns, 11th Intl. Conf. on Data Engg., 1995.
  • [3] Asai, T., Abe, K., Kawasoe, S., Arimura, H., Satamoto, H., Arikawa, S.: Efficient Substructure Discovery from Large Semi-structured Data, 2nd SIAM Int’l Conference on Data Mining, April 2002.
  • [4] Asai, T., Arimura, H., Uno, T., Nakano, S.: Discovering Frequent Substructures in Large Unordered Trees, 6th Int’l Conf. on Discovery Science, October 2003.
  • [5] Chi, Y., Yang, Y., Muntz, R. R.: Indexing and Mining Free Trees, 3rd IEEE International Conference on Data Mining, 2003.
  • [6] Chi, Y., Yang, Y., Muntz, R. R.: Hybrid Tree Miner: An Efficient Algorihtm for Mining Frequent Rooted Trees and Free Trees Using Canonical Forms, 16th International Conference on Scientific and Statistical Database Management, 2004.
  • [7] Chi, Y., Yang, Y., Xia, Y.,Muntz, R. R.: CMTreeMiner: Mining Both Closed and Maximal Frequent Subtrees, 8th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2004.
  • [8] Cole, R., Hariharan, R., Indyk, P.: Tree pattern matching and subset matching in deterministic O(nlog3n)_-time, 10th Symposium on Discrete Algorithms, 1999.
  • [9] Cook, D., Holder, L.: Substructure discovery using minimal description length and background knowledge, Journal of Artificial Intelligence Research, 1, 1994, 231–255.
  • [10] Dehaspe, L., Toivonen, H., King, R.: Finding frequent substructures in chemical compounds, 4th Intl. Conf. Knowledge Discovery and Data Mining, August 1998.
  • [11] Huan, J.,Wang,W., Prins, J.: EfficientMining of Frequent Subgraphs in the Presence of Isomorphism, IEEE Int’l Conf. on Data Mining, 2003.
  • [12] Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data, 4th European Conference on Principles of Knowledge Discovery and Data Mining, September 2000.
  • [13] Kilpelainen, P., Mannila, H.: Ordered and unordered tree inclusion, SIAM J. of Computing, 24(2), 1995, 340–356.
  • [14] Kramer, S., Raedt, L. D., Helma, C.: Molecular Feature Mining in HIV data, Int’l Conf. on Knowledge Discovery and Data Mining, 2001.
  • [15] Kuramochi,M., Karypis, G.: Frequent Subgraph Discovery, 1st IEEE Int’l Conf. on Data Mining, November 2001.
  • [16] Morell, V.: Web-Crawling up the Tree of Life, Science, 273(5275), aug 1996, 568–570.
  • [17] Nijssen, S., Kok, J. N.: Efficient Discovery of Frequent Unordered Trees, 1st Int’l Workshop on Mining Graphs, Trees and Sequences, 2003.
  • [18] Shamir, R., Tsur, D.: Faster Subtree Isomorphism, Journal of Algorithms, 33, 1999, 267–280.
  • [19] Termier, A., Rousset, M.-C., Sebag, M.: TreeFinder: a First Step towards XML Data Mining, IEEE Int’l Conf. on Data Mining, 2002.
  • [20] Wang, K., Liu, H.: Discovering Typical Structures of Documents: A Road Map Approach, ACM SIGIR Conference on Information Retrieval, 1998.
  • [21] Xiao, Y., Yao, J.-F., Li, Z., Dunham, M. H.: Efficient Data Mining for Maximal Frequent Subtrees, International Conference on Data Mining, 2003.
  • [22] Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining, IEEE Int’l Conf. on Data Mining, 2002.
  • [23] Yan, X., Han, J.: CloseGraph: Mining Closed Frequent Graph Patterns, ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, August 2003.
  • [24] Yoshida, K., Motoda, H.: CLIP: Concept Learning from Inference Patterns, Artificial Intelligence, 75(1), 1995, 63–92.
  • [25] Zaki, M. J.: Efficiently Mining Frequent Trees in a Forest, 8th ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining, July 2002.
  • [26] Zaki, M. J., Aggarwal, C.: Xrules: An effective structural classifier for XML data, 9th ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining, August 2003.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BUS2-0007-0020
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.