A General Framework for Mining Frequent Subgraphs from Labeled Graphs

Inokuchi, A.; Washio, T.; Motoda, H.

Artykuł - szczegóły

Tytuł artykułu

A General Framework for Mining Frequent Subgraphs from Labeled Graphs

Autorzy

Inokuchi A. , Washio T. , Motoda H.

Wybrane pełne teksty z tego czasopisma

https://fi.episciences.org/

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

The derivation of frequent subgraphs from a dataset of labeled graphs has high computational complexity because the hard problems of isomorphism and subgraph isomorphism have to be solved as part of this derivation. To deal with this computational complexity, all previous approaches have focused on one particular kind of graph. In this paper, we propose an approach to conduct a complete search for various classes of frequent subgraphs in a massive dataset of labeled graphs within a practical time. The power of our approach comes from the algebraic representation of graphs, its associated operations and well-organized bias constraints to limit the search space efficiently. The performance has been evaluated using real world datasets, and the high scalability and flexibility of our approach have been confirmed with respect to the amount of data and the computation time.

Słowa kluczowe

data mining graph mining frequent subgraph bias canonical form subgraph isomorphism

Wydawca

IOS Press

Czasopismo

Fundamenta Informaticae

Rocznik

2005

Tom

Vol. 66, nr 1,2

Strony

53--82

Opis fizyczny

Bibliogr. 27 poz., wykr.

Twórcy

autor

Inokuchi A.

inokuchi@jp.ibm.com

Tokyo Research Laboratory IBM Japan 1623-14, Shimotsuruma, Yamato, Kanagawa, 242-8502, Japan

autor

Washio T.

washio@ar.sanken.osaka-u.ac.jp

The Institute of Scientific and Industrial Research Osaka University 8-1, Mihogaoka, Ibaraki, Osaka, 567-0047, Japan

autor

Motoda H.

motoda@ar.sanken.osaka-u.ac.jp

The Institute of Scientific and Industrial Research Osaka University 8-1, Mihogaoka, Ibaraki, Osaka, 567-0047, Japan

Bibliografia

[1] Alberts, B., Bray, D., Johnson, A., Lewis, J., Raff, M., Roberts, M., and Walter, P.: Essential Cell Biology: An Introduction to the Molecular Biology of the Cell, Garland Science Publishing, 1998.
[2] Asai, T., Abe, K., Kawasoe, S., Arimura, H., Sakamoto, H., and Arikawa, S.: Efficient Substructure Discovery from Large Semi-Structured Data, Proc. of the 2nd SIAM International Conference on Data Mining, pp. 158–174, 2002.
[3] Asai, T., Arimura, H., Uno, U., and Nakano, S.: Discovering Frequent Substructures in Large Unordered Trees, Proc. of the 6th International Conference on Discovery Science, pp. 47–61. 2003.
[4] Brin, S., Motwani, R., and Silverstein, C: Beyond market baskets: Generalizing association rules to correlations, Proc. of the SIGMOD International Conference on Management of Data, pp. 265–276. 1997.
[5] Cook, D. and Holder, L.: Substructure Discovery Using Minimum Description Length and Background Knowledge, Journal of Artificial Intelligence Research, Vol. 1, pp. 231–255, 1994.
[6] Dehaspe, L., Toivonen, H., and King, R.: Finding Frequent Substructures in Chemical Compounds, Proc. of the 4th International Conference on Knowledge Discovery and Data Mining, pp. 30–36, 1998.
[7] De Raedt, L. and Kramer, S.: The Levelwise Version Space Algorithm and its Application to Molecular Fragment Finding, Proc. of the 17th Joint Conference on Artificial Intelligence, pp. 853–859, 2001.
[8] Garey,M. and Johnson, D.: Computers and Intractability: A Guide to the Theory of NP-Completeness, W.H. Freeman, 1979.
[9] AIDS Antiviral Screen, http://dtp.nci.nih.gov/docs/aids/aids data.html
[10] Inokuchi, A., Washio, T., and Motoda, H.: An Apriori-based Algorithm for Mining Frequent Substructures from Graph Data, Proc. of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, pp. 13–23, 2000.
[11] Inokuchi, A., Washio, T., Nishimura, Y., and Motoda, H.: A Fast Algorithm for Mining Frequent Connected Graph, IBM Research Report, RT0448, 2002.
[12] Inokuchi, A., Washio, T., and Motoda, H.: Complete Mining of Frequent Patterns from Graphs: Mining Graph Data, Machine Learning, Vol.50, pp. 321-354, 2003.
[13] Inokuchi, A.,Washio, T., and Motoda, H.: A General Framework for Mining Frequent Patterns in Structures, IBM Research Report, RT0513, 2003.
[14] Kilpelainen, P. and Mannila, H.: Ordered and Unordered Tree Inclusion, SIAM Journal on Computing, Vol.24, pp. 340–356, 1995.
[15] Kramer, S., and De Raedt, L.: Feature Construction with Version Space for Biochemical Applications, Proc. of the 18th International Conference on Machine Learning, pp. 258–265, 2001.
[16] Kramer, S., De Raedt, L., and Helma, C.: Molecular Feature Mining in HIV data, Proc. of the 7th International Conference on Knowledge Discovery and Data Mining, pp. 136–143, 2001.
[17] Kuramochi, M. and Karypis, G.: Frequent Subgraph Discovery, Proc. of the 1st International Conference on Data Mining, pp. 313–320, 2001.
[18] Kuramochi, M. and Karypis, G.: An Efficient Algorithm for Discovering Frequent Subgraphs, Technical Report, 02-026, 2002.
[19] Matsuda, T., Horiuchi, T., Motoda, H., and Washio, T.: Extension of Graph-Based Induction for General Graph Structured Data, Proc. of the 4th Pacific-Asia Conf. on Knowledge Discovery and Data Mining, pp. 420–431, 2000.
[20] Motoda, H. and Yoshida, K.: Machine Learning Techniques to Make Computers Easier to Use, Proc. of the 15th International Joint Conference on Artificial Intelligence, Vol. 2, pp. 1622–1631, 1997.
[21] Nijiissen, S. and Kok, J.: Faster Association Rules for Multiple Relations, Proc. of the 17th International Joint Conference on Artificial Intelligence, pp. 891–896, 2001.
[22] Nijiissen, S. and Kok, J.: Efficient Discovery of Frequent Unordered Trees Proc. of the 1st First International Workshop on Mining Graphs, Trees and Sequences , 2003.
[23] http://oldwww.comlab.ox.ac.uk/oucl/groups/machlearn/PTE
[24] Yan, X. and Han, J.: gSpan: Graph-Based Substructure Pattern Mining, Proc. of the 2nd International Conference on Data Mining, pp. 721–724, 2002.
[25] Yan, X. and Han, J.: CloseGraph: Mining Closed Frequent Graph Patterns, Proc. of the 9th International Conference on Knowledge Discovery and Data Mining, pp. 527–532, 2003.
[26] Yoshida, K. and Motoda, H.: CLIP: Concept Learning from Inference Patterns, Artificial Intelligence, Vol. 75, No. 1 pp. 63–92, 1995.
[27] Zaki, M.: Efficiently Mining Frequent Trees in a Forest, Proc. of the 8th International Conference on Knowledge Discovery and Data Mining, pp. 71–80, 2002.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BUS2-0007-0021