PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

A Contribution to the Use of Decision Diagrams for Loading and Mining Transaction Databases

Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
In this paper, we mainly address the problem of loading transaction datasets into main memory and estimating the density of such datasets. We propose BoolLoader, an algorithm dedicated to these tasks; it relies on a compressed representation of all the transactions of the dataset. For sake of efficiency, we have chosen Decision Diagrams as the main data structure to the representation of datasets into memory. We give an experimental evaluation of our algorithm on both dense and sparse datasets. Experiments have shown that BoolLoader is efficient for loading some dense datasets and gives a partial answer about the nature of the dataset before time-consuming pattern extraction tasks. We further investigate the use of Algebraic Decision Diagrams by studying the feasibility of current Data Mining operations, as for instance computing the support of an itemset and even mining frequent itemsets.
Wydawca
Rocznik
Strony
575--594
Opis fizyczny
bibliogr. 25 poz., tab., wykr.
Twórcy
autor
  • CCLS-Center for Computational Learning Systems, Columbia University, 475 Riverside Dr., New York, NY 10115, U.S.A., ansaf@ccls.columbia.edu
Bibliografia
  • [1] Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules, Proc. 20th Int. Conf. Very Large Data Bases, VLDB, Morgan Kaufmann, 1994, ISBN 1-55860-153-8.
  • [2] Akers, S.: Binary Decision Diagrams, IEEE Transactions on Computers, 27, 1978, 509-516.
  • [3] Bayardo, R. J.: Efficiently mining long patterns from databases, Proc. 1998 ACM-SIGMOD Int. Conf. Management of Data, Seattle, Washington, June 1998.
  • [4] Bryant, R.: Graph-Based Algorithms for Boolean Functions Manipulation, IEEE Trans. on Computers, C-35(8), August 1986, 677-691.
  • [5] Burdick, D., Calimlim, M., Gehrke, J.: MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases, Proceedings of the 17th International Conference on Data Engineering, Heidelberg, Germany, April 2001.
  • [6] Clarke, E., Fujita,M.,McGeer, P.,McMillan, K., Yang, J., Zhao, X.: Multi terminal binary decision diagrams: An efficient data structure for matrix representation, Int. Workshop on Logic Synth., 1993.
  • [7] Gouda, K., Zaki, M. J.: Efficiently Mining Maximal Frequent Itemsets, Proceedings of 1st IEEE International Conference on Data Mining, San Jose, November 2001.
  • [8] Gouda, K., Zaki, M. J.: GenMax: An Efficient Algorithm for Mining Maximal Frequent Itemsets., Data Min. Knowl. Discov., 11(3), 2005, 223-242.
  • [9] Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation, Proc. of Int. Conf. On Management of Data, May 2000, 1-12.
  • [10] Lee, C.: Representation of Switching Circuits by Binary-Decision Programs, Bell Systems Technical Journal, 38, July 1959, 985-999.
  • [11] Minato, S.: Shared Binary Decision Diagram With Attributed Edges for Efficient Boolean Function Manipulation, Proc. 27th Design Automation Conference, June 1990.
  • [12] Minato, S.: Binary Decision Diagrams and Applications for VLSI CAD, Kluwer Academic Publishers, 1996.
  • [13] Murphy, P. M., Aha, D. W.: UCI Repository of Machine Learning Databases, Machine-readable collection, Dept of Information and Computer Science, University of California, Irvine, 1995, [Available by anonymous ftp from ics.uci.edu in directory pub/machine-learning-databases].
  • [14] R.I. Bahar, E.A. Frohm, C.M. Gaona, G.D. Hachtel, E. Macii, A. Pardo, F. Somenzi: Algebraic Decision Diagrams and Their Applications, IEEE /ACM International Conference on CAD, IEEE Computer Society Press, Santa Clara, California, 1993.
  • [15] Rymon, R.: Search through Systematic Set Enumeration., Proceedings of the 3rd International Conference on Principles of Knowledge Representation and Reasoning (KR'92)., 1992.
  • [16] Salleb, A.,Maazouzi, Z., Vrain, C.: MiningMaximal Frequent Itemsets by a Boolean Approach, Proceedings of the 15th European Conference on Artificial Intelligence ECAI, Lyon, France, 2002.
  • [17] Salleb, A., Vrain, C.: Estimation of the Density of Datasets with Decision Diagrams., Proceedings of the 15th International Symposium, ISMIS 2005, Saratoga Springs, NY, USA, May 25-28, 2005,, 2005.
  • [18] Savasere, A., Omiecinsky, E., Navathe, S.: An efficient algorithm for mining association rules in large databases, In 21st Int'l Conf. on Very Large Databases (VLDB), 1995.
  • [19] Shenoy, P., Haritsa, J., Sudarshan, S., Bhalotia, G., Bawa, M., , Shah, D.: Turbo-charging vertical mining of large databases, Proc. of Int.. Conf. on Management of Data, 2000.
  • [20] Somenzi, F.: Efficient manipulation of decision diagrams, International Journal on Software Tools for Technology Transfer, 3(2), 2001, 171-181.
  • [21] Stone,M. H.: Boolean algebras and their relation to topology, Proceedings of National Academy of Sciences, 20, 1934, 37-111.
  • [22] Tani, S., Hamaguchi, K., Yajima, S.: The Complexity of the Optimal Variable Ordering Problems of Shared Binary Decision Diagrams, ISAAC: 4th International Symposium on Algorithms and Computation Algorithms), 1993.
  • [23] Zaki, M. J., Gouda, K.: Fast Vertical Mining Using Diffsets, 9th International Conference on Knowledge Discovery and Data Mining, Washington, DC, August 2003.
  • [24] Zaki,M. J., Parthasarathy, S., Ogihara,M., Li,W.: New Algorithms for Fast Discovery of Association Rules, 3rd Intl. Conf. on Knowledge Discovery and Data Mining, AAAI Press, 1997, ISBN 1-57735-027-8.
  • [25] Zheng, Z., Kohavi, R., Mason, L.: Real World Performance of Association Rule Algorithms, Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2001.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BUS5-0010-0045
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.