PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Encrypted prefix tree for pattern mining

Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Data influx at large volumes is welcome for quality outcome in knowledge discovery, but it causes concern for scalability of mining algorithms. We introduce three measures for scalable mining - bit-vector coding, data-partitioning and Transaction Prefix (TP)-tree. Following encryption with bit-vector coding, transaction records are partitioned with notion of common prefixes. A TP-tree structure is devised for arranging the data parts such that multiple records share common storage. Advantage is two-fold: additional storage reduction over bit-vector coding and mining common prefixes together. These altogether improve space-time requirement in frequent pattern mining. Experiments on dense datasets show significant improvements in performance and scalability of both candidate generation and pattern-growth algorithms.
Rocznik
Strony
3--22
Opis fizyczny
Bibliogr 22 poz.
Twórcy
autor
  • Microsoft India (R&D) Pvt. Ltd., Gachibovli, Hyderabad - 500032, India, rk_ju@yahoo.com
Bibliografia
  • [1]Savasere A., Omiecinski E„ Navathe S., An efficient algorithm for mining association rules in large datasets. Proceedings of 21st VLDB Conference, Zurich, Switzerland, 432-444, 1995.
  • [2]Goethals B., Survey on Frequent Pattern Mining. Helsenki, 2003, http://www.cs.helsinki.fl/u/-goethals/ publications/survey.ps., 2003.
  • [3]Goethals B., Zaki M., FIMF03: Workshop on Frequent Itemset Mining Implementations. Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations (FIMI’03), Melbourne, Florida, USA, 2003, http://fimi.cs.helsinki.fi/.
  • [4]Borgelt C., Efficient Implementations of Apriori and Eclat. Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations (FIMI’03),Melbourne, Florida, USA, 2003, http://www.cs.rpi.edu/research/pdf/03-14.pdf, http://fimi.cs.helsinki.fi/.
  • [5]FIMI repository: http://fimi.cs.helsinki.fi/.
  • [6]Gardarin G., Pucheral P., Wu F., Bitmap Based Algorithms For Mining Association Rules. BDA, 1998, Ed.: Mokrane Bouzeghoub.
  • [7]Liu G., Lu H., Lou W., Xu Y., Yu J. X., Efficient Mining of Frequent Patterns Using Ascending Frequency Ordered Prefix-Tree. Data Mining and Knowledge Discovery, 9, 249-274, 2004.
  • [8]Toivonen H., Sampling large database for association rules. Proceedings of 22nd VLDB Conference, Bombay, India, 134-145, 1996.
  • [9]Han J., Cheng H., Xin D., Yan X., Frequent pattern mining: current status and future directions. Data Mining and Knowledge Discovery. 15, 55-86, 2007.
  • [10]Han J., Pei J., Sin Y., Mining Frequent Pattern without Candidate Generation. Proceedings of the ACM-SIGMOD International Conference on Management of Data (SIGMOD’00), Dallas, Texas, USA, 1-12, 2000.
  • [11]Lin J. L., Dunham M. H., Mining Association Rules: Anti-Skew Algorithms. Proceedings of Fourth International Conference on Data Engineering(ICDE’98), Florida, USA, 486-493, 1998.
  • [12]Holsheimer M., Kersten M., Manila H., Toivonen H., A perspective on databases and data mining. Proceedings of 1st Int'l Conf. on Knowledge Discovery and Data Mining (KDD’95), Montreal, Canada, 1995.
  • [13]Zaki M. J., Scalable Algorithms for Association Rule Mining. IEEE Trans, on Knowledge and Data Engg. 12, 372-390, 2000.
  • [14]Zaki M. J., Gouda K., Fast vertical mining using diffsets. Proceedings of the Ninth ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining, Washington DC, 326-335, 2003.
  • [15]Zaki M. J., Parthasarathi S., Li W., Ogihara M., Evaluation of Sampling for Data mining of Association Rules. Seventh IEEE International Workshop on Research Issues in Data Engineering (RIDE'97), 42-50, 1997.
  • [16]Shenoy P., Haritsa J. R., Sudarshan S., Bhalotia G., Bawa M., Shah D., Turbocharging vertical mining of large databases. Proceedings of the 2000 ACM SIGKDD Conference, 22-33, 2000.
  • [17]Agrawal R., Srikant R., Fast Algorithm for Mining Association Rules. Proceedings of 20th Very Large Databases (VLDB) Conference, 487-499, 1994.
  • [18]Agrawal R., Imielinski T., Swami A., Mining association rules between sets of items in large databases. ACM SIGMOD International Conference on the Management of Data, Eds. Buneman, P. and Jajodia, S. ACM Press, Washington DC, USA, 207-216, 1993.
  • [19]Bhattacharyya R., Prefix Tree with Encryption of Data and Itemsets, Proceedings of Int’l Conference on Management of Data(COMAD), New Delhi, India, 2006
  • [20]Bhattacharyya R., Bhattacharyya B., Chaudhuri A., BSPT: A Compact Data Structure for Scalable Mining. Proceedings of Int’l Conf on Information Technology, Haldia, India, 250-260, 2007.
  • [21]Morzy T., Zakrzewicz M., Group Bitmap Index: A Structure for Association Rules Retrieval. Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), August 27-31, 1998, New York City, New York, USA. AAAI Press, 284-288, 1998. Ed. : Rakesh Agrawal, Paul E. Stolorz, Gregory Piatetsky-Shapiro.
  • [22]Kosters W. A., Pijls W., Apriori, A Depth First Implementation. Proc. of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations (FIMI’03), Melbourne, Florida, USA, 2003. FIMI repository: http://www.fimi.cs.helsinki.fi/, http://www.cs.rpi.edu/research/pdi703- 14.pdf.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BPP1-0093-0013
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.