Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
Sequential pattern mining is an extensively studied method for data mining. One of new and less documented approaches is estimation of statistical characteristics of sequence for creating model sequences, that can be used to speed up the process of sequence mining. This paper proposes extensive modifications to one of such algorithms, ProMFS (probabilistic algorithm for mining frequent sequences), which notably increases algorithm's processing speed by a significant reduction of its computational complexity. A new version of algorithm is evaluated for real-life and artificial data sets and proven to be useful in real-time applications and problems.
Słowa kluczowe
Rocznik
Tom
Strony
323--326
Opis fizyczny
Bibliogr. 12 poz.
Twórcy
autor
- Institute of Control and Industrial Electronics, Warsaw University of Technology, Koszykowa 75, 00-662, Warsaw, Poland
autor
- Institute of Control and Industrial Electronics, Warsaw University of Technology, Koszykowa 75, 00-662, Warsaw, Poland
Bibliografia
- [1] R. Agrawal and R. Srikant, „Mining sequential patterns”, in Proceedings of the Eleventh International Conference on Data Engineering, 1995, pp. 3-14.
- [2] R. Tumasonis and G. Dzemyda, „A probabilistic algorithm for mining frequent sequences”, in ABDIS, 2004.
- [3] R. Agrawal and R. Srikant, „Mining sequential patterns: Generalizations and performance improvements”, in International Conference onExtending Database Technology, 1996, pp. 3-17.
- [4] C. Antunes and A. Oliveira, „Sequential Pattern Mining Algorithms: Tradeoffs between Speed and Memory”, in 2nd Workshop on MiningGraphs, Trees and Seq, 2004.
- [5] J. Ayres, J. Gehrke, T. Yiu, and J. Flannick, „Sequential Pattern Mining using a Bitmap Representation”, in Proceedings of the eighth ACMSIGKDD International Conference on Knowledge discovery and datamining, 2002, pp. 429-435.
- [6] Z. Yang, Y. Wang, and M. Kitsuregawa, „LAPIN-SPAM: An improved algorithm for mining sequential pattern”, in Proceedings of the 21st International Conference on Data Engineering Workshops, 2005, p. 1222.
- [7] R. Dass, „An Efficient Algorithm for Frequent Pattern Mining for Real-Time Business Intelligence Analytics in Dense Datasets”, in HICSS 06 Proceedings of the 39th Annual Hawaii International Conference on System Sciences, 2006, p. 170b.
- [8] R. Agrawal and R. Srikant, „Fast algorithms for mining association rules in large databases”, in Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, 1994, pp. 487-499.
- [9] K. Hryniów, „Probabilistic sequence mining : evaluation and extension of promfs algorithm”, in II PhDW 2009, Szklarska Poręba, Poland, 2009.
- [10] D. W. Cheung, J. Han, V. Ng, and C. Wong, „Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique”, in Proceedings of the Twelf-th International Conference on Data Engineering, 1996, pp. 106-114.
- [11] „King James bible”, (1611 Authorized Version, 1769 Revised Edition), http://printkjv.ifbweb.com/.
- [12] K. Hryniów, „Parallel pattern mining : application of GSP algorithm for Graphics Processing Units”, in 13th International Carpathian Control Conference, Slovakia, 2012, pp. 233-236.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BWAD-0032-0004