An experimental evaluation of two approaches to mining context based sequential patterns

Stefanowski, J.; Ziembiński, R.

Artykuł - szczegóły

Tytuł artykułu

An experimental evaluation of two approaches to mining context based sequential patterns

Autorzy

Stefanowski J. , Ziembiński R.

Treść / Zawartość

Pełne teksty:

http://matwbn.icm.edu.pl/ksiazki/cc/cc38/cc3813.pdf [zdalny]

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

The paper discusses the results of experiments with a new context extension of a sequential pattern mining problem. In this extension, two kinds of context attributes are introduced for describing the source of a sequence and for each element inside this sequence. Such context based sequential patterns may be discovered by a new algorithm, called Context Mapping Improved, specific for handling attributes with similarity functions. For numerical attributes an alternative approach could include their pre-discretization, transforming discrete values into artificial items and, then, using an adaptation of an algorithm for mining sequential patterns from nominal items. The aim of this paper is to experimentally compare these two approaches to mine artificially generated sequence databases with numerical context attributes where several reference patterns are hidden. The results of experiments show that the Context Mapping Improved algorithm has led to better re-discovery of reference patterns. Moreover, a new measure for comparing two sets of context based patterns is introduced.

Słowa kluczowe

knowledge discovery sequential patterns mining context patterns similarity of patterns

Wydawca

Systems Research Institute, Polish Academy of Sciences

Czasopismo

Control and Cybernetics

Rocznik

2009

Tom

Vol. 38, no 1

Strony

27--45

Opis fizyczny

Bibliogr. 15 poz.

Twórcy

autor

Stefanowski J.

autor

Ziembiński R.

Institute of Computing Science, Poznań University of Technology ul. Piotrowo 2, 60-965 Poznań, jerzy.stefanowski@cs.put.poznan.pl

Bibliografia

AGRAWAL, R. and SRIKANT, R. (1994) Fast algorithms for mining association rules. Proc. of 20th International Conference on Very Large Data Bases. Morgan Kaufmann, 487-499.
AGRAWAL, R., GEHRKE, J., GUNOPULOS, D. and RAGHAVAN, P. (1998) Automatic subspace clustering of high dimensional data for data mining applications. Proc. of the 1998 ACM SIGMOD international conference on Management of data. ACM Press, 94-105.
DONG, G. and PEI, J. (2007) Sequence Data Mining. Springer-Verlag, 1-12.
GRZYMALA-BUSSE, J. (2002) Discretization of Numerical Attributes. In: W. Klosgen and J. Zytkow, eds., Handbook of Data Mining and Knowledge Discovery. Oxford University Press, 218-225.
GURALNIK, V. and KARYPIS, G. (2001) A Scalable Algorithm for Clusterint Sequential Data. Proc. of the 2001 IEEE International Conference o-Data Mining, San Jose, California. IEEE Computer Society Press, 179 186.
HAN, J., PEI, J., MORTAZAVI-ASL, B., CHEN, Q. et al. (2001) PrefixSpan Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth Proceedings of the 17th International Conference on Data Engineering IEEE Computer Society, 215-224.
KIM, C., LIM, J., NG, R.T. and SHIM, K. (2004) SQUIRE: Sequential Pat tern Mining with Quantities. Proc. of 20th International Conference on Data Engineering (ICDE'04). IEEE Computer Society, 215-224.
MORZY, T. (2004) Discovery associations: algorithms and data structures. PAN Press, OWN Poznan.
MORZY, T., WOJCIECHOWSKI, M. and ZAKRZEWICZ, M. (1999) Pattern-Oriented Hierarchical Clustering. Proc. of the Third East European Con ference on Advances in Databases and Information Systems, ADBIS’99 Maribor, Slovenia. LCNS 1691. Springer, 179-190.
PINTO, H., HAN, J., PEI, J., WANG, K., CHEN, Q. and DAYAL, U. (2001) Multi-dimensional sequential pattern mining. Proceedings of the 10th In ternational Conference on Information and Knowledge Management, Atlanta, Georgia. ACM Press, 81-88.
SRIKANT, R. and AGRAWAL, R. (1996) Mining Sequential Patterns: Generalizations and Performance Improvements. Proceedings of the 5th In ternational Conference on Extending Database Technology: Advances in Database Technology. LNCS, 1057, Springer-Verlag, 3-17.
SRIKANT, R. and AGRAWAL, R. (1996) Mining Quantitative Association Rules in Large Relational Tables. Proceedings of the 1996 ACM SIGMOl International Conference on Management of Data. ACM Press, 1-12.
STEFANOWSKI, J. and ZIEMBIŃSKI, R. (2005) Mining Context Based Sequential Patterns. Proceedings of the Third International Atlantic Web Intelligence Conference: Advances in Web Intelligence. LNCS 3528, Springer Verlag, 401-407.
YANG, Y., WEBB, G. and Wu, XINDONG (2005) Discretization Methods. In: O. Maimon and L. Rokach, eds., Data Mining and Knowledge Discovers Handbook. Springer, 113-128.
ZIEMBIŃSKI, R. (2007) Algorithms for Context Based Sequential Pattern Mining. Fundamenta Informaticae, 76 (4), 495-510.

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BAT5-0036-0024