PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

How to improve efficiency of analysis of sequential data?

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Many of todays database applications, including market basket analysis, web log analysis, DNA and protein sequence analysis utilize databases to store and retrieve sequential data. Commercial database management systems allow to store sequential data, but they do not support efficient querying of such data. To increase the efficiency of analysis of sequential data new index structures need to be developed. In this paper we propose an indexing scheme for non-timestamped sequences of sets, which supports set subsequence queries. Our contribution is threefold. First, we describe the index logical and physical structure, second, we provide algorithms for set subsequence queries utilizing this structure, and finally we perform experimental evaluation of the index, which proves its feasibility and advantages in set subsequence query processing.
Słowa kluczowe
Rocznik
Strony
107--126
Opis fizyczny
Bibliogr. 24 poz.
Twórcy
autor
Bibliografia
  • AGRAWAL, R., FALOUTSOS, C. and SWAMI, A.N. (1993) Efficient similarity search in sequence databases. Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms, Chicago. Springer Verlag, 69-84.
  • ANDRZEJEWSKI, W., GAERTIG, P., RADOM, M. and ANTONIEWICZ, M. (2003) Opracowanie i analiza wydajnościowa indeksu dla przybliżonego wyszukiwania podzbiorów danych (Development and efficiency analysis of an index for the approximate retrieval of data subsets; in Polish). Diploma Thesis. Poznan University of Technology.
  • ANDRZEJEWSKI, W. and MORZY, T. (2006a) AISS: An index for non times-tamped set subsequence queries. Proceedings of the 8th International Conference on Data Warehousing and Knowledge Discovery, Cracow. Springer Verlag, 503-512.
  • ANDRZEJEWSKI, W. and MORZY, T. (2006b) SeqTrie: An index for data mining applications. Proceedings of the 2nd ADBIS Workshop on Data Mining and Knowledge Discovery, 13-25.
  • ANDRZEJEWSKI, W., MORZY, T. and MORZY, M. (2005) Indexing of sequences of sets for efficient exact and similar subsequence matching. Proceedings of the 20th International Symposium on Computer and Information Sciences, Istanbul. Springer Verlag, 864-873.
  • DEPPISCH, U. (1986) S-Tree: a dynamic balanced signature index for office retrieval. Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval, Pisa. ACM Press, 77-87.
  • FALOUTSOS, C. and CHRISTODOULAKIS, S. (1984) Signature files: an access method for documents and its analytical performance evaluation. ACM Transactions on Information Systems (TOIS) 2 (4), 267-288.
  • FALOUTSOS, C., RANGANATHAN, M. and MANOLOPOULOS, Y. (1994) Fast subsequence matching in time-series databases. Proceedings of the 1994 ACM SIGMOD international conference on Management of data, Minneapolis. ACM Press, 419-429.
  • GOCZYŁA, K. (1997) The Partial-Order Tree: A New Structure for Indexing on Complex Attributes in Object-Oriented Databases. Proceedings of the 23rd Euromicro Conference, Budapest. IEEE, 47-54.
  • HELLERSTEIN, J. M. and PFEFFER, A. (1994) The RD-Tree: an index structure for sets. Technical Report 1252. University of Wisconsin at Madison.
  • HELMER, S. and MOERKOTTE, G. (1999) A study of four index structures for set-valued attributes of low cardinality. The VLDB Journal - The International Journal on Very Large Data Bases 12 (3), 244-261.
  • KEOGH, E., CHAKRABARTI, K., PAZZANI, M. and MEHROTRA, S. (2001) Locally adaptive dimensionality reduction for indexing large time series databases. Proceedings of the 2001 ACM SIGMOD international conference on Management of data, Santa Barbara. ACM Press, 151-162.
  • KEOGH, E., LONARDI, S. and RATANAMAHATANA, C.A. (2004) Towards parameter free data mining. Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining, Seattle. ACM Press, 206-215.
  • LEVENSHTEIN, V.I. (1965) Binary codes capable of correcting deletions, insertions and reversals. Doklady Akademii Nauk SSSR 163 (1), 845-848.
  • MAMOULIS, N. and YIU, M.L. (2004) Non-contiguous sequence pattern queries. Proceedings of the 9th International Conference on Extending Database Technology. LNCS 2992, Springer Verlag, 783-800.
  • MANBER, U. and MYERS, G. (1990) Suffix arrays: a new method for on-line string searches. Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms, Philadelphia. Society for Industrial and Applied Mathematics, 319-327.
  • MCCREIGHT, E.M. (1976) A space-economical suffix tree construction algorithm. J. ACM 23 (2), 262-272.
  • NANOPOULOS, A., MANOLOPOULOS, Y., ZAKRZEWICZ, M. and MORZY, T. (2002) Indexing web access-logs for pattern queries. Proceedings of the 4th international workshop on Web information and data management, Virginia. ACM Press, 63-68.
  • UKKONEN, E. (1992) Constructing suffix trees on-line in linear time. Information Processing 92, Proceedings of IFIP 12th World Computer Congress, volume 1. Elsevier Sci. Publ., 484-492.
  • UKKONEN, E. (1995) On-line construction of suffix trees. Algorithmica 14 (3), 249-260.
  • VLACHOS, M., HADJIELEFTHERIOU, M., GUNOPULOS, D. and KEOGH, E. (2003) Indexing multidimensional time-series with support for multiple distance measures. Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, Washington. ACM Press, 216-225.
  • WANG, H., PERNG, C.-S., FAN, W., PARK, S. and Yu, P.S. (2003) Indexing weighted-sequences in large databases. Proceedings of International Conference on Data Engineering, Bangalore. IEEE Computer Society, 63-74.
  • WEINER, P. (1973) Linear pattern matching algorithms. Proceedings of the 14th IEEE Annual Symposium on Switching and Automata Theory, Iowa. IEEE, 1-11.
  • YI. B.-K. and FALOUTSOS, C. (2000) Fast time sequence indexing for arbitrary Lp norms. Proceedings of the 26th International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc., 385-394.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BAT5-0036-0028
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.