Wyniki wyszukiwania - BazTech

Ograniczanie wyników

Powiadomienia systemowe

Sesja wygasła!

Znaleziono wyników: 2

Liczba wyników na stronie

Wyniki wyszukiwania

Wyszukiwano:
w słowach kluczowych: sequence database

Sortuj według:

Ogranicz wyniki do:

Efficiently Mining Sequential Generator Patterns Using Prefix Trees

Pham T.-T.

Fundamenta Informaticae

2015

Vol. 138, nr 3

373--386

Mining long frequent sequences that contain a combinatorial number of frequent subsequences or using very low support thresholds to mine sequential patterns is usually both time- and memory-consuming. The mining of closed sequential patterns, sequential generator patterns, and maximum sequences has been proposed to overcome this problem. Sequential generator patterns, when used together with closed sequential patterns, can provide additional information that closed sequential patterns alone cannot provide. Mining sequential generator patterns is thus an important task in data mining as well. This paper proposes an algorithm called MSGP-PreTree for mining all sequential generator patterns based on the prefix-tree structure. The algorithm uses the characteristics of sequential generator patterns and sequence extensions to efficiently perform a depth-first search on a prefix tree. It also uses a vertical approach to list and count the supports of sequences based on the prime block encoding approach for representing candidate sequences and determining the frequencies of candidates. Besides, the search space of the MSGP-PreTree algorithm is much smaller than those of other algorithms because two pruning strategies are applied. Experimental results conducted on synthetic and real databases show that the proposed algorithm is effective than a previous one.

An empirical study of context based sequential pattern mining algorithms efficiency

Ziembiński R.

Foundations of Computing and Decision Sciences

2007

Vol. 32, No. 1

63-84

Methods of patterns detection in the sets of data are useful and demanded tools in a knowledge discovery process. The problem of searching patterns in set of sequences is named Sequential Patterns Mining. It can be defined as a way of finding frequent subsequences in the sequences database. The patterns selection procedure may be simply understood. Every subsequence must be enclosed in the required number of sequences from the database at least to become a pattern. The number of a pattern enclosing sequences is called a pattern support. The process of finding patterns may look trivial but its efficient solution is not. The efficiency plays a crucial role if the required support is lowered. The number of mined patterns may grow exponentially. Moreover, the situation may change if the problem of Sequential Patterns Mining will be extended further. In the classic definition the sequence is a list of ordered elements containing only non-empty sets of items. The Context Based Sequential Patterns Mining adds uniform and multi-attribute contexts (vectors) to the elements of the sequence and the sequence itself. Introducing contexts significantly enlarges the problem search space. However, it brings some additional occasions to constrain the mining process, too. This enhancement requires new algorithms. Traditional ones are not able to cope with non-nominal data directly. Algorithms derived straightly from traditional algorithms were verified to be inefficient. This study evaluates efficiency of novel ContextMapping and ContextMappingHeuristic algorithms. These innovative algoritnms are designed to solve the problem of Context Based Sequential Pattern Mining. This study answers in what scope the algorithms parameterization impacts on mining costs and accuracy. It also refers the modified problem to the traditional one pointing at the common and uncommon properties and drawing perspective for further research.