Discovery of sequential patterns is an important data mining problem with numerous applications. Sequential patterns are subsequences frequently occurring in a database of sequences of sets of items. In a basic scenario, the goal of sequential pattern mining is discovery of all patterns whose frequency exceeds a user-specified frequency threshold. The problem with such an approach is a huge number of sequential patterns which are likely to be returned for reasonable frequency thresholds. One possible solution to this problem is excluding the patterns which do not provide significantly more information than some other patterns in the result set. Two approaches falling into that category have been studied in the context of sequential patterns: discovery of maximal patterns and closed patterns. Unfortunately, the set of maximal patterns may not contain many important patterns with high frequency, and discovery of closed patterns may not reduce the number of resulting patterns for sparse datasets. Therefore, in this paper we propose and experimentally evaluate the minimum improvement criterion to be used in the post-processing phase to reduce the number of sequential patterns returned to the user. Our method is an adaptation of one of the methods previously proposed for association rules.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.