Data influx at large volumes is welcome for quality outcome in knowledge discovery, but it causes concern for scalability of mining algorithms. We introduce three measures for scalable mining - bit-vector coding, data-partitioning and Transaction Prefix (TP)-tree. Following encryption with bit-vector coding, transaction records are partitioned with notion of common prefixes. A TP-tree structure is devised for arranging the data parts such that multiple records share common storage. Advantage is two-fold: additional storage reduction over bit-vector coding and mining common prefixes together. These altogether improve space-time requirement in frequent pattern mining. Experiments on dense datasets show significant improvements in performance and scalability of both candidate generation and pattern-growth algorithms.
2
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
How to select a structuring element for a given task is one of the most frequently asked questions in morphology. Present work tries to give a solution for a restricted class of problems, namely shape classification. In this work an algorithm that extracts distinctive structure of each of a given set of objects is proposed. The proposed algorithm is based on a new approach for computing distance transform.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.