The main task in decision tree construction algorithms is to find the "best partition" of the set of objects. In this paper, we investigate the problem of optimal binary partition of continous attribute domain for large data sets stored in relational databases. The critical for time complexity of algorithms solving this problem is the number of simple SQL queries necassary to construct such partitions. Using straightforward approach to optimal partition selection the number of necessary queries is of order O(N), where N is the number of preassumed partitions of the searching space. We show some properties of optimization measures related to discernibility between objects, that allow to reduce the size of searching space. We prove that using only O(log N) simple queries, one can construct the partition vey close to optimal.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.