On Efficient Handling of Continuous Attributes in Large Data Bases

Son, N.H.

Powiadomienia systemowe

Sesja wygasła!
Sesja wygasła!

Artykuł - szczegóły

Tytuł artykułu

On Efficient Handling of Continuous Attributes in Large Data Bases

Autorzy

Son N.H.

Wybrane pełne teksty z tego czasopisma

https://fi.episciences.org/

Identyfikatory

Warianty tytułu

Języki publikacji

Abstrakty

Some data mining techniques, like discretization of continuous attributes or decision tree induction, are based on searching for an optimal partition of data with respect to some optimization criteria. We investigate the problem of searching for optimal binary partition of continuous attribute domain in case of large data sets stored in relational data bases (RDB). The critical for time complexity of algorithms solving this problem is the number of I/O database operations necessary to construct such partitions. In our approach the basic operators are defined by queries on the number of objects characterized by means of real value intervals of continuous attributes. We assume the answer time for such queries does not depend on the interval length. The straightforward approach to the optimal partition selection (with respect to a given measure) requires O(N) basic queries, where N is the number of preassumed partition parts in the searching space. We show properties of the basic optimization measures making possible to reduce the size of searching space. Moreover, we prove that using only O(logN) simple queries, one can construct a partition very close to optimal.

Słowa kluczowe

basic notions algorithm acceleration methods local and global search

Wydawca

IOS Press

Czasopismo

Fundamenta Informaticae

Rocznik

2001

Tom

Vol. 48, nr 1

Strony

61--81

Opis fizyczny

bibliogr. 21 poz.

Twórcy

autor

Son N.H.

Institute of Informatics Warsaw University ul. Banacha 2 02-097 Warsaw, Poland

Bibliografia

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-article-BUS2-0003-0080