PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Query-condition-aware V-optimal histogram in range query selectivity estimation

Autorzy
Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Obtaining the optimal query execution plan requires a selectivity estimation. The selectivity value allows to predict the size of a query result. This lets choose the best method of query execution. There are many selectivity obtaining methods that are based on different types of estimators of attribute values distribution (commonly they are based on histograms). The adaptive method, proposed in this paper, uses either attribute values distribution or range query condition boundaries one. The new type of histogram - the Query-Conditional-Aware V-optimal one (QCA-V-optimal) - is proposed as a non-parametric estimator of a probability density function of attribute values distribution. This histogram also takes into account information about already processed queries. This information is represented by the 1-dimensional Query Condition Distribution histogram (HQCD) which is an estimator of the include function PI which is also introduced in this paper. PI describes so-called regions of user interest, i.e. it shows how often regions of attribute values domain were used by processed queries. Advantages of the proposed method based on QCA-V-optimal are presented. Conducted experiments reveal small values of a mean relative selectivity estimation error comparing to the error values obtained by methods based on the relevant classical V-optimal histogram and Equi-height one.
Rocznik
Strony
287--303
Opis fizyczny
Bibliogr. 21 poz., tab., wykr., rys., fot.
Twórcy
  • Institute of Computer Science, Silesian University of Technology, 16 Akademicka St., 44-100 Gliwice, Poland
Bibliografia
  • [1] Y.E. Ioannidis, “The history of histograms (abridged)”, Proc. VLDB Conf. 1, CD-ROM (2003).
  • [2] V. Possala and Y.E. Ioannidis, “Selectivity estimation without the attribute value independence assumption”, Proc. 23rd Int. Conf. on Very Large Databases 1, 486–495 (1997).
  • [3] D. W. Scott and S. R. Sain, “Multi-dimensional density estimator”, Handbook of Statistics 24, 229-263 (2004).
  • [4] D. Gunopulos, G. Kollios, V.J. Tsortas, and C. Domeniconi, “Selectivity estimator for multidimensional range queries over real attributes”, VLDB J. 14 (2), 137–154 (2005).
  • [5] F. Korn, T. Johnson, and H. V. Jagadish, “Range selectivity estimation for continuous attributes”, Proc. Int. Conf. on Scientific and Statistical Database Management 1, 244–253 (1999).
  • [6] L. Getoor, B. Taskar, and D. K¨oller, “Selectivity estimation using probabilistic models”, Proc. ACM SIGMOD Int. Conf. on Management of Data 30 (2), 461–472 (2001).
  • [7] L. Lee, K. Deok-Hwan, and Ch. Chin-Wan, “Multi-dimensional selectivity estimation using compressed histogram estimation information”, Proc. 1999 ACM SIGMOD Int. Conf. on Management of Data 28 (2), 205–214 (1999).
  • [8] F. Yan, W.C. Hou, Z. Jiang, C. Luo, and Q. Zhu, “Selectivity estimation of range queries based on data density approximation via cosine series”, Data & Knowledge Engineering 63 (3), 855–878 (2007).
  • [9] K. Chakrabarti, M. Garofalakis, R. Rastogi, and K. Shim, “Approximate query processing using wavelets”, VLDB J. 10 (2–3), 199–223 (2001).
  • [10] N. Bruno, S. Chaudhuri, and L. Gravano, “ST Holes: a multidimensional workload-aware histogram”, Proc. 2001 ACM SIGMOD Int. Conf. on Management of Data 30 (2), 211–222 (2001).
  • [11] D. Fuchsa, Z. Zhen Heb, and B.S. Lee, “Compressed histograms with arbitrary bucket layouts for selectivity estimation”, Information Sciences 177 (3), 680–702 (2007).
  • [12] A. Khachatryan, E. M¨uller, Ch. Stier, and K. B¨ohm, “Sensitivity of self-tuning histograms: query order affecting accuracy and robustness”, Proc. 24th Int. Conf. on Scientific and Statistical Database Management 1, 334–342 (2012).
  • [13] D.R. Augustyn, “Query-condition-aware histograms in selectivity estimation method”, Proc. Man-machine interactions 2. Advances in Intelligent and Soft Computing 103, 437–446 (2011).
  • [14] H.V. Jagadish, V. Poosala, N. Koudas, K. Sevcik, S. Muthukrishnan, and T. Suel, “Optimal histograms with quality guarantees”, Proc. 24rd Int. Conf. on VLDB 1, 275–286 (1998).
  • [15] “Oracle 10g documentation, Using extensible optimizer page”, http://download.oracle.com/docs/cd/B14117 01/appdev.101/b10800/dciextopt.htm (2005).
  • [16] D.R. Augustyn, “Applying advanced methods of query selectivity estimation in oracle DBMS”, Proc. Man-Machine Interactions, Advances in Intelligent and Soft Computing 59, 585–593 (2009).
  • [17] J. Klamka, K. Grochla, and T. Czachórski, “Modelling TCP connection in WIMAX network using fluid flow approximation”, Proc. 2011 IEEE/IPSJ Int. Symp. on Applications and the Internet. Future Internet Engineering 1, 502–507 (2011).
  • [18] D.R. Augustyn, “Application of prediction of atribute value distribution in order to improve accuracy of estimation of question selectivity”, Studia Informatica 34 2A(111), 23–42 (2013), (in Polish).
  • [19] D.R. Augustyn, “Using the model of continuous dynamical system with viscous resistance forces for improving distribution prediction based on evolution of quantiles”, Proc. 10th Int. Conf. Beyond Databases, Architectures, and Structures. Communications in Computer and Information 424, 1–9 (2014).
  • [20] J. Klamka and J. Tańcula, “Examination of robust stability of computer networks”, Proc. 6-th Working Int.Conf. HET-NETs 2010 1, 75–88 (2010).
  • [21] M. Luzar, Ł. Sobolewski, W. Miczulski, and J. Korbicz, “Prediction of corrections for the Polish time scale UTC(PL) using artificial neural networks”, Bull. Pol. Ac.: Tech. 61 (3), 589–594 (2013).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-309345ed-a848-42b4-be6b-cc51ab9cdf34
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.