PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Powiadomienia systemowe
  • Sesja wygasła!
  • Sesja wygasła!
  • Sesja wygasła!
  • Sesja wygasła!
  • Sesja wygasła!
Tytuł artykułu

A Decomposition Framework for Computing and Querying Multidimensional OLAP Data Cubes over Probabilistic Relational Data

Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Focusing on novel database application scenarios, where data sets arise more and more in uncertain and imprecise formats, in this paper we propose a novel decomposition framework for efficiently computing and querying multidimensional OLAP data cubes over probabilistic data, which well-capture previous kind of data. Several models and algorithms supported in our proposed framework are formally presented and described in details, based on well-understood theoretical statistical/ probabilistic tools, which converge to the definition of the so-called probabilistic OLAP data cubes, the most prominent result of our research. Finally, we complete our analytical contribution by introducing an innovative Probability Distribution Function (PDF)-based approach, which makes use of well-known probabilistic estimators theory, for efficiently querying probabilistic OLAP data cubes, along with a comprehensive experimental assessment and analysis over synthetic probabilistic databases.
Wydawca
Rocznik
Strony
239--266
Opis fizyczny
Bibliogr. 56 poz., tab., wykr.
Twórcy
autor
  • ICAR-CNR and University of Calabria, 87036 Rende, Cosenza, Italy
autor
  • Department of Informatics and Telecommunications, University of Athens, 15784 Ilisia, Greece
Bibliografia
  • [1] Agarwal, S., Agrawal, R., Deshpande, P., Gupta, A., Naughton, J. F., Ramakrishnan, R., Sarawagi, S.: On the Computation of Multidimensional Aggregates, Proceedings of VLDB, 1996.
  • [2] Agrawal, P., Benjelloun, O., Sarma, A. D., Hayworth, C., Nabar, S. U., Sugihara, T., Widom, J.: Trio: A System for Data, Uncertainty, and Lineage, Proceedings of VLDB, 2006.
  • [3] Alfredo Cuzzocrea, F. F., Masciari, E., Saccà, D., Sirangelo, C.: Approximate query answering on sensor network data streams, GeoSensor Networks, 2004, 49.
  • [4] Barbará, D., Garcia-Molina, H., Porter, D.: The Management of Probabilistic Data, IEEE Transactions on Knowledge and Data Engineering, 4(5), 1992, 487–502.
  • [5] Benjelloun, O., Sarma, A. D., Halevy, A. Y., Theobald, M., Widom, J.: Databases with uncertainty and lineage, VLDB Journal, 17(2), 2008, 243–264.
  • [6] Bonifati, A., Cuzzocrea, A.: Efficient Fragmentation of Large XML Documents, Proceedings of DEXA, 2007.
  • [7] Bonnet, P., Gehrke, J., Seshadri, P.: Towards Sensor Database Systems, Proceedings of MDM, 2001.
  • [8] Burdick, D., Deshpande, P. M., Jayram, T. S., Ramakrishnan, R., Vaithyanathan, S.: Efficient Allocation Algorithms for OLAP Over Imprecise Data, Proceedings of VLDB, 2006.
  • [9] Burdick, D., Deshpande, P. M., Jayram, T. S., Ramakrishnan, R., Vaithyanathan, S.: OLAP over uncertain and imprecise data, VLDB Journal, 16(1), 2007, 123–144.
  • [10] Burdick, D., Doan, A., Ramakrishnan, R., Vaithyanathan, S.: OLAP over Imprecise Data with Domain Constraints, Proceedings of VLDB, 2007.
  • [11] Chen, A. L. P., Chiu, J.-S., Tseng, F. S.-C.: Evaluating Aggregate Operations Over Imprecise Data, IEEE Transactions on Knowledge and Data Engineering, 8(2), 1996, 273–284.
  • [12] Cheng, R., Kalashnikov, D. V., Prabhakar, S.: Evaluating Probabilistic Queries over Imprecise Data, Proceedings of SIGMOD, 2003.
  • [13] Cheng, R., Singh, S., Prabhakar, S., Shah, R., Vitter, J. S., Xia, Y.: Efficient join processing over uncertain data, Proceedings of CIKM, 2006.
  • [14] Colliat, G.: OLAP, Relational, and Multidimensional Database Systems, SIGMOD Record, 25(3), 1996, 64–69.
  • [15] Cormode, G., Garofalakis, M. N.: Sketching probabilistic data streams, Proceedings of SIGMOD, 2007.
  • [16] Cuzzocrea, A.: Overcoming Limitations of Approximate Query Answering in OLAP, Proceedings of IDEAS, 2005.
  • [17] Cuzzocrea, A.: Providing probabilistically-bounded approximate answers to non-holistic aggregate range queries in OLAP, Proceedings of DOLAP, 2005.
  • [18] Cuzzocrea, A.: Improving range-sum query evaluation on data cubes via polynomial approximation, Data and Knowledge Engineering, 56(2), 2006, 85–121.
  • [19] Cuzzocrea, A.: Retrieving Accurate Estimates to OLAP Queries over Uncertain and Imprecise Multidimensional Data Streams, Proceedings of SSDBM, 2011.
  • [20] Cuzzocrea, A.: Approximate OLAP Query Processing over Uncertain and Imprecise Multidimensional Data Streams, Proceedings of DEXA, 2013.
  • [21] Cuzzocrea, A., Furfaro, F., Greco, S., Masciari, E., Mazzeo, G. M., Saccà, D.: A Distributed System for Answering Range Queries on Sensor Network Data, Proceedings of PerCom Workshops, 2005.
  • [22] Cuzzocrea, A., Mansmann, S.: OLAP Visualization, in: Encyclopedia of Data Warehousing and Mining, 2009, 1439–1446.
  • [23] Cuzzocrea, A., Russo, V., Saccá, D.: A Robust Sampling-Based Framework for Privacy Preserving OLAP, Proceedings of DaWaK, 2008.
  • [24] Cuzzocrea, A., Saccà, D., Serafino, P.: A Hierarchy-Driven Compression Technique for Advanced OLAP Visualization of Multidimensional Data Cubes, Proceedings of DaWaK, 2006.
  • [25] Cuzzocrea, A., Serafino, P.: LCS-Hist: taming massive high-dimensional data cube compression, Proceedings of EDBT, 2009.
  • [26] Cuzzocrea, A., Wang, W.: Approximate range-sum query answering on data cubes with probabilistic guarantees, Journal of Intelligent Information Systems, 28(2), 2007, 161–197.
  • [27] Dalvi, N. N., Suciu, D.: Efficient query evaluation on probabilistic databases, VLDB Journal, 16(4), 2007, 523–544.
  • [28] Dalvi, N. N., Suciu, D.: Management of probabilistic data: foundations and challenges, Proceedings of PODS, 2007.
  • [29] Davey, B. A., Priestley, H. A.: Introduction to Lattices and Order (2. ed.), Cambridge University Press, 2002, ISBN 978-0-521-78451-1.
  • [30] Deligiannakis, A., Garofalakis, M. N., Roussopoulos, N.: Extended wavelets for multiple measures, ACM Transactions on Database Systems, 32(2), 2007, 10.
  • [31] Fink, R., Han, L., Olteanu, D.: Aggregation in Probabilistic Databases via Knowledge Compilation, PVLDB, 5(5), 2012, 490–501.
  • [32] Ganti, V., Lee, M.-L., Ramakrishnan, R.: ICICLES: Self-Tuning Samples for Approximate Query Answering, Proceedings of VLDB, 2000.
  • [33] Gibbons, P. B., Matias, Y.: New Sampling-Based Summary Statistics for Improving Approximate Query Answers, Proceedings of SIGMOD, 1998.
  • [34] Golub, G. H., Loan, C. F., Eds.: Matrix Computation, Johns Hopkins University Press, 1989.
  • [35] Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data Cube: A Relational Aggregation Operator Generalizing Group-by, Cross-Tab, and Sub Totals, Data Mining and Knowledge Discovery, 1(1), 1997, 29–53.
  • [36] Han, J., Kamber, M.: Data Mining: Concepts and Techniques, Morgan Kaufmann, 2000, ISBN 1-55860- 489-8.
  • [37] Harinarayan, V., Rajaraman, A., Ullman, J. D.: Implementing Data Cubes Efficiently, Proceedings of SIGMOD, 1996.
  • [38] Hellerstein, J. M., Haas, P. J., Wang, H. J.: Online Aggregation, Proceedings of SIGMOD (J. Peckham, Ed.), 1997.
  • [39] Ho, C.-T., Agrawal, R., Megiddo, N., Srikant, R.: Range Queries in OLAP Data Cubes, Proceedings of SIGMOD (J. Peckham, Ed.), 1997.
  • [40] Hua, M., Pei, J., Zhang, W., Lin, X.: Ranking queries on uncertain data: a probabilistic threshold approach, Proceedings of SIGMOD, 2008.
  • [41] Ioannidis, Y. E., Poosala, V.: Histogram-Based Approximation of Set-Valued Query-Answers, Proceedings of VLDB, 1999.
  • [42] Jayram, T. S., McGregor, A., Muthukrishnan, S., Vee, E.: Estimating statistical aggregates on probabilistic data streams, ACM Transactions on Database Systems, 33(4), 2008.
  • [43] Jin, R., Glimcher, L., Jermaine, C., Agrawal, G.: New Sampling-Based Estimators for OLAP Queries, Proceedings of ICDE, 2006.
  • [44] Kimelfeld, B., Sagiv, Y.: Maximally joining probabilistic data, Proceedings of PODS, 2007.
  • [45] Lian, X., Chen, L.: Probabilistic ranked queries in uncertain databases, Proceedings of EDBT, 2008.
  • [46] McClean, S. I., Scotney, B. W., Shapcott, M.: Aggregation of Imprecise and Uncertain Information in Databases, IEEE Transactions on Knowledge and Data Engineering, 13(6), 2001, 902–912.
  • [47] Papoulis, A., Ed.: Probability, Random Variables, and Stochastic Processes, McGraw-Hill, 1984.
  • [48] Pei, J., Yuan, Y., Lin, X., Jin, W., Ester, M., Liu, Q., Wang, W., Tao, Y., Yu, J. X., Zhang, Q.: Towards multidimensional subspace skyline analysis, ACM Transactions on Database Systems, 31(4), 2006, 1335– 1381.
  • [49] Poosala, V., Ganti, V.: Fast Approximate Query Answering Using Precomputed Statistics, Proceedings of ICDE, 1999.
  • [50] Ré, C., Suciu, D.: Approximate lineage for probabilistic databases, PVLDB, 1(1), 2008, 797–808.
  • [51] Ross, R. B., Subrahmanian, V. S., Grant, J.: Aggregate operators in probabilistic databases, Journal of the ACM, 52(1), 2005, 54–101.
  • [52] Sarma, A. D., Theobald, M., Widom, J.: Exploiting Lineage for Confidence Computation in Uncertain and Probabilistic Databases, Proceedings of ICDE, 2008.
  • [53] Soliman, M. A., Ilyas, I. F., Chang, K. C.-C.: Probabilistic top-k and ranking-aggregate queries, ACM Transactions on Database Systems, 33(3), 2008.
  • [54] Timko, I., Dyreson, C. E., Pedersen, T. B.: Pre-aggregation with probability distributions, Proceedings of DOLAP, 2006.
  • [55] Vassiliadis, P., Sellis, T. K.: A Survey of Logical Models for OLAP Databases, SIGMOD Record, 28(4), 1999, 64–69.
  • [56] Yi, K., Li, F., Kollios, G., Srivastava, D.: Efficient Processing of Top-k Queries in Uncertain Databases with x-Relations, IEEE Transactions on Knowledge and Data Engineering, 20(12), 2008, 1669–1682.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-54591e61-cbeb-400c-a425-3feb9437f4c9
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.