Tytuł artykułu
Treść / Zawartość
Pełne teksty:
Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
In this paper, we study various ways of representing and querying fact data that are time-stamped with a time period in a data warehouse. The main focus is on how to represent the time periods that are associated with the facts in order to support convenient and efficient aggregations over time. We propose three distinct logical models that represent time periods as sets of all time points in a period (instant model), as pairs of start and end time points of a period (period model), and as atomic units that are explicitly stored in a new period dimension (period∗ model). The period dimension is enriched with information about the days of each period, thereby combining the former two models. We use four different classes of aggregation queries to analyze query formulation, query execution, and query performance over the three models. An extensive empirical evaluation on synthetic and real-world datasets and the analysis of the query execution plans reveal that the period model is the best choice in terms of runtime and space for all four query classes.
Rocznik
Tom
Strony
31--49
Opis fizyczny
Bibliogr. 43 poz., rys., tab., wykr.
Twórcy
autor
- Faculty of Computer Science, Free University of Bozen-Bolzano, Dominikanerplatz 3, 39100 Bozen, Italy
autor
- Faculty of Computer Science, Free University of Bozen-Bolzano, Dominikanerplatz 3, 39100 Bozen, Italy
autor
- Faculty of Computing, University of Latvia, Raiņa bulvāris 19, Riga, LV-1586, Latvia
Bibliografia
- [1] Ahmed, W., Zimányi, E. and Wrembel, R. (2014). A logical model for multiversion data warehouses, Proceedings of the 16th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2014, Munich, Germany, pp. 23–34.
- [2] Bebel, B., Cichowicz, T., Morzy, T., Rytwinski, F., Wrembel, R. and Koncilia, C. (2015). Sequential data analytics by means of Seq-SQL language, Proceedings of the 26th International Conference on Database and Expert Systems Applications, DEXA 2015, Valencia, Spain, Part I, pp. 416–431.
- [3] Ben-Gan, I., Machanic, A., Sarka, D. and Farlee, K. (2015). TSQL Querying, Microsoft Press, Redmond, WA.
- [4] Blaschka, M., Sapia, C. and Höfling, G. (1999). On schema evolution in multidimensional databases, Proceedings of the 1st International Conference on Data Warehousing and Knowledge Discovery, DaWaK 1999, Florence, Italy, pp. 153–164.
- [5] Bliujute, R., Saltenis, S., Slivinskas, G. and Jensen, C.S. (1998). Systematic change management in dimensional data warehousing, Proceedings of the 3rd International Baltic Workshop on DB and IS, Riga, Latvia, pp. 27–41.
- [6] Böhlen, M.H., Dignös, A., Gamper, J. and Jensen, C.S. (2018). Temporal data management—an overview, in E. Zimányi (Ed.), Business Intelligence and Big Data, Springer International Publishing, Cham, pp. 51–83.
- [7] Böhlen, M.H., Gamper, J. and Jensen, C.S. (2006a). An algebraic framework for temporal attribute characteristics, Annals of Mathematics and Artificial Intelligence 46(3): 349–374.
- [8] Böhlen, M.H., Gamper, J. and Jensen, C.S. (2006b). Multi-dimensional aggregation for temporal data, Proceedings of the 10th International Conference on Extending Database Technology, EDBT 2006, Munich, Germany, pp. 257–275.
- [9] Böhlen, M.H., Gamper, J., Jensen, C.S. and Snodgrass, R.T. (2009). SQL-based temporal query languages, in L. Liu and M. Tamer Özsu (Eds.), Encyclopedia of Database Systems, Springer, New York, NY, pp. 2762–2768.
- [10] Bouros, P. and Mamoulis, N. (2017). A forward scan based plane sweep algorithm for parallel interval joins, Proceedings of the VLDB Endowment 10(11): 1346–1357.
- [11] Cafagna, F. and Böhlen, M.H. (2017). Disjoint interval partitioning, The VLDB Journal 26(3): 447–466.
- [12] Dignös, A., Böhlen, M.H. and Gamper, J. (2012). Temporal alignment, Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2012, Scottsdale, AZ, USA, pp. 433–444.
- [13] Dignös, A., Böhlen, M.H. and Gamper, J. (2013). Query time scaling of attribute values in interval timestamped databases, Proceedings of the 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia, pp. 1304–1307.
- [14] Dignös, A., Böhlen, M.H., Gamper, J. and Jensen, C.S. (2016). Extending the kernel of a relational DBMS with comprehensive support for sequenced temporal queries, ACM Transactions on Database Systems 41(4): 26:1–26:46.
- [15] Eder, J., Koncilia, C. and Morzy, T. (2002). The COMET metamodel for temporal data warehouses, Proceedings of the 14th International Conference on Advanced Information Systems Engineering, CAiSE 2002, Toronto, Canada, pp. 83–99.
- [16] Faisal, S. and Sarwar, M. (2014). Handling slowly changing dimensions in data warehouses, Journal of Systems and Software 94: 151–160.
- [17] Gao, D., Jensen, C.S., Snodgrass, R.T. and Soo, M.D. (2005). Join operations in temporal databases, The VLDB Journal 14(1): 2–29.
- [18] Garani, G., Adam, G.K. and Ventzas, D. (2016). Temporal data warehouse logical modelling, International Journal of Data Mining, Modelling and Management 8(2): 144–159.
- [19] Golfarelli, M. and Rizzi, S. (2009a). Data Warehouse Design: Modern Principles and Methodologies, McGraw-Hill, Inc., New York, NY.
- [20] Golfarelli, M. and Rizzi, S. (2009b). A survey on temporal data warehousing, International Journal of Data Warehousing and Mining 5(1): 1–17.
- [21] Golfarelli, M. and Rizzi, S. (2011). Temporal data warehousing: Approaches and techniques, in D. Taniar and L. Chen (Eds.), Integrations of Data Warehousing, Data Mining and Database Technologies—Innovative Approaches, Information Science Reference, London, pp. 1–18.
- [22] Goller, M. and Berger, S. (2013). Slowly changing measures, Proceedings of the 16th International Workshop on Data Warehousing and OLAP, DOLAP 2013, San Francisco, CA, USA, pp. 47–54.
- [23] Goller, M. and Berger, S. (2015). Handling measurement function changes with slowly changing measures, Information Systems 53: 107–123.
- [24] Höpken, W., Fuchs, M., Höll, G., Keil, D. and Lexhagen, M. (2013). Multi-dimensional data modelling for a tourism destination data warehouse, Proceedings of the International Conference on Information and Communication Technologies in Tourism 2013, Insbrusck, Austria, pp. 157–169.
- [25] Jensen, C.S., Pedersen, T.B. and Thomsen, C. (2010). Multidimensional Databases and Data Warehousing, Synthesis Lectures on Data Management, Morgan & Claypool Publishers, San Rafael, CA.
- [26] Jensen, C.S. and Snodgrass, R.T. (2009). Temporal database, in L. Liu and M. Tamer Özsu (Eds.), Encyclopedia of Database Systems, Springer, New York, NY, p. 2957.
- [27] Jensen, C.S., Soo, M.D. and Snodgrass, R.T. (1994). Unifying temporal data models via a conceptual model, Information Systems 19(7): 513–547.
- [28] Kimball, R. and Ross, M. (2013). The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edn., Wiley Publishing, Hoboken, NJ.
- [29] Kline, N. and Snodgrass, R.T. (1995). Computing temporal aggregates, Proceedings of the 11th International Conference on Data Engineering, ICDE 1995, Taipei, Taiwan, pp. 222–231.
- [30] Koncilia, C. (2003). A bi-temporal data warehouse model, Proceedings of the 15th Conference on Advanced Information Systems Engineering, CAiSE 2003, Klagenfurt, Austria, Vol. 74.
- [31] Koncilia, C., Morzy, T., Wrembel, R. and Eder, J. (2014). Interval OLAP:Analyzing interval data, Proceedings of the 16th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2014, Munich, Germany, pp. 233–244.
- [32] Lenz, H. and Shoshani, A. (1997). Summarizability in OLAP and statistical data bases, Proceedings of the 9th International Conference on Scientific and Statistical Database Management, SSDBM 1997, Olympia, WA, USA, pp. 132–143.
- [33] Lorentzos, N.A. (2009). Period-stamped temporal models, in L. Liu and M. Tamer Özsu (Eds.), Encyclopedia of Database Systems, Springer, New York, NY, pp. 2094–2098.
- [34] Malinowski, E. and Zimányi, E. (2008). A conceptual model for temporal data warehouses and its transformation to the ER and the object-relational models, Data & Knowledge Engineering 64(1): 101–133.
- [35] Melton, J. and Simon, A.R. (2002). Advanced SQL query expressions, in J. Melton and A.R. Simon (Eds.), SQL: 1999, Morgan Kaufmann, Burlington, VA, pp. 265–353.
- [36] Moon, B., Vega Lopez, I.F. and Immanuel, V. (2003). Efficient algorithms for large-scale temporal aggregation, IEEE Transactions on Knowledge and Data Engineering 15(3): 744–759.
- [37] Piatov, D. and Helmer, S. (2017). Sweeping-based temporal aggregation, Proceedings of the 15th International Symposium on Advances in Spatial and Temporal Databases, SSTD 2017, Arlington, VA, USA, pp. 125–144.
- [38] Piatov, D., Helmer, S. and Dignös, A. (2016). An interval join optimized for modern hardware, Proceedings of the 32nd IEEE International Conference on Data Engineering, ICDE 2016, Helsinki, Finland, pp. 1098–1109.
- [39] Toman, D. (2009). Point-stamped temporal models, in L. Liu and M. Tamer Özsu (Eds.), Encyclopedia of Database Systems, Springer, New York, NY, pp. 2119–2123.
- [40] Wrembel, R. and Bebel, B. (2007). Metadata management in a multiversion data warehouse, Journal on Data Semantics 8: 118–157.
- [41] Yang, J. and Widom, J. (2003). Incremental computation and maintenance of temporal aggregates, The VLDB Journal 12(3): 262–283.
- [42] Zhang, D., Markowetz, A., Tsotras, V.J., Gunopulos, D. and Seeger, B. (2001). Efficient computation of temporal aggregates with range predicates, Proceedings of the 20th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS 2001, Santa Barbara, CA, USA, pp. 237–245.
- [43] Zhang, D., Tsotras, V.J. and Seeger, B. (2002). Efficient temporal join processing using indices, Proceedings of the 18th International Conference on Data Engineering, ICDE 2002, San Jose, CA, USA, pp. 103–113.
Uwagi
PL
Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2019).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-9a2579be-a868-4102-8525-e5dafd2d0ea7