PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Research Problems of the ETL Technology

Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
This paper overviews research developments in the field of the ETL technology with respect to the four following research fields: modeling of ETL processes, data cleaning, optimization of ETL processes, and evolution of ETL processes. In each of these research fields we outlined the most influential solutions and illustrated them with examples. This paper we also overviews a project that is currently being realized in the Institute of Computing Science at the Poznań University of Technology. The project focuses on developing a method and a framework for the support of the evolution of ETL process.
Rocznik
Strony
283--305
Opis fizyczny
Bibliogr. 37
Twórcy
autor
Bibliografia
  • [1] A. Bouguettaya, B. Benatallah, and A. Elmargamid, Interconnecting Heterogeneous Information Systems. Kluwer Academic Publishers, 1998.
  • [2] A. Elmagarmid, M. Rusinkiewicz, and A. Sheth, Management of Heterogeneous and Autonomous Database Systems. Morgan Kaufmann Publishers, 1999.
  • [3] H. Garcia-Molina, J. D. Ullman, and J. D. Widom, Database Systems: The Complete Book. Prentice Hall, 2001.
  • [4] S. Chaudhuri and U. Dayal, "An overview of data warehousing and OLAP technology," SIGMOD Record, vol. 26, no. 1, pp. 65-74, 1997.
  • [5] W. H. Inmon, Building the Data Warehouse (4th edition). John Wiley & Sons Inc., 2005.
  • [6] R. Kimball and M. Ross, The Data Warehouse Toolkit. John Wiley & Sons Inc., 2002.
  • [7] J. Widom, "Research problems in data warehousing," in Proc. of ACM Conf. on Information and Knowledge Management (CIKM), pp. 25-30, 1995.
  • [8] M. C. Wu and A. P. Buchmann, "Research issues in data warehousing," in Datenbanksysteme in Buro, Technik und Wissenschaft, pp. 61-82, 1997.
  • [9] J. Andzic, V. Fiore, and L. Sisto, "Extraction, transformation, and loading processes," in Data Warehouses and OLAP: Concepts, Architectures and Solutions (R. Wrembel and C. Koncilia, eds.), pp. 88-110, Idea Group Inc., 2007. ISBN 1-59904-364-5.
  • [10] R. Kimball and J. Caserta, The Data Warehouse ETL Toolkit. John Wiley & Sons Inc., 2004.
  • [11] A. Simitsis, P. Vassiliadis, M. Terrovitis, and S. Skiadopoulos, Graph-Based Modeling of ETL Activities with Multi-level Transformations and Updates, vol. 3589 of Lecture Notes in Computer Science, pp. 43-52. Springer-Verlag, 2005.
  • [12] A. Simitsis, P. Vassiliadis, S. Skiadopoulos, and T. Sellis, "Data warehouse refreshment," in Data Warehouses and OLAP: Concepts, Architectures and Solutions (R. Wrembel and C. Koncilia, eds.), pp. 111-134, Idea Group Inc., 2007. ISBN 1-59904-364-5.
  • [13] D. Sjøberg, "Quantifying schema evolution," Information and Software Technology, vol. 35, no. 1, pp. 35-54, 1993.
  • [14] E. Rundensteiner, A. Koeller, and X. Zhang, "Maintaining data warehouses over changing information sources," Communications of the ACM, vol. 43, no. 6, pp. 57-62, 2000.
  • [15] J. E. Olson, ed., Data Quality: the Accuracy Dimension. Morgan Kaufmann Publishers, 2003.
  • [16] H. J. Moon, C. A. Curino, A. Deutsch, C.-Y. Hou, and C. Zaniolo, "Managing and querying transaction-time databases under schema evolution," Proc. VLDB Endow., vol. 1, pp. 882-895, 2008.
  • [17] C. Thomsen and T. Bach Pedersen, "pygrametl: a powerful programming framework for extract-transform-load programmers," in Proceeding of the ACM twelfth international workshop on Data warehousing and OLAP, Proc. of ACM Int. Workshop on Data Warehousing and OLAP (DOLAP), pp. 49-56, ACM, 2009.
  • [18] P. Vassiliadis, A. Simitsis, and S. Skiadopoulos, "Modeling ETL activities as graphs," in Proc. of Int. Workshop on Design and Management of Data Warehouses (DMDW), p. 52-61, CEUR-WS.org, 2002.
  • [19] D. Skoutas, A. Simitsis, and T. Sellis, "Ontology-driven conceptual design of ETL processes using graph transformations," Journal on Data Semantics XIII, vol. pages, p. 120-146, 2009.
  • [20] D. Skoutas and A. Simitsis, "Ontology-Based Conceptual Design of ETL Processes for Both Structured and Semi-Structured Data," in International Journal on Semantic Web and Information Systems, p. 1-24, IGI Publishing, 701 E. Chocolate Ave, Suite 200, Hershey, PA, 17033-1240, USA,, 2007.
  • [21] M. Niinim and T. Niemi, "An etl process for olap using rdf/owl ontologies," Journal on Data Semantics XIII, p. 97-119, 2009.
  • [22] Z. E. Akkaoui and E. Zimanyi, "Defining ETL Worfklows using BPMN and BPEL," in Proc. of ACM Int. Workshop on Data Warehousing and OLAP (DOLAP), pp. 41-48, ACM, 2009.
  • [23] L. Muñoz, J.-N. Mazón, and J. Trujillo, "Automatic generation of ETL processes from conceptual models," in Proc. of ACM Int. Workshop on Data Warehousing and OLAP (DOLAP), p. 33, ACM Press, 2009.
  • [24] H. Galhardas, D. Florescu, D. Shasha, and E. Simon, "Declaratively cleaning your data using AJAX," in Journees Bases de Donnees, 2000.
  • [25] V. Raman and J. Hellerstein, "Potter's wheel: An interactive data cleaning system," in Proc. of Int. Conf. on Very Large Data Bases (VLDB), p. 381-390, Citeseer, 2001.
  • [26] J. Rodic and M. Baranovic, "Generating data quality rules and integration into ETL process," in Proc. of ACM Int. Workshop on Data Warehousing and OLAP (DOLAP), p. 65, ACM Press, 2009.
  • [27] A. Simitsis, P. Vassiliadis, and T. Sellis, "State-space optimization of ETL workflows," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 10, p. 1404-1419, 2005.
  • [28] A. Simitsis, P. Vassiliadis, and T. Sellis, "Logical optimization of ETL Workflows," IEEE Trans on Knowledge and Data Engineering, vol. 17, no. 10, p. 150-161, 2006.
  • [29] P. Vassiliadis, A. Simitsis, and E. Baikousi, "A taxonomy of ETL activities," in Proc. of ACM Int. Workshop on Data Warehousing and OLAP (DOLAP), p. 25, ACM Press, 2009.
  • [30] P. Vassiliadis, A. Simitsis, M. Terrovitis, and S. Skiadopoulos, Blueprints and measures for ETL workflows, p. 385 400. Springer, 2005.
  • [31] M. Thiele, T. Kiefer, and W. Lehner, "Cardinality estimation in ETL processes," in Proc. of ACM Int. Workshop on Data Warehousing and OLAP (DOLAP), p. 57, ACM Press, 2009.
  • [32] E. A. Rundensteiner, A. Koeller, X. Zhang, A. J. Lee, A. Nica, A. Van Wyk, and Y. Lee, "Evolvable View Environment (EVE): Non-Equivalent View Maintenance under Schema Changes," in Proc. of ACM SIGMOD Int. Conference on Management of Data, pp. 553-555, ACM Press, 1999.
  • [33] G. Papastefanatos, P. Vassiliadis, A. Simitsis, T. Sellis, and Y. Vassiliou, "Rulebased Management of Schema Changes at ETL sources," in Proc. of East European Conf. Advances in Databases and Information Systems (ADBIS), p. 55, Springer, 2009.
  • [34] G. Papastefanatos, P. Vassiliadis, A. Simitsis, K. Aggistalis, F. Pechlivani, and Y. Vassiliou, "Language Extensions for the Automation of Database Schema Evolution," in Proc. of Int. Conf. on Enterprise Information Systems (ICEIS), pp. 74-81, 2008.
  • [35] G. Papastefanatos, P. Vassiliadis, A. Simitsis, and Y. Vassiliou, Design Metrics for Data Warehouse Evolution, vol. 5231 of Lecture Notes in Computer Science, p. 440-454. Springer Berlin Heidelberg, 2008.
  • [36] R. Wrembel and B. Bębel, "The Framework for Detecting and Propagating Changes from Data Sources Structure into a Data Warehouse," Foundations of Computing and Decision Sciences Journal, vol. 30, no. 4, pp. 361-372, 2005.
  • [37] R. Wrembel, "A survey on managing the evolution of data warehouses," International Journal of Data Warehousing & Mining, vol. 5, no. 2, pp. 24-56, 2009.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-article-BPP2-0019-0050
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.