How well a multi-model database performs against its single-model variants: Benchmarking OrientDB with Neo4j and MongoDB

Macak, Martin; Stovcik, Matus; Buhnova, Barbora; Merjavy, Michal

doi:10.15439/2020F76

Artykuł - szczegóły

Tytuł artykułu

How well a multi-model database performs against its single-model variants: Benchmarking OrientDB with Neo4j and MongoDB

Autorzy

Macak Martin , Stovcik Matus , Buhnova Barbora , Merjavy Michal

Wybrane pełne teksty z tego czasopisma

http://annals-csis.org

Identyfikatory

DOI

10.15439/2020F76

Warianty tytułu

Konferencja

Federated Conference on Computer Science and Information Systems (15 ; 06-09.09.2020 ; Sofia, Bulgaria)

Języki publikacji

Abstrakty

Digitalization is currently the key factor for progress, with a rising need for storing, collecting, and processing large amounts of data. In this context, NoSQL databases have become a popular storage solution, each specialized on a specific type of data. Next to that, the multi-model approach is designed to combine benefits from different types of databases, supporting several models for data. Despite its versatility, a multi-model database might not always be the best option, due to the risk of worse performance comparing to the single-model variants. It is hence crucial for software engineers to have access to benchmarks comparing the performance of multi-model and single-model variants. Moreover, in the current Big Data era, it is important to have cluster infrastructure considered within the benchmarks. In this paper, we aim to examine how the multi-model approach performs compared to its single-model variants. To this end, we compare the OrientDB multi-model database with the Neo4j graph database and the MongoDB document store. We do so in the cluster setup, to enhance state of the art in database benchmarks, which is not yet giving much insight into cluster-operating database performance.

Słowa kluczowe

Big Data graph theory NoSQL database

Big Data teoria grafów baza danych NoSQL

Wydawca

Polskie Towarzystwo Informatyczne

Czasopismo

Annals of Computer Science and Information Systems

Rocznik

2020

Tom

Vol. 21

Strony

463--470

Opis fizyczny

Bibliogr. 24 poz., rys., tab., wykr.

Twórcy

autor

Macak Martin

macak@mail.muni.cz

Institute of Computer Science, Masaryk University, Brno, Czech Republic
Faculty of Informatics, Masaryk University Brno, Czech Republic

autor

Stovcik Matus

mstovcik@mail.muni.cz

Institute of Computer Science, Masaryk University, Brno, Czech Republic
Faculty of Informatics, Masaryk University Brno, Czech Republic

autor

Buhnova Barbora

buhnova@mail.muni.cz

Institute of Computer Science, Masaryk University, Brno, Czech Republic
Faculty of Informatics, Masaryk University Brno, Czech Republic

autor

Merjavy Michal

merjavy@mail.muni.cz

Faculty of Informatics, Masaryk University Brno, Czech Republic

Bibliografia

1. M. Macak, H. Bangui, B. Buhnova, A. J. Molnár, and C. I. Sidló, “Big data processing tools navigation diagram.” in IoTBDS, 2020, pp. 304–312.
2. F. Gessert, W. Wingerath, S. Friedrich, and N. Ritter, “Nosql database systems: a survey and decision guidance,” Computer Science-Research and Development, vol. 32, no. 3-4, pp. 353–365, 2017.
3. S. Kaisler, F. Armour, J. Espinosa, and W. Money, “Big data: Issues and challenges moving forward,” 01 2013. http://dx.doi.org/10.1109/HICSS.2013.645. ISBN 978-1-4673-5933-7 pp. 995–1004.
4. P. J. Sadalage and M. Fowler, NoSQL distilled: a brief guide to the emerging world of polyglot persistence. Pearson Education, 2013.
5. E. Raguseo, “Big data technologies: An empirical investigation on their adoption, benefits and risks for companies,” International Journal of Information Management, vol. 38, no. 1, pp. 187 – 195, 2018. http://dx.doi.org/https://doi.org/10.1016/j.ijinfomgt.2017.07.008. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0268401217300063
6. M. Macak, M. Stovcik, and B. Buhnova, “The suitability of graph databases for big data analysis: A benchmark.” in IoTBDS, 2020, pp. 213–220.
7. A. Messina, P. Storniolo, and A. Urso, “Keep it simple, fast and scalable: A multi-model nosql dbms as an (eb) xml-over-soap service,” in 2016 30th International Conference on Advanced Information Networking and Applications Workshops (WAINA), 2016, pp. 220–225.
8. T. P. Hong and P. Do, “Combining apache spark orientdb to find the influence of a scientific paper in a citation network,” in 2018 10th International Conference on Knowledge and Systems Engineering (KSE), 2018, pp. 113–117.
9. W. Schultz, T. Avitabile, and A. Cabral, “Tunable consistency in mongodb,” Proc. VLDB Endow., vol. 12, no. 12, p. 2071–2081, Aug. 2019. http://dx.doi.org/10.14778/3352063.3352125.
10. T. T. Aung and T. T. S. Nyunt, “Community detection in scientific co-authorship networks using neo4j,” in 2020 IEEE Conference on Computer Applications(ICCA), 2020, pp. 1–6.
11. S. Ataky T. M, L. Ferreira, M. Ribeiro, and M. Prado Santos, “Evaluation of graph databases performance through indexing techniques,” International Journal of Artificial Intelligence & Applications (IJAIA), vol. 06, pp. 87–98, 09 2015. http://dx.doi.org/10.5121/ijaia.2015.6506
12. C. Messaoudi, M. Amrou, R. Fissoune, and B. Hassan, “A performance study of nosql stores for biomedical data,” 11 2017.
13. D. Jayathilake, C. Sooriaarachchi, T. Gunawardena, B. Kulasuriya, and T. Dayaratne, “A study into the capabilities of nosql databases in handling a highly heterogeneous tree,” in 2012 IEEE 6th International Conference on Information and Automation for Sustainability, 2012, pp. 106–111.
14. C. Messaoudi, R. Fissoune, and B. Hassan, “A performance evaluation of nosql databases to manage proteomics data,” International Journal of Data Mining and Bioinformatics, vol. 21, pp. 70–89, 09 2018. http://dx.doi.org/10.1504/IJDMB.2018.10016724
15. F. R. Oliveira and L. del Val Cura, “Performance evaluation of nosql multi-model data stores in polyglot persistence applications,” in Proceedings of the 20th International Database Engineering & Applications Symposium, ser. IDEAS ’16. New York, NY, USA: Association for Computing Machinery, 2016. http://dx.doi.org/10.1145/2938503.2938518. ISBN 9781450341189 p. 230–235. [Online]. Available: https://doi.org/10.1145/2938503.2938518
16. D. Fernandes and J. Bernardino, “Graph databases comparison: Allegrograph, arangodb, infinitegraph, neo4j, and orientdb,” in Proceedings of the 7th International Conference on Data Science, Technology and Applications - Volume 1: DATA,, INSTICC. SciTePress, 2018. http://dx.doi.org/10.5220/0006910203730380. ISBN 978-989-758-318-6 pp. 373–380.
17. G. Bathla, R. Rani, and H. Aggarwal, “Comparative study of nosql databases for big data storage,” International Journal of Engineering & Technology, vol. 7, no. 2.6, pp. 83–87, 2018. http://dx.doi.org/10.14419/ijet.v7i2.6.10072. [Online]. Available: https://www.sciencepubco.com/index.php/ijet/article/view/10072
18. S. Mazumdar, D. Seybold, K. Kritikos, and Y. Verginadis, “A survey on data storage and placement methodologies for cloud-big data ecosystem,” Journal of Big Data, vol. 6, no. 1, p. 15, Feb 2019. http://dx.doi.org/10.1186/s40537-019-0178-3. [Online]. Available: https://doi.org/10.1186/s40537-019-0178-3
19. F. Holzschuher and R. Peinl, “Performance of graph query languages: comparison of cypher, gremlin and native access in neo4j,” in Proceedings of the Joint EDBT/ICDT 2013 Workshops. ACM, 2013, pp. 195–204.
20. D. Dominguez-Sal, P. Urbón-Bayes, A. Giménez-Vañó, S. Gómez-Villamor, N. Martínez-Bazán, and J. L. Larriba-Pey, “Survey of graph database performance on the hpc scalable graph analysis benchmark,” in Web-Age Information Management, H. T. Shen, J. Pei, M. T. Özsu, L. Zou, J. Lu, T.-W. Ling, G. Yu, Y. Zhuang, and J. Shao, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. ISBN 978-3-642-16720-1 pp. 37–48.
21. S. Jouili and V. Vansteenberghe, “An empirical comparison of graph databases,” in 2013 International Conference on Social Computing, Sep. 2013. http://dx.doi.org/10.1109/SocialCom.2013.106 pp. 708–715.
22. M. Ciglan, A. Averbuch, and L. Hluchy, “Benchmarking traversal operations over graph databases,” in 2012 IEEE 28th International Conference on Data Engineering Workshops, April 2012. http://dx.doi.org/10.1109/ICDEW.2012.47 pp. 186–189.
23. A. S. Mondal, M. Sanyal, S. Chattopadhyay, and K. C. Mondal, “Comparative analysis of structured and un-structured databases,” in Computational Intelligence, Communications, and Business Analytics, J. K. Mandal, P. Dutta, and S. Mukhopadhyay, Eds. Singapore: Springer Singapore, 2017. ISBN 978-981-10-6430-2 pp. 226–241.
24. R. A. Rossi and N. K. Ahmed, “The network data repository with interactive graph analytics and visualization,” in AAAI, 2015. [Online]. Available: http://networkrepository.com

Uwagi

1. The work was supported by the European Regional Development Fund Project CERIT Scientific Cloud (No. CZ.02.1.01/0.0/0.0/16_013/0001802). Access to the CERITSC computing and storage facilities provided by the CERITSC Center, under the "Projects of Large Research, Development, and Innovations Infrastructures" programme (CERIT Scientific Cloud LM2015085), is greatly appreciated.

2. Track 2: Computer Science & Systems

3. Technical Session: 11th Workshop on Scalable Computing

4. Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2021).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-b7ec130e-78c4-44ba-8371-df7a4e9dcffa