Wyniki wyszukiwania - BazTech

1

Optimization of Retrieval Algorithms on Large Scale Knowledge Graphs

Dörpinghaus Jens, Stefan Andreas

Annals of Computer Science and Information Systems

|

2020

|

Vol. 21

227--236

EN

Knowledge graphs have been shown to play an important role in recent knowledge mining and discovery, for example in the field of life sciences or bioinformatics. Although a lot of research has been done on the field of query optimization, query transformation and of course in storing and retrieving large scale knowledge graphs the field of algorithmic optimization is still a major challenge and a vital factor in using graph databases. Few researchers have addressed the problem of optimizing algorithms on large scale labeled property graphs. Here, we present two optimization approaches and compare them with a naive approach of directly querying the graph database. The aim of our work is to determine limiting factors of graph databases like Neo4j and we describe a novel solution to tackle these challenges. For this, we suggest a classification schema to differ between the complexity of a problem on a graph database. We evaluate our optimization approaches on a test system containing a knowledge graph derived biomedical publication data enriched with text mining data. This dense graph has more than 71M nodes and 850M relationships. The results are very encouraging and -- depending on the problem -- we were able to show a speedup of a factor between 44 and 3839.

2

Knowledge Detection and Discovery using Semantic Graph Embeddings on Large Knowledge Graphs generated on Text Mining Results

Dörpinghaus Jens, Jacobs Marc

Annals of Computer Science and Information Systems

|

2020

|

Vol. 21

169--178

EN

Knowledge graphs play a central role in big data integration, especially for connecting data from different domains. Bringing unstructured texts, e.g. from scientific literature, into a structured, comparable format is one of the key assets. Here, we use knowledge graphs in the biomedical domain working together with text mining based document data for knowledge extraction and retrieval from text and natural language structures. For example cause and effect models, can potentially facilitate clinical decision making or help to drive research towards precision medicine. However, the power of knowledge graphs critically depends on context information. Here we provide a novel semantic approach towards a context enriched biomedical knowledge graph utilizing data integration with linked data applied to language technologies and text mining. This graph concept can be used for graph embedding applied in different approaches, e.g with focus on topic detection, document clustering and knowledge discovery. We discuss algorithmic approaches to tackle these challenges and show results for several applications like search query finding and knowledge discovery. The presented remarkable approaches lead to valuable results on large knowledge graphs.

3

A Locality Sensitive Hashing Filter for Encrypted Vector Databases

Kawamoto J.

Fundamenta Informaticae

|

2015

|

Vol. 137, nr 2

291--304

EN

We introduce a filtering methodology based on locality-sensitive hashing (LSH) and whitening transformation to reduce candidate tuples between which encrypted vector databases (EVDBs) must compute similarity for query processing. The LSH hashing methodology is efficient for estimating similarities between two vectors. It hashes a vector space using randomly chosen vectors. We can filter vectors that are less similar to the querying vectors by recording which hashed space each vector belongs to. However, if vectors in EVDBs are found locally, then most vectors are in the same hashed space, so the filter will not work. Because we can treat those cases using whitening transformation to distribute the vectors broadly, our proposed filtering methodology will work effectively on any vector space. We also show that our filter reduces the server’s query processing cost.

4

A Bi-objective Optimization Framework for Heterogeneous CPU/GPU Query Plans

Przymus P., Kaczmarski K., Stencel K.

Fundamenta Informaticae

|

2014

|

Vol. 135, nr 4

483--501

EN

Graphics Processing Units (GPU) have significantly more applications than just rendering images. They are also used in general-purpose computing to solve problems that can benefit from massive parallel processing. However, there are tasks that either hardly suit GPU or fit GPU only partially. The latter class is the focus of this paper. We elaborate on hybrid CPU/GPU computation and build optimization methods that seek the equilibrium between these two computation platforms. The method is based on heuristic search for bi-objective Pareto optimal execution plans in presence of multiple concurrent queries. The underlying model mimics the commodity market where devices are producers and queries are consumers. The value of resources of computing devices is controlled by supply-and-demand laws. Our model of the optimization criteria allows finding solutions of problems not yet addressed in heterogeneous query processing. Furthermore, it also offers lower time complexity and higher accuracy than other methods.

5

Extraction, representation and analysis of geographical spatial information from topographic paper maps

Dhar D. B., Chanda B.

Image Processing & Communications

|

2010

|

Vol. 15, no 1

5-22

EN

Emergence of Geographical Information Systems (GIS) has facilitated map acquisition and utilization to a great extent. This paper presents novel methodology for the extraction, representation and analysis of objects and symbols from topographic sheet that are geographically important as well as their spatial interrelationship. The method exploits various image processing tools suitable for specific objects and symbols on the basis of their geometrical and morphological attributes. The output is presented as a spatial database for further query processing and hence is suitable for GIS applications. The methodology is found to perform quite satisfactorily.

6

Efficient Parallel Query Processing by Graph Ranking

Dereniowski D., Kubale M.

Fundamenta Informaticae

|

2006

|

Vol. 69, nr 3

273-285

EN

In this paper we deal with the problem of finding an optimal query execution plan in database systems. We improve the analysis of a polynomial-time approximation algorithm due to Makino et al. for designing query execution plans with almost optimal number of parallel steps. This algorithm is based on the concept of edge ranking of graphs. We use a new upper bound for the edge ranking number of a tree to derive a better worst-case performance guarantee for this algorithm. We also present some experimental results obtained during the tests of the algorithm on random graphs in order to compare the quality of both approximation ratios on average. Both theoretical analysis and experimental results indicate the superiority of our approach.

7

Quick text retrieval algorithm supporting synonyms based on fuzzy logic

Saxena P. Ch., Gupta N.

Computing, Multimedia and Intelligent Techniques

|

2006

|

Vol. 2, nr 1

7-24

EN

Traditional information retrieval techniques become inadequate for the increasingly vast amounts of text data. Here we show a method of query processing, which retrieve the documents containing not only the query terms but also documents having their synonyms. The method performs the query processing by retrieving and scanning the inverted index document list. We show that query response time for conjunctive Boolean queries can be dramatically reduced, at cost in terms of secondary storage, by applying range partition feature of Oracle to reduce the primary memory storage space requirement for looking the inverted list. The proposed method is based on fuzzy relations and fuzzy reasoning to retrieve only top ranking documents from the database and grouping of the retrieved documents through Suffix tree clustering.

8

Processing Indefinite Deductive Databases under the Possible Model Semantics

Johnson C.A.

Fundamenta Informaticae

|

2002

|

Vol. 49, nr 4

325-347

EN

The relationship between possible and supported models of unstratified indefinite deductive databases is studied, when disjunction is interpreted inclusively. Possible and supported models are shown to coincide under a suitable definition of supportedness, and the concept of a supported cover is introduced and shown to characterise possible models and facilitate top-down query processing and compilation under the possible model semantics. The properties and query processing of deductive databases under the possible model semantics is compared and contrasted with the perfect model semantics.

9

Przetwarzanie zapytań w rozproszonych relacyjnych bazach danych

Stasiecka A., Stemposz E.

Prace Instytutu Podstaw Informatyki Polskiej Akademii Nauk

|

1999

|

nr 890

1-29

PL

Opracowanie zawiera przegląd metodprzetwarzania zapytań w rozproszonych relacyjnych systemach zarządzania bazami danych. Omówione zostały podstawowe założenia, pojęcia i metody. Intencją pracy jest ustosunkowanie się do rozwiązań, które pojawiły się w związku z przetwarzaniem zapytań w rozproszonych relacyjnych bazach danych (RRBD), po przeanalizowaniu ich pod kątem użyteczności dla przetwarzania zapytań w rozproszonych obiektowych bazach danych. Opracowanie omawia także koncepcję przetwarzania zapytań w rozproszonych obiektowych BD, będącą uogólnieniem techniki optymalizacji zapytań w rozproszonych relacyjnych BD, określanej jako pół-złączenia.

EN

The report is an overview of query processing methods in distributed relational database management systems. It presents basic assumptions, concepts and methods related to the subject. The main objective is to work out the view on solutions concerning query processing, which were developed in distributed RDBMS. The aim is to analyse them in order to recognize their capabilities for query processing in distributed object-oriented DBMS. Finally, the report presents a new idea of query processing in distributed object-oriented DBMS, which is a generalization of a query optimization technique in distributed RDBMS, known as semijoins.

10

Optimization of object-oriented queries by factoring out independent subqueries

Płodzień J., Subieta K.

Prace Instytutu Podstaw Informatyki Polskiej Akademii Nauk

|

1999

|

nr 889

1-16

EN

We generalize query optimization methods based on rewriting for a general object-oriented model and a formalized OQL-like query language. Our approach makes it possible to detect and factor out independent subqueries in queries bulit upon traditional or new query operators, including dependent joins of OQL, quantifiers, generalized path expressions and method invocations. In contrast to well-known methods relying on specific patterns of algebraic or calculus expressions, our methodis based on a formal analysis of scoping and binding rules for names occuring in a query and its subqueries. It neither depends on the complexity of an independent subquery nor on the operator connecting this subquery to its parent query. Being very general, the method is simple to understand and analyze. We follow the stack-based approach to object-oriented query languages (having roots in the semantics of programming languages), rather than object algebras or calculi.

PL

W pracy przedstawiamy uogólnienie metod optymalizacji zapytań opartych na przepisywaniu dla ogólnego modelu obiektowego i sformalizowanego języka zapytań w duchu OQL. Nasze podejście umożliwia wykrywanie i "wyciąganie przed nawias" niezależnych podzapytań wystepujących w zapytaniach konstruowanych przy pomocy tradycyjnych i nowych operatorów, m. in. zależnych złączeń z języka OQL, kwantyfikatorów, uogólnionych wyrażeń ścieżkowych i wywołań metod. W przeciwieństwie do dobrze znanych metod operujących na specyficznych wzorach wyrażeń algebraicznych lub wyrażeń pewnego rachunku, nasza metoda jest oparta na formalnej analizie zakresów i reguł wiązania dla nazw wystepujących w danym zapytaniu i jego podzapytaniach. Nie zależy ona ani od stopnia złożoności niezależnego podzapytania, ani od rodzaju operatora lączącego go z pytaniem nadrzędnym. Metoda ta jest jednocześnie ogólna oraz łatwa do zrozumienia i wykorzystania do analizy zapytań. W naszej metodzie używamy podejścia stosowego do obiektowych zapytań (które ma korzenie w semantyce języków programowania) zamiast obiektowych algebr i rachunków.

11

Processing deductive databases under the disjunctive stable model semantics

Johnson C.A.

Fundamenta Informaticae

|

1999

|

Vol. 40, Nr 1

31-51

EN

Cyclic covers are shown to characterise disjunctive stable models of unstratified deductive databases, and to facilitate top-down query processing, query compilation and view updating under the disjunctive stable model semantics. Such processing is shown to be more complex than comparable processing of stratified databases.

12

Rozproszone obiektowe bazy danych

Stasiecka A., Stemposz E., Subieta K.

Prace Instytutu Podstaw Informatyki Polskiej Akademii Nauk

|

1998

|

nr 857

3-59

PL

Opracowanie zawiera przegląd zagadnień związanych z tematyką rozproszonych obiektowych baz danych. Omówione są podstawowe cele, założenia, pojęcia i metody wiążące się z tą dziedziną. Przedstawiono typowe architektury systemów rozproszonych baz danych (architektury: klient-serwer i klient-broker-serwer), a także problemy związane z rozproszeniem zasobów (m.in. replikacje, migracje obiektów, przetwarzanie transakcji, przetwarzanie zapytań, mediatory, osłony, perspektywy). Jakkolwiek główny nacisk został położony na pojęcia i metody związane z obiektowymi bazami danych, opracowanie zawiera również krótkie omówienie technologii komponentowych, takich jak CORBA.

EN

The report contains an overview of distributed object-oriented database systems. The basic goals, assumptions, concepts and methods related to this subject are outlined. Typical architectures of distributed databases (client-server and client-broker-server architectutes) are discussed. Problems connected with resources distribution (among others replications, object migration, transaction processing, query optimization, mediators, wrappers, views) are presented. Although the main emphasis was put on notions and methods related to object-oriented databases systems the report contains also a short description of component technologies, such as CORBA.

13

Kompilacja i optymalizacja obiektowych zapytań

Płodzień J., Subieta K.

Prace Instytutu Podstaw Informatyki Polskiej Akademii Nauk

|

1998

|

nr 869

3-48

PL

W pracy przedstawiamy wstępną propozycję sposobu kompilacji i optymalizacji zapytań w obiektowych bazach danych. Z istniejących podejść obiektowych wybraliśmy podejście stosowe zaimplementowane w systemie LOQIS. Jako ogólną metodę optymalizacji zapytań proponujemy wykorzystanie reguł przepisywania. Za ich pomocą przekształcamy tekstową postać zapytania do takiej semantycznie równoważnej postaci, której czas ewaluacji jest krótszy. Na koniec krótko dyskutujemy możliwość użycia do optymalizacji zapytań indeksów, przy pomocy zbioru specjalnych funkcji.

EN

We address the problem of query compilation and optimization in object-oriented data bases. From the existing object-oriented approaches we have chosen the stack-based approach implemented in the LOQIS system. As a general method of query optimization we use rewriting rules. They transform the textual form of a query into such a semantically equivalent form, for which the time of evaluation is shorter. Finally we briefly discuss the possibility of using indexes for query optimization, which can be accomplished by involving a set of special functions.