Difference between revisions of "Querying Distributed RDF Data Sources with SPARQL"

From Openresearch
Jump to: navigation, search
(Created page with "{{Paper |Title=Querying Distributed RDF Data Sources with SPARQL |Subject=Querying Distributed RDF Data Sources |Authors=Bastian Quilitz, Ulf Leser, |Series=ESWC |Abstract=DAR...")
 
Line 4: Line 4:
 
|Authors=Bastian Quilitz, Ulf Leser,
 
|Authors=Bastian Quilitz, Ulf Leser,
 
|Series=ESWC
 
|Series=ESWC
 +
|Year=2008
 
|Abstract=DARQ provides transparent query access to multiple SPARQL services, i.e., it gives the user the impression to query one single RDF graph despite the real data being distributed on the web. A service description language enables the query engine to decompose a query into sub-queries, each of which can be answered by an individual service. DARQ also uses query rewriting and cost-based query optimization to speed-up query execution.
 
|Abstract=DARQ provides transparent query access to multiple SPARQL services, i.e., it gives the user the impression to query one single RDF graph despite the real data being distributed on the web. A service description language enables the query engine to decompose a query into sub-queries, each of which can be answered by an individual service. DARQ also uses query rewriting and cost-based query optimization to speed-up query execution.
 +
|Conclusion=DARQ offers a single interface for querying multiple, distributed SPARQL end-points and makes query federation transparent to the client. One key feature of DARQ is that it solely relies on the SPARQL standard and therefore is compatible to any SPARQL endpoint implementing this standard. Using service descriptions provides a powerful way to dynamically add and remove endpoints to the query engine in a manner that is completely transparent to the user. To reduce execution costs we introduced basic query optimization for SPARQL queries. Our experiments show that the optimization algorithm can drastically improve query performance and allow distributed answering of SPARQL queries over distributed sources in reasonable time. Because the algorithm only relies on a very small amount of statistical information we expect that further improvements are possible using techniques. An important issue when dealing with data from multiple data sources are differences in the used vocabularies and the representation of information. In further work, we plan to work on mapping and translation rules between the vocabularies used by different SPARQL endpoints. Also, we will investigate generalizing the query patterns that can be handled and blank nodes and identity relationships across graphs.
 
|Future work=In further work, we plan to work on mapping and translation rules between the vocabularies used by different SPARQL endpoints. Also, we will investigate generalizing the query patterns that can be handled and blank nodes and identity relationships across graphs.
 
|Future work=In further work, we plan to work on mapping and translation rules between the vocabularies used by different SPARQL endpoints. Also, we will investigate generalizing the query patterns that can be handled and blank nodes and identity relationships across graphs.
|Year=2008
+
|Problem=SPARQL Query Federation
|Conclusion=DARQ offers a single interface for querying multiple, distributed SPARQL end-points and makes query federation transparent to the client. One key feature of DARQ is that it solely relies on the SPARQL standard and therefore is compatible to any SPARQL endpoint implementing this standard. Using service descriptions provides a powerful way to dynamically add and remove endpoints to the query engine in a manner that is completely transparent to the user. To reduce execution costs we introduced basic query optimization for SPARQL queries. Our experiments show that the optimization algorithm can drastically improve query performance and allow distributed answering of SPARQL queries over distributed sources in reasonable time. Because the algorithm only relies on a very small amount of statistical information we expect that further improvements are possible using techniques. An important issue when dealing with data from multiple data sources are differences in the used vocabularies and the representation of information. In further work, we plan to work on mapping and translation rules between the vocabularies used by different SPARQL endpoints. Also, we will investigate generalizing the query patterns that can be handled and blank nodes and identity relationships across graphs.
+
|Approach=decompose a query into sub-queries, each of which can be answered by an individual service.
 
|Implementation=DARQ
 
|Implementation=DARQ
 +
|Evaluation=Evaluate the performance of the DARQ query engine.
 +
|PositiveAspects=Query rewriting and cost-based query optimization to speed-up query execution.
 
|GUI=No
 
|GUI=No
 +
|RelatedProblem=transparent query federation
 
}}
 
}}

Revision as of 15:18, 3 July 2018

Querying Distributed RDF Data Sources with SPARQL
Querying Distributed RDF Data Sources with SPARQL
Bibliographical Metadata
Subject: Querying Distributed RDF Data Sources
Year: 2008
Authors: Bastian Quilitz, Ulf Leser
Venue ESWC
Content Metadata
Problem: SPARQL Query Federation
Approach: decompose a query into sub-queries, each of which can be answered by an individual service.
Implementation: DARQ
Evaluation: Evaluate the performance of the DARQ query engine.

Abstract

DARQ provides transparent query access to multiple SPARQL services, i.e., it gives the user the impression to query one single RDF graph despite the real data being distributed on the web. A service description language enables the query engine to decompose a query into sub-queries, each of which can be answered by an individual service. DARQ also uses query rewriting and cost-based query optimization to speed-up query execution.

Conclusion

DARQ offers a single interface for querying multiple, distributed SPARQL end-points and makes query federation transparent to the client. One key feature of DARQ is that it solely relies on the SPARQL standard and therefore is compatible to any SPARQL endpoint implementing this standard. Using service descriptions provides a powerful way to dynamically add and remove endpoints to the query engine in a manner that is completely transparent to the user. To reduce execution costs we introduced basic query optimization for SPARQL queries. Our experiments show that the optimization algorithm can drastically improve query performance and allow distributed answering of SPARQL queries over distributed sources in reasonable time. Because the algorithm only relies on a very small amount of statistical information we expect that further improvements are possible using techniques. An important issue when dealing with data from multiple data sources are differences in the used vocabularies and the representation of information. In further work, we plan to work on mapping and translation rules between the vocabularies used by different SPARQL endpoints. Also, we will investigate generalizing the query patterns that can be handled and blank nodes and identity relationships across graphs.

Future work

In further work, we plan to work on mapping and translation rules between the vocabularies used by different SPARQL endpoints. Also, we will investigate generalizing the query patterns that can be handled and blank nodes and identity relationships across graphs.

Approach

Positive Aspects: Query rewriting and cost-based query optimization to speed-up query execution.

Negative Aspects: {{{NegativeAspects}}}

Limitations: {{{Limitations}}}

Challenges: {{{Challenges}}}

Proposes Algorithm: {{{ProposesAlgorithm}}}

Methodology: {{{Methodology}}}

Requirements: {{{Requirements}}}

Limitations: {{{Limitations}}}

Implementations

Download-page: {{{Download-page}}}

Access API: {{{API}}}

Information Representation: {{{InfoRepresentation}}}

Data Catalogue: {{{Catalogue}}}

Runs on OS: {{{OS}}}

Property "RunsOn OS" (as page type) with input value "{{{OS}}}" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.

Vendor: {{{vendor}}}

Uses Framework: {{{Framework}}}

Has Documentation URL: {{{DocumentationURL}}}

Programming Language: {{{ProgLang}}}

Property "ImplementedIn ProgLang" (as page type) with input value "{{{ProgLang}}}" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.

Version: {{{Version}}}

Platform: {{{Platform}}}

Toolbox: {{{Toolbox}}}

GUI: No

Research Problem

Subproblem of: {{{Subproblem}}}

Property "Has Subproblem" (as page type) with input value "{{{Subproblem}}}" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.

RelatedProblem: transparent query federation

Motivation: {{{Motivation}}}

Evaluation

Experiment Setup: {{{ExperimentSetup}}}

Evaluation Method : {{{EvaluationMethod}}}

Hypothesis: {{{Hypothesis}}}

Description: {{{Description}}}

Dimensions: {{{Dimensions}}}

Benchmark used: {{{Benchmark}}}

Property "Has Benchmark" (as page type) with input value "{{{Benchmark}}}" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.

Results: {{{Results}}}

Access API{{{API}}} +
Event in seriesESWC +
Has Challenges{{{Challenges}}} +
Has DataCatalouge{{{Catalogue}}} +
Has Description{{{Description}}} +
Has Dimensions{{{Dimensions}}} +
Has DocumentationURLhttp://{{{DocumentationURL}}} +
Has Downloadpagehttp://{{{Download-page}}} +
Has EvaluationEvaluate the performance of the DARQ query engine. +
Has EvaluationMethod{{{EvaluationMethod}}} +
Has ExperimentSetup{{{ExperimentSetup}}} +
Has GUINo +
Has Hypothesis{{{Hypothesis}}} +
Has ImplementationDARQ +
Has InfoRepresentation{{{InfoRepresentation}}} +
Has Limitations{{{Limitations}}} +
Has NegativeAspects{{{NegativeAspects}}} +
Has PositiveAspectsQuery rewriting and cost-based query optimization to speed-up query execution. +
Has Requirements{{{Requirements}}} +
Has Results{{{Results}}} +
Has Version{{{Version}}} +
Has abstractDARQ provides transparent query access to
DARQ provides transparent query access to multiple SPARQL services, i.e., it gives the user the impression to query one single RDF graph despite the real data being distributed on the web. A service description language enables the query engine to decompose a query into sub-queries, each of which can be answered by an individual service. DARQ also uses query rewriting and cost-based query optimization to speed-up query execution.
optimization to speed-up query execution. +
Has approachdecompose a query into sub-queries, each of which can be answered by an individual service. +
Has authorsBastian Quilitz + and Ulf Leser +
Has conclusionDARQ offers a single interface for queryin
DARQ offers a single interface for querying multiple, distributed SPARQL end-points and makes query federation transparent to the client. One key feature of DARQ is that it solely relies on the SPARQL standard and therefore is compatible to any SPARQL endpoint implementing this standard. Using service descriptions provides a powerful way to dynamically add and remove endpoints to the query engine in a manner that is completely transparent to the user. To reduce execution costs we introduced basic query optimization for SPARQL queries. Our experiments show that the optimization algorithm can drastically improve query performance and allow distributed answering of SPARQL queries over distributed sources in reasonable time. Because the algorithm only relies on a very small amount of statistical information we expect that further improvements are possible using techniques. An important issue when dealing with data from multiple data sources are differences in the used vocabularies and the representation of information. In further work, we plan to work on mapping and translation rules between the vocabularies used by different SPARQL endpoints. Also, we will investigate generalizing the query patterns that can be handled and blank nodes and identity relationships across graphs.
and identity relationships across graphs. +
Has future workIn further work, we plan to work on mappin
In further work, we plan to work on mapping and translation rules between the vocabularies used by different SPARQL endpoints. Also, we will investigate generalizing the query patterns that can be handled and blank nodes and identity relationships across graphs.
and identity relationships across graphs. +
Has motivation{{{Motivation}}} +
Has platform{{{Platform}}} +
Has problemSPARQL Query Federation +
Has relatedProblemTransparent query federation +
Has subjectQuerying Distributed RDF Data Sources +
Has vendor{{{vendor}}} +
Has year2008 +
Proposes Algorithm{{{ProposesAlgorithm}}} +
TitleQuerying Distributed RDF Data Sources with SPARQL +
Uses Framework{{{Framework}}} +
Uses Methodology{{{Methodology}}} +
Uses Toolbox{{{Toolbox}}} +