View source for Querying Distributed RDF Data Sources with SPARQL

Jump to: navigation, search

You do not have permission to edit this page, for the following reason:

The action you have requested is limited to users in the group: Users.


You can view and copy the source of this page.

Return to Querying Distributed RDF Data Sources with SPARQL.

Access API{{{API}}} +
Event in seriesESWC +
Has BenchmarkSubset of DBpedia. +
Has Challenges{{{Challenges}}} +
Has DataCatalougeService Description +
Has DescriptionIn this section we evaluate the performanc
In this section we evaluate the performance of the DARQ query engine. The

prototype was implemented in Java as an extension to ARQ5. We used a subset of DBpedia6. DBpedia contains RDF information extracted from Wikipedia.

The dataset is offered in different parts.
The dataset is offered in different parts. +
Has DimensionsPerformance +
Has DocumentationURLhttp://darq.sf.net/ +
Has Downloadpagehttp://darq.sf.net/ +
Has EvaluationEvaluate the performance of the DARQ query engine. +
Has EvaluationMethodevaluate the performance of the DARQ query engine. +
Has ExperimentSetupwe split all data over two Sun-Fire-880 ma
we

split all data over two Sun-Fire-880 machines (8x sparcv9 CPU, 1050Mhz, 16GB RAM) running SunOS 5.10. The SPARQL endpoints were provided using Virtuoso Server 5.0.37 with an allowed memory usage of 8GB . Note that, although we use only two physical servers, there were five logical SPARQL endpoints. DARQ was running on Sun Java 1.6.0 on a Linux system with Intel Core Duo CPUs, 2.13 GHz and 4GB RAM. The machines were connected over a standard

100Mbit network connection.
ver a standard 100Mbit network connection. +
Has GUINo +
Has Hypothesis- +
Has ImplementationDARQ +
Has InfoRepresentationRDF +
Has Limitations{{{Limitations}}} +
Has NegativeAspects{{{NegativeAspects}}} +
Has PositiveAspectsQuery rewriting and cost-based query optimization to speed-up query execution. +
Has Requirements{{{Requirements}}} +
Has ResultsThe experiments show that our optimization
The experiments show that

our optimizations significantly improve query evaluation performance. For query Q1 the execution times of optimized and unoptimized execution are almost the same. This is due to the fact that the query plans for both cases are the same and bind joins of all sub-queries in order of appearance is exact the right strategy. For queries Q2 and Q4 the unoptimized queries took longer than 10 min to answer and timed out, whereas the execution time of the optimized queries is quiet reasonable. The optimized execution of Q1 and Q2 takes almost the same time

because Q2 is rewritten into Q1.
same time because Q2 is rewritten into Q1. +
Has SubproblemQuerying Distributed RDF Data Sources +
Has Version1.0 +
Has abstractDARQ provides transparent query access to
DARQ provides transparent query access to multiple SPARQL services, i.e., it gives the user the impression to query one single RDF graph despite the real data being distributed on the web. A service description language enables the query engine to decompose a query into sub-queries, each of which can be answered by an individual service. DARQ also uses query rewriting and cost-based query optimization to speed-up query execution.
optimization to speed-up query execution. +
Has approachdecompose a query into sub-queries, each of which can be answered by an individual service. +
Has authorsBastian Quilitz + and Ulf Leser +
Has conclusionDARQ offers a single interface for queryin
DARQ offers a single interface for querying multiple, distributed SPARQL end-points and makes query federation transparent to the client. One key feature of DARQ is that it solely relies on the SPARQL standard and therefore is compatible to any SPARQL endpoint implementing this standard. Using service descriptions provides a powerful way to dynamically add and remove endpoints to the query engine in a manner that is completely transparent to the user. To reduce execution costs we introduced basic query optimization for SPARQL queries. Our experiments show that the optimization algorithm can drastically improve query performance and allow distributed answering of SPARQL queries over distributed sources in reasonable time. Because the algorithm only relies on a very small amount of statistical information we expect that further improvements are possible using techniques. An important issue when dealing with data from multiple data sources are differences in the used vocabularies and the representation of information. In further work, we plan to work on mapping and translation rules between the vocabularies used by different SPARQL endpoints. Also, we will investigate generalizing the query patterns that can be handled and blank nodes and identity relationships across graphs.
and identity relationships across graphs. +
Has future workIn further work, we plan to work on mappin
In further work, we plan to work on mapping and translation rules between the vocabularies used by different SPARQL endpoints. Also, we will investigate generalizing the query patterns that can be handled and blank nodes and identity relationships across graphs.
and identity relationships across graphs. +
Has motivation{{{Motivation}}} +
Has platformJena +
Has problemSPARQL Query Federation +
Has relatedProblemTransparent query federation +
Has subjectQuerying Distributed RDF Data Sources +
Has vendorOpen Source +
Has year2008 +
ImplementedIn ProgLangJava +
Proposes Algorithm{{{ProposesAlgorithm}}} +
RunsOn OSLinux SunOS 5.10 +
TitleQuerying Distributed RDF Data Sources with SPARQL +
Uses FrameworkARQ +
Uses Methodology{{{Methodology}}} +
Uses ToolboxNo data available now. +