Adaptive Integration of Distributed Semantic Web Data

From Openresearch
Jump to: navigation, search
Adaptive Integration of Distributed Semantic Web Data
Adaptive Integration of Distributed Semantic Web Data
Bibliographical Metadata
Subject: Querying Distributed RDF Data Sources
Year: 2010
Authors: Steven Lynden, Isao Kojima, Akiyoshi Matono, Yusuke Tanimura
Venue DNIS
Content Metadata
Problem: SPARQL Query Federation
Approach: Distributed Query Processing
Implementation: ADERIS
Evaluation: Performance Analysis

Abstract

The use of RDF (Resource Description Framework) data is a cornerstone of the Semantic Web. RDF data embedded in Web pages may be indexed using semantic search engines, however, RDF data is often stored in databases, accessible viaWeb Services using the SPARQL query language for RDF, which form part of the Deep Web which is not accessible using search engines. This paper addresses the problem of effectively integrating RDF data stored in separate Web-accessible databases. An approach based on distributed query processing is described, where data from multiple repositories are used to construct partitioned tables that are integrated using an adaptive query processing technique supporting join reordering, which limits any reliance on statistics and metadata about SPARQL endpoints, as such information is often inaccurate or unavailable, but is required by existing systems supporting federated SPARQL queries. The approach presented extends existing approaches in this area by allowing tables to be added to the query plan while it is executing, and shows how an approach currently used within relational query processing can be applied to distributed SPARQL query processing. The approach is evaluated using a prototype implementation and potential applications are discussed.

Conclusion

An adaptive framework has been presented for executing queries over multiple SPARQL endpoints that differs from existing approaches which use static query optimisation techniques. Many SPARQL web services are currently available and the number of them is growing. The work presented in this paper is a framework for executing queries over federations of such services. The framework proposed in this paper, which allows adaptive query processing over dynamically constructed predicate tables to be performed in conjunction with the construction of the predicate tables, was shown to perform relatively well in unpredictable environments where source query failures may occur. The prototype implemented was evaluated using real data, showing some advantage in terms of response times of adaptive over non-adaptive methods using a subset of DBPedia..

Future work

Future work will aim to investigate other data sets with different characteristics and larger data sets. As the approach presented in this paper focuses on efficiently executing a specific kind of query, that of adaptively ordering multiple joins, further work will focus on optimising other kinds of queries and implementing support for more SPARQL query language features. Future work will also concentrate on investigating how the work can be applied in various domains.

Approach

Positive Aspects: No data available now.

Negative Aspects: No data available now.

Limitations: No data available now.

Challenges: No data available now.

Proposes Algorithm: No data available now.

Methodology: No data available now.

Requirements: No data available now.

Limitations: No data available now.

Implementations

Download-page: No data available now.

Access API: No data available now.

Information Representation: RDF

Data Catalogue: Predicate List during setup phase

Runs on OS: OS independent

Vendor: No data available now.

Uses Framework: No data available now.

Has Documentation URL: No data available now.

Programming Language: Java

Version: No data available now.

Platform: -

Toolbox: No data available now.

GUI: Yes

Research Problem

Subproblem of: No data available now.

RelatedProblem: No data available now.

Motivation: No data available now.

Evaluation

Experiment Setup: Endpoint machines are connected to the machine on which the mediator is deployed (2GHz AMD Athlon X2, 2GB RAM) via a 100Mbs Ethernet LAN.

Evaluation Method : No data available now.

Hypothesis: No data available now.

Description: No data available now.

Dimensions: Performance

Benchmark used: DBPedia

Results: No data available now.

Access APINo data available now. +
Event in seriesDNIS +
Has BenchmarkDBPedia +
Has ChallengesNo data available now. +
Has DataCatalougePredicate List during setup phase +
Has DescriptionNo data available now. +
Has DimensionsPerformance +
Has DocumentationURLhttp://No data available now. +
Has Downloadpagehttp://No data available now. +
Has EvaluationPerformance Analysis +
Has EvaluationMethodNo data available now. +
Has ExperimentSetupEndpoint machines are connected to the machine on which the mediator is deployed (2GHz AMD Athlon X2, 2GB RAM) via a 100Mbs Ethernet LAN. +
Has GUIYes +
Has HypothesisNo data available now. +
Has ImplementationADERIS +
Has InfoRepresentationRDF +
Has LimitationsNo data available now. +
Has NegativeAspectsNo data available now. +
Has PositiveAspectsNo data available now. +
Has RequirementsNo data available now. +
Has ResultsNo data available now. +
Has SubproblemNo data available now. +
Has VersionNo data available now. +
Has abstractThe use of RDF (Resource Description Frame
The use of RDF (Resource Description Framework) data is a cornerstone of the Semantic Web. RDF data embedded in Web pages may be indexed using semantic search engines, however, RDF data is often stored in databases, accessible viaWeb Services using the SPARQL query language for RDF, which form part of the Deep Web which is not accessible using search engines. This paper addresses the problem of effectively integrating RDF data stored in separate Web-accessible databases. An approach based on distributed query processing is described, where data from multiple repositories are used to construct partitioned tables that are integrated using an adaptive query processing technique supporting join reordering, which limits any reliance on statistics and metadata about SPARQL endpoints, as such information is often inaccurate or unavailable, but is required by existing systems supporting federated SPARQL queries. The approach presented extends existing approaches in this area by allowing tables to be added to the query plan while it is executing, and shows how an approach currently used within relational query processing can be applied to distributed SPARQL query processing. The approach is evaluated using a prototype implementation and potential applications are discussed.
and potential applications are discussed. +
Has approachDistributed Query Processing +
Has authorsSteven Lynden +, Isao Kojima +, Akiyoshi Matono + and Yusuke Tanimura +
Has conclusionAn adaptive framework has been presented f
An adaptive framework has been presented for executing queries over multiple SPARQL endpoints that differs from existing approaches which use static query optimisation techniques. Many SPARQL web services are currently available and the number of them is growing. The work presented in this paper is a framework for executing queries over federations of such services. The framework proposed in this paper, which allows adaptive query processing over dynamically constructed predicate tables to be performed in conjunction with the construction of the predicate tables, was shown to perform relatively well in unpredictable environments where source query failures may occur. The prototype implemented was evaluated using real data, showing some advantage in terms of response times of adaptive over non-adaptive methods using a subset of DBPedia..
aptive methods using a subset of DBPedia.. +
Has future workFuture work will aim to investigate other
Future work will aim to investigate other data sets with different characteristics and larger data sets. As the approach presented in this paper focuses on efficiently executing a specific kind of query, that of adaptively ordering multiple joins, further work will focus on optimising other kinds of queries and implementing support for more SPARQL query language features. Future work will also concentrate on investigating how the work can be applied in various domains.
he work can be applied in various domains. +
Has motivationNo data available now. +
Has platform- +
Has problemSPARQL Query Federation +
Has relatedProblemNo data available now. +
Has subjectQuerying Distributed RDF Data Sources +
Has vendorNo data available now. +
Has year2010 +
ImplementedIn ProgLangJava +
Proposes AlgorithmNo data available now. +
RunsOn OSOS independent +
TitleAdaptive Integration of Distributed Semantic Web Data +
Uses FrameworkNo data available now. +
Uses MethodologyNo data available now. +
Uses ToolboxNo data available now. +