Difference between revisions of "ANAPSID: An Adaptive Query Processing Engine for SPARQL Endpoints"

Revision as of 10:28, 7 May 2018

ANAPSID: An Adaptive Query Processing Engine for SPARQL Endpoints
ANAPSID: An Adaptive Query Processing Engine for SPARQL Endpoints
Bibliographical Metadata
Keywords:	Adaptive Query Processing, ANAPSID, Linked Data
Year:	2011
Authors:	Maribel Acosta, Maria-Esther Vidal, Tomas Lampo, Julio Castillo
Venue	ISWC
Content Metadata
Problem:	SPARQL Query FederationQuery ExecutionSource Selection,
Approach:	Querying Distributed RDF Data Sources,
Implementation:	ANAPSID

Abstract

Following the design rules of Linked Data, the number of available SPARQL endpoints that support remote query processing is quickly growing; however, because of the lack of adaptivity, query executions may frequently be unsuccessful. First, fixed plans identified following the traditional optimize-then execute paradigm, may timeout as a consequence of endpoint availability. Second, because blocking operators are usually implemented, endpoint query engines are not able to incrementally produce results, and may become blocked if data sources stop sending data. We present ANAPSID, an adaptive query engine for SPARQL endpoints that adapts query execution schedulers to data availability and run-time conditions. ANAPSID provides physical SPARQL operators that detect when a source becomes blocked or data traÆc is bursty, and opportunistically, the operators produce results as quickly as data arrives from the sources. Additionally, ANAPSID operators implement main memory replacement policies to move previously computed matches to secondary memory avoiding duplicates. We compared ANAPSID performance with respect to RDF stores and endpoints, and observed that ANAPSID speeds up execution time, in some cases, in more than one order of magnitude.

Conclusion

We have defined ANAPSID, an adaptive query processing engine for RDF Linked Data accessible through SPARQL endpoints. ANAPSID provides a set of physical operators and an execution engine able to adapt the query execution to the availability of the endpoints and to hide delays from users. Reported experimental results suggest that our proposed techniques reduce execution times and are able to produce answers when other engines fail. Also, depending on the selectivity of the join operator and the data transfer delays, ANAPSID operators may overcome state-of-the-art Symmetric Hash Join operators. In the future, we plan to extend ANAPSID with more powerful and lightweight operators like Eddy and MJoin, which are able to route received responses through different operators and adapt the execution to unpredictable delays by changing the order in which each data item is routed.

Future work

In the future we plan to extend ANAPSID with more powerful and lightweight operators like Eddy and MJoin, which are able to route received responses through different operators, and adapt the execution to unpredictable delays by changing the order in which each data item is routed.

Approach

Positive Aspects: {{{PositiveAspects}}}

Negative Aspects: {{{NegativeAspects}}}

Limitations: {{{Limitations}}}

Challenges: {{{Challenges}}}

Proposes Algorithm: {{{ProposesAlgorithm}}}

Methodology: {{{Methodology}}}

Requirements: {{{Requirements}}}

Limitations: {{{Limitations}}}

Implementations

Download-page: https://github.com/anapsid/anapsid

Access API: -

Information Representation: RDF

Data Catalogue: {{{Catalogue}}}

Runs on OS: Linux CentOS

Vendor: -

Uses Framework: Twisted Network framework

Has Documentation URL: https://github.com/anapsid/anapsid

Programming Language: Python 2.6.5

Version: 1

Platform: -

Toolbox: -

GUI: No

Research Problem

Subproblem of: {{{Subproblem}}}

Property "Has Subproblem" (as page type) with input value "{{{Subproblem}}}" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.

RelatedProblem: {{{RelatedProblem}}}

Property "Has relatedProblem" (as page type) with input value "{{{RelatedProblem}}}" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.

Motivation: {{{Motivation}}}

Evaluation

Experiment Setup: {{{ExperimentSetup}}}

Evaluation Method : {{{EvaluationMethod}}}

Hypothesis: {{{Hypothesis}}}

Description: {{{Description}}}

Dimensions: {{{Dimensions}}}

Benchmark used: FedBench

Results: {{{Results}}}

@@ Line 11: / Line 11: @@
 |Problem=SPARQL Query FederationQuery ExecutionSource Selection,
 |Implementation=ANAPSID
+|Download-page=https://github.com/anapsid/anapsid
+|API=-
+|InfoRepresentation=RDF
+|OS=Linux CentOS
+|vendor=-
+|Framework=Twisted Network framework
+|DocumentationURL=https://github.com/anapsid/anapsid
+|ProgLang=Python 2.6.5
+|Version=1
+|Platform=-
+|Toolbox=-
 |GUI=No
-|Field=Federated Question Answering
+|Benchmark=FedBench
-|Publication venues=ISWC,
 }}
 [[Category:Paper]]

Access API	- +
Event in series	ISWC +
Has Benchmark	FedBench +
Has Challenges	{{{Challenges}}} +
Has DataCatalouge	{{{Catalogue}}} +
Has Description	{{{Description}}} +
Has Dimensions	{{{Dimensions}}} +
Has DocumentationURL	https://github.com/anapsid/anapsid +
Has Downloadpage	https://github.com/anapsid/anapsid +
Has EvaluationMethod	{{{EvaluationMethod}}} +
Has ExperimentSetup	{{{ExperimentSetup}}} +
Has GUI	No +
Has Hypothesis	{{{Hypothesis}}} +
Has Implementation	ANAPSID +
Has InfoRepresentation	RDF +
Has Limitations	{{{Limitations}}} +
Has NegativeAspects	{{{NegativeAspects}}} +
Has PositiveAspects	{{{PositiveAspects}}} +
Has Requirements	{{{Requirements}}} +
Has Results	{{{Results}}} +
Has Version	1 +
Has abstract	Following the design rules of Linked Data, … Following the design rules of Linked Data, the number of available SPARQL endpoints that support remote query processing is quickly growing; however, because of the lack of adaptivity, query executions may frequently be unsuccessful. First, fixed plans identified following the traditional optimize-then execute paradigm, may timeout as a consequence of endpoint availability. Second, because blocking operators are usually implemented, endpoint query engines are not able to incrementally produce results, and may become blocked if data sources stop sending data. We present ANAPSID, an adaptive query engine for SPARQL endpoints that adapts query execution schedulers to data availability and run-time conditions. ANAPSID provides physical SPARQL operators that detect when a source becomes blocked or data traÆc is bursty, and opportunistically, the operators produce results as quickly as data arrives from the sources. Additionally, ANAPSID operators implement main memory replacement policies to move previously computed matches to secondary memory avoiding duplicates. We compared ANAPSID performance with respect to RDF stores and endpoints, and observed that ANAPSID speeds up execution time, in some cases, in more than one order of magnitude. ases, in more than one order of magnitude. +
Has approach	Querying Distributed RDF Data Sources, +
Has authors	Maribel Acosta +, Maria-Esther Vidal +, Tomas Lampo + and Julio Castillo +
Has conclusion	We have defined ANAPSID, an adaptive query … We have defined ANAPSID, an adaptive query processing engine for RDF Linked Data accessible through SPARQL endpoints. ANAPSID provides a set of physical operators and an execution engine able to adapt the query execution to the availability of the endpoints and to hide delays from users. Reported experimental results suggest that our proposed techniques reduce execution times and are able to produce answers when other engines fail. Also, depending on the selectivity of the join operator and the data transfer delays, ANAPSID operators may overcome state-of-the-art Symmetric Hash Join operators. In the future, we plan to extend ANAPSID with more powerful and lightweight operators like Eddy and MJoin, which are able to route received responses through different operators and adapt the execution to unpredictable delays by changing the order in which each data item is routed. e order in which each data item is routed. +
Has future work	In the future we plan to extend ANAPSID wi … In the future we plan to extend ANAPSID with more powerful and lightweight operators like Eddy and MJoin, which are able to route received responses through different operators, and adapt the execution to unpredictable delays by changing the order in which each data item is routed. e order in which each data item is routed. +
Has keywords	Adaptive Query Processing, ANAPSID, Linked Data +
Has motivation	{{{Motivation}}} +
Has platform	- +
Has problem	SPARQL Query FederationQuery ExecutionSource Selection, +
Has vendor	- +
Has year	2011 +
ImplementedIn ProgLang	Python 2.6.5 +
Proposes Algorithm	{{{ProposesAlgorithm}}} +
RunsOn OS	Linux CentOS +
Title	ANAPSID: An Adaptive Query Processing Engine for SPARQL Endpoints +
Uses Framework	Twisted Network framework +
Uses Methodology	{{{Methodology}}} +
Uses Toolbox	- +

Difference between revisions of "ANAPSID: An Adaptive Query Processing Engine for SPARQL Endpoints"

Revision as of 10:28, 7 May 2018

Contents

Abstract

Conclusion

Future work

Approach

Implementations

Research Problem

Evaluation

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Search

Create

Data

Kuratierung

Tools