October 2017 – Olaf Hartig

The Linked Data Fragments (LDF) conceptual framework is an attempt to address the realization in the Semantic Web community that SPARQL endpoints (i.e., Web services that execute any given SPARQL over an internal RDF dataset) are perhaps not the ultimate solution to provide public query access to datasets on the Web. The trouble with these endpoints is that they may be easily overloaded when trying to answer any given SPARQL query, in particular when these queries are complex or multiple clients issue queries concurrently. The main idea of the LDF framework is to consider other types of query-based data access interfaces that are more restricted in the types of queries they support and, thus, impose that the effort for executing more complex queries is shifted to the clients. The initial example of such an interface has been the Triple Pattern Fragment (TPF) interface that limits the server-side effort to the evaluation of triple patterns only (i.e., the simplest type of patterns that SPARQL queries are built of); any other operation needed for a given query has to be performed by a client-side query execution algorithm that is based on obtaining triple pattern results from a TPF server. Several such algorithms have been proposed in the literature, and so have a number of other types of LDF interfaces. Each such proposal aims to hit a sweet spot of trade-offs along multiple dimensions such as server-side load, query throughput, network load, etc.

While the experimental evaluations in the various LDF-related research papers have provided us with a comprehensive elementary understanding of the existing proposals and their respective trade-offs, I strongly believe there is many more interesting work to be done regarding LDFs.

However, you know what I always thought would be great to have in this context? Since the beginning of the LDF work, I was looking for a way that allows us to achieve a more fundamental understanding of possible LDF interfaces, including interfaces that have not yet been implemented! In particular, I was after a formal framework that allows us to organize LDF interfaces into some kind of a lattice, or perhaps multiple lattices, based on the fundamental trade-offs that the interfaces entail. Such lattices would not only provide us with a more complete picture of how different interfaces compare to each other, they would also be a basis for making more informed decisions about whether it is worth to spend the time implementing and studying a possible interface experimentally.

As you likely have guessed by now, such a formal framework is not just an idea anymore. Together with Jorge Pérez and Ian Letter at the Universidad de Chile, we have developed an abstract machine model for which we have shown that it is a suitable foundation for the type of formal framework described above. From a computer science point of view, the most exciting part of this work is that our abstract machine model presents a basis for defining new complexity measures that allow us to capture many more aspects of computation in a client-server setting than what is captured by the classical measure of computational complexity. We will present this work next week at the 16th International Semantic Web Conference (ISWC). If you are interested in reading about our machine model and how we applied it to study various existing types of LDF interfaces, refer to our research paper about it (and, yes, we have actual lattices in that paper 😉

I have started a blog that I will use to write about events, ideas, and results related to my research. For this first post I below copy the report of a Dagstuhl seminar that I recently wrote on the blog of our Semantic Web research group at LiU. –Olaf

During the last week of June, I co-organized a Dagstuhl seminar on Federated Semantic Data Management together with Maria-Esther Vidal and Johann-Christoph Freytag. It was a very intense week with a packed schedule and almost no time to catch some breath (exactly like how a Dagstuhl seminar should be I guess 😉

To start with, we had scheduled a few short, survey-style talks on a number of topics related to the seminar. In particular, these talks covered:

Graph data models and graph databases, by me (slides),
RDF and semantics, by Claudio Gutierrez (slides),
Policies and access control, by Sabrina Kirrane and Piero Andrea Bonatti,
Database privacy, by Johann-Christoph Freytag (slides),
Distributed database systems, by Katja Hose (slides), and
Federated query processing, by Maria-Esther Vidal (slides).

While these talks were meant to establish a common understanding of key concepts and terminology, the major focus of the seminar was on discussions and working groups. To this end, we had invited a good mix of participants from the Semantic Web field, from Databases, as well as from application areas. Due to this mix, we ended up on several occasions and in different constellations discussing and reflecting in depth the fundamental assumptions and the core ideas of federated semantic data management. These general discussions and reflections kept re-emerging not only during the sessions, but also during the meals, the coffee breaks, and the evenings in Dagstuhl’s wine cellar. In my opinion, clearly articulating and repeatedly arguing about these assumptions and ideas was a long-needed discussion to be had in the community. After this week, I would guess that many of the participants have a much clearer understanding of what federated semantic data management can and should be, and I am certain that this understanding will be reflected in the reports that the working groups are preparing.

Speaking of working groups, the seminar was structured around four topics addressed by four separate working groups who came together occasionally to report on their progress and obtain feedback from the other groups. The topics were:

RDF and graph data models
Federated query processing
Access control and privacy
Use cases and applications

Each of the working groups is currently preparing a summary of their discussions and results. These summaries will become part of our Dagstuhl report (to be published some time in August if all goes well). In addition to this report, we are planning to document the discussions and the results of the seminar in a collection of more detailed publications.

What’s next? We have some ideas to keep the momentum and to advance the discussions around the seminar topics in a more continuous community process. Stay tuned.

Month: October 2017

A Foundation for Comparing Linked Data Fragment Interfaces

Dagstuhl seminar on Federated Semantic Data Management