This workshop targets a fundamental problem in the Semantic Web, specifically the preservation of linked datasets. It is of particular relevance to ESWC since it raises awareness of how linked datasets should be used to achieve their full potential. Fostering active usage of such datasets requires challenges such as synchronisation and appraisal to be addressed.

Apart from researchers and practitioners, the target audience comprises data publishers and consumers. Publishers will benefit from attending this workshop by learning about further ways in which they can maximise the use of their published data. Consumers benefit by being able to discuss their expectations of good, appropriately maintained linked datasets with the workshop participants. We foresee that a number of participants from industry will attend this workshop, including the six industry partners participating in the DIACHRON EU FP7 project.

Motivation

There is a vast and rapidly increasing quantity of scientific, corporate, government and crowd-sourced data published on the emerging Data Web. Open Data are expected to play a catalyst role in the way structured information is exploited in the large scale. This offers a great potential for building innovative products and services that create new value from already collected data. It is expected to foster active citizenship (e.g., around the topics of journalism, greenhouse gas emissions, food supply-chains, smart mobility, etc.) and world-wide research according to the œfourth paradigm of science. The most noteworthy advantage of the Data Web is that, rather than documents, facts are recorded, which become the basis for discovering new knowledge that is not contained in any individual source, and solving problems that were not originally anticipated. In particular, Open Data published according to the Linked Data Paradigm are essentially transforming the Web into a vibrant information ecosystem.

Published datasets are openly available on the Web. A traditional view of digitally preserving them by pickling them and locking them away for future use, like groceries, would conflict with their evolution. There are a number of approaches and frameworks, such as the LOD2 stack, that manage a full life-cycle of the Data Web. More specifically, these techniques are expected to tackle major issues such as the synchronisation problem (how can we monitor changes), the curation problem (how can data imperfections be repaired), the appraisal problem (how can we assess the quality of a dataset), the citation problem (how can we cite a particular version of a linked dataset), the archiving problem (how can we retrieve the most recent or a particular version of a dataset), and the sustainability problem (how can we spread preservation ensuring long-term access).

Preserving linked open datasets poses a number of challenges, mainly related to the nature of the LOD principles and the RDF data model. In LOD, datasets representing real-world entities are structured; thus, in LOD, when managing and representing facts we need to take into consideration possible constraints that may hold. Since resources might be interlinked, effective citation measures are required to be in place to enable, for example, the ranking of datasets according to their measured quality. Another challenge is to determine the consequences that changes to one LOD dataset may have to other datasets linked to it. The distributed nature of LOD datasets furthermore makes archiving a headache.

Goals

The first DIACHRON workshop aims at addressing the above mentioned challenges and issues by providing a forum for researchers and practitioners who apply linked data technologies to discuss, exchange and disseminate their work. More broadly, this forum will enable communities interested in data, knowledge and ontology dynamics to network and cross-fertilise. The workshop will also serve as a platform for disseminating results of the DIACHRON EU FP7 project (managing the evolution and preservation of the Data Web).

Workshop Format

We are planning a full-day workshop. The morning session will consist of short presentations of accepted submissions. We will start the afternoon with an invited talk by Francoiss Bancilhon on how DIACHRON and similar services help enterprises such as Data Publica, and continue with a hands-on session, focusing on how to solve practical data evolution and preservation problems using the services of the DIACHRON Platform and other tools that have been accepted in the review process