The volume of scientific, corporate, government and crowd-sourced data published on the World Wide Web is enormous and is increasing rapidly. Open Data (OD) are considered to be one of the key factors that are going to reshape the way structured information is used in the large scale and they are forming a domain where innovative services and products are going to be developed that create new value from existing data sources.
The impact of the data-intensive economic sector is predicted to exceed the importance of the contemporary software industry. Nevertheless there has been little attention so far to the long-term accessibility and usability of this rapidly increasing volume of data on which this emerging sector is based upon.
The Web has not only caused a revolution in communication; it also has completely changed the way we gather and use data. Open data -- data that is available to everyone -- is exponentially growing, and it has completely transformed the way we now conduct any kind of research or scholarship; it has changed the scientific method. The recent development of Linked Open Data has only increased the possibilities for exploiting public data.
Given the value of open data how do we preserve it for future use? Currently, much of the data we use, e.g. demographic records, clinical statistics, personal and enterprise data as well as many scientific measurements cannot be reproduced.
However, there is overwhelming evidence that we should keep such data where it is technically and economically feasible to do so. Until now this problem has been approached by keeping this information in fixed data sets and using extensions to the standard methods of disseminating and archiving traditional (paper) artifacts. Given the complexity, the interlinking and the dynamic nature of current data, especially Linked Open Data, radically new methods are needed.
DIACHRON tackles this problem with a fundamental assumption: that the processes of publishing and preservation data are one and the same. Data are archived at the point of creation and archiving and dissemination are synonymous.
DIACHRON takes on the challenges of evolution, archiving, provenance, annotation, citation, and data quality in the context of Linked Open Data and modern database systems. DIACHRON intends to automate the collection of metadata, provenance and all forms of contextual information so that data are accessible and usable at the point of creation and remain so indefinitely.
The results of DIACHRON are evaluated in three large-scale use cases: open governmental data life-cycles, large enterprise data intranets and scientific data ecosystems in the life-sciences.