To the Editor

A recent Commentary1 in Nature Geoscience points out a gap between the principles and actions of data sharing in geoscience, and urges professional support for the archiving of data during its creation. We agree with these points, but we would like to add that attention should also be paid to the interoperability of data. Data should be discoverable, accessible, decodable, understandable and usable, and data sharing should be legal and ethical for all participants2. Without interoperability in this sense, local geoscience data archives will run the risk of becoming separate islands in a data deluge3.

To tackle issues in Earth and environmental sciences — such as climate change, earthquakes, flooding and pollution — on regional and global levels, geoscience data are often collected and integrated from diverse sources. Interoperability of these data is essential. It is now possible to download thematic spatial data from any number of websites in seconds. However, it can be difficult to understand these data on schematic and semantic levels and align them with each other, or even to use them appropriately. Rich data assets — such as those existing in geological surveys all over the world — are not easy to understand or use for those outside an institution or nation4.

To tackle these difficulties, two projects, OneGeology5 and OneGeology-Europe4, provide examples of schematic and semantic interoperability, and suggest a strategy of collaborative modelling and encoding to enable interoperability of geoscience data on the Internet6. A common data model, GeoSciML, is used to mediate distributed geological maps and information made available from participating nations.

Various data users need to be able to use diverse languages and different definitions in local geoscience data archives. Interoperability does not mean that all data should be mediated or standardized. However, it is important that data archives are accompanied by detailed documentation, clarifying data provenance, data model, vocabularies used, and so on. The documentation will facilitate interoperability among heterogeneous data sources.