Data are at the heart of scientific research. Therefore, all data and metadata should be stored — forever, and accessibly. But it would be naïve to think that such a 'gold standard' of preservation could be achieved. In one spectacular example of the failure of science to save its treasures, some of NASA's early satellite data were erased from the high-resolution master tapes in the 1980s (Science 327, 1322–1323; 2010). The lost data could now help extend truly global climate observations back to the 1960s — had they not been taped over.

At the time, the storage capacity of the tapes seemed more valuable than the data they contained. The story involved the preservation of analogue tapes with whale oil and the need for tape players the size of a large fridge. It vividly illustrates just how much the technology of data storage has changed since the 1960s.

But even in the past 20 years, standards of data documentation and preservation have been revolutionized. This has been largely ignored in the attacks on Phil Jones of the University of East Anglia's Climatic Research Unit over the loss of metadata regarding Chinese station locations used in his 1990 study.

When Jones and colleagues assessed what influence the 'urban heat island' effect had on the global warming signal (Nature 347, 169–172; 1990), Nature was publishing hardly any colour figures and content was not fully available online. More importantly, the option of adding supplementary information to a paper — in hardcopy — had only just been introduced as “a scheme for assisting with the publication of data that would otherwise be buried in people's desk drawers” (Nature 346, 215; 1990). At the time, the long-term vision of Nature was clear, but distant: “Eventually, of course, supplementary information will be distributed electronically, through an electronic database. But that is light-years away.”

Until the introduction of full-scale supplementary information, ensuring that accessible records were kept was down to the authors. Of course, the loss of important information, such as the exact station locations used in the Jones et al. paper, is unacceptable (as Phil Jones himself put it; Nature 463, 860; 2010) from a scientific point of view. But it is hardly surprising and probably widespread: scientists are not well-placed to guarantee continuity of data storage, especially while they are still in their vagabond years of PhD and post-doc work.

Nature Geoscience requires that authors make their data available on publication. The easiest way of ensuring that all the relevant information is accessible, and will remain so in the long term, is to use professionally run databases, which are now available for all sorts of Earth science data.

The creative push in science will always be for the production of better-resolved, more complicated data sets. Ingenious ways of storing and releasing these data are invariably developed with considerable lag. But this is not an excuse to neglect the issue. The preservation of valuable data sets and their distribution on demand is of utmost importance for the progress of science. The continuous attention of dedicated professionals — and substantial funds — is needed for database development to keep up with the science.