Handling (mis?)appropriated data

Introducing a policy to ensure due credit for unpublished data.

The practice of posting unpublished data in publicly accessible databases is widespread in the genome sequencing community. Given the years it can take to produce a finished sequence, such openness makes a significant difference to the rate at which science and its applications can develop. But there is a downside. Researchers who post data in this way may lose the opportunity to exploit them as others promptly seize the data and run with them. That is a (sometimes reluctantly) accepted consequence of openness. What is much more controversial is a refusal by the appropriators of posted data to give credit to the originators of those data.

Previously (see Nature 405, 719; 2000), we stated some elementary principles in our approach to this issue. Briefly, such posting of unpublished data does not count as prior publication, but neither is it protected from appropriation and publication by others in any way, unless a licensing agreement is explicitly required. The latter approach has been adopted, for example, by The Institute for Genomic Research in Rockville, Maryland, whose licensing agreement (see requires users to agree not to use TIGR's unpublished data for global genomic analysis before their publication of a complete genome paper.

We also urged more from the community by way of sensitivity to the interests of originators. Some sequencers have urged that Nature simply refuse to consider papers where inadequate credit is given. And it has recently been suggested that appropriators who do not obtain written consent from originators before making use of these data in publications are by definition misappropriating the data and committing fraud (see Hyman, R. W. Science 291, 827; 2001

While we agree that seeking consent is important, we do not accept that using data if consent is refused necessarily constitutes fraud, as consent can sometimes be withheld for questionable reasons. However, there is also a need to ensure that appropriated data are being used in full awareness of their technical limitations, given that they are sometimes preliminary or may be subject to qualifications that only their originators are fully aware of. And we do believe that practical steps can be taken to ensure that credit is given where, as far as it is possible for us to judge, we believe it is due.

Accordingly, we have decided to adopt the following practice. Appropriation of uncredited data will not prevent us from sending a paper out for prompt review. But we will require written assurance that authors are not violating any originators' data-licensing agreement. We will encourage our referees to be alert to the use of appropriated unpublished data from databases. Where there are concerns over credit, we will usually seek advice from an originator of the data in addition to the usual refereeing process. We would not be giving originators a veto: where disagreements arise, we will use our judgement, having consulted referees over technical considerations if necessary, and will usually insist on an acknowledgement as a condition of publication. As with all policies, we shall keep this under review, and welcome the opinions of readers, which should be sent to

Handling (mis?)appropriated data. Nature 409, 649 (2001).

