CORRESPONDENCE

Sequence data: expand comprehensive access

Scientific data must not be ‘balkanized’ into multiple databases, each with its own rules and restrictions.

Almost 40 years ago, GenBank and the EMBL databank started independently. They soon joined forces and, with the DNA Database of Japan, formed a repository now called the International Nucleotide Sequence Data Collaboration (INSDC). China is now set to join. The INSDC has been one of the world’s most successful initiatives to collect and share scientific data (see Nature 590, 183–184; 2021). As DNA sequence data accumulate at ever-greater rates, the need for INSDC to continue and expand has never been more urgent.

The COVID-19 pandemic is an excellent example of data sharing leading to effective science (see Nature 590, 195–196; 2021). The first sequence of the SARS-CoV-2 virus was released by Yong-Zhen Zhang on 11 January 2020 and was released completely openly that same day in the INSDC databases (accession #MN908947). This enabled the development of rapid PCR-based tests for the viral RNA and jump-started vaccine development.

As international advisers to the INSDC, we call on the scientific community to help ensure that this openness and sharing grows to include many more types of data, so that scientists can use the INSDC to catalyse ever more biological discoveries.

Nature 591, 202 (2021)

A full list of co-signatories to this letter appears in Supplementary Information.

Supplementary Information

  1. List of co-signatories

Nature Briefing

An essential round-up of science news, opinion and analysis, delivered to your inbox every weekday.

Subjects

Sign up to Nature Briefing

An essential round-up of science news, opinion and analysis, delivered to your inbox every weekday.

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing