Hundreds of scientists are urging that SARS-CoV-2 genome data should be shared more openly. Plus, the long road to long-read assembly, and an algorithm that creates tough new maths problems for humans to solve.

Data visualisation of the genomes of the 56 fully sequenced isolates of the virus SARS-CoV-2

A visualization of 56 SARS-CoV-2 genomes.Credit: Martin Krzywinski/SPL

Scientists call for fully open sharing of coronavirus genome data

Hundreds of scientists are urging that SARS-CoV-2 genome data should be shared more openly to help scientists analyse how viral variants are spreading around the world. The most popular data-sharing platform, called GISAID, now hosts more than 450,000 SARS-CoV-2 genomes. But it doesn’t allow sequences to be reshared publicly. In an open letter, researchers have called on their colleagues to post their genome data in one of a triad of databases that don’t place any restrictions on data redistribution: the US GenBank, the European Bioinformatics Institutes European Nucleotide Archive and the DNA Data Bank of Japan, which are collectively known as the International Nucleotide Sequence Database Collaboration.

India pledges billions to virus research

Virology research and biosafety are some of the winners in India’s latest budget. A 5.6-billion Indian rupee (US $77 million) funding hike for the country’s Department of Health Research includes cash for 4 new virology institutions, 9 new laboratories for studying highly infectious pathogens and a national institution to coordinate research and surveillance on animal and human infections. Currently, India has just one national institute that specializes in virology, a bottleneck that in the past has caused delays in confirming positive cases of SARS-CoV-2. Scientists say the investments will improve the country’s response to future outbreaks.

Maths bot calculates constants

An artificial intelligence – named after dream-inspired mathematician Srinivasa Ramanujan – can come up with new formulae to calculate digits of constants, such as the never-ending π. The Ramanujan Machine makes predictions of formulae on the basis of existing calculation methods, which humans then attempt to prove correct or false. Some of these have so far stumped mathematicians (something that excites them). Other formulae have helped to prove that Catalan’s constant, a useful number in several areas of mathematics, is harder to approximate using fractions than has been previously demonstrated. Automatically creating conjectures could point mathematicians towards connections between branches of maths that people didn't know existed, says Ido Kaminer, who leads the project.

Features & opinion

You complete me

Now that genomic data generation and analysis are faster, cheaper and more accurate, researchers are looking to close the gaps in genome sequences. A group of researchers, funded by the US National Institutes of Health, in the Telomere-to-Telomere consortium have taken on the human genome’s gnarly bits to build a completely contiguous reference. Closing gaps is sometimes seen as “a sport for nerds”, says computational biologist Pavel Pevzner. But the resulting sequences can increasingly reflect human variation and diversity on a global scale.

How to learn to love the command line

Earlier this week, I recommended Nature’s feature on five ways the text interface can ease your computational research — plus the pitfalls to avoid. Now, we want to know how you use the command line in your work — please take the poll about a quarter of the way into the feature.

Quote of the day

“Science is cumulative. It builds steadily toward progress, and that’s been my answer to despair during this last year. I can look back over my life and see a degree of advancement that’s staggering.”

Leading vaccinologist Stanley Plotkin is inspired by the accomplishments of science during the pandemic — though the 88-year-old has struggled to get a COVID vaccine himself. (The Washington Post | 6 min read)

Flora Graham, senior editor, Nature Briefing

With contributions by Elizabeth Gibney

