Kim, B.Y. et al. eLife https://doi.org/10.7554/eLife.66405 (2021)

Fruit flies of the Drosophila melanogaster variety have long been the dominate drosophilid used in basic research, but they are hardly the only species out there. There are over 1,600 flies in the family, with many more likely undescribed. That’s a lot of potential genetic diversity to tap, and an effort from Principle Investigators Dmitri Petrov from Stanford and Daniel Matute at UNC Chapel Hill is underway to sequence as many of those flies as possible – starting with 101 drosophilid genomes from across 93 species published recently in eLife.

The technology to prepare those genomes has been improving fast. “We’re kind of at the point where we can just crank out genomes,” says Bernard Kim, a post-doc in Petrov’s lab who is interested in what high-quality genome assemblies can reveal about evolution. The current work was facilitated by recent advances in long-read Nanopore sequencing that have made the assembly process both cheaper and faster – Kim says they can get a draft together for a given fly in just a week or two, and samples can be run in parallel – and followed requests for samples from labs working around the world with different flies. “We wanted to make it feel like a community effort,” he says. “We wanted to be sequencing species that people had an active interest in.”

From here, there are plans to sequence more flies – near-term, another 100 to 200 species. These will be prioritized in collaboration with the National Drosophila Species Stock Center at Cornell. Future sequences will also include – where and when possible – rare flies. “To truly capture the biodiversity in the group, you have to collect wild species or dip into people’s collections,” Kim says.

He and his colleagues hope to create a more integrated genomic resource, for example, by working with fly stock centers to upload the genomes of the flies they hold. “One of the primary things we’re trying to do is maintain this as a very public resource that’s open to anyone who wants to use it,” says Kim. In the meantime, the genomes completed so far are available in GenBank under NCBI BioProject PRJNA675888. There will also be updates along the way – Kim says they are currently re-doing all 101 of these genomes with updated software and genome tools to improve the assemblies.

He hopes the data will help expand the minds of the fly field beyond their main model. “As we sequence more and more, there are going to be instances to not just use Melanogaster as a model species. As useful as it is, you’re still just looking at one species,” Kim says. “Having the opportunity to understand whether what you’re seeing is generalizable across multiple species, for example, is going to provide powerful insights.”