Paez-Espino, D. et al. Nature 536, 425–430 (2016).

More than 2,000 viral genomes have been sequenced, but the vast majority of viruses are unknown. Paez-Espino et al. provide a glimpse of the viral universe beyond the small snapshot of previously known viruses. The researchers explored thousands of publicly available metagenomic data sets that were acquired by untargeted sequencing of samples from various habitats all over the world. Using their analysis pipeline, they expanded the number of known viral groups or singletons by two orders of magnitude. For many of the identified viruses, they were able to identify host species, for example, by analyzing remnants in the viral genomes that were left by CRISPR–Cas-based immune responses. The newly identified viral genomes promise to be a useful resource for studies into viral ecology as well as a source of previously unknown gene families.