Current knowledge of RNA virus biodiversity is both biased and fragmentary, reflecting a focus on culturable or disease-causing agents. Here we profile the transcriptomes of over 220 invertebrate species sampled across nine animal phyla and report the discovery of 1,445 RNA viruses, including some that are sufficiently divergent to comprise new families. The identified viruses fill major gaps in the RNA virus phylogeny and reveal an evolutionary history that is characterized by both host switching and co-divergence. The invertebrate virome also reveals remarkable genomic flexibility that includes frequent recombination, lateral gene transfer among viruses and hosts, gene gain and loss, and complex genomic rearrangements. Together, these data present a view of the RNA virosphere that is more phylogenetically and genomically diverse than that depicted in current classification schemes and provide a more solid foundation for studies in virus ecology and evolution.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Comparative virome analysis of individual shedding routes of Miniopterus phillipsi bats inhabiting the Wavul Galge cave, Sri Lanka
Scientific Reports Open Access 08 August 2023
Individual bat virome analysis reveals co-infection and spillover among bats and virus zoonotic potential
Nature Communications Open Access 10 July 2023
Small RNA sequencing of field Culex mosquitoes identifies patterns of viral infection and the mosquito immune response
Scientific Reports Open Access 30 June 2023
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Koonin, E. V., Senkevich, T. G. & Dolja, V. V. The ancient Virus World and evolution of cells. Biol. Direct 1, 29 (2006)
Junglen, S. & Drosten, C. Virus discovery and recent insights into virus diversity in arthropods. Curr. Opin. Microbiol. 16, 507–513 (2013)
Li, C. X. et al. Unprecedented genomic diversity of RNA viruses in arthropods reveals the ancestry of negative-sense RNA viruses. eLife 4, e05378 (2015)
Bekal, S., Domier, L. L., Niblack, T. L. & Lambert, K. N. Discovery and initial analysis of novel viral genomes in the soybean cyst nematode. J. Gen. Virol. 92, 1870–1879 (2011)
Ballinger, M. J., Bruenn, J. A., Hay, J., Czechowski, D. & Taylor, D. J. Discovery and evolution of bunyavirids in arctic phantom midges and ancient bunyavirid-like sequences in insect genomes. J. Virol. 88, 8783–8794 (2014)
Qin, X. C. et al. A tick-borne segmented RNA virus contains genome segments derived from unsegmented viral ancestors. Proc. Natl Acad. Sci. USA 111, 6744–6749 (2014)
Tokarz, R. et al. Virome analysis of Amblyomma americanum, Dermacentor variabilis, and Ixodes scapularis ticks reveals novel highly divergent vertebrate and invertebrate viruses. J. Virol. 88, 11480–11492 (2014)
Webster, C. L. et al. The discovery, distribution, and evolution of viruses associated with Drosophila melanogaster. PLoS Biol. 13, e1002210 (2015)
Shi, M. et al. Divergent viruses discovered in arthropods and vertebrates revise the evolutionary history of the Flaviviridae and related viruses. J. Virol. 90, 659–669 (2015)
Holmes, E. C. The Evolution and Emergence of RNA Viruses. (Oxford Univ. Press, 2009)
Koonin, E. V. The phylogeny of RNA-dependent RNA polymerases of positive-strand RNA viruses. J. Gen. Virol. 72, 2197–2206 (1991)
Feschotte, C. & Gilbert, C. Endogenous viruses: insights into viral evolution and impact on host biology. Nat. Rev. Genet. 13, 283–296 (2012)
Philippe, H., Lartillot, N. & Brinkmann, H. Multigene analyses of bilaterian animals corroborate the monophyly of Ecdysozoa, Lophotrochozoa, and Protostomia. Mol. Biol. Evol. 22, 1246–1253 (2005)
King, A. M. Q., Adams, M. J., Carstens, E. B. & Lefkowitz, E. J. Virus Taxonomy: 9th Report of the International Committee on Taxonomy of Viruses. (Elsevier Academic Press, 2012)
Gauthier, L. et al. Viral load estimation in asymptomatic honey bee colonies using the quantitative RT–PCR technique. Apidologie (Celle) 38, 426–435 (2007)
Genersch, E. et al. The German bee monitoring project: a long term study to understand periodically high winter losses of honey bee colonies. Apidologie (Celle) 41, 332–352 (2010)
Tentcheva, D. et al. Prevalence and seasonal variations of six bee viruses in Apis mellifera L. and Varroa destructor mite populations in France. Appl. Environ. Microbiol. 70, 7185–7191 (2004)
Baranowski, E., Ruiz-Jarabo, C. M. & Domingo, E. Evolution of cell recognition by viruses. Science 292, 1102–1105 (2001)
Andersson, S. G. & Kurland, C. G. Origins of mitochondria and hydrogenosomes. Curr. Opin. Microbiol. 2, 535–541 (1999)
Gray, M. W., Burger, G. & Lang, B. F. Mitochondrial evolution. Science 283, 1476–1481 (1999)
Botstein, D. A theory of modular evolution for bacteriophages. Ann. NY Acad. Sci. 354, 484–490 (1980)
Suttle, C. A. Viruses in the sea. Nature 437, 356–361 (2005)
Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011)
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012)
Thorvaldsdóttir, H., Robinson, J. T. & Mesirov, J. P. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinform. 14, 178–192 (2013)
Li, B., Ruotti, V., Stewart, R. M., Thomson, J. A. & Dewey, C. N. RNA-seq gene expression estimation with read mapping uncertainty. Bioinformatics 26, 493–500 (2010)
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013)
Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009)
Darriba, D., Taboada, G. L., Doallo, R. & Posada, D. ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics 27, 1164–1165 (2011)
Guindon, S. & Gascuel, O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52, 696–704 (2003)
This study was supported by the National Natural Science Foundation of China (Grants 81290343, 81273014, 81672057), the Special National Project on Research and Development of Key Biosafety Technologies (Grants 2016YFC1201900, 2016YFC1200101), the 12th Five-Year Major National Science and Technology Projects of China (2014ZX10004001-005), and an NHMRC Australia Fellowship (GNT1037231).
The authors declare no competing financial interests.
Reviewer Information Nature thanks E. Ghedin, D. Obbard and the other anonymous reviewer(s) for their contribution to the peer review of this work.
Extended data figures and tables
Extended Data Figure 1 The contribution of major viral clades to the total virome of each host phylum/order.
a, b, These analyses are based on viruses at all frequency levels (a), and viruses in which the frequency exceeds 0.1% of the total number of non-rRNA reads (b).
a, Match between the phylogenies of the RdRp and coat proteins (S-domain like) for non-segmented members of the Tombus–Noda clade. The relationship between the two phylogenies is displayed to maximize topological congruence. b, The degree of phylogenetic incongruence for different pairs of structural and non-structural phylogenies. The comparisons were based on patristic distances matrices derived from the phylogenies.
a, The parallel acquisition of multiple copies of structural proteins by viruses within the Hepe–Virga clade. Left panel shows an outline of the structural part of their genomes, with homologous structural genes marked in yellow and multiple copies of these proteins within the same genome labelled as ‘I’, ‘II’, and ‘III’. Right panel shows a maximum-likelihood phylogeny depicting the evolutionary history of the corresponding structural proteins of these viruses. b, Acquisition of a glycoprotein in the genome of Hubei Lepidoptera virus 2 from the Mono–Chu Clade. Its genome is compared against that of a closely related virus (Hubei dimarhabdovirus-like virus 2). Homologous proteins are connected with dotted lines, and the target glycoprotein is shown in red. c, Three examples of glycoprotein loss in the Mono–Chu Clade. Homologous proteins are connected with dotted lines, and the target glycoproteins are shown in blue.
a, Evolutionary origin of two exoribonucleases (cd06133) in two sea-slater-associated viruses (Beihai hepe-like virus 2 and Beihai sea slater virus 4). Top, alignment of viral and (human) cellular exoribonucleases. The solid triangles indicate the key catalytic sites. Lower left panel shows the phylogenetic positions of the two viruses (marked with solid red circles) whose genomes contain these exoribonucleases. The host information for each virus is shown in parentheses. Lower right panel shows the phylogenetic position of the virus exoribonucleases (solid red circle) in the context of cellular exoribonucleases. b, Evolutionary origin of viral serine proteases (cd00190). The phylogeny contains serine proteases from RNA viruses (solid red circles), DNA viruses (solid blue circles) and cellular organisms. Serine proteases from RNA viruses are either highly divergent or group within the diversity of cellular proteins. c, Relative positions of different protein domains in the replicase of selected Hepe–Virga viruses. The domains are shown as ovals and marked with different colours, and comprise: RdRp (cd01699), Helicase (pfam01443), FstJ (pfam01728), OTU (OTU-like cysteine protease, pfam02338), Macro (cl00019), NADAR (cd15457), and viral methyltransferase (pfam01660). More detailed depictions of lateral gene transfer can be found in Supplementary Data 22–36.
This file contains Supplementary Data 1-36, phylogenies and genome structures of each major virus clade. The phylogenies (SI data 1-21) contain detailed information on evolutionary relationships, the name of the viruses, the frequency of viral RNA, and the presence and location of endogenous virus elements (EVEs). The genome structures (SI data 22-36) contain information on the genome organization and the structural domains of representative viruses. (PDF 1870 kb)
This table contains the detailed information of each pool/library. (PDF 217 kb)
This table contains the detailed information on each virus discovered in this study. (XLSX 231 kb)
About this article
Cite this article
Shi, M., Lin, XD., Tian, JH. et al. Redefining the invertebrate RNA virosphere. Nature 540, 539–543 (2016). https://doi.org/10.1038/nature20167
This article is cited by
In-depth study of tomato and weed viromes reveals undiscovered plant virus diversity in an agroecosystem
Metagenomics of gut microbiome for migratory seagulls in Kunming city revealed the potential public risk to human health
BMC Genomics (2023)