The great virus comeback

Enumeration of viral particles in environmental samples by fluorescence electron microscopy and transmission electron microscopy has suggested that viruses represent the most abundant biological entities on our planet. In addition, metagenomic analyses focusing on viruses (viromes) have shown that viral genomes are a large reservoir of novel genetic diversity (Kristensen et al., 2010; Mokili et al., 2012). These observations have convinced most microbiologists that viruses, ‘the dark matter of the biosphere’, have a major role in structuring cellular populations and controlling geochemical cycles (Rowher and Youle, 2012). Environmental microbiologists should therefore be interested in recent debates about the nature of viruses (Forterre, 2010). These debates have been triggered by the discovery of giant viruses, such as mimiviruses, and also by new proposals on the origin, nature and evolution of viruses. They have implications for major questions raised by recent discoveries in microbial ecology, such as: are we really counting viruses when we enumerate virions? Why do most genes in viromes have no homologues in cellular genomes? Or else, how to distinguish viral from cellular genes in metagenomic analyses? Understanding the new concepts that have been proposed to explain the nature of viruses might also have practical implications for developing future promising experimental approaches.

From virions to virocells

Historically, viruses have been assimilated to their virions, which are the biological objects that pass through Chamberlin’s filters and can be sometimes crystallized, as shown with Tobacco mosaic viruses (Forterre, 2012). Jacob and Wollman wrote in their famous 1961 review ‘Viruses and Genes’, ‘viruses may exist in three states: the extracellular infectious state, the vegetative state of autonomous replication and finally the proviral state’. However, a few paragraphs later, they define the virus as: ‘a genetic element enclosed in a protein coat’. This assimilation of viruses to virions explains why viral ecologists consider that counting viral particles is equivalent to counting viruses. However, this might not be the case. Fluorescent dots observed in stained environmental samples are not always infectious viral particles but can instead represent inactivated virions, gene transfer agents (that is, fragments of cellular genome packaged in Caudovirales capsids) or membrane vesicles containing DNA (Soler et al., 2008). Furthermore, viral particles reveal their viral nature only if they encounter a host. The living form of the virus is the metabolically active ‘vegetative state of autonomous replication’, that is, its intracellular form. I have recently introduced a new concept, the virocell, to emphasize this point (Forterre, 2011, 2012). Viral infection indeed transforms the cell (a bacterium, an archaeon or a eukaryote) into a virocell, whose function is no more to produce two cells but to produce virions to propagate viral genes. According to this nomenclature, regular cells (archaea, bacteria or eukarya) are called ribocells to fit with the definition of viruses and cells proposed by Didier Raoult and myself a few years ago (Raoult and Forterre, 2008). In this paper, we defined viruses and cells by their hallmark features, as capsid- and ribosome-encoding organisms, respectively. Ribocells can harbour a cryptic silent viral genome (a virocell waiting to be awoken by some stressful condition). I coined the term ribovirocell to name a cell that can continue to divide while producing virions (Forterre, 2012).

The virocell (or ribovirocell), being a cellular organism, corresponds to the ‘living form’ of the virus, whereas virions are in fact the equivalent of seeds or spores for multicellular organisms. Accordingly, counting virions to enumerate viruses is more or less equivalent to counting fish eggs to enumerate fish. In doing so, we would conclude that oceanic fish stocks are enormous and fishing regulation is not necessary! Accordingly, the virocell concept suggests that we are wrong when we claim that ‘viruses’ are ten times more abundant than cells in most environments. Virocells cannot be more abundant than visible cells, being a subpopulation of cells, that is, the infected ones. This does not reduce the ecological and evolutionary importance of viruses, as this proportion can be very high (Suttle, 2007). It has been reported in some studies that up to 40% of bacteria present in bacterioplankton are infected by viruses. In such a situation, one should consider that only 60% of visible bacteria are bona fide bacteria, whereas the other 40% are virocells. It should be of paramount importance to have a clear idea of this proportion (and its fluctuation) in various ecosystems (Figure 1). Indeed, the virocell concept reminds us that a large proportion of microbial cells within microbial communities is infected with viruses and, as a consequence, these cells behave differently from non-infected ones.

Figure 1
figure 1

(a) An environmental sample with different microbial species, one of them (grey) being infected by a virulent virus. One can observe an abundant production of virions even if the infected species corresponds to a minor fraction of the population. The sample contains many viral particles but a single virocell. (b) A sample infected by a virus that propagates in a carrier state in a species (grey) corresponding to the major fraction of the population with limited virion production. The sample contains many active virocells (grey) but few virions. These virocells can still divide and are better named ribovirocells (Forterre, 2012).

Virocells, cradles of new genes

Another aim of the virocell concept is to emphasize the importance of viruses as the cradles of new genetic information. New genes are continuously created during replication or recombination of viral genomes in virocells by all molecular mechanisms known to generate new genes in cellular genomes (Forterre, 2011; Jalasvuori, 2012). This could seem trivial, especially to virologists who are well aware of viral creativity. However, this is not obvious for all evolutionary biologists. Many of them still consider that viral genomes are formed by the progressive accretion of genes captured from their hosts, viruses being considered as pickpockets of cellular genes (Moreira and Lopez-Garcia, 2009). In the past, molecular biologists have indeed greatly benefited from the fact that viruses can sometimes capture cellular genes and transfer them to recipient cells (transduction). As a consequence, viruses are often merely considered as passive vehicles of cellular genes and all viral genes are then supposed to derive in fine from bacterial, archaeal or eukaryotic genomes. However, this is not supported by genomic data since most genes in viral genomes have no cellular homologues and only a small percentage can be traced to cellular ancestors. In fact, viral integration in cellular genomes is probably more frequent than the reverse process. In addition, whereas genes with closely related cellular homologues are rare in viral genomes, integrated viruses and related elements (see below) represent a large proportion of most archaeal and bacterial genomes (Cortez et al., 2009), and eukaryotic genomes contain frequently much more (retro)viral genes and retrotransposons than eukaryotic genes (Feschotte and Clement, 2012). One can therefore conclude that cells are giant pickpockets of viral genes.

As viral genomes greatly outnumber cellular genomes in the biosphere, the continuous creation of new genes in virocells well explains the huge amount of biological information specifically stored in viral genomes. Most viral genes in viromes are ORFans with no homologues in current databases or virus-specific genes (that is, genes with only viral homologues) (Mokili et al., 2012) simply because these genes originated in viral genomes and never integrated into cellular ones. Most of them have no detectable function because they encode proteins involved in specific host–virus interactions (see the virodome example below). If I am correct, one can predict that the number of viral ORFans in viromes will continue to increase with the sequencing of more viromes, even if more cellular genomes are sequenced in parallel.

Overlapping gene spaces in metagenomes

When analysing metagenomic data for the presence of viruses or cells, one should be aware of the underlying complexity of organismal relationship. For instance, viral genes should also be abundant in ‘cellulomes’, corresponding to genes encoded by giant viruses (those eliminated by 2 μm filtration steps performed in preparing viromes), genes present in virocells and ribovirocells and, most importantly, viral genes integrated in cellular genomes. This should be taken into account in phylogenomic studies and metagenomic analyses. For instance, an environmental gene with a homologue present in a bacterial genome does not necessarily testify for the presence of a bacterium, because this homologue could sometimes correspond to a viral gene integrated into a bacterial genome. It would therefore be very important to distinguish gene spaces associated either to cellular domains or to viral lineages in phylogenomic studies and metagenomic analyses. This is a challenging task that would require, as a first step, the exhaustive identification, via in depth in silico analyses, of all viruses and related elements integrated into cellular genomes to produce complementary databases containing either cellular or viral proteins (for a preliminary work in that direction, see Cortez et al., 2009).

In my opinion, plasmids and other virus-related elements, such as transposons or pathogenicity islands, should be considered as part of the viral world at large. Similarly to viruses, these mobile elements use cells as vehicles (Jalasvuori, 2012). They are probably evolutionarily related to viruses, as indicated by similar abundance of ORFans and the existence of homologous genes specific to plasmids, transposons and viruses, such as replication proteins, integrases and recombinases. This evolutionary connection is easy to understand, as a single mutational step can transform a viral genome into a plasmid, the only difference between plasmids and viruses being the presence in viral genomes of gene(s) encoding capsid protein(s) (Krupovic and Bamford, 2010). However, most biologists still consider plasmids as an extension of cellular genomes (extrachromosomal elements), whereas they are instead independent virus-related biological entities. For instance, our anthropocentric and cellular-centric views of the world let us declare that conjugative pili are ‘bacterial penises’ connecting male and female bacteria, whereas they are actually ‘plasmid penises’ used by plasmids to propagate themselves in new species! It’s time now to think twice about old nomenclatures and prejudices and consider objectively the living world beyond the historical traditions that shape our present limited vision.

Perspective

The virocell concept should encourage more scientists to go back to the bench and complement shotgun metagenomics and ecological studies with wet studies focusing on interaction between viruses and cells (not only both the entry and exit steps of virions, but also the intracellular stage of the viral life cycle). Only bench work will make sense of the ecological data by progressively revealing the ‘unknown’ of viral information (Mokili et al., 2012). Besides architectural and replication proteins, viromes are full of genes encoding proteins whose function probably is to regulate the interactions between viruses and their hosts (victims). The recent discovery of amazing pyramids used by some archaeal viruses to egress from the cell illustrates this point (Prangishvili and Quax, 2011). A single protein is sufficient to produce this unique viral device. How many other amazing viral machineries (virodome, sensu Prangishvili and Quax) are hidden in the jungle of viral ORFans? Answering this question will require the identification and isolation in the coming years of much more viral or cellular systems from the three domains of life.