We welcome the comments on our Review article (Ten reasons to exclude viruses from the tree of life. Nature Rev. Microbiol. 7, 306–311 (2009))1 by Bayry and colleagues (Reasons to include viruses in the tree of life. Nature Rev. Microbiol. 29 June 2009 (doi: 10.1038/nrmicro2108-c1))2; Navas-Castillo (Six comments on the ten reasons for the demotion of viruses. Nature Rev. Microbiol. 29 June 2009 (doi: 10.1038/nrmicro2108-c2))3; Claverie and Ogata (Ten good reasons not to exclude giruses from the evolutionary picture. Nature Rev. Microbiol. 29 June 2009 (doi: 10.1038/nrmicro2108-c3))4; Ludmir and Enquist (Viral genomes are part of the phylogenetic tree of life. Nature Rev. Microbiol. 29 June 2009 (doi: 10.1038/nrmicro2108-c4))5; Koonin and colleagues (Compelling reasons why viruses are relevant for the origin of cells. Nature Rev. Microbiol. 29 June 2009 (doi: 10.1038/nrmicro2108-c5))6; and Raoult (There is no such thing as a tree of life (and of course viruses are out!) Nature Rev. Microbiol. 29 June 2009 (doi: 10.1038/nrmicro2108-c6))7.

They offer us an opportunity to clarify our ideas further and to dissipate some confusion owing to the mixing of very different concepts and levels of interpretation regarding the actual possibility of including viruses in a tree of life (TOL). We realize that much of this confusion comes from the fact that many virologists and other biologists are not familiar with the theory and practice of molecular phylogeny. Their reaction is therefore more sentimental than rational, as if denying the possibility of including viruses in the TOL would imply dismissing their importance in evolution. First, we emphasize that we agree that viruses have been, and still are, important for the evolution of cellular organisms. They are vehicles of horizontal gene transfer (HGT) between cells, contribute to the acceleration of gene evolutionary rates and exert fundamental selection pressures on their host populations, thereby generating and regulating biodiversity (for example, through 'kill-the-winner' strategies8). However, their importance in biological evolution does not necessarily authorize their inclusion in the TOL. Let us make clear the essential points of our opinion article in a more systematic way. We stated that viruses do not belong in the TOL based on two types of arguments: epistemological arguments, which refer to what 'should not be done', and methodological arguments, which refer to what 'cannot be done' and are more practically conclusive.

Our first argument relates to the definition of life and whether viruses are alive. It is epistemological in nature and deals with the way in which humans conceptualize their surrounding natural world. It is neither metaphysical, as Koonin et al.6 state, nor religious (even much less so) as unfoundedly claimed7. As biological scientists, we deal exclusively with empirical data from the material world, from which we extract information that allow hypotheses to be constructed and tested through the hypothetico–deductive methodology that is the common practice in our discipline. Any form of spirituality, including religion7 or intelligent design4, is out of the scope of our scientific activities. We disagree that defining life should be left to philosophers alone6, as it is biologists who actually study life. Defining life, as defining species, is a problematic issue owing to the difficulty of delimiting barriers to some kind of natural continuum. However, once a definition is established it should be applied logically. When some biologists say that viruses are alive, they are accepting some kind of definition. A logical syllogism would ensue: if viruses were alive they 'should be' placed in the TOL (epistemologically, an ideal construction in which all living entities would have a place) whereas if viruses were not alive they 'should not'. The problem is that current definitions of life based on self-replication and evolution do not accommodate viruses because viruses cannot self-replicate. This is not because of the occurrence or absence of ribosomes, as is often thought5,9, but because of the fact that viruses, unlike cells (including cellular parasites), are unable to transform energy and matter (that is, to actively generate order from disorder). In this sense, we agree with van Regensmortel10 that viruses, as genes or transposons, are biological elements but not living entities. By contrast, a self-replicating molecule, for example a self-replicating ribozyme, would be alive according to the above definition. However, viruses are not self-replicating ribozymes or DNAzymes and consequently cannot be qualified as living. Nevertheless, one can avoid defining life or use a more inclusive definition to accommodate viruses. Let us accept for the benefit of this discussion that anything that can bereplicated (that is, does not necessarily self-replicate) and evolves is alive. Viruses and, by the same token, languages, popular legends, cooking recipes and human technology, would be alive. We contend that it would still be impossible to place viruses in the TOL because of purely methodological reasons (nine reasons out of the ten that we proposed1).

Regarding the methodological arguments to exclude viruses from the TOL, we must differentiate between two types of hypotheses and we must also explain what a TOL is in this context: a molecular phylogenetic tree based on universally conserved (core) genes that are supposed to reflect organismal evolution. Whether this molecular phylogenetic tree truly reflects the most important lines of organismal evolution is a matter of discussion but, in practice, it is helpful and the basis of natural systematics (for an example, see the Tree of Life Web Project). Paradoxically, those that propose that a TOL does not exist because it is blurred by HGT use molecular phylogenies to show HGT11 and recognize the existence of archaea, primarily defined as an independent domain of life by their segregation in a conserved gene-based TOL.

The first type of hypotheses, sustained by Koonin et al.12 and others before them13, propose that viruses pre-date cells in biological evolution. Koonin et al.6,12 recognize that “viruses do not have universal genes” but claim that there is a core of ancient (precellular) virus hallmark genes that have persisted to date, dismissing convergence and HGT as explanations for their widespread distribution. We acknowledge that many viral genes are shared within viral families, but we still claim that dating them back to a common precellular origin is impossible because viruses are polyphyletic and because structural convergence and HGT are well-attested phenomena that can account for that distribution. Although simple structures are more prone to convergence, structures that are more complex can also be affected by it (for example, the presence of complex eyes in some unicellular dinoflagellates is an amazing case of convergence with metazoan eyes14). Koonin et al. maintain that a “virus-like stage of precellular evolution appears inevitable”. There is some ambiguity about what 'virus-like' implies (not all genetic elements are viruses; the distinction is important in this debate). If virus-like implies self-replicating elements, we might agree, but if it implies genetic elements that are dependent for replication on other systems (what viruses actually are), we disagree. It would seem more logical that life derives from those systems that were already able to replicate than from their molecular parasites (these might, however, foster the evolution of such self-replicating systems, just as viruses contribute to cell evolution today). Nonetheless, the problem is that this hypothesis cannot be tested. It cannot be proved or disproved by phylogenetic analysis with contemporary empirical data. Consequently, hypotheses proposing that viruses antedate cells, however unlikely they might be (it is difficult to imagine that parasites pre-date their hosts), cannot be falsified and remain valid. However, proponents of such models would agree with us that viruses cannot be placed in a phylogenetic TOL. This is methodologically impossible as they share no universally conserved genes with cells. If such genes ever existed, their sequences have evolved beyond homology recognition, losing all phylogenetic signal from putative pre-cellular times.

The second type of hypotheses on virus–cell relationships, including those of Raoult et al.15 and Claverie and Ogata4, claim that viruses (particularly large DNA viruses such as the Mimivirus) can indeed be placed in a TOL on the basis of phylogenetic analysis of their conserved cell-like genes (those that are used to construct the cellular, however imperfect, TOL by molecular phylogeny). On the basis of these analyses, viruses would define a fourth domain of life4,15. Ironically, Raoult, who was the first to propose that the “Mimivirus appears to define a new branch distinct from the three other domains” (Ref. 15), partly motivating our article1, now contradicts himself saying that there is no TOL7. Claverie and Ogata, on the contrary, maintain that position4 and present an additional tree of the clamp loader protein from Mimivirus (MIMI_R395) and from Ectocarpus siliculosus virus-1 (ESV-1) with their cellular homologues. The viruses appear at the base of eukaryotes, which is taken as “evidence of deep Mimivirus gene ancestry” (Ref. 4). This kind of assertion can be tested. Using proper phylogenetic analyses, the vast majority of genes shared by Mimivirus and cells can be shown to have been acquired by recent (by contrast to pre-cellular or pre-domain diversification) HGT from their cellular hosts (and associated bacteria) and not the other way around16. The apparent basal positions of Mimivirus genes are easily explained both by long-branch attraction artefacts owing to the higher evolutionary rate of these genes in viruses and by poor taxon sampling. The case of the clamp loader is illustrative, as Claverie and Ogata's tree includes a poor representation of taxa4. Homologues from the lineages to which Mimivirus and ESV-1 hosts belong (amoebae and stramenopiles) are excluded. A taxon-rich, detailed phylogenetic analysis shows a more complicated history for this gene, which has three paralogues in eukaryotes. Interestingly, Mimivirus also has three copies of the gene. However, instead of being at the base of the three eukaryotic branches (which would be expected if the Mimivirus genes were ancestral), each Mimivirus gene appears nested within its respective paralogue group close to amoebae genes (Fig. 1). Similarly, ESV-1 appears nested within the eukaryotic groups (the slowest evolving ESV-1 gene — paralogue 2 — even branches with a stramenopile sequence), far from any of the Mimivirus copies. This demonstrates that the viral clamp loader genes were recently, and independently, acquired by HGT from eukaryotic hosts, and highlights the importance of adequate taxonomic sampling17. By showing their tree of the clamp loader, Claverie and Ogata provide a further example of what 'cannot be done' from a molecular phylogenetic point of view: concluding that large DNA viruses form an independent domain in the TOL from a poor phylogenetic analysis using genes that can be shown to have been acquired by HGT from cells. This and previous analyses argue for massive cell-to-virus HGT1,16. Consequently, these hypotheses proposing that viruses define a fourth domain of life that can be placed in a TOL can be, and in our opinion have been, experimentally refuted.

Figure 1: Taxon-rich phylogenetic tree of clamp loader proteins.
figure 1

This 106-taxa tree was constructed on 174 conserved sites with the Bayesian approach implemented in PhyloBayes, using a mixture model (CAT) that was less sensitive to compositional bias and evolutionary rate heterogeneity between species18. Note that, in contrast to Claverie and Ogata's 20-taxa tree4, there is not a single eukaryotic group but three distinct paralogues and that, as a consequence of the richer taxonomic sampling, viral sequences emerge within (and not at the base of) the eukaryotic groups, suggesting that they were acquired by horizontal gene transfer from their eukaryotic hosts. Numbers at nodes are posterior probabilities. Sequence accession numbers are given in parentheses.

In conclusion, even if viruses were considered to be alive and pre-date primitive cells, viruses could not be placed in a universal TOL by purely methodological reasons owing to the absence of shared genes and/or the loss of phylogenetic signal over billions of years of evolution. The claim that viruses can be placed in a TOL using cell-like genes is based on artefactual results and can be shown to be wrong.