The discovery of giant viruses of amoebae changed the field of virology1,2. Since the nineteenth century, viruses have been defined by their submicroscopic size, and this was a dogmatic obstacle that probably prevented (for decades) researchers from searching for, and discovering, giant protozoan viruses, which can be visualized by light microscopy using the usual dyes3 (Box 1; Fig. 1). Indeed, for years the Acanthamoeba polyphaga mimivirus (APMV) was considered to be an intracellular bacterium of amoebae1 (Box 2). In a similar manner, pandoraviruses were thought to be atypical parasites of amoebae4. However, the absence of ribosomal DNA from an isolate that was presumed to be a bacterium eventually led to the discovery of APMV1,2, and, interestingly, the absence of a ribosome essentially separates giant viruses from the three defined domains of life. The genomic and proteomic analyses of giant viruses revealed that they are more complex than other viruses. Furthermore, giant viruses have several novel biological features, which include the fact that mimiviruses can be infected by parasitic viruses (termed virophages), they can contain mobile DNA elements, (termed transpovirons), and they have a defence mechanism against virophages termed the mimivirus virophage resistance element (MIMIVIRE)5,6,7. Thus, how should giant viruses, which do not have a ribosome but have a level of complexity that is approaching that of numerous bacteria and even eukaryotes with an intracellular growth, be defined8 (Box 1; Supplementary information S1 (table))? Furthermore, there is disagreement as to whether the large genomes of giant viruses are a result of smaller viruses acquiring genes or of a genome with cellular ancestry adapting to escape the cell nucleus2,9,10,11,12,13. In this Review, we reflect on 13 years of research into giant viruses and detail the advances that have been made in characterizing their genomes, structures and mechanisms of replication, as well as the virophages and mobile elements with which they are associated. We place these findings in the context of the ongoing debate on the evolutionary origin of giant viruses and on the extent of similarity between giant viruses and bacteria and eukaryotic cells.

Figure 1: Particle and genome size of giant viruses of amoebae.
figure 1

Families (namely, the Mimiviridae and Marseilleviridae; represented by dark blue circles) or putative families of giant viruses of amoebae (namely, those that include pandoraviruses, pithoviruses, faustoviruses and mollivirus; dark blue circles) are shown, along with other families in the proposed order Megavirales (namely, Poxviridae, Asfarviridae, Phycodnaviridae, Iridoviridae and Ascoviridae; light blue circles). Some families or genera of smaller viruses (grey circles) are shown in the inset, which magnifies a section of the larger graph and shows viruses that have a genome size ≤400 kb and a particle size ≤400 nm. Circle sizes are proportional to virus particle sizes. For each family or genus, the size of the largest member is shown. Viruses are referred to by their family or genus name unless there is no family or genus name.

PowerPoint slide

Structure and genomes of giant viruses

APMV and other giant viruses of amoebae have several structural and genomic features that had not been described in viruses before their discovery.

Understanding APMV. APMV has remarkable features compared with other viruses. The APMV capsid is 500 nm in size and is covered by fibrils that are 120–140 nm long and 1.4 nm thick1. These fibrils are morphologically unique among viruses and, although their structures have not been fully elucidated1,14 (Supplementary information S2 (figure)), they form a dense layer, are extensively glycosylated, and enable the attachment of APMV to amoebae, bacteria, arthropods and fungi through glycans15. The capsid comprises proteins that have a double jelly-roll fold and is icosahedral except at one vertex, which is covered by a unique five-branch starfish-shaped structure (termed 'stargate') that is devoid of fibres16. Beneath the capsid, and surrounded by an inner lipid membrane and fibres, is a spherical lipid bilayer compartment that is 340 nm in size and contains the genome (with an estimated packing density of 0.06 nm3 per bp) and proteins17,18. This nucleocapsid has a large depression that faces the 'stargate', which creates a cavity and indicates that the nucleocapsid has a fixed position relative to the external capsid. APMV particles contain 114 proteins, which is only 12% of the number of predicted genes (see below), among which 12 are involved in transcription, 5 are involved in DNA topology and repair, 2 are involved in RNA modification, 5 are involved in particle structure and 7 are involved in protein or lipid modifications19.

APMV has a double-stranded DNA (dsDNA) genome that is 1.2 Mb in length and contains 979 genes that putatively encode proteins with a coding density of 89%2,20 (Fig. 2; Supplementary information S3 (table)). The genome is also AT rich, comprising 72% AT nucleobases. Several APMV genes are not found in viruses other than giant viruses of amoebae, including those that encode translation factors and aminoacyl tRNA synthetases; some of the genes encoding aminoacyl tRNA synthetases are expressed21,22 and encode functional proteins23. In addition, the APMV genome encodes four different tRNAs2. Other genes that are unique to APMV and other giant viruses of amoebae encode proteins that are involved in nucleotide synthesis, amino acid metabolism, protein modification, lipid or polysaccharide metabolism, DNA repair or protein folding2. In addition, the APMV genome contains four major groups of genes, including core genes that are shared with poxviruses, ascoviruses, iridoviruses, asfarviruses, phycodnaviruses and other giant viruses of amoebae, as well as large sets of horizontally transferred genes, paralogous genes (in addition to genes that were involved in large genome duplication events) and orphan genes (also known as ORFan genes)2,9,24,25,26. Orphan genes are estimated to represent 48% of the predicted gene content2,27, and the proportion of orphan-encoded proteins is also very high (40%) in APMV particles19. The APMV genome also contains early and late gene promoters, and mRNAs are expressed as polyadenylated transcripts that most often end with short palindromic sequences that form hairpin-like structures2,22,28. Recoding events, including a frameshift and a readthrough, were described for a gene that encodes a translation termination factor in the APMV genome29. In addition, introns and inteins were detected in a few conserved genes, including those that encode the major capsid protein and the DNA polymerase30,31, and a 'mobilome' that is unique to APMV was identified6 (see below). Finally, a few mRNAs were detected in mimivirus capsids2 (Fig. 2).

Figure 2: Major genomic and structural features of APMV.
figure 2

Major structural features of the Acanthamoeba polyphaga mimivirus (APMV) particle are shown in the middle of the figure. Black arrows link APMV to the major features of the gene repertoire and nucleic acid content of APMV, including its mobilome and mRNAs, which are shown in grey boxes. The red arrow indicates that virophages can infect mimivirus factories. The minus sign indicates that mimivirus virophage resistance element (MIMIVIRE) can protect against virophages. *Gene definitions and putative functions are given, as well as their relationship to other viral genes. Corresponds to mimivirus genes, the closest match to which in the NCBI GenBank protein sequence database is from a virus that does not belong in the family Mimiviridae (except distant mimiviruses). || Maximum and minimum proportions of genes that are inferred to be involved in lateral transfer are given, as assessed in three different studies9,25,26. NCLDV, nucleocytoplasmic large DNA viruses; VLTF2, viral late transcription factor 2.

PowerPoint slide

Other giant viruses of amoebae. In 2005, APMV became the founding member of the family Mimiviridae32. Since then, about 100 new mimivirus strains have been isolated by culturing on amoebae from water, soil, insect and human samples that were collected worldwide (Supplementary information S3,S4 (table, figure)), most recently using high-throughput strategies33,34,35 (Box 3); the second mimivirus to be cultured was named mamavirus5. The sizes, morphologies and genomes of the other mimivirus isolates are similar to those of APMV30,33,36. They have capsids 370–600 nm in diameter and their genomes are 1.02–1.26 Mb in length, AT rich (72–75% AT nucleobases) and encode 930–1,120 putative proteins. Phylogenomics has enabled mimiviruses that infect amoebae to be divided into three lineages that were named lineage A (in which APMV is the prototype), lineage B (in which moumouvirus is the prototype)30 and lineage C (in which Megavirus chiliensis is the prototype)36. In 2010, a distant mimivirus relative named Cafeteria roenbergensis virus, the capsid and genome of which are smaller than those of the mimiviruses of Acanthamoeba, was isolated from an abundant marine dinoflagellate37. Subsequently, a few other viruses that infect marine unicellular eukaryotes, including Phaeocystis globosa virus, were linked, albeit distantly, to mimiviruses38,39,40.

In addition to mimiviruses, other giant viruses of amoebae were discovered using amoebal co-culture methods. The first of these was marseillevirus, which was described in 2009 (Ref. 41). Since 2013, the number and diversity of giant viruses of amoebae have expanded considerably, and, as of 2016, two virus families, the Mimiviridae and Marseilleviridae, have been described32,42. However, other giant viruses, including pandoraviruses, pithoviruses, faustoviruses and Mollivirus sibericum, represent putative new giant virus families5,43,44,45,46 (Supplementary information S3,S5,S6 (table, table, figure)). Since the APMV genome was described, giant viruses of amoebae have been linked, through phylogenomic analyses, to nucleocytoplasmic large DNA viruses (NCLDV), which is a group of dsDNA viruses that was described in 2001 and comprises poxviruses, ascoviruses, iridoviruses, asfarviruses and phycodnaviruses; these viruses infect a wide range of eukaryotic cells, from algae to insects and mammals2,47,48. In 2009, it was noted that giant viruses of amoebae and NCLDVs share a small subset of nine core genes, five of which are found in all of their genomes (these encode a major capsid protein, a D5 helicase, a family B DNA polymerase, an A32-like packaging ATPase and a very late transcription factor), and a larger subset of 200 genes are shared by at least two NCLDV families48,49. Moreover, giant viruses of amoebae and NCLDVs were described to have a common ancestor, the genome of which is thought to contain 50 conserved genes that are likely, based on phylogenomic analyses, to have an early origin that is possibly concomitant with eukaryogenesis48,49,50,51,52. In 2012, it was proposed that giant viruses of amoebae and NCLDV families should be classified into a new viral order, the Megavirales, on the basis of their common origin, which was suggested by the fact that they share a large set of ancestral genes that encode key viral functions, a common virus particle architecture and major biological features, including replication that occurs inside cytoplasmic factories53. Nevertheless, the architecture of pandoraviruses, Pithovirus sibericum and M. sibericum (see below) differs considerably from that of other giant viruses of amoebae and no capsid- resembling structure, or even gene for pandoraviruses, was detected in these viruses, which, together with poxviruses and ascoviruses that have brick-shaped virus particles and allantoid capsids, respectively, challenges the criteria that are used to classify viruses in the proposed order Megavirales43,44,46,53.

Members of the proposed order Megavirales that were isolated as a result of them being cultured on various amoebae and described during the past 13 years have a wide range of sizes, shapes, structures, genome lengths, percentage of GC nucleobases, gene repertoires and replicative sites (Supplementary information S3,S5,S6,S7 (table, table, figure, figure)). Nonetheless, they still comprise a monophyletic clade that is based on a limited set of core genes and informational genes48,49,53,54,55. In addition, all of these giant viruses enter amoebae through phagocytosis, after which fusion occurs between the vacuole membrane and the internal membrane of the virus; this leads to the release of the genome into the cytoplasm of the amoeba56,57. Finally, the virus factory has a cytoplasmic location, with the exception of M. sibericum46.

Giant viruses of amoebae were isolated from various environmental samples, ecosystems and geographical locations, and from hosts, including amoebae, invertebrates and mammals33,57,58. They were also detected in several metagenomes that were generated from environmental, animal and human samples59,60,61,62,63,64, and in plant genomes65. Finally, sequences from new putative giant viruses were detected in marine environmental metagenomes by approaches that revealed that previously overlooked sequences are related to members of the Megavirales54,66,67. These facts suggest that giant viruses of amoebae are common in our biosphere.

Giant viruses of amoebae have various morphological features (Supplementary information S3,S5,S6 (table, table, figure)). Marseilleviruses and faustoviruses have icosahedral capsids that are 250 nm and 200 nm in size, respectively, and have no 'stargate' structure41,45. The surface of the marseillevirus prototype isolate is covered by fibres that are 12 nm long, whereas the faustovirus prototype isolate has no fibres. Contrary to all other DNA viruses, the faustovirus prototype isolate has a double protein shell to encapsidate, with an outer shell that is composed of a double jelly-roll protein and is similar to that of many double-stranded DNA viruses (including giant viruses of amoebae), and an inner shell that is composed of a repeated hexameric unit, the structure of which differs from that of known capsid proteins. In addition, the major capsid protein of the outer shell is encoded by a 17,000 bp genomic fragment that contains several introns and exons68. Pandoraviruses, pithoviruses and M. sibericum exhibit different morphologies to other giant viruses of amoebae; virus particles have ovoid or spherical (in the case of M. sibericum) shapes43,44,46. In addition, pandoraviruses and pithoviruses are larger than mimiviruses and all other giant viruses of amoebae. Pandoravirus and pithovirus particles have a wall that is 60–70 nm thick and an apical pore at one of their extremities, which enables the internal virus components to be delivered into the amoebal cytoplasm and has the appearance of a honeycomb grid in pithoviruses43,44. The M. sibericum tegument is covered by two layers of fibres that are 10–14 nm long and has a funnel shaped aperture at its apex46.

The genomes of giant viruses of amoebae other than mimiviruses have different lengths, GC content, gene numbers, functions and origins2,40,42,43,44,45 (Supplementary information S6,S7 (figures)). They all include numerous orphan genes, similarly to APMV, which indicates that many of their structural and functional characteristics are novel and remain to be deciphered2,41,43,44,45,46. In addition, these genomes all have a substantial level of mosaicism; this is well exemplified by marseillevirus41. This suggests that many genes were transferred laterally between themselves and other viruses and cellular organisms. Among examples of particular features in the genomes of giant viruses of amoebae, it should be mentioned that the genomes of marseillevirusesen code histone-like proteins41,69; the genes in pandoraviruses frequently include spliceosomal introns43 and the genome of Pandoravirus salinus contains particular transposons that were termed miniature inverted-repeat transposable elements70; and the genomes of faustoviruses encode membrane occupation and recognition nexus (MORN)-repeat-containing proteins, similar to marseilleviruses and pandoraviruses44. In the faustovirus prototype isolate it was found that the capsid protein is encoded by a 17,000 bp large genomic region with several exons and introns, which was most unexpected45,68. Of note, it has been hypothesized that giant virus genomes may evolve through a complex accordion-like process, with successive steps of genome expansions through duplications and gene transfers followed by genome reduction71. Hence, the genomes of giant viruses of amoebae both exhibit features that are shared between some or all giant viruses of amoebae and others that are particular to each family or putative family.

Giant viruses of amoebae that are living fossils. P. sibericum and M. sibericum were isolated from a Siberian permafrost sample that was more than 30,000 years old, which dates back to the Pleistocene epoch, and were therefore considered to be 'living fossils' (Ref. 44). P. sibericum has an AT-rich (64%) dsDNA genome that is 610,033 bp in length, which is unexpectedly small considering that it has the largest particle size (1.5 μm in length and 0.5 μm in diameter) among giant viruses of amoebae44. Moreover, its coding density is only 68%, owing to a considerable number of 150-nm-long regularly interspersed palindromic sequences that are distributed in tandem repeats within arrays that are 2,000 bp long. Although P. sibericum morphologically resembles pandoraviruses, it is most closely related to marseilleviruses and iridoviruses based on phylogenetic analyses. A total of 159 proteins were identified in purified virus particles, of which two-thirds and the four most abundant have unknown functions44. The genome of M. sibericum is 651,523 bp in size, with inverted repeats approximately 10,000 bp long at its extremities, and, similar to pandoraviruses, is GC rich (60%)46. The majority of the sequences that are most similar to genes in M. sibericum correspond to pandoravirus genes, albeit with a low level of homology (about 40% on average), and this result is congruent with phylogenetic analyses. Purified virus particles contain 136 proteins, of which more than half and the three most abundant are orphans. In addition, a homologue of the major capsid protein in members of the proposed order Megavirales is translated, although it is apparently not involved in the structure of the virus particle46.

Another pithovirus, named Pithovirus massiliensis, was recently isolated from sewage in southern France and was compared with its fossil counterpart72. This analysis indicated that the genomic content of pithoviruses evolves slowly, as it was conserved after thousands of years, with selective pressure on the conserved genes. This suggests that the mechanisms of evolution are comparable in giant viruses and bacteria, and include selection, gene fixation and then selective constraints. This enabled the first estimation of a molecular clock in giant viruses, and the mutation rate that was estimated based on the dN/dS ratio was 2 × 10−6 substitutions per site per year, which is lower than that of RNA viruses and is very similar to that estimated for poxviruses (0.5–7 × 10−6 substitutions per site per year)73 and in the same order of magnitude as those found in bacteria and archaea72. Second, this showed that genes that were thought to be horizontally acquired were selected and highly conserved, which indicates that they are essential genes. The GC content and codon usage of these genes tend to adapt progressively to that of the recipient genome72. In light of these results, pithoviruses can select for genes that are acquired by horizontal gene transfer, followed by their long-term fixation and adaptation to viral codon usage. Third, orphans that are highly abundant in pithoviruses44,72 were also constrained by strong selective pressure, which indicates that their accumulation is not random and is biologically relevant. These findings do not support the hypothesis that giant viruses are 'bags' of genes and pseudogenes that have been randomly taken from cellular organisms and not used, but rather indicate that horizontally acquired genes and orphan genes are functional and biologically active. Similar findings were reported previously for two marseilleviruses that were isolated in south-eastern France and Australia74.

Infection cycle of giant viruses

The replicative cycles of giant viruses in amoebal cells show several similarities, including phagocytic entry, DNA release and replication in 'viral factories'. However, differences exist between giant viruses in the duration of their replication cycles, the involvement of the amoebal nucleus in virus replication, and the assembly and release of virus particles.

The replicative cycle of APMV, which lasts about 12 h and occurs in the amoebal cytoplasm1,56,75 (Fig. 3), has several features that have not been observed before in viruses. APMV is internalized through phagocytosis76 before its genome-containing internal core is released through the 'stargate' into the cytoplasm, through the fusion of viral and phagosome membranes; transcription may be initiated in these cores77. Subsequently, the genome is released from the core and is replicated at high levels, which generates an early cytoplasmic replicative centre that is thought to be engulfed by a membrane layer of the endoplasmic reticulum. Replicative centres then merge into a single virus factory that contains, from the inside to the outside, zones that are involved in replication, membrane biogenesis, capsid assembly and DNA packaging, and fibre acquisition56,77. The virus internal layer seems to originate from multivesicular membrane structures that bud from the endoplasmic reticulum and become open membrane sheets; the major capsid protein is assembled around these sheets while acting as a scaffold78. Genome packaging occurs through a transient aperture that is distinct from the 'stargate' (Ref. 78). Finally, the layer of fibres is assembled. The involvement of the amoebal nucleus in the life cycle of APMV has been debated56,57,75. Indeed, the size of the amoebal nucleus decreases approximately twofold during APMV replication, even though this process occurs in a viral factory in the cytoplasm; amoebal nuclear factors may be involved.

Figure 3: The mimivirus replication cycle and key replicative features of other giant viruses of amoebae.
figure 3

a | Schematic of the replication cycle of Acanthamoeba polyphaga mimivirus (APMV). Virus particles can be seen at the surface of an Acanthamoeba sp. amoeba at the first stage of the cycle. Then, after virus entry through phagocytosis, the eclipse phase begins and the release of the APMV genome into the amoebal cytoplasm seeds the virus factory, which appears at a different location to the cell nucleus. Starting from 8 h post-infection, the assembly of virus particles can be seen at the periphery of the virus factory. Amoebal lysis occurs 12 h post-infection. b–e | Electron microscopy images of APMV particles and Acanthamoeba sp. cells that are infected with APMV. b | A mimivirus particle is shown. c | A mimivirus factory in the cytoplasm of an Acanthamoeba sp. amoeba 8 h post-infection. d | This image shows the edge and periphery of a virus factory, in which internal membrane biogenesis and assembly, capsid assembly and DNA packaging, and fibre acquisition of the APMV particles occur 8 h post-infection. e | This image shows an Acanthamoeba sp. amoeba 12 h post-infection with mimivirus, with the amoebal cytoplasm filled with mimivirus particles. f | The key replicative features of giant viruses of amoebae are shown in this table. Dashes indicate where there is no noticeable feature for a virus. A. castellanii, Acanthamoeba castellanii; V. vermiformis, Vermamoeba vermiformis. Microscopy images in parts b–e courtesy of I. Pagnier, Institut Hospitalo-Universitaire (IHU) Méditerranée Infection, Aix Marseille University, France.

PowerPoint slide

In marseilleviruses, the replication cycle lasts about 6–16 h (Refs 41,69) (Fig. 3). The cell nucleus undergoes transient morphological changes during early stages of the replication cycle, and marseillevirus viral factories tend to have a more extensive distribution in the cytoplasm than APMV viral factories. Marseilleviruses can form giant vesicles, comprised of dozens of virus particles, that are wrapped by membranes derived from the amoebal endoplasmic reticulum79. The viruses can be released outside of the amoeba within these giant vesicles; this may cause the amoeba to undergo phagocytosis through an acidification-independent process79. In addition, single marseillevirus particles may enter Acanthamoeba spp. through an endosome-stimulated pathway or may group together to trigger their own phagocytic uptake79.

The replication cycle of faustoviruses occurs in Vermamoeba vermiformis (not in Acanthamoeba spp.) and lasts 18–20 h (Ref. 45). It occurs within the amoebal cytoplasm independently of the cell nucleus, and the viral factory occupies almost the entire amoebal cell, although it should be noted that some virus particles that are produced by the factory may only correspond to empty capsids.

The life cycles of pandoraviruses, P. sibericum and M. sibericum last 10–18 h, 10–20 h and 6 h, respectively43,44,46. The wall and interior of these virus particles seem to assemble simultaneously. The replication of P. sibericum does not seem to affect the amoebal nucleus44. By contrast, infection by pandoraviruses and M. sibericum results in a disorganized amoebal nucleus that exhibits numerous membrane invaginations43,46. The pandoravirus factory seems to replace the cell nucleus, whereas the nucleus persists during infection with M. sibericum, with new virus particles emerging at its periphery. These observations are consistent with the fact that transcription-associated proteins are absent from pandoravirus and mollivirus particles, which suggests that they require the amoebal nucleus, but are present in pithovirus particles, which suggests that they can initiate transcription independently from the nucleus57.

Overall, giant viruses of amoebae enter cells through a phagocytic-like process and release hundreds of new virus particles through amoebal lysis; the exception is M. sibericum, for which the release of virus particles seems to occur through exocytosis without lysis46. Acanthamoeba spp. or V. vermiformis can protect giant viruses against physical and chemical threats, as they are highly stable organisms that can become encysted80. In addition, APMV particles were stable for 9 months in environmental freshwater, saline water and hospital ventilator devices81, and resisted antibiotics, 48 h of desiccation, and several chemical biocides, including alcohols82. Finally, the phagocytosis-like entry process of giant viruses contrasts with the need of specific interactions for other viruses to enter their host and could enable giant viruses to infect a broader range of hosts. Although amoebae are the primary hosts of giant viruses, mimiviruses and marseilleviruses have been detected in oysters, insects, monkeys, cattle and humans83,84,85. Moreover, APMV can enter various human myeloid cells and replicate in peripheral blood mononuclear cells76,86, and marseillevirus can cause a productive infection in human T lymphocyte cells87. In addition, the virus factories in which these giant viruses replicate were described for other viruses, including members of the proposed order Megavirales that are not discussed here, other DNA viruses (including herpesviruses) and some RNA viruses (such as flaviviruses or coronaviruses)88. These factories are replication organelles that recruit viral and cellular components for virus assembly and maturation. Viral factories were considered as the nuclei of 'virocells' (that is, cells infected by a virus, the aim of which is to produce virions), which would themselves correspond to the cellular forms of viruses when not strictly assimilated into a virus particle (or virion)26. In the case of APMV and other giant viruses of amoebae, if the virocell is the infected amoeba, the giant viral factory can be considered as its nucleus. This view is connected to two evolutionary scenarios: viral eukaryogenesis13, in which the eukaryotic nucleus originated from the viral factory of an ancient giant virus26, and nuclear viriogenesis, in which some giant viruses originated from the nucleus of ancient proto-eukaryotic cells89.

Virophages and other mobile elements

The discovery and study of mimiviruses led to the identification of a 'mobilome' that is specific to these giant viruses and comprises viruses that can infect their viral factories and integrate into their genomes.

Virophage discovery and diversity. Virophages, which were discovered in 2008 (Ref. 5), are viruses that infect mimivirus factories. They have small virus particles that are 35–74 nm in diameter and have an icosahedral capsid and a dsDNA genome of 17–29 kb. Their genomes are predicted to contain 16–34 genes, most of which are orphans or genes of unknown function, and six core genes90,91. Virophages cannot replicate alone in their amoebal host but, instead, replicate only in the presence of mimiviruses5. A co-culture procedure that uses a culturable helper virus was established to explore virophage diversity, and it enabled the isolation of new virophages and the analysis of their ability to infect mimivirus lineages92. The first virophage, named Sputnik, was isolated with mamavirus and was shown to impair its replication cycle and morphogenesis5. Specifically, the ability of mamavirus to lyse amoebae was decreased and mamavirus particles with abnormal morphologies were observed. So far, three Sputnik-like virophages, namely Sputnik2, Sputnik 3 and Rio Negro virophage, have been isolated, and they all infect lineage A, lineage B and lineage C of the Mimiviridae92,93,94. A divergent virophage that is associated with mimiviruses of Acanthamoeba spp., named Zamilon, was also isolated95. Another virophage, named Mavirus, depends on Cafeteria roenbergensis virus for its replication and was the second virophage to be discovered after Sputnik96. The study of virophage diversity is still in its infancy, and several studies suggest that virophages are common and especially abundant in the oceans. For example, virophages were identified in the genome of the marine alga Bigelowiella natans97, and virophage genomes were assembled from metagenomic datasets98. Virophages that infect mimiviruses have been recently classified in the family Lavidaviridae91. This classification highlights that, as suggested by their particle and genome size, virophages are similar in complexity to genuine dsDNA viruses, and they share a set of six core genes that strongly suggests their monophyletic origin. In addition, the promoter and transcription termination signals of mimivirus virophages are similar to those of mimiviruses, which suggests that these virophages rely on their associated giant virus for mRNA synthesis99.

Although practical for isolating new virophages, the use of a culture procedure that is based on A. polyphaga and mimiviruses as a reporter system92 was hindered by the unexpected discovery that mimiviruses of lineage A, but not of lineage B and lineage C, were resistant to the Zamilon virophage95. The search for a system that would explain this resistance was inspired by an analogy to the CRISPR–Cas system, which is widely present in bacteria and archaea and relies on the integration of short DNA sequences from invaders7,100. The resistant lineage A isolates indeed have an insertion of four 15-nucleotide-long repeated Zamilon virophage sequences within an operon, named the MIMIVIRE, the resistance mechanism of which was suggested to rely on the sequence-specific recognition of a nucleic acid sequence7. The probability of finding such a sequence in this single mimivirus lineage by chance was very low (<1 × 10−9). MIMIVIRE-associated genes encode a helicase and nuclease that are involved in the degradation of foreign nucleic acids101 and the functions of which were experimentally validated7; MIMIVIRE also includes a gene that contains the repeated insert. Silencing each of these three genes by RNA interference restored the susceptibility of mimivirus to the Zamilon virophage. However, the exact molecular mechanism of MIMIVIRE is unclear. The MIMIVIRE system is thought to differ from CRISPR–Cas systems owing to the absence of a Cas1 homologue, of protospacer-adjacent motifs and of a well-conserved organization that includes bona fide CRISPR-like repeats100,102. Furthermore, the helicase and nuclease that are described in this MIMIVIRE system are only distantly related to proteins that are classified in the same superfamilies as Cas3 and Cas4 proteins, and they do not seem to be related to them100. An alternative mechanism to explain the resistance of mimivirus to virophages is one that would rely on interactions between proteins, rather than on nucleic acid-based recognition, and would involve a restriction factor protein of the virophage replication machinery102.

Provirophages, transpovirons and mobilome. Mobile genetic elements are common features in microorganisms, and mimiviruses have a complex mobilome (Fig. 3). Indeed, the DNA of virophages can integrate into the mimiviral host genome as 'pro-virophages' — that is, proviruses of virophages6. In addition, a novel category of mobile elements, named transpovirons, was reported. Transpovirons are linear DNA elements of 7 kb that encompass 6–8 protein-coding genes and have terminal inverted repeats of 530 bp (Ref. 6). Transpovirons are randomly integrated into giant virus genomes and are strictly dependent on giant viruses for their replication and spread; they replicate in the mimivirus factory and accumulate inside mimivirus particles, virophage particles and the amoebal cytoplasm6. Distinct transpovirons associate with different mimiviruses, including, for example, mamavirus and courdo7 virus isolates, which are classified in mimivirus lineage A and lineage C, respectively.

In evolution, the giant virus mobilome represents a vehicle and route for genetic novelty and adaptation. Thus, a substantial proportion of genes in the genomes of mimiviruses and other giant viruses are predicted to have been exchanged with bacteria, archaea, eukaryotes (including their amoebal hosts) and other viruses, although the direction of putative transfers is often unclear26,103. This high level of genome mosaicism might be related to the co-infection of amoebae by giant viruses and microorganisms, such as bacteria or fungi41,103. Indeed, giant viruses can multiply within amoebae that are infected with other microorganisms, which provides opportunities for horizontal gene transfer. Acanthamoeba spp. are well-known hosts for several bacteria and some fungi, in addition to viruses104. Acanthamoeba spp. that were isolated from a contact lens cleaning liquid were found to be co-infected with a mimivirus and two bacteria93, and Acanthamoeba spp. have been observed experimentally to be co-infected with marseillevirus and two bacteria41. In addition, the fact that, in their natural environment, giant viruses are sympatric with other microorganisms that infect Acanthamoeba helps to explain the large size of their genomes, which accumulate genes that increase their fitness against other microorganisms that replicate in amoebae103,105. Interestingly, a 16% decrease in genome size was observed by co-culturing APMV in amoebae in the absence of other microorganisms105; it was determined that the APMV genes that were inactivated were primarily those with the lowest expression, which suggests the loss of obsolete genes106. Moreover, these gene losses were associated with the loss of the external fibres that cover the APMV capsid and a decreased rate of replication105.

Conclusions and future directions

The replication cycle, structure, genomic make-up and plasticity of giant viruses differ from those of traditional viruses. They have virus particles that are as large as some microorganisms and have a stunning level of complexity. Their genomes are mosaic and contain large repertoires of genes, some of which are hallmarks of cellular organisms, although the majority have unknown functions. These giant viruses enter amoebae through phagocytosis and replicate inside viral factories. In addition, as shown for mimiviruses, they are associated with a specific mobilome and are parasitized by viruses that they can defend against. Giant viruses not only challenge the classification of viruses but also raise intriguing questions about their origin. They extend the definition of viruses into a broader range of biological entities, some of which are very simple and others of which have a complexity comparable to that of other microorganisms.

More giant viruses of amoebae must be identified to gain a better knowledge of their prevalence and diversity, which is largely unexplored but has expanded over the past three years and is likely to expand further now that we have learnt not to neglect them and owing to high-throughput detection and isolation strategies (Box 3). The role that giant viruses of amoebae have in evolution also warrants further attention. Several hypotheses on their ancient origin and their evolutionary relationship with cellular organisms have been proposed, and these topics should continue to be debated. In these viruses, it will be important to search for a translation system that does not involve ribosomes, and/or that is anterior to the ribosome and of which we would find remnants. Finally, the detection of giant viruses of amoebae in humans and the study of their potential pathogenicity are emerging fields107. To date, giant viruses of amoebae that have been linked to humans were discovered before 2013 (Box 4). Mimiviruses were associated with pneumonia108,109, whereas marseillevirus was detected in the blood and the lymphoid tissue, and was associated with adenitis and lymphoma110,111. More systematic investigations of human samples for giant viruses of amoebae must be conducted; such research is more accessible to the next generation of virologists who have entered the field after the description of APMV, and will surely reveal more giant viruses and refine their definition for years to come.