Main

All bona fide RNA viruses encode at least two proteins, a capsid protein and an RNA-dependent RNA polymerase (RdRP). In this article, we do not consider 'abnormal' RNA viruses such as hepatitis delta virus, which does not possess its own replication machinery and borrows a capsid from another virus1, or the capsid-less narnaviruses, which encode only an RdRP2; hepatitis delta virus and narnaviruses are closer to viroids and plasmids, respectively, than to fully fledged RNA viruses. RNA viruses encoding only a capsid and an RdRP are known to infect protists such as Giardia spp.3. Some RNA phages and fungal RNA viruses express only three proteins. However, most RNA viruses do not rely on such a scant protein repertoire. The proteomes of the majority of RNA viruses comprise up to 12 different proteins, with the genomes of some coronaviruses encoding nearly 30 proteins.

To infect, viruses must reach an appropriate intracellular environment and, if necessary, adapt this environment to their own requirements. Thus, viral infection requires not only replication but also interactions with host defences. To carry out these tasks, RNA viruses have only a few proteins at their disposal, with the available protein arsenal being limited by the genome size. As the proteomes of these viruses are strikingly small and specific viral functions generally require more than one protein, most proteins encoded by RNA viruses are multifunctional. At the same time, however, there is a certain division of labour between viral proteins. We consider some aspects of this problem using, as an example, the picornaviruses, a large family of small animal viruses with a medium-sized, single-stranded, positive-sense RNA genome (Box 1). Although they share a common genome organization, these viruses exhibit sufficient genetic variation to be separated into at least 12 distinct genera (Box 2). The family includes important human and animal pathogens that cause a range of disorders, including poliomyelitis, foot-and-mouth disease, the common cold, gastroenteritis, hepatitis, meningitis, myocarditis and uveitis. In addition to acute diseases, these viruses can also cause chronic persistent disorders.

Of the 12 'mature' (fully processed) picornaviral proteins (Box 1), the most ancient group includes the capsid proteins and a set of conserved proteins considered to be picornavirus signature proteins4, comprising the RdRP (3Dpol), the primer for RNA synthesis (VPg), a protease (3Cpro) and an ATPase (2CATPase). The capsid and signature proteins are indispensable for viral viability. Two less conserved proteins, 2B and 3A, are also essential for viability but are less directly involved in viral reproduction. The main functions of these two proteins are to target replicative proteins to the correct destinations and aid in the creation of a suitable replicative niche.

Finally, two other non-structural proteins that flank the capsid precursor in the polyprotein molecule, the leader (L) and 2A proteins, constitute a distinct group, although they have no common structural or biochemical features. They are the most variable among the picornavirus proteins, and some viruses even lack L altogether (Fig. 1; Tables 1, 2). This group also includes the so-called L* protein, which is encoded in a different reading frame of the RNA of certain cardioviruses. Functionally, these proteins (L, L* and 2A) have been characterized in some detail for only a few picornaviruses, and in these cases their major function has been shown to be counteracting host defences, thereby ensuring optimal conditions for viral reproduction. In this Review we argue that picornaviral L and 2A proteins that have yet to be characterized are likely to fulfil the same biological role.

Figure 1: Leader and 2A proteins of picornaviruses.
figure 1

The organization of an 'idealized' picornaviral polyprotein is shown, with specific viral leader (L) and 2A proteins given below; protein sizes are not to scale but give approximate relative lengths. There is great variability in L and 2A proteins. Multiple L and 2A proteins are known for some viral genera. Several picornaviruses do not possess L but have large 2A proteins. Cosaviruses contain no L and only a very short 2A. Other viruses (for example, some sapeloviruses) possess an unusually long L (we propose that in this case it might correspond to at least two separate polypeptides; see Supplementary information S1 (figure)) and a very short 2A (if there is a 2A peptide at all). Remarkably, other sapeloviruses possess a long 2A and a short L. Notable differences in the organization of the L and 2A proteins can occur among representatives of the same genus, such as in cardioviruses and parechoviruses. Well-defined amino acid motifs are indicated. See main text for details about the H-NC and AIG1 domains. L*, alternative leader protein encoded by an alternative reading frame beginning in the L-encoding sequence; P↓, the NPG(P) motif, which interrupts translation at the proline residue; Pro, protease; VPg, primer for RNA synthesis; Zn, zinc finger.

Table 1 Properties of the leader proteins of picornaviruses*
Table 2 Properties of the 2A proteins and peptides of picornaviruses

We have proposed that this group of proteins should be called security proteins5. They are sometimes also referred to as virulence factors6; such a designation, although it may be more familiar, is somewhat less preferable to us for the reasons explained below. Viruses have no special 'desire' to be virulent, that is, to harm, even less to kill, their hosts. The pathogenic properties of a virus are not a prerequisite for viral fitness. In fact, the most severe harm in a viral infection comes not from viral reproduction but from the (sometimes miscalculated) host defence response. Host damaging or even suicidal defensive reactions include RNA degradation, inhibition of translation, and the induction of endoplasmic reticulum stress, apoptosis, autophagy and inflammation, and these reactions are aimed at limiting viral reproduction and spread. As a natural response, viruses have evolved tools directed not only at overcoming the specific innate and adaptive immune responses but also at inhibiting the general host metabolic functions on which these specific defences are based — that is, processes such as transcription and translation, cell signalling and intracellular trafficking. Inhibiting these processes is the major function of the security proteins. Accordingly, the capacity to withstand host defences, rather than virulence (the ability to harm the host) per se, is the property that is selected for during viral evolution. The long-term co-evolution of a virus and its host will probably result in their mutual adaptation accompanied by a decrease in viral virulence. Thus, in our opinion, the term 'security protein' is a better reflection of the evolutionary origin of the relevant proteins than the term 'virulence factor'. Moreover, the possession of efficient security proteins does not necessarily make a virus particularly virulent.

The aim of this Review is to consider the properties and biological significance of picornavirus security proteins and, on the basis of this knowledge, to put forward a general view of security proteins as dedicated counter-defensive proteins that evolved primarily to overcome various mechanisms of host resistance.

Properties of picornavirus security proteins

Both L and 2A are extremely heterogeneous with respect to size, sequence and biochemical properties (Fig. 1; Tables 1, 2). The length of L varies from 70 amino acid residues in cardioviruses to 450 residues in some sapeloviruses, and the variation is striking even in a single genus; for example, in sapeloviruses, the length varies from 80 residues (one L protein) to 450 residues (two L proteins). Although certain sapeloviruses seem to express two L proteins (Supplementary information S1 (figure)), approximately half the picornavirus genera lack L altogether. The length of 2A varies 30-fold (from 9 residues in senecaviruses to 300 residues in avihepatoviruses and some sapeloviruses), and it can comprise one to three separate polypeptides. The L proteins of cardioviruses and klasseviruses are strongly acidic and markedly basic, respectively, whereas the 2A proteins of these viruses exhibit the reverse ratio of acidity. The overall charge of the proteins will influence the choice of their interaction partners. The primary structures of L and 2A are, in most cases, totally unrelated when compared across different genera, and even intra-genus conservation among seemingly homologous proteins can be low, as is the case, for example, with the L proteins of kobuviruses and 2A proteins of cardioviruses and sapeloviruses.

With regard to their biochemical activities, only a few security proteins have been characterized in any detail. 2A of enteroviruses (2Apro) is a protease7 with a chymotrypsin-like fold in which the catalytic serine residue has been replaced by cysteine8. 2Apro performs an essential function in viral reproduction through its involvement in the so-called 'primary' co-translational cis-cleavage of the polyprotein between the amino-terminal amino acid of 2Apro and the carboxy-terminal residue of the capsid protein VP1. Aphthovirus L is also a protease (Lpro) with a papain-like fold9. Translation of aphthovirus RNA can start at two in-frame AUG codons, generating distinct L proteins, Lab (200 residues) and Lb (170 residues)10, with Lb predominating in vivo. Lpro cleaves the bond at the Lpro–VP4 boundary11,12. Erbovirus Lpro exhibits similar enzymatic activity13. Owing to the presence of a characteristic motif, the 2A proteins of some sapeloviruses have also been proposed to be proteases14.

2A of aphthoviruses is a short peptide that promotes co-translational interruption of polyprotein synthesis just downstream of its own coding sequence (and before the 2B moiety); cleavage at the aphthovirus VP1–2A boundary is accomplished by 3Cpro. Interruption of polyprotein synthesis (also known as ribosome skipping) is proposed to be caused by an interaction between the 2A peptide and the exit tunnel of the ribosome, preventing the interaction of the ribosome with Pro-tRNA (proline is the amino-terminal residue of aphthovirus 2B)15,16. To achieve this, a short peptide harbouring an NPG(P) motif (the parentheses mark the interruption site) is sufficient. Short 2A peptides from erboviruses17, teschoviruses18, senecaviruses19 and cosaviruses20 also possess NPG(P) motifs and seem to be involved in polyprotein processing at the 2A–2B border. Separation of these peptides from the VP1 capsid protein has been assumed but not yet clearly demonstrated. As these peptides possess no other known activities, their assignment as security proteins is not warranted; we refer to them here as 2Asp (2A short peptide).

The ribosome-skipping NPG(P) motif is also present in some 'composite' 2A proteins, such as those of Ljungan virus21 (a parechovirus), avihepatoviruses22 and seal picornavirus type 1 (Ref. 23), in which they seem to ensure self-separation from the downstream 2A moieties. In cardiovirus 2A, the NPG(P) motif is located at the C terminus and interrupts polyprotein synthesis at the 2A–2B boundary24, which corresponds to the polyprotein 'primary' cleavage site in this genus25.

Apart from the NPG(P)-containing peptides, 2A proteins of cardioviruses, avihepatoviruses, Ljungan virus and seal picornavirus type 1 contain distinct but poorly characterized sequences. The cardiovirus encephalomyocarditis virus (EMCV) encodes a 2A protein with RNA-binding affinity26. The bipartite 2A of Ljungan virus, the tripartite 2A of avihepatoviruses and the NPG(P)-lacking 2A proteins of kobuviruses, tremoviruses and some parechoviruses all have a 140–150-residue H-NC domain, which contains a histidine with a downstream asparagine-cysteine dipeptide and a putative transmembrane domain21,27. The observation that a similar H-NC domain is present in certain cellular tumour suppressors21 suggests that these viral proteins are also involved in the control of host activities. Between the NPG(P) and H-NC motifs, 2A of the avihepatovirus duck hepatitis A virus possesses an additional moiety that contains the so-called AIG1 domain22, which is also found in representatives of the Ras-like GTPase superfamily.

Cardiovirus L proteins28, which are devoid of any enzymatic activity and exhibit noticeable intra-genus variability (Table 1), contain a non-classical but functional zinc finger (Cys-His-Cys-Cys) motif29,30 and a downstream acidic motif31 that, in some of these L proteins, contains potential phosphorylation sites31,32. Certain strains of Theiler's murine encephalomyelitis virus (TMEV) are unique among picornaviruses in expressing a functional protein, L*, that is encoded in an alternative translation frame33,34 that starts within the L coding sequence, goes through the VP4 coding sequences and terminates in the VP2 coding sequence of the main reading frame. The 2A and L proteins of other picornaviruses neither contain easily identifiable amino acid motifs nor have known specific biochemical activities.

Effects on general host metabolism

Security proteins contribute substantially to the shut-off of host macromolecular synthesis that occurs in response to infection with many picornaviruses. One could perhaps argue that such effects of security proteins contradict the key proposal of this Review that these proteins are dedicated to counter-defensive functions. However, as already mentioned, inflicting harm on their hosts does not bring viruses any benefits per se. The only reason (or at least, the main reason) why viruses evolved the ability to damage infected cells is their need to incapacitate the cellular defensive machinery. This machinery includes several specific mechanisms (such as innate and adaptive immunity), the implementation of which requires general cellular functions such as translation, transcription and controllable nucleocytoplasmic trafficking. Therefore, virus-induced impairment of these all-purpose metabolic functions can be regarded as a component of the viral counter-defensive strategy.

The effect of security proteins on cap-dependent translation of cellular mRNA is particularly important. 2Apro from diverse enteroviruses35,36,37,38 and Lpro from aphthoviruses12,39 cleave eukaryotic translation initiation factor eIF4G. Interestingly, erbovirus Lpro does not seem to cleave eIF4G and does not trigger translational shut-off13. Poly(A)-binding protein 1 (PABP1; also known as cytoplasmic PABP), a host protein that is also involved in translational control, is another target of enterovirus 2Apro (Refs 40, 41).

Security proteins can inhibit host translation by mechanisms other than proteolysis. Mutations of cardiovirus 2A (which is not a protease) alleviate virus-induced translational shut-off42,43. It is thought that association of cardiovirus 2A with ribosomes44,45, which seems to take place in the nucleolus46,47 and possibly occurs through the RNA-binding activity of 2A26, might contribute to preferential use of the internal ribosome entry site (IRES)-dependent viral templates. Cardiovirus L was also reported to mediate translational shut-off48. However, this effect could largely be due to L-mediated inhibition of nuclear export of mRNA rather than to inhibition of translation per se49,50. On the basis of ectopic expression experiments, hepatitis A virus 2A has also been implicated in inhibition of cap-dependent translation51, but the relevance of this observation is uncertain, as hepatitis A virus does not exert translational shut-off.

The effects of security proteins on cellular transcription are less well studied, although synthesis of the mRNA for cytokines and chemokines is inhibited in certain cases (see below). Individually expressed poliovirus 2Apro was reported to cleave the general transcription factors TATA-box-binding protein52 and cyclic AMP-responsive element-binding protein 1 (Ref. 38), but the biological significance of these effects is debatable52,53. Poliovirus 2Apro cleaves GEMIN3 (also known as DDX20), a protein that is involved in the formation of spliceosomes54. Cardiovirus 2A has also been implicated in virus-triggered inhibition of host transcription47, but this effect was not investigated further. Foot-and-mouth disease virus (FMDV) Lpro (Refs 55, 56) and human parechovirus 2A57 accumulate in the nucleus during the course of infection and as a result of ectopic expression, respectively, but their nuclear effects are unknown. Enterovirus 2Apro cleaves some cytoskeletal proteins, such as cytokeratin 8 (Ref. 58) and dystrophin, a protein that connects the cytoskeleton to the plasma membrane59.

The security proteins of several picornaviruses profoundly affect nucleocytoplasmic transport in infected cells. The targets of enterovirus 2Apro include nucleoporins, which are nuclear pore components that control nucleocytoplasmic exchange60,61. As a result, bidirectional passive diffusion through the nuclear envelope is facilitated. Another manifestation of the perturbation of intracellular trafficking is the suppression of nuclear export of mRNAs, ribosomal RNAs and U spliceosomal small nuclear RNAs62. Passive nucleocytoplasmic diffusion of proteins is also facilitated by the cardiovirus L protein63,64, and this effect seems to be caused by L-triggered phosphorylation of nucleoporins50,65,66.

Effects on innate immunity and viral pathogenicity

One major consequence of the effects of security proteins on host metabolism is the downregulation of innate immunity. FMDV Lpro suppresses interferon production56,67 and action68,69, largely through the inhibition of nuclear factor-κB-dependent transcription that is caused by the degradation of the p65 subunit of this transcription factor55,56. The transcription of genes encoding various cytokines and chemokines (including tumour necrosis factor (TNF; also known as TNFa), T cell-specific protein RANTES (also known as CCL5), myxovirus resistance protein 1 and interferon regulatory factor 7) is also suppressed by FMDV Lpro(Ref. 56).

Poliovirus reproduction, which is largely resistant to the effects of interferon, becomes interferon sensitive in 2Apro mutants. Introduction of the poliovirus 2Apro gene into interferon-sensitive EMCV facilitated its replication in cells pretreated with interferon70. The ability of rhinovirus 2Apro to cleave mitochondrial antiviral-signalling protein (MAVS; also known as VISA, CARDIF and IPS1), an intermediate in the interferon generation pathway, might contribute to the insensitivity of these viruses to interferons71. 2Apro also cleaves the catalytic subunit of DNA-dependent protein kinase, which, among other activities, is involved in the induction of pro-inflammatory cytokines72. Cardiovirus L also suppresses interferon production by affecting the activation of interferon regulatory factor 3, but the exact mechanism of this interference has not yet been identified32,50,73,74. TMEV L* was proposed to suppress the antiviral cytotoxic T cell response in TMEV-infected mice75,76.

In line with these observations, the functions of security proteins are less crucial for viral 'well-being' in hosts with innate immunity defects. Mutations in cardiovirus L48,77,78 or 2A79 or in FMDV Lpro(Ref. 56) decrease viral reproductive potential, but such mutations are less detrimental for viral growth in BHK-21 cells, which are deficient in interferon production, than in immune-competent cells. Similarly, cardiovirus mutants lacking L grow better in mice that have a deficient interferon system than in wild-type mice50,74. For the L*-expressing TMEV strains, functional L* is important for the ability of the virus to infect macrophages or microglia80,81.

One of the components of the innate immune response is apoptosis, which can potentially limit viral reproduction and spread, although certain viruses can subvert the apoptotic machinery to their benefit. Conflicting results have been reported on the relationship between the security proteins and the apoptotic machinery. Ectopic expression of enterovirus 2Apro triggers an apoptotic response38,82,83 and enhances the sensitivity of cells to the apoptogenic activity of tumour necrosis factor84. However, in the context of the whole genome, 2Apro possesses anti-apoptotic activity85,86. Similarly, expression of TMEV L in macrophage-like cells induces apoptosis, as occurs during TMEV infection of these cells87, whereas EMCV and mengovirus L are anti-apoptotic in the context of the whole virus (at least in HeLa cells, in which the virus itself elicits necrotic death)5. Whether this apparent discrepancy is a result of the use of different assays or hosts or of the intrinsic peculiarities of cardiovirus L proteins remains unknown. TMEV L* has also been reported to exhibit anti-apoptotic activity88.

Several illuminating examples strongly suggest that security proteins can markedly modulate viral pathogenicity. FMDV lacking L is highly attenuated89, and both L73,90 and L* (Refs 33, 81, 88) of TMEV strains BeAn 8386 and DA have been implicated in persistence in the central nervous system and in demyelinating disease. 2Apro of human coxsackievirus B4, which can cleave dystrophin in the cardiac muscle, seems to be involved in the pathogenesis of human acquired dilated cardiomyopathy59. Moreover, a major virulence determinant of swine vesicular disease virus, an enterovirus, has been mapped to 2Apro(Ref. 91).

Exchangeability and dispensability

Notwithstanding the low level of conservation of L proteins between EMCV and TMEV (Table 1), these proteins can be functionally exchanged with respect to their ability to inhibit interferon formation and to 'open' nuclear pores90. The replacement of L in the full-length cardiovirus genome with FMDV Lpro generates a virus that can overcome host defences more efficiently than its leaderless counterpart, although it has lower fitness92,93. Such interchangeability of structurally and biochemically distinct proteins attests to the similarity of their biological functions.

It is hardly by chance that viruses that encode long or multiple Ls tend to have a short 2A or lack 2A altogether (assuming that 2Asp is not a security protein), and vice versa (Fig. 1). For example, simian and porcine sapeloviruses harbour a long 2A and a short L, whereas avian sapelovirus encodes the longest L identified to date and the predicted 2A ORF is very short, if it encodes a protein at all. This tendency is consistent with the notion that L and 2A have similar roles in virus–host interactions. As already mentioned, some picornaviruses, for example, cosaviruses, encode no security proteins. Even viruses that do encode security proteins can, under certain circumstances, survive and replicate after these proteins have been inactivated or eliminated. Notably, cardiovirus L31,48,77,78 and L* (Refs 34, 81, 94) and FMDV Lpro(Ref. 95) are not essential for viral viability, and extended deletions in 2A of cardioviruses42,79 and hepatitis A virus96,97 do not kill these viruses. Furthermore, although some data suggest that poliovirus 2Apro has an essential replicative function98, recent experiments have demonstrated its dispensability for viral viability86.

Division of labour in picornavirus proteins

Although this Review focuses on the counter-defensive functions of security proteins, it should be kept in mind that other picornavirus proteins are also often engaged in similar functions. The targets of 3Cpro (or its proteolytically active precursors) might include proteins that are involved in innate immunity99,100,101,102. 2B and 3A are also important players in the virus–host struggle, being involved in the rearrangement of cytoplasmic membranes and in suppression of trafficking, secretion and antigen presentation on cellular plasma membranes84,103,104,105. 2C can also participate in some of these activities103. Even specialized proteins such as VPg106 and capsid proteins107 can sometimes assist in overcoming host defences. In all these cases, however, the 'security' functions are neither the main nor the conserved roles of these proteins.

Conversely, as well as representing a dedicated counter-defensive system, security proteins can be directly involved in viral reproduction. The products of the cleavage of eIF4G, and possibly of some other host proteins, by FMDV Lpro and enterovirus 2Apro can stimulate IRES-dependent cell-free translation108,109; the effect of 2Apro is partly the result of stabilization of the viral RNA110. The physiological relevance of these phenomena is unclear, however. The poliovirus IRES dysfunction that is caused by some mutations can be compensated for by mutations in 2A in a host-dependent manner111, suggesting the participation of host proteins. Cardiovirus L was also implicated in control of IRES activity, because EMCV RNA with deletions in the L-coding sequence exhibited a decrease in cell-free translatability31.

Poliovirus 2Apro was reported to stimulate strand initiation of negative-strand RNA, and this was seemingly independent of its effects on RNA stability and translation110. Deletion of a carboxy-terminal sequence from this protein does not substantially affect its protease activity but does inhibit RNA replication112. Deletion of the amino-terminal region notably, but incompletely, suppresses replication of the relevant replicon113. Similarly, deletions in 2A of Aichi virus (a kobuvirus) inhibit viral RNA replication114. However, the mechanisms responsible for these effects have not been elucidated, casting doubts on whether these 2A proteins affect viral replication directly or by modulating host cell activities.

A role for hepatitis A virus 2A in virion assembly and maturation has been established97,115. The primary co-translational cleavage of the viral polyprotein takes place between 2A and 2B and is accomplished by the viral proteinase 3Cpro (or its precursor), leaving the 2A sequence fused to the carboxyl terminus of capsid protein VP1 (Refs 116, 117). In immature virions, VP1 retains this 2A extension, which is eventually cleaved off by an unidentified enzyme118. The involvement of 2A in the maturation of some other picornaviruses cannot be excluded therefore. Notably, the primary co-translational scission of the cardiovirus polyprotein generates a fusion between 2A and the capsid protein precursor25.

Origins and evolution of security proteins

Obviously, it is not possible to construct a genealogical tree of either L or 2A, because they are unrelated. These proteins are not considered at all in the hypothesis on the origin of picorna-like viruses4, and only the short, translation-interrupting 2A peptides (that is, the 2Asp peptides) are briefly mentioned in the proposal on picornavirus classification119. There is no correlation between the nature (or even the presence) of security proteins and either the type of IRES that lies upstream of the L protein or the RdRP lineage (which is generally accepted as the most reliable indicator of viral relatedness) (Fig. 2).

Figure 2: Relationships between the presence of distinct security proteins and other evolutionary hallmarks of picornaviruses.
figure 2

The distribution of security proteins among different viruses is not congruent with either the type of internal ribosome entry site (IRES; the key cis-acting element responsible for cap-independent translation of picornavirus RNAs) or the topology of the RNA-dependent RNA polymerase (RdRP) tree. For example, viruses harbouring type II IRESs can possess different L proteins (aphthoviruses and cardioviruses) or be devoid of this protein (cosaviruses). A similar situation occurs with viruses that use type IV IRESs. Conversely, different kobuviruses can possess unrelated IRESs. 2A proteins with the H-NC motif are present in viruses of distant RdRP lineages (kobuviruses on the one hand and avihepatoviruses, parechoviruses and tremoviruses on the other), but this motif is not shared by more closely related viruses (for example, it is present in tremoviruses but absent in hepatoviruses). The same pattern is characteristic of the 2A proteins that contain the NPG(P) motif (which interrupts translation at the indicated proline residue; shown by P↓). The closely related seal picornavirus type 1 (SePV-1), duck hepatitis A virus (DHV), Ljungan virus (LV) and human parechovirus type 2 (HPeV-2) each harbour a distinct 2A protein. Well-defined amino acid motifs are indicated. Viral RdRP protein sequences were taken from GenBank. Multiple alignments of protein sequences were constructed using CLUSTAL-X2. The RdRP tree was constructed by using MrBayes with default parameters. Solenopsis invicta virus 2 (SolV-2) was used as the outgroup. AEV, avian encephalomyelitis virus; AiV, Aichi virus; ASV, avian sapelovirus; EMCV, encephalomyocarditis virus; ERBV, equine rhinitis B virus; FMDV, foot-and-mouth disease virus; HAV, hepatitis A virus; HCoSV-A, human cosavirus A; HPV-1, human poliovirus type 1 str. Mahoney; HRV, human rhinovirus A101; PKV, porcine kobuvirus; Pro, protease; PSV, porcine sapelovirus; PTV, porcine teschovirus; SalV, Salivirus NG-J1; SSV, simian sapelovirus; SVV, Seneca Valley virus; TMEV, Theiler's murine encephalomyelitis virus; Zn, zinc finger.

The diversity of the security proteins and their absence from many picornaviruses suggest that they are independent and late evolutionary acquisitions5,120. It can be speculated that the most ancient of the 2A molecules are the 2Asp peptides, as hinted by their presence in most picornavirus genera, including those belonging to different lineages (Fig. 2). Interruption of polyprotein synthesis after translation of the capsid proteins might be advantageous for viral reproduction. It might, for example, facilitate proper protein folding, ensure optimal kinetics of protein synthesis or control the ratios of structural to non-structural proteins, which could theoretically be achieved by incomplete translational re-initiation at the second proline residue of the NPG(P) motif. It is worth noting that there is a difference in the translation factor requirements for translation of the EMCV polyprotein upstream and downstream of the 2A–2B boundary121.

Strikingly, 2Asp peptides — or more accurately, the DXEXNPG(P) motifs (where X is any amino acid) that are characteristic of these peptides — are found in some other picorna-like and unrelated viruses, where they seem to serve the same function122,123,124. Assuming that the acquisition of NPG(P) was an early event, this motif might have been lost by some parechoviruses (and other picornaviruses) at a certain step of evolution21. In contrast to picornaviruses, the acquisition of 2Asp by members of some other families of RNA viruses has been proposed to have occurred at a late stage122. The idea that these peptides might have originated in picornaviruses is an attractive hypothesis. The NPG(P) motif can also be found in several cellular proteins, but the current data do not allow researchers to determine whether the recoding ability of this motif was a viral or cellular invention.

The non-NPG(P)-containing regions of the 2A proteins are unrelated acquisitions. Cardiovirus 2A possesses these additional moieties in its amino-terminal region, whereas other viruses of this subset (parechoviruses, avihepatoviruses and seal picornavirus type 1) contain these moieties in their carboxy-terminal regions (Fig. 1). This may suggest that these moieties were acquired after the NPG(P) motif. In avihepatoviruses and some parechoviruses, these newly acquired sequences harbour the H-NC motif, which is also present in the NPG(P)-lacking 2A proteins of other parechoviruses, kobuviruses and tremoviruses (that is, viruses of different RdRP lineages) (Fig. 2). A plausible hypothesis is that H-NC-containing proteins from cellular organisms were hijacked by certain picornaviruses on several independent occasions, but the possibility that 2Asp peptides were formed by deletions from larger 2A proteins120 cannot be ruled out.

Taking into account the chymotrypsin-like fold that is found in enterovirus 2Apro, it is reasonable to assume that this protein was derived from picornavirus 3Cpro or a cellular protease125. A similar assumption could perhaps be made for the putative sapelovirus 2A protease. There are no obvious clues to the possible origins of the 2A proteins of hepatoviruses and klasseviruses or of the non-NPG(P) moieties of the 2A proteins of cardioviruses and seal picornavirus type 1.

With regard to the picornavirus L proteins, one hypothesis is that the papain-like Lpro proteases of aphthoviruses and erboviruses originated from cellular enzymes. No obvious relatives of other L proteins can currently be identified among cellular or viral proteins. Only the origin of cardiovirus L* seems certain: the first L* probably came into being accidentally, by translation of an alternative reading frame, and was then shaped by mutations preserving the functional integrity of L, VP4 and VP2. It is worth noting that recently identified human TMEV-like cardioviruses lack this alternative reading frame126.

Theoretically, several mechanisms could underlie the acquisition of security proteins. For example, one possibility is that viral genes were duplicated and subsequently substantially modified (such a scenario can be imagined for the origin of enterovirus 2Apro(Ref. 125)). Other scenarios are: recombination with viral or cellular RNAs encoding related proteins such as proteases; conversion of non-coding RNA sequences into coding sequences (which could occur through different mechanisms, such as interspecies recombination127 (V.I.A., A.P.G., E. V. Khitrina and W. J. Melchers, unpublished observations), or introduction or activation of an upstream in-frame AUG codon); frame shifting120 (or double frame shifting, in the case of 2A proteins); and using an alternative reading frame. Unfortunately, we can only pinpoint a definite mechanism for the case of L*, for which the last mechanism has obviously been operative.

The acquisition of security proteins, by whatever mechanism, required that the new 'additions' did not interfere with the function of the adjacent viral proteins — VP4 (or VP0) in the case of L, and VP1 and 2B in the case of 2A. The new proteins should either be separated from these neighbours (by self-proteolysis, the action of other viral proteases or translation interruption) or, if they remain fused, they should not impair the functions of these neighbours (as is the case with the hepatitis A virus VP1–2A fusion).

A separate issue is the evolution of the security proteins themselves. It was proposed that the L proteins of TMEV and EMCV diverged during evolution to adapt to the different replication fitnesses of these viruses90. The data are too scarce, however, for any generalizations at this point.

Conclusions

L and 2A constitute a distinct and remarkable set of picornavirus proteins. They exhibit striking structural and biochemical diversity but (with the exception of the 2Asp peptides) accomplish similar biological functions by counteracting host defensive reactions. However, their abilities to solve similar problems might involve fundamentally different molecular mechanisms. This is illustrated in Fig. 3, which summarizes the properties of the three best studied security proteins. The biological functions of enterovirus 2Apro and cardiovirus L are strikingly similar: inhibition of host macromolecular synthesis, permeabilization of the nuclear envelope and inhibition of active nucleocytoplasmic transport, suppression of specific innate immunity mechanisms and control of the apoptotic machinery of the host cells. No less striking is the difference in the mechanisms by which the two proteins achieve these goals. Conversely, aphthovirus Lpro can solve only a subset of these problems. An intriguing question is how, and whether, the diverse security functions exhibited by a given protein (Fig. 3) are related to each other. The fact that mutations inactivating one function usually also impair other functions suggests the existence of a common upstream target.

Figure 3: Major biological functions of the best studied but unrelated security proteins.
figure 3

There is a striking similarity between the functional activities of the enterovirus 2A protease (2Apro) and the cardiovirus leader protein (L), but the underlying mechanisms by which these functions are carried out are fundamentally different. By contrast, the known functional activities of aphthovirus L, which is a protease (Lpro), seem to be more limited. IRF3, interferon regulatory factor 3; NF-κB, nuclear factor-κB.

Most of the picornavirus L and 2A proteins still await researchers' attention. Nevertheless, the data discussed here allow us to provisionally assign security functions even to those L and 2A proteins that have not yet been characterized. Indeed, L and 2A do not definitely belong to the set of essential reproductive proteins that ensure translation and replication of the viral genome (although 2A of hepatitis A virus assists virion maturation).

We propose that the concept of security proteins is of general relevance and can be applied to viruses other than picornaviruses. The hallmarks of these proteins are as follows: structural and biochemical unrelatedness or even absence in related viruses; the dispensability of the entire protein or its functional domains for viral viability; and, for mutated versions of the proteins, fewer detrimental effects on viral reproduction in immune-compromised hosts than in immune-competent hosts. Possessing one of these features would make a viral protein a good candidate security protein, whereas a combination of these features would probably confirm this designation.

Viruses with large DNA genomes possess impressive arsenals of security proteins. The complement of security proteins is much more limited in RNA viruses but is sufficient for their evolutionary success. There are also tentative examples of security proteins in non-picornavirus RNA viruses. Coronaviruses have several so-called accessory proteins, which are neither conserved nor essential and exhibit the capacity to suppress host innate immunity by a range of mechanisms128. Some, but not all, flaviviruses use ribosome frame shifting to express a non-essential non-structural protein, NS1′, mutational inactivation of which results in viral attenuation129. The NSS protein of the Rift Valley fever phlebovirus suppresses the interferon system, and its inactivation does not kill the virus but attenuates its pathogenicity130. This list can readily be extended.

The spectrum of picornavirus-induced diseases is extraordinarily broad. The reasons for this variability are poorly understood, although receptor compatibility and effects on viral protein and RNA synthesis that are caused by differences in the availability of host factors are surely important contributors. However, the interaction between host defences and viral counter-defence is certainly one of the key factors underlying the pathogenicity of picornaviruses and other viruses. This interaction cannot be fully understood without elucidation of the roles of the security proteins. Treatment and prevention of viral diseases may also markedly benefit from such elucidation. The study of security proteins is therefore an underdeveloped but highly promising research area.