Main

The induction of type I interferon (IFN) is a hallmark of immune sensing of nucleic acids by the innate immune system. Type I IFN was identified on the basis of its functional ability to restrict viral replication. Soon after its discovery, specific types of nucleic acids were found to potently induce this key antiviral cytokine and thus mimic the presence of virus1. The identification of the corresponding immune sensing receptors was a prerequisite to any real understanding of how self and non-self nucleic acids can be distinguished.

The basic molecular structure of DNA and RNA is universal throughout biology. Thus, the recognition of foreign nucleic acids among an abundance of self nucleic acids in an organism is a substantial biochemical challenge. There is now a broad consensus within the field that most of the receptors involved in the immune sensing of nucleic acids have been identified. Given the substantial progress over the last few years, it seems timely to review the current literature and to convey an updated concept of nucleic acid sensing. First, we provide a brief overview of the general principles of nucleic acid sensing. Then, the current literature on RNA and DNA sensing is reviewed with a focus on self versus non-self recognition. We conclude with open questions, future perspectives and implications for human disease. For more detailed information about the molecular structure of the receptors, their specific immune functions and therapeutic development, we refer to other in-depth reviews2,3,4,5,6,7,8,9,10.

Principles of nucleic acid sensing

Most non-vertebrates, such as insects, plants and nematodes, rely mainly on RNA interference (RNAi) for antiviral defence. However, in vertebrates, although there are indications that RNAi is involved in the degradation of genomic RNA of some viruses in some cell types (reviewed in Ref. 11), other reports suggest that there is competition rather than cooperation between small interfering RNA (siRNA)-mediated RNAi and type I IFN-dominated antiviral pathways12 (see below). Overall, RNAi does not seem to have an essential function in vertebrate antiviral defence but, if not the sequence, which biochemical and cell biological features of nucleic acids are used for the reliable discrimination of foreign and self?

Availability, localization and structure. In vertebrates, the principle of self versus non-self nucleic acid sensing is based on three central criteria (Fig. 1): first, the availability of nucleic acid ligands as determined by their local concentration, by the rate of degradation by endogenous nucleases and by the level of shielding (for example, by nucleocapsid proteins); second, the localization of nucleic acid ligands, such as outside the cell membrane, in the endolysosomal compartment or in the cytosol; and third, the structure of nucleic acid ligands as characterized by sequence motifs, conformation (for example, base-paired versus unpaired) and chemical modification. In many cases, a combination of all three aspects contributes to the reliable recognition of potentially dangerous foreign nucleic acids leading to the induction of appropriate innate immune responses.

Figure 1: Principles of self versus non-self recognition of nucleic acids.
figure 1

The detection of foreign nucleic acids is based on their availability, localization and structure. Nucleases rapidly degrade most self nucleic acids before they can be sensed by nucleic acid receptors, and the localization of nucleic acids determines if the potentially immunoactive nucleic acids are accessible for their detectors. The nucleic acid receptors can be divided into two main categories: immune sensing receptors, which include Toll-like receptor 3 (TLR3), TLR7, TLR8, TLR9, retinoic acid inducible gene I (RIG-I), melanoma differentiation associated gene 5 (MDA5), absent in melanoma 2 (AIM2) and cyclic GMP–AMP synthetase (cGAS); and nucleic acid receptors with direct antiviral activity, including double-stranded RNA (dsRNA)-activated protein kinase R (PKR), IFN-induced protein with tetratricopeptide repeats 1 (IFIT1), 2′-5′-oligoadenylate synthetase 1 (OAS1) and ribonuclease L (RNase L), and adenosine deaminase acting on RNA 1 (ADAR1). Immune sensing receptors can detect structural features such as dsRNA or 5′-triphosphate RNA, which indicate non-self, and indirectly or directly induce transcription factors that upregulate the expression of antiviral effector proteins, chemokines and cytokines, including type I interferon (IFN), to promote an antiviral immune response. In addition, these immune sensing receptors induce the expression of nucleic acid receptors with direct antiviral activity through the induction of IFN-stimulated genes. Those effector proteins primarily do not induce transcription factors but act directly on the target RNA. Of note, in addition to its primary role as a sensor, RIG-I has direct antiviral activity. ssRNA, single-stranded RNA.

PowerPoint slide

Two categories of nucleic acid receptors. The first category of nucleic acid receptors comprises pattern recognition receptors (PRRs) of the Toll-like receptor (TLR) family (specifically, TLR3, TLR7, TLR8 and TLR9), the RIG-I-like receptor (RLR) family of RNA sensors (such as retinoic acid inducible gene I (RIG-I; also known as DDX58) and melanoma differentiation associated gene 5 (MDA5; also known as IFIH1)), and the DNA sensors absent in melanoma 2 (AIM2) and cyclic GMP–AMP synthetase (cGAS). These PRRs directly or indirectly induce transcription factors, including nuclear factor-κB (NF-κB) and IFN-regulatory factor 3 (IRF3), that upregulate the expression of antiviral effector proteins, chemokines and cytokines (Fig. 1). This antiviral response is dominated by type I IFN and IFN-stimulated genes (ISGs).

Although the expression of some PRRs (for example, AIM2, RIG-I and MDA5) are enhanced by type I IFN in a positive feedback loop, the initial expression of most of the second category of receptors requires type I IFN or PRR signalling. Second category receptors comprise nucleic acid receptors with direct antiviral activity — for example, double-stranded RNA (dsRNA)- activated protein kinase R (PKR; also known as eIF2AK2), 2′-5′-oligoadenylate synthetase 1 (OAS1) and adenosine deaminase acting on RNA 1 (ADAR1). Like PRRs, they recognize the biochemical features of foreign nucleic acids. However, unlike PRRs, their major function is not to induce immune responses via transcription factors and cytokines but rather to directly act on viral RNA (for example, by inhibiting translation or by chemical modification or degradation of their target RNA). Recent findings have shown that these categories can overlap: as detailed below, RIG-I can also act as an effector molecule on viral RNA13,14, translational inhibition of PKR was reported to enhance NF-κB signalling15, and RNA modifications or cleavage can inactivate or generate target RNA structures that are recognized by PRRs.

RNA sensing

RNA-sensing receptors. The transmembrane TLRs TLR3, TLR7, and TLR8 recognize RNA in the endosome16,17,18,19 and, in the case of TLR3, at the surface of some cell types, such as fibroblasts and tumour cell lines20 (Fig. 2). In the cytosol, the RLRs RIG-I and MDA5, which are DExD/H box RNA helicases with caspase recruitment domains (CARDs)21, signal via mitochondrial antiviral signalling protein (MAVS; also known as CARDIF, IPS1 and VISA) and IRF3–IRF7 to induce type I IFNs following RNA recognition4 (Fig. 2). In addition to its signalling function, RIG-I can act as a direct antiviral effector protein that binds viral genomic RNA and interferes with viral polymerases13,14. A third cytosolic RIG-I-like helicase is LGP2 (also known as DHX58), which lacks a CARD and appears to contribute to the fine tuning of immune responses by inhibiting RIG-I and supporting MDA5 signalling22. Although LGP2 can recognize RNA, it is unclear whether this is relevant for its function23.

Figure 2: RNA-sensing receptors.
figure 2

Double-stranded RNA (dsRNA) in the endosome is detected by Toll-like receptor 3 (TLR3) and TLR7, whereas single-stranded RNA (ssRNA) is sensed by TLR7 and TLR8. These receptors signal via myeloid differentiation primary response protein 88 (MYD88) and TIR domain-containing adaptor protein inducing IFNβ (TRIF) to induce interferon (IFN)-regulatory factor 3 (IRF3)–IRF7 and type I IFN production and via nuclear factor-κB (NF-κB) to induce pro-interleukin-1β (pro-IL-1β) and the inflammasome component NLRP3. The helicases retinoic acid inducible gene I (RIG-I) and melanoma differentiation associated gene 5 (MDA5) detect dsRNA in the cytosol and signal through mitochondrial antiviral signalling protein (MAVS) to induce type I IFN production and pro-apoptotic signalling via IRF3–IRF7, and to activate the NOD-, LRR- and pyrin domain-containing 3 (NLRP3) inflammasome resulting in IL-1β production. The RNA receptor 2′-5′-oligoadenylate synthetase 1 (OAS1) has direct antiviral activity and degrades RNA via the generation of 2′-5′-oligoadenylates that activate ribonuclease L (RNase L). RNase L generates 5′ OH- and 3′ phosphate-containing RNA fragments that can stimulate RIG-I. The crucial RIG-I-stimulating structure has not been determined to date. The RNA receptors IFN-induced protein with tetratricopeptide repeats 1 (IFIT1) and double- stranded RNA (dsRNA)-activated protein kinase R (PKR) inhibit cap-dependent translation. BAX, BCL-2-associated X protein; BCL-2, B cell lymphoma 2; BIRC3, baculoviral IAP repeat-containing protein 3; IFNAR, interferon-α/β receptor; PRKCE, protein kinase Cε; PUMA, p53-upregulated modulator of apoptosis; TRAIL, TNF-related apoptosis-inducing ligand.

PowerPoint slide

In addition to IFN production, MAVS activation promotes apoptosis through several pathways: recruitment and activation of caspase 8 (Ref. 24); IRF3–IRF7-dependent upregulation of TRAIL and the pro-apoptotic proteins p53-upregulated modulator of apoptosis (PUMA; also known as BBC3) and NOXA (also known as PMAIP1); downregulation of the anti-apoptotic genes BCL2, BIRC3 and PRKCE; and direct activation of the pro- apoptotic BCL-2-associated X protein (BAX)25,26,27,28 (Fig. 2). Furthermore, MAVS was implicated in the activation of the NOD-, LRR- and pyrin domain-containing 3 (NLRP3) inflammasome, thereby contributing to the induction of an interleukin-1β (IL-1β)-dependent pro-inflammatory cytokine response29,30,31,32,33 (Fig. 2).

RNA-sensing receptors with direct antiviral activity. In addition to immune receptors that induce antiviral immune responses, there are several receptors that, upon RNA recognition, directly inhibit viral replication and propagation. Since viral replication crucially relies on the host's translation machinery, most antiviral mechanisms specifically or nonspecifically target the mRNA translation process (Fig. 2).

One such receptor is PKR, which inhibits viral and host cap-dependent translation by phosphorylating eukaryotic translation initiation factor 2A (eIF2A)34. Further antiviral functions of PKR have been postulated but are controversial. IFN-induced protein with tetratricopeptide repeats 1 (IFIT1) is an antiviral effector protein35,36 that binds to the 5′ cap of mRNA lacking a 2′-O-methyl modification at the N1 position and blocks its translation37,38,39. IFIT1 has been proposed to compete with eIF4E (a 5′ cap binding and translation factor)38, which is supported by the observation that IFIT1 interferes with the formation of the 48S translation initiation complex39. To date, a crucial role for IFIT1 in the antiviral immune response was reported for negative sense single-stranded RNA ((−)ssRNA) viruses (vesicular stomatitis virus and influenza virus) and positive sense ssRNA ((+)ssRNA) viruses (West Nile virus and mouse hepatitis virus) but not picornaviruses which do not have a 5′ cap35,36,38.

RNA recognition by OAS1 provides another cell autonomous antiviral mechanism. Upon binding to double-stranded RNA (dsRNA), OAS1 synthesizes 2′-5′ oligomers of adenosine (2′-5′-oligoadenylate) as a second messenger40,41. 2′-5′-oligoadenylate in turn activate ribonuclease L (RNase L)42, which degrades viral and cellular RNA molecules (Fig. 2). The antiviral molecule ADAR1 catalyses the C6 deamination of adenosine in base-paired RNA, which results in A-to-I conversions. A-to-I conversions change the coding of RNA, as inosine (I) is read as guanosine (G) instead of adenosine (A) by ribosomes, leading to the translation of altered, potentially non-functional, proteins43. Furthermore, conversion of A-U to I-U further destabilizes potential secondary structures important for viral RNA regulation43. However, as detailed below, the destabilization of secondary structures can also impair recognition by PRRs, thereby supporting immune escape of the virus. In conclusion, the role of ADAR1 in the context of viral infections remains unclear and might depend on the type of the virus44. Notably, the expression and functions of PKR, IFIT1, OAS1 and ADAR1 are all downstream of the type I IFN-inducing nucleic acid immune receptors (such as RLRs).

RNA recognition motifs

In the following section, we discuss structural motifs of RNA that are recognized by sensing receptors, such as secondary structure, RNA modification and RNA sequence (Fig. 3). Interestingly, the same structural feature or modification is often recognized by evolutionarily unrelated receptors, pointing to a convergent evolution of PRRs.

Figure 3: Immune sensing of double-stranded RNA.
figure 3

Double-stranded RNA (dsRNA), which is the prototypic non-self nucleic acid stimulus, is detected by three signalling receptors: Toll-like receptor 3 (TLR3), which is located on the cell membrane and in the endosomal membrane, and retinoic acid inducible gene I (RIG-I) and melanoma differentiation associated gene 5 (MDA5), which are located in the cytosol. Long forms of dsRNA are recognized independently of the structure at the ends (TLR3 recognizes dsRNA >35 bp, and MDA5 recognises dsRNA >300 bp). A short stretch of dsRNA (>19 bp) is sufficient for recognition by RIG-I if a triphosphate or a diphosphate is present at the 5′ end, and if the end is blunt with no overhangs. A 2′-O-methyl group at the first nucleotide (N1) of the 5′ end is part of the cap 1 structure of self RNA that labels it as self, and thus prevents recognition by RIG-I.

PowerPoint slide

RNA secondary structures recognized as non-self RNA. Long dsRNA in the cytosol is a hallmark of DNA and RNA virus replication and is absent from an uninfected host cell45. Polyinosinic–polycytidylic acid (poly(I:C)), which mimics dsRNA, binds to and triggers the activation of the RNA sensors PKR, OAS1, TLR3, MDA5 and RIG-I, and thus poly(I:C) or its derivatives have been used as a tool to identify and characterize dsRNA recognition receptors and ligand requirements16,21,46,47,48,49. TLR3-mediated NF-κB activation requires dsRNA of 35–39 bp in length50 (Table 1). Of note, mouse TLR3 responds to shorter dsRNA, including siRNAs that are not recognized by human TLR3 (Refs 51,52). This observation calls into question the use of mouse models for testing siRNA-based therapeutic approaches. In addition to long dsRNA, ssRNA segments harbouring stem structures with bulge loops have been proposed as possible ligand structures that are recognized by TLR3 (Ref. 53).

Table 1 RNA ligand structures that indicate non-self

The cytosolic helicase MDA5 is potently activated by very long dsRNA (>300 bp) — for example, poly(I:C) — but the minimal length requirement of a biologically relevant dsRNA remains unclear46,47,48. The current model for the recognition of continuous dsRNA by MDA5 is that MDA5 uses the long dsRNA as a signalling platform to cooperatively assemble a filament of MDA5 molecules in a head-to-tail arrangement with exposed CARDs along the long dsRNA; the exposed CARDs in turn recruit and activate the downstream adaptor MAVS54. It has recently been shown that the RNA-modifying activity of ADAR1 (that is, the conversion of A to I) is required to prevent endogenous dsRNA from activating MDA5 (Refs 55,56); given that there is no evidence for endogenous dsRNA of that length, these data suggest that MDA5 senses ligand motifs in addition to very long dsRNA. Several studies demonstrate that MDA5 activation concurs with the formation of dsRNA in the course of viral replication23. Recent data suggest that MDA5 contributes to the detection of (−)ssRNA viruses despite the absence of long dsRNA45. MDA5 was found to bind to and be activated by the mRNA of paramyxovirus (a (−)ssRNA virus)57,58, with paramyxovirus V protein binding to and inhibiting MDA5 but not RIG-I59. Furthermore, an in vivo short hairpin RNA screen identified MDA5 as one of the key antiviral proteins involved in the antiviral defence response against influenza A virus60. These data indicate that (−)ssRNA viruses, which do not produce dsRNA, are sensed by MDA5 and can actively interfere with MDA5 function. It is therefore tempting to speculate that, in addition to long dsRNA, ssRNA with short base-paired secondary structures is recognized by MDA5.

'Short' fragments of poly(I:C) (<300 bp) and long random dsRNA molecules (>200 bp) have both been reported to activate RIG-I in a manner independent of 5′-modifications48,61. However, these findings are controversial as the recognition by RIG-I of long genomic dsRNA of reovirus was found to depend on the presence of 5′-diphosphates49 (see below).

An examination of poly(I:C) molecules of different molecular size revealed that the minimum length of dsRNA for the activation of PKR and OAS1 was above 30 bp62. For PKR, this was confirmed with dsRNA fragments of heterogeneous sequence and defined length that were derived from phage RNA polymerases in vitro63. Owing to A-to-I conversions, ADAR1 activity can also impair dsRNA recognition by PKR and OAS1 (Refs 55,56), indicating a broad immunomodulatory role of ADAR1 as an inhibitor of the dsRNA sensors MDA5, PKR and OAS1 and explaining its role in preventing autoinflammatory diseases (reviewed in Ref. 2).

Altogether, dsRNA, which is a hallmark of viral replication, and long RNA with base-paired structures appear to be strong recognition motifs for several evolutionarily unrelated PRRs.

Sequence-dependent recognition of foreign RNA. Some RNA-sensing receptors have sequence preferences (Table 1). TLR7 and TLR8 are preferentially activated by poly-uridine (polyU) and by guanosine and uridine rich (GU-rich) sequences18,19,64. Since polyU or GU-rich sequences are not more frequent in microbial RNA compared to vertebrate RNA, it is currently unclear how the preference for these motifs may help distinguish self from non-self RNA. Furthermore, although TLR8 exclusively detects ssRNA, TLR7 primarily detects dsRNA but can also accommodate certain ssRNA oligonucleotides65. However, the biological rationale behind these sequence preferences is still unknown. Of note, in a recent study a binding site for a single ssRNA U nucleotide and another binding site for short RNA degradation products were found in a TLR8–ligand co-crystal, suggesting that TLR8 is activated by degradation products of ssRNA66. Similarly, TLR7 was proposed to sense G derivatives of ssRNA67.

Studies have shown that RIG-I and MDA5 have a preference for AU-rich sequences in long viral transcripts58,68. ADAR1 has binding domains for A-form dsRNA and for Z-form dsRNA, which guide ADAR1 to mediate highly site-selective adenosine deamination of only a few, specific adenosine residues, although non-selective editing of multiple sites in endogenous and viral RNAs has been described43. However, non-selective editing of multiple sites also has the potential to destroy secondary structures, which are potential recognition sites for MDA5, PKR and OAS1.

Another sequence-dependent antiviral mechanism is triggered by high frequencies of CpG and UpA (but not GpC or ApU) dinucleotides in coding RNA, and these dinucleotide motifs are indeed avoided in the RNA sequences of many RNA viruses (for example, picornaviruses). So far, the mechanism behind the antiviral effects of CpG and UpA motifs is unclear; however, it does not involve ADAR1, RIG-I, MDA5 and PKR or translation codon usage effects69,70.

Although many PRRs recognize RNA in a sequence dependent manner, the underlying biological significance is unknown.

Recognition of 5′-modified RNA. RIG-I is activated by blunt end 5′-triphosphate dsRNA that lacks overhangs71,72,73,74 (Table 1). Experimental approaches using synthetic or highly purified enzymatically generated 5′-triphosphate dsRNA revealed a minimum length of 18–19 bp of dsRNA for RIG-I recognition73,75. Structural studies confirmed the molecular requirement of a base-paired RNA region with a 5′-triphosphate end for RIG-I recognition76,77,78,79,80. By contrast, one group reported that 5′-triphosphate RNA that forms a hairpin structure can activate RIG-I even by a shorter 10 bp stretch81. However, it has to be considered that those hairpin RNAs could also form a 20mer heteroduplex dsRNA from two hairpins and in this way meet the minimum length of 18–19 bp (Table 1).

Nascent 5′-triphosphate RNA in the cytosol is indicative of RNA that has formed outside the nucleus and thus denotes viral infection72 or intracellular bacteria82,83,84. Enzymatic RNA biosynthesis is based on the 3′ to 5′ linkage of nucleotide triphosphates and consequently leaves a free triphosphate at the 5′ end in nascent RNA. Endogenous self RNA is generated in the nucleus and further processed (such as backbone or base modifications, cleavage and 5′ capping), resulting in the removal or modification of accessible 5′-triphosphate ends before the RNA is translocation to the cytosol. By contrast, the perfect replication of an RNA strand is a hallmark of many viruses and is always linked to the formation of a blunt end dsRNA structure containing 5′-triphosphate ends, which is recognized by cytosol RNA sensors. Furthermore, some (-)ssRNA viruses (for example, influenza virus) that minimize the occurrence of dsRNA by binding their ssRNA to nucleocapsid proteins possess genomes that form panhandle structures with a blunt end, due to partially complementary terminal sequences23. These partially complementary terminal sequences are a result of two ssRNA replication origins at the ends of the viral genomic RNA that are conserved, as both are recognized by the same viral RNA polymerase. Panhandle structures are recognized by RIG-I, a cytosol RNA sensor.

Certain (-)ssRNA viruses (for example, Sendai virus and vesicular stomatitis virus) separate nascent RNA strands with nucleocapsid proteins45 and are thus primarily devoid of dsRNA intermediates and panhandle structures. However, they still can be recognized by RIG-I via perfectly matched base-paired abortive RNA structures that are generated during erroneous replication (known as 'snap back' defective interfering RNA genomes)85.

In addition to 5′-triphosphate ends, RIG-I recognizes 5′-diphosphate ends of dsRNA, albeit with lower affinity than 5′-triphosphate. 5′-diphosphate ends are a characteristic molecular structure formed by certain viruses (such as reovirus)49. Interestingly, the enzymatic synthesis of poly(I:C) can leave free 5′-diphosphate ends and therefore contribute to the recognition of poly(I:C) by RIG-I49. Thus, poly(I:C) should not be regarded as a simple synthetic mimic of dsRNA, as other forms of dsRNA do not generally bear 5′-diphosphate moieties. The detection of di- or tri-phosphorylated 5′-ends of RNA allows RIG-I to reliably identify much shorter dsRNA molecules compared with TLR3 and MDA5 (Refs 72,73).

Besides RIG-I, three other nucleic acid receptors — IFIT1 and IFIT5 (Refs 35,36,37,39) and PKR86 — have been shown to recognize 5′-triphosphorylated RNA, but, in these cases, the RNA was single-stranded. IFIT1 and IFIT5 were found to recognize either 5′-triphosphate ssRNA or a triphosphorylated ssRNA overhang of at least 5 nt in length in the case of IFIT1, or 3 nt in length in the case of IFIT5 (Ref. 37). Conversely, base pairing within the first 5 or 3 nt abrogates recognition37. The presence of a 5′ 7-methylguanosine cap (m7G cap) enhances IFIT1 binding, confirming that IFIT1 targets mRNA for translational inhibition and does not target viral genomic RNA38,39. Activation of PKR by 5′-triphosphorylated 'single-stranded' RNA with secondary structures starts at a length of about 47 nt, and an m7G cap abrogates PKR binding86. In addition to 5′-triphosphate-dependent binding to ssRNA, PKR also binds long dsRNA independently of 5′-triphosphate (discussed above).

In conclusion, many evolutionarily unrelated PRRs target polyphosphorylated RNA, albeit in the context of different RNA structures.

Labels of self and viral escape mechanisms. In contrast to bacterial RNA, 2′-O-methylation is a common modification of eukaryotic RNA, whereby a methyl group is added to the 2′ hydroxyl group of the ribose87. As a eukaryotic marker of self, it prevents the recognition of endogenous RNA by TLR7 and TLR8, thus adding to the repertoire of discriminatory factors of self versus non-self64,88,89 (Fig. 3; Table 2). Notably, 2′-O-methylation at the N1 position of mRNA is an essential component of the universal cap 1 structure of the mRNA of higher eukaryotes. It is now evident that N1 methylation in cap 1 has a key function in nucleic acid sensing as a marker of self, as it abolishes the recognition of self RNA by RIG-I90,91 and by IFIT1 (Ref. 38) (Table 2).

Table 2 Structure of endogenous RNA and viral escape mechanisms

Certain viruses have evolved mechanisms to mimic this self label of RNA by encoding 2′-O-methyltransferases, which mediate 2′-O-methylation of their RNA to escape detection by RIG-I, IFIT1, TLR7 and MDA5 (Refs 35,90,92). Important examples of viruses with their own N1 2′-O-methyltransferase and that replicate in the cytosol include the (+)ssRNA flaviviruses and coronaviruses, the (−)ssRNA paramyxoviruses and rhabdoviruses, the dsRNA reoviruses and the dsDNA poxviruses and vaccinia virus93. Furthermore, the (−)ssRNA orthomyxoviruses, bunyaviruses and arenaviruses cut the 5′-terminal 10–14 nt of host mRNA and use those capped oligonucleotides as a primer for mRNA generation (termed the cap-snatching mechanism)93.

Concealment and modification of 5′-terminal RNA is an important viral strategy to avoid immune recognition. RNAs of pathogenic alphaviruses that lack N1 2′-O-methylation have 5′-terminal structural motifs that interfere with IFIT1 binding and function94 (Table 2). Members of the Picornaviridae family prime RNA transcription using a non-canonical cap-like structure termed VPg that is covalently linked to a uridine dimer (VPg–pUpU), thereby avoiding the presence of a 5′-triphosphate and escaping recognition by both RIG-I and IFIT1 (Ref. 93). Furthermore, Crimean–Congo haemorrhagic fever virus, Hantaan virus and Borna disease virus have been reported to prevent RIG-I-mediated detection of their genomes by an RNA prime-and-realign mechanism that shifts the panhandle structure to generate an 5′ overhang and by cleavage of the 5′ terminal base of their genomic RNA leaving monophosphorylated 5′ ends95. Arenaviruses use a prime and realign mechanism to generate 5′-triphosphate overhangs of 1 nt in their panhandle structures, which are not recognized by RIG-I75 and, in principle, do not allow binding by IFIT1.

Other RNA modifications present within viral transfer RNA (tRNA) and ribosomal RNA (rRNA) were also reported to inhibit immune sensing. Incorporation of pseudouridine (Ψ), 5-methylcytidine (m5C), 2-thio-uridine (s2U) or N6-methyladenosine (m6A) was found to potently inhibit TLR7 and TLR8 and to reduce TLR3 activity, which is the least sensitive receptor to RNA modifications88. Incorporation of Ψ, 5-methyluridine (m5U) or s2U was also found to abolish RIG-I activation72. As mentioned above, ADAR1 prevents recognition of dsRNA by A-to-I conversion in base-paired structures43,55,56, which results in the destabilization of the double-strands due to disrupted A–U base pairs. Other approaches by RNA viruses, such as (−)ssRNA viruses, to escape immune recognition by sensing receptors involves the use of nucleocapsid protein to avoid the formation of dsRNA45 or to introduce mismatches in their genomic panhandles to minimize recognition by RIG-I96 (Table 2).

Overall, viruses have evolved sophisticated mechanisms to mimic endogenous RNA modifications, especially cap 1 structures, and to generate unique viral structures that prevent 5′ end recognition by PRRs.

Recognition of potentially pathogenic, endogenous RNAs. Some endogenous RNAs that are potentially 'pathogenic', such as RNAs from endogenous retroviruses (ERVs), appear to be continuously recognized by TLRs, thereby limiting their replication97. It has been reported that the loss of TLR7 function causes retroviral viraemia and additional loss of TLR3 and TLR9 causes acute T cell lymphoblastic leukaemia in mice97. This indicates that endogenous RNAs transcribed from RNA polymerase II promoters are not generally excluded from TLR-mediated recognition. Furthermore, it has been demonstrated that OAS1-activated RNase L98 and IRE1α99, which is the detector RNase for the unfolded protein response (UPR), cleave endogenous RNA and that the cleavage products can then serve as endogenous ligands for RLRs.

To counteract this process, the SKIV2L RNA exosome degrades endogenous RNA, similar to the role of 3′ repair exonuclease 1 (TREX1; also known as DNase III) in the regulation of endogenous DNA (see below). The RNA exosome is a cellular RNA degradation machine in eukaryotes involved in the regulation of RNA turnover. Depletion of the SKIV2L RNA exosome resulted in the accumulation of endogenous RNA, which triggered the RIG-I pathway in an IRE1α-dependent manner100. Thus, the RNA degradation activity of SKIV2L RNA exosome appears to prevent accumulation of aberrant RNAs with immunostimulatory capacity. To date, the structural features of the 5′ end of IRE1α or RNase L cleavage products that lead to RIG-I activation have not been systematically analysed. Of note, both RNase L and IRE1α generate RNA molecules with 5′-hydroxyl and 3′-phosphate or 2′-3′-cyclic phosphate ends98,100. As viral infections trigger the IRE1α UPR, it makes sense that the IRE1α-dependent UPR pathway triggers the RIG-I pathway to achieve a full antiviral response.

Overall, RNA-sensing PRRs have evolved to sense RNA structures and modifications that typically occur during viral replication. Several unrelated PRRs share similar recognition motifs (for example base-paired or 5′-triphosphorylated RNA), which is suggestive of a convergent evolution towards efficient virus detection. Although there are hints that endogenous RNA cannot entirely avoid recognition by PRRs, additional tolerance mechanisms — including RNA modifications (especially 2′-O-methylation and A-to-I conversions) and RNA degradation of aberrant RNA structures — further prevent recognition of endogenous RNA. Viruses have established sophisticated mechanisms to exploit the endogenous tolerance mechanisms or to circumvent the generation of the typical structures recognized by PRRs.

DNA sensing

DNA-sensing receptors. TLR9 is responsible for the detection of DNA in the endolysosomal compartment101 and triggers type I IFN production in plasmacytoid dendritic cells and polyclonal activation of B cells via the myeloid differentiation primary response protein 88 (MYD88) and IRF7 signalling pathway. In contrast to the endolysosomal compartment, DNA is usually absent in the cytosol, and the mere presence of cytosolic DNA is indicative of non-self or 'dangerous' DNA. A number of DNA-sensing receptors and mechanisms have been proposed to promote inflammasome activation and the induction of type I IFN (Fig. 4).

Figure 4: Immune sensing receptors of DNA.
figure 4

Toll-like receptor 9 (TLR9) in the endolysosomal compartment detects CpG motif-containing DNA and RNA–DNA hybrids that have not been degraded by nucleases such as DNase I and DNase II. TLR9 signals via myeloid differentiation primary response protein 88 (MYD88), interferon (IFN)-regulatory factor 7 (IRF7) and nuclear factor-κB (NF-κB) to induce type I IFN and the inflammasome-related factors pro-interleukin-1β (pro-IL-1β) and NOD-, LRR- and pyrin domain-containing 3 (NLRP3). In the cytosol, cyclic GMP–AMP synthetase (cGAS), IFNγ-inducible protein 16 (IFI16) and absent in melanoma 2 (AIM2) detect DNA that has not been degraded by nucleases such as 3′ repair exonuclease 1 (TREX1). Knockout models reveal that cGAS signals via stimulator of IFN genes (STING) to stimulate type I IFN production. IFI16, DEAD box protein 41 (DDX41) and cGAS also induce apoptosis via STING, IRF3 and BCL-2-associated X protein (BAX) in response to double-stranded DNA (dsDNA). Polyglutamine binding protein 1 (PQBP1) is a co-receptor of cGAS that recognizes HIV reverse transcripts. AIM2 activation results in the formation of the AIM2 inflammasome, which induces IL-1β maturation and pyroptosis via ASC and caspase 1 (not shown). Recognition of dsDNA in the nucleus by RAD50 activates caspase recruitment domain 9 (CARD9)–BCL-10 to induce NF-κB, which upregulates pro-IL-1β transcription. DNA-sensing receptors implicated in the DNA damage response and nuclear recognition of viral nuclear DNA includes breast cancer type 1 susceptibility protein (BRCA1), IFI16, DNA-dependent serine/threonine protein kinase (DNA-PK) and MRE11, and these molecules also signal via STING. Ribonuclease H (RNase H) degrades DNA–RNA hybrids. ssDNA, single-stranded DNA.

PowerPoint slide

The PYHIN (pyrin and HIN domain-containing protein) family member AIM2 is the major cytosolic dsDNA-sensing receptor responsible for inflammasome activation102,103,104,105. The AIM2-related PYHIN domain-containing IFNγ-inducible protein 16 (IFI16) has also been implicated in inflammasome activation in response to dsDNA and the death of HIV-infected CD4+ T cells106,107,108. In addition, the DNA-damage sensor RAD50 was reported to activate the CARD9–BCL-10 pathway leading to pro-IL-1β mRNA induction after stimulation with long dsDNA (such as, poly(dAdT), poly(dGdC) and calf thymus DNA)109 (Fig. 4).

Similar to poly(I:C), a self-complementary polymer of alternating AT (poly(dAdT)) that forms dsDNA, also termed 'B-DNA', served initially as a gold standard for investigating dsDNA-mediated immunostimulation110. However, it turned out that AT-rich DNA in the cytosol can also indirectly induce type I IFN via RNA-sensing receptors: cytosolic poly(dAdT) DNA is transcribed by RNA polymerase III into 5′-triphosphorylated RNA and folds into 5′-triphosphate dsRNA, which strongly activates RIG-I111,112. However, the biological relevance of this pathway is unclear, as AT-rich DNA is rather uncommon and self-complementarity is a prerequisite for RIG-I stimulation. Numerous candidate cytosolic dsDNA receptors — including ZBP1, DEAD box protein 41 (DDX41), IFI16, the DNA-PK complex (DNA-dependent serine/threonine protein kinase complex), MRE11, polyglutamine binding protein 1 (PQBP1) and cGAS — have been proposed to activate the IRF3 pathway (but not the RIG-I-dependent DNA sensing pathway) for type I IFN production (also known as the interferon stimulatory DNA pathway (ISD pathway))113,114.

Currently, cGAS is the most widely accepted dsDNA sensor; however, it cannot be excluded that the other candidates may have a role in influencing the cGAS pathway or in sensing dsDNA within specific tissues. It is broadly accepted that the mitochondrial adaptor protein stimulator of IFN genes (STING; also known as MITA, ERIS and MPYS) is downstream of the DNA receptor and is essential for the activation of the IRF3-dependent pathway115,116 (Fig. 4). STING was found to sense both cyclic dinucleotides secreted by intracellular bacteria117,118 and the endogenous cyclic dinucleotide cGAMP (cyclic GMP–AMP), which contains one 2′-5′-phosphodiester linkage and a canonical 3′5′ linkage (c[G(2′-5′)pA(3′-5′)p]). cGAMP functions as a second messenger and is generated by cGAS following binding of cytosolic DNA119,120,121,122. Crystal structures of cGAS and in vitro activation assays demonstrate that dsDNA directly activates the catalytic activity of cGAS123,124,125,126,127,128. In vivo, the DNA receptor cGAS is essential for the detection of dsDNA genome-based viruses such as herpes simplex virus (HSV), vaccinia virus and adenovirus119,129,130, as well as for retroviruses126,131,132,133.

In addition to cGAS, IFI16 was reported to colocalize with STING upon DNA stimulation134 and to contribute to IFN responses during infection with HSV, human cytomegalovirus, HIV and Listeria monocytogenes114,135. However, to date, a substantial contribution for IFI16 in the DNA-induced type I IFN response has not been confirmed in in vivo studies.

Using knockout and reconstituted cell lines, the DNA damage sensor protein BRCA1 (breast cancer type 1 susceptibility protein) was shown to be crucial for IL-1β and type I IFN induction following Kaposi's sarcoma-associated herpesvirus, Epstein–Barr virus and HSV1 infection; BRCA1 bound to nuclear IFI16 complexes and enabled IFI16 nuclear export, which resulted in the formation of an IFI16-containing inflammasome complex and the IFI16–STING complex136.

Furthermore, genetic deficiency in single DNA-PK components (DNA-PKcs, Ku70 and Ku80) abrogated dsDNA-induced type I IFN response in mouse embryonic fibroblasts (MEFs) in vitro137 but not in mouse bone marrow-derived macrophages113, suggesting tissue- or maturation-specific differences in the role of the DNA-PK complex in dsDNA sensing, which are not fully understood. Although cGAS expression was reported to be essential for the type I IFN response induced by genomic or plasmid DNA or by dsDNA viruses in lung fibroblasts, macrophages and dendritic cells, cGAS expression in MEFs is low, and the response to dsDNA in cGAS-deficient MEFs has not been analysed119. Therefore, the possibility remains that the recognition of dsDNA in MEFs differs from other cell types and might depend on receptors other than cGAS.

Altogether, it is possible that some of the proposed dsDNA receptors function upstream of or as enhancers for the cGAS–STING–IRF3 pathway. Indeed, one cGAS co-receptor, PQBP1, has recently been shown to be a specific enhancer of cGAS-dependent detection of HIV1 cDNA but not detection of plasmid or genomic DNA138.

Structural requirements for DNA recognition. Functional analysis of the cGAS–STING–IRF3 pathway and the AIM2 inflammasome pathway revealed length-dependent but not sequence-dependent requirements for DNA recognition, which is consistent with structural data showing that the DNA phosphate backbone binds to the HIN domain of AIM2 and the carboxyl terminus of cGAS121,123,125,139 (Table 3). For the cGAS–STING–IRF3 pathway, a minimum DNA length of 25 bp in mouse myeloid cells and of 40–60 bp for human myeloid cells was found111,133,134. By contrast, AIM2 was not activated by the so-called ISD sequence (which is a 45mer dsDNA strand) in mouse macrophages, and the observed minimum DNA length for AIM2 activation in human myeloid cells was 50–80 bp102,139. Unexpectedly, stem-loop ssDNA structures, which are characteristic of retroviruses and retroviral elements such as ssDNA reverse transcripts from HIV1, potently activates cGAS, although the stems in such stem-loop structures are shorter than 40 bp108,133. Further evidence showed that the presence of guanosines in the loops or in short single-stranded 3′- and 5′-flanking sequences was required for cGAS activation and that the sequence of the dsDNA stem regions had no impact on activity133. The minimal length of such Y-form DNA with G-containing overhangs that potently activates cGAS was 12 bp in length (Table 3). G-ended Y-form DNA with a 20mer dsDNA stem was as potent as plasmid or genomic DNA in activating cGAS133.

Table 3 DNA ligand structures that indicate non-self

During lentiviral infection, the primary reverse transcription of the RNA genome to ssDNA occurs in the cytosol, whereas the second strand is typically synthesized in the nucleus. Most of the template RNA in the cytosol is rapidly degraded by RNase H. Using a mutated HIV1 reverse transcriptase, which impaired dsDNA but not ssDNA synthesis, the induction of type I IFN by lentiviral particles correlated with the presence of cytosolic ssDNA rather than the presence of dsDNA133. This finding is corroborated by the observation that elite controllers of HIV1 infection produce more type I IFN because of the cytosolic accumulation of HIV1 ssDNA reverse transcripts140. Here, the observed recognition of guanosines in unpaired DNA by cGAS may be essential for ssDNA recognition in the cytosol. In conclusion, primary ssDNA reverse transcripts appear to be the crucial DNA species recognized by cGAS during HIV1 infection and a special mechanism that allows for sensing of guanosine-rich Y-form DNA structures enables the sensitive detection of ssDNA.

TLR9 in the endolysosomal compartment preferentially detects DNA containing unmethylated CpG dinucleotides. Unmethylated CpG dinucleotides are less frequent in eukaryotic self DNA compared with bacterial DNA101,141,142,143. Notably, the specificity of TLR9 for unmethylated CpG motifs is reduced when the CpG motif is placed in the context of phosphorothioate-stabilized ssDNA142.

TLR9 and cGAS can also recognize DNA–RNA hybrids144,145 (Table 3). The hybridization of a 30mer GU-repetitive RNA strand with a 30mer AC-repetitive DNA strand yields a potent TLR9 stimulus, yet abrogates the TLR7 stimulating activity of the GU-repetitive RNA sequence144. Of note, the AC-repetitive DNA strand that lacks CpG has no TLR9 activity as a single strand, underlining the RNA–DNA hybrid as a real recognition motif. Based on structural models, the mechanism through which RNA–DNA hybrids and dsDNA bind to cGAS is similar145. Indeed, RNA–DNA hybrids of the homopolymers poly(rA) and poly(dT) were found to activate cGAS but less robustly then dsDNA145. However, whether the accumulation of high levels of DNA–RNA hybrids in the cytosol is possible in the presence of RNase H activity is unclear. Altogether, the main dsDNA receptors (cGAS and TLR9) can also sense DNA–RNA hybrids. However, the biological relevance of DNA–RNA sensing during infections needs to be investigated.

The impact of endogenous DNA modifications. Two fundamental base modifications have been implicated in DNA sensing. One is endogenous methylation of the C5 carbon of cytosine within CpG motifs, which inhibits recognition by and activation of TLR9 (Refs 141,143). Another form of base modification occurs in the context of oxidative stress: the presence of reactive oxygen species as a result of UV radiation or cell stress leads to the incorporation of oxidative adducts, the most common of which is 8-hydroxyguanosine (8-OHG) (Fig. 5). Thus, the 8-OHG modification is indicative of oxidative damage. 8-OHG modifications have important consequences for the immune response: 8-OHG stabilizes DNA against degradation by the cytosolic DNA exonuclease TREX1, leading to the accumulation of cytosolic DNA and an increased cGAS activation146 (Fig. 5).

Figure 5: Immune sensing of cytoplasmic double-stranded DNA as non-self nucleic acid.
figure 5

The cytosolic immune receptor cyclic GMP–AMP synthetase (cGAS) detects long double-stranded DNA (dsDNA) or short dsDNA with unpaired open ends containing guanosines, which is present in highly structured single-stranded DNA (ssDNA) of certain viruses, such as retroviruses. Cytosolic DNA is efficiently degraded by 3′ repair exonuclease 1 (TREX1) located in the cytosol. Oxidation of DNA, which occurs in situations of oxidative stress caused by UV radiation or cell stress, leads to oxidation of DNA, the most common of which is 8-hydroxyguanosine (8-OHG). This stabilizes DNA against degradation by TREX1, resulting in an accumulation of DNA in the cytosol. Oxidated DNA is recognized by cGAS, resulting in the formation of the cyclic dinucleotide cGAMP, which activates STING to induce type I interferon (IFN) production.

PowerPoint slide

Endogenous DNases and autoinflammation. Cell death-derived DNA, DNA damage and endogenous retroviral elements are sources of self DNA that may accumulate in the circulation, in endolysosomes and in the cytosol of phagocytic cells. Three DNases continuously limit immune recognition of such DNA through degradation (Fig. 4). DNase I is secreted and degrades DNA in the circulation. The loss of DNase I causes a phenotype similar to systemic lupus erythematosus147. Lysosomal DNase II not only degrades phagocytosed DNA but is also crucial for the deletion of nuclear DNA that enters the autophagy pathway148. DNase II deficiency is fatal due to the expression of type I IFN during embryogenesis, but mice can be partially rescued via concurrent type I IFN receptor deficiency but develop a polyarthritis-like disease149,150. The tissue-specific activation of AIM2 and the TLR pathway by self DNA in DNase II-deficient mice has been associated with polyarthritis151,152. TREX1 degrades DNA in the cytosol (Fig. 4) and TREX1 deficiency in humans causes the autoinflammatory Aicardi–Goutières syndrome153. The fatal phenotypes of TREX1-deficient and DNase II-deficient mice, as well as associated polyarthritis, can be reversed by knockout of the gene encoding cGAS148,154,155,156. In contrast to its suppressive role on cytoplasmic DNA receptor activation, DNase II was reported to support the stimulation of endosomal TLR9 by enhancing the accessibility of ssDNA ligands from bacterial DNA in the endosome157.

Many of the cytosolic pathways proposed to trigger type I IFN production by dsDNA recognition are still controversial. cGAS, which activates STING via the second messenger cGAMP, is the most accepted DNA sensor. Recent research has revealed that the DNA sensing receptors cGAS and TLR9 can also be activated by DNA–RNA hybrids. In addition to length-dependent activation of cGAS by dsDNA, cGAS possesses an alternative mechanism for DNA recognition that detects G-rich Y-form DNA and appears to be critical for sensing ssDNA that is derived from exogenous or endogenous retroviruses.

Concluding remarks

Innate immune cells have established a powerful system to specifically detect the presence of foreign, potentially harmful nucleic acids. Several virus-associated nucleic acid structures have been identified, such as long dsRNA, 5′-triphosphate or 5′-diphosphate dsRNA, CpG motifs in ssDNA, and short dsDNA with guanosine-containing overhangs that resemble highly structured viral ssDNA. In addition, cells actively label their self nucleic acids in order to distinguish them from non-self molecules. The 2′-O-methylation at the N1 position of capped RNA completely eliminates recognition by RIG-I and IFIT1, the incorporation of 2′-O-methyl groups or other modifications (such as pseudouridine) at internal positions of RNA impairs recognition by TLR7, TLR8 and MDA5, and C5 methylation of CpG motifs in DNA abolishes TLR9 recognition.

However, pattern sensing of nucleic acids extends beyond the detection of specific molecular structures at the receptor–ligand level. To achieve sensitivity and specificity for viral nucleic acids among an abundance of self nucleic acids, this system further integrates the enzymatic function of extracellular and intracellular nucleases, allowing the detectability of nucleic acids to be determined by their stability against nucleases. Thus, potentially dangerous nucleic acids, such as those associated with viral particles or released during oxidative stress, are particularly potent immune activators.

The nucleic acid-sensing system further integrates information about the localization of detected nucleic acids in order to select the most appropriate response: TLRs localized in the endosomal membrane induce different signalling pathways to those induced by receptors localized in the cytosol. The distinct function of nucleic acid-sensing receptors is also regulated by their differential expression in functionally distinct immune cell subsets and even non-immune cells. An important example is the restrictive expression of TLR7 and TLR9 to plasmacytoid dendritic cells, which, upon stimulation, respond with very high levels of type I IFN production. In contrast, TLR8-mediated stimulation of myeloid cells leads to the release of IL-12.

Overall, the main responses induced by nucleic acid-sensing receptors comprises: production of cytokines (for example, type I IFNs) and chemokines to activate neighbouring cells and to recruit immune cells to the site of infection; and direct antiviral responses, such as cell autonomous antiviral mechanisms, that target viral replication, translation of viral proteins and virus assembly, and that induce cell death, including apoptosis, necroptosis and pyroptosis (reviewed in Ref. 158).

It is evident that the sensitivity of the nucleic acid-sensing system can differ according to the immunological context. In the presence of type I IFN or nucleic acid ligands, the sensitivity of this system is heightened, and, consequently, endogenous ligands can become more immunogenic. In addition, certain stressors can increase the recognition of endogenous ligands; for example exposure of DNA to reactive oxygen species results in DNA oxidation, thereby enhancing its stability against TREX1-mediated degradation and increasing its sensitivity to cGAS recognition.

Future perspectives

Although the major nucleic acid-sensing receptor pathways seem to have been identified, numerous urgent questions remain regarding the functional interactions of these receptors within and beyond the immune system. We still do not understand the nature of immune sensing of foreign extrachromosomal nucleic acids in the nucleus. Another pressing area of research is the link between nucleic acid sensing and the DNA damage response.

Within well-described nucleic acid-sensing pathways, we are still just beginning to understand the implications of genetic alterations for human disease. Genetic alterations can lead to enhanced or reduced expression or function of nucleic acid-sensing receptors, DNases, RNases and nucleic acid-modifying enzymes. Perturbations of the balance between self and non-self nucleic acid recognition are involved in the pathogenesis of several rare autoimmune diseases such as Aicardi–Goutières syndrome (which is associated with genetic defects in TREX1, RNase H components and SAMHD1 (Ref. 159)), Singleton–Merten syndrome (associated with mutations in the ATPase domain of RIG-I, leading to constitutive RIG-I activation by self RNA)160 and STING-associated vasculopathy with onset in infancy (SAVI)161. These perturbations can also contribute to the pathogenesis of type I diabetes (owing to genetic variants of RIG-I and MDA5 (Refs 2,162,163)). Furthermore, recent work indicates that disrupted nucleic acid sensing is involved in systemic lupus erythematosus, rheumatoid arthritis and primary Sjögren syndrome2,164. Several studies also implicate the RIG-I–MAVS signalling axis in gut homeostasis165,166,167. A better understanding of nucleic acid sensing and its role in disease may also pave the way for new therapeutic interventions involving therapeutic oligonucleotides10 or other molecules that specifically block certain nucleic acid receptor pathways165,166,167.

Finally, it will be important to better understand the relationship between RNAi and the innate immune sensing of nucleic acids. From an evolutionary perspective, Dicer, the central endoribonuclease of the RNAi pathway, and RIG-I helicases are closely related. Moreover, several recent studies suggest that many pathogenic viruses disable Dicer11. Whether viruses disable Dicer to affect the function of microRNAs or to escape antiviral RNAi or both merits further investigation.