Main

Viruses are obligate intracellular pathogens that depend on host cellular components for replication. They bind to cell surface receptors to enter cells, and they co-opt cellular functions and organelles to replicate. Host cells can counteract infections by sensing pathogen-associated molecular patterns (PAMPs), such as viral nucleic acids, and by subsequently triggering the expression of antiviral genes. The identification and characterization of host factors that promote and restrict viral replication can provide important insights into basic aspects of cellular biology and virus–host relationships, and can lead to the identification of new targets for antiviral therapeutics.

The use of forward genetic screens has provided an unbiased and comprehensive strategy to uncover host factors that promote or restrict virus replication. Originally, the use of these genetic screens was limited to genetically tractable model organisms, such as yeasts, fruit flies, roundworms and zebrafish, and relied on the use of X-rays or chemical mutagens to introduce mutations. These forward genetic screens have markedly contributed to our understanding of many fundamental biological processes1,2,3,4, but their application to cultured mammalian cells was challenging. With technological advances such as RNAi and insertional mutagenesis in human haploid cells, it became possible to disrupt gene expression on a genome scale in mammalian cell culture5,6,7. Recently, the prokaryotic CRISPR–Cas adaptive immune system has been engineered to efficiently induce knockout mutations in almost any cell type, which has revolutionized biological research8,9,10 (Box 1). In contrast to gene knockdown approaches, such as RNAi, the knockout of alleles by CRISPR–Cas often results in more marked phenotypes, a greater signal-to-noise ratio and the identification of fewer false-positives11,12,13,14. Knockout alleles are generated by the endonuclease Cas9, which is directed to a specific genomic region by a single-guide RNA (sgRNA) through Watson–Crick base pairing. Cas9 creates a double-strand break (DSB) at the target site, which is then repaired by non-homologous end joining (NHEJ). This often results in a frameshift mutation and the expression of truncated or non-functional proteins. The ease of Cas9 targeting to specific loci, combined with the design of multiplexed pools of sgRNAs that span the entire human genome14,15,16,17, has enabled the genome-scale identification of host factors that are crucial for virus replication.

In this Review, we describe how genetic screens have contributed to our understanding of virus–host biology and how CRISPR–Cas screens have been used to expand our toolkit to identify host factors that are important for virus replication. We provide practical advice on how to set up CRISPR–Cas screens and give examples of recent discoveries that have been made using CRISPR–Cas technology for viruses that cause important human diseases, including dengue virus (DENV), Zika virus (ZIKV), West Nile virus (WNV), hepatitis C virus (HCV) and noroviruses. We also discuss the potential for CRISPR–Cas technology beyond genetic screening applications, and how it could advance our understanding of viral pathogenesis and the development of antiviral therapeutics.

The power of unbiased genetic screens

Historically, loss-of-function screens have lagged behind gain-of-function approaches in mammalian cells owing to the lack of efficient tools that can mutate both alleles in diploid genomes in a high-throughput manner.

Gain-of-function approaches. Gain-of-function approaches rely on the ectopic overexpression of genes and have been successful in identifying cell surface receptors that are required for viral entry and host viral restriction factors. To identify entry receptors, a cell line that is refractory to infection is typically transduced with a complementary DNA library (cDNA library) derived from a cell type that is permissive to infection. For example, claudin 1 (CLDN1)18 and occludin (OCLN)19 were identified as entry receptors for HCV by transducing a non-permissive cell line with a cDNA library derived from hepatocellular carcinoma cells. In addition to discovering receptors, an unbiased expression screen also discovered that SEC14-like protein 2 (SEC14L2), which is a cytosolic lipid-binding protein, enhances the replication of clinical strains of HCV20. Furthermore, a library of 380 interferon-stimulated genes (ISGs) was used to identify key proteins that are important for innate immune defences against several DNA and RNA viruses21,22. In addition to these screens, in a continuing effort, comprehensive cDNA libraries that contain all annotated ORFs from humans have been cloned into lentiviral expression vectors23, generating an expression vector library that is likely to improve the utility of gain-of-function screens in the study of host–pathogen interactions.

Loss-of-function genetic screens. Loss-of-function screens are based on the stable knockdown or knockout of genes. Initial approaches that used RNAi have provided valuable insights into virus–host relationships24. In contrast to RNAi, which leads to the partial depletion of expression for a specific gene, recent technological advances have made it possible to completely disrupt gene expression (Table 1). One approach, termed haploid genetic screening, relies on insertional mutagenesis of genes in cultured haploid cell lines. For example, retroviral gene traps that contain a splice acceptor site can integrate into the host genome, leading to the expression of truncated mRNA transcripts7. The complete ablation of gene expression can have marked effects on virus replication and enables the identification of the most crucial host factors for virus infection. Insertional mutagenesis in haploid cells has been used to discover essential receptors for several viruses, including Ebola virus and Lassa virus25,26. Both viruses use abundant lysosomal proteins as receptors. The interaction between the Ebola virus glycoprotein and its receptor Niemann–Pick C1 protein (NPC1) is triggered by cathepsin cleavage27,28, whereas the Lassa virus glycoprotein interacts with its receptor, lysosome- associated membrane glycoprotein 1 (LAMP1), following acidification of the endosome. Subsequent structural studies defined the binding interface between the viral glycoprotein and NPC1 (Refs 29,30,31). Interestingly, several mutations arose in the host-binding site of the viral glycoprotein during the 2013–2016 Ebola virus epidemic. These mutations increased the infectivity of the virus in primate cells but not in rodent cells, which suggests that these mutations contributed to adaptation and spread in humans32,33.

Table 1 Comparison of mammalian loss-of-function screening methods

Haploid genetic screens were important for the discovery of a cellular phospholipase that enables viral evasion of an antiviral restriction mechanism that is broadly active against many picornaviruses34. Recently, a haploid screen identified a proteinaceous receptor that enables virus entry for multiple distinct serotypes of adeno- associated virus (AAV)35, potentially affecting the use of AAV as a gene therapy vector. These and other studies have established loss-of-function screens as a reliable strategy to uncover host factors that are crucial for virus replication (Table 2).

Table 2 Genome-wide knockout screens to identify virus–host interactions

Practical considerations for screens

Genetic screens enable the identification of virus–host interactions without prior knowledge of the interaction and on a genomic scale. In this section, we describe the different technologies that are currently available to carry out genetic screens, and we highlight important considerations at different stages of the screen, including the generation of the library of mutant cells, the virus infection assay, phenotypic selection, next-generation sequencing and bioinformatic analyses (Fig. 1). We also consider the degree of 'saturation' in genetic screening; that is, the fraction of target genes it is possible to identify in a specific screen.

Figure 1: Genome-wide screening strategies to investigate host factors that are involved in virus infection.
figure 1

Genetic screens can identify host factors that promote virus replication, as well as antiviral restriction factors. For example, in a loss-of-function screen, knockout of a viral receptor in a permissive cell line will make the cell resistant to the virus infection. By contrast, in a gain-of-function screen, overexpression of the viral receptor in a non-permissive cell line will enable virus infection. Various technologies are available for genome-wide screening. Loss-of-function screens can be carried out using a haploid mutagenesis (in a 1n cell type) or CRISPR–Cas knockout approaches, whereas gain-of-function screens use CRISPR activation (CRISPRa) or ectopic overexpression. Target cells are mutated by delivering retroviral gene trap or lentiviral expression constructs, which can either disrupt or lead to gene expression. The pooled mutagenized cell population is then subjected to a virus infection assay, in which either the cells are infected by the virus of interest, or a subgenomic viral reporter is introduced by transduction or transfection. Virus-resistant cells are selected either by surviving virus-induced cell death or by fluorescence-activated cell sorting (FACS). Next-generation sequencing and bioinformatic analyses enable the enrichment of CRISPR single-guide RNAs (sgRNAs), gene traps or complementary DNA (cDNA) insertions to be determined. CMV, cytomegalovirus; dCas9, catalytically inactive Cas9; DSB, double-strand break; LTR, long terminal repeat; NHEJ, non-homologous end joining; pA, poly(A) tail; SA, splice acceptor; VP64, herpes simplex virus VP16 activation domain.

PowerPoint slide

Choice of cell line and screen. Viruses differ in their host range and tissue tropism. Whether a cell is permissive or non-permissive to virus infection is determined by the expression of genes that facilitate virus replication and genes that restrict virus infection. Genetic screens can uncover genes that promote and restrict virus replication depending on the choice of host cell type (permissive or non-permissive) and the type of screen (loss-of-function or gain-of-function; Fig. 1). CRISPR–Cas genome editing has been reported for a wide range of cell lines that can be infected with many viruses. However, large-scale genetic screens in which the sgRNAs are introduced into the cells in a pooled manner have some limitations. To ensure the appropriate representation of each of the sgRNAs in the pool, numerous cells are transduced and undergo phenotypic selection. In practice, many transformed cell lines will be suitable for generating a mutagenized cell library; however, primary cells have a limited proliferative capacity, and it is therefore more challenging to transduce and expand these cells in large numbers. Pre-arrayed sgRNA formats, in which wells contain individual synthetic sgRNA constructs for reverse transfection36, may therefore be more suitable for primary cells.

Haploid screens are limited to cell types that have a haploid or near-haploid karyotype to achieve insertional mutagenesis of the allele. Commonly used cell lines include the chronic myeloid cell line KBM7 (Ref. 7) and its derivative, HAP1 (Refs 25,37), and human38 and mouse embryonic stem cells39,40. Despite this limited choice of cell types, haploid genetic screens have been useful for studying many different virus–host interactions41,42 (Table 2).

Overall, both CRISPR–Cas and haploid screens are well suited for the identification of host factors if the cell line is permissive to the virus. The two types of screen may even be carried out in parallel for additional validation and comprehensive screening of candidate genes.

CRISPR libraries and mutagenesis. Several CRISPR sgRNA libraries are available as plasmid repositories (see the Addgene website). The libraries vary in the number of sgRNAs they contain, their target genes (genome wide or a subpool of genes only), the targeted position within the gene (for example, the ORF or the promoter), the targeted species, and their availability as a one-plasmid or two-plasmid system (such that Cas9 is encoded on the same plasmid as the sgRNA or on a second plasmid, respectively). In addition, custom libraries can be constructed for a specific class of gene (for example, kinases) or for validation screens.

The initial genome-scale CRISPR knockout (GeCKO) libraries contained 4–6 sgRNAs per gene and were designed to minimize off-target effects14,43. More recently constructed CRISPR–Cas libraries (for example, the Broad Brunello44, Toronto KnockOut13 or Sabatini–Lander libraries45) contain more sgRNAs per gene (up to 12), which increases the likelihood of statistically significant enrichment of candidate genes. However, larger sgRNA libraries also require larger-scale screening, which could be challenging to achieve, especially in cells that have a limited capacity to divide. Notably, a small sgRNA library that contains a subset of sgRNAs from a larger CRISPR–Cas library was still able to identify the majority of the same hits, albeit with less statistical significance44. This suggests that when scaling up is unfeasible owing to cost or cell number, sgRNA libraries that contain fewer sgRNAs per gene can be used in an initial screen, which can then be followed by a secondary screen and/or careful validation.

The more recent CRISPR–Cas sgRNA libraries were constructed to have greater on-target cleavage efficiency than previous sgRNA libraries, in addition to minimal off-target activity. They have shown consistent on-target cleavage efficiency, therefore reducing the chance of false-negative identifications. For example, in a genome-scale screen selecting for resistance to the toxic effect of thymidine, 11 of the 12 sgRNAs against thymidine kinase 1 (TK1) scored as hits, which indicates that the majority of these sgRNAs were active because TK1 is crucial for mediating thymidine toxicity13. In a more systematic study, 85% of all sgRNA constructs that target essential genes were accurately recalled without false-positive identifications11. Owing to the high efficiency of CRISPR–Cas knockouts, the marked phenotypes that are generated in knockout cells and the reproducibility of CRISPR–Cas screens, these screens have outperformed RNAi library screens for the identification of drug resistance genes14, modulators of protein stability12 and essential genes11,13. In addition to creating gene knockouts, CRISPR–Cas technology has been used to modulate the transcription levels of target genes (Box 2). This approach can be advantageous when studying essential genes because it can decrease gene expression without eliminating it completely. It also enables the role of long non-coding RNAs to be assessed, as small insertions or deletions (indels) do not typically disrupt their biological activity46.

To ensure that the sgRNA library is of sufficient quality, it is important to maintain the complexity of the sgRNA pool when expanding the sgRNA plasmid pool in Escherichia coli, during transfection or transduction of the target cells and during the extraction of genomic DNA from cells for downstream analyses. For example, we consistently found a good sgRNA representation (>99%) when the number of transduced cells was 500-fold higher than the total number of sgRNAs in the library. Furthermore, a low multiplicity of infection (MOI; 0.3) during transduction is advised to ensure that only one integration event takes place per cell47.

Phenotypic selection. Many viruses such as poliovirus or DENV are cytolytic, which enables a straightforward selection of virus-resistant cells in cell viability-based screens. This selection recovers mutant cells that do not support viral entry, translation of the viral genome, replication of the viral genome or virus-induced cell death, but typically not mutant cells that do not support virion assembly and egress. In a pooled screen, in which mutant cells are cultured together, the selection can be extremely stringent because of the requirement that resistant cells survive multiple rounds of infection. Therefore, this screening method identifies genes for which disruption causes marked phenotypes. Strong selection conditions in which >99% of cells die from infection are preferred. Although this high stringency increases the confidence in the candidate genes identified, other genes that have subtler effects on virus infection may be missed. Decreasing the stringency could help to identify these genes. Strategies to achieve this include the use of naturally attenuated virus strains or the use of antiviral compounds during selection. However, fine-tuning of the stringency is not always possible in pooled screens, and arrayed screens may be a valuable alternative.

If the virus is not efficient at inducing cell death, then a longer selection period, multiple rounds of virus challenge and larger sgRNA libraries may help to increase the signal-to-noise ratio. As an alternative strategy, fluorescence-activated cell sorting (FACS)-based selection can be used to study persistent or non-cytolytic viruses (for example, hepatitis B virus (HBV), HIV and AAV). This approach relies on genetically engineered viruses that express a fluorescent reporter or on antibody staining35. FACS-based selection enables the isolation of cells that have low or high levels of virus gene expression, making it possible to simultaneously identify factors that enhance virus infection and factors that inhibit virus infection.

It is also possible to identify host factors that are required at specific stages of the viral life cycle; for example, pseudotyped viruses48,49, viral replicons50,51 and internal ribosome entry site reporters (IRES reporters)52,53 can be used in the virus infection assay to identify host factors that are required for virus entry, genome replication and translation, respectively.

Next-generation sequencing and bioinformatics. After phenotypic selection, genomic DNA is isolated. Uninfected, mutagenized cells are used as control samples (the starting population either collected at day 0 or grown and harvested in parallel with the virus-selected population). At this step, the total amount of DNA template should be sufficiently high to maintain the complexity of the library. The sgRNA integrations are PCR-amplified and sequenced by next-generation sequencing to quantify their relative abundances. The level of sgRNA enrichment in phenotypically selected cells compared with that in unselected cells is determined by comparing the number of reads that map to specific sgRNAs in the different cell populations. To normalize for differences in sequencing depth between populations, the number of reads that map to each specific sgRNA is divided by the total number of reads. Bioinformatic tools can help to determine whether a gene is significantly enriched over background by assessing the level of enrichment of multiple sgRNAs against the same gene. Analysis tools that were developed for RNAi screens, such as RNAi gene enrichment ranking54 (RIGER) and redundant siRNA activity55 (RSA), can be repurposed for this task. More recently, scoring algorithms, such as model-based analysis of genome-wide CRISPR–Cas9 knockout (MAGeCK)56 and STARS44, have been developed to improve the bioinformatic analyses of CRISPR–Cas screen data sets, taking into account the increasing number of sgRNAs that are used per gene.

Validation of candidate genes and off-target effects. An important step after any genetic screen is the validation of the candidate genes and the consideration of off-target effects. Gene editing at off-target loci has been reported57,58,59, and if these off-target sites are within exons, they have the potential to cause false-positive results. The use of multiple sgRNAs per gene combined with the implementation of sgRNA sequence design rules can help to reduce this risk. In side-by-side comparisons with RNAi-based approaches, CRISPR–Cas screens typically have fewer false-positive identifications11,12,13,44. Nevertheless, a thorough validation of candidate genes is still essential. Individual knockout cell lines should be generated using CRISPR–Cas methods and start from a single cell clone. After confirming that the gene has been knocked out by genotyping and immunostaining, the effect of the knockout on virus replication can be measured, and genetic complementation experiments can confirm that the effect was due to the knockout.

Essential genes and genome coverage. It is challenging to identify all of the genes that affect virus replication because a proportion of them are essential for cell growth and viability and will therefore be excluded from downstream analyses. CRISPR–Cas screens and haploid screens have enabled the systematic and comprehensive identification of a core set of 2,000 human genes that are essential for optimal cellular growth and viability13,45,60, corresponding to 10% of human genes. Owing to the essential roles that these genes have in cell physiology, it is challenging to determine whether they directly influence virus replication. The remaining 90% of genes can be tested in genetic screens because they do not affect cell growth or viability; however, this figure is likely to be an underestimate because the list of 'essential genes' includes genes that only moderately affect cell growth and could therefore be included in the screens. For example, in haploid and CRISPR–Cas screens for host factors that are crucial for DENV replication, multiple subunits of the oligosaccharyltransferase complex (OST complex) were identified61 despite the genes that encode these subunits being classified as essential genes60.

Another important consideration is the coverage of the genome. Notably, sgRNA libraries are designed according to the presence of annotated genes in reference genomes (20,000 genes in humans). The increasing number of independent sgRNAs (now at 4–12 per gene) and improved sequence rules for cleavage efficiency make current sgRNA libraries more reliable than early libraries for probing the entire human genome, thus minimizing false-negative results. By contrast, haploid genetic screens rely on retroviral insertional mutagenesis and do not require genome annotation. Retroviral integration is not random and occurs more frequently in actively transcribed chromatin62, which biases the insertions towards genes. Indeed, mapping of insertion sites in gene trap screens revealed insertions in 70% of all annotated genes and 98% of expressed genes63. Recently, more extensive mapping efforts in HAP1 cells showed that >90% of all annotated genes contained insertions, with a median of 525 independent gene trap insertion events per annotated gene60. This high number of knockout events increases the power of identifying signal over noise.

Despite the fundamental differences between CRISPR–Cas and haploid genetic screens, both approaches have been equally powerful in identifying core essential human genes45,60, endoplasmic reticulum-associated protein degradation (ERAD) components64 and host factors that are required for DENV replication61. The high concordance in identified genes between the two different technologies underscores the power and reliability of knockout screens.

Insights from CRISPR–Cas screens

The potential for CRISPR–Cas screens to discover host factors that are crucial for viral pathogenesis is great and may lead to the development of new antivirals65. Several viruses have been studied using CRISPR–Cas screens.

Mosquito-borne flaviviruses. The mosquito-borne flaviviruses include important pathogens such as DENV66. More recently, ZIKV has emerged in Brazil and is spreading at a rapid pace throughout South America67, causing severe congenital abnormalities in the unborn children of pregnant mothers who are infected68. The biogenesis and membrane topology of mature flavivirus proteins is complex and involves the translation of a polyprotein at the ER membrane, the co-translational and post-translational insertion of several membrane-spanning hydrophobic helices, and polyprotein cleavage by a viral protease and several host proteases into the mature viral proteins. Despite this knowledge of these processes, a detailed understanding of the host proteins that are involved is lacking.

CRISPR–Cas screens that were carried out independently using DENV61, WNV69,70 and ZIKV71 have each identified a number of ER proteins that are required for virus replication (Fig. 2). Many of these proteins are involved in the biosynthesis of membrane and secretory proteins, a core function of the ER. In particular, the proteins that were identified have described roles in N-linked glycosylation, ERAD, and signal peptide insertion and processing. Notably, the identification of these proteins was reproduced in replica screens in the same laboratory and in independent screens in different laboratories using different cell lines and different virus strains. There was also a substantial overlap with results from haploid genetic screens. This reproducibility is remarkable and a major advantage of this technology.

Figure 2: Host factors that have been identified by CRISPR–Cas screens as important for infection and replication of viruses in the family Flaviviridae.
figure 2

The flaviviruses Zika virus (ZIKV), dengue virus (DENV) and West Nile virus (WNV) enter the cell by attachment to cell surface molecules, including heparan sulfate proteoglycans (HSPG) and potentially other protein receptors61,71,77. After uncoating, viral (+)RNA is translated by host ribosomes. The ribosomal subunit 40S ribosomal protein S25 (RPS25) is important for DENV infection and for translation of hepatitis C virus (HCV) RNA, but is dispensable for host mRNA translation61,158. The flavivirus polyprotein is inserted into the endoplasmic reticulum (ER) membrane and cleaved by viral and host proteases, including the host signal peptidase complex70. The viral proteins assemble a replication complex in close association with several ER-resident host protein complexes: the oligosaccharyltransferase (OST) complex, the translocon-associated protein (TRAP) complex and components of the ER-associated protein degradation (ERAD) pathway61,69,70,71. Notably, different flaviviruses have different dependencies on the two distinct OST multiprotein complexes, which contain either an STT3A or an STT3B catalytic subunit. The ERAD-related host factors belong to the classical ERAD complex and the ER membrane protein complex (EMC). HCV enters hepatocytes through the receptors CD81, occludin (OCLN) and claudin 1 (CLDN1)61,159, and the host microRNA miR-122 binds to and stabilizes the 5′ UTR of the HCV RNA61,160. FAD biosynthesis, catalysed by riboflavin kinase (RFK) and FAD synthase (FLAD1), is important for HCV RNA synthesis61. ELAVL1 binds to the 3′ UTR of HCV to circularize the viral genome by interacting with La protein (also known as SSB) and displacing polypyrimidine-tract-binding protein 1 (PTB) to stimulate virus replication61,161. Cyclophilin A (CYPA) is required for HCV replication through its interaction with NS5A61,83. UBE2J1, ubiquitin-conjugating enzyme E2 J1; YFV, yellow fever virus.

PowerPoint slide

CRISPR–Cas technology also provides a reliable way to validate candidate genes and measure the effects of knockouts on virus replication. In contrast to knockdown approaches, such as RNAi, gene knockouts are absolute and do not result in the variable levels of depletion seen with RNAi. This enables a faithful comparison between genes when quantitative assays for virus replication are used, such as quantitative PCR, immunostaining or plaque assays. Remarkably, flavivirus replication was decreased 100–10,000-fold when the most significantly enriched host factors from the screens were knocked out61. This demonstrates that pooled sgRNA screens have the potential to identify host factors that are essential for virus replication.

Moreover, CRISPR–Cas knockout cells can be used to understand the molecular basis of knockout phenotypes and to help identify the stage of the virus life cycle in which the host factor is involved. For example, the OST complex was found to be required for viral RNA synthesis, but not for viral entry and translation61. The OST complex catalyses the N-linked glycosylation of newly synthesized proteins. In mammalian cells, two distinct OST multiprotein complexes are formed, each composed of a catalytic subunit (one of two paralogues, STT3A or STT3B) and accessory subunits72. Both isoforms are individually required for the replication of DENV, as knockout of either STT3A or STT3B resulted in complete abrogation of DENV replication. Other mosquito-borne flaviviruses, including ZIKV, are exclusively dependent on the STT3A isoform for viral RNA replication, which indicates a specific but divergent virus–host interaction (Fig. 2). Surprisingly, the catalytic activity of STT3A and STT3B was dispensable for virus replication, because catalytically inactive mutant proteins were able to restore DENV replication in the knockout cells, which indicates that the OST complex has an unconventional role in DENV replication. The OST complex was found to bind to multiple non-structural viral proteins that form the RNA synthesis complex at the ER61, which suggests that the OST complex acts as a scaffold to coordinate the assembly of a functional DENV RNA replication complex.

Other host factors that were found to be required for flavivirus replication include SEC61A1 and SEC63, which form the translocon channel in the ER membrane; the translocon-associated protein (TRAP) complex, which stimulates co-translational translocation of polypeptides into the ER73; and the signal peptidase complex that cleaves signal peptides in the ER lumen. Knockout of a subset of signal peptidase complex subunits (SPCSs) revealed severe defects in the polyprotein cleavage of multiple flaviviruses. In particular, cleavage of the structural proteins prM and E from the polyprotein was affected, leading to marked defects in the release of virus particles70.

Components of the ERAD pathway were also found to be important for flavivirus replication. This protein quality control mechanism targets incorrectly folded proteins in the ER lumen for retrotranslocation through the ER membrane to the cytosol, in which proteasomal degradation occurs74. Two categories of ERAD components were found in the CRISPR–Cas screens: first, components of the classical ERAD machinery, including SEL1L, derlin 2 (DERL2) and ubiquitin-conjugating enzyme E2 J1 (UBE2J1), which are part of the retrotranslocation complex75; and second, components of the ER membrane complex (EMC), an evolutionarily conserved complex that has less-well-understood roles in ERAD76. Knockout of ERAD components led to substantial decreases in viral-RNA accumulation, particle formation and virus-induced cell death for DENV, ZIKV, Japanese encephalitis virus and WNV61,69,70. However, how ERAD functions promote flavivirus replication remains to be fully understood.

It is important to note that in contrast to genetic screens with several other viruses (for example, Ebola virus), screens with WNV and DENV have not been able to identify a specific receptor that is required for viral entry into host cells. This is most probably due to redundancy in entry routes, such that knockout of one virus receptor still leaves cells susceptible through a different route. Indeed, several receptors have been reported for DENV77. Nevertheless, CRISPR–Cas screens have contributed to our understanding of flavivirus biology, revealing a central role for several ER complexes in promoting flavivirus infection.

HCV. Another important pathogen that has been investigated using CRISPR–Cas screens is HCV, which causes chronic liver disease in 160 million infected individuals worldwide78. Whereas mosquito-borne flaviviruses have a dependence on ER proteins, screening with HCV, which is a more distantly related member of the family Flaviviridae, revealed non-overlapping hits, including entry receptors CD81, OCLN and CLDN1, the liver-specific microRNA miR-122 and several RNA-binding proteins and metabolic enzymes61 (Fig. 2). One of the most significant hits was ELAVL1, an RNA-binding protein that is involved in mRNA stabilization79. HCV RNA replication was markedly reduced in ELAVL1-knockout cells, whereas RNA replication for other RNA viruses (for example, DENV and poliovirus) was unaffected. The HCV screens also uncovered an unexpected link between intracellular FAD levels and HCV RNA replication. The enzymes riboflavin kinase (RFK) and FAD synthase (FLAD1), which are involved in the conversion of riboflavin (vitamin B2) to FAD, were found to be crucial for the replication of HCV. Lumiflavin, an inhibitor of cellular uptake of riboflavin, potently inhibited viral RNA replication, which indicates that the modulation of intracellular FAD levels could be explored as an antiviral treatment. Host-targeted antiviral therapeutics may become an effective strategy to control virus replication because they may present a higher genetic barrier for resistant mutants to evolve than virus-targeting antivirals, and they have the potential to inhibit a broader range of viruses65. For example, cyclophilin A (CYPA) is a host factor that is required for HCV replication and also promotes HIV infection80,81,82. CYPA inhibitors have advanced to phase II/III clinical trials for the treatment of HCV infection, and their use is also being explored to treat other viral infections83.

Noroviruses. Human noroviruses are a leading cause of gastroenteritis globally. Although their mechanism of entry and cellular receptor remain unknown, carbohydrates — in particular, the histo-blood group antigens (HBGAs) — have been shown to have a role in human norovirus entry84. Unbiased genetic CRISPR–Cas screens led to the discovery of CD300LF (also known as CLM1) as a proteinaceous receptor for murine norovirus85,86. CD300LF knockout abolished murine norovirus infection in mouse cell lines and in a mouse model of murine norovirus infection. Moreover, the expression of mouse CD300LF in human cells made them susceptible to murine norovirus infection. The discovery of a key host receptor that is necessary and sufficient for the binding of murine norovirus raises the possibility that human noroviruses also require a specific proteinaceous receptor or receptors. The recent development of a more reliable in vitro infection model for human noroviruses87 combined with CRISPR–Cas technology could lead to a better understanding of the entry pathway that is used by human noroviruses and to the development of entry inhibitors.

HIV. To identify host factors that are required for HIV replication, a CRISPR–Cas screen was carried out in a physiologically relevant CD4+ T cell line88. In addition to the T cell surface glycoprotein CD4 and the co- receptor CC-chemokine receptor type 5 (CCR5) that are required for entry of CCR5-tropic viruses, a cell adhesion molecule named CD166 antigen (ALCAM) and two proteins, protein tyrosine sulfotransferase 2 (TPST2) and adenosine 3′-phospho 5′-phosphosulfate transporter 1 (SLC35B2), that are involved in tyrosine sulfation were found to be important for HIV infection. To validate these findings, electroporation was used to introduce Cas9–sgRNA ribonucleoprotein complexes into CD4+ T cells that were isolated from the blood of healthy human donors, and these cells were then challenged with CCR5-tropic HIV. This demonstrates that CRISPR–Cas technology can be used to study host factors in primary cells.

Bacteria, parasites and immune signalling. Genome-scale knockout screens have also been used to uncover immune-regulatory networks89, the pyroptosis pathway90 and host requirements for bacterial pathogenesis91,92,93. CRISPR–Cas screens in Toxoplasma gondii have also identified genes that are essential for the fitness of apicomplexan parasites94. In bacteria, partial knockdowns using CRISPR interference enabled the systematic phenotypic identification of essential bacterial genes in Bacillus subtilis95.

Emerging CRISPR–Cas tools

CRISPR–Cas technology has broad applications in the study of viruses, extending beyond host factor screens. CRISPR–Cas methods are being used to generate both in vitro and in vivo models to study viral pathogenesis, to edit and image viral genomes, in the development of gene drive systems that have the potential to eradicate viral disease vectors, and to advance the development of antiviral therapeutics (Fig. 3).

Figure 3: CRISPR–Cas applications beyond genetic screening.
figure 3

CRISPR–Cas genome editing enables the generation of in vitro and in vivo models to study viral pathogenesis. The technology is not limited to engineering model organisms, such as mice, fruit flies and roundworms, but can also be applied to non-model organisms, such as pigs, macaques, ferrets, chickens, ticks, bats and mosquitoes. CRISPR–Cas technology is also useful for engineering the genomes of large DNA viruses, such as poxviruses. Catalytically inactive Cas9 (dCas9) proteins that are fused to fluorophores may be useful to track viral nucleic acids in cells162. CRISPR-Cas technology could lead to the development of new approaches to treat virus infections and prevent transmission, including the development of gene drive systems to eradicate viral disease vectors, the direct targeting to inactivate viral gene expression, the identification of druggable host proteins that are required for virus replication, and elucidating the mechanisms of action of antivirals. cccDNA, covalently closed circular DNA; HBV, hepatitis B virus; sgRNA, single-guide RNA.

PowerPoint slide

Generation of in vitro and in vivo models to study viral disease. Traditionally, in vitro systems using cell lines have been invaluable tools to study virus infections. However, these systems have limitations in providing comprehensive insights into host physiology, immunity, pathology and transmission during infection. CRISPR–Cas technology has been used to generate advanced in vitro and in vivo knockout models to study viral pathogenesis, such as primary cells, organoids, induced-pluripotent stem cells (iPSCs)96,97,98 and animal models99. CRISPR–Cas methods have expedited the process of generating knockout animal models. In addition to genome engineering of laboratory animals, such as roundworms100,101, fruit flies102,103 and mice104,105, CRISPR–Cas approaches can be applied to non-model organisms, such as mosquitoes106, ticks, bats, pigs107,108, macaques109, ferrets110 and chickens111, which are important vectors or reservoirs of viruses. For example, bats are reservoirs for rabies virus, Nipah virus, Ebola virus and severe acute respiratory syndrome-related coronavirus, whereas mosquitoes transmit DENV, ZIKV, WNV and chikungunya virus104. Ferrets are a suitable animal model to study influenza viruses112. Previously, it was challenging to genetically engineer ferrets. Ferrets that have been genetically engineered using CRISPR–Cas technology have recently been reported and may substantially broaden the application of the ferret model in the study of influenza virus pathogenesis and transmission110.

CRISPR–Cas tools for studying large DNA viruses. Efficient genetic modification of large viral genomes has been limited by conventional molecular cloning techniques, especially for DNA viruses that belong to the proposed order Megavirales. For example, it has been challenging to edit the genomes of human poxviruses, which range from 130 kb to 375 kb in size. However, CRISPR–Cas technology has been used to efficiently edit the genomes of large DNA viruses, such as vaccinia virus, Epstein–Barr virus and adenoviral vectors113,114,115.

CRISPR–Cas antiviral strategies. There is also potential for the application of CRISPR–Cas technology in the prevention and treatment of diseases by targeting viruses and their vectors. Vector control has been used as a strategy to limit the transmission of vector-borne viruses, including ZIKV, DENV and yellow fever virus. For example, several attempts have been made to introduce genetically modified, sterile mosquitoes into the environment in an attempt to eradicate wild-type mosquito populations that transmit viral diseases116,117,118,119. CRISPR–Cas tools have been used to generate gene drives that have the potential to diminish mosquito populations120,121. Furthermore, CRISPR–Cas technology could be used to treat persistent virus infections, such as infections with HIV, HBV, HCV and herpes simplex virus122,123,124,125. Recently, HBV covalently closed circular DNA (cccDNA), the hallmark of persistent HBV infection, has been successfully targeted in cell culture and in animal models126,127,128,129. In addition, CRISPR–Cas screens can be used to understand the mode of action of antivirals. For example, CRISPR–Cas and short hairpin RNA (shRNA) screens carried out in parallel uncovered the mechanism of action of GSK983, an antiviral drug that may prove effective in the treatment of a wide range of RNA and DNA viruses130,131. GSK983 was found to block virus replication by inhibiting the cellular pyrimidine biosynthesis enzyme dihydroorotate dehydrogenase, thus reducing intracellular levels of nucleotides, which are needed for viral nucleic acid synthesis.

Conclusions and perspectives

The repurposing of the CRISPR–Cas system as a genome-engineering tool is starting to transform biomedical research in several areas, including infectious diseases, cancer and gene therapy. This new approach is also being used to gain a better understanding of how viruses exploit their host and to develop new antiviral therapeutics. Since its discovery, CRISPR–Cas technology has already advanced our understanding of the life cycles of noroviruses and flaviviruses. Future screens will undoubtedly shed light on commonalities and differences in how viruses have evolved to exploit and subvert host functions, and may provide potential targets for antiviral therapy. Together with advances in the genetic engineering of animal models using CRISPR–Cas and improvements in field applications such as gene drive systems, these new CRISPR–Cas technologies will help us to tackle current and future viral epidemics.

Continued efforts to develop and enhance CRISPR–Cas systems will expand the toolbox that enables us to gain a greater understanding of complex biological and disease processes. Engineering of Cas nucleases will make DNA and RNA targeting more versatile. For example, Staphylococcus aureus Cas9 is smaller than most Cas9 nucleases that have been used to date, making in vivo delivery of Cas9–sgRNA complexes more feasible132,133; and the CRISPR-associated endoribonuclease C2c2 could lead to the development of new RNA-targeting tools134,135. Moreover, CRISPR–Cas systems can be combined with other technologies to develop more sophisticated screening approaches. Combining CRISPR–Cas technology with advances in single-cell profiling could lead to better measurements of virus replication dynamics136,137,138,139,140. Furthermore, a screening strategy that investigates epistatic relationships (for example by combining haploid and CRISPR–Cas mutagenesis60), will enable the systematic analysis of functional interdependencies between the host genes that are most crucial for virus infection. We expect that refining and expanding the genetic toolbox for manipulating host cells will lead to novel insights into the 'arms race' between viruses and their hosts.