Introduction

Adaptive immunity was once thought to be an exclusive feature of vertebrates1. However, the discovery that prokaryotes also possess a form of targeted immunity has led to the development of technologies that could lead to a radical change in the way that human diseases are treated. The breakthrough began in the early years of bacterial genome sequencing when researchers noticed ‘an unusual structure’ that contained short, repetitive DNA sequences in the Escherichia coli chromosome2. Subsequent studies identified more of these structural motifs in other prokaryotes3, and in 2005 the sequences between the repeats, termed clustered regularly interspaced short palindromic repeats (CRISPRs), were analysed and found to be exact matches to phage genomes4. Further analyses of the regions upstream and downstream of these repeat-spacer loci identified a group of coding genes that often co-localized at the CRISPR arrays. These coding genes were named CRISPR-associated (Cas) proteins5,6. A 2007 study reported that yogurt-fermenting bacteria (Streptococcus thermophilus) expressing Cas proteins and a CRISPR array containing spacers that matched a phage genome were protected from infection by the phage7. Notably, a single protein, CRISPR-associated protein 9 (Cas9), was identified as being solely responsible for RNA-mediated DNA cleavage in certain bacteria7.

The mechanism of this CRISPR-mediated phage protection was characterized through detailed biochemical work8. The CRISPR array is transcribed as a single RNA and then processed at the repeats into shorter CRISPR RNAs (crRNAs) that each contain a single spacer. The crRNAs hybridize with a small trans-activating CRISPR RNA (tracrRNA) and can then be recognized and bound by Cas9 to create a ribonucleoprotein (RNP) complex. The RNP complex associates with a phage genome, searching for sequences that match the spacer encoded on the crRNA. Once homology is found, Cas9 acts as a nuclease, creating a double-strand break (DSB) by cutting the DNA and thereby inhibiting the phage life cycle.

This simple mechanism was immediately recognized as a promising tool for editing DNA and curing disease. To simplify CRISPR–Cas9 and make it more amenable to gene editing, the crRNA and tracrRNA were fused into a single guide RNA (sgRNA) to create a two-component system: the Cas9 protein creates the DSB and the sgRNA guides the nuclease to a user-defined genomic site9 (Fig. 1). In mammalian cells, the system was first used to harness natural DNA repair mechanisms to perform gene editing via the more efficient non-homologous end joining (NHEJ) and less efficient homology-directed repair (HDR) processes10. NHEJ leads to error-prone indel (insertion and deletion) formation, whereas HDR is often a more desired therapeutic outcome owing to its precise manner of editing. The predominant existence of the NHEJ pathway for repairing CRISPR-induced DSBs led most early efforts to focus on knocking out mutant genes that have harmful effects in monogenic Mendelian diseases. However, many diseases cannot be treated with a simple gene knockout and require more nuanced genome engineering11,12,13,14.

Fig. 1: The CRISPR–Cas9 system.
figure 1

a | CRISPR–Cas9 evolved as a prokaryotic adaptive immune system to protect against phages and other mobile genetic elements. The prokaryotic genome encodes a CRISPR array that contains spacers — short pieces of DNA that have exact homology to the genome of the invading pathogen — separated by repeats. Once transcribed, the array is processed into short CRISPR RNAs (crRNAs), each containing one spacer. The crRNAs duplex with trans-activating CRISPR RNAs (tracrRNAs) to create the secondary structure needed to interact with Cas9 and form a ribonucleoprotein (RNP) complex. In prokaryotes, the Cas9 RNP surveys the cell and binds to the phage genome. Cas9 cuts the phage DNA, creating a double-strand break (DSB) and disrupting the pathogen’s life cycle. b | To import CRISPR–Cas9 into other organisms or cells, the crRNA and tracrRNA are fused into a single guide RNA (sgRNA) that encodes a spacer targeting the genome at a defined site. The sgRNA together with Cas9 can be delivered as DNA via a viral vector or as RNA or protein via a lipid nanoparticle. In mammalian cells, Cas9 RNP creates a DSB and induces DNA repair pathways to generate nucleotide insertions and deletions (indels), leading to a gene edit that can potentially be used to treat disease. AAV, adeno-associated virus.

In this Review, we focus on the applications of CRISPR to potentially treat diseases that cannot be overcome by inducing frameshifts or premature stops in coding genes. We provide an overview of Cas protein engineering and CRISPR systems beyond Cas9 that create a toolbox to engineer the human genome. We then detail how each of these tools might be uniquely leveraged to create new therapies for diseases that have yet to be cured by other forms of medicine.

CRISPR gene editing

Mammalian cells have evolved a pathway to repair DSBs by ligating damaged strands predominantly through NHEJ15,16. During this DNA repair process, nucleotides are inserted or deleted (indels), leading to nearly random mutations in the ligated DNA sequence and creating a permanent edit to the genome. Decades of research developing zinc finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs) demonstrated the ability to harness the NHEJ mechanism to overcome genetic diseases in mammalian cells17,18. A disease-driving locus is targeted by engineered nucleases to create indels that result in gene knockout via frameshift mutations or gene repair via a random indel correcting the mutation. Generating ZFNs and TALENs to target a precise locus requires laborious design, build and test cycles to identify amino acid substitutions that selectively bind to a desired genomic sequence. As DNA binding is difficult to accurately predict from amino acid changes, targeting a precise genomic location can be challenging19.

The simplicity and predictability of the CRISPR targeting mechanism transformed gene editing from a complicated protein engineering problem to a RNA coding problem, instantly making CRISPR an attractive tool for basic research and clinical application. This advance resulted in an explosion of CRISPR-related publications and quickly led to CRISPR-based therapies being used in preclinical studies and early clinical trials directed towards a multitude of well-defined Mendelian disorders11,12,13,14.

A few CRISPR-based therapies directed towards monogenic disorders have reached clinical trials. For example, hereditary transthyretin amyloidosis (hATTR), a rare, fatal neuropathy that affects 50,000 people worldwide, is characterized by a point mutation in the coding sequence of the transthyretin (TTR) gene that leads to destruction of the peripheral nervous system20. This nonsense mutation induces protein misfolding, leading to oligomerization of transthyretin into fibrils that accumulate in the extracellular matrix and disrupt normal cell functions. A nanoparticle-based therapy, NTLA-2001, that encapsulates mRNA that encodes Cas9 and an sgRNA that targets TTR was developed to treat this disease21. When delivered to patients, the Cas9–sgRNA RNP creates a DSB within the coding sequence of TTR, resulting in a frameshift mutation that silences the mutant gene. The early results of a phase I clinical trial suggest that NTLA-2001 can dramatically reduce the expression of TTR, which could be highly beneficial for reducing symptom progression in patients with hATTR22.

In preclinical models, CRISPR-based therapies have been used to leverage gene knockout for many indications including the treatment of cancer, metabolic disorders and neurological disorders11,12,13,14. However, most diseases are more complex than monogenic disorders and cannot be fixed by simply editing a mutated allele. To treat these diseases, point mutations need to be precisely corrected, transcription must be carefully tuned to rescue gene dosage or more nuanced editing of non-coding regions must be considered. This next generation of therapies will use novel CRISPR tools and methods, moving the field from gene editing to a broader concept of genome engineering.

The Cas toolbox

Since the initial discovery of Cas9 as a mammalian gene editor, two key advances have enabled expansion of CRISPR technology to diseases with complex drivers: importing CRISPR systems that use other Cas proteins into mammalian cells and engineering Cas molecules to enhance their functionality. Together, the available Cas molecules comprise a set of genome engineering tools that create opportunities to cure diseases beyond the limitations of wild-type Cas9. With this large toolbox, a suitable CRISPR tool can be chosen to meet the needs of the specific disease, instead of limiting potential therapies to conform to the existing capabilities of Cas9.

Naturally occurring Cas proteins

Two distinct classes of CRISPR system exist, class I and class II (Fig. 2). Class II Cas proteins, including Cas9, have their targeting and nuclease functions encoded in a single protein, spanning a large group of RNA-guided nucleases that have evolved in numerous prokaryotic species. The two most widely used Cas9 proteins were discovered in Streptococcus pyogenes (SpCas9) and Staphylococcus aureus (SaCas9). Evolution in these different environments endowed these proteins with unique traits that need to be considered when applying them as a therapy8. For example, all known DNA-targeting Cas nucleases require a targeted locus to be flanked by a specific sequence called a protospacer adjacent motif (PAM). PAMs are short segments of DNA not encoded on the crRNA that a Cas protein must recognize to begin the process of DNA melting and target binding. SpCas9 requires a flanking NGG PAM whereas SaCas9 recognizes an NNGRRT PAM. In addition to their efficiency and specificity for DNA cutting, PAM sequences are important factors that can determine the genomic space that can be targeted by a particular Cas protein. For example, the PAM of SpCas9 is simple and more frequently represented on the genome than the PAM of SaCas9, making it easier to define a target site for gene editing. The size of the Cas protein is another crucial factor that must be considered. The coding sequence of SaCas9 (3.2 kb) is significantly smaller than that of SpCas9 (4.1 kb), making SaCas9 more amenable for packaging in gene delivery vectors such as adeno-associated viruses (AAVs) that have an ~4.7 kb packaging limit.

Fig. 2: Natural Cas systems.
figure 2

Mining the prokaryotic metagenome has uncovered numerous Cas systems, each with their own unique capabilities that can be leveraged for therapeutic genome engineering. These systems are categorized into two classes: class I systems, which perform their nucleic acid-targeting and nuclease activity as multiple proteins, and class II systems, which have both functions encoded on one protein. a | Type I-E, also known as Cascade, is a class I double-stranded DNA (dsDNA) nuclease. The ability of type I-E to handle longer guide RNAs (gRNAs) means that it can target genomes with higher specificity than other Cas systems35. b | Type I-F can interact with TniQ, part of a transposase complex, and is capable of targeted transposition, inserting entirely new DNA sequences into the genome37. c | Type II-A, also known as Cas9, is the first CRISPR–Cas system to be used therapeutically in humans. Cas9 can use single guide RNA (sgRNA) and typically has high GC protospacer adjacent motifs (PAMs)10. d | Type V-A, also known as Cas12a, typically has high AT PAMs and therefore has access to different genomic regions than Cas9. Cas12a can process multiple CRISPR RNAs (crRNAs) from a single transcript, enabling facile multiplexed DNA targeting25. e | Type V-F, also known as Cas12f, is an extremely small nuclease (1.4–1.6 kb), which makes it more amenable to viral packaging than other Cas systems. Cas12f has high AT PAMs29. f | Type VI-D, also known as Cas13d, targets single-stranded RNA (ssRNA), enabling transcriptome modification, and has no PAM requirement34. ssDNA, single-stranded DNA.

The observation that class II systems can possess unique features prompted researchers to search the metagenome space to find new CRISPR proteins. The repetitive, palindromic feature of natural prokaryotic CRISPR loci was used to identify Cas genes from sequencing data of prokaryotes and establish a landscape of potential gene-editing proteins23,24. This work led to the discovery of Cas12a (originally called Cpf1), which can generate DSBs in the human genome25. The PAM of Cas12a (TTTV) differs from those of SpCas9 and SaCas9, enabling targeting of new genomic locations, and is smaller than that of SpCas9 (Lachnospiraceae bacterium Cas12a is ~3.7 kb). Most importantly, Cas12a can process a CRISPR array into individual crRNAs, enabling facile multiplexed targeting through expression of multiple crRNAs from a single transcript. By contrast, Cas9 multiplexing requires each sgRNA to have its own promoter, making expression of multiple sgRNAs difficult. This finding catalysed the discovery and characterization of new Cas12 proteins including Cas12b26, Cas12c27, Cas12d (previously CasY)28, Cas12e (previously CasX)28, the hypercompact Cas12f (previously Cas14, approximately 1.4–1.6 kb)29, Cas12g27, Cas12h27, Cas12i27 and Cas12j (previously CasΦ)30. Some of these systems (Cas12b31, Cas12e32, Cas12f33 and Cas12j30) have shown promise as gene editors in human cells.

In 2018, a group of class II proteins, known as Cas13, with the ability to bind and cleave single-stranded RNA were discovered34. The Cas13 mechanism results in transcript knockdown akin to RNA interference (RNAi), enabling the destruction of specific mRNA and enabling targeted changes to the transcriptome. Furthermore, many Cas13 proteins can flexibly target the transcript without being restricted by the need for a specific flanking sequence. This feature makes Cas13 a versatile tool for changing the phenotype of a cell without creating heritable changes to the genome.

Class I CRISPR–Cas systems segregate their targeting and nuclease functions into multiple proteins35. For example, the Cascade complex contains multiple subunits of Cas5, Cas6, Cas7, Cas8 and Cas11 that bind crRNA and direct the complex to target DNA. The complex further recruits Cas3 to perform the nuclease function. The large, multi-component nature of class I systems imposes a challenge for delivery and expression that reduces their utility in human cells. However, the longer crRNA (and therefore potential for increased specificity) and the vast diversity of class I systems (more than 80% of known CRISPR systems belong to class I) make them an attractive option for gene editing. For example, Cascade has been used to create long-range genomic deletions in human embryonic stem cells36.

Multiple Cas proteins that are capable of targeted genomic insertions have also been discovered. For example, a class I system (type I-F) and a class II system (Cas12k) can knock in DNA fragments to a specific site by CRISPR-mediated recruitment of transposition machinery. This ability to create targeted insertions has been demonstrated both in vitro and in prokaryotic hosts37,38. Although Cas-mediated transposons have yet to be imported into human cells, these and the other Cas systems described demonstrate the vast diversity and biological functions of CRISPR systems.

Engineered Cas proteins

As CRISPR evolved in prokaryotes24, most Cas systems do not perform optimally when expressed in a more complex genomic environment in human cells, resulting in poor editing efficiency or specificity. Advances in protein engineering enabled the development of enhanced Cas proteins using techniques such as structure-guided mutations, directed evolution and phage-assisted evolution39,40. For example, structure-guided engineering of the DNA-binding pocket of Cas12a and Cas12f increased indel frequency, resulting in more efficient human genome editors33,41,42. Notably, Cas12f has been engineered to generate a hypercompact class of Cas effectors (~1.4–1.6 kb) that are more amenable than other Cas proteins for in vivo delivery and expression33.

To expand Cas applications beyond genome editing, we removed the catalytic activity of SpCas9 to generate a nuclease-dead version of the protein termed dCas9 (ref.43). This engineering converts Cas9 from an RNA-guided nuclease into an RNA-guided binding protein. In E. coli, targeting dCas9 to a coding sequence or its promoter did not cut the gene but inhibited transcription by blocking RNA polymerase. This approach, termed CRISPR interference (CRISPRi), has profoundly changed the way that CRISPR is used and catalysed a series of CRISPR technologies for gene editing and gene regulation43.

Translating CRISPRi into mammalian cells by targeting dCas9 to a coding sequence is typically not sufficient to block transcription. To achieve gene knockdown in mammalian cells, dCas9 is fused with a repressor domain, for example, the Krüppel-associated box (KRAB), to induce local gene repression when brought in proximity to a specific locus44 (Fig. 3a). Here, dCas9 targets a precise genomic location, localizing the fused KRAB domain to silence gene expression only where the dCas9–KRAB fusion protein is bound. Since this discovery, numerous fusions have been generated, creating an expanded toolbox for epigenome engineering. Nuclease-dead Cas proteins have been fused to transcriptional activators to upregulate specific genes in a method termed CRISPR activation (CRISPRa). The first example of CRISPRa fused VP64, four tandem repeats of the herpes simplex virus VP16 domain that induces transcription, to dCas9 and targeted regions proximal to promoters to upregulate specific genes in mammalian cells44. Subsequently, dCas9 was fused with a myriad of transcriptional activator domains including RTA, VP64, HSF1 and p65, to enable highly specific gene upregulation in multiple cell types44,45,46,47 (Fig. 3b). Epigenetic DNA-modifying domains including the DNA methylation domains DNMT3A and DNMT3L, as well as the DNA demethylation domain TET can also be fused to dCas9. In addition, histone modifiers that write H3K27 acetylation or methylation, H3K4 methylation, H3K9 methylation or H3K79 methylation, can be fused individually or in combination to write or erase changes in the histone epigenome. These CRISPR-mediated epigenetic modifications can be used to reprogramme the transcriptome and achieve novel functions such as prolonged targeted gene silencing or activation compared with traditional CRISPRi or CRISPRa48 (Fig. 3ce).

Fig. 3: Genome engineering using engineered Cas proteins.
figure 3

Catalytic residues in Cas proteins can be mutated to render the protein nuclease dead (dCas). Fusing dCas to other protein domains endows novel functionality that can be targeted to precise locations on the human genome. a | dCas fused to transcriptional repressors such as the Krüppel-associated box (KRAB) generates CRISPR interference (CRISPRi), which is capable of targeted gene downregulation lasting for as long as the fusion is present44. b | Fusion of dCas to transcriptional activators such as VP64 generates CRISPR activation (CRISPRa), which enables precise gene upregulation while the fusion protein is bound45. c | Targeted DNA methylation enabled by fusion of DNA methyltransferases such as DNMT3 to dCas results in long-term and even heritable gene repression that is independent of the fusion protein being bound to the genome48. d | Natural or engineered DNA methylation can be removed by fusing methylcytosine dioxygenases such as TET to dCas48. e | Other targeted epigenetic marks can be generated by fusing histone methyl or acetyl transferases such as p300 to dCas to modify histone residues leading to long-term, stable upregulation or downregulation of targeted genes48. f | Fusing dCas or nickase Cas (nCas), which creates single-stranded DNA breaks, to the cytosine deaminase APOBEC1 converts local cytosines into uracil, which is later converted into thymine52,53,54,55,56. g | Fusion of adenosine deaminases such as TadA to dCas or nCas leads to the local conversion of adenosine into inosine, which is resolved as guanine57,58,59,60. h | Prime editing uses a long prime editing guide RNA (pegRNA) that binds to a nicked DNA strand. This creates the starting conditions for the reverse transcriptase (RT) fused to nCas9 to write the genetic information encoded on the pegRNA directly on the genome. The pegRNA can be designed to enable use of prime editing to create large insertions or deletions63.

Other CRISPR–Cas systems can also be mutated into dCas systems to take advantage of their unique properties. For example, the crRNA processing feature of dCas12a enables highly multiplexed CRISPRa or inducible and logic-gated gene regulation49,50. dCas13 fused to RNA-modifying domains such as the ADAR deaminase domain that converts A into I enables targeted coding or epitranscriptome changes to study their effects on cellular phenotype51.

Beyond transcriptional regulation or epigenetic engineering, Cas proteins can be fused to nucleotide modifiers to enable precise gene editing. Although DSB-mediated indel formation is useful for gene knockout, the random nature of indels makes this mechanism difficult to harness for precise mutation correction. To overcome this limitation, DNA base-editing enzymes have been fused to dCas and to a mutated nickase version of Cas (nCas) that generates a single-stranded break. For example, when fused to dCas9, the cytidine deaminase enzyme APOBEC1 makes a targeted C-to-U conversion on the DNA strand that is not bound by sgRNA. This U is read as a T upon DNA replication, creating a precise C-to-T mutation52 (Fig. 3f). Cytosine base editor (CBE) systems have been greatly improved by switching dCas9 for nCas9, fusing uracil DNA glycosylase (UGI), mutating or homologue swapping APOBEC1 and linker optimization52,53,54,55,56.

Cas9 fused to an E. coli TadA that was optimized via protein evolution can deaminate A into I to create an A-to-G conversion57 (Fig. 3g). Adenosine base editors (ABEs) have been improved through subsequent rounds of TadA-directed and phage-assisted evolution, addition of extra TadA domains, and improved codon usage and nuclear localization57,58,59,60. To generate simultaneous C-to-T and A-to-G conversions at a single site, both cytosine and adenosine deaminases can be fused to nCas9 (refs.61,62).

Base editors can create specific mutations at regions close to the Cas binding region but are not sufficient to make multiple base pair insertions, deletions or mutations beyond A-to-G or C-to-T. Furthermore, if many As or Cs are present around the target site, they might inadvertently become mutated, creating off-target effects. To overcome this limitation, a method to insert longer stretches of DNA has been developed, termed prime editing63 (Fig. 3h). Here, nCas9 is fused to a reverse transcriptase and the sgRNA is elongated at the 3′ end to encode both the desired insertion or deletion sequence and a priming region complementary to the nicked DNA (collectively called prime editing guide RNA (pegRNA)). The pegRNA primes the nicked strand and the reverse transcriptase converts the desired edit into DNA directly on the genome, creating a targeted insertion or deletion in place. Prime editing was originally used to make all 12 base-to-base conversions, insertions of up to 44 bp and deletions from 1 to 80 bp. Using two pegRNAs, larger gene replacement or excision strategies can be created, enabling targeted insertion of sites such as Bxb1 recombinase sites for large genomic insertions (up to several kilobase pairs)64.

Targeting the non-coding genome

In most cases, the ablation of genes through NHEJ-induced indels to cause a frameshift or to introduce a premature stop codon does not sufficiently overcome drivers of diseases. Many diseases are driven by mutations in the non-coding regions of the genome, such as promoters or enhancers that regulate transcription or introns that affect mRNA splicing and protein translation. These non-coding sites present new therapeutic opportunities to manipulate gene expression instead of changing the primary sequence of the gene. Most of the human genome is non-coding, and CRISPR can make genetic and epigenetic changes in this vast space to affect gene regulation.

Introns

mRNAs contain regulatory elements that can modulate translation but do not themselves code for protein. Mutations in intronic regions can result in improper splicing of pre-mRNA, leading to mistranslated proteins that can lead to disease. For example, a rare form of childhood blindness known as Leber congenital amaurosis type 10 (LCA10) is characterized by an intronic point mutation in CEP290 that results in dysfunctional photoreceptors and ultimately retinal degeneration65. This mutation leads to aberrant splicing in the CEP290 pre-mRNA that introduces a premature stop site, resulting in a truncated protein and loss of function. The addition of CEP290 complementary DNA (cDNA) as a gene therapy could correct gene dosage and in theory cure the disease; however, the large size of CEP290 protein (2,479 amino acids) makes it impossible to package into viral vectors such as AAVs and therefore prevents its use as a gene therapy. To overcome this problem, an AAV5-based therapy known as EDIT-101 that encapsulates SaCas9 and two sgRNAs targeting genomic locations upstream and downstream of the intronic CEP290 point mutation is being developed66. The two sgRNAs enable cutting around the mutation to induce its removal or inversion and thereby restore normal splicing of CEP290 pre-mRNA (Fig. 4a).

Fig. 4: Editing non-coding regions of the genome.
figure 4

CRISPR technologies can be used to edit non-coding regions of the human genome. a | Proper intron splicing can be restored by creating targeted cuts around an intronic mutation (Mut). These cuts lead to deletion or inversion of the mutated region and thereby correct the mRNA66,68,69,70. b | Entire exons carrying deleterious mutations can be skipped by modifying the flanking introns72,73,74. c | The expression of a gene can be indirectly increased by knocking out the expression of its inhibitory transcription factor, for example, by indel formation in the enhancer region of the transcription factor82,83,84. d | Single or dual guide RNA (gRNA) strategies that target Cas proteins around a transcriptional start site cause the region to be deleted, leading to a reduction in gene expression86,87,88. e | Editing long non-coding RNA (lncRNA) can modulate gene dosage. For example, in settings in which a lncRNA silences a paternal allele, leaving only a mutant maternal allele, the generation of indels in the lncRNA can disrupt its function, leading to the therapeutic expression of the wild-type paternal allele91,92. f | To increase the concentration of a therapeutically relevant gene, Cas proteins can be used to form indels in the microRNA (miRNA) or the 3′ untranslated region, leading to a reduction in miRNA binding and amelioration of the resulting RNA interference101,102,103,104,105,106,107,108.

Intron targeting has also been used in preclinical studies to correct the genetic blood disorder β-thalassaemia, which is caused by a myriad of mutations in HBB. One of the most common disease-causing HBB mutations, particularly in Southeast Asian populations, is a point mutation in intron 2 (IVS2-654) that alters splicing67. Cas9 has been targeted to the aberrant intron to restore HBB gene expression in induced pluripotent stem cells (iPSCs) in vitro, creating a potential avenue for cell therapy through haemopoietic stem cell replacement68. Similarly, CRISPR–Cas9 targeted to intron 16 of LZTR1 can overcome the disease phenotype associated with Noonan syndrome-associated cardiomyopathy in iPSC-derived cardiomyocytes in vitro69. This type of intronic targeting has also been used in vitro to correct a rare mutation in CFTR (affecting about 2,000 patients worldwide) that leads to cystic fibrosis70.

CRISPR–Cas can be used to delete entire exons by taking advantage of a process known as exon skipping. Here, Cas9 is targeted to introns flanking a mutated exon to alter pre-mRNA processing and cause the aberrant exon to be spliced out, thus maintaining an intact open reading frame but removing the mutation (Fig. 4b). This approach has been applied in vitro to Duchenne muscular dystrophy, a muscle-wasting disease characterized by a mutation in the dystrophin gene (DMD) that results in the deletion of exon 50 (ref.71). To correct this deletion, exon skipping can be harnessed by targeting the splice acceptor site at exon 50, the introns flanking exon 51 or the introns surrounding exons 45–55 of DMD in human myoblasts72,73. In all cases, the loss of exon 51 restores DMD and overcomes the disease pathology. Exon skipping by dual targeting CRISPR–Cas9 can also be used to target fusion oncogenes. This approach was used to create a cancer-specific therapy that controlled tumour burden in a mouse xenograft model74.

Untranslated regions

mRNAs contain regulatory elements in the untranslated regions (UTRs) that flank their translational start and stop sites. These regions perform a myriad of regulatory functions such as initiating and terminating translation, altering RNA trafficking and stability, interacting with RNA-binding proteins or microRNAs (miRNAs), and controlling post-transcriptional modifications. Targeting of UTRs to overcome disease pathologies has been demonstrated in vitro. For example, expansion of CTG repeats within the 3′ UTR of DM1 protein kinase (DMPK) from 5–38 repeats in healthy cells to more than 50 repeats in mutated cells causes a neuromuscular disorder known as myotonic dystrophy type 1 (DM1)75. Targeted deletion of the CTG repeats by CRISPR–Cas led to loss of the aberrant mRNA transcripts in DM1 neural stem cells76.

In iPSCs from patients with Duchenne muscular dystrophy, upregulation of the dystrophin-related gene utrophin (UTRN) can circumvent the effects of DMD loss of function, but miRNAs that bind to UTRN destroy the transcript. Cas9 targeting of the miRNA binding sites led to upregulation of the utrophin protein and overcame the disease phenotype77 (Fig. 4c).

Expression of the huntingtin gene (HTT) leads to the neural degeneration that is associated with Huntington disease. To knock out this gene in a mutation-independent manner, Cas9 can be targeted to the HTT 5′ UTR, leading to improper maturation of the transcript and reducing the expression of the disease-causing allele78. Similarly, disruption of a 52 bp regulatory element in the 3′ UTR of amyloid precursor protein led to a substantial reduction in the disease-inducing amyloid-β peptide (Aβ) in a mouse model of Alzheimer disease79.

Cis-regulatory elements

Cis-regulatory elements, including promoters, enhancers and silencers, are important regulatory regions that modulate coding genes to control and alter their expression. Creating indels in these regions disrupts their function and can be used to correct gene dosages that drive disease. For example, sickle cell disease (SCD) and transfusion dependent β-thalassaemia (TDT) are monogenic diseases caused by mutations in HBB80,81 that result in malformation and loss of function of HBB protein. The γ-globin genes (HBG1 and HBG2) that encode fetal haemoglobin (HbF) have the same function as HBB but are silenced by the transcription factor BCL11A during maturation into adulthood. An ex vivo gene-editing technique, CTX001, has been developed that reduces expression of BCL11A to upregulate HBG in autologous haematopoietic stem and progenitor cells82. This technique uses CRISPR editing of the BCL11A enhancer to reduce gene expression rather than complete ablation of BCL11A, which would lead to other pathologies83,84 (Fig. 4a). This ‘one size fits all’ therapeutic strategy has the potential to benefit more patients than specific strategies that each correct one of the myriads of individual mutations in HBB.

The therapeutic potential of editing cis-regulatory elements is also being investigated in other disorders. For example, the mutated transcription factor FOXA1 is an oncogene with a role in the onset and progression of prostate cancer85. Targeting transcription factor binding elements in the FOXA1 promoter modulates the function of these elements, reducing expression of the gene and inhibiting prostate cancer cell growth in vitro86. In addition, a dual sgRNA approach has been used in vitro to excise a 44 kb promoter region upstream of a mutant HTT gene to silence its expression and thereby ablate expression of the Huntington disease-causing variant87 (Fig. 4d). Similarly, a dual targeting approach that binds once in the promoter and once in the first intron of HTT removes the transcriptional start site and first exon, inhibiting gene expression88.

Non-coding RNAs

Some non-coding RNAs affect gene expression by binding to mRNA through Watson–Crick base pairing, which creates another avenue to alter gene expression through gene editing89. For example, miRNAs can bind to UTRs and target them for destruction. Long non-coding RNAs (lncRNAs) can act through several mechanisms including activating or inhibiting transcription or translation, altering splicing or remodelling chromatin epigenetics. Altering the primary sequences of the non-coding elements ablates their function and downstream effects on gene expression.

Angelman syndrome is a neurodevelopmental disorder caused by a maternally inherited mutant UBE3A gene that could be rescued by expression of the paternal allele90. However, the paternal allele is silenced by the lncRNA UBE3A-ATS. CRISPR–Cas9 targeting of UBE3A-ATS ablated its function, leading to expression of the paternal UBE3A gene and rescuing the disease phenotype in cultured human neurons and in a mouse model of the disease91,92 (Fig. 4e).

CRISPR-mediated disruption of lncRNA through indel formation has been widely investigated in vitro in cancer to reduce cell growth93,94,95,96,97,98 and overcome metastasis96,99. CRISPR-mediated mRNA knockdown using the Cas13 system has been used to cleave lncRNA and inhibit bladder cancer proliferation in vitro and in vivo100.

Muscular atrophy is in part controlled by expression of miR-29b101. Delivery of Cas9 that targets miR-29b in multiple mouse models of muscular atrophy knocked out the miRNA and prevented muscle loss102 (Fig. 4f). In macrophage cell lines, CRISPR-mediated indel formation in miR-155 reduced pro-inflammatory cytokine expression in vitro, creating an avenue for treating the autoimmune disease rheumatoid arthritis103. miRNA targeting has also been extensively studied in cancer104; various approaches have been shown to lead to reductions in cancer cell proliferation105,106,107, inhibition of metastasis105,106 and death of cancer cells108 in preclinical studies.

Transcriptional and epigenetic modulation

Epigenetic changes and dysregulated expression that alters gene dosage drive many diseases, resulting in  phenotypes that cannot be rescued simply by indel formation or microdeletion. Dysregulation can also be caused by an epigenetic change that is inaccessible to wild-type CRISPR systems. To fill this gap, dCas molecules can be creatively coupled to transcriptional or epigenetic modulators, to precisely target relevant therapeutic regulatory domains to specific regions of the genome without creating DNA damage or a DNA edit. This approach also mitigates the risks associated with DNA damage, p53-induced apoptosis, permanent off-target editing and abnormal chromosomal rearrangements.

CRISPR interference

dCas9–KRAB fusions can be targeted to protein-coding sequences to downregulate transcription, repressing the gene for as long as the CRISPR fusion protein is present without permanently editing DNA44,109. This feature is attractive for reducing expression of the voltage-gated sodium ion channel NaV1.7 (encoded by SCN9A) in the peripheral nervous system, which could reduce pain and thereby overcome the current reliance on opioids110. As developing small-molecule drugs is challenging and complete gene ablation would result in permanent undesirable pain insensitivity, CRISPRi is an attractive option to treat chronic pain111. CRISPRi targeted to NaV1.7 and delivered intrathecally reduced pain sensitivity and reversed chronic pain in mouse models of carrageenan-induced inflammatory pain, paclitaxel-induced neuropathic pain and BzATP-induced pain, demonstrating the therapeutic advantage of CRISPRi over traditional CRISPR editing in these settings111.

The regulatory effects of CRISPRi can be used in many other diseases in which complete CRISPR-mediated gene knockout is not therapeutically useful. In one form of long QT syndrome (LQTS) that can be caused by a myriad of mutations in CALM2, dCas9–KRAB was used to reduce expression of the mutant gene in vitro112. This intervention overcame the disease phenotype in iPSC-derived cardiomyocytes and creates a generalizable therapeutic approach that is independent of the location of the nonsense mutation. In a mouse model of retinitis pigmentosa, dCas9–KRAB targeted to Nrl rescued retinal function when delivered to postmitotic cells that normally have reduced capacity for the DNA repair mechanisms that are essential for indel formation113. Overexpression of DUX4 in myocytes leads to facioscapulohumeral muscular dystrophy (FSHD)114,115. DUX4 has many genomic copies that could lead to toxicity if numerous DSBs were created, and gene editing at such large repetitive regions can lead to unpredictable outcomes. CRISPRi has been leveraged in vitro and in vivo to reduce DUX4 expression without the risk of inducing apoptosis owing to DNA damage. In contrast to CRISPR gene editing, CRISPRi can be inducible and reversible, which further alleviates safety concerns when testing in the clinic.

CRISPR activation

Fusion of dCas proteins to activators provides a method for targeted gene upregulation that can overcome various types of disease including those caused by haploinsufficiencies44. Unlike ectopic transgene expression, CRISPRa can be used to precisely tune the magnitude of gene upregulation. In addition, this system can be packaged into viral vectors more easily than larger transgenes. For example, nuclease-dead SaCas9 (SadCas9) was fused to the VP64 domain and delivered to mouse models of obesity that had a haploinsufficiency of either Sim1 or Mc4r116. Targeting CRISPRa to the promoter region increased transcription of both genes, rescuing the obesity phenotype and demonstrating cell specificity by precisely targeting tissue-specific cis-regulatory elements.

CRISPRa can upregulate genes independently of mutations. Multiple LAMA2 mutations lead to congenital muscular dystrophy type 1A (MDC1A), which can be rescued by ectopic expression of LAMA1. AAV-based delivery of SadCas9 fused to VP64 was used to upregulate Lama1 in a mouse model of MDC1A, improving muscle fibrosis and preventing disease progression117. The ability to overcome muscle wasting in a mutation-independent manner has also been used to overcome the Duchenne muscular dystrophy phenotype in vitro through upregulation of Lama1 (ref.118) or a utrophin gene (UTRN)119. Cas9 expressed in mice with a sgRNA containing an aptamer that recruits p65 and HSF1 domains47 was able to upregulate genes to treat Duchenne muscular dystrophy (Klotho or Utrn), acute kidney injury (Il10 or Klotho) and type 1 diabetes (Pdx1)120. Importantly, the sgRNA had a spacer of 14 bp instead of 20 bp, which allowed Cas9 to bind to DNA but not to create DSBs, resulting in a nuclease-deficient system.

Use of CRISPRa to upregulate therapeutically useful coding genes has been demonstrated in vitro for autoimmune diseases121, neurodegenerative diseases122,123 and cancer124,125,126,127. However, the usefulness of this approach extends beyond upregulation of single proteins to endogenous non-coding RNA. For example, dCas9 fused to VP64, p65 and RTA (collectively known as VPR) has been used in vivo to increase expression of DANCR, a lncRNA that increases bone regeneration through chondrogenic differentiation128. CRISPRa can also be used to upregulate multiple gene targets by the addition of multiple sgRNAs. For example, CRISPRa was demonstrated to simultaneously upregulate Bdnf, Gdnf and Ngf in adipose-derived stem cells ex vivo to promote peripheral nerve regeneration in a rat model of nerve injury129.

Traditional CRISPRi and CRISPRa constructs are large and challenging to package into single AAVs; however, the development of hypercompact Cas molecules can overcome this issue. For example, structurally guided engineering of a natural Cas12f system reduced the size of Cas by almost 60% (2.6 kb) to produce a miniature Cas system (CasMINI, ~1.55 kb)33. CasMINI can be fused to many commonly used activating or repressive modulators to create proteins that are much smaller than the 4.7 kb packaging limit of AAV vectors for in vivo delivery. These hypercompact systems can also be encoded on mRNA for more efficient delivery and expression in human tissues and in vivo.

CRISPR epigenetic modification

CRISPRa and CRISPRi gene regulation methods result in transient gene modulation. In postmitotic cells or disease indications in which transient gene expression results in a therapeutic benefit, this transience does not present a challenge. However, some diseases require long-lasting and heritable changes to gene regulation. Epigenetic modifications via targeted addition of methyl groups to DNA or insertion of acetyl or methyl groups on histone residues locally modulate gene expression130. These modifications are often persistent and can be inherited by daughter cells, creating an opportunity for long-lasting gene expression modulation. Many epigenetic modifiers have been fused to CRISPR proteins to make chemical modifications at the DNA or chromatin level48. For example, CRISPRoff131 and CRISPR-KAL109 can lead to long-term (for example, several months) gene silencing by modifying H3K9me3 and DNA methylation. These approaches are potentially suitable for treating diseases that require persistent gene perturbation.

DNA methylation domains from the DNMT3 family have been fused with dCas9 to achieve long-term gene silencing132,133,134. For example, targeting the SNCA intron 1 with a dCas–DNMT3 fusion protein generated targeted DNA hypermethylation in human iPSC-derived dopaminergic neurons carrying a SNCA triplication and rescued the Parkinson disease-related phenotype in vitro135. To reverse the silencing effects of natural DNA methylation, ten-eleven translocation methylcytosine dioxygenase 1 (TET1) catalytic domain was fused with dCas9 to selectively remove DNA methyl groups and upregulate gene expression136,137. This approach has been investigated as a potential therapy for fragile X syndrome, which is an intellectual disability caused by a CGG expansion in FMR1 that results in extensive methylation and therefore reduces gene expression138. Targeting of dCas9–TET1 to FMR1 demethylated the CGG repeats, reactivated sustained gene expression and rescued the disease phenotype in iPSC-derived neurons in vivo139. Fusion proteins comprising dCas9 and TET enzyme catalytic domains have also been used to treat cancer in vitro (targeting BRCA1 (ref.140)) and in vivo (targeting SARI141) and attenuate renal fibrosis in vivo (targeting Rasal1 or Klotho)142. Importantly, the resulting DNA methylation changes are long-lasting, heritable and reversible143.

To site-specifically modify histones, the catalytic core of the p300 domain was fused to dCas9. When directed to enhancer regions on DNA, this fusion adds an acetyl group to lysine 27 of histone H3 (H3K27ac), resulting in activation of gene expression144. In mice, expression of dCas9–p300 was able to upregulate Foxp3 expression in T cells, converting them into regulatory T (Treg) cells with the potential to treat autoimmunity145,146. The H3K27ac mark can be removed using dCas9 fused to histone deacetylase 1 (HDAC1). This approach has been targeted to KRAS to inhibit cancer growth147. Additional suppressive CRISPR histone modifiers include decreasing H3K4 methylation, increasing H3K9 methylation and enhancing HP1α binding, which when targeted to GRN can reduce cell proliferation and invasion in hepatoma cells148.

Base and prime editing

The random process of indel formation is difficult to harness to correct precise mutations, as the number or identity of the added nucleotides cannot be controlled. With the exception of cell therapies in which engineered cells are clonally expanded, checked for proper mutation and then reintroduced into the body, wild-type CRISPR systems are often poor choices for precise mutation correction. To fill this gap, CRISPR fusions that make precise genetic changes have been generated and deployed in a myriad of diseases.

Base editing

Given the rapid improvement in the technology and ability to correct deleterious point mutations with unparalleled precision, base editors have been quickly adopted as potential approaches to treat well-understood diseases with known missense mutations. CBEs that create C-to-T mutations have been used in a wide variety of in vivo models. Both Cas9 and Cas12a CBEs have been used to correct a missense mutation in the Pah gene in a mouse model of the human autosomal recessive liver disease phenylketonuria (PKU)149. The ability to make a C-to-T conversion enables the generation of stop codons, which always begin with a thymine. In a mouse model of amyotrophic lateral sclerosis (ALS), SpCas9 CBE was used to create a premature stop codon in SOD1, reducing muscle atrophy and improving neuromuscular function150. The large size of the CBE CRISPR constructs necessitated the protein to be split into two AAV vectors and fused post-translationally in the cell using inteins149,150.

ABEs are highly relevant for therapeutics, as C•G to T•A transitions account for approximately half of all known pathogenic point mutations57. In a Duchenne muscular dystrophy mouse model, ABEs that were delivered to the muscles as two AAVs were able to correct a single mutation in Dmd and improve the disease phenotype151. ABEs have also been used to correct the LMNA mutation in a mouse model of Hutchinson–Gilford progeria syndrome, extending the median lifespan from 215 to 510 days152. Notably, use of an ABE to correct a nonsense mutation ex vivo in a mouse model of sickle cell disease led to approval of the BEACON-101 phase I/II trial of this therapy153,154. Other therapeutic uses of base editors have been reviewed elsewhere155,156,157.

Prime editing

Prime editing has the potential to create a wide array of therapeutic genome edits but has not yet been as widely investigated as other CRISPR systems. In the study that first described the tool, researchers corrected mutations in HBB that cause sickle cell disease, HEXA that cause Tay–Sachs disease and PRNP to protect against prion diseases63. Prime editing can be used to make precise mutations that are currently not possible using base editors. For example, in a mouse model of α1-antitrypsin deficiency (AATD), prime editors were effectively delivered to mice livers to remove a pathogenic E342K mutation in SERPINA1 by creating an A-to-G edit158. Prime editors have also been used to correct a mutation in Dnmt1 in mouse retinas by creating a G-to-T transversion, demonstrating the potential to correct eye disease. These precise edits could not be achieved using other CRISPR tools159.

In addition to base editing, prime editing can be used to insert oligonucleotides. In human iPSCs, prime editing was used to insert two nucleotides (AC) into exon 52 of DMD. This approach enabled exon reframing to rescue expression of DMD and the contractile function of iPSC-derived cardiomyocytes modelling Duchenne muscular dystrophy160. Although the therapeutic use of prime editors is still in its infancy, the flexibility of genomic edits that this methodology creates potentially enables correction of a myriad of diseases.

Infection prevention and treatment

In addition to modifying the human genome, CRISPR–Cas therapies can be used to target latent and chronic viral infections in human cells. For example, intrastromal injection of Cas9 as a non-integrating lentivirus prevented herpes simplex virus type 1 (HSV-1) infection and disease pathology and destroyed the viral reservoir in mouse models161. This system can also be used for other herpes viruses, such as Epstein–Barr virus (EBV), by targeting Cas9 to essential promoters162 or coding sequences163 in the viral genome. Cas9 and Cas12a have both been used to target the long terminal repeats and Gag–Pol polyprotein of HIV-1 (refs.164,165,166,167,168). Cas9 has also been used to target coding sequences and the covalently closed circular DNA in hepatitis B virus (HBV)169,170,171,172,173,174 and to cut the DNA genome of human papillomavirus (HPV)175,176,177. In addition, Cas9 with a modified sgRNA was used to destroy the RNA genome of hepatitis C virus (HCV)178. Notably, a CRISPR-based strategy to clear HIV infections (EBT-101) has entered a phase I/II clinical trial179. Furthermore, our laboratory has developed a strategy using Cas13d to target viral RNA genomes and demonstrated the utility of this approach as a prophylactic for both influenza A virus (IAV) and sever acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection180. Notably, this strategy works on a broad spectrum of coronaviruses and variants of SARS-CoV-2 owing to the ability to target highly evolutionarily conserved regions in the viral genome181.

CRISPR has also been used to target bacterial infections. Cas9 can be packaged into bacteriophages and delivered to antibiotic-resistant S. aureus to target bacterial resistance genes and re-sensitize the bacteria to treatment182. As DSBs in bacterial genomes result in cell death, plasmid or phage-delivered Cas9 is an effective antimicrobial strategy in both E. coli and S. aureus183,184,185. Cas3 has been harnessed to target and shred (that is, create long-range deletions) the genome of Clostridiodes difficile (also known as Clostridium difficile), which is one of the most harmful and antibiotic-resistant bacterial species in existence186. The vast therapeutic CRISPR toolbox is rapidly expanding beyond human genome engineering to treat a wide variety of infectious diseases.

Challenges of delivering CRISPR tools

CRISPRs are multi-component systems that require packaging of the large protein, the gRNA and all the elements that control their expression. Many established approaches exist for in vitro and ex vivo delivery of these components as DNA, RNA or RNP complexes187,188. Both integrating and non-integrating lentiviruses can be used to deliver CRISPR components but are limited owing to the potential for insertional mutagenesis and low efficiency, respectively. DNA, RNA or RNP can also be delivered using methods that physically introduce the components to cells, such as electroporation or microinjection. These approaches benefit from controllable dosing and efficient delivery but can be technically difficult and create viability issues.

Many diseases that could benefit from a CRISPR therapy cannot be treated ex vivo and therefore cannot be delivered using lentiviruses, microinjection or electroporation. Delivery of CRISPR molecules in vivo poses a major challenge that has limited their potential as therapeutics187,188,189,190. AAV is commonly used to deliver CRISPR components as DNA both in vivo and ex vivo. This approach can be used to deliver small CRISPR systems in a single vector or larger components split between multiple vectors. However, the limited packaging capacity and tropism of AAVs prevent them from being universally used. Lipid nanoparticles (LNPs) can deliver CRISPR tools as RNA, resulting in more transient effects than those obtained with viral AAV delivery and therefore reducing the risk of off-target editing. However, many LNPs almost exclusively traffic to the liver and cannot reach other therapeutically relevant tissues. Virus-like particles (VLPs) are exciting vehicles for the delivery of CRISPR components. An RNA-binding protein or CRISPR RNP is fused to a retroviral Gag–Pol, enabling CRISPR RNA or RNP to be encapsulated in a viral vector. Although the therapeutic use of VLPs is still in its infancy, this approach has been demonstrated to have low levels of off-target effects and flexible tropism191,192,193.

Current CRISPR therapeutics are limited by the small packaging capacities and tissue trafficking properties of the available delivery vectors, which restrict the use of these CRISPR tools and reduce their potential disease indications. Various approaches, such as directed evolution of AAV capsids, functionalization of LNPs and molecular engineering of CRISPR components, are being investigated with the aim of improving the efficacy, safety and specificity of in vivo delivery vehicles. To realize the full potential of CRISPR therapies, further efforts are required to get these tools to the relevant tissues with high efficiency, high specificity and minimal toxicity.

Conclusions

The ease with which CRISPR can create targeted DSBs in the human genome enabled quick adoption as a broad tool to overcome genetic disorders. As a first step, CRISPR was used to perform targeted gene knockouts, as Cas9 can be targeted anywhere on the coding sequence to induce a frameshift to silence a deleterious protein. However, most diseases are complex and cannot be cured by this simple coding sequence-targeting strategy. The use of CRISPR to target diseases with complex drivers has been catalysed by developing more nuanced strategies that target the non-coding genome and fix gene expression more indirectly (for example, by exon skipping or intron corrections). Beyond these approaches, the rapid discovery of natural CRISPR molecules with beneficial properties and further engineering of these proteins to create molecules that alter transcription, change the epigenome, make precise mutations or enable writing directly on the genome have dramatically increased the range of indications that can potentially be treated using CRISPR–Cas systems. However, further advances are needed to fully leverage these proteins.

As discussed above, current CRISPR tools are limited by the challenges of in vivo delivery to the relevant tissues. In addition, off-target events caused by CRISPR systems must be precisely controlled to create highly targeted therapies. Bioinformatic strategies to improve the specificity of gRNA, altering the chemical composition and length of gRNA, the discovery and engineering of new Cas variants, temporal restriction of CRISPR systems using transient delivery methods or anti-CRISPRs, and moving from DSBs to more targeted systems such as prime editors, base editors or epigenetic modulators could greatly reduce off-target effects194,195,196. However, more research is required to develop a maximally safe and effective CRISPR therapy.

The use of CRISPR tools to target more nuanced disease drivers requires a better understanding of how non-coding DNA and epigenetic states affect a disease pathology. Point mutations in coding sequences are much easier to link to a disease phenotype than mutations in non-coding sequences owing to a deep understanding of how a genetic change results in an amino acid change by looking at sequencing information. Using the right CRISPR tool that can link sequence, epigenome, transcriptome and phenotype information to the root cause of a pathology that is not driven by simple polymorphisms will be helpful to define new cures. However, the rapid advances in CRISPR tools, multi-omic methods and delivery mechanisms suggest that genome engineering techniques will soon be developed for a multitude of diseases, potentially resulting in curative therapies for many underserved patient populations.