Trio deep-sequencing does not reveal unexpected off-target and on-target mutations in Cas9-edited rhesus monkeys

Luo, Xin; He, Yaoxi; Zhang, Chao; He, Xiechao; Yan, Lanzhen; Li, Min; Hu, Ting; Hu, Yan; Jiang, Jin; Meng, Xiaoyu; Ji, Weizhi; Zhao, Xudong; Zheng, Ping; Xu, Shuhua; Su, Bing

doi:10.1038/s41467-019-13481-y

Download PDF

Article
Open access
Published: 04 December 2019

Trio deep-sequencing does not reveal unexpected off-target and on-target mutations in Cas9-edited rhesus monkeys

Xin Luo^1,2,3^na1,
Yaoxi He ORCID: orcid.org/0000-0003-3324-3239^1,2,3^na1,
Chao Zhang⁴^na1,
Xiechao He^1,3^na1,
Lanzhen Yan^3,5^na1,
Min Li^1,2,
Ting Hu^1,2,
Yan Hu^1,2,
Jin Jiang^1,2,
Xiaoyu Meng ORCID: orcid.org/0000-0002-1025-3314^1,2,
Weizhi Ji⁶,
Xudong Zhao^3,5,
Ping Zheng^1,7,
Shuhua Xu ORCID: orcid.org/0000-0002-1975-1002^4,7,8,9 &
…
Bing Su^1,3,7

Nature Communications volume 10, Article number: 5525 (2019) Cite this article

4535 Accesses
23 Citations
2 Altmetric
Metrics details

Subjects

Abstract

CRISPR-Cas9 is a widely-used genome editing tool, but its off-target effect and on-target complex mutations remain a concern, especially in view of future clinical applications. Non-human primates (NHPs) share close genetic and physiological similarities with humans, making them an ideal preclinical model for developing Cas9-based therapies. However, to our knowledge no comprehensive in vivo off-target and on-target assessment has been conducted in NHPs. Here, we perform whole genome trio sequencing of Cas9-treated rhesus monkeys. We only find a small number of de novo mutations that can be explained by expected spontaneous mutations, and no unexpected off-target mutations (OTMs) were detected. Furthermore, the long-read sequencing data does not detect large structural variants in the target region.

Whole genome analysis for 163 gRNAs in Cas9-edited mice reveals minimal off-target activity

Article Open access 10 June 2023

Kevin A. Peterson, Sam Khalouei, … Lauryl M. J. Nutter

CRISPR-Cas9 induces large structural variants at on-target and off-target sites in vivo that segregate across generations

Article Open access 02 February 2022

Ida Höijer, Anastasia Emmanouilidou, … Adam Ameur

Multiplex precise base editing in cynomolgus monkeys

Article Open access 11 May 2020

Wenhui Zhang, Tomomi Aida, … Shihua Yang

Introduction

CRISPR-Cas9 has been widely used to facilitate efficient genome editing in model and nonmodel animals¹. It also serves as a promising tool for correcting deleterious mutations causing human genetic diseases. However, the assessment of off-target and on-target effects is insufficient, making it unsafe to implement in clinical therapeutic settings.

The specificity of CRISPR-Cas9 relies on the designed 20 bp guide RNA (sgRNA) and PAM^2,3. In the genome, there are often many sgRNA-like sequences. Consequently, CRISPR-Cas9 may generate nonspecific editing, leading to OTMs. Previously, most of the studies on off-target were carried out in rodents or human cells. Rodents serve as important animal models in preclinical studies. However, rodents have failed to show features of human disorders in many aspects. For example, the clinical symptoms involving high-level cognitive functions cannot be reproduced faithfully in rodent models⁴. By contrast, nonhuman primates (NHPs) are genetically and physiologically similar with humans. Macaque monkeys have been used in biomedical research and are among the highest primates that can be genetically manipulated (without serious ethical concerns) to construct preclinical models for Cas9-based therapies^5,6,7. Hence, exploring the off-target activity in Cas9-edited monkeys becomes crucial for future clinical applications.

The off-target effect has been investigated using whole-genome sequencing (WGS) of Cas9-edited cells or animals^8,9,10,11,12. However, previous studies mainly focused on the potential off-target sites predicted by sgRNA binding, not on a genome-wide evaluation of de novo mutations (DNMs). Recently, Schaefer et al. found plenty unexpected mutations using WGS of Cas9-edited mice¹³ though the claim was challenged by several groups^{14,15,16,17,18,19}, and the latest trio sequencing of Cas9-edited mice did not see unexpected off-target activity¹⁴.

Besides the off-target effect, the on-target complex mutations induced by the CRISPR-Cas9 system may also be a concern. A recent study reported large fragment DNA mutations (LFDMs), including large deletions, insertions, and complex rearrangements at the targeted sites in CRISPR-Cas9-edited cell lines²⁰.

To evaluate the situation in monkeys, we perform trio WGS and long-read sequencing of the target regions of Cas9-edited rhesus monkeys (Macaca mulatta). We also analyze the published trio WGS data of Cas9-edited cynomolgus monkeys (Macaca fascicularis)²¹. We did not observe any unexpected OTMs in the Cas9-edited monkeys.

Result

MCPH1 gene knockout rhesus macaques using CRISPR-Cas9

We designed two sgRNAs to target exon2 and exon4 of MCPH1-a human autosomal recessive primary microcephaly gene that plays a key role in primate brain development and evolution^22,23,24 (Supplementary Fig. 1). Firstly, using zygotic injection of Cas9 mRNA and two sgRNAs, we achieved a high knockout efficiency for MCPH1 at embryo level. Among the 15 rhesus monkey embryos tested, 13 of them were knockout positive (86.6%), including 3 (20%) knockout homozygotes (Supplementary Fig. 2 and 3). To generate MCPH1 knockout monkeys, we microinjected 30 zygotes, among which 24 zygotes developed normally and were transferred into 6 surrogate females, resulting in two pregnancies of twins and triplets, respectively. The surrogate female with twins had premature delivery at 138-days gestation, leading to a live male monkey and a dead female monkey. We performed C-section at 160-days gestation for the other surrogate female with triplets, and all three monkeys (one male and two females) were alive (Supplementary Fig. 1 and Supplementary Table 1). We first used PCR-clone sequencing to evaluate the Cas9-editing status, and the result showed that all offspring monkeys were successfully modified by Cas9 except for rmO4^ko* from the triplets. The dead female monkey (rmO2^ko) was a homozygous knockout (Supplementary Fig. 1). Interestingly, all Cas9-induced mutations were located in exon2 of MCPH1. No exon4 mutations were detected although sgRNA2 was designed to target exon4 and showed a high efficiency in the embryo test (Supplementary Fig. 3). We speculate this might be due to technical problems such as misoperation of microinjection.

Whole-genome deep sequencing and variant calling (VC)

The five Cas9-treated monkeys and their three wildtype parents were subject to WGS by the Illumina X10 platform (“Methods” section) (Fig. 1a and Supplementary Table 1). Blood samples were taken from the four live monkeys for DNA extraction, and for the dead monkey, multiple tissues (brain, liver, and muscle) were sampled. We achieved a median 46× depth of genome coverage (Table 1). The WGS data exhibited a high reads quality with Q30 > 85%, mean duplicate percentage of 12.03%, and properly paired reads >96% (Supplementary Table 2).

Table 1 Summary of trio WGS and identification of DNMs.

Full size table

Using WGS data, we first reassessed the Cas9-editing efficiency at the MCPH1 locus (Table 1 and Supplementary Fig. 4). Consistent with the PCR-clone sequencing, we observed mosaic patterns of MCPH1 knockout for all Cas9-edited monkeys. The knockout efficiency ranges from 12.2 to 95.3% (Table 1) in the four Cas9-edited monkeys and no sequence change was detected in the knockout-negative monkey (rmO4^ko*).

We then performed genome-wide VC using four different tools, including GATK²⁵, Platypus²⁶, Freebayes²⁷, and Samtools²⁸ (Fig. 1b). VC, quality control (QC), site filtering (SF), genotype filtering (GF), and universal mask (UM) were performed to obtain high-confident variants (“Methods” section) (Fig. 1b). The overlapped variants from the four different calling tools were taken as the high-confident variants (Supplementary Table 3).

Evaluation of off-target mutations

With the use of the SpeedSeq⁸, we predicted the potential off-target locations for the two MCPH1 sgRNAs. Among the 4807 predicted off-target sites, no mutation was observed in the four Cas9-edited monkeys, including the monkey (rmO2^ko) with multiple tissue samples (Supplementary Table 4). We also used another off-target predictor (Cas-OFFinder)²⁹, and no mutation was observed in the four Cas9-edited monkeys when up to seven mismatches between the on-target and off-target site was allowed (Supplementary Table 4). Hence, no off-target effects were detected at the predicted sgRNA-binding sites.

Detection of DNMs using trio WGS data

To evaluate the genome-wide off-target effect, we explored the pattern of DNMs using the trio WGS data by TrioDeNovo software³⁰ (Fig. 1b). We first validated the genetic relationship between the Cas9-treated monkeys and their parents³¹ (see “Methods” section). The IBD (identity by descent) results agreed well with the known kinship (Supplementary Fig. 5). Initially, we obtained on average 1,365 candidate DNMs for each Cas9-treated monkey by running TrioDeNovo and overlapping the DMNs called by different tools (Table 1). We then performed multiple filtrations (see “Methods” section) for the overlapped DNM sets, and obtained on average 34 high-confident DNMs for each monkey (32 substitutions and 2 indels) (Fig. 1c, d; Table 1; Supplementary Fig. 6, 7; Supplementary Data 1). We then performed Sanger sequencing in one Cas9-edited monkey (rmO1^ko), and 36 of the 39 identified high-confident DNMs (92.3%) were validated (Supplementary Fig. 8; Supplementary Data 1), suggesting that the employed DNM calling pipeline was reliable. These high-confident DNMs can be explained by the known spontaneous mutation rates of primates (0.98–2.17 × 10⁻⁸ per nucleotide per generation) with 22–78 expected DNMs per generation^32,33. Consistently, we saw no correlation between the number of high-confident DNMs and the Cas9-editing efficiency (R² = −0.140, P = 0.765, Pearson’s correlation test). In other words, the Cas9-editing efficiency does not affect the occurrence of DNMs in the Cas9-edited monkeys.

In addition, none of the detected DNMs are located or near (a 40 bp region around⁸) the predicted genome-wide off-target sites, ruling out the possibility of DNMs by the Cas9 off-target effect. To evaluate the statistical power of detecting DNMs, we adopted the previous method¹⁴. Given a median sequencing depth of 46.22× (Table 1) and a minimum allele frequency of 10%, the power to detect one DNM occurred in the single-cell or the two-cell stage of zygote is at least 99.5% (see “Methods” section for more details). We also evaluated if there were de novo structural variants (SVs) (≥50 bp) using Delly³⁴ and Pindel³⁵ with multiple filters (refer to “Methods” section), and we did not see any high-confident de novo SVs in the Cas9-treated monkeys. (Supplementary Data 2).

To further confirm the DNM pattern seen in rhesus monkeys, using the same pipeline, we analyzed the published trio WGS data of a gene-knockin model via CRISPR-Cas9 in cynomolgus monkeys²¹. The trio included an Oct4-hrGFP knockin cynomolgus monkey and his two wildtype parents (Fig. 1a, b). The results showed that only 35 DNMs (32 substitutions and 3 indels) were detected (Fig. 1d; Table 1), concordant with the pattern seen in rhesus monkeys. For the sgRNA-predicted off-target sites, only one mutation (a 2-bp deletion) was seen in the knockin cynomolgus monkey as reported in Cai et al.²¹.

Examination of LFDMs

Furthermore, to detect if there are LFDMs induced by CRISPR-Cas9 at the target regions, we PCR-amplified two ~6.0 kb regions covering exon2 and exon4 of MCPH1, respectively (Fig. 2a). The PCR products were sequenced by the PacBio platform (see details in “Methods” section). The results showed that no unexpected LFDMs were detected at the target regions in the Cas9-edited monkeys (Fig. 2a and Supplementary Data 2). It should be noted that a ~300 bp insertion was detected around the target region (exon4) in two Cas9-edited monkeys, which is also carried by their wildtype mother (without Cas9 treatment), suggesting this insertion in the Cas9-edited monkeys were inherited from their mother (Fig. 2b and Supplementary Data 1). Consequently, no unexpected LFDMs were detected at the target regions in the Cas9-edited monkeys.

Discussion

In this study, we used multiple tools to call variants and we took the overlap as the high-confident variant set. It is known that the performance of different tools varies when conducting genome-wide VCs. Hence, a combination of different calling tools is necessary to identify high-confident variants from WGS data. Notably, the four different tools exhibited high consistency (>80%) for VCs (Supplementary Table 3).

For DNM identification, we initially identified ~1365 candidate DNMs for each monkey. We found that >90% of them were the types disobeying the Mendel’s law due to unexpected allele combinations in the offspring monkeys, but the alleles were in fact present in the parents. For example, the genotypes of the parents are AA and GG respectively, and we saw AA or GG in the offspring instead of the expected genotype of AG. These candidate DNMs are most likely noise, not true DNMs, which may be caused by the technical bias of next-generation sequencing. The same scenario was also seen in the reported mouse data¹⁴. We filtered out these candidate DNMs by the DNM filtration procedure (“allele filtering”, “Methods” section).

In addition, gene knockout and knockin are formed by different repair procedures in the cell. We analyzed trio sequencing data from both knockout (rhesus monkeys) and knockin (cynomolgus monkeys) monkey models, and we did not observe unexpected mutations in either model, suggesting that the homologous repair template does not induce OTMs. Our WGS data indicate that neither the knockout nor the knockin monkeys possess unexpected mutations. However, it should be noted that although WGS data is powerful in detecting DNMs, considering the minimum allele frequency for VC of 10% in this study, we cannot fully exclude the possibility of low-frequency (<5%) DNMs.

Furthermore, we did not find any LFDMs in the Cas9-edited monkeys, contradicting the previous report using cell lines²⁰. We speculate that this discrepancy may result from different DNA repair mechanisms between in vivo and in vitro systems. In addition, in the data by Kosicki et al., ES cells and other cell lines were used, which may not apply to in vivo system. Also, Kosicki et al. introduced Cas9 and gRNA constructs targeting intronic and exonic sites of PigA using PiggyBac transposon system, and the observed LFDMs were possibly induced by transposase itself, not by the Cas9 system. Finally, natural mutations of some genes (often seen in cell lines) can also induce LFDMs. For example, Yu et al. reported that Dna2 nuclease deficiency could lead to large and complex DNA insertions at chromosomal breaks³⁶.

Due to the limited monkey resource, we did not acquire non-Cas9-treated littermate controls. Instead, we used the knock-negative monkey (rmO4^KO*) as a “proxy”, and no difference was detected between the knock-negative and the Cas9-edited monkeys in view of DNM frequencies. With this experimental design, though unlikely, we cannot formally rule out the possibility that the DNMs are induced by Cas9. In addition, MCPH1 is essential for mitotic and meiotic recombination DNA repair and for maintaining genomic stability³⁷. Targeting this gene might result in some confounding effect. Previous study reported that only the homozygous Mcph1-del mice showed defect of DNA damage repair³⁸. Most of our Cas9-edited monkeys are mosaics (heterozygous) with on-target ratios ranging from 12.2 to 95.3%. If MCPH1 affected DNA damage repair in the Cas9-treated monkeys, we would have detected difference of DNM frequencies among the monkeys having different mosaic ratios. Our results showed that the detected DNMs were similar among the Cas9-treated monkeys (29–38 DNMs, Table 1), ruling out the potential influence of MCPH1 as a DNA damage regulator.

In conclusion, based on our systemic evaluation of off-target and on-target effects in the Cas9-edited monkeys, we did not detect unexpected mutations (OTMs and LFDMs). Given the presented data was a single set of genomes editing experiments in monkeys, more tests are necessary to fully evaluate the safety issue of gene editing in primates.

Methods

Animals

All animals were housed at the AAALAC (Association for Assessment and Accreditation of Laboratory Animal Care) accredited facility of Primate Research Center of Kunming Institute of Zoology. All animal protocols were approved in advance by the Institutional Animal Care and Use Committee of Kunming Institute of Zoology (Approval No: SYDW-2010002).

SgRNA design and in vitro transcription

Based on the rhesus monkey reference genome (Mmul_8.0.1), two sgRNAs were designed to target the MCPH1 gene with sgRNA1 targeting exon2 and sgRNA2 targeting exon4. The sequences of the two sgRNAs are (PAM in bold): sgRNA1: CCTATGTTGAAGTGTGGTCATCC; sgRNA2: TTACACAGATGCAGGACAGCTGG. The sgRNAs were cloned into PUC57-sgRNA vector (Addgene No. 51132) (Supplementary Table 5). The sgRNAs were transcribed by the MEGAshortscript Kit (Ambion, AM1354) after the vectors were linearized by DraI (NEB, R0129S). SgRNAs were purified by the MEGAclear Kit (Ambion, AM1908). Cas9 mRNAs were transcribed by the T7 Ultra Kit (Ambion, AM1345) after the pST1374-Cas9-NNLS-flag-linker vector (Addgene No. 44758) was linearized with AgeI (NEB, R0552S). Cas9 mRNAs were purified by the RNeasy Mini Kit (Qiagen, 74104).

Zygote injection and embryo transfer

Briefly, healthy female monkeys with regular menstrual cycles were used as oocyte donors for superovulation by intramuscular injection with rhFSH (Recombinant Human Follitropin Alfa, GONAL-F^®, Merck Serono) for continuous 8 days, then rhCG (Recombinant Human Chorionic Gonadotropin Alfa, OVIDREL^®, Merck Serono) on day 9. The oocytes were collected by laparoscopic follicular aspiration 36 h after rhCG treatment. The MII (first polar body present) oocytes were selected for in vitro fertilization and the fertilization was confirmed by the presence of two pronuclei. Fertilized eggs were injected with a mixture of Cas9 mRNA (20 ng/μl), sgRNA1 (10 ng/μl), and sgRNA2 (10 ng/μl) into cytoplasm using a Nikon microinjection system. The injected zygotes were cultured in the chemically defined, protein-free hamster embryo culture medium-9 (HECM-9) containing 10% fetal calf serum (Hyclone Laboratories, SH30088.02) at 37 °C in 5% CO₂. The normally developed embryos from 2-cell to 8-cell with high quality were transferred into the oviduct of the matched recipients. A total of six monkeys were used as surrogate recipients, and typically, four embryos were transferred for each recipient female. The earliest pregnancy was diagnosed by ultrasonography about 30 days after transfer. Both pregnancy and number of fetuses were confirmed by fetal cardiac activity and presence of a yolk sac as detected by ultrasonography.

DNA extraction

Genomic DNA was extracted using the DNeasy Blood & Tissue Kit (Qiagen, 69506) according to manufacturer’s instructions. The tissue samples included muscle, liver, and brain from rmO1^ko and blood from the other monkeys. DNA samples were quantified using a NanoDrop spectrophotometer.

Genotyping of MCPH1 gene knockout rhesus monkeys

PCR primers were designed to amplify the sgRNA targeting region (Supplementary Table 6). Targeted fragments were amplified with Taq DNA polymerase from genomic DNA. PCR products were subcloned into pMD19 vector (Takara, 3271). The colonies were picked up randomly and sequenced by M13-F primer.

Whole-genome sequencing (WGS)

WGS libraries were prepared using standard protocols for the Illumina X10 platform. Briefly, 100 ng DNA was fragmented using a Covaris LE220 (Covaris), size selected (300–550 bp), end-repaired, A-tailed, and adapter ligated. Libraries were sequenced using the Hiseq X10 platform (Illumina) as paired-end 150 base reads. We generated on average 133 Gb raw sequence data per monkey. We performed QC by FastQC and GATK, and the mean Q30 of read-pairs are higher than 88%. Each sample has a raw read depth >46×. After masking the duplicates by Picard, we calculated the effective read depth (eDP) of the entire genome and the average eDP > 40. Mean percentage of PCR duplicates was lower than 13%, and the average mapped rate > 99.1%. The properly paired reads are >96% (Supplementary Table 2).

Alignment and postalignment processing

We used BWA MEM algorithm³⁹ to perform alignment, where short reads of rhesus monkeys (M. mulatta) were mapped to their reference genome (genome-build Mmul_8.0.1, rheMac8). The short reads of cynomolgus monkeys (M. fascicularis) were mapped to their reference genome (genome-build Macaca_fascicularis_5.0). The detailed command lines can be found in Supplementary Table 7. After the initial alignment, we run Picard’s MarkDuplicates to remove duplicates in both datasets.

Variant calling

We called single-nucleotide variants (SNVs), and small insertions/deletions (INDELs), from de-duplicated bam files with GATK HaplotypeCaller²⁵, Platypus²⁶, Freebayes²⁷, and Samtools²⁸. The command lines can be found in Supplementary Table 7. For GATK, variants were called and a GVCF file was generated for each sample, and then joint calling were performed for GVCF files of each trio, separately. For Platypus, Freebayes, and Samtools, we directly called the variants for each trio, separately.

Variant filtering

The overview of the variant filtering strategies can be found in Fig. 1b. We used SF, GF, and UM to filter against variants with low quality in each VCF set called by different callers. SF strategy filters variants at the site level, which takes QD (variant confidence/quality by depth) (QD > 2.0), mapping quality (MQ > 30), allele bias (AB > 0.1, at least Pval < 0.05), and strand bias (at least Pval < 0.05) into consideration. GF filters variants at the genotype level, which takes depth (15 < DP < 100 for SNVs) and genotype quality (GQ > 30) of each genotype into consideration. Since the information varies in the VCF sets generated by different VC tools, the corresponding SF and GF filtering variables and parameters are different. The details were summarized in Supplementary Tables 8 and 9. The UM is a sample independent mask that identifies complex regions in the reference genome where VC can be challenging⁴⁰. In our analysis, the UM included three components: (1) mappability mask; (2) low-complexity regions; and (3) repeat regions. The command lines that generated (1) and (2) are provided in Supplementary Table 10. We merged the three sets of regions. SNVs and INDELs in the UM regions were filtered out, of which the detailed commands can be found in Supplementary Tables 8 and 9. The number of variants after each of the filters were listed in Supplementary Data 2.

Target region analysis

We extracted the reads, which aligned to the sgRNA-binding regions as well as the regions 100 bp upstream and downstream by Samtools tview²⁸. We investigated Cas9-target effect at these reads, and the number of reads with deletions near PAM were counted in calculating on-target rate (Supplementary Fig. 4).

Prediction of off-target sites of the MCPH1 sgRNAs

SpeedSeq⁸ and Cas-Offinder²⁹ were used to evaluate the potential off-target sites of sgRNA. For SpeedSeq, genomic sites with “NGG” or “NAG” PAM motifs and ungapped alignment with up to five mismatches with sgRNA1 or sgRNA2 were defined as potential off-target sites. For Cas-Offinder, genomic sites with “NGG” or “NAG” PAM motifs and ungapped alignment with up to seven mismatches with sgRNA1 or sgRNA2 were defined as potential off-target sites. All prediction results are listed in Supplementary Table 4.

Kinship validation

To reduce computational complexity and linkage disequilibrium effect, we only included the independent variants of whole-genome to create the IBS (identical by state) matrix by PLINK 1.07³¹ with argument: indep-pairwise 50 5 0.2. A total of 866,199 independent variants were included in calculating IBD by PLINK 1.07.

DNM calling

We adopted TrioDeNovo software to identified DNMs, using default settings and ran by each parent/offspring trio. After obtaining the results from running TrioDeNovo (Raw_DNMs), we took the overlapped set as downstream DNMs (noStrict_DNMs) detected in VCF sets of different callers. Then, we performed multiple filtrations to remove the false positive DNMs: (1) allele filtering: the DNM candidates where at least one allele was absent from parental genotypes (AF_DNMs); (2) dbSNP filtering: the DNM candidates must be absent from public SNV database (dbSNPBuildID = 150) (dbSNPF_DNMs); (3) cross filtering: the DNM candidates shared between offspring were removed (CF_DNMs). To compare the consistency of different callers, we intersected the DNMs after allele filtering with four tools, and the venn plots were provided in Fig. 1c and Supplementary Fig. 7. The number of each filtering step was presented in Supplementary Data 2.

Power evaluation in detection of DNMs

To resolve mosaic DNMs in the genome, we employed the method from mouse study¹⁴. Considering our median depth (46.22×) of WGS (Table 1), we set a minimum required de novo allele frequency as 10% (we must observe at least 4 mutant allele reads out of 40 to call a DNM)¹⁴. The probability of not calling a mutation seen in 3 or fewer reads in this case is 0.005. In other words, the power of detecting a mutation occurred in a two-cell embryo is 99.8%.

De novo SV detection using WGS short-read data

We conducted genome-wide SV calling using Delly³⁴ and Pindel³⁵, and called de novo SVs for each trio. SV filtering was performed with the following criteria: (1) SV quality was evaluated with “PASS”; (2) SVs with precise breakpoints; (3) SVs with >10 supported reads; (4) SV length between 50 bp and 2 Mb (>2 Mb-length SVs were manually checked and we did not see any); (5) overlap the SV sets from Delly and Pindel (an overlapped SV was defined when the overlapped length is more than 50% of the reciprocal similarity). De novo SV filtering included the following criteria: (1) the shared SVs between parents and offspring were excluded (parent-inherit); (2) the SVs included in the public SV database or reported as common SVs in populations⁴¹ (known SV filtering); (3) the SVs shared among the offspring were removed (cross-filtering). (4) IGV tool was used to manually check the candidate SVs and filtered out SVs according to the alignment and coverage situations. The results of the SV numbers of each filtering step are presented in Supplementary Data 2.

Library preparation and PacBio sequencing

The MCPH1 gene targeted regions were amplified using primers MCPH1-E2-5939bp-f: 5′- GGCGGGGGGATAACGGTGCCGAAAG-3′. MCPH1-E2-5939bp-r: 5′-GACAGGCATTAGGGAGGTCAAACAAGGCTCTTAGGGTA-3′ and MCPH1-E4-5713bp-f: 5′-GTTTTCAAGGTTCATCATGTTGTCATCTGTATT-3′MCPH1-E4-5713bp-r: 5′-ATTGTTTATGATTAGTGAGACGAAGGATTTGC-3′. The PCR was performed using LA Taq DNA Polymerase (ClonTech). We prepared libraries following the PacBio protocol, every five PCR products (with distinct barcode) are pooled together, DNA damage repair, EXO III and VII digestion, Two AMPure PB bead washes, annealing, binding, and sequencing.

Long-read data analysis

For long-read PacBio data, we applied NGLMR software to map PacBio subreads to the rhesus macaque reference genome (version: rheMac8), and then Sniffles was used to call SVs from the bam file and the variants with support by at least ten high-quality reads were included in the downstream analysis⁴². SV filtering was performed with the following criteria: (1) Sniffle calling evaluated with “PASS”; (2) SV supported reads >5% of whole-region read depth; (3) SVs have precise breakpoints. De novo SV filtering included the following criteria: (1) The same SVs between parents and offspring were filtered out (parent-inherit). An overlapped SV was defined as same with overlapping length reaching at least 50% of reciprocal similarity; (2) the SVs shared between offsprings were removed (cross-filtering). We used IGV tool to visualize the alignment and coverage of long reads by using bam file from NGLMR. The SV numbers of each filtering step are listed in Supplementary Data 2.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

The sequencing data generated in this study was deposited in NCBI BioProject PRJNA588331. The source data underlying Figs. 1–2, Supplementary Figs. 1–8, and Supplementary Table 4 are provided as a Source Data file. All other data are available from the authors upon reasonable request.

References

Hsu, P. D., Lander, E. S. & Zhang, F. Development and applications of CRISPR-Cas9 for genome engineering. Cell 157, 1262–1278 (2014).
Article CAS PubMed PubMed Central Google Scholar
Sternberg, S. H., Redding, S., Jinek, M., Greene, E. C. & Doudna, J. A. DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature 507, 62–67 (2014).
Article CAS PubMed PubMed Central ADS Google Scholar
Anders, C., Niewoehner, O., Duerst, A. & Jinek, M. Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease. Nature 513, 569–573 (2014).
Article CAS PubMed PubMed Central ADS Google Scholar
Jennings, C. G. et al. Opportunities and challenges in modeling human brain disorders in transgenic primates. Nat. Neurosci. 19, 1123–1130 (2016).
Article PubMed CAS Google Scholar
Izpisua Belmonte, J. C. et al. Brains, genes, and primates. Neuron 86, 617–631 (2015).
Article PubMed CAS Google Scholar
Chan, A. W. Progress and prospects for genetic modification of nonhuman primate models in biomedical research. ILAR J. 54, 211–223 (2013).
Article CAS PubMed PubMed Central Google Scholar
Shi, L. & Su, B. A transgenic monkey model for the study of human brain evolution. Zool. Res. 40, 236–238 (2019).
Article PubMed PubMed Central Google Scholar
Iyer, V. et al. Off-target mutations are rare in Cas9-modified mice. Nat. Methods 12, 479 (2015).
Article CAS PubMed Google Scholar
Cho, S. W. et al. Analysis of off-target effects of CRISPR/Cas-derived RNA-guided endonucleases and nickases. Genome Res. 24, 132–141 (2014).
Article CAS PubMed PubMed Central Google Scholar
Smith, C. et al. Whole-genome sequencing analysis reveals high specificity of CRISPR/Cas9 and TALEN-based genome editing in human iPSCs. Cell. Stem Cell. 15, 12–13 (2014).
Article CAS PubMed PubMed Central Google Scholar
Veres, A. et al. Low incidence of off-target mutations in individual CRISPR-Cas9 and TALEN targeted human stem cell clones detected by whole-genome sequencing. Cell. Stem Cell. 15, 27–30 (2014).
Article CAS PubMed PubMed Central Google Scholar
Anderson, K. R. et al. CRISPR off-target analysis in genetically engineered rats and mice. Nat. Methods 15, 512–514 (2018).
Article CAS PubMed PubMed Central Google Scholar
Schaefer, K. A. et al. Unexpected mutations after CRISPR-Cas9 editing in vivo. Nat. Methods 14, 547–548 (2017).
Article CAS PubMed PubMed Central Google Scholar
Iyer, V. et al. No unexpected CRISPR-Cas9 off-target activity revealed by trio sequencing of gene-edited mice. PLoS Genet. 14, e1007503 (2018).
Article PubMed PubMed Central CAS Google Scholar
Kim, S. T. et al. Response to “Unexpected mutations after CRISPR-Cas9 editing in vivo”. Nat. Methods 15, 239–240 (2018).
Article CAS PubMed Google Scholar
Wilson, C. J. et al. Response to “Unexpected mutations after CRISPR-Cas9 editing in vivo”. Nat. Methods 15, 236–237 (2018).
Article CAS PubMed Google Scholar
Lescarbeau, R. M., Murray, B., Barnes, T. M. & Bermingham, N. Response to “Unexpected mutations after CRISPR-Cas9 editing in vivo”. Nat. Methods 15, 237 (2018).
Article CAS PubMed Google Scholar
Lareau, C. A. et al. Response to “Unexpected mutations after CRISPR-Cas9 editing in vivo”. Nat. Methods 15, 238–239 (2018).
Article CAS PubMed PubMed Central Google Scholar
Nutter, L. M. J. et al. Response to “Unexpected mutations after CRISPR-Cas9 editing in vivo”. Nat. Methods 15, 235–236 (2018).
Article CAS PubMed Google Scholar
Kosicki, M., Tomberg, K. & Bradley, A. Repair of double-strand breaks induced by CRISPR-Cas9 leads to large deletions and complex rearrangements. Nat. Biotechnol. 36, 765–771 (2018).
Article CAS PubMed PubMed Central Google Scholar
Cui, Y. et al. Generation of a precise Oct4-hrGFP knockin cynomolgus monkey model via CRISPR/Cas9-assisted homologous recombination. Cell Res. 28, 383–386 (2018).
Article CAS PubMed PubMed Central Google Scholar
Jackson, A. P. et al. Primary autosomal recessive microcephaly (MCPH1) maps to chromosome 8p22-pter. Am. J. Hum. Genet. 63, 541–546 (1998).
Article CAS PubMed PubMed Central Google Scholar
Wang, Y. & Su, B. Molecular evolution of microcephalin, a gene determining human brain size. Hum. Mol. Genet. 13, 1131–1137 (2004).
Article CAS PubMed Google Scholar
Shi, L. & Su, B. Transgenic rhesus monkeys carrying the human MCPH1 gene copies show human-like neoteny of brain development. Nat. Sci. Rev. 6, 480–493 (2019).
Article Google Scholar
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Article CAS PubMed PubMed Central Google Scholar
Rimmer, A. et al. Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications. Nat. Genet. 46, 912–918 (2014).
Article CAS PubMed PubMed Central Google Scholar
Garrison, E., Marth, G. Haplotype-based variant detection from short-read sequencing. Preprint at https://arxiv.org/abs/1207.3907 (2012).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Article PubMed PubMed Central CAS Google Scholar
Bae, S., Park, J. & Kim, J.-S. Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics 30, 1473–1475 (2014).
Article CAS PubMed PubMed Central Google Scholar
Wei, Q. et al. A Bayesian framework for de novo mutation calling in parents-offspring trios. Bioinformatics 31, 1375–1381 (2015).
Article CAS PubMed Google Scholar
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Article CAS PubMed PubMed Central Google Scholar
Kong, A. et al. Rate of de novo mutations and the importance of father’s age to disease risk. Nature 488, 471–475 (2012).
Article CAS PubMed PubMed Central ADS Google Scholar
Besenbacher, S., Hvilsom, C., Marques-Bonet, T., Mailund, T. & Schierup, M. H. Direct estimation of mutations in great apes reconciles phylogenetic dating. Nat. Ecol. Evol. 3, 286–292 (2019).
Article PubMed Google Scholar
Rausch, T. et al. DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28, I333–I339 (2012).
Article CAS PubMed PubMed Central Google Scholar
Ye, K. et al. Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25, 2865–2871 (2009).
Article CAS PubMed PubMed Central Google Scholar
Yu, Y. et al. Dna2 nuclease deficiency results in large and complex DNA insertions at chromosomal breaks. Nature 564, 287–290 (2018).
Article CAS PubMed PubMed Central ADS Google Scholar
Liang, Y. et al. BRIT1/MCPH1 is essential for mitotic and meioticrecombination DNA repair and maintaining genomic stability in mice. PLoS Genet. 6, e1000826 (2010).
Article PubMed PubMed Central CAS Google Scholar
Gruber, R. et al. MCPH1 regulates the neuroprogenitor division mode by coupling the centrosomal cycle with mitotic entry through the Chk1-Cdc25 pathway. Nat. Cell Biol. 13, 1325–1334 (2011).
Article CAS PubMed Google Scholar
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26, 589–595 (2010).
Article PubMed PubMed Central CAS Google Scholar
Mallick, S. et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016).
Article CAS PubMed PubMed Central ADS Google Scholar
He, Y. et al. Long-read assembly of the Chinese rhesus macaque genome and identification of ape-specific structural variants. Nat. Commun. 10, 4233 (2019).
Article PubMed PubMed Central ADS CAS Google Scholar
Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15, 461–468 (2018).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank Yan Guo for her technical assistance in this study. We thank Xingxu Huang for providing plasmids. This study was supported by grants from the National Natural Science Foundation of China (31621062, 31730088, 91731303, 31525014, 31771388, and 31711530221), the Strategic Priority Research Program (XDB13010000), the Key Research Program of Frontier Sciences (QYZDJ-SSW-SYS009) of the Chinese Academy of Sciences, the Program of Shanghai Academic Research Leader (16XD1404700), the National Key Research and Development Program (2016YFC0906403), and the Shanghai Municipal Science and Technology Major Project (2017SHZDZX01).

Author information

These authors contributed equally: Xin Luo, Yaoxi He, Chao Zhang, Xiechao He, Lanzhen Yan.

Authors and Affiliations

State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, 650223, Kunming, China
Xin Luo, Yaoxi He, Xiechao He, Min Li, Ting Hu, Yan Hu, Jin Jiang, Xiaoyu Meng, Ping Zheng & Bing Su
Kunming College of Life Science, University of Chinese Academy of Sciences, 100101, Beijing, China
Xin Luo, Yaoxi He, Min Li, Ting Hu, Yan Hu, Jin Jiang & Xiaoyu Meng
Kunming Primate Research Center, Chinese Academy of Sciences, 650223, Kunming, China
Xin Luo, Yaoxi He, Xiechao He, Lanzhen Yan, Xudong Zhao & Bing Su
Chinese Academy of Sciences (CAS) Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology (PICB), Shanghai Institutes for Biological Sciences, CAS, 200031, Shanghai, China
Chao Zhang & Shuhua Xu
Key Laboratory of Animal Models and Human Disease Mechanisms of Chinese Academy of Sciences, Kunming Institute of Zoology, Chinese Academy of Sciences, 650203, Kunming, China
Lanzhen Yan & Xudong Zhao
Yunnan Key Laboratory of Primate Biomedicine Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, 650500, Kunming, China
Weizhi Ji
Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, 650223, Kunming, China
Ping Zheng, Shuhua Xu & Bing Su
School of Life Science and Technology, ShanghaiTech University, 201210, Shanghai, China
Shuhua Xu
Collaborative Innovation Centre of Genetics and Development, 200438, Shanghai, China
Shuhua Xu

Authors

Xin Luo
View author publications
You can also search for this author in PubMed Google Scholar
Yaoxi He
View author publications
You can also search for this author in PubMed Google Scholar
Chao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiechao He
View author publications
You can also search for this author in PubMed Google Scholar
Lanzhen Yan
View author publications
You can also search for this author in PubMed Google Scholar
Min Li
View author publications
You can also search for this author in PubMed Google Scholar
Ting Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yan Hu
View author publications
You can also search for this author in PubMed Google Scholar
Jin Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyu Meng
View author publications
You can also search for this author in PubMed Google Scholar
Weizhi Ji
View author publications
You can also search for this author in PubMed Google Scholar
Xudong Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Ping Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Shuhua Xu
View author publications
You can also search for this author in PubMed Google Scholar
Bing Su
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

X.L. and B.S. designed the study; X.L., M.L., J.J., T.H., Y.H. and X.M. performed experiments; X.H., L.Y., P.Z. and X.Z. assisted in reproductive technique, microinjection, and animal care; Y.H, C.Z., X.L. and S.X. performed data analysis; W.J. provided the WGS data of the knockin monkeys; X.L., Y.H. and B.S. wrote the paper.

Corresponding authors

Correspondence to Ping Zheng, Shuhua Xu or Bing Su.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Data 1

Supplementary Data 2

Peer Review File

Reporting Summary

Description of Additional Supplementary Files

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Luo, X., He, Y., Zhang, C. et al. Trio deep-sequencing does not reveal unexpected off-target and on-target mutations in Cas9-edited rhesus monkeys. Nat Commun 10, 5525 (2019). https://doi.org/10.1038/s41467-019-13481-y

Download citation

Received: 03 March 2019
Accepted: 08 November 2019
Published: 04 December 2019
DOI: https://doi.org/10.1038/s41467-019-13481-y

This article is cited by

CRISPR/Cas9 genome editing to create nonhuman primate models for studying stem cell therapies for HIV infection
- Jenna Kropp Schmidt
- Matthew R. Reynolds
- Igor I. Slukvin
Retrovirology (2022)
CRISPR-Cas9 induces large structural variants at on-target and off-target sites in vivo that segregate across generations
- Ida Höijer
- Anastasia Emmanouilidou
- Adam Ameur
Nature Communications (2022)
Amplification-free long-read sequencing reveals unforeseen CRISPR-Cas9 off-target activity
- Ida Höijer
- Josefin Johansson
- Adam Ameur
Genome Biology (2020)
Evaluation of mutation rates, mosaicism and off target mutations when injecting Cas9 mRNA or protein for genome editing of bovine embryos
- Sadie L. Hennig
- Joseph R. Owen
- James D. Murray
Scientific Reports (2020)
Multiplex precise base editing in cynomolgus monkeys
- Wenhui Zhang
- Tomomi Aida
- Shihua Yang
Nature Communications (2020)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.