Despite their fundamental biological and clinical importance, the molecular mechanisms that regulate the first cell fate decisions in the human embryo are not well understood. Here we use CRISPR–Cas9-mediated genome editing to investigate the function of the pluripotency transcription factor OCT4 during human embryogenesis. We identified an efficient OCT4-targeting guide RNA using an inducible human embryonic stem cell-based system and microinjection of mouse zygotes. Using these refined methods, we efficiently and specifically targeted the gene encoding OCT4 (POU5F1) in diploid human zygotes and found that blastocyst development was compromised. Transcriptomics analysis revealed that, in POU5F1-null cells, gene expression was downregulated not only for extra-embryonic trophectoderm genes, such as CDX2, but also for regulators of the pluripotent epiblast, including NANOG. By contrast, Pou5f1-null mouse embryos maintained the expression of orthologous genes, and blastocyst development was established, but maintenance was compromised. We conclude that CRISPR–Cas9-mediated genome editing is a powerful method for investigating gene function in the context of human development.
Early mammalian embryogenesis is controlled by mechanisms that govern the balance between pluripotency and differentiation. Expression of early lineage-specific genes varies substantially between species1,2,3, with implications for developmental control and stem cell derivation. However, the mechanisms that pattern the human embryo are unclear, because of a lack of methods to efficiently perturb gene expression of early lineage specifiers in this species.
Recent advances in genome editing using the CRISPR (clustered regularly interspaced, short palindromic repeat)–Cas (CRISPR-associated) system have greatly increased the efficiency of genetic modification. The Streptococcus pyogenes Cas9 endonuclease is guided to homologous DNA sequences via a single-guide RNA (sgRNA) whereby it induces double strand breaks (DSBs) at the target site4. Endogenous DNA repair mechanisms function to resolve the DSBs, including error-prone non-homologous or micro-homology-mediated end joining, which can lead to insertions or deletions (indels) of nucleotides that can result in the null mutation of the target gene. CRISPR–Cas9-mediated editing has been attempted in abnormally fertilized tripronuclear human zygotes and a limited number of normally fertilized human zygotes, with variable success5,6,7,8. To determine whether CRISPR–Cas9 can be used to understand gene function in human preimplantation development, we chose to target POU5F1, a gene encoding the developmental regulator OCT4, as a proof-of-principle. Zygotic POU5F1 is thought to be first transcribed at the four- to eight-cell stage coincident with embryo genome activation (EGA), and OCT4 protein is not detectable until approximately the eight-cell stage2,3. OCT4 perturbation would be predicted to cause a clear developmental phenotype based on studies in the mouse9,10 and human embryonic stem (ES) cells11.
By using an inducible human ES cell-based CRISPR–Cas9 system and optimizing mouse zygote microinjection techniques, we have identified conditions that allowed us to target POU5F1 efficiently and precisely in human zygotes. Live embryo imaging revealed that while OCT4-targeted human embryos initiate blastocyst formation, the inner cell mass (ICM) forms poorly, and embryos subsequently collapse. Mutations affecting POU5F1 in human blastocysts are associated with the downregulation of genes associated with all three preimplantation lineages, including NANOG (epiblast), GATA2 (trophectoderm) and GATA4 (primitive endoderm). By contrast, in OCT4-null mouse blastocysts, genes such as Nanog continue to be expressed in the ICM. The insights gained from these investigations advance our understanding of human development and suggest that OCT4 has an earlier role in the progression of the human blastocyst compared to the mouse, and therefore that there are distinct mechanisms of lineage specification between these species.
Selection of an sgRNA targeting POU5F1
To target POU5F1, we selected four sgRNAs using a standard in silico prediction tool12: two targeting the exon encoding the N-terminal domain of OCT4 (sgRNA1-1 and sgRNA1-2), one targeting the exon encoding the conserved DNA-binding POU homeodomain13,14 (sgRNA2b) and one targeting the end of the POU domain and the start of the C-terminal domain (sgRNA4) (Extended Data Fig. 1a). To screen candidate sgRNAs, we took advantage of human ES cells as an unlimited resource that reflects the cellular context of the human preimplantation embryo. We engineered isogenic human ES cells constitutively expressing the Cas9 gene, together with a tetracycline-inducible sgRNA11 (Fig. 1a), thereby allowing comparative assessment of sgRNA activities.
Cells were collected every day for five days for flow cytometry analysis, which revealed that induction of each of the sgRNAs in human ES cells imposed remarkably different temporal effects on OCT4 protein expression (Extended Data Fig. 1b). sgRNA2b was the most efficient at rapidly causing loss of OCT4 protein expression, with only 15.6% of cells retaining detectable OCT4 by day 5 of induction. Immunofluorescence analysis following sgRNA2b induction confirmed the efficient loss of OCT4 expression (Fig. 1b, Extended Data Fig. 2a). Conversely, in human ES cells induced to express sgRNAs 1-1, 1-2 or 4, 43.7%, 70.5% and 51.7% of cells retained OCT4 expression at the equivalent time, respectively (Extended Data Fig. 1b). To identify the transcriptional consequences of OCT4 depletion, we performed quantitative PCR with reverse transcription (qRT–PCR) and RNA sequencing (RNA-seq) analysis on induced and non-induced sgRNA2b-expressing human ES cells (Extended Data Figs 1c, d and 2b). Induction of sgRNA2b resulted in downregulation of pluripotency genes such as NANOG, ETS1 and DPPA3, consistent with OCT4 depletion causing exit from self-renewal. Furthermore, the differentiation-associated genes PAX6, SOX17, SIX3, GATA2 and SOX9 were upregulated after induction of sgRNA2b, suggesting that OCT4 normally restrains differentiation (Extended Data Figs 1c, d and 2a, b).
Analysing POU5F1 targeting specificity
To compare the on-target editing efficiencies and mutation spectrums induced by candidate sgRNAs, we performed a time-course genotypic analysis on cells collected across four days after sgRNA induction. Targeted deep sequencing of the on-target site revealed indels from as early as 24 h after induction of sgRNA2b, but not until 48 h after induction of sgRNAs1-1, 1-2 or 4 (Fig. 1c). sgRNA2b-induced indels most commonly comprised a 2-bp deletion upstream of the protospacer-adjacent motif (PAM) site leading to a frameshift mutation and a premature stop codon (Extended Data Fig. 3), consistent with the loss of OCT4 protein expression.
We evaluated putative off-target sites identified by their sequence similarity to the seed region of sgRNA2b (Extended Data Fig. 4a, b). We did not observe off-target indels in sgRNA2b-induced human ES cells, nor any sequence alterations above background PCR error rates observed in control human ES cell lines. In parallel, we performed a genome-wide unbiased evaluation of off-target events using Digenome-seq (Extended Data Fig. 4c). Targeted deep sequencing across the experimentally determined putative off-target sites revealed that indels had occurred only at the on-target site (Extended Data Fig. 4d). Furthermore, we used the WebLogo program to determine the most frequent sequences associated with putative sites identified from Digenome-seq15,16 (Extended Data Fig. 4e). Deep sequencing at these sites also confirmed that no off-target events had occurred (Extended Data Fig. 4f). In all, owing to both its efficient mutagenicity and its high on-target specificity, sgRNA2b appeared the most promising.
sgRNA activity in mouse embryos
We used published sgRNA and Cas9 mRNA zygote microinjection conditions17 to further assess sgRNA activity and optimize microinjection methodologies in mouse zygotes. As it has been shown that OCT4-null mouse blastocysts lack expression of the primitive endoderm marker SOX17 owing to a cell-autonomous requirement for FGF4 and MAPK signalling9,18, we used the absence of both OCT4 and SOX17 immunostaining to identify OCT4-deficient embryos (Fig. 1d). This OCT4-null phenotype was observed in 54% of embryos injected with Cas9 mRNA and sgRNA2b, and in 0%, 10% or 3% of embryos injected with Cas9 mRNA and sgRNA1-1, sgRNA1-2 or sgRNA4, respectively (Fig. 1e). These data confirm that sgRNA2b is superior to the other tested sgRNAs at inducing null mutations in both mouse embryos and human ES cells. We next tested a greater range of Cas9 mRNA and sgRNA concentrations to identify conditions that could enhance rates of mutagenesis (Extended Data Fig. 5a). We confirmed that the previously reported concentrations of 100 ng μl−1 Cas9 mRNA and 50 ng μl−1 sgRNA17 were optimal for inducing an OCT4-null phenotype.
It has been suggested that microinjection of sgRNA and Cas9 ribonucleoprotein complexes may reduce mosaicism and allelic complexity by bypassing the requirement for Cas9 translation and sgRNA–Cas9 complex formation in embryos19,20. To test this, we microinjected mouse pronuclear zygotes with preassembled ribonucleoprotein complexes containing varying concentrations of Cas9 protein (20–200 ng μl−1) and sgRNA2b (20–100 ng μl−1; Fig. 1f and Extended Data Fig. 5b). Immunofluorescence analysis revealed that the sgRNA–Cas9 complex was superior to Cas9 mRNA in causing loss of both OCT4 and SOX17, and that the optimal concentration comprised 50 ng μl−1 Cas9 protein and 25 ng μl−1 sgRNA (Fig. 1f). Notably, MiSeq analysis demonstrated that 83% of blastocysts derived from sgRNA2b–Cas9 complex microinjections had four or fewer different types of indels (Fig. 1g), suggesting that editing occurred before or at the two-cell stage. By contrast, only 53% of embryos microinjected with sgRNA2b and Cas9 mRNA exhibited this range of indels. Furthermore, a greater proportion of blastocysts that formed after sgRNA2b and Cas9 mRNA microinjection had six or more different types of detectable indels (42%) compared to those that formed after microinjection of the sgRNA2b–Cas9 complex (8%). This increased mutational spectrum suggests that, following Cas9 mRNA injection, DNA editing occurred between the three- and four-cell stages. Consistent with previous reports21, we observed a stereotypic pattern in the types of indels detected in independently targeted embryos, including the representative 28-bp deletion (Extended Data Fig. 5c), which was distinct from those induced in human ES cells.
As well as lacking SOX17 and OCT4 expression, mouse embryos microinjected with the sgRNA2b–Cas9 complex recapitulated other reported OCT4-null phenotypes, such as downregulation of PDGFRA, SOX7, GATA6 and GATA4 in the primitive endoderm (Extended Data Fig. 5d). Consistent with the role of OCT4 in repressing trophectoderm genes9, the few ICM cells that could be detected in sgRNA2b–Cas9 microinjected embryos expressed CDX2 ectopically (Extended Data Fig. 5d). When plated in mouse ES cell derivation conditions, these embryos failed to generate ICM outgrowths, and instead differentiated into trophoblast-like cells (Extended Data Fig. 5e). By contrast, blastocysts derived from non-injected embryos formed ICM outgrowths in most instances, as did blastocysts from embryos microinjected with Cas9 protein alone or an sgRNA–Cas9 complex targeting Dmc1 (a gene not essential for preimplantation development). Having thus determined sgRNA2b to be an efficient and specific guide capable of generating a null mutation of POU5F1 or Pou5f1 in human ES cells and mouse preimplantation embryos, respectively, we next used this guide together with our optimized microinjection technique to target POU5F1 in human preimplantation embryos.
Targeting POU5F1 in human preimplantation embryos
To test whether OCT4 is required in human embryos, we performed CRISPR–Cas9 editing on thawed in vitro fertilized (IVF) zygotes that were donated as surplus to infertility treatment. We microinjected 37 zygotes with the sgRNA2b–Cas9 ribonucleoprotein complex (Supplementary Video 1), and 17 zygotes with Cas9 protein alone to control for the microinjection technique. Of the zygotes that were microinjected with sgRNA2b–Cas9, 30 embryos retained both pronuclei during microinjection, with pronuclear fading observed approximately 6 h later and cytokinesis on average 5 h later (Supplementary Video 2). These timings are similar to those previously published22,23 and indicate that microinjection was performed when the embryos were in S phase of the cell cycle (Fig. 2a). Genome editing by the ribonucleoprotein complex has been estimated24 to start after approximately 3 h in vitro and to persist for 12–24 h, so CRISPR–Cas9-induced DSBs are likely to be formed during late S phase or subsequently at G2 phase. In seven of the zygotes that were microinjected with sgRNA2b–Cas9, the pronuclei had already faded after thawing, showing that they had exited S phase and were undergoing syngamy. These embryos consequently underwent cell division approximately 3 h after microinjection. In these embryos, editing is likely to have occurred during the G1 phase of the next cell cycle, at the two-cell stage (Fig. 2a), which would promote mosaicism.
Time-lapse microscopy of the embryos showed that the timings of cleavage divisions following pronuclear fading were similar between embryos microinjected with Cas9 protein or sgRNA2b–Cas9 (Fig. 2b, c). By the eight-cell stage, cleavage arrest was observed in 62% (23 out of 37) of sgRNA2b–Cas9-microinjected embryos compared to 53% (9 out of 17) of Cas9-microinjected control embryos (Fig. 2d). As developmental arrest at the onset of EGA at the eight-cell stage correlates strongly with aneuploidy in IVF embryos25, we also sought to determine embryo karyotypes. We performed low-pass whole-genome sequencing, which has been shown to accurately estimate gross chromosome anomalies26. We collected blastomeres from sgRNA2b–Cas9-microinjected embryos arrested up to the eight-cell stage and detected chromosomal loss or gain in 83% (five out of six) of these embryos (Extended Data Fig. 6a), which is consistent with rates reported by preimplantation genetic screening26,27. Trophectoderm biopsies of a subset of blastocysts that developed following sgRNA2b–Cas9 microinjection showed that 60% (three out of five) were euploid (Fig. 2e, Extended Data Fig. 6a). The other two blastocysts exhibited karyotypic abnormalities, including the loss of chromosome 16 (Extended Data Fig. 6b), an abnormality frequently observed in human preimplantation embryos and thus likely to be unrelated to targeting25. In the Cas9-microinjected control group, 57% (four out of seven) of blastocysts were euploid, and aneuploidies were observed in the remaining three blastocysts, including the loss of chromosome 14 in two sibling-matched control embryos, and the gain of chromosome 15 and 18 (Fig. 2e, Extended Data Fig. 6a, b). Altogether, these data suggest that CRISPR–Cas9 targeting does not increase the rate of karyotypic anomalies in human embryos.
Forty-seven per cent (8 out of 17) of Cas9-microinjected control embryos developed to the blastocyst stage, a rate equivalent to those of uninjected controls28, suggesting that the microinjection technique did not affect embryo viability (Fig. 2d). However, significantly fewer of the sgRNA2b–Cas9-microinjected embryos—only 19% (7 out of 37)—developed to the blastocyst stage (Fig. 2d, P = 0.03). The blastocysts that formed following sgRNA2b–Cas9 protein microinjection were of variable quality (Extended Data Fig. 6c). Although all blastocysts had a discernible blastocoel cavity, only some possessed a small compact ICM (Extended Data Fig. 6c), and all retained a thick zona pellucida, in contrast to Cas9-microinjected controls. Embryos arising from zygotes microinjected with sgRNA2b–Cas9 also went through iterative cycles of expanding and initiating blastocyst formation and then collapsing, until some embryos ultimately degenerated (Supplementary Videos 2 and 3). These findings suggest that targeting OCT4 in human embryos reduces both viability and quality of blastocysts.
To measure on-target editing efficiency, we performed targeted deep and/or Sanger sequencing of separate individual cells microdissected from sgRNA2b–Cas9-microinjected embryos arrested before the eight-cell stage, and found indels at the POU5F1 on-target site in 71% (five out of seven) of embryos (Fig. 3a, purple line). The most frequently observed indels in sgRNA2b–Cas9-microinjected embryos were the 2-bp and 3-bp deletions that were observed in the sgRNA2b-induced human ES cells (Fig. 3b, Extended Data Fig. 7a, b). This finding indicates that human ES cells can be used not only to screen sgRNA efficiency, but also to predict the in vivo mutation spectrum induced by CRISPR–Cas9-mediated genome editing. We also detected larger POU5F1 deletions in the human embryos than in human ES cells, similar to our observations in mouse embryos (Fig. 3b, Extended Data Fig. 7a, b). Furthermore, targeted deep and/or Sanger sequencing in edited cells demonstrated that off-target mutations were undetectable above background PCR error rates, confirming the specificity of the sgRNA (Extended Data Fig. 7c, d).
We next assessed mutational signatures in more developmentally advanced embryos, after EGA. Notably, we confirmed that on-target editing had occurred in eight out of eight sgRNA2b–Cas9-microinjected embryos analysed from the eight-cell to the blastocyst stage (Fig. 3a, green line). However, these embryos invariably retained wild-type copies of the POU5F1 allele in at least one cell (Fig. 3a). In sgRNA2b–Cas9-microinjected human embryos, OCT4 protein expression was downregulated in most cleavage-stage cells and undetectable above background in others, confirming the high efficiency of editing (Fig. 3c, Extended Data Fig. 8a). However, we were able to identify at least one cell that had nuclear OCT4 staining above background levels in all cases (Fig. 3c, Extended Data Fig. 8a). Moreover, despite a significant reduction in cell number (P = 0.001), blastocyst-stage embryos also retained OCT4 expression in a subset of cells (Fig. 3d, e, Extended Data Fig. 8b, c). These findings suggest that POU5F1 targeting efficiency is high, and that only embryos with partial OCT4 expression are able to progress to the blastocyst stage.
To determine whether there is a high degree of editing in embryos before the onset of OCT4 expression, we microinjected four additional human embryos with the sgRNA2b–Cas9 complex and stopped their development before the eight-cell stage. One-hundred per cent (four out of four) of these embryos had detectable indels, with two embryos lacking wild-type POU5F1 alleles (Fig. 3a, black line). In one embryo, editing occurred in all blastomeres, although one blastomere retained one copy of the wild-type allele. In another embryo, although four out of five blastomeres had been edited, one blastomere retained both copies of the wild-type allele. Together with the cleavage-arrested embryos above, these data show that in 45% (five out of eleven) of cleavage stage embryos (either stopped or developmentally arrested), all of the cells analysed from each embryo had no detectable POU5F1 wild-type alleles, indicating high rates of editing. In addition, these data suggest that OCT4 has an unexpectedly earlier function in humans than in mice, before blastocyst formation.
Loss of OCT4 associated with gene mis-expression
To identify globally which genes might be affected by the loss of OCT4, we microdissected single cells from microinjected embryos at the blastocyst stage. We adapted a method to isolate both RNA and DNA from single cells29 in order to perform RNA-seq and targeted deep or Sanger sequencing of on-target and putative off-target sites. Principal component analysis showed that cells from sgRNA2b–Cas9-microinjected human blastocysts clustered distinctly from those derived from Cas9-microinjected controls (Fig. 4a). Notably, the cluster from sgRNA2b–Cas9-microinjected embryos contained not only cells that were homozygous null mutant for POU5F1, but also those that were wild-type or heterozygous. This finding suggests that loss of POU5F1 may impose non-cell autonomous effects on gene expression in neighbouring wild-type or heterozygous cells.
Differential gene expression analysis indicated that the genes that were most highly mis-expressed in the sgRNA2b–Cas9-targeted human blastocysts (compared to the Cas9 controls) included those that we previously identified as highly enriched in the epiblast, including NANOG, KLF17, DPPA5, ETV4, TDGF1 and VENTX (Extended Data Fig. 9a, Supplementary Table 1). Immunofluorescence analysis confirmed that even in cells that retained OCT4, the expression of NANOG was absent (Fig. 4b, Extended Data Fig. 8c). In striking contrast, OCT4-null mouse blastocysts maintained Nanog expression in the ICM (Fig. 4b, Extended Data Fig. 8d, e), as previously reported9,18.
In OCT4-null cells, several trophectoderm-associated genes were also downregulated, including CDX2, HAND1, DLX3, TEAD3, PLAC8 and GATA2 (Extended Data Fig. 9a, Supplementary Table 1). We confirmed loss of GATA2 protein expression in human sgRNA2b–Cas9-injected embryos (Fig. 4c, Extended Data Fig. 8f). Coupled with the failure to maintain a fully expanded blastocyst, this finding suggests that the integrity of the trophectoderm may be compromised in OCT4-targeted embryos. To investigate this further, we performed immunofluorescence analysis for ZO-1, which incorporates into tight junctions during trophectoderm formation. In sgRNA2b–Cas9-targeted human blastocysts, ZO-1 expression was interrupted, patchy and diffuse compared to the uniform network-like distribution in uninjected control embryos (Fig. 4d). By contrast, in mouse Oct4-null embryos, expression of trophectoderm markers such as Cdx2, Hand1 and Gata3 is upregulated9.
In addition, primitive endoderm markers such as GATA4 were downregulated in sgRNA2b–Cas9-microinjected embryos compared to Cas9 controls. Immunofluorescence analysis suggested that SOX17 protein expression was also downregulated (Fig. 3d, Extended Data Fig. 8b). Moreover, we were surprised to observe ectopic expression of PAX6 in some cells from sgRNA2b–Cas9-edited human blastocysts (Extended Data Fig. 9a, Supplementary Table 1). The lack of expression of genes associated with all three lineages in the blastocysts suggests that OCT4-targeted embryos either failed to initiate the expression of these genes or downregulated their expression as development progressed. To determine whether the gene expression patterns in OCT4-targeted cells more closely resemble those of cells from earlier stages of human development, we integrated our data with a previously published dataset comprising all stages of human preimplantation development3,30 (Fig. 4e, Extended Data Fig. 9b). This revealed that while cells from OCT4-targeted embryos were progressing towards the transcriptional state of the blastocyst, they were more dispersed and heterogeneous in their gene expression. Together, our data suggest that the integrity of the human blastocyst is compromised as a consequence of OCT4 downregulation. As a result, all lineages are negatively affected, pointing to a functional role for OCT4 in early human development.
CRISPR–Cas9-mediated genome editing represents a transformative method to evaluate the function of putative regulators of human preimplantation development. We have demonstrated the importance of initially screening sgRNA efficiencies and mutagenic patterns before targeting in human embryos, as sgRNAs were not equivalently efficient in inducing POU5F1-null mutations despite scoring highly by in silico predictions. We have shown that OCT4 loss has different consequences in human and mouse embryos, consistent with other differences reported between these species. For example, pharmacological inhibition of FGF and downstream ERK signalling leads to ectopic expression of pluripotency factors in the mouse, but not the human at equivalent stages31,32.
Unexpectedly, our data suggest that OCT4 may be required earlier in human development than in mice, for instance during the cleavage or morula stages, when OCT4 expression is initiated (Fig. 4f). As the mouse maternal–zygotic Pou5f1-null mutation phenocopies the zygotic-null mutation9, it is unlikely that persistence of maternal transcripts or proteins compensates for the loss of OCT4 expression, and any additional compensatory mechanisms that may be present in the mouse do not appear to be conserved in the regulation of human development. The mis-expression of genes associated with all three blastocyst lineages in OCT4-targeted human blastocysts further suggests that OCT4 may have an essential function before this stage. In the future, it would be informative to determine whether OCT4 mutation leads to changes in gene expression before the blastocyst stage, which may explain the failure of blastocyst development. Alternatively, inducing POU5F1-null mutations in human embryos slightly later in development, following the onset of EGA, may bypass its earlier critical role and thereby delineate its function in the fully formed blastocyst.
Notably, CRISPR–Cas9-mediated genome editing does not appear to increase genomic instability or developmental arrest before EGA, suggesting that this method could be used to understand the function of other putative lineage specifiers. In future, a number of adaptations may provide further advantages. Co-injection of the CRISPR–Cas9 components with sperm during intracytoplasmic sperm injection33 might allow more time for targeting before the first cell division, further increasing editing efficiency. Indeed, this approach has been used recently in human embryos8. Introducing multiple sgRNAs might increase targeting efficiency, but may also increase the risk of off-target mutations. Alternatively, introducing the CRISPR–Cas9 components alongside a donor oligonucleotide complementary to the target locus and harbouring a premature stop codon should favour the generation of null mutations via homology-directed repair. This approach may not be straightforward, given that recent attempts to correct an abnormal paternal gene variant were suggested to use the maternal allele for HDR rather than an introduced template8, although this requires further validation34. Targeting genes that are not essential for, or have a later or more specific role in, pre-implantation development will also inform our interpretation of the OCT4 phenotype. At present, we cannot be certain that the early developmental arrest is associated with the loss of OCT4 rather than some non-specific effect of injecting both Cas9 and the sgRNA, as opposed to Cas9 alone. However, a previous study showed that human embryos in which a non-essential gene was targeted exhibited rates of blastocyst formation similar to controls8. This suggests that the effects we see here are due to loss of OCT4. In summary, we have developed an optimized approach to target OCT4 in human embryos, thus suggesting that OCT4 has a different function in humans than in mice. This proof of principle lays out a framework for future investigations that could transform our understanding of human biology, thereby leading to improvements in the establishment and therapeutic use of stem cells and in IVF treatments.
This study was approved by the UK Human Fertilisation and Embryology Authority (HFEA): research licence number 0162, and the Health Research Authority’s Research Ethics Committee (Cambridge Central reference number 16/EE/0067).
The process of licence approval entailed independent peer review along with consideration by the HFEA Licence Committee. Our research is compliant with the HFEA Code of Practice and has undergone inspections by the HFEA since the licence was granted. Research donors were recruited from patients at Bourn Hall clinic.
Informed consent was obtained from all couples that donated spare embryos following IVF treatment. Before giving consent, people donating embryos were provided with all of the necessary information about the research project, an opportunity to receive counselling and the conditions that apply within the licence and the HFEA Code of Practice. Specifically, patients signed a consent form authorizing the use of genome editing techniques including CRISPR–Cas9 on donated embryos. Donors were informed that after the embryos had been genetically modified their development would be stopped before 14 days post-fertilization and that subsequent biochemical and genetic studies would be performed. Informed consent was also obtained from donors for all the results of these studies to be published in scientific journals. No financial inducements were offered for donation. Consent was not obtained to perform genetic tests on patients and no such tests were performed. The patient information sheets and consent document provided to patients are publicly available (https://www.crick.ac.uk/research/a-z-researchers/researchers-k-o/kathy-niakan/hfea-licence/). Embryos surplus to the patient’s IVF treatment were donated cryopreserved and were transferred to the Francis Crick Institute where they were thawed and used in the research project.
Power analysis and data acquisition
The R statistical package pwr was used to determine the number of human embryos required to determine the function of OCT4 compared to microinjected controls. A two-sample t-test was performed to a significance level of P < 0.05. The effect size was 0.8, which assumes an observable difference between the CRISPR-injected and control embryos. The sample size was estimated to be 25 CRISPR-targeted embryos.
Unless stated otherwise, the experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment.
sgRNA design to target POU5F1
So as not to lower the targeting efficiency, we determined whether the sgRNAs targeted polymorphic regions of the human genome. Most sgRNAs had a single nucleotide polymorphism (SNP) frequency of less than 0.1% in the human population, with the exception of the sgRNA targeting exon 4, which had an SNP frequency of 32% within the sgRNA target sequence as determined by the 1000 Genomes project35. We retained this sgRNA as it had the highest in silico score and overlapped with a site that has been previously shown in complementarity studies to be functionally required for pluripotency, suggesting that even an in-frame deletion would render a loss of function in the gene13. We also favoured the use of sgRNAs with sequence conservation of the PAM and sgRNA seed sequence (approximately 12-bp region proximal to the PAM sequence) that would allow us to determine efficiency in mouse embryos. In the case of high-scoring sgRNAs targeting exon 2d, there is no mouse equivalent sgRNA sequence that we could evaluate, and for exon 3, we could not design sgRNAs where the predicted cut site would be within the exon; these options were therefore excluded.
sgRNA production and ribonucleoprotein preparation
sgRNAs were prepared as previously described36. The sgRNA was cloned into the bicistronic expression vector px330 (Addgene; 4223037) using the Bbs1 restriction site. The sgRNA sequence from the correctly targeted px330 vector was amplified using the Q5 hot start high fidelity DNA polymerase (NEB; M0493) and the PCR product was in vitro transcribed using the MEGAshortscript T7 kit (ThermoFisher Scientific; AM1354) and purified using the Zymo RNA Clean & Concentrator columns (Zymo Research; R1017). The sgRNA and Cas9 mRNA (TriLink Biotechnologies; L61256) and recombinant Cas9 protein (Toolgen; TGEN CP1) were individually re-suspended in RNase-free water, aliquoted and stored at −80 °C until use. Prior to microinjection, the ribonucleoprotein complex was prepared by centrifuging the Cas9 protein for 1 min at 14,000 r.p.m. at 4 °C and transferring the supernatant to a fresh tube containing the sgRNA. This was incubated at 37 °C for 15 min, pulse spun and transferred to a fresh tube for microinjection.
Mouse zygote collection
Four- to eight-week-old (C57BL6 × CBA) F1 female mice were super-ovulated using injection of 5 IU of pregnant mare serum gonadotrophin (PMSG; Sigma-Aldrich). Forty-eight hours after PMSG injection, 5 IU of human chorionic gonadotrophin (HCG; Sigma-Aldrich) was administered. Superovulated females were set up for mating with eight-week-old or older (C57BL6 × CBA) F1 males. Mice were maintained on a 12 h light–dark cycle. Mouse zygotes were isolated in Global total with HEPES (LifeGlobal; LGTH-100) under mineral oil (Origio; ART-4008-5P) and cumulus cells were removed with hyaluronidase (Sigma-Aldrich; H4272). All animal research was performed in compliance with the UK Home Office Licence Number 70/8560.
Human embryo thaw
Human zygotes were thawed using Quinn’s Advantage thaw kit (Origio; ART-8016). Briefly, upon thawing the embryos were transferred to 3 ml of 0.5% sucrose thawing medium and incubated for 5 min at 37 °C, followed by 3 ml of 0.2% sucrose thawing medium for 10 min at 37 °C. The embryos were then washed through seven drops of diluent solution before culture. Human blastocysts were thawed using the Blast thaw kit (Origio; 10542010) following the manufacturer’s instructions.
Human and mouse microinjection and culture
Human and mouse embryo microinjections were performed in Global Total medium with HEPES under mineral oil on a heated stage with a holding pipette (Research Instruments) and a Femtojet 4i microinjection manipulator (Eppendorf) set at approximately 40 injection pressure and 20 constant pressure. Embryos were microinjected with a mixture of Cas9 mRNA and sgRNA or the ribonulceoprotein complex back-filled into microfilament glass capillary injection needles (World Precision Instruments; TW100F-6) pulled using a pipette puller (Suter; P-97 micropipette puller). The microinjection procedure took ~15 min to complete.
Human or mouse embryos were cultured in drops of pre-equilibrated Global medium (LifeGlobal; LGGG-20) supplemented with 5 mg ml−1 protein supplement (LifeGlobal; LGPS-605) and overlaid with mineral oil (Origio; ART-4008-5P). Pre-implantation embryos were incubated at 37 °C and 5.5% CO2 in an EmbryoScope+ time-lapse incubator (Vitrolife) for either 3–4 d (mouse) or 5–6 d (human).
Genomic DNA extraction and genotyping
Human ES cells were lysed using proteinase K digestion (10 μg ml−1 in lysis buffer (100 mM Tris buffer pH 8.5, 5 mM EDTA, 0.2% SDS, 200 mM NaCl)) overnight at 37 °C. gDNA was extracted from the lysed cells using phenol:chloroform extraction followed by ethanol precipitation. For the time-course genotypic analysis, bulk cells were collected every 24 h and PCR products were amplified from the extraction genomic DNA. These products were used to generate multiplexed libraries for targeted amplicon sequencing by MiSeq according to the manufacturer’s instructions (Illumina).
Genomic DNA from fixed embryos (human and mouse) was isolated using the alkaline lysis method; 25 μl of 50 mM NaOH was added to the sample and incubated at 95 °C for 5 min. Samples were neutralized by adding 2.5 μl of 1 M Tris-HCL pH 8.0.
The Illustra Single Cell GenomiPhi DNA Amplification Kit (GE Healthcare Life Sciences; 29108039) was used according to manufacturer’s instructions to amplify gDNA from unfixed mouse blastocysts. DNA was purified by adding 30 μl of 20 mM EDTA, 5 μl of 3 M sodium acetate and 137 μl ice cold ethanol. Tubes were mixed by inverting and centrifuged at 16,000g for 20 min. Supernatant was removed and DNA was washed in 100 μl ice cold 70% ethanol by mixing and centrifuging for 5 min. DNA was resuspended by adding 20 μl H2O and incubating for 20 min at 4 °C before mixing by gentle pipetting. These products were used to generate multiplexed libraries for targeted amplicon sequencing by MiSeq according to the manufacturer’s instructions (Illumina).
To genotype cells from unfixed Cas9 control or OCT4-targeted human embryos, genomic DNA was isolated from either an individual single cell (1-cell embryos) or following microdissection of multiple individual single-cell samples from each embryo or approximately five cells from trophectoderm biopsies. The samples were genotyped following whole genome amplification (WGA) using one of the following protocols:
(1) For the single cell samples used in the either the modified G&T-seq protocol29 or isolated solely for genotyping, genomic DNA was amplified using the REPLI-g Single Cell Kit (Qiagen; 150343) according to the manufacturer’s guidelines. The DNA samples were quantified using high-sensitivity Qubit assay. In preparation for Sanger sequencing and MiSeq analysis, the WGA DNA product was diluted 1:100 in nuclease-free water, and 2 μl of this product was used as the template in a PCR reaction containing 25 μl Phusion High Fidelity PCR Master Mix (New England Biolabs), 2.5 μl 5 μM forward primer, 2.5 μl 5 μM reverse primer and 18 μl nuclease-free water. Thermocycling settings used were as follows: 98 °C 30 s, 35 cycles of 98 °C 10 s, 58 °C 30 s, 72 °C 30 s, and a final extension of 72 °C for 5 min. Gel electrophoresis confirmed that the size of the PCR product corresponded to the expected amplicon size. PCR amplicons were analysed by Sanger sequencing and indels were quantified by TIDE webtool38. Results of the TIDE analysis were also verified by manual visual inspection of the Sanger chromatograms. For MiSeq library preparation, quantification, pooling and denaturation were performed according to the manufacturer’s instructions (Illumina). PCR amplicons were cleaned using an equal volume of AMPure XP beads according to the manufacturer’s instructions (Beckman Coulter). Index PCR was performed using 10 μl of cleaned amplicon, 12.5 μl Q5 high fidelity 2X Master Mix (NEB; M0492S), 1.25 μl Nextera XT Index 1 primer and 1.25 μl Nextera XT Index 2 primer (Nextera XT Index kit; FC‐131‐1001). The thermocycling parameters used were: 98 °C for 30 s, 35 cycles of 98 °C for 10 s, optimized annealing temperature for 30 s, 72 °C for 30 s, and a final extension of 72 °C for 2 min. Index PCR was cleaned using equal volume of AMPure XP beads as described previously. Beads were rehydrated with 20 μl nuclease-free water. Five microlitres of the index PCR product was run on a gel to identify any samples with over-abundance of primer dimers, which were subsequently subjected to gel size selection and extraction using QIAquick gel extraction kit (Qiagen; 28704). Index PCR products were quantified using QuantiFluor dsDNA system (Promega; E2670). The concentration was used to determine the dilution required to obtain a 5 μM solution of each sample. Five microlitres of each sample was pooled and the library was spiked with 20% PhiX genomic control (Illumina; FC‐110‐3001). Sequencing generated paired-end (2 × 250-bp) dual indexed reads. After sequencing, reads were demultiplexed and stored as FASTQ files for downstream processing and analysis. The CRISPR Genome Analyser39 or CRISPR Cas Analyser40 tools were used to align the reads and to determine the percentage of non-wild-type reads resulting from editing, as well as assessing the position and size of each indel for all of the PCR amplicons evaluated. Seven single cell samples processed solely for genotyping failed to amplify using any of the sgRNA2b on-target site primers. Eight samples from single cells processed using the modified G&T-seq protocol failed to amplify using any of the sgRNA2b on-target site primers. We further tested these DNA samples using primers up- and down-stream of the sgRNA2b on-target site. We also performed PCR analysis using primers targeting GAPDH as a positive control. These samples failed to generate amplicons using any of these primer pairs and were subsequently excluded from the analysis on the basis that they were likely to be of poor quality.
(2) For the samples used in the cytogenetic analysis described below, the cells were subjected to WGA (SurePlex, Rubicon). Out of the 22 OCT4-targeted human embryo samples, three failed WGA using this protocol and were excluded from further analysis and three showed suboptimal amplification. The samples showing some evidence of amplification were processed along with three control samples for genotype analysis. The resulting PCR amplicons were quantified using high sensitivity Qubit assay to establish whether concentrations were in an acceptable range (approximately 3–5 ng μl−1). Gel electrophoresis confirmed that the size of the PCR product corresponded to the expected amplicon size. Of the 19 WGA products examined from the OCT4-targeted embryos, six failed the targeted PCR amplification and were excluded from further analysis. The rest were processed for genotype analysis using MiSeq targeted deep sequencing. The sequences were analysed using the CRISPR Cas Analyser tool by uploading the FASTQ files and defining the target DNA sequence and unedited sequence as a reference. The genotypes were further confirmed using the IGV software (Broad Institute).
The samples from the protocols above were used for genotyping of on- and putative off-target sites. The samples were amplified using the primers listed in Extended Data Table 1a. Primers were designed to generate amplicons of approximately 250 bp centred around the predicted cut site so as to maximize the detection of a variety of mutations and ensure that each amplicon was sequenced continuously from the forward and reverse barcode. We excluded PCR primers targeting highly polymorphic regions of the genome35.
PCR amplification of the sgRNA2b on-target site was initially performed on all samples using a primer pair generating an amplicon size of 244 bp, which is also suitable for MiSeq analysis. Any samples that failed amplification three times using this primer pair were subjected to amplification using alternative primer pairs listed in Extended Data Table 1a. Where only the original reference genome sequence was identified, the genotype was classified as wild-type. When only edited sequences were detected, the genotype was defined as knockout. Whenever an original reference sequence and an edited sequence were identified in the same cell the corresponding cell was characterised as heterozygous. Where possible we assessed multiple single cells from the same embryo. Putative off-target sites were evaluated using the primer pairs listed in Extended Data Table 1a.
Evaluating potential off-target sites
Putative off-targets were determined using the MIT CRISPR Design tool (http://crispr.mit.edu/), which indicated top scoring off-target sites. We evaluated sequences that had mismatches of three nucleotides or fewer compared to the sgRNA2b sequence. As described previously17, potential off-target sites were also identified by using the following parameters: 12 base pairs of the sgRNA seed sequence plus an NGG PAM sequence where (N was varied to include all possible nucleotides) were searched against the reference human genome (hg19).
Digenome-seq was performed as described previously15,16. In brief, 20 μg genomic DNA was incubated with pre-incubated 100 nM recombinant Cas9 protein and 300 nM sgRNA in a reaction volume of 1 ml (100 mM NaCl, 50 mM Tris-HCl, 10 mM MgCl2, 100 μg ml−1 BSA, pH 7.9) at 37 °C for 8 h. Digested DNA was mixed with 50 μg ml−1 RNase A (Qiagen) at 37 °C for 30 min, and purified again with a DNeasy Tissue Kit (Qiagen). One micrgram of digested DNA was fragmented using the Covaris system and ligated with adaptors using TruSeq DNA libraries. DNA libraries were subjected to whole genome sequencing performed at Macrogen using an Illumina HiSeq X Ten at a sequencing depth of 30–40×. In vitro DNA cleavage scores were calculated using a previously described scoring system16.
Embryos and cells were fixed with 4% paraformaldehyde in PBS for 1 h and overnight, respectively, at 4 °C and immunofluorescently analysed as described previously2. The primary antibodies used are listed in Extended Data Table 1b. Embryos were placed on coverslip dishes (MatTek) for confocal imaging.
To determine the chromosome copy number, single or multiple blastomeres were biopsied from embryos at the cleavage stage and clumps of approximately five cells were microdissected from the trophectoderm of blastocysts. The cells were washed through three drops of a wash buffer (PBS/0.1% polyvinyl alcohol), which had previously been tested to confirm absence of contaminating DNA (Reprogenetics UK). The cells were transferred to 0.2-ml PCR tubes in a volume of 1.5 μl, lysed and subjected to whole-genome amplification (SurePlex, Rubicon) followed by low-pass next generation sequencing (coverage depth <0.1×) (VeriSeq PGS kit, Illumina). Libraries were prepared according to the manufacturer’s instructions and sequenced using the MiSeq sequencing platform. Typically, ~1 million reads were generated per sample, of which 60–70% successfully mapped to unique genomic sites. Mapped reads were interpreted using BlueFuse Multi software (Illumina) in order to generate chromosome copy number profiles. This strategy has been extensively validated and is widely used for the detection of whole chromosome losses and gains, as well as segmental aneuploidy, in human embryos undergoing preimplantation genetic diagnosis26. Analysis of single blastomeres allowed each chromosomal region of at least 5 Mb to be assigned a copy number of 0, 1, 2, 3 or 4 (corresponding to nullisomy, monosomy, disomy, trisomy or tetrasomy). In trophectoderm samples, composed of several cells, it was also possible to detect the presence of chromosomal mosaicism, indicated when copy number values for a given chromosome had an intermediate value, between the thresholds for assigning 1 and 2 or 2 and 3 chromosome copies41.
Confocal immunofluorescence pictures were taken with a Leica SP5 confocal microscope and 3–5-μm-thick optical sections were collected. Quantification was performed manually using Fiji (ImageJ) or automated using MINS 1.3 software42.
Epifluorescence images were obtained on an Olympus IX73 using Cell^F software (Olympus Corporation) or on an EVOS FL cell imaging system (AMF4300). Phase contrast images and videos were collected on an Olympus IX73 using with Cell^F software and RI Viewer software (Research Instruments), respectively.
Time-lapse imaging was performed using an EmbryoScope+ time-lapse incubator (Vitrolife) and annotated using the EmbryoViewer software.
Generation of optimized inducible knockout (OPTiKO) human ES cell lines
The sgRNA sequences were cloned into the pAAV-Puro_siKO-TO vector as previously described11. In brief, complementary single-stranded oligonucleotides (Extended Data Table 1a) were annealed and scarlessly ligated to AarI-digested plasmids between the H1-TO tetracycline-inducible promoters and the scaffold sgRNA sequence. The Cas9 and inducible sgRNA targeting vectors were each inserted into one of the two alleles of the AAVS1 locus by homologous-directed recombination facilitated by two obligate heterodimer ZFNs11. Cells were cultured in the presence of 10 μM ROCK inhibitor Y-27632 (Sigma-Aldrich; Y0503) in medium without antibiotics 24 h before nucleofection. Cells were washed with PBS (Life Technologies; 14190-094) and dissociated with Accutase (Life Technologies; A11105-01) for 5 min at 37 °C. Colonies were mechanically triturated into clumps of 2–3 cells and counted. 2 × 106 cells were nucleofected in 100 μl with a total of 12 μg DNA (4 μg each for the two ZFN plasmids, and 2 μg each for the two targeting vectors) using the Lonza P3 Primary Cell 4D-Nucleofector X Kit and the cycle CA-137 on a Lonza 4D-Nucleofector System. Cells were incubated for 5 min at room temperature, after which antibiotic-free KSR containing 10 μM ROCK inhibitor was added. After another 5 min the cell suspension was distributed on pre-plated DR4 (Applied Stem Cell; ASF-1013) drug resistant MEF feeders in antibiotic-free KSR medium. Four days after nucleofection, cells underwent double antibiotic selection with 0.5 μg ml−1 Puromycin (Sigma-Aldrich) and 25 μg ml−1 Geneticin (G418 Sulphate (Gibco)) for 7 days. Targeted colonies appeared after 4–8 d and were mechanically picked and clonally expanded at 10–14 d after transfection.
Extensive genotyping was carried out on the targeted clones to check for correct AAVS1 gene targeting and to exclude the presence of randomly integrated plasmids, as previously described11. Briefly, genomic DNA was extracted using the Wizard Genomic DNA Purification Kit (Promega; A1120). Site-specific integration was checked for both 5′and 3′ends of each of the two targeting vectors (Cas9 and inducible sgRNA). Clones were also screened for the absence of the wild-type locus (indicating homozygous targeting) and for the absence of amplicons for both the 5′ and 3′ ends of the targeting vector backbones (to ensure there was no random integration of the plasmid).
Culture conditions for human ES cells and engineering inducible cell lines
Clonal H9 human ES cells (WiCell) (n = 2 or 3 per sgRNA) were cultured in feeder- and serum-free conditions either in mTeSR1 (Stem Cell Technologies) on growth factor-reduced Matrigel-coated dishes (BD Biosciences) or as previously described43 as indicated in the figure legends. Tetracycline hydrochloride (Sigma-Aldrich; T7660) was used at 1 μg ml−1 to induce guide expression. Human ES cells underwent routine mycoplasma screening and karyotyping.
Cells were collected every day for 5 d alongside matched control cells. Cells were dissociated into single-cell suspension using TrypLE Select 1X (Gibco; 12563011) for 5 min at 37 °C. The cell suspension was pelleted, washed with PBS (Life Technologies; 14190-094) then fixed and permeabilized using BD Cytofix/Cytoperm (554714) for 20 min at 4 °C. A 1× permeabilization/wash buffer (BD; 554723) containing fetal bovine serum (FBS) and saponin was used for all subsequent wash steps and during antibody incubation unless indicated otherwise. After fixation, cells were washed once then stored at 4 °C until the day 5 sample had been collected, at which point all samples underwent intracellular staining. Cells were blocked for 30 min at room temperature with 1× permeabilization/wash buffer containing 10% donkey serum (Bio-rad; C06SB) and 0.1% Triton X-100 (ThermoFisher Scientific; 85111). Cells were stained with primary antibodies by incubating at room temperature for 1 h and cells were washed three times after each incubation. Negative control secondary-only stained cells and unstained cells were performed on each batch of cells at a given day. Flow cytometry was performed using a Cyan ADP flow cytometer and the Summit software (Beckman Coulter), and 10,000–50,000 events were recorded. FlowJo was used to analyse flow cytometry results. Cells were first gated on the basis of forward and side scatter properties, after which singlets were isolated on the basis of relationship between side scatter area peak area and width. A secondary-only negative control was used to determine the background and OCT4-positive cells were quantified relative to cells that were OCT4-negative in the total bulk population of cells analysed.
RNA isolation from human ES cells for RNA-seq and qRT–PCR
qRT–PCR data presented in Extended Data Fig. 1c were generated as follows: RNA was isolated using TRI reagent (Sigma) and DNase I-treated (Ambion). cDNA was synthesized using a Maxima first strand cDNA synthesis kit (Fermentas). qRT–PCR was performed using SensiMix SYBR low-ROX kit (Bioline) on a QuantStudio 5 machine (ThermoFisher Scientific). Primers pairs used are listed in Extended Data Table 1a. Each sample was run in triplicate and samples were normalized using GAPDH as the housekeeping gene, and the results were analysed using the ΔΔCt method
In preparation for RNA-seq of the human ES cells induced to express sgRNA2b, samples were further cleaned using ethanol precipitation. Libraries were prepared using KAPA mRNA HyperPrep kit for Illumina platforms (Roche Sequencing Solutions Inc.)
The qRT–PCR data presented in Extended Data Fig. 2b were generated as follows: RNA was extracted using the GenElute Mammalian Total RNA Miniprep Kit (Sigma-Aldrich; RTN350-1KT) and the On-Column DNase I Digestion kit (Sigma-Aldrich; DNASE70-1SET). Five-hundred nanograms of RNA was reverse-transcribed with SuperScript II (Invitrogen; 18064071). qRT-PCR was performed using 5 ng cDNA and SensiMix SYBR low-ROX (Bioline; QT625-20). qRT–PCR was performed on a Stratagene Mx-3005P (Agilent Technologies) and the results were analysed using the ΔΔCt method. Each sample was run in duplicate and samples were normalized using PBGD as the housekeeping gene.
Samples were processed using a previously published protocol that was adapted where indicated29. Single cells from microdissected human embryos were picked using 100 μm inner diameter Stripper pipette (Origio) and transferred to individual low bind RNase-free tubes containing 2.5 μl RLP plus buffer (Qiagen; 79216).
To separate RNA and genomic DNA (gDNA), 50 μl of Dynabeads were washed and incubated with 100 μM biotinylated poly-dT oligonucleotide (IDT). Ten microlitres of oligo-dT beads were added to each tube containing the single cell. Samples were incubated in a thermomixer for 20 min at room temperature at 2,000 r.p.m. Tubes were put on a magnet until the beads collected into a pellet and the supernatant went clear. The supernatant containing the genomic DNA was transferred to a new collection tube. Beads were washed three times to collect any residual genomic DNA, which was amplified as described above.
cDNA was generated from the RNA captured on the bead using the SMARTer v4 Ultra Low Input kit (Clontech; 634891) as previously described3. Reverse transcription was performed on the thermomixer using the settings 2 min at 42 °C at 2,000 rpm, 60 min at 42 °C at 1,500 rpm, 30 min at 50 °C at 1,500 rpm and 10 min at 60 °C at 1,500 rpm. cDNA was amplified by adding 12.5 μl 2X SeqAmp PCR buffer, 0.5 μl PCR Primer II A (12 μM), 0.5 μl SeqAmp DNA polymerase, 1.5 μl nuclease-free water. Beads were mixed on a thermomixer for 60 s at room temperature at 2,000 rpm and then were incubated on a PCR machine using the following settings: 95 °C for 1 min, 24 cycles of 98 °C for 10 s, 65 °C for 30 s and 68 °C for 3 min, before a final extension for 10 min at 72 °C. Amplified cDNA was purified by adding 25 μl Ampure XP beads according to the manufacturer’s instructions. Twelve microlitres of purification buffer was added to rehydrate the pellet and incubated for 2 min at room temperature. cDNA was eluted by pipetting up and down 10 times before returning the tube to the magnet. The clear supernatant containing the cDNA was removed from the immobilised beads and transferred to a new low-bind tube. cDNA was stored at −80 °C until library preparation. cDNA quality was assessed by High Sensitivity DNA assay on an Agilent 2100 Bioanalyser with good quality cDNA showing a broad peak from 300 to 9,000 bp. cDNA concentration was measured using QuBit dsDNA HS kit (Life Technologies).
In preparation for library generation, cDNA was sheared using an E220 focused-ultrasonicator (Covaris) to achieve cDNA in 200-500 bp range. Ten microlitres of cDNA sample and 32 μl purification buffer was added to a Covaris AFA Fibre Pre-Slit Snap Cap microTUBE. cDNA was sheared using the following settings: Peak Incident power 175 W, Duty Factor 10%, 200 cycles per burst, water level 5.
Libraries were prepared using Low Input Library Prep Kit v2 (Clontech; 634899) according to manufacturer’s instructions. Dual indexing was performed by substituting the manufacturer’s provided indexing adaptors with NEBNext Multiplex Oligos for Illumina Dual Index primers set 1 (NEB; E7600S). Library quality was assessed by Bioanalyser and the concentration was measured by high sensitivity QuBit assay.
Twenty-five microlitres of AMPure beads was added to each collection tube containing the genomic DNA. Tubes were mixed well and incubated at room temperature for 20 min so that the DNA could be bound to the beads. Tubes were put on the magnet until the supernatant ran clear so that it could be removed and discarded. The beads were washed twice with 100 μl 80% ethanol. Any remaining ethanol was removed and beads allowed to dry, and resuspended in nuclease-free water.
Single-cell RNA-seq data analysis
RNA-seq data for single cells were obtained as paired-end reads and analysis was performed blinded to the identity of the samples. The RNA-Seq data flow was managed by a GNU make pipeline. Transcript reads were aligned to the Ensembl GRCh37 genome using TopHat2 (version 2.1.1 with option no coverage search)44; alignment rates were typically between 60 and 80%. Transcript counts were computed using the featureCounts program (version 1.5.1)45. A quality filter was applied to the matrix, ensuring >50,000 total transcript reads per cell and >5 reads in at least 5 samples. The raw transcript counts were corrected for read-count depth effects using the SCnorm package46 with a single-group design matrix. The RUVSeq47 (version 1.10.0) was used for between-sample normalization by applying the ‘betweenLaneNormalization’ function with ‘full’ quantile regression. For PCA analysis, transcript counts were transformed using a asinh(x/2) transformation with per-gene centring to obtain near-Gaussian and zero-centred count distributions. The prcomp function of the stats package in R (version 3.4.1) was applied to the count matrix and single cells were projected into the plane of the first two eigenvectors.
Independently, sequenced reads from all single cell samples were also aligned to the human reference genome sequence GRCh38 using TopHat2 (version 2.1.1)44 and parameters were optimized for 100-bp paired-end reads. Read counts per gene were calculated using the python package HTSeq (version 0.6.1)48 and differential gene expression analysis was carried out using DESeq2 (version 1.10.1)49. Read counts were normalized using the RPKM method50 and hierarchical clustering of samples was performed to generate a heat map using the R package pheatmap (version 1.0.8). A previously published reference control dataset3 was integrated in the heat map and hierarchical clustering. Principal components analysis was performed using the stats (version 3.2.2) R package on a previously published single cell RNA-seq dataset covering different stages of preimplantation development30 together with our own OCT4-targeted samples and controls.
The scripts used to generate the figures have been deposited in GitHub and can be accessed using the following link: https://github.com/Genalico/RNAseq-BlaCy_pub. The read-depths for each sample are provided in Supplementary Table 2 and via the above GitHub link.
Source Data are provided for figures. MiSeq and RNA-seq data have been deposited into Gene Expression Omnibus (GEO) under accession numbers GSE100119 and GSE100120, respectively. Scripts used for bioinformatics analysis can be found on the following GitHub page: https://github.com/Genalico/RNAseq-BlaCy_pub. Any additional information is available upon request from the corresponding author.
Gene Expression Omnibus
We thank the generous donors whose contributions have enabled this research; M. Macnamee, P. Snell and L. Christie at Bourn Hall Clinic for their support and assistance with the donation of embryos; T. Hiroda, P. Singh and J. Schimenti for the DMC1 sgRNA sequence and product; R. Lovell-Badge, I. Henderson, J. Haber, J. Rossant and A. Handyside for discussions and advice; the Wellcome Trust policy advisers, especially K. Littler and S. Rappaport, as well as J. Lawford-Davies and M. Chatfield for advice and support; and the Francis Crick Institute’s Biological Resources, Advanced Light Microscopy, High Throughput Sequencing, Research Illustration (Fig. 2a) and Bioinformatics facilities. D.W. was supported by the National Institute for Health Research (NIHR) Oxford Biomedical Research Centre Programme. N.K. was supported by the University of Oxford Clarendon Fund and Brasenose College Joint Scholarship. A.B. was supported by a British Heart Foundation PhD Studentship (FS/11/77/39327). K.E.S. was supported by the NIHR Cambridge BRC. L.V. was supported by core grant funding from the Wellcome Trust and Medical Research Council (PSAG028). J.-S.K. was supported by the Institute for Basic Science (IBS-R021-D1). Work in the K.K.N. and J.M.A.T. labs was supported by the Francis Crick Institute, which receives its core funding from Cancer Research UK, the UK Medical Research Council, and the Wellcome Trust (FC001120 and FC001193). Work in the K.K.N. laboratory was also supported by the Rosa Beddington Fund.
Extended data figures
Extended data tables
Video of human pronuclear stage zygote microinjected with sgRNA2b/Cas9 ribonucleoprotein complex.
Development of a human embryo following microinjection of the sgRNA2b/Cas9 ribonucleoprotein complex. AVI format
Development of a human embryo following microinjection of the sgRNA2b/Cas9 ribonucleoprotein complex.
Development of a human embryo following microinjection of Cas9 protein.