The RNA-guided nuclease Cas9 can be reengineered as a programmable transcription factor. However, modest levels of gene activation have limited potential applications. We describe an improved transcriptional regulator obtained through the rational design of a tripartite activator, VP64-p65-Rta (VPR), fused to nuclease-null Cas9. We demonstrate its utility in activating endogenous coding and noncoding genes, targeting several genes simultaneously and stimulating neuronal differentiation of human induced pluripotent stem cells (iPSCs).
Cas9 is an RNA-guided endonuclease that is directed to a specific DNA sequence through complementarity between the associated guide RNA (gRNA) and its target locus1,2. Cas9 can be directed to nearly any arbitrary sequence with a gRNA, requiring only a short protospacer-adjacent motif (PAM) site proximal to the target3,4,5. Through mutational analysis, variants of Cas9 have been generated that lack endonucleolytic activity but retain the capacity to interact with DNA2,6,7. These nuclease-null (dCas9) variants have been subsequently functionalized with effector domains such as transcriptional activation domains (ADs), enabling Cas9 to serve as a tool for cellular programming at the transcriptional level6,8,9,10. The ability to program the robust induction of expression at a specific target within its native chromosomal context would provide a transformative tool for myriad applications, including the development of therapeutic interventions, genetic screening, activation of endogenous and synthetic genetic circuits, and the induction of cellular differentiation11,12,13.
In natural systems, transcriptional initiation occurs through the coordinated recruitment of the necessary machinery by a number of locally concentrated transcription-factor ADs. As a result, we hypothesized that the tandem fusion of multiple ADs would increase transcriptional activation by mimicking the natural cooperative recruitment process. Toward this goal a series of more than 20 candidate effectors with known transcriptional roles were fused to the C terminus of Streptococcus pyogenes (SP)-dCas9, and their potency was assessed by a fluorescent reporter assay in human embryonic kidney (HEK) 293T cells (Supplementary Figs. 1 and 2)14.
Of the hybrid proteins tested, dCas9-VP64, dCas9-p65 and dCas9-Rta showed the most meaningful reporter induction. Nonetheless, neither the p65 nor the Rta hybrids were stronger activators than the commonly used dCas9-VP64 protein. Taking dCas9-VP64 as a starting scaffold, we subsequently extended the C-terminal fusion with the addition of either p65 or Rta. As predicted, these bipartite fusions had increased transcriptional activity. Further improvement was observed when both p65 and Rta were fused in tandem to VP64, generating a hybrid VP64-p65-Rta tripartite activator (hereon referred to as VPR) (Supplementary Fig. 3).
To begin characterizing VPR, we verified the importance of each of its constituent domains (VP64, p65 and Rta) by replacing the respective domain with mCherry and measuring the resulting protein's activity by reporter assay. All fusions containing mCherry had lower activity, demonstrating the essentiality of all three domains (Supplementary Fig. 4). We further validated the importance of domain order by shuffling the positions of the three domains, generating all possible nonrepeating dCas9 fusion proteins. Evaluation of the VPR permutations confirmed that the original ordering was indeed optimal (Supplementary Fig. 5).
Given the potency of our SP-dCas9-VPR fusion, we investigated whether the VPR construct would show similar potency when fused to other DNA-binding scaffolds. Fusion of VPR to a nuclease-null Streptococcus thermophilus (ST1)-dCas9, a designer transcription activator–like effector (TALE) or a zinc-finger protein led to an increase in activation relative to VP64 (Supplementary Fig. 6)15.
Having performed initial characterization of our SP-dCas9-VPR fusion, we sought to assess its ability to activate endogenous coding and noncoding targets relative to VP64. To this end, we selected a set of genes related to cellular reprogramming, development and gene therapy, and then independently activated each target with three to four gRNAs delivered in concert. When compared to the dCas9-VP64 activator, dCas9-VPR showed significantly (22- to 320-fold) greater activation of endogenous targets (Fig. 1a). While VPR was able to induce each of our target genes to a much greater extent than VP64, we observed a marked difference in the relative levels of gene induction between targets. Furthermore, in accordance with previous studies16, we noted an inverse correlation between basal expression level and relative expression gain induced by dCas9 activators (genes with high basal expression were less potently activated) (Supplementary Fig. 7).
To place our observed levels of activation within a biologically relevant context, we compared dCas9-VPR-mediated activation for a subset of genes in HEK 293T cells with the respective gene's expression in native human tissue. Absolute comparisons in gene expression between in vitro cell lines and native tissues are difficult, but our preliminary analysis suggests that we were able to activate a number of our target genes to levels similar to those normally seen in their native tissues (Supplementary Fig. 8).
Cas9 enables multiplexed activation through the simple introduction of a collection of guide RNAs against a desired set of genes. To determine the efficiency of multigene targeting, we performed a pooled activation experiment simultaneously inducing four of our initially characterized genes: MIAT, NEUROD1, ASCL1 and RHOXF2. VPR allowed for robust multilocus activation, showing significantly (severalfold) higher expression levels than VP64 across the panel of genes (Fig. 1b).
After demonstrating dCas9-VPR's ability to robustly activate gene expression in human cells, we sought to further explore its versatility as a general tool for gene induction within alternate model systems. Expression of dCas9-VPR in Saccharomyces cerevisiae, Drosophila melanogaster S2R+ cells and Mus musculus Neuro-2A cells led to a range of improved activation from 5- to 300-fold over VP64 based activators (Supplementary Fig. 9).
The ability to selectively upregulate gene expression provides a powerful means to reprogram cellular identity for regenerative medicine and basic research purposes. Previous work has shown that the ectopic expression of several cDNAs promotes the differentiation of stem cells into multiple cell types. Although such artificial induction often requires multiple factors, it was recently shown that exogenous expression of single transcription factors, neurogenin 2 (NGN2, also known as NEUROG2) or neurogenic differentiation factor 1 (NEUROD1), is sufficient to promote differentiation of human iPSCs into induced neurons (iNeurons)17,18. While our previous attempts to generate iNeurons from iPSCs using dCas9-VP64-based activators were unsuccessful (data not shown), we were optimistic that the increased potency of VPR might induce sufficient expression of NGN2 and/or NEUROD1 protein to trigger differentiation.
Stable PGP1 iPSC, doxycycline-inducible, dCas9-VP64 and dCas9-VPR cell lines were generated and transduced with lentiviral vectors containing a mixed pool of 30 gRNAs directed against either NGN2 (NEUROG2) or NEUROD1. To determine differentiation efficiency, gRNA-containing dCas9-AD iPSC lines were cultured in the presence of doxycycline and monitored for phenotypic changes (Supplementary Figs. 10 and 11). We observed that VPR, in contrast to VP64, enabled rapid and robust differentiation of iPSCs into a neuronal phenotype. Additionally, these cells stained positively for the neuronal markers beta III tubulin and neurofilament 200 (Fig. 2a and Supplementary Fig. 12a, respectively). Subsequent quantification of the staining revealed that dCas9-VPR cell lines showed a 10- to 37-fold increase in the amount of iNeurons observed through upregulation of either NGN2 or NEUROD1 (Fig. 2b and Supplementary Fig. 12b). Analysis by qRT-PCR revealed 10-fold and 18-fold increases in NGN2 and NEUROD1 mRNA expression levels, respectively, within dCas9-VPR cells over their dCas9-VP64 counterparts (Fig. 2c).
Over the past year there have been a number of exciting advances in the field of Cas9-derived transcriptional activators. Two-component systems that rely on innovative gRNA modifications (e.g., synergistic activation mediator (SAM) and scaffold RNA (scRNA)) and epitope-based attachment systems (e.g., SUperNova (SunTag)) continue to push the limits of activator potency16,19,20. Notably, it was shown that the multimeric recruitment of even modestly effective activation domains (i.e., VP64 and p65) can lead to abundant increases in transcriptional output16,19,20. We believe that the rational selection and ordered fusion of individual activator domains provides an approach that is highly effective while eliminating the delivery and design complications generated by a two-component activator. In addition, as even modestly potent activation domains have shown marked improvement in activity when repeatedly recruited to a single dCas9 protein, we envision that our more potent VPR activator should lead to drastically improved activation if multiply recruited to a single dCas9 protein through technologies such as SAM, scRNA or SunTag.
Beyond the utility of VPR as a technological catalyst, we believe that our design process brings to light several important generalizations for future synthetic effectors, most notably the importance of screening large numbers of putative candidates and the critical role of domain order in the emergent synergy of multicomponent fusions.
Vectors used and designed.
Activation domains were cloned using a combination of Gibson and Gateway assembly or Golden Gate assembly methods. For experiments involving multiple activation domains, ADs were separated by short glycine-serine linkers. Activator sequences are listed in the Supplementary Data (vectors to be deposited in Addgene). All SP-dCas9 plasmids were based on Cas9m4-VP64 (Addgene #47319)6, and ST1-dCas9 plasmids were based on M-ST1n-VP64 (Addgene #48675)15. Sequences for gRNAs are listed in the Supplementary Data. gRNAs for endogenous human gene activation were selected to bind between 1 and 1,000 bp upstream of the transcriptional start site (TSS). gRNAs for iPSC differentiation to iNeurons, targeting NGN2 and NEUROD1, were selected to bind between 1 and 2,000 bp upstream of the transcriptional start site. All human gRNAs were either expressed from cloned plasmids (Addgene #41817)5 or integrated into the genome through lentiviral delivery (plasmid SB700). Guide RNA sequences are listed within the Supplementary Data. Reporter-targeting gRNAs were previously described (Addgene #48671 and #48672)6.
Mammalian cell culture and transfections.
HEK 293T cells (gift from P. Mali, University of California, San Diego) and Neuro-2A cells (ATCC CCL-131) were maintained in high glucose Dulbecco's Modified Eagle's Medium (Invitrogen) supplemented with 10% FBS (Invitrogen) and penicillin/streptomycin (Invitrogen). Cells were maintained at 37 °C and 5% CO2 in a humidified incubator and tested for mycoplasma yearly. Cells were transfected in 24-well plates seeded with 50,000 cells per well. 200 ng of dCas9 activator, 10 ng of gRNA and 60 ng of reporter plasmid (when required) were delivered to each well with Lipofectamine 2000 (HEK 293T) or Lipofectamine 3000 (Neuro-2A), according to manufacturer's instructions. For multiplex activation, a 40-ng mix of gRNAs was used, with a 10-ng total amount of guide per each of the four gene targets. For example, if four guide RNAs were used against an individual target, 2.5 ng of each guide RNA were combined, to obtain a 10-ng mix for that target; then the four 10-ng mixes were combined to prepare 40 ng total for transfection. Cells were grown 36–48 h after transfection before being assayed using fluorescence microscopy or flow cytometry or lysed for RNA purification and quantification.
S. cerevisiae manipulation.
Yeast strain W303 was used for all experiments. dCas9 activator constructs were cloned into vector pAG414GPD-ccdB (Addgene #14144)21. gRNAs (located between 100 and 200 bp upstream of the TSS) were expressed from the SNR52 promoter and cloned into the 2μ-based pAG60-2u vector22. Cells were grown up overnight at 30 °C in synthetic complete media lacking tryptophan and uracil. The following day cells were diluted 1:100 into 5 ml of fresh media and grown for an additional 7 h at 30 °C. 2 ml of culture was then spun down for RNA extraction.
Drosophila S2R+ cells were grown in Schneider's medium (Invitrogen) supplemented with 10% heat-inactivated FBS (JRH Biosciences) and penicillin/streptomycin (Sigma) at 25 °C without CO2. Cells were transfected using Effectene Transfection Reagent (Qiagen) according to manufacturer's instructions. Transfections were performed in 24-well plates and cells were seeded at 30,000 cells per well. 150 ng of dCas9 activator and 50 ng gRNAs were transfected and incubated for 3 d at 25 °C before extraction of total RNA. Five gRNAs were transfected against each of the indicated target genes.
Fluorescence reporter assay.
SP-dCas9 reporter assays were performed by targeting all dCas9-ADs with a single guide to a minimal CMV promoter, driving expression of a fluorescent reporter. Addgene plasmid #47320 (ref. 6) was used to screen for novel ADs (Supplementary Figs. 1 and 2) or was altered to contain a sfGFP reporter gene instead of tdTomato (Supplementary Figs. 3b, 4 and 5). In addition, to control for transfection efficiency (Supplementary Figs. 3b, 4 and 5), an EBFP2-expressing control plasmid was co-transfected at 25 ng per well (EBFP2 plasmid was not co-transformed in Supplementary Fig. 2). To remove untransfected cells from the analysis, sfGFP fluorescence was only analyzed in cells with >103 EBFP2 expression (as determined by flow cytometry). For fusion of VPR to other programmable transcription factors (ST1-dCas9, TALE, and zinc-finger protein), no EBFP2 plasmid was transfected. ST1-dCas9 reporter assays were performed using the previously described tdTomato reporter with an appropriate PAM inserted upstream of the tdTomato coding region (Addgene #48678)15. The binding sequences for the zinc finger and TALE are TAATTANGGGNG and TACCTCATCAGGAACATGTT, respectively.
Yeast RNA was extracted using the YeaStar kit (Zymogen), RNA from Drosophila S2R cells was extracted using Trizol (Life Technologies), and RNA from human cells was extracted using the RNeasy PLUS mini kit (Qiagen). Human tissue RNA was obtained from Life Technologies (Human Brain Total RNA (AM7962), Human Heart Total RNA (AM7966) and Human Testes Total RNA (AM7972)). 500 ng of RNA was used with the iScript cDNA synthesis Kit (Bio-Rad), and 0.5 μl of cDNA was used for each qPCR reaction, using the KAPA SYBR FAST Universal 2X qPCR Master Mix. The Drosophila qPCR reaction used iQ SYBR. qPCR primers are listed in Supplementary Table 1. qRT-PCR was run and analyzed on the CFX96 Real-Time PCR Detection System (BIORAD), with all target gene expression levels normalized to β-actin mRNA levels (human and M. musculus), FBA1 mRNA levels (S. cerevisiae) or RpL32 mRNA levels (D. melanogaster).
Lentiviral particles were generated by transfecting 293T cells with the pSB700 sgRNA expression plasmid (with cerulean reporter) and the psPAX2 (Addgene #12260) and pMD2.G (Addgene #12259) packaging vectors at a ratio of 4:3:1, respectively. Viral supernatants were collected 48–72 h following transfection and concentrated using the PEG Virus Precipitation Kit (BioVision) according to the manufacturer's protocol.
iPSC culture and dCas9-AD cell line generation.
PGP1 iPSC cells were obtained from the Coriell Institute Biorepository (GM23338) and maintained on Matrigel (Corning) coated tissue culture plates in mTeSR1 Basal medium (Stemcell Technologies). To generate stable iPS dCas9-AD expressing cell lines, approximately 5 × 105 cells were nucleofected with 1.5 μg of dCas9-AD PiggyBac expression vector and 340 ng of transposase vector (System Biosciences) using the Amaxa P3 Primary Cell 4D-Nucleofector X Kit (Lonza), program CB-150. Following electroporation, cells were seeded onto 24-well Matrigel-coated plates in the presence of 10 μM ROCK inhibitor (R&D Systems) and allowed to recover for 2 d before expanding to 6-well plates in the presence of 20 μg/ml hygromycin to select for a mixed population of dCas9-AD integrant–containing cells.
iPSC transduction and neural induction.
iPS dCas9-AD cell lines were transduced with lentiviral preparations containing 30 gRNAs, targeted against either NEUROD1 or NGN2, 1 d after seeding onto Matrigel-coated plates. Transduced cells were expanded and then sorted for the top 15% of cerulean-positive cells (pSB700 gRNA expression). Sorted gRNA containing dCas9-AD iPS cell lines were seeded in triplicates onto Matrigel-coated 24-well plates with mTeSR + 10 μM ROCK inhibitor in the presence of 1 μg/ml of doxycycline. Fresh mTeSR medium with or without doxycycline was added every day for 4 d, at which cells were analyzed by light microscopy and immunofluorescence and harvested for qRT-PCR analysis.
Immunostaining of Cas9 iNeurons.
All steps for staining were performed at room temperature. Samples were washed once with PBS, fixed with 10% formalin (Electron Microscopy Sciences) for 20 min and then permeabilized with 0.2% Triton X-100/PBS for 15 min. Samples were next blocked with 8% BSA for 30 min and then stained with primary antibodies diluted into 4% BSA. Staining was performed for either 3 h with anti-beta III tubulin eFluor 660 conjugate (eBioscience, catalog no. 5045-10, clone 2G10-TB3) or 1 h with anti-neurofilament 200 (Sigma, catalog no. N4142), both at a 1:500 dilution. Samples were then washed 3 times, 5 min each, with 0.1% Tween/PBS, and then washed once with PBS. For neurofilament 200 staining, a secondary donkey anti-rabbit Alexa Fluor 647 (Life Technologies, cat. no. A-31573) antibody was added at a 1:1,000 dilution in 4% BSA for 1 h. Samples were again washed as previously mentioned and then stained with NucBlue (Hoechst 33342) (Life Technologies) for 5 min.
Image acquisition and analysis of Cas9 iNeurons.
24-well plates stained for NucBlue and neuronal markers were imaged with a 10× objective on a Zeiss Axio Observer Z1 microscope. Zen Blue software (Zeiss) was used to program acquisition of 24 images per well. Total cell (NucBlue) and iNeuron (beta III tubulin or Neurofilament 200) counts were quantified for each image using custom Fiji and Matlab scripts and used to determine the percentage of iNeurons per well by the formula: (number of beta III tubulin–positive cells/number of nucBlue cells) × 100. In preparation for publication, individual channels were composited and pseudocolored, with equal adjustments across samples and controls, in Fiji.
All statistical comparisons are two-tailed t-tests calculated using the GraphPad Prism software package (Version 6.0 for Windows. GraphPad Software, San Diego, CA). All sample numbers listed indicate the number of biological replicates employed in each experiment.
Custom Fiji and Matlab scripts used in analyzing iPS cell differentiation are available upon request.
Throughout our study, we employ a sample size which is frequently used for similar kinds of experiments; no statistical method was used to predetermine sample size. No data were excluded from any of our analyses. No randomization was employed, and blinding was not used except in iNeuron image analysis, where the scientist quantifying each of the conditions was blind to the sample type.
We thank K. Esvelt, P. Mali and the rest of the members of the Church and Collins labs for helpful discussions. We thank T. Ferrante, S. Byrne and M. Farrell for technical assistance, A. Keung (Massachusetts Institute of Technology) for providing zinc-finger reporter constructs, and J. Schulak for graphic design. This work was supported by US National Institutes of Health National Human Genome Research Institute grant P50 HG005550, US Department of Energy grant DE-FG02-02ER63445 and the Wyss Institute for Biologically Inspired Engineering. A.C. acknowledges funding by the National Cancer Institute grant 5T32CA009216-34. S.V. acknowledges funding by the National Science Foundation Graduate Research Fellowship Program, the Department of Biological Engineering at the Massachusetts Institute of Technology and the Department of Genetics at Harvard Medical School.
Integrated supplementary information
Supplementary Figures 1–12, Supplementary Table 1 and Supplementary Note