Abstract

The transcriptional programs that establish neuronal identity evolved to produce the rich diversity of neuronal cell types that arise sequentially during development. Remarkably, transient expression of certain transcription factors can also endow non-neural cells with neuronal properties. The relationship between reprogramming factors and the transcriptional networks that produce neuronal identity and diversity remains largely unknown. Here, from a screen of 598 pairs of transcription factors, we identify 76 pairs of transcription factors that induce mouse fibroblasts to differentiate into cells with neuronal features. By comparing the transcriptomes of these induced neuronal cells (iN cells) with those of endogenous neurons, we define a ‘core’ cell-autonomous neuronal signature. The iN cells also exhibit diversity; each transcription factor pair produces iN cells with unique transcriptional patterns that can predict their pharmacological responses. By linking distinct transcription factor input ‘codes’ to defined transcriptional outputs, this study delineates cell-autonomous features of neuronal identity and diversity and expands the reprogramming toolbox to facilitate engineering of induced neurons with desired patterns of gene expression and related functional properties.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. 1.

    Davis, R. L., Weintraub, H. & Lassar, A. B. Expression of a single transfected cDNA converts fibroblasts to myoblasts. Cell 51, 987–1000 (1987).

  2. 2.

    Takahashi, K. & Yamanaka, S. Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell 126, 663–676 (2006).

  3. 3.

    Vierbuchen, T. et al. Direct conversion of fibroblasts to functional neurons by defined factors. Nature 463, 1035–1041 (2010).

  4. 4.

    Caiazzo, M. et al. Direct generation of functional dopaminergic neurons from mouse and human fibroblasts. Nature 476, 224–227 (2011).

  5. 5.

    Chanda, S., Marro, S., Wernig, M. & Südhof, T. C. Neurons generated by direct conversion of fibroblasts reproduce synaptic phenotype caused by autism-associated neuroligin-3 mutation. Proc. Natl Acad. Sci. USA 110, 16622–16627 (2013).

  6. 6.

    Kim, J. et al. Functional integration of dopaminergic neurons directly converted from mouse fibroblasts. Cell Stem Cell 9, 413–419 (2011).

  7. 7.

    Pang, Z. P. et al. Induction of human neuronal cells by defined transcription factors. Nature 476, 220–223 (2011).

  8. 8.

    Pfisterer, U. et al. Direct conversion of human fibroblasts to dopaminergic neurons. Proc. Natl Acad. Sci. USA 108, 10343–10348 (2011).

  9. 9.

    Son, E. Y. et al. Conversion of mouse and human fibroblasts into functional spinal motor neurons. Cell Stem Cell 9, 205–218 (2011).

  10. 10.

    Vadodaria, K. C. et al. Generation of functional human serotonergic neurons from fibroblasts. Mol. Psychiatry 21, 49–61 (2016).

  11. 11.

    Wapinski, O. L. et al. Hierarchical mechanisms for direct reprogramming of fibroblasts to neurons. Cell 155, 621–635 (2013).

  12. 12.

    Blanchard, J. W. et al. Selective conversion of fibroblasts into peripheral sensory neurons. Nat. Neurosci. 18, 25–35 (2015).

  13. 13.

    Hockemeyer, D. et al. A drug-inducible system for direct reprogramming of human somatic cells to pluripotency. Cell Stem Cell 3, 346–353 (2008).

  14. 14.

    Xu, C. et al. Immortalized fibroblast-like cells derived from human embryonic stem cells support undifferentiated cell growth. Stem Cells 22, 972–980 (2004).

  15. 15.

    Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

  16. 16.

    Dennis, G. Jr et al. DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol. 4, 3 (2003).

  17. 17.

    Schoenherr, C. J., Paquette, A. J. & Anderson, D. J. Identification of potential target genes for the neuron-restrictive silencer factor. Proc. Natl Acad. Sci. USA 93, 9881–9886 (1996).

  18. 18.

    Drouin-Ouellet, J. et al. REST suppression mediates neural conversion of adult human fibroblasts via microRNA-dependent and -independent pathways. EMBO Mol. Med. 9, 1117–1131 (2017).

  19. 19.

    Treutlein, B. et al. Dissecting direct reprogramming from fibroblast to neuron using single-cell RNA-seq. Nature 534, 391–395 (2016).

  20. 20.

    Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008).

  21. 21.

    Zhang, B. & Horvath, S. A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. 4, https://doi.org/10.2202/1544-6115.1128 (2005).

  22. 22.

    Mi, H., Poudel, S., Muruganujan, A., Casagrande, J. T. & Thomas, P. D. PANTHER version 10: expanded protein families and functions, and analysis tools. Nucleic Acids Res. 44 (D1), D336–D342 (2016).

  23. 23.

    Mi, H., Muruganujan, A., Casagrande, J. T. & Thomas, P. D. Large-scale gene function analysis with the PANTHER classification system. Nat. Protoc. 8, 1551–1566 (2013).

  24. 24.

    Oh, S. I. et al. Efficient reprogramming of mouse fibroblasts to neuronal cells including dopaminergic neurons. The Scientific World Journal 2014, 957548 (2014).

  25. 25.

    Addis, R. C. et al. Efficient conversion of astrocytes to functional midbrain dopaminergic neurons using a single polycistronic vector. PLoS ONE 6, e28719 (2011).

  26. 26.

    Dell’Anno, M. T. et al. Remote control of induced dopaminergic neurons in parkinsonian rats. J. Clin. Invest. 124, 3215–3229 (2014).

  27. 27.

    Albuquerque, E. X., Pereira, E. F., Alkondon, M. & Rogers, S. W. Mammalian nicotinic acetylcholine receptors: from structure to function. Physiol. Rev. 89, 73–120 (2009).

  28. 28.

    Thorgeirsson, T. E. et al. A variant associated with nicotine dependence, lung cancer and peripheral arterial disease. Nature 452, 638–642 (2008).

  29. 29.

    Dougherty, J. D., Schmidt, E. F., Nakajima, M. & Heintz, N. Analytical approaches to RNA profiling data for the identification of genes enriched in specific cells. Nucleic Acids Res. 38, 4218–4230 (2010).

  30. 30.

    Xu, X., Wells, A. B., O’Brien, D. R., Nehorai, A. & Dougherty, J. D. Cell type-specific expression analysis to identify putative cellular mechanisms for neurogenetic disorders. J. Neurosci. 34, 1420–1431 (2014).

  31. 31.

    Chin, M. T. Reprogramming cell fate: a changing story. Front. Cell Dev. Biol. 2, 46 (2014).

  32. 32.

    Poulin, J. F., Tasic, B., Hjerling-Leffler, J., Trimarchi, J. M. & Awatramani, R. Disentangling neural cell diversity using single-cell transcriptomics. Nat. Neurosci. 19, 1131–1141 (2016).

  33. 33.

    Usoskin, D. et al. Unbiased classification of sensory neuron types by large-scale single-cell RNA sequencing. Nat. Neurosci. 18, 145–153 (2015).

  34. 34.

    Tasic, B. et al. Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat. Neurosci. 19, 335–346 (2016).

  35. 35.

    Lo Sardo, V. et al. Influence of donor age on induced pluripotent stem cells. Nat. Biotechnol. 35, 69–74 (2017).

  36. 36.

    Brewer, G. J. & Torricelli, J. R. Isolation and culture of adult neurons and neurospheres. Nat. Protoc. 2, 1490–1498 (2007).

  37. 37.

    Hazen, J. L. et al. The complete genome sequences, unique mutational spectra, and developmental potency of adult neurons revealed by cloning. Neuron 89, 1223–1236 (2016).

  38. 38.

    Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

  39. 39.

    Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

  40. 40.

    Anders, S., Pyl, P. T. & Huber, W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015).

  41. 41.

    R Core Team. R: A Language and Environment for Statistical Computing http://www.R-project.org/ (R Foundation for Statistical Computing, Vienna, Austria, 2013.

  42. 42.

    Gentleman, R. C. et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5, R80 (2004).

  43. 43.

    Warnes, G. R. gplots: Various R Programming Tools for Plotting Data. https://rdrr.io/cran/gplots/ (2015).

  44. 44.

    Adler, D. et al. rgl: 3D Visualization Using OpenGL. https://rdrr.io/rforge/rgl/ (2016).

  45. 45.

    pca3d: Three Dimensional PCA Plots. https://cran.r-project.org/web/packages/pca3d/index.html (2015).

  46. 46.

    Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).

  47. 47.

    Matys, V. et al. TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 34, D108–D110 (2006).

  48. 48.

    Mathelier, A. et al. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 44 (D1), D110–D115 (2016).

  49. 49.

    Zheng, G. X. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).

  50. 50.

    Hawrylycz, M. et al. Canonical genetic signatures of the adult human brain. Nat. Neurosci. 18, 1832–1844 (2015).

Download references

Acknowledgements

We thank M. Haynes, B. Seeger and A. Saluk for cell sorting, S. Head, J. Shimashita and J. Lesdesma for next-generation sequencing, K. Spencer for microscopy, V. Lo Sardo, W. Ferguson, M. Duran, J. Hazen, A. Adler and the Topol laboratory for technical assistance, R. Vega Perez for cell counting, and A. Su and J. Fouquier for assistance with BioGPS. BioGPS work is funded by R01 GM083924 to A. Su. This research was supported the National Brain Research Program of Hungary (KTIA_NAP_13-2014-0018 to A.S.), by the NIH (NIDA, DA031566 to P.P.S.), by The Scripps Translational Science (A.T.), (CTSA; 5 UL1 TR001114 to A.T.), (U54GM114833 to A.T.), (NIDCD, DC012592 to K.K.B.), (NIMH, MH102698 to K.K.B.), (NIA, DP1 AG055944), and the Dorris Neuroscience Center (K.K.B.), a pre-doctoral fellowship from CIRM (J.W.B., R.T. and S.L.), an NSF Predoctoral Fellowship (R.T.) and the Andrea Elizabeth Vogt Memorial Award (J.W.B.).

Author information

Author notes

  1. These authors contributed equally: Rachel Tsunemoto, Sohyon Lee.

Affiliations

  1. Department of Molecular and Cellular Neuroscience, Dorris Neuroscience Center, The Scripps Research Institute, La Jolla, CA, USA

    • Rachel Tsunemoto
    • , Sohyon Lee
    • , Pavel Chubukov
    • , Joel W. Blanchard
    • , Kevin T. Eade
    •  & Kristin K. Baldwin
  2. Neuroscience Graduate Program, University of California San Diego, La Jolla, CA, USA

    • Rachel Tsunemoto
    •  & Kristin K. Baldwin
  3. BioCircuits Institute, University of California San Diego, La Jolla, CA, USA

    • Attila Szűcs
  4. MTA-ELTE-NAP B Neuronal Cell Biology Research Group, Eotvos Lorand University, Budapest, Hungary

    • Attila Szűcs
  5. Molecular and Integrative Neurosciences Department, The Scripps Research Institute, La Jolla, CA, USA

    • Irina Sokolova
    •  & Pietro Paolo Sanna
  6. Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, USA

    • Jacob Bruggemann
    •  & Chunlei Wu
  7. Scripps Translational Science Institute, Scripps Health and The Scripps Research Institute, La Jolla, CA, USA

    • Ali Torkamani

Authors

  1. Search for Rachel Tsunemoto in:

  2. Search for Sohyon Lee in:

  3. Search for Attila Szűcs in:

  4. Search for Pavel Chubukov in:

  5. Search for Irina Sokolova in:

  6. Search for Joel W. Blanchard in:

  7. Search for Kevin T. Eade in:

  8. Search for Jacob Bruggemann in:

  9. Search for Chunlei Wu in:

  10. Search for Ali Torkamani in:

  11. Search for Pietro Paolo Sanna in:

  12. Search for Kristin K. Baldwin in:

Contributions

K.K.B., R.T. and S.L. designed and conceived the experiments, wrote and revised the manuscript and all of the authors edited the final drafts. K.T.E., J.W.B. and R.T. performed and analysed the screen experiments. A.S., I.S. and P.P.S. performed electrophysiology. R.T., S.L. and P.C. prepared cDNA libraries. R.T., S.L., P.C. and A.T. performed RNA-seq analysis. J.B. performed HOMER motif enrichment analysis. C.W. made data available through BioGPS.org. R.T. and S.L. performed all remaining experiments.

Competing interests

The authors declare no competing interests.

Corresponding author

Correspondence to Kristin K. Baldwin.

Extended data figures and tables

  1. Extended Data Fig. 1 TUJ1 immunostaining of MEF- and TTF-derived iN cells and the p75-depletion experiment.

    a, TUJ1 immunofluorescence labelling on day 14–16 post-induction of 35 of the 76 hits that were selected for whole-transcriptome analysis. n = 3 independent experiments. b, TUJ1 immunofluorescence labelling of conditions with individual bHLH factors Ascl1, Ascl2, Neurog1 and Neurog3. n = 3 independent experiments. c, TUJ1 immunofluorescence labelling of MEFs treated with only reverse tetracycline-controlled transactivator (rtTA), without reprogramming factors. n = 3 independent experiments. d, TUJ1 immunofluorescence labelling of TTFs derived from three-day-old mice and transduced with selected reprogramming combinations following the same reprogramming methods used with MEFs. Fixed and stained on day 16 post-induction. n = 1 independent experiment. e, TUJ1 immunofluorescence of TTFs treated with only rtTA, without reprogramming factors, and fixed and stained on day 16 post-induction. n = 1 independent experiment. f, Representative FACS gates of MEFs (~180,000 cells shown). MEFs were depleted of p75+ neural crest cells by first gating for DAPI cells (not shown) and collecting only those that were p75 (~93% of the DAPI population). g, Quantification of immunostaining for p75+ cells in source and p75-depleted MEF populations after expansion for four days after FACS, on the day of transduction for reprogramming. Data are mean ± s.d., n = 3 biologically independent samples. h, Percentage of TUJ1+ cells derived from source and p75-depleted MEF populations 16 days after induction. A2, Ascl2; N3, Neurog3; ND2, NeuroD2; B3c, Pou4f3; P1, Pou1f1. Data are presented as the mean ± s.d., n = 3 biologically independent samples. Percentages of TUJ1+ cells were not significantly different between source and p75-depleted conditions (two-way ANOVA, Sidak’s multiple comparison test. A2.B3c, P = 0.895; N3.P1, P = 0.985; ND2.B3c, P > 0.999). Scale bars, 100 μm. Source Data

  2. Extended Data Fig. 2 Additional electrophysiological recordings of iN cells from five transcription factor combinations.

    ae, Example voltage responses of representative iN cells from five transcription factor combinations: Neurog3/Pou1f1 (a; n = 3 cells), Neurog3/Pou5f1 (b; n = 2 cells), Ascl2/Pou4f3 (c; n = 3 cells), Neurod2/Pou4f3 (d; n = 2 cells) and Atoh1/Pou4f3 (e; n = 3 cells). Cells were stimulated using incremental levels of intracellular current starting at −100 to −50 pA and reaching levels where intense firing of action potentials was observed. fg, Quantification of resting membrane potential (f), rheobase (g) and membrane input resistance (h) for cells that exhibited current-induced action potentials. Neurog3/Pou1f1 (N3.P1, n = 15 cells), Neurog3/Pou5f1 (N3.O4; n = 10 cells), Ascl2/Pou4f3 (A2.B3c; n = 15 cells), Neurod2/Pou4f3 (ND2.B3c; n = 10 cells) and Atoh1/Pou4f3 (Atoh1.B3c; n = 8 cells). Data are mean ± s.d.; ***P = 0.0006, *P = 0.0228; ns, not significant. One-way ANOVA, Tukey’s multiple comparison test. im, Physiological properties of the cells. i, Current–voltage relationship obtained by plotting the observed membrane potential as a function of the injected current of both maximal voltage deflections (black) and the membrane potential at the end of the current step (grey). Data from the third Neurog3/Pou1f1 cell in a. j, Selected action potential of the second Neurog3/Pou5f1 cell in b. The dual spike after-hyperpolarization is indicative of Ca-dependent K+ currents in this neuron. k, Input−output curve of the number of spikes as a function of the injected current. This cell starts firing at + 100 pA (rheobase). l, Plot of the voltage sag (red) and after depolarization (dark yellow) as a function of the current. The Neurod2/Pou4f3 cells in d exhibit characteristic voltage sags under negative currents. The second Neurod2/Pou4f3 cell also produces post-inhibitory rebound spikes. m, Plot of membrane resistance versus current. Blue symbols are resistance values calculated from maximal voltage deflections and green symbols were obtained from voltage levels just before the termination of the current step of the third Atoh1/Pou4f3 cell in e. The decrease of membrane resistance as a function of current indicates the action of potent outward-rectifying K+ currents. n, Representative current traces from four cells showing EPSCs from tau–eGFP+, synapsin+ cells generated with Neurog3/Pou5f1 and Neurod2/Pou4f3. Source Data

  3. Extended Data Fig. 3 Transcription factor pairs generate functional human iN cells from HEFs.

    a, Representative TUJ1 immunofluorescence labelling of human iN cells reprogrammed from HEFs using mouse Neurog3/Pou1f1 or human NEUROG3/POU1F1. Scale bar, 100 μm. b, Quantification of TUJ1+ DAPI+ cells for mouse and human iN cells derived from mouse (m) or human (h) Neurog3 and Pou1f1 or rtTA only. Data from n = 2 biologically independent samples. c, Representative images of human iN cells reprogrammed from HEFs using pairs of mouse transcription factors. TUJ1 and MAP2 immunofluorescence labelling of 15 of the 76 positive pairwise combinations derived from the unbiased mouse screen. Fixed and stained on day 16–18 post-induction. Scale bar, 100 μm. Repeated with n = 2 independent experiments. d, Representative images of human iN cells reprogrammed from HEFs in an independent experiment from c. TUJ1 and MAP2 immunofluorescence labelling of four pairwise mouse transcription factor combinations. Fixed and stained on day 18 post-induction. Scale bar, 100 μm. e, Percentage fraction of MAP2+TUJ1+ cells from the four transcription factor combinations represented in d. Imaging from n = 2 biologically independent samples, 100 fields of view each. Number of TUJ1+ cells is as follows: Neurog1/Pou4f1 (n = 166 cells); Neurog3/Pou3f4 (n = 343 cells); Neurog3/Pou1f1 (n = 235 cells); Neurog3/Pou5f1 (n = 146 cells). Data are mean ± s.d. f, Representative synapsin (SYN1) and TUJ1 immunofluorescence labelling of human iN cells reprogrammed with Neurog3/Pou1f1 (91.5% positive for both). Scale bar, 100 μm. Repeated with n = 3 biologically independent samples. gj, Electrophysiological recordings were performed on human iN cells generated with mouse Neurog3/Pou1f1 between 26 and 31 days post-induction. g, Representative voltage responses from a Syn1–TdTomato+ cell with neuronal morphology; 21 of 27 fluorescent cells tested (77%) generated action potentials upon current injection. h, Representative whole-cell currents evoked by hyperpolarizing and depolarizing voltage steps delivered from a holding potential of −65 mV. i, Passive membrane properties of human iN cells. Quantification of resting membrane potential (left), capacitance (middle) and membrane resistance (right) is shown as mean ± s.d. (n = 15 cells). j, Steady-state currents versus voltage in individual cells reflect the expression of depolarization-induced voltage-gated outward currents (n = 9 cells). Source Data

  4. Extended Data Fig. 4 FACS, RNA-seq library preparation and characterization of iN cell and endogenous neuron populations.

    a, Representative immunofluorescence labelling of tau–EGFP+ iN cell population (Ascl2/Pou4f2) on day 12 post-induction using neuronal antibodies TUJ1 and MAP2. Scale bars, 100 μm. Pou4f2 is also known as Brn3b. b, Quantification of co-labelling of tau–eGFP and MAP2 in Tuj1+ cells on day 12 post-induction calculated from various reprogramming transcription factor pairs. Data are presented as mean ± s.d. from n = 4 independent experiments and n = 574 cells. c, d, Representative FACS gates of an Ascl2/Pou4f2 iN cell population (500,000 cells shown) (c) and a negative rtTA-only control (40,000 cells shown) (d) sorted on day 16 post-induction. Live tau–eGFP+ cells were enriched by first gating DRAQ5+ DAPI cells, then collecting only those that were GFP+. For Ascl2/Pou4f2, n = 2 independent experiments showed similar results, while for rtTA only, n = 40 independent experiments showed similar results. For all other iN cell populations, at least n = 2 independent experiments were performed to obtain biological replicates. e, Per cent of tau–eGFP+ cells out of total number of cells collected post-FACS, presented as mean ± s.d. (n = 4 sorts, > 100 cells per sort). f, g, Correlation plots between aligned counts from single sequenced libraries of a Neurog3/Pou3f2-iN cell population generated from 10 ng versus 5 ng input RNA (f) and 10 ng versus 1 ng input RNA (g). Pou3f2 is also known as Brn2. r, Pearson correlation coefficient. h, Correlation plots between aligned counts from single sequenced libraries of a Neurog3/Pou3f2 (10 ng input RNA) population and an Ascl1/Pou3f2 (10 ng input RNA) population. in, Representative images taken while dissecting tissue from various brain regions of appropriate mouse reporter strains used to isolate specific endogenous cell-type populations used for RNA-seq: cerebellum (CER) (i), DRG (j), cortex (CTX) (k), olfactory bulb mitral and tufted cells (OB-MT) and olfactory bulb granule cells (OB-GC) (l), hippocampus (HIP) (m), and dorsal-medial habenula (MHb-d) and ventral-medial habenula (MHb-v) (n). n = 2 independent RNA-seq experiments. o, Characteristics of the endogenous neuron populations used for RNA-seq. Source Data

  5. Extended Data Fig. 5 Upstream regulator analysis for core neuronal genes.

    a, Complete volcano plot of log2(fold change) versus –log(adjusted P value per gene) for MEFs (black) versus the pooled endogenous neuron and brain (endogenous neuron/brain) RNA-seq data. Genes enriched in MEFs and endogenous neuron/brain are plotted as negative and positive log2(fold change), respectively. Plotted are enriched core genes shared between iN cells and endogenous neuron/brain (orange, 75.5% of the significantly enriched endogenous neuron/brain genes), genes enriched in endogenous neuron/brain (purple, endo enriched), and genes enriched in iN cells (green, iN cell enriched). Red line, −log(0.05 P-adjusted value). Selected neural genes are labelled. b, Number of shared enriched genes between endogenous neurons and MEFs, individual endogenous neurons (purple) or iN cell (green) populations. Core genes (orange) are those shared collectively among iN cells and endogenous neurons. c, Heat map of expression of significant transcriptional regulators identified by HOMER only. Expression levels are defined as DESeq2 vsd-normalized RNA-seq counts with replicates averaged and scaled by row. d, Heat map of expression of significant class I–IV transcriptional regulators identified by IPA only. Class I, putative uniform neuronal repressor; Class II, putative non-uniform neuronal repressor; Class III, putative neuronal activator in iN cells; Class IV, putative neuronal activator in endogenous neurons. Expression levels are defined as DESeq2 vsd-normalized RNA-seq counts with groups averaged and scaled by row. Source Data

  6. Extended Data Fig. 6 scRNA-seq analysis of iN cell populations.

    at-SNE projection of single cells collected from four iN cell populations, Neurog3/Pou5f1 (N3.O4, n = 415 cells), Neurog3/Pou3f4 (N3.B4, n = 313 cells), Neurog1/Pou4f1 (N1.B3a, n = 134 cells) and Ascl2/Nr4a2 (A2.NR1, n = 90 cells), coloured by log of UMI counts per cell. Arrows point to subpopulations of cells with low UMI, which includes a cluster composed of cells from each iN cell population. b, t-SNE projection of the same single cells shown in a, coloured by the log of UMI counts for the myogenic genes Acta1, Tnnc2 and Myl1. Inset areas are magnified to highlight the small fraction of cells positive for the myogenic genes (3 out of 90, threshold set at log(UMI counts) > 1) in the Ascl2/Nr4a2 (A2.NR1) population. The three myogenic genes plotted were those identified previously19 that were not highly expressed in any of our endogenous neuron populations. c, t-SNE projection of single cells collected from MEFs and five iN cell populations: Neurog3/Pou5f1 (N3.O4), Neurog3/Pou3f4 (N3.B4), Neurog1/Pou4f1 (N1.B3a), Ascl2/Nr4a2 (A1.NR1) and Neurog3/Pou1f1 (N3.P1). Cells are coloured by the log of UMI counts for genes Col1a2 and Lox, which represent MEF genes (10 out of 15 genes) that are highly expressed in the majority of the MEF population and in a small fraction of cells in the iN cell populations. Fifteen MEF genes were selected, based on the top genes enriched in MEFs compared to endogenous neuron/brain according to population RNA-seq that were not also expressed in endogenous neuronal single cells (data not shown). dt-SNE projection of the same single cells as shown in c, coloured by the log of UMI counts for the genes Postn and Mmp2, which represent MEF genes (5 out of 15 genes) that are highly expressed in the majority of the MEF population and in a large fraction of cells in the iN cell populations. e, t-SNE projections of single cells coloured by log of UMI counts per cell for each of the individual iN cell populations sequenced: Neurog3/Pou3f4 (N3.B4), Neurog3/Pou5f1 (N3.O4), Neurog1/Pou4f1 (N1.B3a) and Ascl2/Nr4a2 (A1.NR1). The number of cells for each transcription factor combination is the same as in a. f, Expression of receptors and transmembrane proteins among the top 20 differentially expressed genes in each transcription factor pair relative to all other combinations, plotted as a simplified violin plot. One representative gene shown for each transcription factor pair.

  7. Extended Data Fig. 7 Diversity across iN cell populations.

    a, WGCNA module eigengene expression of the 35 iN cell populations (in duplicate) shown as bar plots of average module eigengene expression for module 09 (M09, n = 477 genes) correlated with bHLH subclasses. Colours highlight iN cells populations generated with the Ascl family of bHLH factors or an iN cell combination generated with the bHLH factor, Neurod2. b, Heat map of expression of myogenic genes reflects higher levels of expression in iN cell populations derived with the Ascl family of reprogramming factors compared to the Neurog family. The myogenic gene list is as described19. Expression levels are defined as DESeq2 vsd-normalized RNA-seq counts with replicates averaged and scaled by row. The dendrogram represents hierarchical clustering based on correlation distance. c, Heat map of expression of select neurotransmitter-associated genes. Expression levels in iN cell (green), endogenous neuron/brain (purple) and MEF populations (grey) are defined as DESeq2 vsd-normalized RNA-seq counts with replicates averaged. Dendrogram represents hierarchical clustering based on correlation distance. d, Schematic of dopamine and noradrenaline biosynthesis pathway. e, Heat map of expression of genes involved in dopamine and noradrenaline biosynthesis and re-uptake across all iN cell (green), endogenous neurons (purple) and MEF (grey) populations. Expression patterns for populations generated with Ascl1/Nr4a2, Ascl2/Nr4a2, Ascl5/Pou4f3 and Neurod2/Pou4f3 are outlined with a black frame. Expression levels are defined as DESeq2 vsd-normalized RNA-seq counts with replicates averaged. Dendrogram represents hierarchical clustering based on correlation distance.

  8. Extended Data Fig. 8 Diverse glutamate and nicotine responses of iN cell populations.

    a, Heat map of expression of glutamate and nicotinic acetylcholine receptor subunit genes across all iN cell populations. Expression levels are defined as DESeq2 vsd-normalized RNA-seq counts with replicates averaged. Dendrogram represents hierarchical clustering based on correlation distance. b, Percentages of glutamate- and nicotine-responsive cells out of total KCl-responsive cells in each individual iN cell population (n = 218 total cells). Group 1 (n = 6 independent experiments) and group 2 (n = 4) comprise iN cell populations with the lowest and highest overall expression of nicotinic acetylcholine receptors, respectively. ***P = 0.0004; ns, not significant (unpaired Student’s t-test). Data are mean ± s.d. Source Data

  9. Extended Data Fig. 9 Comparison of iN cell and endogenous neuron populations.

    a, Heat map of expression of uniquely enriched genes in individual iN cell populations as defined by genes significantly enriched (P-adjusted value < 0.05) in each iN cell population versus all other iN cell populations and MEFs determined by DESeq2. Expression levels are defined as DESeq2 vsd-normalized RNA-seq counts with replicates averaged and scaled by row. Dendrogram represents hierarchical clustering based on Euclidean distance. bg, Overlap of gene lists with a particular cell type or region for which data are currently available were identified by Fisher’s exact test (two-sided) with Benjamini–Hochberg correction using CSEA. Concentric, hexagonal plots represent each cell type or region. The sizes of the hexagons are scaled to the number of specifically enriched transcripts at set stringency thresholds with the innermost hexagon representing the most unique genes. Hexagons are colour coded by the P values of the Fisher’s exact test. RET, retina; HYP, hypothalamus; STR, striatum; HAB, habenula; BF, basal forebrain; BS, brainstem. bd, CSEA of the core genes (enriched genes shared between iN cell and endogenous neuron/brain populations, n = 2,239 genes) (b) and uniquely enriched genes of iN cell populations Ascl1/Nr4a2 (A1.Nurr1, n = 282 genes) (c) and Neurog1/Pou4f1 and Neurog2/Pou4f1 (N1/N2.B3a, combined n = 93 genes total) (d). Uniquely enriched genes were defined in the same manner as in a. eg, Modified CSEA visualization of uniquely enriched genes of individual iN cell populations: Ascl5/Pou4f3 (A5.B3c, n = 46 genes) (e), Neurog3/Pou5f1 (N3.O4, n = 51 genes) (f) and Ascl2/Nr4a2 (A2.NR1, n = 101 genes) (g). Uniquely enriched genes were defined in the same manner as in a.

  10. Extended Data Fig. 10 Comparison of single-cell iN cell and endogenous neuron populations.

    a, Pearson correlation values between individual single cells and bulk DRG plotted as kernel density distributions for each transcription factor pair, and colour-coded accordingly. To generate Pearson correlation values between endogenous populations and single cells, unique genes for each endogenous population (n = 1 population in duplicate biological samples) were defined using DESeq2 as the top 100 significant genes that were ranked by highest fold change when compared to all other endogenous populations (n = 5 in duplicate, n = 2 in triplicate biological samples). The expression level of these unique genes in their respective endogenous population was correlated with each single-cell for genes that were found in filtered gene-barcode matrices. Pearson correlation values were plotted as kernel density estimations to represent the distribution of single cells for each iN cell population: Neurog1/Pou4f1 (N1.B3a, n = 134 cells, green), Neurog3/Pou5f1 (N3.O4, n = 415 cells, pink), Neurog3/Pou3f4 (N3.B4, n = 313 cells, blue) and Ascl2/Nr4a2 (A2.NR1, 90 cells, orange). b, t-SNE projections of 952 single cells coloured by their correlation with bulk DRG. The Neurog1/Pou4f1 pair exhibits enrichment of highly correlated cells. c, Pearson correlation values between individual single cells (n = 952 cells) and bulk HIP plotted as kernel density distributions for each combination and colour coded accordingly. d, t-SNE projections of 952 single cells coloured by their correlation with bulk HIP. The Ascl2/Nr4a2 pair exhibits enrichment of highly correlated cells. eg, Pearson correlation values between individual single cells and bulk CTX (e), MHb-v (f) and CER (g) plotted as kernel density distributions for each combination, and colour-coded accordingly.

Supplementary information

  1. Supplementary Information

    This file contains full legends for Supplementary Tables 1-4

  2. Reporting Summary

  3. Supplementary Table 1

    This file contains Supplementary Table 1 - see the Supplementary Information document for full legend

  4. Supplementary Table 2

    This file contains Supplementary Table 2 - see the Supplementary Information document for full legend

  5. Supplementary Table 3

    This file contains Supplementary Table 3 - see the Supplementary Information document for full legend

  6. Supplementary Table 4

    This file contains Supplementary Table 4 - see the Supplementary Information document for full legend

  7. Source Data Figure 1

  8. Source Data Figure 2

  9. Source Data Figure 5

  10. Source Data Extended Data Figure 1

  11. Source Data Extended Data Figure 2

  12. Source Data Extended Data Figure 3

  13. Source Data Extended Data Figure 4

  14. Source Data Extended Data Figure 5

  15. Source Data Extended Data Figure 8

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/s41586-018-0103-5

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.