The equivalence of human induced pluripotent stem cells (hiPSCs) and human embryonic stem cells (hESCs) remains controversial. Here we use genetically matched hESC and hiPSC lines to assess the contribution of cellular origin (hESC vs. hiPSC), the Sendai virus (SeV) reprogramming method and genetic background to transcriptional and DNA methylation patterns while controlling for cell line clonality and sex. We find that transcriptional and epigenetic variation originating from genetic background dominates over variation due to cellular origin or SeV infection. Moreover, the 49 differentially expressed genes we detect between genetically matched hESCs and hiPSCs neither predict functional outcome nor distinguish an independently derived, larger set of unmatched hESC and hiPSC lines. We conclude that hESCs and hiPSCs are molecularly and functionally equivalent and cannot be distinguished by a consistent gene expression signature. Our data further imply that genetic background variation is a major confounding factor for transcriptional and epigenetic comparisons of pluripotent cell lines, explaining some of the previously observed differences between genetically unmatched hESCs and hiPSCs.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Communications Biology Open Access 21 December 2022
Determining epigenetic memory in kidney proximal tubule cell derived induced pluripotent stem cells using a quadruple transgenic reprogrammable mouse
Scientific Reports Open Access 25 November 2022
Complex biology of constitutional ring chromosomes structure and (in)stability revealed by somatic cell reprogramming
Scientific Reports Open Access 22 February 2021
Subscribe to Journal
Get full journal access for 1 year
only $8.25 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Gene Expression Omnibus
Takahashi, K. & Yamanaka, S. Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell 126, 663–676 (2006).
Takahashi, K. et al. Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell 131, 861–872 (2007).
Park, I.-H. et al. Disease-specific induced pluripotent stem cells. Cell 134, 877–886 (2008).
Yu, J. et al. Induced pluripotent stem cell lines derived from human somatic cells. Science 318, 1917–1920 (2007).
Chin, M.H. et al. Induced pluripotent stem cells and embryonic stem cells are distinguished by gene expression signatures. Cell Stem Cell 5, 111–123 (2009).
Bock, C. et al. Reference Maps of human ES and iPS cell variation enable high-throughput characterization of pluripotent cell lines. Cell 144, 439–452 (2011).
Chin, M.H., Pellegrini, M., Plath, K. & Lowry, W.E. Molecular analyses of human induced pluripotent stem cells and embryonic stem cells. Cell Stem Cell 7, 263–269 (2010).
Ruiz, S. et al. Identification of a specific reprogramming-associated epigenetic signature in human induced pluripotent stem cells. Proc. Natl. Acad. Sci. USA 109, 16196–16201 (2012).
Teichroeb, J.H., Betts, D.H. & Vaziri, H. Suppression of the imprinted gene NNAT and X-chromosome gene activation in isogenic human iPS cells. PLoS One 6, e23436 (2011).
Phanstiel, D.H. et al. Proteomic and phosphoproteomic comparison of human ES and iPS cells. Nat. Methods 8, 821–827 (2011).
Soldner, F. et al. Parkinson's disease patient-derived induced pluripotent stem cells free of viral reprogramming factors. Cell 136, 964–977 (2009).
Stadtfeld, M. et al. Ascorbic acid prevents loss of Dlk1-Dio3 imprinting and facilitates generation of all-iPS cell mice from terminally differentiated B cells. Nat. Genet. 44, 398–405, S1–S2 (2012).
Newman, A.M. & Cooper, J.B. Lab-specific gene expression signatures in pluripotent stem cells. Cell Stem Cell 7, 258–262 (2010).
Rouhani, F. et al. Genetic background drives transcriptional variation in human induced pluripotent stem cells. PLoS Genet. 10, e1004432 (2014).
Humpherys, D. et al. Epigenetic instability in ES cells and cloned mice. Science 293, 95–97 (2001).
Tchieu, J. et al. Female human iPSCs retain an inactive X chromosome. Cell Stem Cell 7, 329–342 (2010).
Anguera, M.C. et al. Molecular signatures of human induced pluripotent stem cells highlight sex differences and cancer genes. Cell Stem Cell 11, 75–90 (2012).
Stadtfeld, M. et al. Aberrant silencing of imprinted genes on chromosome 12qF1 in mouse induced pluripotent stem cells. Nature 465, 175–181 (2010).
Fusaki, N., Ban, H., Nishiyama, A., Saeki, K. & Hasegawa, M. Efficient induction of transgene-free human pluripotent stem cells using a vector based on Sendai virus, an RNA virus that does not integrate into the host genome. Proc. Jpn. Acad., Ser. B. Phys. Biol. Sci. 85, 348–362 (2009).
Cowan, C.A. et al. Derivation of embryonic stem-cell lines from human blastocysts. N. Engl. J. Med. 350, 1353–1356 (2004).
Mallon, B.S. et al. Comparison of the molecular profiles of human embryonic and induced pluripotent stem cells of genetically matched origin. Stem Cell Res. (Amst.) 12, 376–386 (2014).
Guenther, M.G. et al. Chromatin structure and gene expression programs of human embryonic and induced pluripotent stem cells. Cell Stem Cell 7, 249–257 (2010).
Maherali, N. et al. A high-efficiency system for the generation and study of human induced pluripotent stem cells. Cell Stem Cell 3, 340–345 (2008).
Everse, J. & Kaplan, N.O. Lactate dehydrogenases: structure and function. Adv. Enzymol. 37, 61–133 (1973).
Fantin, V.R., St-Pierre, J. & Leder, P. Attenuation of LDH-A expression uncovers a link between glycolysis, mitochondrial physiology, and tumor maintenance. Cancer Cell 9, 425–434 (2006).
Mueckler, M. et al. Sequence and structure of a human glucose transporter. Science 229, 941–945 (1985).
Young, C.D. et al. Modulation of glucose transporter 1 (GLUT1) expression levels alters mouse mammary tumor cell growth in vitro and in vivo. PLoS One 6, e23205 (2011).
Zhou, W. et al. HIF1α induced switch from bivalent to exclusively glycolytic metabolism during ESC-to-EpiSC/hESC transition. EMBO J. 31, 2103–2113 (2012).
Cohen, D.R., Cheng, C.W., Cheng, S.H. & Hui, C.C. Expression of two novel mouse Iroquois homeobox genes during neurogenesis. Mech. Dev. 91, 317–321 (2000).
Matsumoto, K. et al. The prepattern transcription factor Irx2, a target of the FGF8/MAP kinase cascade, is involved in cerebellum formation. Nat. Neurosci. 7, 605–612 (2004).
Girirajan, S. et al. Refinement and discovery of new hotspots of copy-number variation associated with autism spectrum disorder. Am. J. Hum. Genet. 92, 221–237 (2013).
Marshall, C.R. et al. Structural variation of chromosomes in autism spectrum disorder. Am. J. Hum. Genet. 82, 477–488 (2008).
Zhang, Y. et al. Functional genomic screen of human stem cell differentiation reveals pathways involved in neurodevelopment and neurodegeneration. Proc. Natl. Acad. Sci. USA 110, 12361–12366 (2013).
Chambers, S.M. et al. Highly efficient neural conversion of human ES and iPS cells by dual inhibition of SMAD signaling. Nat. Biotechnol. 27, 275–280 (2009).
Zhang, X. et al. Pax6 is a human neuroectoderm cell fate determinant. Cell Stem Cell 7, 90–100 (2010).
Tsankov, A.M. et al. A qPCR ScoreCard quantifies the differentiation potential of human pluripotent stem cells. Nat. Biotechnol. doi:10.1038/nbt.3387 (2015).
Wang, C. et al. The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance. Nat. Biotechnol. 32, 926–932 (2014).
Zhao, S., Fung-Leung, W.-P., Bittner, A., Ngo, K. & Liu, X. Comparison of RNA-Seq and microarray in transcriptome profiling of activated T cells. PLoS One 9, e78644 (2014).
Loewer, S. et al. Large intergenic non-coding RNA-RoR modulates reprogramming of human induced pluripotent stem cells. Nat. Genet. 42, 1113–1117 (2010).
Abyzov, A. et al. Somatic copy number mosaicism in human skin revealed by induced pluripotent stem cells. Nature 492, 438–442 (2012).
Koyanagi-Aoi, M. et al. Differentiation-defective phenotypes revealed by large-scale analyses of human pluripotent stem cells. Proc. Natl. Acad. Sci. USA 110, 20569–20574 (2013).
Soldner, F. et al. Generation of isogenic pluripotent stem cells differing exclusively at two early onset Parkinson point mutations. Cell 146, 318–331 (2011).
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
Flicek, P. et al. Ensembl 2014. Nucleic acids research. 42, D749–755 (2014).
Lee, S., Seo, C.H. Alver, B.H. Lee, S. & Park, P.J. EMSAR: estimation of transcript abundance from RNA-seq data by mappability-based segmentation and reclustering. BMC bioinformatics 16, 278 (2010).
Robinson, M.D., McCarthy, D.J. & Smyth, G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010).
Thorvaldsdóttir, H., Robinson, J.T. & Mesirov, J.P. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinform. 14, 178–192 (2013).
Huang, W., Sherman, B.T. & Lempicki, R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009).
Huang, W., Sherman, B.T. & Lempicki, R.A. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 37, 1–13 (2009).
Dunn, J.C. Well-separated clusters and optimal fuzzy partitions. J. Cybern. 4, 95–104 (1974).
Rousseeuw, P.J. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987).
Boyle, P. et al. Gel-free multiplexed reduced representation bisulfite sequencing for large-scale DNA methylation profiling. Genome Biol. 13, R92 (2012).
Zhong, L. et al. The histone deacetylase Sirt6 regulates glucose homeostasis via Hif1α. Cell 140, 280–293 (2010).
Sebastián, C. et al. The histone deacetylase SIRT6 is a tumor suppressor that controls cancer metabolism. Cell 151, 1185–1199 (2012).
We thank members of the Hochedlinger and Park laboratories for productive discussions and a critical reading of the manuscript. We also thank M. Stadtfeld for his helpful discussions and D. Melton for his generous donation of HUES2 and HUES3 lines. We are grateful to K. Folz-Donahue, M. Weglarz and L. Prickett at the Massachusetts General Hospital (MGH)/Harvard Stem Cell Institute (HSCI) flow cytometry core for their constant assistance and support. We are also thankful to the members of the Tufts Genomics Core for performing RNA-seq. Work in the Lee laboratory was supported by grants from the Robertson Investigator Award of the New York Stem Cell Foundation and from the Maryland Stem Cell Research Fund (TEDCO). A.M. and J.L.R. are supported by US National Institutes of Health (NIH) grant P01GM099117. A.M. is a New York Stem Cell Foundation Robertson Investigator. Parts of this work were supported by the Howard Hughes Medical Institute (HHMI), MGH startup funds, the Gerald and Darlene Jordan Endowed Chair for Regenerative Medicine (to K.H.) and a pilot grant from the NIH (P01GM099117 to K.H.). J.C. was supported by the Vranos Family Graduate Research Fellowship in Developmental & Regenerative Biology.
The authors declare no competing financial interests.
Integrated supplementary information
A) Representative AP staining for parental fibroblasts of each genetic background (top panels) and control hESC GFP lines from corresponding background (bottom panels) that were cultured in hESC media. Parental fibroblast failed to form any pluripotent colonies whereas hESC lines formed multiple pluripotent colonies. Insets show magnification of AP staining (B) PCA analysis of isogenic cell lines based on global gene expression levels. (C) The hiPSC1 line was stained with DAPI and α-SeV antibody at passage 7 (top panels) and 15 (bottom panels). Representative images are shown. (D) Transgene-specific primers were used to detect the expression of Oct4 and Klf4 at passage 15. hiPSC1 at passage 7 was used as a positive control. (E) Heatmap and dendrogram separating all genetically matched hESC lines based on the 63 DEGs from Fig. 2B. HUES lines, dark blue; hESC GFP lines, light blue. (F) Gene ontology enrichment analysis for the 63 DEGs from Fig. 2B. Gene count for each term is shown.
Supplementary Figure 2 Effects of genetic background on global gene expression and DNA methylation patterns.
(A) Heatmap and dendrogram for isogenic hESC subclones, in vitro- differentiated fibroblasts, derivative hiPSCs, and dermal fibroblast based on pairwise Pearson correlation (r) on global gene expression levels. hiPSC lines, red; hESC lines, blue; fibroblasts, black. (B) Dendrogram for hESCs, in vitro-differentiated fibroblasts and derivative hiPSCs based on global DNA methylation levels as determined by RRBS analysis. (C) Bar plots showing mean absolute deviation (MAD) in global gene expression levels of hiPSC lines relative to matched hESC GFP (i.e., HUES2-derived hiPSCs vs HUES2-derived hESC GFPs and HUES3-derived hiPSCs vs HUES3-derived hESC GFPs; top bar) or unmatched hESC GFP (i.e., HUES2-derived hiPSCs vs HUES3-derived hESC GFPs and HUES3-derived hiPSCs vs HUES2-derived hESC GFPs; bottom bar) (also see Methods). Blue rectangles and error bars represent mean values and s.d. of six hiPSC lines. (D) Bar plots showing MAD in global gene expression levels of hiPSC lines relative to either hESC GFP lines (1st bar) or hiPSC lines (2nd bar) and of hESC GFP lines relative to either hiPSC lines (3rd bar) or hESC GFP lines (4th bar) independent of genetic background. Blue rectangles and error bars represent mean values and s.d. of either six hiPSC lines or six hESC GFP lines. (E) Total number of mapped reads for individual RNA-seq samples with technical replicates merged. Red dotted line indicates average.
(A) Full blot for Fig. 3g.
Supplementary Figure 4 Comparison of differentiated cells derived from isogenic hESC GFP and hiPSC lines.
(A) Venn diagram showing the number of up- and down-regulated genes in 3 biological replicate hiPSC fibroblasts relative to 3 biological replicate hESC GFP fibroblasts within each genetic background. (B) Box plot of 12 in vitro- differentiated fibroblast-like cell lines and primary dermal fibroblasts (cross) based on the 2 DEGs (identified in A) between hiPSC fibroblasts (red) and hESC GFP fibroblasts (blue). (C) Schematic for identifying “inconsistently differentially expressed genes (iDEGs)” that were dysregulated in only a subset of hiPSC lines when compared to hESC GFP lines. Red and green boxes stand for 6 discrete grouping patterns of samples for differential expression analysis, where one or two hiPSC lines are pretended to be a replicate of the hESC lines of the same genetic background. Differentially expressed genes for each pattern were identified and merged within each genetic background and the intersection was taken for the two genetic backgrounds (Venn diagram, grey: HUES2, green: HUES3). 8 iDEGs that were common between the two backgrounds are indicated. (D) hESC GFP and hiPSC lines were differentiated into neuroectodermal cells and Western blot analysis was used to detect neural differentiation by PAX6 expression at day 6 in each cell line. GAPDH was used as a control.
Supplementary Figure 5 Analyses of differentially expressed genes between hESC and hiPSC lines using independent reprogramming data sets.
(A) 16 genes were identified as differentially expressed between unmatched hESC and hiPSC lines described in this study. (FDR<0.15 and fold change >2 or <0.5, see details in the Methods). (B) Definition and number of significantly differentially expressed genes (DEGs) between hESCs and hiPSCs. (C) Dendrogram for all isogenic hESC (blue) and hiPSC (red) lines from Choi et al. based on expression levels of the 16 DEGs identified in A. (D-F) Dendrograms for non-isogenic hESC (blue) and hiPSC (red) lines (Choi et al.) based on DEGs defined in other studies (see Supplementary Fig. 5B). (G-I) Dendrograms for non-isogenic hESC (blue) and hiPSC (red) lines (Phanstiel et al.10) based on DEGs defined in other studies (see Supplementary Fig. 5B). (J) Genes were ranked by the sum of –log10(p-value) of differential expression between HUES2 and HUES3 lines (see Methods for details), and the frequency (top) and placement (bottom) of Phanstiel et al.’s DEGs10 (red circles) within the ranking were determined. (K) Left panel: distribution of Dunn-index-based scores of random gene sets, which measures how well a gene set separates our isogenic samples by genetic background. Larger values indicate better separation. Zero indicates the samples are not separated by genetic background. Each of the 10,000 random gene sets was size- and expression-matched to Phanstiel’s DEGs10. Red vertical line indicates the value for Phantiel’s DEGs10, suggesting a significantly better separation by Phanstiel’s DEGs10 than a random set of genes (p-value = 0.0236). Right panel: distribution of expression levels of the random gene sets computed as the sum of log(TPM+1). Gene sets were stratified according to whether they separate our isogenic samples by genetic background (red) or not (green), showing the separation is not affected by expression levels. Dotted line indicates Phanstiel’s DEGs10.
About this article
Cite this article
Choi, J., Lee, S., Mallard, W. et al. A comparison of genetically matched cell lines reveals the equivalence of human iPSCs and ESCs. Nat Biotechnol 33, 1173–1181 (2015). https://doi.org/10.1038/nbt.3388
This article is cited by
Communications Biology (2022)
Determining epigenetic memory in kidney proximal tubule cell derived induced pluripotent stem cells using a quadruple transgenic reprogrammable mouse
Scientific Reports (2022)
Convergent genomic and pharmacological evidence of PI3K/GSK3 signaling alterations in neurons from schizophrenia patients
Nature Biotechnology (2021)
Nature Materials (2021)