Human pluripotent stem cells recurrently acquire and expand dominant negative P53 mutations

Published online:


Human pluripotent stem cells (hPS cells) can self-renew indefinitely, making them an attractive source for regenerative therapies. This expansion potential has been linked with the acquisition of large copy number variants that provide mutated cells with a growth advantage in culture1,2,3. The nature, extent and functional effects of other acquired genome sequence mutations in cultured hPS cells are not known. Here we sequence the protein-coding genes (exomes) of 140 independent human embryonic stem cell (hES cell) lines, including 26 lines prepared for potential clinical use4. We then apply computational strategies for identifying mutations present in a subset of cells in each hES cell line5. Although such mosaic mutations were generally rare, we identified five unrelated hES cell lines that carried six mutations in the TP53 gene that encodes the tumour suppressor P53. The TP53 mutations we observed are dominant negative and are the mutations most commonly seen in human cancers. We found that the TP53 mutant allelic fraction increased with passage number under standard culture conditions, suggesting that the P53 mutations confer selective advantage. We then mined published RNA sequencing data from 117 hPS cell lines, and observed another nine TP53 mutations, all resulting in coding changes in the DNA-binding domain of P53. In three lines, the allelic fraction exceeded 50%, suggesting additional selective advantage resulting from the loss of heterozygosity at the TP53 locus. As the acquisition and expansion of cancer-associated mutations in hPS cells may go unnoticed during most applications, we suggest that careful genetic characterization of hPS cells and their differentiated derivatives be carried out before clinical use.

  • Subscribe to Nature for full access:



Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

Change history

  • Corrected online 04 May 2017

    New references 29 and 30 were added, and subsequent citations were renumbered accordingly.


  1. 1.

    ISCI. Screening ethnically diverse human embryonic stem cells identifies a chromosome 20 minimal amplicon conferring growth advantage. Nat. Biotechnol . 29, 1132–1144 (2011)

  2. 2.

    et al. BCL-XL mediates the strong selective advantage of a 20q11.21 amplification commonly found in human embryonic stem cell cultures. Stem Cell Rep. 1, 379–386 (2013)

  3. 3.

    et al. Gain of 20q11.21 in human embryonic stem cells improves cell survival by increased expression of Bcl-xL. Mol. Hum. Reprod. 20, 168–177 (2014)

  4. 4.

    , , , & Good manufacturing practice and clinical-grade human embryonic stem cell lines. Hum. Mol. Genet. 17, R48–R53 (2008)

  5. 5.

    et al. Clonal hematopoiesis and blood-cancer risk inferred from blood DNA sequence. N. Engl. J. Med. 371, 2477–2487 (2014)

  6. 6.

    & Somatic mutation in cancer and normal cells. Science 349, 1483–1489 (2015)

  7. 7.

    et al. Age-related clonal hematopoiesis associated with adverse outcomes. N. Engl. J. Med. 371, 2488–2498 (2014)

  8. 8.

    et al. Characterization of human embryonic stem cell lines by the International Stem Cell Initiative. Nat. Biotechnol. 25, 803–816 (2007)

  9. 9.

    et al. Detecting genetic mosaicism in cultures of human pluripotent stem cells. Stem Cell Rep . 7, 998–1012 (2016)

  10. 10.

    et al. Human embryonic stem cell-derived retinal pigment epithelium in patients with age-related macular degeneration and Stargardt’s macular dystrophy: follow-up of two open-label phase 1/2 studies. Lancet 385, 509–516 (2015)

  11. 11.

    et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016)

  12. 12.

    et al. COSMIC: exploring the world’s knowledge of somatic mutations in human cancer. Nucleic Acids Res . 43, D805–D811 (2015)

  13. 13.

    . et al. International Cancer Genome Consortium data portal—a one-stop shop for cancer genomics data. Database 2011, bar026 (2011)

  14. 14.

    et al. TP53 variations in human cancers: new lessons from the IARC TP53 database and genomics data. Hum. Mutat. 37, 865–876 (2016)

  15. 15.

    , & Surfing the p53 network. Nature 408, 307–310 (2000)

  16. 16.

    , , & 5-methylcytosine as an endogenous mutagen in the human LDL receptor and p53 genes. Science 249, 1288–1290 (1990)

  17. 17.

    , , & Crystal structure of a p53 tumor suppressor-DNA complex: understanding tumorigenic mutations. Science 265, 346–355 (1994)

  18. 18.

    , , & Mutant p53 exerts a dominant negative effect by preventing wild-type p53 from binding to the promoter of its target genes. Oncogene 23, 2330–2338 (2004)

  19. 19.

    Li–Fraumeni syndrome. Genes Cancer 2, 475–484 (2011)

  20. 20.

    et al. Heterogeneity of Li–Fraumeni syndrome links to unequal gain-of-function effects of p53 mutations. Sci. Rep. 4, 4223 (2014)

  21. 21.

    et al. High-throughput droplet digital PCR system for absolute quantitation of DNA copy number. Anal. Chem. 83, 8604–8610 (2011)

  22. 22.

    et al. A p53-mediated DNA damage response limits reprogramming to ensure iPS cell genomic integrity. Nature 460, 1149–1153 (2009)

  23. 23.

    et al. Two supporting factors greatly improve the efficiency of human iPSC generation. Cell Stem Cell 3, 475–479 (2008)

  24. 24.

    et al. Spontaneous single-copy loss of TP53 in human embryonic stem cells markedly increases cell proliferation and survival. Stem Cells (2016)

  25. 25.

    et al. Human intestinal tissue with adult stem cell properties derived from pluripotent stem cells. Stem Cell Rep . 2, 838–852 (2014)

  26. 26.

    et al. A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279–283 (2011)

  27. 27.

    et al. Dynamic chromatin remodeling mediated by polycomb proteins orchestrates pancreatic differentiation of human embryonic stem cells. Cell Stem Cell 12, 224–237 (2013)

  28. 28.

    RIKEN suspends first clinical trial involving induced pluripotent stem cells. Nat. Biotechnol. 33, 890–891 (2015)

  29. 29.

    Dynamic changes in the copy number of pluripotency and cell proliferation genes in human ESCs and iPSCs during reprogramming and time in culture. Cell Stem Cell 8, 106–118 (2011)

  30. 30.

    & Culturing human pluripotent stem cells from diverse culture histories. Protoc. Exch. (2017)

  31. 31.

    et al. Derivation of human embryonic stem cells in defined conditions. Nat. Biotechnol. 24, 185–187 (2006)

  32. 32.

    et al. Chemically defined conditions for human iPSC derivation and culture. Nat. Methods 8, 424–429 (2011)

  33. 33.

    et al. Efficient CRISPR–Cas9-mediated generation of knockin human pluripotent stem cells lacking undesired mutations at the targeted locus. Cell Reports 11, 875–883 (2015)

  34. 34.

    et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011)

  35. 35.

    et al. Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am. J. Hum. Genet. 91, 839–848 (2012)

  36. 36.

    et al. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 17, 1665–1674 (2007)

  37. 37.

    et al. Ultra-rare disruptive and damaging mutations influence educational attainment in the general population. 19, 1563–1565 (2016)

  38. 38.

    et al. AMBER 2016

  39. 39.

    et al. The complete genome of an individual by massively parallel DNA sequencing. Nature 452, 872–876 (2008)

  40. 40.

    et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013)

  41. 41.

    et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res . 29, 308–311 (2001)

  42. 42.

    et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011)

  43. 43.

    , , , & BigWig and BigBed: enabling browsing of large distributed datasets. 26, 2204–2207 (2010)

Download references


We thank the many institutions and investigators world-wide that provided their cell lines and supported the publication of the results. We are indebted to D. Santos, M. Smith, K. Elwell, M. A. Yram, S. Ellender, L. Bevilacqua, and D. Gage for their assistance with the regulatory and logistical efforts required to acquire and sequence hES cell lines. We also thank K. Lilliehook for her comments, I. Yildirim for his assistance with the molecular modelling of P53 mutations, and C. Usher for help with figure schematics. We regret the omission of any relevant references or discussion due to space limitations. The Genomics Platform at the Broad Institute performed sample preparation, sequencing, and data storage. Y.A. is a Clore Fellow. N.B. is the Herbert Cohn Chair in Cancer Research and was partially supported by The Rosetrees Trust and The Azrieli Foundation. Costs associated with acquiring and sequencing hES cell lines were supported by HHMI and the Stanley Center for Psychiatric Research. F.T.M., S.A.M., and K.E. were supported by grants from the NIH (5P01GM099117, 5K99NS08371). K.E. was supported by the Miller consortium of the HSCI, and F.T.M. is currently supported by funds from the Wellcome Trust, the Medical Research Council (MR/P501967/1), and the Academy of Medical Sciences (SBF001\1016).

Author information

Author notes

    • Florian T. Merkle
    •  & Shila Mekhoubad

    Present addresses: Metabolic Research Laboratories and Medical Research Council Metabolic Diseases Unit, Wellcome Trust - Medical Research Council Institute of Metabolic Science, and Wellcome Trust Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0QQ, UK (F.T.M.); Stem Cell Research, Biogen, 115 Broadway, Cambridge, Massachusetts 02142, USA (S.M.).

    • Florian T. Merkle
    •  & Sulagna Ghosh

    These authors contributed equally to this work.


  1. Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, Massachusetts 02138, USA

    • Florian T. Merkle
    • , Sulagna Ghosh
    • , Jana Mitchell
    • , Shila Mekhoubad
    • , Maura Charlton
    • , Genevieve Saphier
    •  & Kevin Eggan
  2. Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138, USA

    • Florian T. Merkle
    • , Sulagna Ghosh
    • , Jana Mitchell
    • , Shila Mekhoubad
    • , Maura Charlton
    •  & Kevin Eggan
  3. Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA

    • Florian T. Merkle
    • , Sulagna Ghosh
    • , Nolan Kamitaki
    • , Jana Mitchell
    • , Curtis Mello
    • , Seva Kashin
    • , Maura Charlton
    • , Genevieve Saphier
    • , Robert E. Handsaker
    • , Giulio Genovese
    • , Steven A. McCarroll
    •  & Kevin Eggan
  4. Harvard Stem Cell Institute, Cambridge, Massachusetts 02138, USA

    • Florian T. Merkle
    • , Sulagna Ghosh
    • , Jana Mitchell
    • , Shila Mekhoubad
    • , Maura Charlton
    • , Genevieve Saphier
    •  & Kevin Eggan
  5. Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA

    • Nolan Kamitaki
    • , Curtis Mello
    • , Seva Kashin
    • , Robert E. Handsaker
    • , Giulio Genovese
    •  & Steven A. McCarroll
  6. Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA

    • Nolan Kamitaki
    • , Curtis Mello
    • , Seva Kashin
    • , Robert E. Handsaker
    • , Giulio Genovese
    •  & Steven A. McCarroll
  7. The Azrieli Center for Stem Cells and Genetic Research, Institute of Life Sciences, Hebrew University of Jerusalem, Givat-Ram, Jerusalem 91904, Israel

    • Yishai Avior
    • , Shiran Bar
    •  & Nissim Benvenisty
  8. Stem Cell Laboratories, Guy’s Assisted Conception Unit, Division of Women’s Health, Faculty of Life Sciences and Medicine, King’s College London, London, UK

    • Dusko Ilic


  1. Search for Florian T. Merkle in:

  2. Search for Sulagna Ghosh in:

  3. Search for Nolan Kamitaki in:

  4. Search for Jana Mitchell in:

  5. Search for Yishai Avior in:

  6. Search for Curtis Mello in:

  7. Search for Seva Kashin in:

  8. Search for Shila Mekhoubad in:

  9. Search for Dusko Ilic in:

  10. Search for Maura Charlton in:

  11. Search for Genevieve Saphier in:

  12. Search for Robert E. Handsaker in:

  13. Search for Giulio Genovese in:

  14. Search for Shiran Bar in:

  15. Search for Nissim Benvenisty in:

  16. Search for Steven A. McCarroll in:

  17. Search for Kevin Eggan in:


F.T.M., S.G., S.A.M., and K.E. conceived the project. F.T.M. and K.E. acquired hES cell lines with the assistance of M.C. and G.S., who also assisted with regulatory issues pertaining to sequencing and data distribution. F.T.M. cultured and banked hES cell lines, prepared them for sequencing, and coordinated efforts to interpret and visualize sequencing data with the assistance of S.G. S.G. performed computational data analysis and visualization with the help of G.G., R.E.H., and S.K. Y.A. preformed the analysis of TP53 mutations in the RNA-seq database with the assistance of S.B. and N.B. Data were interpreted by F.T.M., S.G., N.K., G.G., Y.A., S.B., N.B., S.A.M., and K.E. N.K., J.M., and C.M. designed, performed and analysed experiments to measure the mosaic nature and competitive expansion of TP53 mutations. S.M. derived HUES 68, 69, 70, 74, 75, and D.I. provided the KCL lines. F.T.M., S.G., S.A.M., and K.E. prepared drafts of the manuscript text and figures with contributions and comments from all authors.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Steven A. McCarroll or Kevin Eggan.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Supplementary information

Excel files

  1. 1.

    Supplementary Table 1

    Considered and whole exome sequenced hESC lines. Tab 1. We considered hESC lines for WES if they were listed on the NIH Human Embryonic Stem Cell Registry (http://grants.nih.gov/stem_cells/registry/current.htm) or if they were prepared under GMP conditions. Cell lines were typically excluded from consideration if they were unavailable for distribution or contained known karyotypic abnormalities in more than 10% of analyzed cells or disease-causing mutations identified by PGD. Cell lines with MTAs that restricted our ability to work with the cell lines, that could not be recovered upon thawing, or proved to be unavailable upon request were also excluded. Passage number at the time of request, the number of passages and time in culture from thaw to passaging, and the passaging method, media, and substrate, are provided, as is mean sequencing coverage and % cross sample contaminated per cell line. GMP, good manufacturing practice; MTA, material transfer agreement; PGD, pre-implantation genetic diagnosis; WES, whole exome sequencing. Tab 2. Summary of number of cells considered and sequenced, including reasons for exclusion. These data are presented graphically in Figure 1b-e.

  2. 2.

    Supplementary Table 2

    Identification of candidate mosaic variants present in sequenced hESCs. Tab 1. Filters used to identify likely mosaic variants among all heterozygous variants present among the sequenced exomes of 140 hESCs. Tab 2. List of 263 candidate mosaic variants passing quality control filters and present no more than two times among the 140 sequenced hESC lines. Variants are arranged by chromosome position and annotated by likely functional impact and frequency in the general population (ExAC AC). Tab 3. Variants from the list in Tab 2 predicted to have either a high or damaging impact on gene function based on a consensus of 7 bioinformatic algorithms. See Materials and Methods for further details. Tab 4. In addition to mosaic variants identified using these stringent filters, we provide an inclusive list of all high confidence somatic variants (n=36,396) that pass the binomial test with a P value of <0.01. SNP, single nucleotide polymorphism; CHROM, chromosome number; POS, genomic position (hg19); REF, reference allele; ALT, alternate allele; HESC, human embryonic stem cell line; REFC, count of reference alleles; ALTC, count of alternate alleles; FILTER, high confidence variant score; EXACAC, allele count in the Exome Aggregation Consortium (ExAC) database; IMPACT, predicted effect of mutation; HESCAC, allele count in hESCs; TOTALC, REFC+ALTC; AF, allelic faction (ALTC/TOTALC); P, P value for binomial test on allelic fraction.

  3. 3.

    Supplementary Table 3

    Characteristics of TP53 mutations identified in hESCs by WES and RNAseq. Tab 1. Summary of all 15 instances of TP53 mutations observed by WES and RNAseq with details of read depth, allelic fraction, P value, reference, and culture method. Note that all observed mutations are frequently seen in human cancer, and that most mutations have evidence of mosaicism, indicating that they were likely culture-derived. bFGF, basic fibroblast growth factor (FGF2); COSMIC, Catalogue of Somatic Mutations in Cancer (http://cancer.sanger.ac.uk/cosmic); ExAC, Exome Aggregation Consortium (http://exac.broadinstitute.org/); Freq., frequency; GMP, good manufacturing practice, IARC, International Agency for Research on Cancer (http://p53.iarc.fr/); ICGC, International Cancer Genome Consortium (http://icgc.org/); MEF, mouse embryonic fibroblast; Seq., sequencing; SNL, SNL mouse fibroblast feeder cell line; WES, whole exome sequencing. Errors denote SEM. Tab 2. Breakdown of the incidence of P53 mutations by culture media, substrate, and passaging method.

  4. 4.

    Supplementary Table 4

    Primer and probe sequences used for ddPCR-based determination of P53 variant allele frequency.

  5. 5.

    Supplementary Table 5

    Calculation of selective advantage conferred by three distinct TP53 variants. The allelic fraction of TP53 variants was measured at several passages by ddPCR in hESCs cultured under standard conditions. Replicate experiments per passage are shown in grey, and average values are shown in black. The observed increase in allelic frequency of each of the variants across time in culture is consistent with a substantial growth or survival advantage in all but one instance. See Materials and Methods for details on ddPCR and the calculation of the effect per passage.

  6. 6.

    Supplementary Table 6

    Large copy number variants in hESCs identified by the human Psych Array. Tab 1. Summary of hESC lines with large copy number variants (>500kb) as ascertained by SNP array analysis. Two of the five cell lines with acquired TP53 mutations harbored large structural alternations (HUES71 and MShef10). Five separate cell lines (CSES25, ESI051, MShef3, UM78-1 and WA21) had an amplification at the pericentromeric region of chromosome 20 (Chr20q11.21). Tab 2. Complete list of large deletions or duplications (>500kb) identified across the 140 hESC lines.

  7. 7.

    Supplementary Table 7

    Identification of TP53 mutations in hPSCs by RNA sequencing and WES. Tab 1. List of all RNA sequenced samples from hPSCs. Five of these samples (cell2-7) were removed since they were from single stem cells rather than cell lines. Tab 2. Summary of the number of samples and studies generated from each cell line. Tab 3. List of all samples harboring TP53 mutations, their chromosomal location, and the relevant study. Tab 4. Summary of all affected cell lines and studies. Tab 5. Summary of affected samples, cell lines, and number of mutations seen in hESCs and hiPSCs by WES and RNAseq.


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.