Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Somatic copy number mosaicism in human skin revealed by induced pluripotent stem cells


Reprogramming somatic cells into induced pluripotent stem cells (iPSCs) has been suspected of causing de novo copy number variation1,2,3,4. To explore this issue, here we perform a whole-genome and transcriptome analysis of 20 human iPSC lines derived from the primary skin fibroblasts of seven individuals using next-generation sequencing. We find that, on average, an iPSC line manifests two copy number variants (CNVs) not apparent in the fibroblasts from which the iPSC was derived. Using PCR and digital droplet PCR, we show that at least 50% of those CNVs are present as low-frequency somatic genomic variants in parental fibroblasts (that is, the fibroblasts from which each corresponding human iPSC line is derived), and are manifested in iPSC lines owing to their clonal origin. Hence, reprogramming does not necessarily lead to de novo CNVs in iPSCs, because most of the line-manifested CNVs reflect somatic mosaicism in the human skin. Moreover, our findings demonstrate that clonal expansion, and iPSC lines in particular, can be used as a discovery tool to reliably detect low-frequency CNVs in the tissue of origin. Overall, we estimate that approximately 30% of the fibroblast cells have somatic CNVs in their genomes, suggesting widespread somatic mosaicism in the human body. Our study paves the way to understanding the fundamental question of the extent to which cells of the human body normally acquire structural alterations in their DNA post-zygotically.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Characterization of candidate LM-CNVs with respect to passage number and total CNVs.
Figure 2: Validation and estimation of cell frequency of representative somatic CNVs in fibroblasts.

Accession codes

Primary accessions

Gene Expression Omnibus

Data deposits

The CNV array and sequencing data are available from Gene Expression Omnibus under accessions GSE41716 and GSE41563, and from


  1. 1

    Laurent, L. C. et al. Dynamic changes in the copy number of pluripotency and cell proliferation genes in human ESCs and iPSCs during reprogramming and time in culture. Cell Stem Cell 8, 106–118 (2011)

    CAS  Article  Google Scholar 

  2. 2

    Quinlan, A. R. et al. Genome sequencing of mouse induced pluripotent stem cells reveals retroelement stability and infrequent DNA rearrangement during reprogramming. Cell Stem Cell 9, 366–373 (2011)

    CAS  Article  Google Scholar 

  3. 3

    Hussein, S. M. et al. Copy number variation and selection during reprogramming to pluripotency. Nature 471, 58–62 (2011)

    ADS  CAS  Article  Google Scholar 

  4. 4

    Mayshar, Y. et al. Identification and classification of chromosomal aberrations in human induced pluripotent stem cells. Cell Stem Cell 7, 521–531 (2010)

    CAS  Article  Google Scholar 

  5. 5

    Takahashi, K. et al. Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell 131, 861–872 (2007)

    CAS  Article  Google Scholar 

  6. 6

    Yu, J. et al. Induced pluripotent stem cell lines derived from human somatic cells. Science 318, 1917–1920 (2007)

    ADS  CAS  Article  Google Scholar 

  7. 7

    Wernig, M. et al. In vitro reprogramming of fibroblasts into a pluripotent ES-cell-like state. Nature 448, 318–324 (2007)

    ADS  CAS  Article  Google Scholar 

  8. 8

    Lowry, W. E. et al. Generation of human induced pluripotent stem cells from dermal fibroblasts. Proc. Natl Acad. Sci. USA 105, 2883–2888 (2008)

    ADS  CAS  Article  Google Scholar 

  9. 9

    Vaccarino, F. M. et al. Annual Research Review: the promise of stem cell research for neuropsychiatric disorders. J. Child Psychol. Psychiatry 52, 504–516 (2011)

    Article  Google Scholar 

  10. 10

    Park, I. H. et al. Disease-specific induced pluripotent stem cells. Cell 134, 877–886 (2008)

    CAS  Article  Google Scholar 

  11. 11

    Lee, G. et al. Modelling pathogenesis and treatment of familial dysautonomia using patient-specific iPSCs. Nature 461, 402–406 (2009)

    ADS  CAS  Article  Google Scholar 

  12. 12

    Hargus, G. et al. Differentiated Parkinson patient-derived induced pluripotent stem cells grow in the adult rodent brain and reduce motor asymmetry in Parkinsonian rats. Proc. Natl Acad. Sci. USA 107, 15921–15926 (2010)

    ADS  CAS  Article  Google Scholar 

  13. 13

    Brennand, K. J. & Gage, F. H. Concise review: the promise of human induced pluripotent stem cell-based studies of schizophrenia. Stem Cells 29, 1915–1922 (2011)

    CAS  Article  Google Scholar 

  14. 14

    Liang, Q., Conte, N., Skarnes, W. C. & Bradley, A. Extensive genomic copy number variation in embryonic stem cells. Proc. Natl Acad. Sci. USA 105, 17453–17456 (2008)

    ADS  CAS  Article  Google Scholar 

  15. 15

    Wu, H. et al. Copy number variant analysis of human embryonic stem cells. Stem Cells 26, 1484–1489 (2008)

    CAS  Article  Google Scholar 

  16. 16

    Elliott, A. M., Elliott, K. A. & Kammesheidt, A. High resolution array-CGH characterization of human stem cells using a stem cell focused microarray. Mol. Biotechnol. 46, 234–242 (2010)

    CAS  Article  Google Scholar 

  17. 17

    O’Huallachain, M., Karczewski, K. J., Weissman, S. M., Urban, A. E. & Snyder, M. P. Extensive genetic variation in somatic human tissues. Proc. Natl Acad. Sci.. USA (5 October 2012)

  18. 18

    De, S. Somatic mosaicism in healthy human tissues. Trends Genet. 27, 217–223 (2011)

    CAS  Article  Google Scholar 

  19. 19

    Baillie, J. K. et al. Somatic retrotransposition alters the genetic landscape of the human brain. Nature 479, 534–537 (2011)

    ADS  CAS  Article  Google Scholar 

  20. 20

    Coufal, N. G. et al. L1 retrotransposition in human neural progenitor cells. Nature 460, 1127–1131 (2009)

    ADS  CAS  Article  Google Scholar 

  21. 21

    Rehen, S. K. et al. Constitutional aneuploidy in the normal human brain. J. Neurosci. 25, 2176–2180 (2005)

    CAS  Article  Google Scholar 

  22. 22

    Youssoufian, H. & Pyeritz, R. E. Mechanisms and consequences of somatic mosaicism in humans. Nature Rev. Genet. 3, 748–758 (2002)

    CAS  Article  Google Scholar 

  23. 23

    Piotrowski, A. et al. Somatic mosaicism for copy number variation in differentiated human tissues. Hum. Mutat. 29, 1118–1124 (2008)

    Article  Google Scholar 

  24. 24

    Mkrtchyan, H. et al. Early embryonic chromosome instability results in stable mosaic pattern in human tissues. PLoS ONE 5, e9591 (2010)

    ADS  Article  Google Scholar 

  25. 25

    Poduri, A. et al. Somatic activation of AKT3 causes hemispheric developmental brain malformations. Neuron 74, 41–48 (2012)

    CAS  Article  Google Scholar 

  26. 26

    Abyzov, A., Urban, A. E., Snyder, M. & Gerstein, M. CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res. 21, 974–984 (2011)

    CAS  Article  Google Scholar 

  27. 27

    Mills, R. E. et al. Mapping copy number variation by population-scale genome sequencing. Nature 470, 59–65 (2011)

    CAS  Article  Google Scholar 

  28. 28

    Cheng, L. et al. Low incidence of DNA sequence variation in human induced pluripotent stem cells generated by nonintegrating plasmid expression. Cell Stem Cell 10, 337–344 (2012)

    CAS  Article  Google Scholar 

  29. 29

    Arlt, M. F., Ozdemir, A. C., Birkeland, S. R., Wilson, T. E. & Glover, T. W. Hydroxyurea induces de novo copy number variants in human cells. Proc. Natl Acad. Sci. USA 108, 17360–17365 (2011)

    ADS  Article  Google Scholar 

  30. 30

    Eichler, E. E. et al. Missing heritability and strategies for finding the underlying causes of complex disease. Nature Rev. Genet. 11, 446–450 (2010)

    CAS  Article  Google Scholar 

  31. 31

    Chan, E. M. et al. Live cell imaging distinguishes bona fide human iPS cells from partially reprogrammed cells. Nature Biotechnol. 27, 1033–1037 (2009)

    CAS  Article  Google Scholar 

  32. 32

    Deb-Rinker, P., Ly, D., Jezierski, A., Sikorska, M. & Walker, P. R. Sequential DNA methylation of the Nanog and Oct-4 upstream regions in human NT2 cells during neuronal differentiation. J. Biol. Chem. 280, 6257–6260 (2005)

    CAS  Article  Google Scholar 

  33. 33

    Freberg, C. T., Dahl, J. A., Timoskainen, S. & Collas, P. Epigenetic reprogramming of OCT4 and NANOG regulatory regions by embryonal carcinoma cell extract. Mol. Biol. Cell 18, 1543–1553 (2007)

    CAS  Article  Google Scholar 

  34. 34

    Kim, J. E. et al. Investigating synapse formation and function using human pluripotent stem cell-derived neurons. Proc. Natl Acad. Sci. USA 108, 3005–3010 (2011)

    ADS  CAS  Article  Google Scholar 

  35. 35

    Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26, 589–595 (2010)

    Article  Google Scholar 

  36. 36

    Wang, L. Y., Abyzov, A., Korbel, J. O., Snyder, M. & Gerstein, M. MSB: a mean-shift-based approach for the analysis of structural variation in the genome. Genome Res. 19, 106–117 (2009)

    Article  Google Scholar 

  37. 37

    Zhang, J., Feuk, L., Duggan, G. E., Khaja, R. & Scherer, S. W. Development of bioinformatics resources for display and analysis of copy number and other structural variants in the human genome. Cytogenet. Genome Res. 115, 205–214 (2006)

    CAS  Article  Google Scholar 

  38. 38

    Korbel, J. O. et al. PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. Genome Biol. 10, R23 (2009)

    Article  Google Scholar 

  39. 39

    Lam, H. Y. et al. Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library. Nature Biotechnol. 28, 47–55 (2010)

    CAS  Article  Google Scholar 

  40. 40

    Sanders, S. J. et al. Multiple recurrent de novo CNVs, including duplications of the 7q11.23 Williams syndrome region, are strongly associated with autism. Neuron 70, 863–885 (2011)

    CAS  Article  Google Scholar 

  41. 41

    Qin, J., Jones, R. C. & Ramakrishnan, R. Studying copy number variations using a nanofluidic platform. Nucleic Acids Res. 36, e116 (2008)

    Article  Google Scholar 

  42. 42

    Trapnell, C., Pachter, L. & Salzberg, S. L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009)

    CAS  Article  Google Scholar 

  43. 43

    Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009)

    Article  Google Scholar 

  44. 44

    Habegger, L. et al. RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries. Bioinformatics 27, 281–283 (2011)

    CAS  Article  Google Scholar 

  45. 45

    The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012)

  46. 46

    Abyzov, A. & Gerstein, M. AGE: defining breakpoints of genomic structural variants at single-nucleotide resolution, through optimal alignments with gap excision. Bioinformatics 27, 595–603 (2011)

    CAS  Article  Google Scholar 

  47. 47

    Hindson, B. J. et al. High-throughput droplet digital PCR system for absolute quantitation of DNA copy number. Anal. Chem. 83, 8604–8610 (2011)

    CAS  Article  Google Scholar 

  48. 48

    Haraksingh, R. R., Abyzov, A., Gerstein, M., Urban, A. E. & Snyder, M. Genome-wide mapping of copy number variation in humans: comparative analysis of high resolution array platforms. PLoS ONE 6, e27859 (2011)

    ADS  CAS  Article  Google Scholar 

Download references


We acknowledge support from the National Institutes of Health (NIH) and from the AL Williams Professorship fund and the Harris Professorship fund. We also acknowledge the Yale University Biomedical High Performance Computing Center and its support team (in particular, R. Bjornson and N. Carriero). We thank A. Klin for help with family recruitment. We thank M. V. Simonini for technical help, I.-H. Park for advice in the characterization of iPSC lines and the gift of the iPSC PGP1-1, and S. A. Duncan for the gift of the K3 iPSC line. We acknowledge the following grant support: NIMH MH089176 and MH087879, the Simons Foundation (SFARI 137055 F.V.) and the State of Connecticut, which funded the hiPSC generation and characterization; and NIH grant RR19895, which funded the instrumentation. We acknowledge the Yale Center for Clinical Investigation for clinical support in obtaining the biopsy specimens. We thank J. Overton for advice in carrying out DNA and RNA sequencing. Finally, we thank M. O’Huallachain and J. Li-Pook-Than for their advice on planning, carrying out and analysing the ddPCR experiments.

Author information




The authors contributed to this study at different levels, as described in the following. Study conception and design: F.M.V., A.A. and A.E.U. Family selection: E.L.G. Skin biopsy: A.S. Fibroblast culture: A.H. hiPSC generation and characterization: L.A.R.B., J.M. and L.T. Virus production: A.K. Microarrays data analysis: L.T. Neuronal differentiation: L.A.R.B., N.E.C. and J.M. Sequencing library preparation: L.A.R.B., J.M., L.T. and Y.Z. Processing and analysis of RNA-seq data: D.P. and A.A. Processing and analysis of DNAseq data: A.A. and M.W. qPCR validation: A.F.F. PCR validation: Y.Z. and A.A. aCGH hybridization and analysis: M.S.H. ddPCR experiments and analysis: M.S.H. and A.A. Human subjects: K.C. Coordination of analyses: F.M.V., S.W., A.E.U. and M.G. Display item preparation: A.A., F.M.V., L.T., D.P., J.M., N.E.C., Y.Z. and M.S.H. Manuscript writing: A.A., F.M.V. and A.E.U. The following authors contributed equally to the study: J.M., D.P., Y.Z., M.S.H. and L.T. All authors participated in discussion of results and manuscript editing.

Corresponding authors

Correspondence to Alexander Eckehart Urban or Mark Gerstein or Flora M. Vaccarino.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information

This file contains Supplementary Text, Supplementary References, Supplementary Figures 1-57, Supplementary Tables 2, 5 and 6 and full legends for Supplementary Tables 1, 3 and 4 (see contents for details). (PDF 22730 kb)

Supplementary Table 1

This file contains the Gene expression microarray dataset - see Supplementary Information for full legend. (XLSX 2185 kb)

Supplementary Table 3

This file contains comprehensive information about all LM-CNV candidates - see Supplementary Information for full legend. (XLS 75 kb)

Supplementary Table 4

This file contains comprehensive information about LM-CNV candidates from aCGH - see Supplementary Information for full legend. (XLSX 12 kb)

Supplementary Data

This file shows the alignment of sequenced PCR bands to genomic regions with LM-CNVs. (TXT 197 kb)

PowerPoint slides

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Abyzov, A., Mariani, J., Palejev, D. et al. Somatic copy number mosaicism in human skin revealed by induced pluripotent stem cells. Nature 492, 438–442 (2012).

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing