Proteomic and phosphoproteomic comparison of human ES and iPS cells

Journal name:
Nature Methods
Year published:
Published online


Combining high-mass-accuracy mass spectrometry, isobaric tagging and software for multiplexed, large-scale protein quantification, we report deep proteomic coverage of four human embryonic stem cell and four induced pluripotent stem cell lines in biological triplicate. This 24-sample comparison resulted in a very large set of identified proteins and phosphorylation sites in pluripotent cells. The statistical analysis afforded by our approach revealed subtle but reproducible differences in protein expression and protein phosphorylation between embryonic stem cells and induced pluripotent cells. Merging these results with RNA-seq analysis data, we found functionally related differences across each tier of regulation. We also introduce the Stem Cell–Omics Repository (SCOR), a resource to collate and display quantitative information across multiple planes of measurement, including mRNA, protein and post-translational modifications.

At a glance


  1. Figures of merit for peptide identification and quantification.
    Figure 1: Figures of merit for peptide identification and quantification.

    (a) Peptide identifications as a function of precursor and product mass tolerance. Using proteins isolated from human ESC whole-cell lysate, we performed liquid chromatography tandem mass spectrometry for each combination of dissociation method and mass analyzer. IT, ion-trap detection; FT, orbitrap detection. We searched data using fragment-ion tolerances of 0.01–5.0 Da, filtered results by precursor mass tolerances of 0.5–1,000 p.p.m. and filtered identifications to achieve 1% FDR. We performed experiments in triplicate and averaged the results. The number of peptide spectrum matches (PSMs) is proportional to circle size; number of unique peptides is represented by circle color as indicated. (b) R2 values for all peptides in each protein (H1 versus NFF comparison; fourplex experiment) were calculated as a metric for quality of quantification. (c) Characterization of quantification. Data points represent reporter ion intensities for a single protein mixed in the indicated ratios. Lines represent the theoretical value for the mixtures presented.

  2. A transcriptomic, proteomic and phosphoproteomic comparison of ESC lines H1 and H9, iPSC line DF19. 7 and NFF line.
    Figure 2: A transcriptomic, proteomic and phosphoproteomic comparison of ESC lines H1 and H9, iPSC line DF19. 7 and NFF line.

    (a) Heat maps depict all quantified transcripts, proteins and phosphorylation sites. Values were median-normalized. (b) Overlap between transcripts and proteins identified in the fourplex experiment. We considered transcripts 'present' if the reads per kilobase of exon per million mapped reads (RPKM) value was greater than 1 for all four cell types, and we determined protein identification via P-value filtering (1% FDR). (c) Cytoscape schematic of mRNA, protein and phosphorylation quantification from the fourplex experiment for genes known to have an interaction with NANOG, SOX2 or POU5F1 (search tool for the retrieval of interacting genes-proteins (STRING) database, confidence score > 0.90). Data are identified by protein name.

  3. Kinase substrate analysis between ESCs and NFFs (adapted from ref. 24 with permission from the American Association for the Advancement of Science).
    Figure 3: Kinase substrate analysis between ESCs and NFFs (adapted from ref. 24 with permission from the American Association for the Advancement of Science).

    Highlighted are kinase substrates for sets of phosphorylation sites that were enriched (changed by more than twofold) in ESCs (red; P < 0.05) and in NFFs (blue; P < 0.05).

  4. Comparison of four ESC and four iPSC lines.
    Figure 4: Comparison of four ESC and four iPSC lines.

    (a) Differentially regulated transcripts, proteins and phosphorylation sites are shown as a function of the number of comparisons (n). We performed differential expression analysis using subsets of data. For example, the n = 2 value reflects the number of differences detected from comparing just two ESC lines and two iPSC lines without biological replicate, whereas n = 12 represents the differences detected from comparing all four ESC lines and all four iPSC lines in biological triplicate. The number of differentially regulated elements for a given fold difference is indicated by different colors. The lines connect data point for ease of interpretation. (b) Heatmaps depicting differentially regulated transcripts, proteins and phosphorylation sites (P < 0.05, Student's t-test, with Benjamini-Hochberg correction). Only transcripts exhibiting at least a 1.5-fold difference and protein and phosphorylation sites exhibiting at least a 1.2-fold difference are shown. (c) Randomly selected examples of differentially regulated transcripts, proteins and phosphorylation sites. Bar heights represent relative reporter ion intensity (arbitrary units). *P < 0.05 (Student's t-test), (ESCs compared to iPSCs). (d) Differentially regulated transcripts detected based on either a comparison between biological triplicates of H1 and DF4.7 cell lines or a comparison of biological triplicates of all four ESC and all four iPSC lines. (e) Overlap between differentially regulated proteins and transcripts (left; only genes with both a quantified protein and transcript were included) and differentially regulated proteins and phosphorylation sites (right; only genes with both a quantified protein and phosphorylation site were included).


  1. Yu, J. et al. Human induced pluripotent stem cells free of vector and transgene sequences. Science 324, 797801 (2009).
  2. Takahashi, K. et al. Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell 131, 861872 (2007).
  3. Yu, J. et al. Induced pluripotent stem cell lines derived from human somatic cells. Science 318, 19171920 (2007).
  4. Chin, M.H. et al. Induced pluripotent stem cells and embryonic stem cells are distinguished by gene expression signatures. Cell Stem Cell 5, 111123 (2009).
  5. Guenther, M.G. et al. Chromatin structure and gene expression programs of human embryonic and induced pluripotent stem cells. Cell Stem Cell 7, 249257 (2010).
  6. Chin, M.H., Pellegrini, M., Plath, K. & Lowry, W.E. Molecular analyses of human induced pluripotent stem cells and embryonic stem cells. Cell Stem Cell 7, 263269 (2010).
  7. Bock, C. et al. Reference maps of human ES and iPS cell variation enable high-throughput characterization of pluripotent cell lines. Cell 144, 439452 (2011).
  8. Stadtfeld, M. et al. Aberrant silencing of imprinted genes on chromosome 12qF1 in mouse induced pluripotent stem cells. Nature 465, 175181 (2010).
  9. Doi, A. et al. Differential methylation of tissue- and cancer-specific CpG island shores distinguishes human induced pluripotent stem cells, embryonic stem cells and fibroblasts. Nat. Genet. 41, 13501353 (2009).
  10. Lister, R. et al. Hotspots of aberrant epigenomic reprogramming in human induced pluripotent stem cells. Nature 471, 6873 (2011).
  11. Olsen, J.V. et al. Higher-energy C-trap dissociation for peptide modification analysis. Nat. Methods 4, 709712 (2007).
  12. McAlister, G.C., Phanstiel, D., Wenger, C.D., Lee, M.V. & Coon, J.J. Analysis of tandem mass spectra by FTMS for improved large-scale proteomics with superior protein quantification. Anal. Chem. 82, 316322 (2010).
  13. Olsen, J.V. et al. A dual pressure linear ion trap Orbitrap instrument with very high sequencing speed. Mol. Cell. Proteomics 8, 27592769 (2009).
  14. Nagaraj, N., D′Souza, R.C.J., Cox, J., Olsen, J.V. & Mann, M. Feasibility of large-scale phosphoproteomics with higher energy collisional dissociation fragmentation. J. Proteome Res. 9, 67866794 (2010).
  15. Thompson, A. et al. Tandem mass tags: a novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. Anal. Chem. 75, 18951904 (2003).
  16. Ross, P.L. et al. Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell. Proteomics 3, 11541169 (2004).
  17. Choe, L. et al. 8-plex quantitation of changes in cerebrospinal fluid protein expression in subjects undergoing intravenous immunoglobulin treatment for Alzheimer's disease. Proteomics 7, 36513660 (2007).
  18. Ow, S.Y. et al. iTRAQ underestimation in simple and complex mixtures: the good, the bad and the ugly. J. Proteome Res. 8, 53475355 (2009).
  19. Wenger, C.D., Phanstiel, D.H., Lee, M.V., Bailey, D.J. & Coon, J.J. COMPASS: a suite of pre- and post-search proteomics software tools for OMSSA. Proteomics 6, 10641074 (2011).
  20. Shadforth, I.P., Dunkley, T.P.J., Lilley, K.S. & Bessant, C. i-Tracker: for quantitative proteomics using iTRAQ (TM). BMC Genomics 6, 145 (2005).
  21. Griffin, T.J. et al. iTRAQ reagent-based quantitative proteomic analysis on a linear ion trap mass spectrometer. J. Proteome Res. 6, 42004209 (2007).
  22. Becker, K.A., Stein, J.L., Lian, J.B., van Wijnen, A.J. & Stein, G.S. Establishment of histone gene regulation and cell cycle checkpoint control in human embryonic stem cells. J. Cell. Physiol. 210, 517526 (2007).
  23. Xue, Y. et al. GPS 2.0, a tool to predict kinase-specific phosphorylation sites in hierarchy. Mol. Cell. Proteomics 7, 15981608 (2008).
  24. Manning, G., Whyte, D.B., Martinez, R., Hunter, T. & Sudarsanam, S. The protein kinase complement of the human genome. Science 298, 19121934 (2002).
  25. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate—a practical and powerful approach to multiple testing. J. R. Stat. Soc., B 57, 289300 (1995).
  26. Ashburner, M. et al. Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 2529 (2000).
  27. Kanehisa, M. & Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 2730 (2000).
  28. Frye, M. & Watt, F.M. The RNA methyltransferase Misu (NSun2) mediates Myc-induced proliferation and is upregulated in tumors. Curr. Biol. 16, 971981 (2006).
  29. Singh, M.K. et al. The T-box transcription factor Tbx15 is required for skeletal development. Mech. Dev. 122, 131144 (2005).
  30. Dong, F. et al. Pitx2 promotes development of splanchnic mesoderm-derived branchiomeric muscle. Development 133, 48914899 (2006).
  31. Kim, K. et al. Epigenetic memory in induced pluripotent stem cells. Nature 467, 285290 (2010).
  32. Polo, J.M. et al. Cell type of origin influences the molecular and functional properties of mouse induced pluripotent stem cells. Nat. Biotechnol. 28, 848855 (2010).
  33. Hu, B.Y. et al. Neural differentiation of human induced pluripotent stem cells follows developmental principles but with variable potency. Proc. Natl. Acad. Sci. USA 107, 43354340 (2010).
  34. Siu, I.M. et al. Coexpression of neuronatin splice forms promotes medulloblastoma growth. Neuro-oncol. 10, 716724 (2008).
  35. Hargrave, M. et al. Expression of the Sox11 gene in mouse embryos suggests roles in neuronal maturation and epithelio-mesenchymal induction. Dev. Dyn. 210, 7986 (1997).
  36. Kawano, Y. et al. CRMP-2 is involved in kinesin-1-dependent transport of the Sra-1/WAVE1 complex and axon formation. Mol. Cell. Biol. 25, 99209935 (2005).
  37. Zhu, H., Coppinger, J.A., Jang, C.Y., Yates, J.R. III & Fang, G. FAM29A promotes microtubule amplification via recruitment of the NEDD1-gamma-tubulin complex to the mitotic spindle. J. Cell Biol. 183, 835848 (2008).
  38. Bourke, E., Brown, J.A.L., Takeda, S., Hochegger, H. & Morrison, C.G. DNA damage induces Chk1-dependent threonine-160 phosphorylation and activation of Cdk2. Oncogene 29, 616624 (2010).
  39. Ludwig, T.E. et al. Derivation of human embryonic stem cells in defined conditions. Nat. Biotechnol. 24, 185187 (2006).
  40. Sengupta, S. et al. Highly consistent, fully representative mRNA-Seq libraries from ten nanograms of total RNA. Biotechniques 49, 898904 (2010).
  41. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10 R25 (2009).
  42. Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621628 (2008).
  43. Good, D.M. et al. Post-acquisition ETD spectral processing for increased peptide identifications. J. Am. Soc. Mass Spectrom. 20, 14351440 (2009).
  44. Geer, L.Y. et al. Open mass spectrometry search algorithm. J. Proteome Res. 3, 958964 (2004).
  45. Kersey, P.J. et al. The International Protein Index: An integrated database for proteomics experiments. Proteomics 4, 19851988 (2004).
  46. Nesvizhskii, A.I. & Aebersold, R. Interpretation of shotgun proteomic data - The protein inference problem. Mol. Cell. Proteomics 4, 14191440 (2005).
  47. Swaney, D.L., Wenger, C.D., Thomson, J.A. & Coon, J.J. Human embryonic stem cell phosphoproteome revealed by electron transfer dissociation tandem mass spectrometry. Proc. Natl. Acad. Sci. USA 106, 9951000 (2009).

Download references

Author information

  1. These authors contributed equally to this work.

    • Douglas H Phanstiel &
    • Justin Brumbaugh


  1. Department of Chemistry, University of Wisconsin, Madison, Wisconsin, USA.

    • Douglas H Phanstiel,
    • Craig D Wenger,
    • Derek J Bailey,
    • Danielle L Swaney,
    • Mark A Tervo &
    • Joshua J Coon
  2. Genome Center of Wisconsin, University of Wisconsin, Madison, Wisconsin, USA.

    • Douglas H Phanstiel,
    • Justin Brumbaugh,
    • Craig D Wenger,
    • Derek J Bailey,
    • Danielle L Swaney,
    • Mark A Tervo &
    • Joshua J Coon
  3. Department of Biomolecular Chemistry, University of Wisconsin, Madison, Wisconsin, USA.

    • Justin Brumbaugh &
    • Joshua J Coon
  4. Morgridge Institute for Research, Madison, Wisconsin, USA.

    • Justin Brumbaugh,
    • Shulan Tian,
    • Mitchell D Probasco,
    • Jennifer M Bolin,
    • Victor Ruotti,
    • Ron Stewart &
    • James A Thomson
  5. Department of Cell and Regenerative Biology, University of Wisconsin, Madison, Wisconsin, USA.

    • James A Thomson
  6. Department of Molecular, Cellular and Developmental Biology, University of California, Santa Barbara, California, USA.

    • James A Thomson


D.H.P. designed research, prepared samples, performed mass spectrometry, wrote software, analyzed data and wrote the manuscript. J.B. designed research, grew cells, prepared samples, analyzed data and wrote the manuscript. C.D.W. wrote software. S.T. and V.R. analyzed data. M.D.P. grew cells. D.J.B. designed websites. D.L.S. helped with phosphorylation analysis. M.A.T. optimized the labeling procedure. J.M.B. performed RNA sequencing. R.S. designed research and analyzed data. J.A.T. and J.J.C. designed research and wrote the manuscript.

Competing financial interests

J.A.T. is a founder, stockowner, consultant and board member of Cellular Dynamics International (CDI), and serves as scientific advisor to and has financial interests in Tactics II Stem Cell Ventures. J.J.C. is a consultant for Thermo Fisher Scientific.

Corresponding author

Correspondence to:

Author details

Supplementary information

PDF files

  1. Supplementary Text and Figures (10M)

    Supplementary Figures 1–5, Supplementary Tables 4,8,9

Excel files

  1. Supplementary Table 1 (6M)

    Proteomic identification and quantification.

  2. Supplementary Table 2 (10M)

    Phosphoproteomic identification and quantification.

  3. Supplementary Table 3 (717K)

    Enrichment analysis from fourplex experiment.

  4. Supplementary Table 5 (5M)

    Transcriptomic identification and quantification.

  5. Supplementary Table 6 (1M)

    Transcripts, proteins, and phosphorylation sites that differ between ESCs and iPSCs.

  6. Supplementary Table 7 (238K)

    Enrichment analysis from eightplex experiment.

Additional data