Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Multi-omic measurements of heterogeneity in HeLa cells across laboratories


Reproducibility in research can be compromised by both biological and technical variation, but most of the focus is on removing the latter. Here we investigate the effects of biological variation in HeLa cell lines using a systems-wide approach. We determine the degree of molecular and phenotypic variability across 14 stock HeLa samples from 13 international laboratories. We cultured cells in uniform conditions and profiled genome-wide copy numbers, mRNAs, proteins and protein turnover rates in each cell line. We discovered substantial heterogeneity between HeLa variants, especially between lines of the CCL2 and Kyoto varieties, and observed progressive divergence within a specific cell line over 50 successive passages. Genomic variability has a complex, nonlinear effect on transcriptome, proteome and protein turnover profiles, and proteotype patterns explain the varying phenotypic response of different cell lines to Salmonella infection. These findings have implications for the interpretation and reproducibility of research results obtained from human cultured cells.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: HeLa cell lines from different laboratories showed varied and evolving genotypes.
Fig. 2: Heterogeneous transcriptome, proteome and protein turnover profiles between HeLa cell lines across laboratories.
Fig. 3: Gene expression comparison between HeLa 12 and 14 representing 3-month passaging.
Fig. 4: Global processes affecting HeLa proteotypes.
Fig. 5: Proteotypes of HeLa cells tightly link to phenotypes.

Data availability

RNA-seq data are available on GEO (GSE111485). The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE70 partner repository with the dataset identifier PXD009273. The full dataset is available at


  1. 1.

    Capes-Davis, A. et al. Check your cultures! A list of cross-contaminated or misidentified cell lines. Int. J. Cancer 127, 1–8 (2010).

    CAS  Article  Google Scholar 

  2. 2.

    Zhao, M. et al. Assembly and initial characterization of a panel of 85 genomically validated cell lines from diverse head and neck tumor sites. Clin. Cancer Res. 17, 7248–7264 (2011).

    CAS  Article  Google Scholar 

  3. 3.

    Lorsch, J. R., Collins, F. S. & Lippincott-Schwartz, J. Fixing problems with cell lines. Science 346, 1452–1453 (2014).

    CAS  Article  Google Scholar 

  4. 4.

    Yu, M. et al. A resource for cell line authentication, annotation and quality control. Nature 520, 307–311 (2015).

    CAS  Article  Google Scholar 

  5. 5.

    Almeida, J. L., Cole, K. D. & Plant, A. L. Standards for cell line authentication and beyond. PLoS Biol. 14, e1002476 (2016).

    Article  Google Scholar 

  6. 6.

    Muff, R. et al. Genomic instability of osteosarcoma cell lines in culture: impact on the prediction of metastasis relevant genes. PLoS One 10, e0125611 (2015).

    Article  Google Scholar 

  7. 7.

    Frattini, A. et al. High variability of genomic instability and gene expression profiling in different HeLa clones. Sci. Rep. 5, 15377 (2015).

    CAS  Article  Google Scholar 

  8. 8.

    Ben-David, U. et al. Genetic and transcriptional evolution alters cancer cell line drug response. Nature 560, 325–330 (2018).

    CAS  Article  Google Scholar 

  9. 9.

    Bottomley, R. H., Trainer, A. L. & Griffin, M. J. Enzymatic and chromosomal characterization of HeLa variants. J. Cell Biol. 41, 806–815 (1969).

    CAS  Article  Google Scholar 

  10. 10.

    Nelson-Rees, W. A., Hunter, L., Darlington, G. J. & O’Brien, S. J. Characteristics of HeLa strains: permanent vs. variable features. Cytogenet. Cell Genet. 27, 216–231 (1980).

    CAS  Article  Google Scholar 

  11. 11.

    Macville, M. et al. Comprehensive and definitive molecular cytogenetic characterization of HeLa cells by spectral karyotyping. Cancer Res. 59, 141–150 (1999).

    CAS  PubMed  Google Scholar 

  12. 12.

    Rutledge, S. What HeLa cells are you using? The Winnower (2014).

  13. 13.

    Landry, J. J. et al. The genomic and transcriptomic landscape of a HeLa cell line. G3 (Bethesda) 3, 1213–1224 (2013).

    Article  Google Scholar 

  14. 14.

    Adey, A. et al. The haplotype-resolved genome and epigenome of the aneuploid HeLa cancer cell line. Nature 500, 207–211 (2013).

    CAS  Article  Google Scholar 

  15. 15.

    Williams, E. G. et al. Systems proteomics of liver mitochondria function. Science 352, aad0189 (2016).

    Article  Google Scholar 

  16. 16.

    Gillet, L. C. et al. Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol. Cell. Proteomics 11, O111.016717 (2012).

    Article  Google Scholar 

  17. 17.

    Rosenberger, G. et al. Statistical control of peptide and protein error rates in large-scale targeted data-independent acquisition analyses. Nat. Methods 14, 921–927 (2017).

    CAS  Article  Google Scholar 

  18. 18.

    Rosenberger, G. et al. A repository of assays to quantify 10,000 human proteins by SWATH-MS. Sci. Data 1, 140031 (2014).

    CAS  Article  Google Scholar 

  19. 19.

    Röst, H. L. et al. OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nat. Biotechnol. 32, 219–223 (2014).

    Article  Google Scholar 

  20. 20.

    Röst, H. L. et al. TRIC: an automated alignment strategy for reproducible protein quantification in targeted proteomics. Nat. Methods 13, 777–783 (2016).

    Article  Google Scholar 

  21. 21.

    Schwanhäusser, B. et al. Global quantification of mammalian gene expression control. Nature 473, 337–342 (2011).

    Article  Google Scholar 

  22. 22.

    Jovanovic, M. et al. Dynamic profiling of the protein life cycle in response to pathogens. Science 347, 1259038 (2015).

    Article  Google Scholar 

  23. 23.

    Liu, Y. et al. Systematic proteome and proteostasis profiling in human trisomy 21 fibroblast cells. Nat. Commun. 8, 1212 (2017).

    Article  Google Scholar 

  24. 24.

    Fasterius, E. et al. A novel RNA sequencing data analysis method for cell line authentication. PLoS One 12, e0171435 (2017).

    Article  Google Scholar 

  25. 25.

    Forbes, S. A. et al. COSMIC: exploring the world’s knowledge of somatic mutations in human cancer. Nucleic Acids Res. 43, D805–D811 (2015).

    CAS  Article  Google Scholar 

  26. 26.

    Iorio, F. et al. A landscape of pharmacogenomic interactions in cancer. Cell 166, 740–754 (2016).

    CAS  Article  Google Scholar 

  27. 27.

    Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

    CAS  Article  Google Scholar 

  28. 28.

    Liu, Y., Beyer, A. & Aebersold, R. On the dependency of cellular protein levels on mRNA abundance. Cell 165, 535–550 (2016).

    CAS  Article  Google Scholar 

  29. 29.

    Fortelny, N., Overall, C. M., Pavlidis, P. & Freue, G. V. C. Can we predict protein from mRNA levels? Nature 547, E19–E20 (2017).

    CAS  Article  Google Scholar 

  30. 30.

    Lundberg, E. et al. Defining the transcriptome and proteome in three functionally different human cell lines. Mol. Syst. Biol. 6, 450 (2010).

    Article  Google Scholar 

  31. 31.

    Claydon, A. J. & Beynon, R. Proteome dynamics: revisiting turnover with a global perspective. Mol. Cell. Proteomics 11, 1551–1565 (2012).

    Article  Google Scholar 

  32. 32.

    Ruepp, A. et al. CORUM: the comprehensive resource of mammalian protein complexes—2009. Nucleic Acids Res. 38, D497–D501 (2010).

    CAS  Article  Google Scholar 

  33. 33.

    Stingele, S. et al. Global analysis of genome, transcriptome and proteome reveals the response to aneuploidy in human cells. Mol. Syst. Biol. 8, 608 (2012).

    Article  Google Scholar 

  34. 34.

    Dephoure, N. et al. Quantitative proteomic analysis reveals posttranslational responses to aneuploidy in yeast. eLife 3, e03023 (2014).

    Article  Google Scholar 

  35. 35.

    Thul, P. J. et al. A subcellular map of the human proteome. Science 356, eaal3321 (2017).

    Article  Google Scholar 

  36. 36.

    Ambros, V. The functions of animal microRNAs. Nature 431, 350–355 (2004).

    CAS  Article  Google Scholar 

  37. 37.

    Roush, S. & Slack, F. J. The let-7 family of microRNAs. Trends Cell Biol. 18, 505–516 (2008).

    CAS  Article  Google Scholar 

  38. 38.

    Schulte, L. N., Eulalio, A., Mollenkopf, H. J., Reinhardt, R. & Vogel, J. Analysis of the host microRNA response to Salmonella uncovers the control of major cytokines by the let-7 family. EMBO J. 30, 1977–1989 (2011).

    CAS  Article  Google Scholar 

  39. 39.

    Agarwal, V., Bell, G. W., Nam, J. W. & Bartel, D. P. Predicting effective microRNA target sites in mammalian mRNAs. eLife 4, 05005 (2015).

    Article  Google Scholar 

  40. 40.

    Misselwitz, B. et al. RNAi screen of Salmonella invasion shows role of COPI in membrane targeting of cholesterol and Cdc42. Mol. Syst. Biol. 7, 474 (2011).

    CAS  Article  Google Scholar 

  41. 41.

    Kreibich, S. et al. Autophagy proteins promote repair of endosomal membranes damaged by the Salmonella type three secretion system 1. Cell Host Microbe 18, 527–537 (2015).

    CAS  Article  Google Scholar 

  42. 42.

    Criss, A. K. & Casanova, J. E. Coordinate regulation of Salmonella enterica serovar Typhimurium invasion of epithelial cells by the Arp2/3 complex and Rho GTPases. Infect. Immun. 71, 2885–2891 (2003).

    CAS  Article  Google Scholar 

  43. 43.

    Cossart, P. & Helenius, A. Endocytosis of viruses and bacteria. Cold Spring Harb. Perspect. Biol. 6, a016972 (2014).

    Article  Google Scholar 

  44. 44.

    Misselwitz, B. et al. Near surface swimming of Salmonella Typhimurium explains target-site selection and cooperative invasion. PLoS Pathog. 8, e1002810 (2012).

    CAS  Article  Google Scholar 

  45. 45.

    Kleensang, A. et al. Genetic variability in a frozen batch of MCF-7 cells invisible in routine authentication affecting cell function. Sci. Rep. 6, 28994 (2016).

    CAS  Article  Google Scholar 

  46. 46.

    Leung, E., Kim, J. E., Askarian-Amiri, M., Finlay, G. J. & Baguley, B. C. Evidence for the existence of triple-negative variants in the MCF-7 breast cancer cell population. Biomed. Res. Int. 2014, 836769 (2014).

    PubMed  PubMed Central  Google Scholar 

  47. 47.

    Lin, Y. C. et al. Genome dynamics of the human embryonic kidney 293 lineage in response to cell biology manipulations. Nat. Commun. 5, 4767 (2014).

    CAS  Article  Google Scholar 

  48. 48.

    Geraghty, R. J. et al. Guidelines for the use of cell lines in biomedical research. Br. J. Cancer 111, 1021–1046 (2014).

    Article  Google Scholar 

  49. 49.

    Pamies, D. & Hartung, T. 21st century cell culture for 21st century toxicology. Chem. Res. Toxicol. 30, 43–52 (2017).

    CAS  Article  Google Scholar 

  50. 50.

    Lancaster, M. A. & Knoblich, J. A. Organogenesis in a dish: modeling development and disease using organoid technologies. Science 345, 1247125 (2014).

    Article  Google Scholar 

  51. 51.

    Drubin, D. G. & Hyman, A. A. Stem cells: the new “model organism”. Mol. Biol. Cell. 28, 1409–1411 (2017).

    CAS  Article  Google Scholar 

  52. 52.

    Venkatraman, E. S. & Olshen, A. B. A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics 23, 657–663 (2007).

    CAS  Article  Google Scholar 

  53. 53.

    Trapnell, C., Pachter, L. & Salzberg, S. L. TopHat: discovering splice junctions with RNA-seq. Bioinformatics 25, 1105–1111 (2009).

    CAS  Article  Google Scholar 

  54. 54.

    Leinonen, R., Sugawara, H. & Shumway, M. The sequence read archive. Nucleic Acids Res. 39, D19–D21 (2011).

    CAS  Article  Google Scholar 

  55. 55.

    Andrews, S. FastQC: a quality control tool for high throughput sequence data. (2018).

  56. 56.

    Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    CAS  Article  Google Scholar 

  57. 57.

    Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    CAS  Article  Google Scholar 

  58. 58.

    Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome AnalysisToolkit best practices pipeline. Curr. Protoc. Bioinformatics 43, 11.10.1–11.10.33 (2013).

    Google Scholar 

  59. 59.

    Cirulli, E. T. et al. Screening the human exome: a comparison of whole genome and whole transcriptome sequencing. Genome. Biol. 11, R57 (2010).

    Article  Google Scholar 

  60. 60.

    Liu, Y. et al. Quantitative variability of 342 plasma proteins in a human twin population. Mol. Syst. Biol. 11, 786 (2015).

    Article  Google Scholar 

  61. 61.

    Collins, B. C. et al. Quantifying protein interaction dynamics by SWATH mass spectrometry: application to the 14-3-3 system. Nat. Methods 10, 1246–1253 (2013).

    CAS  Article  Google Scholar 

  62. 62.

    Ludwig, C., Claassen, M., Schmidt, A. & Aebersold, R. Estimation of absolute protein quantities of unlabeled samples by selected reaction monitoring mass spectrometry. Mol. Cell. Proteomics 11, M111.013987 (2012).

    Article  Google Scholar 

  63. 63.

    Kunszt, P. et al. iPortal: the Swiss grid proteomics portal: requirements and new features based on experience and usability considerations. Concurr. Comput. 27, 433–445 (2015).

    Article  Google Scholar 

  64. 64.

    Shteynberg, D. et al. iProphet: multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates. Mol. Cell. Proteomics 10, M111.007690 (2011).

    Article  Google Scholar 

  65. 65.

    Lam, H. et al. Development and validation of a spectral library searching method for peptide identification from MS/MS. Proteomics 7, 655–667 (2007).

    CAS  Article  Google Scholar 

  66. 66.

    Pratt, J. M. et al. Dynamics of protein turnover, a missing dimension in proteomics. Mol. Cell. Proteomics 1, 579–591 (2002).

    CAS  Article  Google Scholar 

  67. 67.

    Boisvert, F. M. et al. A quantitative spatial proteomics analysis of proteome turnover in human cells. Mol. Cell. Proteomics 11, M111.011429 (2012).

    Article  Google Scholar 

  68. 68.

    Zeiler, M., Straube, W. L., Lundberg, E., Uhlen, M. & Mann, M. A protein epitope signature tag (PrEST) library allows SILAC-based absolute quantification and multiplexed determination of protein copy numbers in cell lines. Mol. Cell. Proteomics 11, O111.009613 (2012).

    Article  Google Scholar 

  69. 69.

    Szklarczyk, D. et al. The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acids Res. 45, D362–D368 (2017).

    CAS  Article  Google Scholar 

  70. 70.

    Vizcaíno, J. A. et al. 2016 update of the PRIDE database and its related tools. Nucleic Acids Res. 44, D447–D456 (2016).

    Article  Google Scholar 

Download references


We thank G. Rosenberger, A. Beyer, B. Collins and S. Nikolaev for discussions. We thank L. Reiter, R. Bruderer and O. Rinner from Biognosys AG for sharing their thoughts about cell line proteome analysis from a commercial perspective. We thank H. Zhang and J. Chen from Johns Hopkins University, D. Pflieger and O. Filhol-Cochet from CEA Grenoble, M. Riwanto from University Hospital Zurich, U. Greber and M. Suomalainen from the University of Zurich, C. Arrieumerlou from the University of Basel (through InfectX), M. Beck and M.-T. Mackmull from the European Molecular Biology Laboratory, C. Jorgensen and J. Worboys from the Cancer Research UK Manchester Institute, M. Peter and C. Barnes from ETH Zurich, and A. Venkitaraman and C. Williams from the University of Cambridge for providing us their HeLa cells.

The work was supported by the project PhosphoNetX PPM (to R.A.), TargetInfectX (to C.D.), the Swiss National Science Foundation (grant 3100A0-688 107679 to R.A.), the European Research Council (ERC-20140AdG 670821 to R.A.), the JRC for Computational Biomedicine (which was partially funded by Bayer AG, to J.S.-R.), the Swiss National Science Foundation (grant 163180 to S.E.A.), the European Research Council (grants AdG 249968 to S.E.A. and 616441-DISEASEAVATARS to G.T.), the Umberto Veronesi Foundation (fellowship to P.-L.G.), the ERA-NET Neuron Program (P.-L.G.), Regione Lombardia (Ricerca Indipendente 2012 to G.T.) and the Italian Ministry of Health (Ricerca Corrente to G.T.) E.G.W. was supported by an NIH F32 Ruth Kirchstein Fellowship (F32GM119190).

Author information




Y.L. and R.A. designed and supervised the whole project. Y.L., Y.M., E.G.W., P.-L.G., M.F., I.B., M.S., M.E. and F.B. analyzed the data and performed the bioinformatics analysis. Y.M. developed the HeLa Proteome website. T.M. performed the pSILAC experiment. S.K. and Y.L. performed the Let7 experiment. S.K. performed the S.Tm infection experiment. A.V.D., C.B, I.S., C.D. and H.Z. established and cultured the cell lines. Y.L. and M.M. performed the mass spectrometry experiments. I.B. performed pyProphet analysis. F.S.B. generated CNV data. M.S. processed the CNV data. C.B. generated RNA-seq data. M.F. performed sequence variation analysis. F.B. and P.-L.G. analyzed RNA-seq data. M.E. analyzed the microscopy phenotypic data. G.T. and J.S.-R. supervised data interpretation. S.E.A. supervised the genomics data generation. W.-D.H. supervised all the microbiology experiments and provided critical inputs. Y.L., E.G.W. and R.A. wrote the paper.

Corresponding authors

Correspondence to Yansheng Liu or Ruedi Aebersold.

Ethics declarations

Competing interests

R.A. holds shares of Biognosys AG, which operates in the field covered by the article.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–19, Supplementary Table 1 and Supplementary Notes 1–6

Reporting Summary

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Liu, Y., Mi, Y., Mueller, T. et al. Multi-omic measurements of heterogeneity in HeLa cells across laboratories. Nat Biotechnol 37, 314–322 (2019).

Download citation

Further reading


Quick links