Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

The haplotype-resolved genome and epigenome of the aneuploid HeLa cancer cell line

Abstract

The HeLa cell line was established in 1951 from cervical cancer cells taken from a patient, Henrietta Lacks. This was the first successful attempt to immortalize human-derived cells in vitro1. The robust growth and unrestricted distribution of HeLa cells resulted in its broad adoption—both intentionally and through widespread cross-contamination2—and for the past 60 years it has served a role analogous to that of a model organism3. The cumulative impact of the HeLa cell line on research is demonstrated by its occurrence in more than 74,000 PubMed abstracts (approximately 0.3%). The genomic architecture of HeLa remains largely unexplored beyond its karyotype4, partly because like many cancers, its extensive aneuploidy renders such analyses challenging. We carried out haplotype-resolved whole-genome sequencing5 of the HeLa CCL-2 strain, examined point- and indel-mutation variations, mapped copy-number variations and loss of heterozygosity regions, and phased variants across full chromosome arms. We also investigated variation and copy-number profiles for HeLa S3 and eight additional strains. We find that HeLa is relatively stable in terms of point variation, with few new mutations accumulating after early passaging. Haplotype resolution facilitated reconstruction of an amplified, highly rearranged region of chromosome 8q24.21 at which integration of the human papilloma virus type 18 (HPV-18) genome occurred and that is likely to be the event that initiated tumorigenesis. We combined these maps with RNA-seq6 and ENCODE Project7 data sets to phase the HeLa epigenome. This revealed strong, haplotype-specific activation of the proto-oncogene MYC by the integrated HPV-18 genome approximately 500 kilobases upstream, and enabled global analyses of the relationship between gene dosage and expression. These data provide an extensively phased, high-quality reference genome for past and future experiments relying on HeLa, and demonstrate the value of haplotype resolution for characterizing cancer genomes and epigenomes.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Haplotype-resolved copy number of the HeLa cancer cell line genome.
Figure 2: HeLa HPV integration locus.
Figure 3: Gene expression by copy number and haplotype in HeLa S3.
Figure 4: Haplotype-specific regulation near the HPV integration site.

Similar content being viewed by others

Accession codes

Primary accessions

GenBank/EMBL/DDBJ

Data deposits

The Whole Genome Shotgun projects have been deposited in the Third Party Assembly Section of GenBank under the accessions DAAG00000000 and DAAH00000000. The versions described in this paper are versions DAAG01000000 and DAAH01000000. The sequences, variant calls, phase annotation and haplotype-specific reference sequences are available in the NIH database of Genotypes and Phenotypes (dbGaP; http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap) under accession phs000642.v1.p1.

References

  1. Gey, G. O., Coffman, W. D. & Kubicek, M. T. Tissue culture studies of the proliferative capacity of cervical carcinoma and normal epithelium. Cancer Res. 12, 264–265 (1952)

    Google Scholar 

  2. Gartler, S. M. Apparent Hela cell contamination of human heteroploid cell lines. Nature 217, 750–751 (1968)

    Article  ADS  CAS  Google Scholar 

  3. Skloot, R. The Immortal Life of Henrietta Lacks. (Crown Publishers, 2010)

    Google Scholar 

  4. Macville, M. et al. Comprehensive and definitive molecular cytogenetic characterization of HeLa cells by spectral karyotyping. Cancer Res. 59, 141–150 (1999)

    CAS  PubMed  Google Scholar 

  5. Kitzman, J. O. et al. Haplotype-resolved genome sequencing of a Gujarati Indian individual. Nature Biotechnol. 29, 59–63 (2011)

    Article  CAS  Google Scholar 

  6. Nagaraj, N. et al. Deep proteome and transcriptome mapping of a human cancer cell line. Mol. Syst. Biol. 7, 548 (2011)

    Article  Google Scholar 

  7. Dunham, I. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012)

    Article  ADS  CAS  Google Scholar 

  8. Meyer, M. et al. A high-coverage genome sequence from an archaic Denisovan individual. Science 338, 222–226 (2012)

    Article  ADS  CAS  Google Scholar 

  9. The 1000 Genomes Project Consortium An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012)

    Article  Google Scholar 

  10. Exome Variant Server. http://evs.gs.washington.edu/EVS/ (NHLBI GO Exome Sequencing Project (ESP), January 2012)

  11. Morin, R. et al. Profiling the HeLa S3 transcriptome using randomly primed cDNA and massively parallel short-read sequencing. Biotechniques 45, 81–94 (2008)

    Article  CAS  Google Scholar 

  12. The Cancer Genome Project. http://www.sanger.ac.uk/genetics/CGP/ (Wellcome Trust Sanger Institute, January 2013)

  13. Goodwin, E. C. et al. Rapid induction of senescence in human cervical carcinoma cells. Proc. Natl Acad. Sci. USA 97, 10978–10983 (2000)

    Article  ADS  CAS  Google Scholar 

  14. Rosty, C. et al. Clinical and biological characteristics of cervical neoplasias with FGFR3 mutation. Mol. Cancer 4, 15 (2005)

    Article  Google Scholar 

  15. Talora, C., Sgroi, D. C., Crum, C. P. & Dotto, G. P. Specific down-modulation of Notch1 signaling in cervical cancer cells is required for sustained HPV-E6/E7 expression and late steps of malignant transformation. Genes Dev. 16, 2252–2263 (2002)

    Article  CAS  Google Scholar 

  16. White, E. A. et al. Comprehensive analysis of host cellular interactions with human papillomavirus E6 proteins identifies new E6 binding partners and reflects viral diversity. J. Virol. 86, 13174–13186 (2012)

    Article  CAS  Google Scholar 

  17. Corver, W. E. et al. Genome-wide allelic state analysis on flow-sorted tumor fractions provides an accurate measure of chromosomal aberrations. Cancer Res. 68, 10333–10340 (2008)

    Article  CAS  Google Scholar 

  18. Wingo, S. N. et al. Somatic LKB1 mutations promote cervical cancer progression. PLoS ONE 4, e5137 (2009)

    Article  ADS  Google Scholar 

  19. Wistuba, I. I. et al. Deletions of chromosome 3p are frequent and early events in the pathogenesis of uterine cervical carcinoma. Cancer Res. 57, 3154–3158 (1997)

    CAS  PubMed  Google Scholar 

  20. Nik-Zainal, S. et al. The life history of 21 breast cancers. Cell 149, 994–1007 (2012)

    Article  CAS  Google Scholar 

  21. Fan, H. C., Wang, J., Potanina, A. & Quake, S. R. Whole-genome molecular haplotyping of single cells. Nature Biotechnol. 29, 51–57 (2011)

    Article  CAS  Google Scholar 

  22. The Cancer Genome Atlas Research Network. Comprehensive genomic characterization of squamous cell lung cancers. Nature 489, 519–525 (2012); corrigendum. 491, 288 (2012)

  23. Puck, T. T. & Marcus, P. I. A rapid method for viable cell titration and clone production with HeLa cells in tissue culture: the use of X-irradiated cells to supply conditioning factors. Proc. Natl Acad. Sci. USA 41, 432–437 (1955)

    Article  ADS  CAS  Google Scholar 

  24. Nelson-Rees, W. A., Daniels, D. W. & Flandermeyer, R. R. Cross-contamination of cells in culture. Science 212, 446–452 (1981)

    Article  ADS  CAS  Google Scholar 

  25. Wentzensen, N., Vinokurova, S. & von Knebel Doeberitz, M. Systematic review of genomic integration sites of human papillomavirus genomes in epithelial dysplasia and invasive cancer of the female lower genital tract. Cancer Res. 64, 3878–3884 (2004)

    Article  CAS  Google Scholar 

  26. Lazo, P. A., DiPaolo, J. A. & Popescu, N. C. Amplification of the integrated viral transforming genes of human papillomavirus 18 and its 5′-flanking cellular sequence located near the myc protooncogene in HeLa cells. Cancer Res. 49, 4305–4310 (1989)

    CAS  PubMed  Google Scholar 

  27. Bouallaga, I., Massicard, S., Yaniv, M. & Thierry, F. An enhanceosome containing the Jun B/Fra-2 heterodimer and the HMG-I(Y) architectural protein controls HPV 18 transcription. EMBO Rep. 1, 422–427 (2000)

    Article  CAS  Google Scholar 

  28. Li, G. et al. Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell 148, 84–98 (2012)

    Article  CAS  Google Scholar 

  29. Peter, M. et al. MYC activation associated with the integration of HPV DNA at the MYC locus in genital tumors. Oncogene 25, 5985–5993 (2006)

    Article  CAS  Google Scholar 

  30. Ahmadiyeh, N. et al. 8q24 prostate, breast, and colon cancer risk loci show tissue-specific long-range interaction with MYC. Proc. Natl Acad. Sci. USA 107, 9742–9746 (2010)

    Article  ADS  CAS  Google Scholar 

  31. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009)

    Article  CAS  Google Scholar 

  32. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010)

    Article  CAS  Google Scholar 

  33. Li, H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011)

    Article  CAS  Google Scholar 

  34. Gymrek, M., Golan, D., Rosset, S. & Erlich, Y. lobSTR: a short tandem repeat profiler for personal genomes. Genome Res. 22, 1154–1162 (2012)

    Article  CAS  Google Scholar 

  35. Huang, D. W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protocols 4, 44–57 (2009)

    Article  CAS  Google Scholar 

  36. Hach, F. et al. mrsFAST: a cache-oblivious algorithm for short-read mapping. Nature Methods 7, 576–577 (2010)

    Article  CAS  Google Scholar 

  37. Sudmant, P. H. et al. Diversity of human copy number variation and multicopy genes. Science 330, 641–646 (2010)

    Article  ADS  CAS  Google Scholar 

  38. Gnerre, S. et al. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc. Natl Acad. Sci. USA 108, 1513–1518 (2011)

    Article  ADS  CAS  Google Scholar 

  39. Talkowski, M. E. et al. Next-generation sequencing strategies enable routine detection of balanced chromosome rearrangements for clinical diagnostics and genetic research. Am. J. Hum. Genet. 88, 469–481 (2011)

    Article  CAS  Google Scholar 

  40. Adey, A. et al. Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition. Genome. Biol. 11, R119 (2010)

    Article  CAS  Google Scholar 

  41. Duitama, J. et al. Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of single individual haplotyping techniques. Nucleic Acids Res. 40, 2041–2053 (2012)

    Article  CAS  Google Scholar 

  42. Trapnell, C., Pachter, L. & Salzberg, S. L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009)

    Article  CAS  Google Scholar 

  43. Roberts, A., Pimentel, H., Trapnell, C. & Pachter, L. Identification of novel transcripts in annotated genomes using RNA-Seq. Bioinformatics 27, 2325–2329 (2011)

    Article  CAS  Google Scholar 

Download references

Acknowledgements

The genome sequence described in this paper was derived from a HeLa cell line. Henrietta Lacks, and the HeLa cell line that was established from her tumour cells in 1951, have made significant contributions to scientific progress and advances in human health. We are grateful to Henrietta Lacks, now deceased, and to her surviving family members for their contributions to biomedical research. We also thank M. Kircher, M. Snyder, A. Kumar and R. Patwardhan as well as other members of the Shendure laboratory for advice and suggestions. We thank the Stamatoyannopoulos and Malik laboratories for cell aliquots. Our work was supported by a gift from the Washington Research Foundation; grant HG006283 from the National Genome Research Institute (NHGRI, to J.S.); grant CA160080 from the National Cancer Institute (to J.S.); a graduate research fellowship DGE-0718124 from the National Science Foundation (to A.A. and J.K.); grant T32HG000035 from the NHGRI (to J.N.B.); and grant AG039173 from the National Institute of Aging (to J.B.H.). J.S. is the Lowell Milken Prostate Cancer Foundation Young Investigator. J.S. is a member of the scientific advisory board or serves as a consultant for Ariosa Diagnostics, Stratos Genomics, Good Start Genetics, and Adaptive Biotechnologies.

Author information

Authors and Affiliations

Authors

Contributions

A.A., J.N.B., J.O.K. and J.S. devised experiments, carried out analyses and wrote the manuscript. A.A., J.B.H., A.P.L., B.K.M., R.Q. and C.L. maintained cell cultures, constructed libraries and performed DNA sequencing. J.S. supervised all aspects of the study.

Corresponding authors

Correspondence to Andrew Adey or Jay Shendure.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information

This file contains Supplementary Notes 1-23, Supplementary Tables 1-3, 6, 8-9, 12, 14-15 (see separate excel file for Supplementary Table 4-5, 7, 10-11 and 13) and Supplementary Figures 1-48 (see Contents for more details). (PDF 12344 kb)

Supplementary Tables

This spreadsheet contains Supplementary Tables 7, 10-11, 13 and links to Supplementary Tables 4-5. (XLSX 217 kb)

PowerPoint slides

Rights and permissions

Reprints and permissions

About this article

Cite this article

Adey, A., Burton, J., Kitzman, J. et al. The haplotype-resolved genome and epigenome of the aneuploid HeLa cancer cell line. Nature 500, 207–211 (2013). https://doi.org/10.1038/nature12064

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature12064

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer