Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Large-scale analysis of acquired chromosomal alterations in non-tumor samples from patients with cancer


Mosaicism, the presence of subpopulations of cells bearing somatic mutations, is associated with disease and aging and has been detected in diverse tissues, including apparently normal cells adjacent to tumors. To analyze mosaicism on a large scale, we surveyed haplotype-specific somatic copy number alterations (sCNAs) in 1,708 normal-appearing adjacent-to-tumor (NAT) tissue samples from 27 cancer sites and in 7,149 blood samples from The Cancer Genome Atlas. We find substantial variation across tissues in the rate, burden and types of sCNAs, including those spanning entire chromosome arms. We document matching sCNAs in the NAT tissue and the adjacent tumor, suggesting a shared clonal origin, as well as instances in which both NAT tissue and tumor tissue harbor a gain of the same oncogene arising in parallel from distinct parental haplotypes. These results shed light on pan-tissue mutations characteristic of field cancerization, the presence of oncogenic processes adjacent to cancer cells.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Chromosomal alterations, allelic imbalance and mosaicism.
Fig. 2: Summary of results.
Fig. 3: Landscape of sCNAs.
Fig. 4: Arm-level sCNAs in NAT tissues.

Data availability

The results shown are based on data generated by the TCGA Research Network ( All datasets used in this work are available in public repositories ( A list of TCGA disease sites (Supplementary Table 1) and blood and NAT samples used for the analyses (including case IDs) are included (Supplementary Tables 4 and 5, respectively). Reported sCNAs with case IDs are available in Supplementary Tables (6, 7, 10, 11 and 1921).


  1. 1.

    Freed, D., Stevens, E. L. & Pevsner, J. Somatic mosaicism in the human genome. Genes (Basel) 5, 1064–1094 (2014).

    Google Scholar 

  2. 2.

    Forsberg, L. A., Gisselsson, D. & Dumanski, J. P. Mosaicism in health and disease—clones picking up speed. Nat. Rev. Genet. 18, 128–142 (2017).

    CAS  PubMed  Google Scholar 

  3. 3.

    Machiela, M. J. & Chanock, S. J. The ageing genome, clonal mosaicism and chronic disease. Curr. Opin. Genet. Dev. 42, 8–13 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Martincorena, I. & Campbell, P. J. Somatic mutation in cancer and normal cells. Science 349, 1483–1489 (2015).

    CAS  PubMed  Google Scholar 

  5. 5.

    Beroukhim, R. et al. The landscape of somatic copy-number alteration across human cancers. Nature 463, 899–905 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  6. 6.

    Taylor, A. M. et al. Genomic and functional approaches to understanding cancer aneuploidy. Cancer Cell 33, 676–689 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. 7.

    Zack, T. I. et al. Pan-cancer patterns of somatic copy number alteration. Nat. Genet. 45, 1134–1140 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. 8.

    Piotrowski, A. et al. Somatic mosaicism for copy number variation in differentiated human tissues. Hum. Mutat. 29, 1118–1124 (2008).

    PubMed  Google Scholar 

  9. 9.

    Aghili, L., Foo, J., DeGregori, J. & De, S. Patterns of somatically acquired amplifications and deletions in apparently normal tissues of ovarian cancer patients. Cell Rep. 7, 1310–1319 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Genovese, G. et al. Clonal hematopoiesis and blood-cancer risk inferred from blood DNA sequence. N. Engl. J. Med. 371, 2477–2487 (2014).

    PubMed  PubMed Central  Google Scholar 

  11. 11.

    Xie, M. et al. Age-related mutations associated with clonal hematopoietic expansion and malignancies. Nat. Med. 20, 1472–1478 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. 12.

    Jakubek, Y. et al. Genomic landscape established by allelic imbalance in the cancerization field of a normal appearing airway. Cancer Res. 76, 3676–3683 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. 13.

    Curtius, K., Wright, N. A. & Graham, T. A. An evolutionary perspective on field cancerization. Nat. Rev. Cancer. 18, 19–32 (2018).

    CAS  PubMed  Google Scholar 

  14. 14.

    Kadara, H. et al. Driver mutations in normal airway epithelium elucidate spatiotemporal resolution of lung cancer. Am. J. Respir. Crit. Care Med. 200, 742–750 (2019).

    PubMed  Google Scholar 

  15. 15.

    Yadav, V. K., DeGregori, J. & De, S. The landscape of somatic mutations in protein coding genes in apparently benign human tissues carries signatures of relaxed purifying selection. Nucleic Acids Res. 44, 2075–2084 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. 16.

    Martincorena, I. et al. Tumor evolution. High burden and pervasive positive selection of somatic mutations in normal human skin. Science 348, 880–886 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. 17.

    Martincorena, I. et al. Somatic mutant clones colonize the human esophagus with age. Science 362, 911–917 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. 18.

    Moore, L. et al. The mutational landscape of normal human endometrial epithelium. Preprint at (2018).

  19. 19.

    Suda, K. et al. Clonal expansion and diversification of cancer-associated mutations in endometriosis and normal endometrium. Cell Rep. 24, 1777–1789 (2018).

    CAS  PubMed  Google Scholar 

  20. 20.

    Lee-Six, H. et al. The landscape of somatic mutation in normal colorectal epithelial cells. Preprint at (2018).

  21. 21.

    Lee-Six, H. et al. Population dynamics of normal human blood inferred from somatic mutations. Nature 561, 473–478 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Laurie, C. C. et al. Detectable clonal mosaicism from birth to old age and its relationship to cancer. Nat. Genet. 44, 642–650 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Jacobs, K. B. et al. Detectable clonal mosaicism and its relationship to aging and cancer. Nat. Genet. 44, 651–658 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Machiela, M. J. et al. Characterization of large structural genetic mosaicism in human autosomes. Am. J. Hum. Genet. 96, 487–497 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Vattathil, S. & Scheet, P. Extensive hidden genomic mosaicism revealed in normal tissue. Am. J. Hum. Genet. 98, 571–578 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Loh, P. R. et al. Insights into clonal haematopoiesis from 8,342 mosaic chromosomal alterations. Nature 559, 350–355 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. 27.

    Machiela, M. J. et al. Female chromosome X mosaicism is age-related and preferentially affects the inactivated X chromosome. Nat. Commun. 7, 11843 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Vattathil, S. & Scheet, P. Haplotype-based profiling of subtle allelic imbalance with SNP arrays. Genome Res. 23, 152–158 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. 29.

    San Lucas, F. A. et al. Rapid and powerful detection of subtle allelic imbalance from exome sequencing data with hapLOHseq. Bioinformatics 32, 3015–3017 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Pounds, S. et al. A genomic random interval model for statistical analysis of genomic lesion data. Bioinformatics 29, 2088–2095 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Jamal-Hanjani, M. et al. Tracking the evolution of non-small-cell lung cancer. New Engl. J. Med. 376, 2109–2121 (2017).

    CAS  PubMed  Google Scholar 

  32. 32.

    Jakubek, Y. A., San Lucas, F. A. & Scheet, P. Directional allelic imbalance profiling and visualization from multi-sample data with RECUR. Bioinformatics 35, 2300–2302 (2018).

    Google Scholar 

  33. 33.

    Gausachs, M. et al. Mutational heterogeneity in APC and KRAS arises at the crypt level and leads to polyclonality in early colorectal tumorigenesis. Clin. cancer Res. 23, 5936–5947 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. 34.

    Martincorena, I. et al. Universal patterns of selection in cancer and somatic tissues. Cell 171, 1029–1041 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Machiela, M. J. et al. Detectible mosaic truncating PPM1D mutations, age and breast cancer risk. J. Hum. Genet. 64, 545–550 (2019).

    PubMed  Google Scholar 

  36. 36.

    Garnis, C. et al. Genomic imbalances in precancerous tissues signal oral cancer risk. Mol. Cancer 8, 50 (2009).

    PubMed  PubMed Central  Google Scholar 

  37. 37.

    Hogan, B. L. et al. Repair and regeneration of the respiratory system: complexity, plasticity, and mechanisms of lung stem cell function. Cell Stem Cell 15, 123–138 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Aran, D. et al. Comprehensive analysis of normal adjacent to tumor transcriptomes. Nat. Commun. 8, 1077 (2017).

    PubMed  PubMed Central  Google Scholar 

  39. 39.

    Kimura, Y. et al. Genetic alterations in 102 primary gastric cancers by comparative genomic hybridization: gain of 20q and loss of 18q are associated with tumor progression. Mod. Pathol. 17, 1328–1337 (2004).

    CAS  PubMed  Google Scholar 

  40. 40.

    The International Stem Cell Initiative Screening ethnically diverse human embryonic stem cells identifies a chromosome 20 minimal amplicon conferring growth advantage. Nat Biotechnol 29, 1132–1144 (2011).

    PubMed Central  Google Scholar 

  41. 41.

    Sivakumar, S. et al. Genomic landscape of allelic imbalance in premalignant atypical adenomatous hyperplasias of the lung. EBioMedicine 42, 296–303 (2019).

    PubMed  PubMed Central  Google Scholar 

  42. 42.

    Korn, J. M. et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat. Genet. 40, 1253–1260 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. 43.

    Li, Y., Willer, C. J., Ding, J., Scheet, P. & Abecasis, G. R. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet. Epidemiol. 34, 816–834 (2010).

    PubMed  PubMed Central  Google Scholar 

  44. 44.

    Fowler, J., San Lucas, F. A. & Scheet, P. System for quality-assured data analysis: Flexible, reproducible scientific workflows. Genet. Epidemiol. 43, 227–237 (2019).

    PubMed  Google Scholar 

  45. 45.

    Liu, J. et al. An integrated TCGA pan-cancer clinical data resource to drive high-quality survival outcome analytics. Cell 173, 400–416 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. 46.

    Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. 47.

    Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

    PubMed  PubMed Central  Google Scholar 

  48. 48.

    Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. 49.

    Bailey, M. H. et al. Comprehensive characterization of cancer driver genes and mutations. Cell 173, 371–385 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

Download references


We thank D. Swartzlander for help with the graphics and reviewers for their helpful comments. We acknowledge the High Performance Research Computing Center at the University of Texas, MD Anderson Cancer Center. This work was supported by National Institutes of Health grants R25CA057730 (to Y.A.J.), R01HG005855 (to P.S.), R01HG005859 (to P.S.), R01CA181244 (to P.S. and C.D.H.) and P30CA016672 (to MD Anderson) and by the following awards from the Cancer Prevention Research Institute of Texas: RP150079 (to H.K.) and RP160668 (to P.S.).

Author information




P.S. and Y.A.J. conceptualized and directed the study. J.F., K.C., M.R.G., P.S., S.S., Y.A.J. and Y.Y. performed data analyses. C.D.H., E.V., H.K., P.S. and Y.A.J. interpreted results. P.S. and Y.A.J. wrote the manuscript.

Corresponding author

Correspondence to Y. A. Jakubek.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Materials

Supplementary Figs. 1–16 and Supplementary Notes 1–8

Reporting Summary

Supplementary Tables

Supplementary Tables 1–25

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Jakubek, Y.A., Chang, K., Sivakumar, S. et al. Large-scale analysis of acquired chromosomal alterations in non-tumor samples from patients with cancer. Nat Biotechnol 38, 90–96 (2020).

Download citation

Further reading

  • Evolution and progression of Barrett’s oesophagus to oesophageal cancer

    • Sarah Killcoyne
    •  & Rebecca C. Fitzgerald

    Nature Reviews Cancer (2021)

  • Genetics of autosomal mosaic chromosomal alteration (mCA)

    • Xiaoxi Liu
    • , Yoichiro Kamatani
    •  & Chikashi Terao

    Journal of Human Genetics (2021)

  • Recurring urothelial carcinomas show genomic rearrangements incompatible with a direct relationship

    • Nour-Al-Dain Marzouka
    • , David Lindgren
    • , Pontus Eriksson
    • , Gottfrid Sjödahl
    • , Carina Bernardo
    • , Fredrik Liedberg
    • , Håkan Axelson
    •  & Mattias Höglund

    Scientific Reports (2020)

  • Interleukin-6 trans-signaling is a candidate mechanism to drive progression of human DCCs during clinical latency

    • Melanie Werner-Klein
    • , Ana Grujovic
    • , Christoph Irlbeck
    • , Milan Obradović
    • , Martin Hoffmann
    • , Huiqin Koerkel-Qu
    • , Xin Lu
    • , Steffi Treitschke
    • , Cäcilia Köstler
    • , Catherine Botteron
    • , Kathrin Weidele
    • , Christian Werno
    • , Bernhard Polzer
    • , Stefan Kirsch
    • , Miodrag Gužvić
    • , Jens Warfsmann
    • , Kamran Honarnejad
    • , Zbigniew Czyz
    • , Giancarlo Feliciello
    • , Isabell Blochberger
    • , Sandra Grunewald
    • , Elisabeth Schneider
    • , Gundula Haunschild
    • , Nina Patwary
    • , Severin Guetter
    • , Sandra Huber
    • , Brigitte Rack
    • , Nadia Harbeck
    • , Stefan Buchholz
    • , Petra Rümmele
    • , Norbert Heine
    • , Stefan Rose-John
    •  & Christoph A. Klein

    Nature Communications (2020)

  • Pervasive chromosomal instability and karyotype order in tumour evolution

    • Thomas B. K. Watkins
    • , Emilia L. Lim
    • , Marina Petkovic
    • , Sergi Elizalde
    • , Nicolai J. Birkbak
    • , Gareth A. Wilson
    • , David A. Moore
    • , Eva Grönroos
    • , Andrew Rowan
    • , Sally M. Dewhurst
    • , Jonas Demeulemeester
    • , Stefan C. Dentro
    • , Stuart Horswell
    • , Lewis Au
    • , Kerstin Haase
    • , Mickael Escudero
    • , Rachel Rosenthal
    • , Maise Al Bakir
    • , Hang Xu
    • , Kevin Litchfield
    • , Wei Ting Lu
    • , Thanos P. Mourikis
    • , Michelle Dietzen
    • , Lavinia Spain
    • , George D. Cresswell
    • , Dhruva Biswas
    • , Philippe Lamy
    • , Iver Nordentoft
    • , Katja Harbst
    • , Francesc Castro-Giner
    • , Lucy R. Yates
    • , Franco Caramia
    • , Fanny Jaulin
    • , Cécile Vicier
    • , Ian P. M. Tomlinson
    • , Priscilla K. Brastianos
    • , Raymond J. Cho
    • , Boris C. Bastian
    • , Lars Dyrskjøt
    • , Göran B. Jönsson
    • , Peter Savas
    • , Sherene Loi
    • , Peter J. Campbell
    • , Fabrice Andre
    • , Nicholas M. Luscombe
    • , Neeltje Steeghs
    • , Vivianne C. G. Tjan-Heijnen
    • , Zoltan Szallasi
    • , Samra Turajlic
    • , Mariam Jamal-Hanjani
    • , Peter Van Loo
    • , Samuel F. Bakhoum
    • , Roland F. Schwarz
    • , Nicholas McGranahan
    •  & Charles Swanton

    Nature (2020)


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing