Large-scale analysis of acquired chromosomal alterations in non-tumor samples from patients with cancer


Mosaicism, the presence of subpopulations of cells bearing somatic mutations, is associated with disease and aging and has been detected in diverse tissues, including apparently normal cells adjacent to tumors. To analyze mosaicism on a large scale, we surveyed haplotype-specific somatic copy number alterations (sCNAs) in 1,708 normal-appearing adjacent-to-tumor (NAT) tissue samples from 27 cancer sites and in 7,149 blood samples from The Cancer Genome Atlas. We find substantial variation across tissues in the rate, burden and types of sCNAs, including those spanning entire chromosome arms. We document matching sCNAs in the NAT tissue and the adjacent tumor, suggesting a shared clonal origin, as well as instances in which both NAT tissue and tumor tissue harbor a gain of the same oncogene arising in parallel from distinct parental haplotypes. These results shed light on pan-tissue mutations characteristic of field cancerization, the presence of oncogenic processes adjacent to cancer cells.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Chromosomal alterations, allelic imbalance and mosaicism.
Fig. 2: Summary of results.
Fig. 3: Landscape of sCNAs.
Fig. 4: Arm-level sCNAs in NAT tissues.

Data availability

The results shown are based on data generated by the TCGA Research Network ( All datasets used in this work are available in public repositories ( A list of TCGA disease sites (Supplementary Table 1) and blood and NAT samples used for the analyses (including case IDs) are included (Supplementary Tables 4 and 5, respectively). Reported sCNAs with case IDs are available in Supplementary Tables (6, 7, 10, 11 and 1921).


  1. 1.

    Freed, D., Stevens, E. L. & Pevsner, J. Somatic mosaicism in the human genome. Genes (Basel) 5, 1064–1094 (2014).

  2. 2.

    Forsberg, L. A., Gisselsson, D. & Dumanski, J. P. Mosaicism in health and disease—clones picking up speed. Nat. Rev. Genet. 18, 128–142 (2017).

  3. 3.

    Machiela, M. J. & Chanock, S. J. The ageing genome, clonal mosaicism and chronic disease. Curr. Opin. Genet. Dev. 42, 8–13 (2017).

  4. 4.

    Martincorena, I. & Campbell, P. J. Somatic mutation in cancer and normal cells. Science 349, 1483–1489 (2015).

  5. 5.

    Beroukhim, R. et al. The landscape of somatic copy-number alteration across human cancers. Nature 463, 899–905 (2010).

  6. 6.

    Taylor, A. M. et al. Genomic and functional approaches to understanding cancer aneuploidy. Cancer Cell 33, 676–689 (2018).

  7. 7.

    Zack, T. I. et al. Pan-cancer patterns of somatic copy number alteration. Nat. Genet. 45, 1134–1140 (2013).

  8. 8.

    Piotrowski, A. et al. Somatic mosaicism for copy number variation in differentiated human tissues. Hum. Mutat. 29, 1118–1124 (2008).

  9. 9.

    Aghili, L., Foo, J., DeGregori, J. & De, S. Patterns of somatically acquired amplifications and deletions in apparently normal tissues of ovarian cancer patients. Cell Rep. 7, 1310–1319 (2014).

  10. 10.

    Genovese, G. et al. Clonal hematopoiesis and blood-cancer risk inferred from blood DNA sequence. N. Engl. J. Med. 371, 2477–2487 (2014).

  11. 11.

    Xie, M. et al. Age-related mutations associated with clonal hematopoietic expansion and malignancies. Nat. Med. 20, 1472–1478 (2014).

  12. 12.

    Jakubek, Y. et al. Genomic landscape established by allelic imbalance in the cancerization field of a normal appearing airway. Cancer Res. 76, 3676–3683 (2016).

  13. 13.

    Curtius, K., Wright, N. A. & Graham, T. A. An evolutionary perspective on field cancerization. Nat. Rev. Cancer. 18, 19–32 (2018).

  14. 14.

    Kadara, H. et al. Driver mutations in normal airway epithelium elucidate spatiotemporal resolution of lung cancer. Am. J. Respir. Crit. Care Med. 200, 742–750 (2019).

  15. 15.

    Yadav, V. K., DeGregori, J. & De, S. The landscape of somatic mutations in protein coding genes in apparently benign human tissues carries signatures of relaxed purifying selection. Nucleic Acids Res. 44, 2075–2084 (2016).

  16. 16.

    Martincorena, I. et al. Tumor evolution. High burden and pervasive positive selection of somatic mutations in normal human skin. Science 348, 880–886 (2015).

  17. 17.

    Martincorena, I. et al. Somatic mutant clones colonize the human esophagus with age. Science 362, 911–917 (2018).

  18. 18.

    Moore, L. et al. The mutational landscape of normal human endometrial epithelium. Preprint at (2018).

  19. 19.

    Suda, K. et al. Clonal expansion and diversification of cancer-associated mutations in endometriosis and normal endometrium. Cell Rep. 24, 1777–1789 (2018).

  20. 20.

    Lee-Six, H. et al. The landscape of somatic mutation in normal colorectal epithelial cells. Preprint at (2018).

  21. 21.

    Lee-Six, H. et al. Population dynamics of normal human blood inferred from somatic mutations. Nature 561, 473–478 (2018).

  22. 22.

    Laurie, C. C. et al. Detectable clonal mosaicism from birth to old age and its relationship to cancer. Nat. Genet. 44, 642–650 (2012).

  23. 23.

    Jacobs, K. B. et al. Detectable clonal mosaicism and its relationship to aging and cancer. Nat. Genet. 44, 651–658 (2012).

  24. 24.

    Machiela, M. J. et al. Characterization of large structural genetic mosaicism in human autosomes. Am. J. Hum. Genet. 96, 487–497 (2015).

  25. 25.

    Vattathil, S. & Scheet, P. Extensive hidden genomic mosaicism revealed in normal tissue. Am. J. Hum. Genet. 98, 571–578 (2016).

  26. 26.

    Loh, P. R. et al. Insights into clonal haematopoiesis from 8,342 mosaic chromosomal alterations. Nature 559, 350–355 (2018).

  27. 27.

    Machiela, M. J. et al. Female chromosome X mosaicism is age-related and preferentially affects the inactivated X chromosome. Nat. Commun. 7, 11843 (2016).

  28. 28.

    Vattathil, S. & Scheet, P. Haplotype-based profiling of subtle allelic imbalance with SNP arrays. Genome Res. 23, 152–158 (2013).

  29. 29.

    San Lucas, F. A. et al. Rapid and powerful detection of subtle allelic imbalance from exome sequencing data with hapLOHseq. Bioinformatics 32, 3015–3017 (2016).

  30. 30.

    Pounds, S. et al. A genomic random interval model for statistical analysis of genomic lesion data. Bioinformatics 29, 2088–2095 (2013).

  31. 31.

    Jamal-Hanjani, M. et al. Tracking the evolution of non-small-cell lung cancer. New Engl. J. Med. 376, 2109–2121 (2017).

  32. 32.

    Jakubek, Y. A., San Lucas, F. A. & Scheet, P. Directional allelic imbalance profiling and visualization from multi-sample data with RECUR. Bioinformatics 35, 2300–2302 (2018).

  33. 33.

    Gausachs, M. et al. Mutational heterogeneity in APC and KRAS arises at the crypt level and leads to polyclonality in early colorectal tumorigenesis. Clin. cancer Res. 23, 5936–5947 (2017).

  34. 34.

    Martincorena, I. et al. Universal patterns of selection in cancer and somatic tissues. Cell 171, 1029–1041 (2017).

  35. 35.

    Machiela, M. J. et al. Detectible mosaic truncating PPM1D mutations, age and breast cancer risk. J. Hum. Genet. 64, 545–550 (2019).

  36. 36.

    Garnis, C. et al. Genomic imbalances in precancerous tissues signal oral cancer risk. Mol. Cancer 8, 50 (2009).

  37. 37.

    Hogan, B. L. et al. Repair and regeneration of the respiratory system: complexity, plasticity, and mechanisms of lung stem cell function. Cell Stem Cell 15, 123–138 (2014).

  38. 38.

    Aran, D. et al. Comprehensive analysis of normal adjacent to tumor transcriptomes. Nat. Commun. 8, 1077 (2017).

  39. 39.

    Kimura, Y. et al. Genetic alterations in 102 primary gastric cancers by comparative genomic hybridization: gain of 20q and loss of 18q are associated with tumor progression. Mod. Pathol. 17, 1328–1337 (2004).

  40. 40.

    The International Stem Cell Initiative Screening ethnically diverse human embryonic stem cells identifies a chromosome 20 minimal amplicon conferring growth advantage. Nat Biotechnol 29, 1132–1144 (2011).

  41. 41.

    Sivakumar, S. et al. Genomic landscape of allelic imbalance in premalignant atypical adenomatous hyperplasias of the lung. EBioMedicine 42, 296–303 (2019).

  42. 42.

    Korn, J. M. et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat. Genet. 40, 1253–1260 (2008).

  43. 43.

    Li, Y., Willer, C. J., Ding, J., Scheet, P. & Abecasis, G. R. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet. Epidemiol. 34, 816–834 (2010).

  44. 44.

    Fowler, J., San Lucas, F. A. & Scheet, P. System for quality-assured data analysis: Flexible, reproducible scientific workflows. Genet. Epidemiol. 43, 227–237 (2019).

  45. 45.

    Liu, J. et al. An integrated TCGA pan-cancer clinical data resource to drive high-quality survival outcome analytics. Cell 173, 400–416 (2018).

  46. 46.

    Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).

  47. 47.

    Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

  48. 48.

    Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).

  49. 49.

    Bailey, M. H. et al. Comprehensive characterization of cancer driver genes and mutations. Cell 173, 371–385 (2018).

Download references


We thank D. Swartzlander for help with the graphics and reviewers for their helpful comments. We acknowledge the High Performance Research Computing Center at the University of Texas, MD Anderson Cancer Center. This work was supported by National Institutes of Health grants R25CA057730 (to Y.A.J.), R01HG005855 (to P.S.), R01HG005859 (to P.S.), R01CA181244 (to P.S. and C.D.H.) and P30CA016672 (to MD Anderson) and by the following awards from the Cancer Prevention Research Institute of Texas: RP150079 (to H.K.) and RP160668 (to P.S.).

Author information




P.S. and Y.A.J. conceptualized and directed the study. J.F., K.C., M.R.G., P.S., S.S., Y.A.J. and Y.Y. performed data analyses. C.D.H., E.V., H.K., P.S. and Y.A.J. interpreted results. P.S. and Y.A.J. wrote the manuscript.

Corresponding author

Correspondence to Y. A. Jakubek.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Materials

Supplementary Figs. 1–16 and Supplementary Notes 1–8

Reporting Summary

Supplementary Tables

Supplementary Tables 1–25

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Jakubek, Y.A., Chang, K., Sivakumar, S. et al. Large-scale analysis of acquired chromosomal alterations in non-tumor samples from patients with cancer. Nat Biotechnol 38, 90–96 (2020).

Download citation