Landscape of somatic mutations in 560 breast cancer whole-genome sequences

Journal name:
Nature
Volume:
534,
Pages:
47–54
Date published:
DOI:
doi:10.1038/nature17676
Received
Accepted
Published online

Abstract

We analysed whole-genome sequences of 560 breast cancers to advance understanding of the driver mutations conferring clonal advantage and the mutational processes generating somatic mutations. We found that 93 protein-coding cancer genes carried probable driver mutations. Some non-coding regions exhibited high mutation frequencies, but most have distinctive structural features probably causing elevated mutation rates and do not contain driver mutations. Mutational signature analysis was extended to genome rearrangements and revealed twelve base substitution and six rearrangement signatures. Three rearrangement signatures, characterized by tandem duplications or deletions, appear associated with defective homologous-recombination-based DNA repair: one with deficient BRCA1 function, another with deficient BRCA1 or BRCA2 function, the cause of the third is unknown. This analysis of all classes of somatic mutation across exons, introns and intergenic regions highlights the repertoire of cancer genes and mutational processes operating, and progresses towards a comprehensive account of the somatic genetic basis of breast cancer.

At a glance

Figures

  1. Cohort and catalogue of somatic mutations in 560 breast cancers.
    Figure 1: Cohort and catalogue of somatic mutations in 560 breast cancers.

    a, Catalogue of base substitutions, insertions/deletions, rearrangements and driver mutations in 560 breast cancers (sorted by total substitution burden). Indel axis limited to 5,000(*). b, Complete list of curated driver genes sorted by frequency (descending). Fraction of ER-positive (left, total 366) and ER-negative (right, total 194) samples carrying a mutation in the relevant driver gene presented in grey. log10 P value of enrichment of each driver gene towards the ER-positive or ER-negative cohort is provided in black. Highlighted in green are genes for which there is new or further evidence supporting these as novel breast cancer genes.

  2. Non-coding analyses of breast cancer genomes.
    Figure 2: Non-coding analyses of breast cancer genomes.

    a, Distributions of substitution (purple dots) and indel (blue dots) mutations within the footprint of five regulatory regions identified as being more significantly mutated than expected is provided on the left. The proportion of base substitution mutation signatures associated with corresponding samples carrying mutations in each of these non-coding regions, is displayed on the right. b, Mutability of TGAACA/TGTTCA motifs within inverted repeats of varying flanking palindromic sequence length compared to motifs not within an inverted repeat. c, Variation in mutability between loci of TGAACA/TGTTCA inverted repeats with 9 bp palindromes.

  3. Extraction and contributions of base substitution signatures in 560 breast cancers.
    Figure 3: Extraction and contributions of base substitution signatures in 560 breast cancers.

    a, Twelve mutation signatures extracted using non-negative matrix factorization. Each signature is ordered by mutation class (C>A/G>T, C>G/G>C, C>T/G>A, T>A/A>T, T>C/A>G, T>G/A>C), taking immediate flanking sequence into account. For each class, mutations are ordered by 5′ base (A, C, G, T) first before 3′ base (A, C, G, T). b, The spectrum of base substitution signatures within 560 breast cancers. Mutation signatures are ordered (and coloured) according to broad biological groups: signatures 1 and 5 are correlated with age of diagnosis; signatures 2 and 13 are putatively APOBEC-related; signatures 6, 20 and 26 are associated with mismatch-repair deficiency; signatures 3 and 8 are associated with homologous-recombination deficiency; signatures 18, 17 and 30 have unknown aetiology. For ease of reading, this arrangement is adopted for the rest of the manuscript. Samples are ordered according to hierarchical clustering performed on mutation signatures. Top, absolute numbers of mutations of each signature in each sample. Bottom, proportion of each signature in each sample. c, Distribution of mutation counts for each signature in relevant breast cancer samples. Percentage of samples carrying each signature provided above each signature.

  4. Additional characteristics of base substitution signatures and novel rearrangement signatures in 560 breast cancers.
    Figure 4: Additional characteristics of base substitution signatures and novel rearrangement signatures in 560 breast cancers.

    a, Contrasting transcriptional strand asymmetry and replication strand asymmetry between twelve base substitution signatures. b, Six rearrangement signatures extracted using non-negative matrix factorization. Probability of rearrangement element on y axis. Rearrangement size on x axis. del, deletion; tds, tandem duplication; inv, inversion; trans, translocation.

  5. Integrative analysis of rearrangement signatures.
    Figure 5: Integrative analysis of rearrangement signatures.

    Heatmap of rearrangement signatures following unsupervised hierarchical clustering based on proportions of rearrangement signatures in each cancer. Seven cluster groups (A–G) noted and relationships with expression (AIMS) subtype (basal, red; luminal B, light blue; luminal A, dark blue), immunohistopathology status (ER, progesterone receptor (PR), HER2 status; black, positive), abrogation of BRCA1 (purple) and BRCA2 (orange) (whether germline, somatic or through promoter hypermethylation), presence of 3 or more foci of kataegis (black, positive), HRD index (top 25% or lowest 25%; black, positive), GISTIC cluster group (black, positive) and driver mutations in cancer genes. miRNA cluster groups: 0, red; 1, purple; 2, blue; 3, light blue; 4, green; 5, orange. Contribution of base-substitution signatures in these seven cluster groups is provided in the bottom panel.

  6. Landscape of driver mutations.
    Extended Data Fig. 1: Landscape of driver mutations.

    a, Summary of subtypes of cohort of 560 breast cancers. b, Driver mutations by mutation type. c, Distribution of rearrangements throughout the genome. Black line represents background rearrangement density (calculation based on rearrangement breakpoints in intergenic regions only). Red lines represent frequency of rearrangement within breast cancer genes.

  7. Rearrangements in oncogenes.
    Extended Data Fig. 2: Rearrangements in oncogenes.

    a, Variation in rearrangement and copy number events affecting ESR1. Clear amplification in top panel, transection of ESR1 in middle panel and focused tandem duplication events in bottom panel. b, Predicted outcomes of some rearrangements affecting ETV6. Red crosses indicate exons deleted as a result of rearrangements within the ETV6 genes, black dotted lines indicate rearrangement break points resulting in fusions between ETV6 and ERC, WNK1, ATP2B1 or LRP6. ETV6 domains indicated are: N-terminal (NT) pointed domain and E26 transformation-specific DNA binding domain (ETS).

  8. Recurrent non-coding events in breast cancers.
    Extended Data Fig. 3: Recurrent non-coding events in breast cancers.

    a, Manhattan plot demonstrating sites with most significant P values as identified by binning analysis. Purple highlighted sites were also detected by the method seeking recurrence when partitioned by genomic features. b, Locus at chr11 65 Mb, which was identified by independent analyses as being more mutated than expected by chance. Bottom, a rearrangement hotspot analysis identified this region as a tandem duplication hotspot, with nested tandem duplications noted at this site. Partitioning the genome into different regulatory elements, an analysis of substitutions and indels identified lncRNAs MALAT1 and NEAT1 (topmost panels) with significant P values.

  9. Copy number analyses.
    Extended Data Fig. 4: Copy number analyses.

    a, Frequency of copy number aberrations across the cohort. Chromosome position along x axis, frequency of copy number gains (red) and losses (green) y axis. b, Identification of focal recurrent copy number gains by the GISTIC method (Supplementary Methods). c, Identification of focal recurrent copy number losses by the GISTIC method. d, Heatmap of GISTIC regions following unsupervised hierarchical clustering. Five cluster groups are noted and relationships with expression subtype (basal, red; luminal B, light blue; luminal A, dark blue), immunohistopathology status (ER, PR, HER2 status; black, positive), abrogation of BRCA1 (red) and BRCA2 (blue) (whether germline, somatic or through promoter hypermethylation), driver mutations (black, positive), HRD index (top 25% or lowest 25%; black, positive).

  10. miRNA analyses.
    Extended Data Fig. 5: miRNA analyses.

    Hierarchical clustering of the most variant miRNAs using complete linkage and Euclidean distance. miRNA clusters were assigned using the partitioning algorithm using recursive thresholding (PART) method. Five main patient clusters were revealed. The horizontal annotation bars show (from top to bottom): PART cluster group, PAM50 mRNA expression subtype, GISTIC cluster, rearrangement cluster, lymphocyte infiltration score and histological grade. The heatmap shows clustered and centred miRNA expression data (log2 transformed). Details on colour coding of the annotation bars are presented below the heatmap.

  11. Rearrangement cluster groups and associated features.
    Extended Data Fig. 6: Rearrangement cluster groups and associated features.

    a, Overall survival (OS) by rearrangement cluster group. b, Age of diagnosis. c, Tumour grade. d, Menopausal status. e, ER status. f, Immune response metagene panel. g, Lymphocytic infiltration score.

  12. Contrasting tandem duplication phenotypes.
    Extended Data Fig. 7: Contrasting tandem duplication phenotypes.

    Contrasting tandem duplication phenotypes of two breast cancers using chromosome X. Copy number (y axis) depicted as black dots. Lines represent rearrangements breakpoints (green, tandem duplications; pink, deletions; blue, inversions; black, translocations with partner breakpoint provided). Top, PD4841a has numerous large tandem duplications (>100 kb, rearrangement signature 1), whereas PD4833a has many short tandem duplications (<10 kb, rearrangement signature 3) appearing as ‘single’ lines in its plot.

  13. Hotspots of tandem duplications.
    Extended Data Fig. 8: Hotspots of tandem duplications.

    A tandem duplication hotspot occurring in six different patients.

  14. Rearrangement breakpoint junctions.
    Extended Data Fig. 9: Rearrangement breakpoint junctions.

    a, Breakpoint features of rearrangements in 560 breast cancers by rearrangement signature. b, Breakpoint features in BRCA and non-BRCA cancers.

  15. Signatures of focal hypermutation.
    Extended Data Fig. 10: Signatures of focal hypermutation.

    a, Kataegis and alternative kataegis occurring at the same locus (ERBB2 amplicon in PD13164a). Copy number (y axis) depicted as black dots. Lines represent rearrangements breakpoints (green, tandem duplications; pink, deletions; blue, inversions). Top, an ~10 Mb region including the ERBB2 locus. Middle, zoomed-in tenfold to an ~1 Mb window highlighting co-occurrence of rearrangement breakpoints, with copy number changes and three different kataegis loci. Bottom, demonstrates kataegis loci in more detail. log10 intermutation distance on y axis. Black arrow, kataegis; blue arrows, alternative kataegis. b, Sequence context of kataegis and alternative kataegis identified in this data set.

References

  1. Stratton, M. R., Campbell, P. J. & Futreal, P. A. The cancer genome. Nature 458, 719724 (2009)
  2. Nik-Zainal, S. et al. Mutational processes molding the genomes of 21 breast cancers. Cell 149, 979993 (2012)
  3. Nik-Zainal, S. et al. The life history of 21 breast cancers. Cell 149, 9941007 (2012)
  4. Hicks, J. et al. Novel patterns of genome rearrangement and their association with survival in breast cancer. Genome Res. 16, 14651479 (2006)
  5. Bergamaschi, A. et al. Extracellular matrix signature identifies breast cancer subgroups with different clinical outcome. J. Pathol. 214, 357367 (2008)
  6. Ching, H. C., Naidu, R., Seong, M. K., Har, Y. C. & Taib, N. A. Integrated analysis of copy number and loss of heterozygosity in primary breast carcinomas using high-density SNP array. Int. J. Oncol. 39, 621633 (2011)
  7. Fang, M. et al. Genomic differences between estrogen receptor (ER)-positive and ER-negative human breast carcinoma identified by single nucleotide polymorphism array comparative genome hybridization analysis. Cancer 117, 20242034 (2011)
  8. Curtis, C. et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 486, 346352 (2012)
  9. Pleasance, E. D. et al. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463, 191196 (2010)
  10. Pleasance, E. D. et al. A small-cell lung cancer genome with complex signatures of tobacco exposure. Nature 463, 184190 (2010)
  11. Banerji, S. et al. Sequence analysis of mutations and translocations across breast cancer subtypes. Nature 486, 405409 (2012)
  12. Ellis, M. J. et al. Whole-genome analysis informs breast cancer response to aromatase inhibition. Nature 486, 353360 (2012)
  13. Shah, S. P. et al. The clonal and mutational evolution spectrum of primary triple-negative breast cancers. Nature 486, 395399 (2012)
  14. Stephens, P. J. et al. The landscape of cancer genes and mutational processes in breast cancer. Nature 486, 400404 (2012)
  15. The Cancer Genome Atlas Network Comprehensive molecular portraits of human breast tumours. Nature 490, 6170 (2012)
  16. Wu, Y. M. et al. Identification of targetable FGFR gene fusions in diverse cancers. Cancer Discovery 3, 636647 (2013)
  17. Giacomini, C. P. et al. Breakpoint analysis of transcriptional and genomic profiles uncovers novel gene fusions spanning multiple human cancer types. PLoS Genet. 9, e1003464 (2013)
  18. Robinson, D. R. et al. Functionally recurrent rearrangements of the MAST kinase and Notch gene families in breast cancer. Nature Med. 17, 16461651 (2011)
  19. Karlsson, J. et al. Activation of human telomerase reverse transcriptase through gene fusion in clear cell sarcoma of the kidney. Cancer Lett. 357, 498501 (2015)
  20. Khurana, E. et al. Integrative annotation of variants from 1092 humans: application to cancer genomics. Science 342, 1235587 (2013)
  21. West, J. A. et al. The long noncoding RNAs NEAT1 and MALAT1 bind active chromatin sites. Mol. Cell 55, 791802 (2014)
  22. Huang, F. W. et al. Highly recurrent TERT promoter mutations in human melanoma. Science 339, 957959 (2013)
  23. Vinagre, J. et al. Frequency of TERT promoter mutations in human cancers. Nature Commun. 4, 2185 (2013)
  24. Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415421 (2013)
  25. Alexandrov, L. B., Nik-Zainal, S., Wedge, D. C., Campbell, P. J. & Stratton, M. R. Deciphering signatures of mutational processes operative in human cancer. Cell Rep. 3, 246259 (2013)
  26. Lawrence, M. S. et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495501 (2014)
  27. Natrajan, R. et al. Characterization of the genomic features and expressed fusion genes in micropapillary carcinomas of the breast. J. Pathol. 232, 553565 (2014)
  28. Kalyana-Sundaram, S. et al. Gene fusions associated with recurrent amplicons represent a class of passenger aberrations in breast cancer. Neoplasia 14, 702708 (2012)
  29. Tubio, J. M. Somatic structural variation and cancer. Brief. Func. Genomics 14, 339351 (2015)
  30. Weinhold, N., Jacobsen, A., Schultz, N., Sander, C. & Lee, W. Genome-wide analysis of noncoding regulatory mutations in cancer. Nature Genet. 46, 11601165 (2014)
  31. Ussery, D. W., Binnewies, T. T., Gouveia-Oliveira, R., Jarmer, H. & Hallin, P. F. Genome update: DNA repeats in bacterial genomes. Microbiology 150, 35193521 (2004)
  32. Lu, S. et al. Short inverted repeats are hotspots for genetic instability: relevance to cancer genomes. Cell Rep. 10, 16741680 (2015)
  33. Voineagu, I., Narayanan, V., Lobachev, K. S. & Mirkin, S. M. Replication stalling at unstable inverted repeats: interplay between DNA hairpins and fork stabilizing proteins. Proc. Natl Acad. Sci. USA 105, 99369941 (2008)
  34. Wojcik, E. A. et al. Direct and inverted repeats elicit genetic instability by both exploiting and eluding DNA double-strand break repair systems in mycobacteria. PLoS ONE 7, e51064 (2012)
  35. Pearson, C. E., Zorbas, H., Price, G. B. & Zannis-Hadjopoulos, M. Inverted repeats, stem-loops, and cruciforms: significance for initiation of DNA replication. J. Cell. Biochem. 63, 122 (1996)
  36. Kozak, M. Interpreting cDNA sequences: some insights from studies on translation. Mamm. Genome 7, 563574 (1996)
  37. Helleday, T., Eshtad, S. & Nik-Zainal, S. Mechanisms underlying mutational signatures in human cancers. Nature Rev. Genet. 15, 585598 (2014)
  38. Birkbak, N. J. et al. Telomeric allelic imbalance indicates defective DNA repair and sensitivity to DNA-damaging agents. Cancer Disc. 2, 366375 (2012)
  39. Abkevich, V. et al. Patterns of genomic loss of heterozygosity predict homologous recombination repair defects in epithelial ovarian cancer. Br. J. Cancer 107, 17761782 (2012)
  40. Popova, T. et al. Ploidy and large-scale genomic instability consistently identify basal-like breast carcinomas with BRCA1/2 inactivation. Cancer Res. 72, 54545462 (2012)
  41. Puente, X. S. et al. Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia. Nature 475, 101105 (2011)
  42. Morganella, S. A. et al. The topography of mutational processes in breast cancer genomes. Nature Commun. http://dx.doi.org/10.1038/ncomms11383 (2016)
  43. Fong, P. C. et al. Inhibition of poly(ADP-ribose) polymerase in tumors from BRCA mutation carriers. N. Engl. J. Med. 361, 123134 (2009)
  44. Forster, M. D. et al. Treatment with olaparib in a patient with PTEN-deficient endometrioid endometrial cancer. Nature Rev. Clin. Oncol. 8, 302306 (2011)
  45. Turner, N., Tutt, A. & Ashworth, A. Targeting the DNA repair defect of BRCA tumours. Curr. Opin. Pharmacol. 5, 388393 (2005)
  46. Waddell, N. et al. Whole genomes redefine the mutational landscape of pancreatic cancer. Nature 518, 495501 (2015)
  47. Kozarewa, I. et al. Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes. Nature Methods 6, 291295 (2009)
  48. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 17541760 (2009)
  49. Ye, K., Schulz, M. H., Long, Q., Apweiler, R. & Ning, Z. Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25, 28652871 (2009)
  50. Zerbino, D. R. & Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821829 (2008)
  51. Van Loo, P. et al. Allele-specific copy number analysis of tumors. Proc. Natl Acad. Sci. USA 107, 1691016915 (2010)
  52. Greenman, C., Wooster, R., Futreal, P. A., Stratton, M. R. & Easton, D. F. Statistical analysis of pathogenicity of somatic mutations in cancer. Genetics 173, 21872198 (2006)
  53. Lawrence, M. S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214218 (2013)
  54. Sun, L., Craiu, R. V., Paterson, A. D. & Bull, S. B. Stratified false discovery control for large-scale hypothesis testing with application to genome-wide association studies. Genet. Epidemiol. 30, 519530 (2006)
  55. The ENCODE Project Consortium An integrated encyclopedia of DNA elements in the human genome. Nature 489, 5774 (2012)
  56. Wilkerson, M. D. & Hayes, D. N. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 26, 15721573 (2010)
  57. Zhang, H., Meltzer, P. & Davis, S. RCircos: an R package for Circos 2D track plots. BMC Bioinformatics 14, 244 (2013)

Download references

Author information

Affiliations

  1. Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK

    • Serena Nik-Zainal,
    • Helen Davies,
    • Manasa Ramakrishna,
    • Dominik Glodzik,
    • Xueqing Zou,
    • Inigo Martincorena,
    • Ludmil B. Alexandrov,
    • Sancha Martin,
    • David C. Wedge,
    • Peter Van Loo,
    • Young Seok Ju,
    • Adam Butler,
    • Serge Dronov,
    • Moritz Gerstung,
    • David R. Jones,
    • Yilong Li,
    • Stuart McLaren,
    • Andrew Menzies,
    • Ville Mustonen,
    • Sarah O’Meara,
    • Keiran Raine,
    • Kamna Ramakrishnan,
    • Rebecca Shepherd,
    • Lucy Stebbings,
    • Jon Teague,
    • Lucy Yates,
    • P. Andrew Futreal,
    • Peter J. Campbell &
    • Michael R. Stratton
  2. East Anglian Medical Genetics Service, Cambridge University Hospitals NHS Foundation Trust, Cambridge CB2 9NB, UK

    • Serena Nik-Zainal
  3. Division of Oncology and Pathology, Department of Clinical Sciences Lund, Lund University, Lund SE-223 81, Sweden

    • Johan Staaf,
    • Markus Ringnér &
    • Åke Borg
  4. Theoretical Biology and Biophysics (T-6), Los Alamos National Laboratory, Los Alamos, NM 87545, New Mexico, USA

    • Ludmil B. Alexandrov
  5. Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA

    • Ludmil B. Alexandrov
  6. Department of Human Genetics, University of Leuven, B-3000 Leuven, Belgium

    • Peter Van Loo
  7. Department of Medical Oncology, Erasmus MC Cancer Institute and Cancer Genomics Netherlands, Erasmus University Medical Center, Rotterdam 3015CN, The Netherlands

    • Marcel Smid,
    • John A. Foekens,
    • F. Germán Rodríguez-González,
    • Anieta M. Sieuwerts &
    • John W. M. Martens
  8. Radboud University, Department of Molecular Biology, Faculty of Science, 6525GA Nijmegen, The Netherlands

    • Arie B. Brinkman &
    • Hendrik G. Stunnenberg
  9. European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK

    • Sandro Morganella &
    • Ewan Birney
  10. Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital, The Norwegian Radium Hospital, Oslo 0310, Norway

    • Miriam R. Aure,
    • Anita Langerød &
    • Anne-Lise Børresen-Dale
  11. K. G. Jebsen Centre for Breast Cancer Research, Institute for Clinical Medicine, University of Oslo, Oslo 0310, Norway

    • Miriam R. Aure,
    • Ole Christian Lingjærde,
    • Anita Langerød &
    • Anne-Lise Børresen-Dale
  12. Department of Computer Science, University of Oslo, Oslo, Norway

    • Ole Christian Lingjærde
  13. Gachon Institute of Genome Medicine and Science, Gachon University Gil Medical Center, Incheon, South Korea

    • Sung-Min Ahn
  14. Translational Research Lab, Centre Léon Bérard, 28, rue Laënnec, 69373 Lyon Cedex 08, France

    • Sandrine Boyault
  15. Department of Pathology, Brigham and Women’s Hospital, Boston, Massachusetts 02115, USA

    • Jane E. Brock &
    • Andrea L. Richardson
  16. The Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands

    • Annegien Broeks,
    • Laura van’t Veer &
    • Jos Jonkers
  17. Breast Cancer Translational Research Laboratory, Université Libre de Bruxelles, Institut Jules Bordet, Bd de Waterloo 121, B-1000 Brussels, Belgium

    • Christine Desmedt &
    • Christos Sotiriou
  18. Translational Cancer Research Unit, Center for Oncological Research, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium

    • Luc Dirix,
    • Gert G. Van den Eynden,
    • Peter Vermeulen &
    • Steven Van Laere
  19. Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA

    • Aquila Fatima &
    • Andrea L. Richardson
  20. Department of Pathology, Academic Medical Center, Meibergdreef 9, 1105 AZ Amsterdam, The Netherlands

    • Gerrit K. J. Hooijer &
    • Marc J. van de Vijver
  21. Department of Pathology, Asan Medical Center, College of Medicine, Ulsan University, Ulsan, South Korea

    • Se Jin Jang &
    • Hee Jin Lee
  22. Department of Pathology, College of Medicine, Hanyang University, Seoul 133-791, South Korea

    • Hyung-Yong Kim &
    • Gu Kong
  23. Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, New York 10065, USA

    • Tari A. King
  24. Morgan Welch Inflammatory Breast Cancer Research Program and Clinic, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard., Houston, Texas 77030, USA

    • Savitri Krishnamurthy &
    • Naoto T. Ueno
  25. Institute for Bioengineering and Biopharmaceutical Research (IBBR), Hanyang University, Seoul, South Korea

    • Jeong-Yeon Lee
  26. Institut National du Cancer, Research Division, Clinical Research Department, 52 avenue Morizet, 92513 Boulogne-Billancourt, France

    • Iris Pauporté
  27. University Hospital of Minjoz, INSERM UMR 1098, Bd Fleming, Besançon 25000, France

    • Xavier Pivot
  28. Pathology Department, Ninewells Hospital and Medical School, Dundee DD1 9SY, UK

    • Colin A. Purdie &
    • Alastair M. Thompson
  29. Oncologie Sénologie, ICM Institut Régional du Cancer, Montpellier, France

    • Gilles Romieu
  30. The University of Queensland, UQ Centre for Clinical Research and School of Medicine, Brisbane, Queensland 4029, Australia

    • Peter T. Simpson &
    • Sunil R. Lakhani
  31. Cancer Research Laboratory, Faculty of Medicine, University of Iceland, 101 Reykjavik, Iceland

    • Olafur A. Stefansson &
    • Jorunn E. Eyfjord
  32. IRCCS Istituto Tumori “Giovanni Paolo II”, Bari, Italy

    • Stefania Tommasi
  33. Department of Pathology, Centre Léon Bérard, 28 rue Laënnec, 69373 Lyon Cédex 08, France

    • Isabelle Treilleux
  34. Department of Pathology, GZA Hospitals Sint-Augustinus, Antwerp, Belgium

    • Gert G. Van den Eynden &
    • Peter Vermeulen
  35. Institut Curie, Paris Sciences Lettres University, Department of Pathology and INSERM U934, 26 rue d’Ulm, 75248 Paris Cedex 05, France

    • Anne Vincent-Salomon
  36. Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK

    • Carlos Caldas
  37. Breast Cancer Now Research Unit, King’s College London, London SE1 9RT, UK

    • Andrew Tutt
  38. Breast Cancer Now Toby Robins Research Centre, Institute of Cancer Research, London SW3 6JB, UK

    • Andrew Tutt
  39. Department of Clinical Science, University of Bergen, 5020 Bergen, Norway

    • Stian Knappskog
  40. Department of Oncology, Haukeland University Hospital, 5021 Bergen, Norway

    • Stian Knappskog
  41. National Cancer Centre Singapore, 11 Hospital Drive, 169610, Singapore

    • Benita Kiat Tee Tan
  42. Singapore General Hospital, Outram Road, 169608, Singapore

    • Benita Kiat Tee Tan
  43. Equipe Erable, INRIA Grenoble-Rhône-Alpes, 655, Avenue de l’Europe, 38330 Montbonnot-Saint Martin, France

    • Alain Viari
  44. Synergie Lyon Cancer, Centre Léon Bérard, 28 rue Laënnec, Lyon Cedex 08, France

    • Alain Viari &
    • Gilles Thomas
  45. Department of Genomic Medicine, UT MD Anderson Cancer Center, Houston, Texas 77230, USA

    • P. Andrew Futreal
  46. Department of Radiation Oncology, Department of Laboratory Medicine, Radboud University Medical Center, Nijmegen 6525GA, The Netherlands

    • Paul N. Span
  47. Pathology Queensland, The Royal Brisbane and Women’s Hospital, Brisbane, Queensland 4029, Australia

    • Sunil R. Lakhani
  48. Department of Breast Surgical Oncology, University of Texas MD Anderson Cancer Center, 1400 Pressler Street, Houston, Texas 77030, USA

    • Alastair M. Thompson

Contributions

S.N.-Z., M.R.S. designed the study, analysed data and wrote the manuscript. H.D., J.S., M. Ramakrishna, D.G., X.Z. performed curation of data and contributed towards genomic and copy number analyses. M.S., A.B.B., M.R.A., O.C.L., A.L., M. Ringner, contributed towards curation and analysis of non-genomic data (transcriptomic, miRNA, methylation). I.M., L.B.A., D.C.W., P.V.L., S. Morganella, Y.S.J., contributed towards specialist analyses. G.T., G.K., A.L.R., A-L.B.-D., J.W.M.M., M.J.v.d.V., H.G.S., E.B., A. Borg., A.V., P.A.F., P.J.C., designed the study, drove the consortium and provided samples. S.Martin was the project coordinator. S.McL., S.O.M., K.R., contributed operationally. S.-M.A., S.B., J.E.B., A.Brooks., C.D., L.D., A.F., J.A.F., G.K.J.H., S.J.J., H.-Y.K., T.A.K., S.K., H.J.L., J.-Y.L., I.P., X.P., C.A.P., F.G.R.-G., G.R., A.M.S., P.T.S., O.A.S., S.T., I.T., G.G.V.d.E., P.V., A.V.-S., L.Y., C.C., L.v.V., A.T., S.K., B.K.T.T., J.J., N.t.U., C.S., P.N.S., S.V.L., S.R.L., J.E.E., A.M.T contributed pathology assessment and/or samples. A. Butler., S.D., M.G., D.R.J., Y.L., A.M., V.M., K.R., R.S., L.S., J.T. contributed IT processing and management expertise. All authors discussed the results and commented on the manuscript.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

Raw data have been submitted to the European-Genome Phenome Archive under the overarching accession number EGAS00001001178 (please see Supplementary Notes for breakdown by data type). Somatic variants have been deposited at the International Cancer Genome Consortium Data Portal (https://dcc.icgc.org/).

Author details

Extended data figures and tables

Extended Data Figures

  1. Extended Data Figure 1: Landscape of driver mutations. (261 KB)

    a, Summary of subtypes of cohort of 560 breast cancers. b, Driver mutations by mutation type. c, Distribution of rearrangements throughout the genome. Black line represents background rearrangement density (calculation based on rearrangement breakpoints in intergenic regions only). Red lines represent frequency of rearrangement within breast cancer genes.

  2. Extended Data Figure 2: Rearrangements in oncogenes. (209 KB)

    a, Variation in rearrangement and copy number events affecting ESR1. Clear amplification in top panel, transection of ESR1 in middle panel and focused tandem duplication events in bottom panel. b, Predicted outcomes of some rearrangements affecting ETV6. Red crosses indicate exons deleted as a result of rearrangements within the ETV6 genes, black dotted lines indicate rearrangement break points resulting in fusions between ETV6 and ERC, WNK1, ATP2B1 or LRP6. ETV6 domains indicated are: N-terminal (NT) pointed domain and E26 transformation-specific DNA binding domain (ETS).

  3. Extended Data Figure 3: Recurrent non-coding events in breast cancers. (650 KB)

    a, Manhattan plot demonstrating sites with most significant P values as identified by binning analysis. Purple highlighted sites were also detected by the method seeking recurrence when partitioned by genomic features. b, Locus at chr11 65 Mb, which was identified by independent analyses as being more mutated than expected by chance. Bottom, a rearrangement hotspot analysis identified this region as a tandem duplication hotspot, with nested tandem duplications noted at this site. Partitioning the genome into different regulatory elements, an analysis of substitutions and indels identified lncRNAs MALAT1 and NEAT1 (topmost panels) with significant P values.

  4. Extended Data Figure 4: Copy number analyses. (919 KB)

    a, Frequency of copy number aberrations across the cohort. Chromosome position along x axis, frequency of copy number gains (red) and losses (green) y axis. b, Identification of focal recurrent copy number gains by the GISTIC method (Supplementary Methods). c, Identification of focal recurrent copy number losses by the GISTIC method. d, Heatmap of GISTIC regions following unsupervised hierarchical clustering. Five cluster groups are noted and relationships with expression subtype (basal, red; luminal B, light blue; luminal A, dark blue), immunohistopathology status (ER, PR, HER2 status; black, positive), abrogation of BRCA1 (red) and BRCA2 (blue) (whether germline, somatic or through promoter hypermethylation), driver mutations (black, positive), HRD index (top 25% or lowest 25%; black, positive).

  5. Extended Data Figure 5: miRNA analyses. (664 KB)

    Hierarchical clustering of the most variant miRNAs using complete linkage and Euclidean distance. miRNA clusters were assigned using the partitioning algorithm using recursive thresholding (PART) method. Five main patient clusters were revealed. The horizontal annotation bars show (from top to bottom): PART cluster group, PAM50 mRNA expression subtype, GISTIC cluster, rearrangement cluster, lymphocyte infiltration score and histological grade. The heatmap shows clustered and centred miRNA expression data (log2 transformed). Details on colour coding of the annotation bars are presented below the heatmap.

  6. Extended Data Figure 6: Rearrangement cluster groups and associated features. (264 KB)

    a, Overall survival (OS) by rearrangement cluster group. b, Age of diagnosis. c, Tumour grade. d, Menopausal status. e, ER status. f, Immune response metagene panel. g, Lymphocytic infiltration score.

  7. Extended Data Figure 7: Contrasting tandem duplication phenotypes. (171 KB)

    Contrasting tandem duplication phenotypes of two breast cancers using chromosome X. Copy number (y axis) depicted as black dots. Lines represent rearrangements breakpoints (green, tandem duplications; pink, deletions; blue, inversions; black, translocations with partner breakpoint provided). Top, PD4841a has numerous large tandem duplications (>100 kb, rearrangement signature 1), whereas PD4833a has many short tandem duplications (<10 kb, rearrangement signature 3) appearing as ‘single’ lines in its plot.

  8. Extended Data Figure 8: Hotspots of tandem duplications. (445 KB)

    A tandem duplication hotspot occurring in six different patients.

  9. Extended Data Figure 9: Rearrangement breakpoint junctions. (201 KB)

    a, Breakpoint features of rearrangements in 560 breast cancers by rearrangement signature. b, Breakpoint features in BRCA and non-BRCA cancers.

  10. Extended Data Figure 10: Signatures of focal hypermutation. (331 KB)

    a, Kataegis and alternative kataegis occurring at the same locus (ERBB2 amplicon in PD13164a). Copy number (y axis) depicted as black dots. Lines represent rearrangements breakpoints (green, tandem duplications; pink, deletions; blue, inversions). Top, an ~10 Mb region including the ERBB2 locus. Middle, zoomed-in tenfold to an ~1 Mb window highlighting co-occurrence of rearrangement breakpoints, with copy number changes and three different kataegis loci. Bottom, demonstrates kataegis loci in more detail. log10 intermutation distance on y axis. Black arrow, kataegis; blue arrows, alternative kataegis. b, Sequence context of kataegis and alternative kataegis identified in this data set.

Supplementary information

PDF files

  1. Supplementary Information (2.2 MB)

    This file contains Supplementary Methods and Data and additional references.

  2. Supplementary Information (187 KB)

    This file contains some acknowledgements and the EGA accession numbers.

Zip files

  1. Supplementary Tables (41.2 MB)

    This file zipped contains Supplementary Tables 1-21.

Comments

  1. Report this comment #68115

    Majid Ali said:

    Re: Somatic Mutations in Breast cancer. Hyperinsulinism In Women With Breast Cancer

    The work of Nik-Zainal et al. enlightens physicians who treat cancer with an integrative model in two ways: (1) it vastly expands the knowledge of breast cancer genome which is expected to yield significant clinical benefits in coming years; and (2) it underscores the importance of addressing non-genetic elements in treating breast cancer. The finding of 93 protein-coding breast cancer genes carrying probable driver mutations notably extends the range beyond the limits of BRCA 1 and BRCA 2 mutations. The repertoire of breast cancer genes and mutational processes revealed in the study further underscores the need for integrating new treatments based on advances in tumor genomics with those made with dietary, metabolic, environmental, and self-regulatory approaches to counter tumor-related immunosuppression. Specific concerns here are: (1) diagnosis-related anger and fear; (2) cumulative burdens of cancer biology and comorbidities; (3) imagined and real concerns about serious adverse effects of treatment options; and (4) perceived and actual probabilities of treatment failure.

    To address cancer-related immunosuppression from diverse causes, the author focused on oxygen signaling (ref.1-4) and insulin homeostasis (ref.5,6). In 1995, a holistic-integrative model of cancer control based on accelerated oxidative injury was proposed as the common denominator in elements that sustain and promote proliferation of tumor cells. In 2001, the oxidative hypothesis was expanded to the dysox model of cancer with the following words: cancer is the destructive behavior of cells incited and perpetuated by many factors that cumulatively lead to anomalous oxygen signaling. (ref.2). In 2009, focus on issues of insulin homeostasis led to proposing an oxygen model of hyperinsulinism (ref.5) and described a hyperinsulinism modification plan (ref. 6).

    Breast cancer cells carry a larger complement of insulin receptors than non-neoplastic mammary gland cells (ref.7). This may be expected to add to disruption of insulin homeostasis caused by dietary, metabolic, and other factors mentioned earlier and indeed might become a clinically important metabolic concern. The author investigated this risk by measuring insulin responses to a 75-gram glucose challenge with blood samples drawn at fasting and 1-hour, 2-hour, and 3-hours in patients with a variety of malignant neoplasms. The following sets of blood insulin and glucose data concerning three women with unimpaired glucose tolerance and breast cancer treated with chemotherapy and surgery illustrate my main points: (1) how insulin can rise with treatment (cases 1 and 2); and (2) how insulin levels can fall with an integrative insulin modification plan following treatment (case 3). The insulin profile of a healthy subject with unimpaired glucose tolerance is included as a control. Insulin and glucose concentrations in 3-hour insulin and glucose profiles given below are expressed in uIU/ml and mg/mL respectively.

    1. Healthy control subject: - insulin levels: <2, 18, 4, and <2.;
    glucose levels: 77, 168, 109, 74.
    2. Breast cancer case 1: insulin levels: 4, 29, 30, 14; Insulin levels three years after chemotherapy: 6.2, 79.0, 60.6, 8.4.
    3. Breast cancer case 2: insulin levels: 6.5, 21.9, 19.9, 14.8. Insulin levels one year after surgery and chemotherapy: 3, 52, 11, 7.
    4. Breast cancer case 3: insulin levels: 3, 62, 96, 19; Insulin levels determined three years after beginning an insulin modification plan: Insulin: 6.8, 36.0, 23.0, 8.8.

    The author is one of the growing community of physicians who treat cancer and recognize the deep commitment of scientists to advancing science for controlling cancer. They also acknowledge that the scientists? diligence, persistence, and endurance is not being matched by that of clinicians. The patients often pay a heavy price when relevant complementary immunity-bolstering nutritional, metabolic, environmental, and self-regulatory measures are withheld from them.

    References
    1. Ali M. Respiratory-to-Fermentative (RTF) Shift in ATP Production in Chronic Energy Deficit States. Townsend Letter for Doctors and Patients. 2004. 253: 64-65 (2004).
    2. Ali M. Cancer, Oxygen, and Pantotropha. Part I. Townsend Letter for Doctors and Patients. 2004;256:98-102.?
    3. Ali M. Oxygen and Aging. (Ist ed.) New York, Canary 21 Press. Aging Healthfully Book 2000.
    4. Ali M. The Principles and Practice of Integrative Medicine Volume XI: 3rd. Edi. Darwin, Dysox, and Disease. 2000. 3rd. Edi. 2008. New York. (2009) Institute of Integrative Medicine Press.
    5. Ali M. Beyond Insulin Resistance and Syndrome X: The Oxidative-Dysoxygenative Insulin Dysfunction (ODID) Model – Part 1. Townsend Letter for Doctors & Patients. August 2002.
    6. Ali M. Dr. Ali?s Plan for Reversing Diabetes. New York, Canary Press. Aging Healthyfully Books. 2011.
    7. Papa V, Pezzino V, Constantino A, et al. Elevated insulin receptor content in human breast cancer. J Clin Invest. 1990;86:1503-1510.

Subscribe to comments

Additional data