Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Landscape of somatic mutations in 560 breast cancer whole-genome sequences

An Author Correction to this article was published on 18 January 2019

This article has been updated


We analysed whole-genome sequences of 560 breast cancers to advance understanding of the driver mutations conferring clonal advantage and the mutational processes generating somatic mutations. We found that 93 protein-coding cancer genes carried probable driver mutations. Some non-coding regions exhibited high mutation frequencies, but most have distinctive structural features probably causing elevated mutation rates and do not contain driver mutations. Mutational signature analysis was extended to genome rearrangements and revealed twelve base substitution and six rearrangement signatures. Three rearrangement signatures, characterized by tandem duplications or deletions, appear associated with defective homologous-recombination-based DNA repair: one with deficient BRCA1 function, another with deficient BRCA1 or BRCA2 function, the cause of the third is unknown. This analysis of all classes of somatic mutation across exons, introns and intergenic regions highlights the repertoire of cancer genes and mutational processes operating, and progresses towards a comprehensive account of the somatic genetic basis of breast cancer.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Cohort and catalogue of somatic mutations in 560 breast cancers.
Figure 2: Non-coding analyses of breast cancer genomes.
Figure 3: Extraction and contributions of base substitution signatures in 560 breast cancers.
Figure 4: Additional characteristics of base substitution signatures and novel rearrangement signatures in 560 breast cancers.
Figure 5: Integrative analysis of rearrangement signatures.

Similar content being viewed by others

Accession codes

Data deposits

Raw data have been submitted to the European-Genome Phenome Archive under the overarching accession number EGAS00001001178 (please see Supplementary Notes for breakdown by data type). Somatic variants have been deposited at the International Cancer Genome Consortium Data Portal (

Change history

  • 18 January 2019

    In the Methods section of this Article, 'greater than' should have been 'less than' in the sentence 'Putative regions of clustered rearrangements were identified as having an average inter-rearrangement distance that was at least 10 times greater than the whole-genome average for the individual sample.'. The Article has not been corrected.


  1. Stratton, M. R., Campbell, P. J. & Futreal, P. A. The cancer genome. Nature 458, 719–724 (2009)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  2. Nik-Zainal, S. et al. Mutational processes molding the genomes of 21 breast cancers. Cell 149, 979–993 (2012)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Nik-Zainal, S. et al. The life history of 21 breast cancers. Cell 149, 994–1007 (2012)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Hicks, J. et al. Novel patterns of genome rearrangement and their association with survival in breast cancer. Genome Res. 16, 1465–1479 (2006)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Bergamaschi, A. et al. Extracellular matrix signature identifies breast cancer subgroups with different clinical outcome. J. Pathol. 214, 357–367 (2008)

    Article  CAS  PubMed  Google Scholar 

  6. Ching, H. C., Naidu, R., Seong, M. K., Har, Y. C. & Taib, N. A. Integrated analysis of copy number and loss of heterozygosity in primary breast carcinomas using high-density SNP array. Int. J. Oncol. 39, 621–633 (2011)

    CAS  PubMed  Google Scholar 

  7. Fang, M. et al. Genomic differences between estrogen receptor (ER)-positive and ER-negative human breast carcinoma identified by single nucleotide polymorphism array comparative genome hybridization analysis. Cancer 117, 2024–2034 (2011)

    Article  CAS  PubMed  Google Scholar 

  8. Curtis, C. et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 486, 346–352 (2012)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Pleasance, E. D. et al. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463, 191–196 (2010)

    Article  ADS  CAS  PubMed  Google Scholar 

  10. Pleasance, E. D. et al. A small-cell lung cancer genome with complex signatures of tobacco exposure. Nature 463, 184–190 (2010)

    Article  ADS  CAS  PubMed  Google Scholar 

  11. Banerji, S. et al. Sequence analysis of mutations and translocations across breast cancer subtypes. Nature 486, 405–409 (2012)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  12. Ellis, M. J. et al. Whole-genome analysis informs breast cancer response to aromatase inhibition. Nature 486, 353–360 (2012)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  13. Shah, S. P. et al. The clonal and mutational evolution spectrum of primary triple-negative breast cancers. Nature 486, 395–399 (2012)

    Article  ADS  CAS  PubMed  Google Scholar 

  14. Stephens, P. J. et al. The landscape of cancer genes and mutational processes in breast cancer. Nature 486, 400–404 (2012)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. The Cancer Genome Atlas Network Comprehensive molecular portraits of human breast tumours. Nature 490, 61–70 (2012)

  16. Wu, Y. M. et al. Identification of targetable FGFR gene fusions in diverse cancers. Cancer Discovery 3, 636–647 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Giacomini, C. P. et al. Breakpoint analysis of transcriptional and genomic profiles uncovers novel gene fusions spanning multiple human cancer types. PLoS Genet. 9, e1003464 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Robinson, D. R. et al. Functionally recurrent rearrangements of the MAST kinase and Notch gene families in breast cancer. Nature Med. 17, 1646–1651 (2011)

    Article  CAS  PubMed  Google Scholar 

  19. Karlsson, J. et al. Activation of human telomerase reverse transcriptase through gene fusion in clear cell sarcoma of the kidney. Cancer Lett. 357, 498–501 (2015)

    Article  CAS  PubMed  Google Scholar 

  20. Khurana, E. et al. Integrative annotation of variants from 1092 humans: application to cancer genomics. Science 342, 1235587 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. West, J. A. et al. The long noncoding RNAs NEAT1 and MALAT1 bind active chromatin sites. Mol. Cell 55, 791–802 (2014)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Huang, F. W. et al. Highly recurrent TERT promoter mutations in human melanoma. Science 339, 957–959 (2013)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  23. Vinagre, J. et al. Frequency of TERT promoter mutations in human cancers. Nature Commun. 4, 2185 (2013)

    Article  ADS  CAS  Google Scholar 

  24. Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Alexandrov, L. B., Nik-Zainal, S., Wedge, D. C., Campbell, P. J. & Stratton, M. R. Deciphering signatures of mutational processes operative in human cancer. Cell Rep. 3, 246–259 (2013)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Lawrence, M. S. et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  27. Natrajan, R. et al. Characterization of the genomic features and expressed fusion genes in micropapillary carcinomas of the breast. J. Pathol. 232, 553–565 (2014)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Kalyana-Sundaram, S. et al. Gene fusions associated with recurrent amplicons represent a class of passenger aberrations in breast cancer. Neoplasia 14, 702–708 (2012)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Tubio, J. M. Somatic structural variation and cancer. Brief. Func. Genomics 14, 339–351 (2015)

    Article  CAS  Google Scholar 

  30. Weinhold, N., Jacobsen, A., Schultz, N., Sander, C. & Lee, W. Genome-wide analysis of noncoding regulatory mutations in cancer. Nature Genet. 46, 1160–1165 (2014)

    Article  CAS  PubMed  Google Scholar 

  31. Ussery, D. W., Binnewies, T. T., Gouveia-Oliveira, R., Jarmer, H. & Hallin, P. F. Genome update: DNA repeats in bacterial genomes. Microbiology 150, 3519–3521 (2004)

    Article  CAS  PubMed  Google Scholar 

  32. Lu, S. et al. Short inverted repeats are hotspots for genetic instability: relevance to cancer genomes. Cell Rep. 10, 1674–1680 (2015)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Voineagu, I., Narayanan, V., Lobachev, K. S. & Mirkin, S. M. Replication stalling at unstable inverted repeats: interplay between DNA hairpins and fork stabilizing proteins. Proc. Natl Acad. Sci. USA 105, 9936–9941 (2008)

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  34. Wojcik, E. A. et al. Direct and inverted repeats elicit genetic instability by both exploiting and eluding DNA double-strand break repair systems in mycobacteria. PLoS ONE 7, e51064 (2012)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  35. Pearson, C. E., Zorbas, H., Price, G. B. & Zannis-Hadjopoulos, M. Inverted repeats, stem-loops, and cruciforms: significance for initiation of DNA replication. J. Cell. Biochem. 63, 1–22 (1996)

    Article  CAS  PubMed  Google Scholar 

  36. Kozak, M. Interpreting cDNA sequences: some insights from studies on translation. Mamm. Genome 7, 563–574 (1996)

    Article  CAS  PubMed  Google Scholar 

  37. Helleday, T., Eshtad, S. & Nik-Zainal, S. Mechanisms underlying mutational signatures in human cancers. Nature Rev. Genet. 15, 585–598 (2014)

    Article  CAS  PubMed  Google Scholar 

  38. Birkbak, N. J. et al. Telomeric allelic imbalance indicates defective DNA repair and sensitivity to DNA-damaging agents. Cancer Disc. 2, 366–375 (2012)

    Article  CAS  Google Scholar 

  39. Abkevich, V. et al. Patterns of genomic loss of heterozygosity predict homologous recombination repair defects in epithelial ovarian cancer. Br. J. Cancer 107, 1776–1782 (2012)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Popova, T. et al. Ploidy and large-scale genomic instability consistently identify basal-like breast carcinomas with BRCA1/2 inactivation. Cancer Res. 72, 5454–5462 (2012)

    Article  CAS  PubMed  Google Scholar 

  41. Puente, X. S. et al. Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia. Nature 475, 101–105 (2011)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Morganella, S. A. et al. The topography of mutational processes in breast cancer genomes. Nature Commun. (2016)

  43. Fong, P. C. et al. Inhibition of poly(ADP-ribose) polymerase in tumors from BRCA mutation carriers. N. Engl. J. Med. 361, 123–134 (2009)

    Article  CAS  PubMed  Google Scholar 

  44. Forster, M. D. et al. Treatment with olaparib in a patient with PTEN-deficient endometrioid endometrial cancer. Nature Rev. Clin. Oncol. 8, 302–306 (2011)

    Article  CAS  Google Scholar 

  45. Turner, N., Tutt, A. & Ashworth, A. Targeting the DNA repair defect of BRCA tumours. Curr. Opin. Pharmacol. 5, 388–393 (2005)

    Article  CAS  PubMed  Google Scholar 

  46. Waddell, N. et al. Whole genomes redefine the mutational landscape of pancreatic cancer. Nature 518, 495–501 (2015)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Kozarewa, I. et al. Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes. Nature Methods 6, 291–295 (2009)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Ye, K., Schulz, M. H., Long, Q., Apweiler, R. & Ning, Z. Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25, 2865–2871 (2009)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Zerbino, D. R. & Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829 (2008)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Van Loo, P. et al. Allele-specific copy number analysis of tumors. Proc. Natl Acad. Sci. USA 107, 16910–16915 (2010)

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  52. Greenman, C., Wooster, R., Futreal, P. A., Stratton, M. R. & Easton, D. F. Statistical analysis of pathogenicity of somatic mutations in cancer. Genetics 173, 2187–2198 (2006)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Lawrence, M. S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013)

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  54. Sun, L., Craiu, R. V., Paterson, A. D. & Bull, S. B. Stratified false discovery control for large-scale hypothesis testing with application to genome-wide association studies. Genet. Epidemiol. 30, 519–530 (2006)

    Article  PubMed  Google Scholar 

  55. The ENCODE Project Consortium An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012)

  56. Wilkerson, M. D. & Hayes, D. N. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 26, 1572–1573 (2010)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Zhang, H., Meltzer, P. & Davis, S. RCircos: an R package for Circos 2D track plots. BMC Bioinformatics 14, 244 (2013)

    PubMed  PubMed Central  Google Scholar 

Download references


This work has been funded through the ICGC Breast Cancer Working group by the Breast Cancer Somatic Genetics Study (BASIS), a European research project funded by the European Community’s Seventh Framework Programme (FP7/2010-2014) under the grant agreement number 242006; the Triple Negative project funded by the Wellcome Trust (grant reference 077012/Z/05/Z) and the HER2+ project funded by Institut National du Cancer (INCa) in France (grant numbers 226-2009, 02-2011, 41-2012, 144-2008, 06-2012). The ICGC Asian Breast Cancer Project was funded through a grant of the Korean Health Technology R&D Project, Ministry of Health and Welfare, Republic of Korea (A111218-SC01). Personally funded by grants above: F.G.R.-G., S.M., K.R., S.M. were funded by BASIS. Recruitment was performed under the auspices of the ICGC breast cancer projects run by the UK, France and Korea. For contributions towards instruments, specimens and collections: Tayside Tissue Bank (funded by CRUK, University of Dundee, Chief Scientist Office & Breast Cancer Campaign), Asan Bio-Resource Center of the Korea Biobank Network, Seoul, South Korea, OSBREAC consortium, The Icelandic Centre for Research (RANNIS), The Swedish Cancer Society and the Swedish Research Council, and Fondation Jean Dausset-Centre d’Etudes du polymorphisme humain. Icelandic Cancer Registry, The Brisbane Breast Bank (The University of Queensland, The Royal Brisbane and Women’s Hospital and QIMR Berghofer), Breast Cancer Tissue and Data Bank at KCL and NIHR Biomedical Research Centre at Guy’s and St Thomas’s Hospitals. Breakthrough Breast Cancer and Cancer Research UK Experimental Cancer Medicine Centre at KCL. For pathology review: The Mouse Genome Project and Department of Pathology, Cambridge University Hospitals NHS Foundation Trust for microscopes. A. Richardson, A. Ehinger, A. Vincent-Salomon, C. Van Deurzen, C. Purdie, D. Larsimont, D. Giri, D. Grabau, E. Provenzano, G. MacGrogan, G. Van den Eynden, I. Treilleux, J. E. Brock, J. Jacquemier, J. Reis-Filho, L. Arnould, L. Jones, M. van de Vijver, Ø. Garred, R. Salgado, S. Pinder, S. R. Lakhani, T. Sauer, V. Barbashina. Illumina UK Ltd for input on optimization of sequencing throughout this project. Wellcome Trust Sanger Institute Sequencing Core Facility, Core IT Facility and Cancer Genome Project Core IT team and Cancer Genome Project Core Laboratory team for general support. Personal funding: S.N.-Z. is a Wellcome Beit Fellow and personally funded by a Wellcome Trust Intermediate Fellowship (WT100183MA). L.B.A. is supported through a J. Robert Oppenheimer Fellowship at Los Alamos National Laboratory. A.L.R. is partially supported by the Dana-Farber/Harvard Cancer Center SPORE in Breast Cancer (NIH/NCI 5 P50 CA168504-02). D.G. was supported by the EU-FP7-SUPPRESSTEM project. A.S. was supported by Cancer Genomics Netherlands through a grant from the Netherlands Organisation of Scientific research (NWO). M.S. was supported by the EU-FP7-DDR response project. C.S. and C.D. are supported by a grant from the Breast Cancer Research Foundation. E.B. was funded by EMBL. C.S. is funded by FNRS (Fonds National de la Recherche Scientifique). S.J.J. is supported by Leading Foreign Research Institute Recruitment Program through the National Research Foundation of Republic Korea (NRF 2011-0030105). G.K. is supported by National Research Foundation of Korea (NRF) grants funded by the Korean government (NRF 2015R1A2A1A10052578). J.F. received funding from an ERC Advanced grant (no. 322737). For general contribution and administrative support: Fondation Synergie Lyon Cancer in France. J. G. Jonasson, Department of Pathology, University Hospital & Faculty of Medicine, University of Iceland. K. Ferguson, Tissue Bank Manager, Brisbane Breast Bank and The Breast Unit, The Royal Brisbane and Women's Hospital, Brisbane, Australia. The Oslo Breast Cancer Consortium of Norway (OSBREAC). Angelo Paradiso, IRCCS Istituto Tumori “Giovanni Paolo II”, Bari Italy. A. Vines for administratively supporting to identifying the samples, organizing the bank, and sending out the samples. M. Schlooz-Vries, J. Tol, H. van Laarhoven, F. Sweep, P. Bult in Nijmegen for contributions in Nijmegen. This research used resources provided by the Los Alamos National Laboratory Institutional Computing Program, which is supported by the US Department of Energy National Nuclear Security Administration under contract no. DE-AC52-06NA25396. Research performed at Los Alamos National Laboratory was carried out under the auspices of the National Nuclear Security Administration of the United States Department of Energy. N. Miller (in memoriam) for her contribution in setting up the clinical database. Finally, we would like to acknowledge all members of the ICGC Breast Cancer Working Group and ICGC Asian Breast Cancer Project.

Author information

Authors and Affiliations



S.N.-Z., M.R.S. designed the study, analysed data and wrote the manuscript. H.D., J.S., M. Ramakrishna, D.G., X.Z. performed curation of data and contributed towards genomic and copy number analyses. M.S., A.B.B., M.R.A., O.C.L., A.L., M. Ringner, contributed towards curation and analysis of non-genomic data (transcriptomic, miRNA, methylation). I.M., L.B.A., D.C.W., P.V.L., S. Morganella, Y.S.J., contributed towards specialist analyses. G.T., G.K., A.L.R., A-L.B.-D., J.W.M.M., M.J.v.d.V., H.G.S., E.B., A. Borg., A.V., P.A.F., P.J.C., designed the study, drove the consortium and provided samples. S.Martin was the project coordinator. S.McL., S.O.M., K.R., contributed operationally. S.-M.A., S.B., J.E.B., A.Brooks., C.D., L.D., A.F., J.A.F., G.K.J.H., S.J.J., H.-Y.K., T.A.K., S.K., H.J.L., J.-Y.L., I.P., X.P., C.A.P., F.G.R.-G., G.R., A.M.S., P.T.S., O.A.S., S.T., I.T., G.G.V.d.E., P.V., A.V.-S., L.Y., C.C., L.v.V., A.T., S.K., B.K.T.T., J.J., N.t.U., C.S., P.N.S., S.V.L., S.R.L., J.E.E., A.M.T contributed pathology assessment and/or samples. A. Butler., S.D., M.G., D.R.J., Y.L., A.M., V.M., K.R., R.S., L.S., J.T. contributed IT processing and management expertise. All authors discussed the results and commented on the manuscript.

Corresponding authors

Correspondence to Serena Nik-Zainal, Alain Viari, Gu Kong or Michael R. Stratton.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Extended data figures and tables

Extended Data Figure 1 Landscape of driver mutations.

a, Summary of subtypes of cohort of 560 breast cancers. b, Driver mutations by mutation type. c, Distribution of rearrangements throughout the genome. Black line represents background rearrangement density (calculation based on rearrangement breakpoints in intergenic regions only). Red lines represent frequency of rearrangement within breast cancer genes.

Extended Data Figure 2 Rearrangements in oncogenes.

a, Variation in rearrangement and copy number events affecting ESR1. Clear amplification in top panel, transection of ESR1 in middle panel and focused tandem duplication events in bottom panel. b, Predicted outcomes of some rearrangements affecting ETV6. Red crosses indicate exons deleted as a result of rearrangements within the ETV6 genes, black dotted lines indicate rearrangement break points resulting in fusions between ETV6 and ERC, WNK1, ATP2B1 or LRP6. ETV6 domains indicated are: N-terminal (NT) pointed domain and E26 transformation-specific DNA binding domain (ETS).

Extended Data Figure 3 Recurrent non-coding events in breast cancers.

a, Manhattan plot demonstrating sites with most significant P values as identified by binning analysis. Purple highlighted sites were also detected by the method seeking recurrence when partitioned by genomic features. b, Locus at chr11 65 Mb, which was identified by independent analyses as being more mutated than expected by chance. Bottom, a rearrangement hotspot analysis identified this region as a tandem duplication hotspot, with nested tandem duplications noted at this site. Partitioning the genome into different regulatory elements, an analysis of substitutions and indels identified lncRNAs MALAT1 and NEAT1 (topmost panels) with significant P values.

Extended Data Figure 4 Copy number analyses.

a, Frequency of copy number aberrations across the cohort. Chromosome position along x axis, frequency of copy number gains (red) and losses (green) y axis. b, Identification of focal recurrent copy number gains by the GISTIC method (Supplementary Methods). c, Identification of focal recurrent copy number losses by the GISTIC method. d, Heatmap of GISTIC regions following unsupervised hierarchical clustering. Five cluster groups are noted and relationships with expression subtype (basal, red; luminal B, light blue; luminal A, dark blue), immunohistopathology status (ER, PR, HER2 status; black, positive), abrogation of BRCA1 (red) and BRCA2 (blue) (whether germline, somatic or through promoter hypermethylation), driver mutations (black, positive), HRD index (top 25% or lowest 25%; black, positive).

Extended Data Figure 5 miRNA analyses.

Hierarchical clustering of the most variant miRNAs using complete linkage and Euclidean distance. miRNA clusters were assigned using the partitioning algorithm using recursive thresholding (PART) method. Five main patient clusters were revealed. The horizontal annotation bars show (from top to bottom): PART cluster group, PAM50 mRNA expression subtype, GISTIC cluster, rearrangement cluster, lymphocyte infiltration score and histological grade. The heatmap shows clustered and centred miRNA expression data (log2 transformed). Details on colour coding of the annotation bars are presented below the heatmap.

Extended Data Figure 6 Rearrangement cluster groups and associated features.

a, Overall survival (OS) by rearrangement cluster group. b, Age of diagnosis. c, Tumour grade. d, Menopausal status. e, ER status. f, Immune response metagene panel. g, Lymphocytic infiltration score.

Extended Data Figure 7 Contrasting tandem duplication phenotypes.

Contrasting tandem duplication phenotypes of two breast cancers using chromosome X. Copy number (y axis) depicted as black dots. Lines represent rearrangements breakpoints (green, tandem duplications; pink, deletions; blue, inversions; black, translocations with partner breakpoint provided). Top, PD4841a has numerous large tandem duplications (>100 kb, rearrangement signature 1), whereas PD4833a has many short tandem duplications (<10 kb, rearrangement signature 3) appearing as ‘single’ lines in its plot.

Extended Data Figure 8 Hotspots of tandem duplications.

A tandem duplication hotspot occurring in six different patients.

Extended Data Figure 9 Rearrangement breakpoint junctions.

a, Breakpoint features of rearrangements in 560 breast cancers by rearrangement signature. b, Breakpoint features in BRCA and non-BRCA cancers.

Extended Data Figure 10 Signatures of focal hypermutation.

a, Kataegis and alternative kataegis occurring at the same locus (ERBB2 amplicon in PD13164a). Copy number (y axis) depicted as black dots. Lines represent rearrangements breakpoints (green, tandem duplications; pink, deletions; blue, inversions). Top, an ~10 Mb region including the ERBB2 locus. Middle, zoomed-in tenfold to an ~1 Mb window highlighting co-occurrence of rearrangement breakpoints, with copy number changes and three different kataegis loci. Bottom, demonstrates kataegis loci in more detail. log10 intermutation distance on y axis. Black arrow, kataegis; blue arrows, alternative kataegis. b, Sequence context of kataegis and alternative kataegis identified in this data set.

Supplementary information

Supplementary Information

This file contains Supplementary Methods and Data and additional references. (PDF 2344 kb)

Supplementary Information

This file contains some acknowledgements and the EGA accession numbers. (PDF 181 kb)

Supplementary Tables

This file zipped contains Supplementary Tables 1-21. (ZIP 42202 kb)

PowerPoint slides

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nik-Zainal, S., Davies, H., Staaf, J. et al. Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature 534, 47–54 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer