Polygenic risk scores (PRS) for breast cancer have potential to improve risk prediction, but there is limited information on their utility in various clinical situations. Here we show that among 122,978 women in the FinnGen study with 8401 breast cancer cases, the PRS modifies the breast cancer risk of two high-impact frameshift risk variants. Similarly, we show that after the breast cancer diagnosis, individuals with elevated PRS have an elevated risk of developing contralateral breast cancer, and that the PRS can considerably improve risk assessment among their female first-degree relatives. In more detail, women with the c.1592delT variant in PALB2 (242-fold enrichment in Finland, 336 carriers) and an average PRS (10–90th percentile) have a lifetime risk of breast cancer at 55% (95% CI 49–61%), which increases to 84% (71–97%) with a high PRS ( > 90th percentile), and decreases to 49% (30–68%) with a low PRS ( < 10th percentile). Similarly, for c.1100delC in CHEK2 (3.7–fold enrichment; 1648 carriers), the respective lifetime risks are 29% (27–32%), 59% (52–66%), and 9% (5–14%). The PRS also refines the risk assessment of women with first-degree relatives diagnosed with breast cancer, particularly among women with positive family history of early-onset breast cancer. Here we demonstrate the opportunities for a comprehensive way of assessing genetic risk in the general population, in breast cancer patients, and in unaffected family members.
In women, breast cancer is the most commonly diagnosed cancer and the leading cause of cancer-related deaths1. Approximately 5–10% of all breast cancers are estimated to develop due to high-impact germline mutations in breast cancer susceptibility genes, with up to 30% due to pathogenic mutations in BRCA1 and BRCA2 and with a smaller proportion carrying mutations in other susceptibility genes, such as PTEN, TP53, CHEK2, PALB2 and STK112. While pathogenic mutations in BRCA1 and BRCA2 are less common in Finns3, two frameshift mutations, c.1592delT (rs180177102) in PALB2 and c.1100delC (rs555607708) in CHEK2 have an unusually high allele frequency in Finland, which provides a unique opportunity to explore the impact of these mutations in the population. PALB2 (Partner and Localizer of BRCA2) encodes a key tumour suppressor protein that functions through affecting BRCA2 nuclear localisation and DNA damage response functions, and through interacting with BRCA14. The second gene, CHEK2 (Checkpoint kinase 2), is a tumour suppressor gene encoding a serine/threonine-protein kinase involved in DNA repair, cell cycle arrest and apoptosis5.
Beyond genetic predisposition caused by high-risk mutations in breast cancer susceptibility genes, breast cancer has a highly polygenic mode of inheritance. Large-scale genetic screens have to date identified over a hundred loci associated with risk of breast cancer6. These variants, and many more yet to be discovered, represent common genetic variation acting through a wide range of molecular pathways, in contrast to the rare, high-risk pathogenic variants in high-risk breast cancer susceptibility genes that often disrupt a specific pathway involved in maintaining integrity of DNA repair processes. Individually, the common variants have very small effect sizes with odds ratios usually ranging from 0.85 to 1.20, but their cumulative impact in breast cancer risk has been shown to be considerably larger7. This cumulative effect can be captured in a single measure by a polygenic risk score (PRS), the summed contribution of many common risk variants, which is able to identify women at over 3-fold risk of breast cancer, compared to women with an average risk7,8. By improving identification of these women at high risk of breast cancer, it could serve as a new tool for personalised, risk-based breast cancer screening7,9,10.
Here, we comprehensively assess the impact of germline genetic variation on risk of breast cancer and show (1) how a high breast cancer PRS compares to high-risk mutations in breast cancer susceptibility genes, (2) how the PRS modifies the risk of breast cancer in women carrying pathogenic mutations in the PALB2 and CHEK2 genes and (3) that the PRS has utility for informing about risk of contralateral breast cancer, and about the risk in first-degree relatives. We use data from the FinnGen study, which combines nationwide health registries with genomic information for 122,978 women from across the country, representing 5% of the Finnish adult female population.
We studied 122,978 women in FinnGen, with the mean age at the end of follow-up 58.5 (inter-quartile range, IQR 45.1–72.2, range 16.0–106.0). In FinnGen, 8401 (6.8%) women have been diagnosed with breast cancer, with mean age at disease onset of 58.6 (IQR 50.4–66.3, range 21.3–98.3 years). We first tested the association of three polygenic risk scores on breast cancer risk: a 313 SNP score7, a genome-wide score by Mars et al.10 derived using LDpred software and a new genome-wide score derived using PRS-CS11. In our data, the genome-wide scores outperformed the 313 SNP score with hazard ratio (HR) estimates per standard deviation at 1.55 (95% confidence interval, CI 1.52–1.58), 1.63 (CI 1.60–1.67) and 1.71 (CI 1.68–1.75) of the PRS for 313 SNP score, LDpred score and PRS-CS score, each scaled separately to mean zero and unit variance. We therefore chose the PRS built with PRS-CS for subsequent analyses (Table 1 and Supplementary Table 1).
We then studied the allele frequencies, geographic variation and risk estimates for the two Finnish-enriched, high-impact breast cancer mutations. The allele frequency for rs180177102 (PALB2) was 0.0014 (242-fold enrichment compared to non-Finnish non-Estonian Europeans, NFEE12), with 336 heterozygote mutation carriers included in the analyses. The allele frequency for rs555607708 (CHEK2) was 0.0064 (3.7 times enriched in Finns compared to NFEE), with 1641 heterozygotes and 7 homozygote individuals.
Geographic variation of genetic risk
Considering Finns have passed internal genetic bottlenecks, we first aimed to characterise any geographic distribution for both the PALB2 and CHEK2 mutations, and for the PRS. Both the PALB2 and CHEK2 mutations had more carriers in Eastern Finland, with the proportion of carriers ranging from close to 0 in Western Finland, to 2.8% for CHEK2 in South Karelia and to 0.8% for PALB2 in North Karelia (Fig. 1). In contrast, we observed slightly higher proportions of individuals with high PRS in Western and Southern Finland, in line with breast cancer incidence.
The effect of frameshift mutations in PALB2 and CHEK2
Both PALB2 and CHEK2 conferred considerably elevated risk for breast cancer (Table 2). The PALB2 variant conferred a risk increase for breast cancer with a HR of 4.99 (95% CI 4.02–6.20, p = 6.76 × 10−48), corresponding to a lifetime risk by age 80 of 56.1% (95% CI 50.8–61.4%). The CHEK2 variant conferred a risk increase for breast cancer with HR 2.19 (95% CI 1.91–2.51), p = 3.90 × 10−29), corresponding to a lifetime risk of 31.7% (95% CI 29.5–33.9%). Comparing to women with a PRS between the 10th and 90th percentiles (lifetime risk 15.5%, 95% CI 15.3–15.7%), women with PRS above the 90th percentile had a similar effect size as CHEK2 mutation carriers (HR 2.38, 95% CI 2.26–2.50, p = 1.98 × 10−230) and their similar lifetime risk was similar (32.5%, 95% CI 31.6–33.4%). However, a high PRS affected a nearly 7-fold larger group of women (Table 1; results excluding first-degree relatives in Supplementary Table 2). Estimating these while accounting for competing risks (non-breast cancer related death) yielded 4.6%, 4.9% and 3.2% lower estimates for lifetime risks in carriers of the PALB2 and CHEK2 mutations, and women with high PRS, respectively (Supplementary Figs. 1–3 and Supplementary Table 3).
PRS modifies the risk in PALB2 and CHEK2 mutation carriers
Next, we estimated how the PRS modifies breast cancer risk in the mutation carriers. For both PALB2 and CHEK2, a high PRS further increased the breast cancer risk. In terms of lifetime risk for breast cancer by age 80, women with the PALB2 mutation and average PRS (10–90th percentile) had a lifetime risk of 55.3% (95% CI 49.4–61.2%), which increased to 83.9% (71.2–96.6%) among women with a high PRS (>90th percentile), and decreased to 49.1% (30.6–67.6%) in women with a low PRS (<10th percentile; Fig. 2 and Tables 3 and 4). Women with CHEK2 and an average PRS had a lifetime risk of 29.3% (95% CI 26.8–31.8%) which doubled to 59.2% (52.1–66.3%) in women with a high PRS and decreased to 9.3% (4.5–14.1%) in women with low PRS.
To test for possible interaction between mutation carriers and the PRS, we first compared the PRS effect size in pooled mutation carriers (PALB2 and CHEK2) and in non-carriers. In both carriers and non-carriers, hazard ratios for the top and bottom decile of the PRS were very similar (reference group PRS 10–90%; Table 5). This was observed also in PALB2 and CHEK2 mutation carriers separately. For PALB2, the HR per SD in carriers was 1.81 (95% CI 1.34–2.44, p = 1.05 × 10−4), in carriers of CHEK2, 1.86 (1.60–2.16, p = 6.58 × 10−16), and in carriers of neither the PALB2 nor the CHEK2 mutation, the HR was 1.71 (1.67–1.74, p < 1.00 × 10−300). Similarly, in a formal test for interaction by introducing an interaction term in the regression model, we found no evidence of an interaction between the PRS and mutations for neither the PALB2 variant (p = 0.18), nor the CHEK2 variant (p = 0.45).
PRS refines risk assessment in first-degree relatives
Next, we evaluated how the PRS modifies the risk conferred by a positive first-degree family history. Family history was assessed in 7715 mother–daughter pairs and 12,086 pairs of sisters, separately for family history of early-onset (age < 45) and late-onset (age ≥ 45) breast cancer. For both, PRS stratified women for breast cancer risk, but the stratification was more pronounced in family history of early-onset disease (Fig. 3 and Supplementary Table 4). Women with an average PRS (between the 10th and 90th percentiles) and positive family history of early-onset breast cancer had a lifetime risk at 32.5% (95% CI 24.0–41.0%) – a risk similar to women with a high PRS (>90th percentile) in the full dataset (32.5%, 31.6–33.4%). A combination of family history of early-onset breast cancer and a high PRS further increased the risk to 49.0% (30.1–67.9%), but with only one breast cancer case in the bottom decile we were unable to estimate the impact of a low PRS. We then tested whether family history adds to risk assessment if we know the woman’s PRS. When adjusting with a continuous PRS, the effect size for family history of early-onset breast cancer was attenuated, from HR 2.80 (95% CI 1.81–4.33, p = 4.08 × 10−6), to HR 2.32 (1.50–3.60, p = 1.72 × 10−4). Also for late-onset, the association was attenuated, from HR 1.30 (1.07–1.57, p = 0.01), to HR 1.09 (0.90–1.33, p = 0.37).
High PRS increases risk for contralateral breast cancer
Lastly, we tested the association between the PRS contralateral breast cancer among breast cancer patients. With PRS between the 10th and 90th percentile as reference, a high PRS (>90th percentile) was associated with risk of contralateral breast cancer with HR 1.60 (95% CI 1.25–2.04, p = 0.0002), with 97 individuals out of 1604 cases with a high PRS being diagnosed with contralateral breast cancer.
Using large-scale biobank data combining longitudinal nationwide health registry data with genomic information, we show that over the life course, the breast cancer PRS strongly alters the breast cancer incidence in high-impact mutation carriers. After breast cancer diagnosis, individuals with an elevated PRS have an increased likelihood of developing contralateral breast cancer, and the PRS can considerably improve risk assessment among their female first-degree relatives.
The breast cancer PRS strongly altered the risk of breast cancer in PALB2 and CHEK2 mutation carriers, substantially increasing the risk of breast cancer in women with a high PRS, and lowering the risk in women with a low PRS. Deciding on appropriate surveillance and risk-reduction strategies is a clinical challenge particularly for moderate-risk mutations such as those in CHEK213, and our results show that additional information provided by the PRS could guide in these decisions. A combination of breast cancer PRS in the top decile and a mutation in the CHEK2 variant increased the lifetime risk to 59% – a risk comparable to that seen in PALB2 mutation carriers – whereas those with a PRS in the bottom decile had a risk similar to the population level.
That PRS modifies the risk in PALB2 and CHEK2 mutation carriers supports previous findings suggesting that common genetic variation at least partly explains the widely observed incomplete penetrance of mutations in breast cancer susceptibility genes14,15,16,17. This variation is now measurable on an individual level with the breast cancer PRS, which captures a wide range of molecular pathways. Our results are in line with previous studies on BRCA1, BRCA2, PALB2 and CHEK2 mutation carriers, but these studies have used a case–control setting or PRSs consisting of <100 variants16,17,18,19. We conducted the study in a large longitudinal dataset with 120,000 women, using a more predictive, genome-wide PRS and leveraging the considerable enrichment of the PALB2 and CHEK2 variants in an isolated population. With the longitudinal setting, we were also able quantify the lifetime risk in PALB2 and CHEK2 mutation carriers based on observed events over the life course, instead of calculating them using baseline risks from published studies17,18.
Harbouring pathogenic mutations in high-risk breast cancer susceptibility genes often prompt intensified medical surveillance and consideration of preventative procedures such as risk-reducing surgery. The lifetime risk estimates for individuals in the top decile of the PRS was comparable to CHEK2 mutation carriers – both had a risk of 32% by age 80. Considering this, our results also argue for the need of studies on the impact of targeted actions in women with a high PRS only, who currently go undetected. After the diagnosis, patients with elevated PRS had a 1.6-fold elevated risk for contralateral cancer, providing additional evidence of increased breast cancer susceptibility, a finding that might warrant intensified or prolonged surveillance in breast cancer cases with elevated PRS. This finding is in line with earlier studies showing that familial factors contribute to the risk of contralateral breast cancer20,21,22.
The proportion of mutation carriers and the elevated PRS showed differing geographical distributions. While the elevated PRS distribution followed the breast cancer incidence distribution with highest rates in the early-settlement region in South-Western Finland, the allele frequencies for the PALB2 and CHEK2 mutations were highest in the late-settlement region in Eastern Finland. It is likely that the PALB2 and CHEK2 mutations have survived both the founder bottleneck in Finland, and the internal bottleneck in the Eastern Finland, therefore being heavily enriched in the Finnish population. These regional differences in both PRS and mutation frequency distributions may have an impact on regional screening strategies.
Finally, the PRS improved risk assessment of first-degree relatives of women with breast cancer, with pronounced stratification particularly for family history of early-onset disease. Family history is an essential factor guiding screening strategies of family members of breast cancer patients23, and our results show that PRS could improve the precision of this assessment.
Our study has several limitations. Our findings are limited to individuals of European ancestry and it is important to study the applicability of the results in individuals of admixed and non-European ancestry24. The FinnGen study is a mixture of population-based cohorts and samples from hospital biobanks. It is possible that the sampling may introduce biases in some of the estimates. We observed a slightly higher baseline risk compared to the NORDCAN database25. However, our key PRS estimates were similar when estimated in a FinnGen subset of population-based cohorts only. Moreover, accounting for the competing risk of mortality from other causes yielded slightly lower estimates for lifetime risks.
In conclusion, we show that a high breast cancer PRS comes with a comparable risk profile to frameshift mutations in breast cancer susceptibility genes PALB2 and CHEK2, and that the PRS strongly modifies breast cancer risk in the mutation carriers. Even after the breast cancer diagnosis, the PRS was associated with breast cancer susceptibility by increasing the risk of contralateral breast cancer, and it considerably improved risk assessment among the patient’s first-degree relatives. These results demonstrate opportunities for a more comprehensive way of assessing genetic risk in the general population, in breast cancer patients and in unaffected family members of breast cancer patients. Optimisation of these strategies in the clinical setting warrant further study.
Participants and endpoints
The data comprised of 122,978 Finnish women in the FinnGen, Data Freeze 5. FinnGen comprises prospective epidemiological cohorts (initiated as far back as 1992), disease-based cohorts, and hospital biobank samples (Supplementary Table 5). The unique national personal identification number links the genotypes to the Finnish Cancer Registry (available from 1953, with nationwide completeness of solid tumours at 96%26), as well as to the national hospital discharge registry (1968-), the national death registry (1969-) and the medication reimbursement registry (1964-). These registries cover the whole population.
Breast cancer cases were identified through the Finnish Cancer Registry with diagnosis C50 (International Classification of Diseases for Oncology, 3rd Edition; ICD-O-3), from the drug reimbursement registry by selecting individuals with a reimbursement code for breast cancer, and from the death registry with ICD-10 C50. Contralateral breast cancer was defined as breast cancer in the opposite breast diagnosed over 6 months after the date of the primary breast cancer diagnosis, obtained from the Cancer Registry.
Genotyping and imputation
FinnGen samples were genotyped with Illumina and Affymetrix arrays (Illumina Inc., San Diego, and Thermo Fisher Scientific, Santa Clara, CA, USA), and genotype calls were made with the GenCall or zCall (for Illumina) and the AxiomGT1 algorithm for Affymetrix data. Individuals with ambiguous gender, high genotype missingness (>5%), excess heterozygosity (+-4SD) and non-Finnish ancestry were excluded, as well as all variants with high missingness (>2%), low Hardy–Weinberg equilibrium p-value (<1e-6) and minor allele count (MAC < 3). Array data pre-phasing was carried out with Eagle 2.3.527 with the number of conditioning haplotypes set to 20,000. Genotype imputation was done with Beagle 4.128 (as described in https://doi.org/10.17504/protocols.io.xbgfijw) by using the SISu v3 population-specific reference panel developed from high-quality data for 3,775 high-coverage (25-30x) whole-genome sequences in Finns.
We chose two previously reported Finnish-enriched frameshift variants for our main analyses, rs180177102 (c.1592delT) in PALB2 and rs555607708 (c.1100delC) in CHEK2. Genotype data batches with an imputation INFO score <0.8 were excluded. This excluded 13,607 women from analyses involving the PALB2 variant (mainly older disease-based cohorts), but no exclusions were needed for CHEK2. PALB2 mutation carrier status was ignored in analyses involving the CHEK2 variant, and vice versa. Women homozygous for the CHEK2 variant were analysed jointly with the heterozygotes.
Polygenic risk score
To choose our breast cancer PRS, we compared three scores: (1) a previously published PRS with 313 SNPs7, (2) another previously published, genome-wide PRS10 built with the software LDpred29 and (3) a genome-wide PRS we built with the software PRS-CS (PRS-CS-auto, with 1000 Genomes Project European sample, N = 503, as the external LD reference panel) using HapMap3 variants11. For the LDpred and PRS-CS PRSs, the input weights came from a large independent genome-wide association study (GWAS)6. To have a PRS independent of the PALB2 and CHEK2 variants, we excluded the variants within the CHEK2 gene ±3 Mb, and variants within the PALB2 gene ±2 Mb (Supplementary Fig. 4). Out of these three, the PRS-CS score showed the strongest association for breast cancer and was therefore chosen for subsequent analyses (Table 1 and Supplementary Table 1). All three PRSs showed acceptable goodness-of-fit (Supplementary Fig. 5). The final variant count for the PRS-CS PRS with PALB2 and CHEK2 excluded was 1,074,667.
A high PRS was defined as a PRS above the 90th percentile, as it corresponds to a lifetime risk of ≥30%, which guidelines consider as the threshold for high risk23. Correspondingly, we defined a PRS below the 10th percentile as a low PRS. Individuals between the 10th to 90th percentiles served as the reference category.
Geographic variation is reported by region of birth (obtained from Statistics Finland) as the proportion of individuals with (1) the frameshift mutations in the PALB2 or CHEK2 variants, and (2) high PRS (>90th percentile). The benchmark for these analyses was age-standardised (age in 2014) breast cancer incidence for the whole Finnish population, calculated as the mean of 5-year incidences for each hospital district over 1998–2007. The incidence data was obtained from the Finnish Cancer Registry (publicly available at https://cancerregistry.fi/statistics/). Polygon data for the Finnish map were obtained from GADM (https://gadm.org/data.html).
Population structure-related bias analysis
A population structure-related bias analysis was performed by following the approach described in detail in Kerminen et al.30. In brief, the method measures the accumulation of PRS differences between the Western and Eastern subpopulations of Finland using a “random PRS”, made from a randomly chosen set of independent (r2 < 0.1) variants with minor allele frequency >0.05 that are not associated with breast cancer (breast cancer GWAS6 p-value >0.5). If such random PRS accumulated differences between the subpopulations, that could indicate a population genetic bias in effect estimates of the GWAS, rather than a real difference in genetic susceptibility of breast cancer between the subpopulations. We found no evidence of such bias (Supplementary Fig. 6), which indicates that any detected geographic variation in the PRS is unlikely to result from a population genetic bias.
Risk assessment in first-degree relatives
The pairs of first-degree relatives were inferred with KING v2.2.431 by a kinship coefficient ranging between 0.177 and 0.354 (inference based on 57 K unlinked variants). To analyse the impact of family history in first-degree relatives, we randomly sampled one female relative for each woman who had at least one first-degree relative in the dataset. For mother–daughter pairs, the mother was assigned as the index relative. For sisters, we randomly assigned one to be the index relative, irrespective of age. If both women in the pair were breast cancer cases, we used the year of diagnosis to assign the woman diagnosed earlier as the index. Some individuals appeared several times as non-index individuals, which may occur when, for instance, a woman is the daughter of one index individual and the sister of another – we therefore randomly sampled the data to contain each non-index individual only once. We then inferred the risk of breast cancer in these unique non-index individuals. We analysed separately family history of early-onset (age < 45) and late-onset (age ≥ 45) breast cancer.
We estimated HRs and 95% CIs with the Cox proportional hazards model, and used Schoenfeld residuals and log–log inspection for assessing the proportional hazards assumption. Start of follow-up was set at birth, and follow-up ended at the first record of the endpoint of interest, death or at the end of follow-up on 31 December 2018, whichever came first. All tests were two-tailed. In all survival analyses, we used age as the time scale, with 63 batches and the first 10 principal components as covariates. The only exception was the analysis on contralateral breast cancer, where follow-up started from the diagnosis, and age was included as a covariate.
Goodness-of-fit for the PRS was assessed with a method proposed by May & Hosmer for a Cox proportional hazards model32. In line with previous studies on breast cancer susceptibility genes, we assessed lifetime risk (cumulative incidence without competing risks) by age 8014,33. Lifetime risk was estimated from the adjusted survival curves, with 95% CIs obtained by normal approximation. The adjusted survival curves were plotted with the R package survminer. This presents the expected survival curves separately for subgroups, based on the Cox model. To estimate the covariate-adjusted cumulative incidence functions in the presence of competing risks, we used the Stata module stcompadj34. The competing event was non-breast cancer causes of death and covariates were assumed to have similar effects the main and competing event.
Interactions between the PRS and the pathogenic mutations were assessed (1) by comparing the PRS effect sizes in pooled and non-pooled mutation carriers and non-carriers (with the PRS scaled to zero mean and unit variance within the whole dataset), and (2) formally by introducing an interaction term for the mutation and the continuous PRS. For data and variant handling and PRS calculation, we used BCFtools versions 1.7 and 1.9, and PLINK 2.0. For statistical analyses, we used R 3.6.3 and Stata 16.0 (College Station, TX, USA). Cromwell and WOMtool were used for workflow handling.
The FinnGen project is approved by the Finnish Institute for Health and Welfare (THL), approval number THL/2031/6.02.00/2017, amendments THL/1101/5.05.00/2017, THL/341/6.02.00/2018, THL/2222/6.02.00/2018, THL/283/6.02.00/2019), Digital and population data service agency VRK43431/2017-3, VRK/6909/2018-3, the Social Insurance Institution (KELA) KELA 58/522/2017, KELA 131/522/2018, KELA 70/522/2019 and Statistics Finland TK-53-1041-17.
Patients and control subjects in FinnGen provided informed consent for biobank research, based on the Finnish Biobank Act. Alternatively, older research cohorts, collected prior the start of FinnGen (in August 2017), were collected based on study-specific consents and later transferred to the Finnish biobanks after approval by Valvira, the National Supervisory Authority for Welfare and Health. Recruitment protocols followed the biobank protocols approved by Valvira. The Ethics Review Board of the Hospital District of Helsinki and Uusimaa approved the FinnGen study protocol Nr HUS/990/2017.
The Biobank Access Decisions for FinnGen samples and data utilised in FinnGen Data Freeze 5 include: THL Biobank BB2017_55, BB2017_111, BB2018_19, BB_2018_34, BB_2018_67, BB2018_71, BB2019_7 Finnish Red Cross Blood Service Biobank 7.12.2017, Helsinki Biobank HUS/359/2017, Auria Biobank AB17-5154, Biobank Borealis of Northern Finland_2017_1013, Biobank of Eastern Finland 1186/2018, Finnish Clinical Biobank Tampere MH0004, Central Finland Biobank 1-2017 and Terveystalo Biobank STB 2018001. Analyses of potential geographic bias of PRS were done with THL biobank permission BB2019_44.
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
The FinnGen data may be accessed through Finnish Biobanks’ FinBB portal (web link: www.finbb.fi, email: firstname.lastname@example.org). The GWAS summary statistics used for constructing our main PRS are available at http://bcac.ccge.medschl.cam.ac.uk/bcacdata/oncoarray/oncoarray-and-combined-summary-result/gwas-summary-results-breast-cancer-risk-2017/, with contact information at http://bcac.ccge.medschl.cam.ac.uk/contact/. The weights for our main PRS are available at PGS Catalog (email@example.com) at https://www.PGSCatalog.org/score/PGS000335/, and the previously published PRSs at https://www.PGSCatalog.org/score/PGS000004/ and https://www.PGSCatalog.org/score/PGS000332/. The remaining data are available within the Article, Supplementary Information or available from the authors upon request.
The full genotyping and imputation protocol for FinnGen is described at https://doi.org/10.17504/protocols.io.xbgfijw.
Bray, F. et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 68, 394–424 (2018).
Economopoulou, P., Dimitriadis, G. & Psyrri, A. Beyond BRCA: new hereditary breast cancer susceptibility genes. Cancer Treat. Rev. 41, 1–8 (2015).
Vehmanen, P. et al. Low proportion of BRCA1 and BRCA2 mutations in finnish breast cancer families: evidence for additional susceptibility genes. Hum. Mol. Genet. 6, 2309–2315 (1997).
Ducy, M. et al. The tumor suppressor PALB2: Inside out. Trends Biochem. Sci. 44, 226–240 (2019).
Nevanlinna, H. & Bartek, J. The CHEK2 gene and inherited breast cancer susceptibility. Oncogene 25, 5912–5919 (2006).
Michailidou, K. et al. Association analysis identifies 65 new breast cancer risk loci. Nature 551, 92–94 (2017).
Mavaddat, N. et al. Polygenic risk scores for prediction of breast cancer and breast cancer subtypes. Am. J. Hum. Genet. 104, 21–34 (2019).
Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).
Lee, A. et al. BOADICEA: A comprehensive breast cancer risk prediction model incorporating genetic and nongenetic risk factors. Genet. Med. 21, 1708–1718 (2019).
Mars, N. et al. Polygenic and clinical risk scores and their impact on age at onset and prediction of cardiometabolic diseases and common cancers. Nat. Med. 26, 549–557 (2020).
Ge, T., Chen, C. Y., Ni, Y., Feng, Y. A. & Smoller, J. W. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat. Commun. 10, 1776 (2019).
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Tung, N. et al. Counselling framework for moderate-penetrance cancer-susceptibility mutations. Nat. Rev. Clin. Oncol. 13, 581–588 (2016).
Antoniou, A. C. et al. Breast-cancer risk in families with mutations in PALB2. New Engl. J. Med. 371, 497–506 (2014).
Antoniou, A. C. et al. A comprehensive model for familial breast cancer incorporating BRCA1, BRCA2 and other genes. Br. J. Cancer 86, 76–83 (2002).
Kuchenbaecker, K. B. et al. Evaluation of polygenic risk scores for breast and ovarian cancer risk prediction in BRCA1 and BRCA2 mutation carriers. J. Natl. Cancer Inst. 109, djw302 (2017).
Muranen, T. A. et al. Genetic modifiers of CHEK2*1100delC-associated breast cancer risk. Genet. Med. 19, 599–603 (2017).
Gallagher, S. et al. Association of a polygenic risk score with breast cancer among women carriers of high- and moderate-risk breast cancer genes. JAMA Netw. Open 3, e208501 (2020).
Fahed, A. C. et al. Polygenic background modifies penetrance of monogenic variants for tier 1 genomic conditions. Nat. Commun. 11, 3635 (2020).
Reiner, A. S. et al. Breast cancer family history and contralateral breast cancer risk in young women: An update from the women’s environmental cancer and radiation epidemiology study. J. Clin. Oncol. 36, 1513–1520 (2018).
Narod, S. A., Kharazmi, E., Fallah, M., Sundquist, K. & Hemminki, K. The risk of contralateral breast cancer in daughters of women with and without breast cancer. Clin. Genet. 89, 332–335 (2016).
Robson, M. E. et al. Association of common genetic variants with contralateral breast cancer risk in the WECARE study. J. Natl. Cancer Inst. 109, djx051 (2017).
National Collaborating Centre for Cancer. NICE clinical guidelines, no. 164. Familial breast cancer: classification and care of people at risk of familial breast cancer and management of breast cancer and related risks in people with a family history of breast cancer (2013).
Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).
Danckert, B. et al. NORDCAN: cancer incidence, mortality, prevalence and survival in the Nordic countries, version 8.2 (26.03.2019). http://www.dep.Iarc.Fr/nordcan/ Accessed on 13 July 2020.
Leinonen, M. K., Miettinen, J., Heikkinen, S., Pitkaniemi, J. & Malila, N. Quality measures of the population-based Finnish cancer registry indicate sound data quality for solid malignant tumours. Eur. J. Cancer 77, 31–39 (2017).
Loh, P. R. et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat. Genet. 48, 1443–1448 (2016).
Browning, B. L. & Browning, S. R. Genotype imputation with millions of reference samples. Am. J. Hum. Genet. 98, 116–126 (2016).
Vilhjalmsson, B. J. et al. Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am. J. Hum. Genet. 97, 576–592 (2015).
Kerminen, S. et al. Geographic variation and bias in the polygenic scores of complex diseases and traits in Finland. Am. J. Hum. Genet. 104, 1169–1181 (2019).
Manichaikul, A. et al. Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010).
May, S. & Hosmer, D. W. A simplified method of calculating an overall goodness-of-fit test for the Cox proportional hazards model. Lifetime Data Anal. 4, 109–120 (1998).
Yang, X. et al. Ovarian and breast cancer risks associated with pathogenic variants in RAD51C and RAD51D. J. Natl. Cancer Inst. 112, djaa030 (2020).
Coviello E. stcompadj: Stata Module To Estimate The Covariate-adjusted Cumulative Incidence Function In The Presence Of Competing Risks. Statistical Software Components S457063 (Department of Economics, Boston College, 2009).
We would like to thank Sari Kivikko, Huei-Yi Shen and Ulla Tuomainen for management assistance. Following biobanks are acknowledged for collecting the FinnGen project samples: Auria Biobank (https://www.auria.fi/biopankki), THL Biobank (https://thl.fi/fi/web/thl-biopankki), Helsinki Biobank (https://www.terveyskyla.fi/helsinginbiopankki), Biobank Borealis of Northern Finland (https://www.oulu.fi/university/node/38474), Finnish Clinical Biobank Tampere (https://www.tays.fi/en-US/Research_and_development/Finnish_Clinical_Biobank_Tampere), Biobank of Eastern Finland (https://ita-suomenbiopankki.fi), Central Finland Biobank (https://www.ksshp.fi/fi-FI/Potilaalle/Biopankki), Finnish Red Cross Blood Service Biobank (https://www.veripalvelu.fi/verenluovutus/biopankkitoiminta) and Terveystalo Biobank (https://www.terveystalo.com/fi/Yritystietoa/Terveystalo-Biopankki/Biopankki/). All Finnish Biobanks are members of BBMRI.fi infrastructure (www.bbmri.fi). We also thank study participants for their generous participation at THL Biobank and the National FINRISK study. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The FinnGen project is funded by two grants from Business Finland (HUS 4685/31/2016 and UH 4386/31/2016) and by twelve industry partners (AbbVie Inc, AstraZeneca UK Ltd, Biogen MA Inc, Celgene Corporation, Celgene International II Sàrl, Genentech Inc, Merck Sharp & Dohme Corp, Pfizer Inc., GlaxoSmithKline Intellectual Property Development Ltd., Sanofi US Services Inc., Maze Therapeutics Inc., Janssen Biotech Inc and Novartis AG). This work was supported by the Sigrid Jusélius Foundation (to S.R., A.P., M.P., and H.J.); University of Helsinki HiLIFE Fellow grants 2017-2020 (to S.R.); Academy of Finland Center of Excellence in Complex Disease Genetics (grant number 312062 to S.R., 312074 to A.P., 312075 to M.D; 312073 to J.K.; 312076 to M.P.); Academy of Finland (grant number 331671 to N.M., grant number 285380 to S.R., 128650 to A.P., 308248 to J.K., 288509 to M.P., 218068 and 131449 to H.J.); The Finnish Innovation Fund Tekes (grant number 2273/31/2017 to E.W.); Foundation and the Horizon 2020 Research and Innovation Programme (grant number 667301 (COSYN) to A.P.); Cancer Foundation Finland sr (to T.M.); Cancer Society of Finland (to H.J.); and Jane and Aatos Erkko Foundation (to H.J.). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
A.P. is a member of the Pfizer Genetics Scientific Advisory Panel. H.J. has a co-appointment at Orion Pharma, has received fees from Neutron Therapeutics, and owns stocks of Orion Pharma and Sartar Therapeutics. The remaining authors declare no competing interests.
Peer review information Nature Communications thanks Peter Kraft and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Mars, N., Widén, E., Kerminen, S. et al. The role of polygenic risk and susceptibility genes in breast cancer over the course of life. Nat Commun 11, 6383 (2020). https://doi.org/10.1038/s41467-020-19966-5
Genetics in Medicine (2021)
Polygenic burden has broader impact on health, cognition, and socioeconomic outcomes than most rare and high-risk copy number variants
Molecular Psychiatry (2021)
Nature Medicine (2021)