The predictive ability of the 313 variant–based polygenic risk score for contralateral breast cancer risk prediction in women of European ancestry with a heterozygous BRCA1 or BRCA2 pathogenic variant

Purpose To evaluate the association between a previously published 313 variant–based breast cancer (BC) polygenic risk score (PRS313) and contralateral breast cancer (CBC) risk, in BRCA1 and BRCA2 pathogenic variant heterozygotes. Methods We included women of European ancestry with a prevalent first primary invasive BC (BRCA1 = 6,591 with 1,402 prevalent CBC cases; BRCA2 = 4,208 with 647 prevalent CBC cases) from the Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA), a large international retrospective series. Cox regression analysis was performed to assess the association between overall and ER-specific PRS313 and CBC risk. Results For BRCA1 heterozygotes the estrogen receptor (ER)-negative PRS313 showed the largest association with CBC risk, hazard ratio (HR) per SD = 1.12, 95% confidence interval (CI) (1.06–1.18), C-index = 0.53; for BRCA2 heterozygotes, this was the ER-positive PRS313, HR = 1.15, 95% CI (1.07–1.25), C-index = 0.57. Adjusting for family history, age at diagnosis, treatment, or pathological characteristics for the first BC did not change association effect sizes. For women developing first BC < age 40 years, the cumulative PRS313 5th and 95th percentile 10-year CBC risks were 22% and 32% for BRCA1 and 13% and 23% for BRCA2 heterozygotes, respectively. Conclusion The PRS313 can be used to refine individual CBC risks for BRCA1/2 heterozygotes of European ancestry, however the PRS313 needs to be considered in the context of a multifactorial risk model to evaluate whether it might influence clinical decision-making.


Genotyping and Polygenic Risk Score calculation
For most of the participants, genotyping was performed with the Illumina OncoArray 1 , comprising 533,631 SNPs. The remaining participants were genotyped with the Illumina iCOGS array, containing 211,155 SNPs 2 . Details about the quality control procedures and correlation between the arrays have been described previously [3][4][5][6][7][8] . European ancestry was determined using genetic data and multidimensional scaling. As previously published: "We excluded individuals of non-European ancestry using multi-dimensional scaling. For this purpose we selected 30,733 uncorrelated autosomal SNPs (pair-wise r2< 0.10) to compute the genomic kinship between all pairs of BRCA1 and BRCA2 carriers, along with 267 HapMap samples (CHB, JPT, YRI and CEU). These were converted to distances and subjected to multidimensional scaling. Using the first two components, we calculated the proportion of European ancestry for each individual and excluded samples with >27% non-European ancestry to ensure that samples of Ashkenazi Jewish ancestry were included in the final sample" 6 . Imputation of variants not on genotyping arrays was performed with IMPUTE2 9 , after prephasing with SHAPEIT 10 , using 1000 Genomes phase 3 as a reference panel. Imputation quality scores for the variants used in this study are shown in Table S2.
We used the 313-variant-based PRS for breast cancer developed in an independent study using data from the general population as described previously 11 ; correlation between PRS based on the two genotyping arrays was high 8 . The PRS for overall breast cancer (PRS313) and two ER-specific PRS, the ER-positive PRS313 and ER-negative PRS313 were calculated. For all three PRS, the same 313 variants were used for calculation with the following formula: In which is the number of risk alleles (0, 1 or 2) for variant carried by individual and is the weight associated with variant . All weights were derived from the analysis of data from the Breast Cancer Association Consortium (BCAC) 11 ; for the ER-positive and ER-negative PRS313, ER-specific weights were used for the subset of 116 variants with a significant difference in the effect size by subtype. The variants and their corresponding weights used in the PRS are listed in Table S2 as published previously 11 . The three PRS were standardized to the mean from all CIMBA participants, including both unaffected and affected women, and to the SD in BCAC population controls which were included in the validation dataset 11 . The SDs used were 0.61, 0.65 and 0.59 for the PRS313, ERpositive PRS313 and ER-negative PRS313 respectively. Using these SDs, the HR estimates for the associations of the standardized PRS313 in our study are directly comparable with the OR estimates reported in the BCAC population-based study 11 and the HR estimates reported for primary breast cancer in BRCA1 and BRCA2 heterozygotes 7 .

Figure S1: Flow chart of the inclusion of CIMBA participants
Flow chart of the inclusion and exclusion of CIMBA participants for this study.
Abbreviation: N, Number

Figure S2: Time at risk in the association analyses
The time at risk was assumed to start one year after the first breast cancer. Participants were censored at (i) age at baseline, (ii) bilateral risk reducing mastectomy or (iii) death, whichever was earlier. Baseline age was defined as the age at local ascertainment (97%), or when this was not known, age at genetic testing (2%) or age at last follow-up (1%). Incidence of a metachronous contralateral breast cancer, invasive or in situ, before baseline was considered as an event in the main analyses.        c Cox regression model for the association between the PRS and contralateral breast cancer, stratified by country, clustered on family membership, and adjusted for birth cohort (quartiles of the observed distribution).