Limitations in next-generation sequencing-based genotyping of breast cancer polygenic risk score loci

Considering polygenic risk scores (PRSs) in individual risk prediction is increasingly implemented in genetic testing for hereditary breast cancer (BC) based on next-generation sequencing (NGS). To calculate individual BC risks, the Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA) with the inclusion of the BCAC 313 or the BRIDGES 306 BC PRS is commonly used. The PRS calculation depends on accurately reproducing the variant allele frequencies (AFs) and, consequently, the distribution of PRS values anticipated by the algorithm. Here, the 324 loci of the BCAC 313 and the BRIDGES 306 BC PRS were examined in population-specific database gnomAD and in real-world data sets of five centers of the German Consortium for Hereditary Breast and Ovarian Cancer (GC-HBOC), to determine whether these expected AFs can be reproduced by NGS-based genotyping. Four PRS loci were non-existent in gnomAD v3.1.2 non-Finnish Europeans, further 24 loci showed noticeably deviating AFs. In real-world data, between 11 and 23 loci were reported with noticeably deviating AFs, and were shown to have effects on final risk prediction. Deviations depended on the sequencing approach, variant caller and calling mode (forced versus unforced) employed. Therefore, this study demonstrates the necessity to apply quality assurance not only in terms of sequencing coverage but also observed AFs in a sufficiently large cohort, when implementing PRSs in a routine diagnostic setting. Furthermore, future PRS design should be guided by the technical reproducibility of expected AFs across commonly used genotyping methods, especially NGS, in addition to the observed effect sizes.


INTRODUCTION
The German Consortium for Hereditary Breast and Ovarian Cancer (GC-HBOC) is a consortium of interdisciplinary university centers specialized in providing counseling, genetic testing, and healthcare for individuals at risk for familial breast and ovarian cancer (BC/OC).Clinical management of women found to be at increased risk for BC/OC, due to inherited pathogenic variants in established BC/OC risk genes or a strong family history of cancer, demands for accurate and age-dependent risk estimates.Numerous studies demonstrated that the effects of BC susceptibility loci, i.e., common single nucleotide variants (SNVs) and short indels, which individually contribute only slightly to individual BC risks, but whose effects can be summed up to polygenic risk scores (PRSs), can achieve a clinically relevant degree of BC risk discrimination [1][2][3].As the contribution of the PRS to BC risks has also been confirmed for carriers of a pathogenic variant in moderate-to high-penetrant BC risk genes [4][5][6][7], the inclusion of PRSs in individual BC risk prediction is increasingly implemented in GC-HBOC centers [8].
The Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA), which is implemented in the CE-marked CanRisk web interface, provides (since v5) the straightforward inclusion of germline genetic test results, cancer family history, non-genetic risk factors and (if available) PRSs in a comprehensive model [9][10][11].It is, therefore, widely applied for individual BC risk prediction in routine diagnostics of the GC-HBOC centers.The CanRisk web interface allows the specification of individual PRSs either as manual input (including specification of the square root of the proportion of the overall polygenic variance explained) or, for a given set of PRSs, via upload of a VCF file with the genotype or dosage information per locus to consider.Whichever method is chosen, genotyping is the responsibility of the user.For PRSs for which VCF upload is supported, CanRisk provides specifications for incorporated loci, each including the variant (chromosome, genomic position for hg19, reference and effect allele), log odds ratio (i.e., effect size), and expected AF [12].The given alleles and AFs arise from highthroughput genotyping using one of two arrays, iCOGS13 or OncoArray [2].The AFs are not directly included in the calculation of an individual, raw PRS, which is defined as the sum over the product of the number of observed effect alleles and corresponding effect size per PRS locus.However, observing AFs similar to the expected AFs in a sufficiently large cohort can be considered a quality criterion for PRS genotyping.The expected AFs are one of the core assumptions of the algorithm, as they determine the distribution of raw PRS values.
In the GC-HBOC centers, the BCAC 313 BC PRS, and its modified version, the BRIDGES 306 BC PRS [13], are the preferred PRS variant sets used for BC risk prediction.The germline genetic testing and genotyping of PRS loci are based on next-generation sequencing (NGS), e.g., using the TruRisk® or further specifically adapted multi-gene panels, whole-exome or whole-genome sequencing (WGS).The BRIDGES 306 BC PRS excludes loci of the original BCAC 313 BC PRS that were found not appropriately designable using NGS, some of which were replaced by corresponding loci in linkage disequilibrium [13].The assessment of designability was mainly based on sufficient read coverage for diagnostic purposes when using a multi-gene panel approach and mapping to human reference hg19.With the implementation of BC PRS analysis in routine diagnostics and the establishment of corresponding bioinformatic workflows, further technical challenges besides insufficient coverage were identified, e.g., missing variant calls or variant calling resulting in deviating alleles.Studies systematically assessing and comparing the quality and pitfalls of germline genotyping using either arrays or NGS approaches are rare and mainly date from the early days of the establishment of NGS in clinical diagnostics [14][15][16][17].Hence, it cannot be excluded that the conclusions drawn (which were also contradictory with regard to NGS or array being the more reliable and preferable approach) were based on now predominantly outdated technologies.Nevertheless, it is well-known that the accuracy of NGS tends to be hampered in genomic regions of low complexity, i.e., homopolymer runs, tandem repeats and strongly biased GC contents, among others [18][19][20].In the Genome Aggregation Database (gnomAD), the largest and most widely used population-specific variant database, variants located in so-called low-complexity regions are flagged, to indicate that reported AFs may be erroneous [21,22].
In this study, the Bioinformatics Working Group of the GC-HBOC conducted a systematic evaluation across GC-HBOC centers to develop a detailed, locus-wise assessment of technical pitfalls and possible sources of error in NGS-based PRS genotyping.A threestage approach was followed.First, the AF of PRS variants was compared to the gnomAD AF for the European general population and it was checked if the variants can be converted to the hg38 reference genome.Second, PRS variant AFs in real-world data sets provided by participating GC-HBOC centers were compared to the AFs expected by CanRisk.Third, possible workarounds for use in clinical diagnostics, i.e., usage of alternative alleles and proxies, were identified.The presented results are of relevance beyond diagnostics for BC risk prediction, as they demonstrate principle difficulties in NGS-based PRS computation, especially for PRSs developed based on array data.Furthermore, the results underline the necessity of a comprehensive technical evaluation of PRS variant genotyping in clinical use, as the predictive ability of an individual PRS crucially depends on the assumptions made about the underlying AFs.

MATERIALS AND METHODS Variant annotation
For denoting variants, dbSNP identifiers and gnomAD-like annotations were used throughout the manuscript.The corresponding HGVS annotations are listed in Supplementary Table 1.

Evaluation of expected allele frequencies & convertibility to hg38
Two BC PRS variant sets were considered, namely, the BCAC 313 and the BRIDGES 306 BC PRS.Of the two sets, 295 loci are identical, 18 loci are unique to BCAC 313 BC PRS, and further 11 loci are unique to the BRIDGES 306 BC PRS, resulting in a total number of N = 324 variants to be considered.Expected AFs were extracted from the corresponding PRS specification files at the CanRisk knowledge base [12].Additionally, AFs in the non-Finnish European (NFE) general population were obtained from the gnomAD v3.1.2database1 , which are based on more than 33,000 WGS samples mapped to the hg38 reference sequence.For conversion of the hg19-based PRS variants from CanRisk to hg38, the gnomAD liftover feature was used.Besides AFs, gnomAD flags and warnings indicating possible technical artifacts were retrieved and recorded.These included localization within low-complexity regions, low-quality sites (i.e., sites that are covered in <50% of considered samples [21]), and sites not passing the allele-specific GATK Variant Quality Score Recalibration (VQSR) filter.

Determination of deviating allele frequencies
To determine PRS variants with considerably deviating AFs, thresholds had to be defined dependent on sample sizes and variances observed.Therefore, individual thresholds per data set were determined, using an elbow of the curve method.The absolute differences between observed and expected AFs were sorted in descending order, and the absolute difference referring to the point with the largest Euclidean distance to the imaginary line between thought points (0, 1) and (N + 1, 0) was chosen as threshold, i.e., all observed absolute differences greater than this threshold were determined as noticeably deviating.Corresponding curves are shown in Supplementary Figs.1-6.If the same set of samples was processed with two different variant callers, the smaller threshold was applied in each case, to facilitate comparing variant caller performance.

Real-world data collection
Genotyping results for either BCAC 313 or BRIDGES 306 BC PRS loci in a cohort of at least 100 individuals of European ancestry were requested from GC-HBOC centers.Family IDs were checked for uniqueness to prevent samples from related individuals.Participating centers submitted observed AFs per locus as well as fractions of samples that did not meet the required quality criteria (e.g., with regard to minimum read depth).Furthermore, details on sequencing approaches and bioinformatic analysis workflows for PRS genotyping were systematically recorded.In total, five GC-HBOC centers provided data, namely the Institute of Medical Genetics and Applied Genomics (IMGAG), University Hospital Tübingen, the Institute for Clinical Genetics (ICG), University Hospital Carl Gustav Carus Dresden, the Department of Medical Genetics (DMG) at University Hospital Münster, the Center for Familial Breast and Ovarian Cancer (CFBOC), University Hospital Cologne, and the Institute of Human Genetics (IHG) at the University of Regensburg.Each center provided two NGS-based data sets.An overview of data characteristics is given in Table 1.A more detailed description of sample compositions, sequencing approaches and bioinformatic analyses can be found in Supplementary Methods.

Assessment of effects of deviating allele frequencies on estimated breast cancer risks
Effects of noticeably deviating AFs of PRS loci on CanRisk-based estimated BC risks rely on the number and combination of affected loci, as well as a multitude of additional risk factors such as results of germline testing of established BC/OC risk genes, BC/OC family history, non-genetic risk factors, and current age.Principally, the proportional contribution of the PRS to overall BC risk decreases with increasing age, and also decreases for carriers of a germline pathogenic variant in a BC risk gene with moderate to high penetrance [10].In order to get an estimate of expected biases in predicted BC risks due to potentially erroneous PRS genotyping, estimates of 10-year and remaining lifetime risks, i.e., cumulative risks of primary BC until age of 80 years, were calculated for imaginary, cancer-unaffected women of three different ages, namely 20, 40, and 60 years, without any further information than (artificial) PRS.
To simulate different scenarios, artificial VCF files were constructed with an average PRS (50th percentile) by setting dosage to two times the expected CanRisk AF using the DS tag.For each data set, for loci showing noticeably deviating AFs, DS was set to two times the observed AF in the data set.Dates of birth were set to January 1 in 2004January 1 in , 1984January 1 in , and 1964, to simulate 20, 40, and 60 years of age at the time of risk computation, which were performed in March 2024, using the web interface of CanRisk v2.3.5, and under specification of the default UK incidence rates.

Elaboration of workarounds
Potential solutions for improving genotyping performance with respect to expected AFs could be (besides improving the calling itself) the consideration of alternative alleles or proxies.Details on the identification of potential variants to substitute for this purpose are given in Supplementary Methods.Alternative variants in gnomAD v.3.1.2with an AF matching the expected CanRisk AF were further evaluated using the IMGAG freebayes data, as this (i) was the largest data set in the study (n = 1410), and (ii) the only WGS-based data set, which allowed genotyping of the entire set of putative proxies.

Missing loci & convertibility to hg38
For four BC PRS loci, no variants were listed at the specified genomic position in gnomAD v2.1.1,namely rs572022984, rs113778879, rs73754909, and rs79461387.gnomAD v3.1.2also reported no variants for three of these four loci for corresponding loci in hg38 as defined by dbSNP [23] (Supplementary Table 2).Locus rs572022984 was listed but with an overall allele count of zero in NFE samples (Table 2).

Allele frequencies & technical artifacts reported in gnomAD v3.1.2
For 39 of the 320 PRS loci listed with AF > 0 in gnomAD v3.1.2,at least one observation of technical artifacts was reported: 38 loci were flagged as being located in low-complexity regions, 3 as being localized at a low-quality site, and 1 failed the allele-specific VQSR filter (Supplementary Table 2).

Evaluation of real-world next-generation sequencing outcome
All 49 PRS loci for which a noticeably deviating AF was observed in at least one of the data sets provided by the five participating GC-HBOC centers are listed in Table 3.
For the IMGAG DRAGEN data, 0.052 was calculated as threshold to determine noticeably deviating AFs (Supplementary Fig. 2), resulting in 18 loci affected (Table 3, Fig. 2).Of these, 16 were previously also identified as missing or showing noticeably deviating AFs in gnomAD v3.1.2.The exceptions were rs62485509 and rs9931038.For IMGAG freebayes data, 0.036 was calculated as threshold (Supplementary Fig. 2), resulting in 16 loci from the BCAC 313 BC PRS determined as showing a noticeably deviating AF.Of these, 11 loci were also identified as showing deviating AF in IMGAG DRAGEN data, and all but rs12406858 and rs11268668 were previously identified as missing or showing deviating AFs in gnomAD v3.1.2.Considering genotyping data provided by the ICG based on 585 samples, 23 of the overall 324 PRS loci did not meet the minimum quality criteria (read depth ≥ 20) in more than 25% of samples and were discarded (Supplementary Table 3).Additionally, GATK reported read depth <20 for >25% of samples for rs56097627 and rs143384623.For 260 of the remaining 299 PRS loci (86.96%), forced genotyping with GATK and freebayes resulted in the observation of identical AFs.For both ICG GATK and freebayes data, 0.063 was calculated as threshold to determine noticeably deviating AFs (Supplementary Fig. 3).Using this threshold, 11 loci showed noticeably deviating AFs in the GATK data set (including two loci exclusive for BCAC 313 BC PRS) and 14 loci in the freebayes data set (including three loci exclusive for BCAC 313 BC PRS), respectively, with an overlap of 7 (Table 3, Fig. 2).
The DMG provided GATK-and DRAGEN-based BRIDGES 306 BC PRS genotyping data of 545 samples.Locus rs138179519 did not meet the quality criteria, and additionally rs774021038 using DRAGEN.Of the remaining 304 loci, 252 (82.89%) showed identical AFs (Supplementary Table 3).Using a threshold of 0.052 (Supplementary Fig. 4), resulted in 20 loci showing deviating AFs in GATK data and 14 loci in DRAGEN data, respectively, with an overlap of 9 loci.
For the CFBOC data based on 412 samples, a threshold of 0.047 was calculated (Supplementary Fig. 5).The loci of the BRIDGES 306 BC PRS were considered, 243 (79.41%) of which showed identical AFs for both callers applied (Supplementary Table 3).Overall 25 loci (all of which are included also in the BCAC 313 BC PRS) showed deviating AFs: 16 loci in GATK and 19 loci in freebayes data, with an overlap of 10 loci.
The IHG provided GATK-and CLC-based BRIDGES 306 BC PRS genotyping data of 251 samples (Supplementary Methods).Four loci did not meet the quality criteria in both settings, and additional four in the CLC setting.Of the remaining 298 loci, 228 (76.51%) showed identical AFs (Supplementary Table 3).Using a threshold of 0.063 (Supplementary Fig. 6), resulted in 23 loci showing noticeably deviating AFs in GATK data, respectively 19 loci in CLC data, with an overlap of 10 loci.
In summary, for four loci, deviating AFs were reported in all GC-HBOC real-world settings examined, namely for rs56097627, rs113778879, rs57589542, and rs3988353.Further four loci, namely rs574103382, rs73754909, rs3057314, and rs57920543, were reported with deviating AFs in all settings except for one (Table 3).
Considering the loci non-existent in gnomAD v3.1.2,rs113778879 was not observed with expected AF in any GC-HBOC center, and rs73754909 only with forced DRAGEN calling in DMG data.For rs79461387, expected AFs were reported consistently when using freebayes, but not by unforced DRAGEN calling and in two settings using forced GATK.Of note, rs572022984 with zero allele count in gnomAD v3.1.2NFEs and an expected AF of 0.0364 in CanRisk, was consistently not observed at all or with a maximum AF of 0.0037 (Supplementary Table 3).

Implications on risk prediction
Without further information and assuming a standardized PRS at the 50th percentile, the estimated 10-year risks of developing primary BC of cancer-unaffected women of 20, 40, and 60 years of age were 0.1%, 1.5%, and 3.4% according to CanRisk (Supplementary Table 4).Percentiles of PRSs from artificial VCF files with aberrant dosages (see "Materials and Methods") ranged from Table 3.  47.5% (IHG CLC, BRIDGES 306) up to 55.7% (ICG freebayes, BCAC 313).The risk of 0.1% for a 20-year-old woman was concordantly unchanged in all scenarios including artificial PRSs.For a 40-yearold woman, estimated 10-year risks were increased by 0.1% in seven scenarios, and for a 60-year-old woman by up to 0.2% in eight scenarios.
Estimated remaining lifetime risks of developing primary BC assuming an average PRS (50th percentile) of cancer-unaffected women aged 20, 40, and 60 years are 11.3%, 10.9%, and 7.1% according to CanRisk (Supplementary Table 4).When using PRSs from artificial VCF files with aberrant dosages, estimated lifetime risks ranged from 11.1% up to 11.9% for a 20-year-old woman, from 10.6% up to 11.4% for a 40-year-old woman, and from 7.0% up to 7.4% for a 60-year-old woman.The lowest estimates were obtained with the BRIDGES 306 BC PRS based on IHG CLC data with 19 artificial dosages imputed, and the highest with the BCAC 313 BC PRS based on ICG freebayes data with 14 artificial dosages imputed.

Consideration of alternative alleles and loci in linkage disequilibrium
For 20 PRS loci showing noticeably deviating AFs in at least one real-world NGS data set, alternative alleles or overlapping variants with minimum AF 0.01 in NFEs were reported in gnomAD v3.1.2(Supplementary Table 5).For rs73754909 and rs79461387, both SNVs and non-existent in gnomAD v3.1.2,deletions were reported with comparable AFs to the ones expected by CanRisk.For both deletions, the adjacent downstream nucleotide of the reference sequence was identical to the substituted nucleotide of the expected effect allele (Fig. 3).For rs113778879, which is also an SNV not contained in gnomAD v3.1.2,a similar observation could be made (Supplementary Fig. 7), but the reported AF exceeds the expected one by more than 0.1 (0.5762 versus 0.6818).
For 28 out of the 49 loci showing noticeable deviating AFs in at least one real-world data set, proxies in 1000G GRCh37 microarray data, 1000G GRCh38 High Coverage WGS data, or TOPMED European data could be identified (Supplementary Table 6).For rs113778879, rs73754909, and rs79461387, LDpair based on GRCh38 reported the same alternative alleles as gnomAD v3.1.2(Supplementary Table 5), where the original PRS loci are nonexistent.
Proxies and alternative alleles showing AFs in gnomAD v3.1.2comparable to expected CanRisk AFs, i.e., an absolute deviation <0.016, were considered as possible workarounds for improved PRS genotyping, and further evaluated with respect to observed AFs in IMGAG freebayes data (Table 4).For 19 of these 21 PRS loci, absolute differences between expected and observed AFs in IMGAG freebayes data remained below the previously defined IMGAG freebayes-specific threshold of 0.036.The exceptions were the substitutions of rs12406858 and rs79461387.The latter is noteworthy because the original PRS locus, which is an SNV, was correctly called by freebayes in forced and unforced mode (Table 3), whereas GATK HaplotypeCaller seemed to call an overlapping deletion of sequence GAG in DMG and CFBOC data.Also noteworthy are the potential replacements of rs73754909 and rs111833376, as both variants were called with noticeably deviating AFs in most real-world data sets.

DISCUSSION
This study describes the systematic evaluation of NGS-based PRS genotyping in real-world data sets of five GC-HBOC centers.The observed AFs of PRS loci in individuals with European descent were used as quality criterion, as the reproducibility of expected AFs of the PRS loci, and hence, the assumptions made about the overall PRS distribution, are an essential prerequisite for a correct risk calculation.In each setting under consideration, at least 11 out of 313 BCAC BC PRS loci, respectively 306 BRIDGES BC PRS loci, showed noticeably deviating AFs.These deviations were dependent on sequencing technology, variant caller, and calling mode and can be expected to affect the final BC risk calculations of the BOADICEA model implemented in CanRisk.Therefore, this study demonstrates the necessity to apply quality assurance not only in terms of sequencing coverage but also in terms of observed AFs in a sufficiently large cohort, when implementing PRSs in a routine diagnostic setting.
The presented results also point to potential solutions for improving genotyping performance with respect to the replication of expected AFs for several loci, these primarily include the use of alternative variant callers or consideration of proxy variants.The use of certain variant callers resulted consistently in noticeable deviating AFs, which were not observed for other callers.This concerned e.g., rs62485509 when using DRAGEN, and rs11268668 when using freebayes (Table 3).In each setting under investigation considering identical samples, the number of loci whose AFs match the expected AFs could be increased by variant-specific selection of the variant caller.
Comparison to large-scale population-specific data, such as gnomAD and 1000G High Coverage WGS, indicates that several PRS loci do not appear or appear with different alleles in NGS than   A. Baumann et al.
in array-based genotyping.Here, four loci have been identified for which the use of alternative alleles could lead to the achievement of the intended, originally array-based determined AF, if NGSbased genotyping does not do so (Table 4).Two of these loci were absent in gnomAD v3.1.2NFEs, which was also true for rs113778879 and rs572022984.As a potential workaround for rs113778879, which is an SNV, an overlapping 5 bp deletion was identified, but the observed AF exceeds the expected one by more than 0.1 (Supplementary Table 5).gnomAD SV v2.1 [24] reports a 1370 bp deletion starting at the same genomic position as rs572022984, namely DEL_2_27095, with an AF of 0.0417 in Europeans.However, genotyping of structural variants requires adapted variant calling approaches and therefore might be unfeasible within the scope of PRS genotyping in a routine diagnostic setting.
If no workarounds are available for loci showing noticeably deviating AFs, only imputation of the expected dosage according to CanRisk remains.This leads to smaller errors than omitting the locus from PRS calculation or setting the genotype to 0/0.However, each imputation causes a shift toward the mean PRS, and therefore imputations are applicable only up to a certain extent.
PRSs for calculating individual BC risks will continue to evolve.For example, currently, the Confluence Project2 aims to develop multi-ancestry PRSs.In addition, PRSs become also more and more relevant for the diagnostics of other diseases with a genetic component [25,26].The presented results underline that it would facilitate the implementation in clinical routine and thus also increase the reliability of genetic diagnostics if the design of future PRSs would be guided by the reproducibility of the expected AFs in addition to the observed effect sizes.A straightforward strategy to achieve this could be to ensure comparability of AFs in largescale population databases, favorably based on different genotyping approaches, prior to including a locus in a PRS.
This study has limitations.Larger sample sizes may have resulted in more accurate estimators of AFs.Furthermore, there was a strong enrichment for samples derived from individuals with familial BC/OC, which may have resulted in deviating AFs due to genetic load rather than technical artifacts.The genetic background could explain, e.g., the aberrant (but concordant) AFs of rs55941023 in IHG data and of rs35054928 in CFBOC data.Despite checking family IDs, related individuals within a data set cannot be entirely excluded.Finally, no statement can be made about whether the described AF deviations would persist when using arrays for genotyping, since corresponding analyses are not (yet) performed in any of the GC-HBOC centers.

Fig. 1
Fig. 1 Comparison of variant effect allele frequencies (AFs) specified by CanRisk and observed in gnomAD v3.1.2non-Finnish European samples for 320 variants incorporated in BCAC 313 or BRIDGES 306 breast cancer polygenic risk scores.Extremely deviating AFs with an absolute difference > 0.016 are indicated by red markers.

Fig. 2
Fig.2Comparison of effect allele frequencies (AFs) specified by CanRisk and observed in ten real-world data sets for 320 loci incorporated in BCAC 313 or BRIDGES 306 breast cancer polygenic risk scores.Data were provided by the Institute of Medical Genetics and Applied Genomics (IMGAG) at University Hospital Tübingen, Institute for Clinical Genetics (ICG) at University Hospital Carl Gustav Carus Dresden, by the Department of Medical Genetics (DMG) at University Hospital Münster, by the Center for Familial Breast and Ovarian Cancer (CFBOC) at University Hospital Cologne, and by the Institute of Human Genetics (IHG) at the University of Regensburg.

Fig. 3
Fig.3Sequences of reference, expected effect allele and potential alternative allele of polygenic risk score loci rs73754909 and rs79461387 (hg19-based).Both alternative alleles are deletions with the adjacent downstream nucleotide identical to the expected substituted one.

Table 1 .
Characteristics of data sets provided by participating centers of the German Consortium for Hereditary Breast & Ovarian Cancer (GC-HBOC), namely the Institute of Medical Genetics and Applied Genomics (IMGAG), University Hospital Tübingen, the Institute for Clinical Genetics (ICG), University Hospital Carl Gustav Carus Dresden, the Department of Medical Genetics (DMG), University Hospital Münster, the Center for Familial Breast and Ovarian Cancer (CFBOC), University Hospital Cologne, and the Institute of Human Genetics (IHG) at the University of Regensburg.

Table 2 .
Characteristics of loci incorporated in the BCAC 313 or BRIDGES 306 breast cancer PRSs that were either not included in the gnomAD v3.1.2database or reported with extremely deviating allele frequency compared to CanRisk.
Log odds ratios (ORs) are identical for BCAC 313 and BRIDGES 306, but missing values indicate loci not included in the corresponding PRS.Entries in the Comment column refer to technical artifacts reported in gnomAD.LCR low-complexity region, LQS low-quality site (in <50% of samples covered), VQSR failed allele-specific GATK Variant Quality Score Recalibration (VQSR) filter.

Table 4 .
Potential solutions for improving polygenic risk score (PRS) genotyping performance with respect to the achievement of allele frequencies (AFs) expected by CanRisk, using alternative alleles or proxies.Resulting AFs were investigated based on gnomAD v3.1.2non-Finnish European data and genotyping results of 1410 European whole-genome sequencing (WGS) samples using (unforced) freebayes (FB), provided by the Institute of Medical Genetics and Applied Genomics (IMGAG) at University Hospital Tübingen.