INTRODUCTION

Approximately 6% of unselected breast cancers and 15% of ovarian cancers can be attributed to germline pathogenic variants in BRCA1 and BRCA2.1,2 BRCA1 pathogenic variant heterozygotes have a cumulative (to age 80 years) breast cancer risk of 72% (95% confidence interval [CI] 65–79%) and ovarian cancer risk of 44% (95% CI 36–53%).3 This represents roughly a 6-fold and 35-fold increase compared with the general population for breast and ovarian cancer, respectively.

The explosion in clinical germline genetic panel testing has led to an increase in findings of variants of uncertain clinical significance (VUS).4 VUS heterozygotes remain in the dark as to whether they are at increased risk or not. While for those with a strong family history of cancer risk assessment can be based on clinical factors, those without a family history, representing approximately half of all heterozygotes, have no such alternative. Thus, the inability to determine pathogenicity associated with a VUS poses a significant barrier to counseling and clinical management of VUS heterozygotes.5,6,7 Currently, the Evidence-based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) consortium utilizes three distinct classification frameworks for the purposes of clinical management of patients with variants in BRCA1.

A rule-based framework classifies variants based on the predicted effect inferred by changes in genetic code. Variants whose effects on the protein can be unambiguously inferred to cause premature protein termination (e.g., frameshift and nonsense variants) or the production of a noncanonical protein lacking functional domains (e.g., disruption of splicing donor or acceptor sites) are deemed pathogenic. Variants whose effects cannot be inferred from the genetic code cannot be classified by the rule-based framework. This is the case for missense variants, small in-frame insertions and deletions, and intronic variants outside the canonical splice sites.

To classify these problematic variants, a statistical multifactorial model framework was developed to take into account tumor pathology, family history, cosegregation, and co-occurrence data.8,9 However, very rare alleles remain as VUS as these data may not be sufficient to achieve classification. For these rare missense variants, functional data has emerged as a powerful way to determine whether a variant leads to loss of function.10 However, functional data are not yet integrated in multifactorial statistical models.

Finally, the framework recommended by the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP), establishes categorical criteria for the strength of evidence and weighting of data sources, which allows for the introduction of functional data and provides an opportunity to assess a large number of VUS.11

All three frameworks can be used in a complementary fashion and are based on the following common assumptions: (1) that loss-of-function variants constitute risk-associated BRCA1 variants and (2) that pathogenic missense variants confer the same risk increase as pathogenic truncating variants. Commercial and academic genetic testing laboratories may apply a variation or a combination of these frameworks to generate a five-tier system of pathogenic, likely pathogenic, variant of uncertain significance (VUS), likely benign, or benign.12

We started systematically collecting and curating BRCA1 functional data in 2014 and have made those data available through a publicly accessible web tool, the BRCA1 Circos (https://research.nhgri.nih.gov/bic/circos/).13 Here we provide an updated and comprehensive resource of functional data for missense variants in BRCA1. Further, we follow recent recommendations for the use of evidence from functional assays to rigorously assess their validity, harmonize data across studies, and integrate data to generate evidence for hundreds of variants.10,14

MATERIALS AND METHODS

Literature curation, data extraction, and definitions

We annotated all published functional data for BRCA1 (OMIM 113705) missense variants (as of August 2019). Missense variants are defined as nucleotide changes that lead to the substitution of an amino acid residue using GenBank accession U14680 and LRG_292 as reference sequences.

For the purposes of this study, specific instances of an assay performed in a study were considered “tracks”. For example, a publication that reports cell viability, protein expression, and subcellular localization assays is represented as three separate tracks. The same underlying assay reported in different publications will be represented as a separate track. In one case, track 102, this single track collapses data from several independent publications with multiple biological and technical replicates runs using a joint analysis that takes into account batch effects.15,16

Universe of variants

The published biological and biochemical assays included in this study focus on the effects of missense variants on protein function. There are 16,776 possible single-nucleotide substitutions in the BRCA1 coding region. We excluded nonsense (n = 803), stop codon reversions (n = 9), and synonymous (n = 3492) changes, resulting in 12,472 variants. As multiple nucleotide changes can generate the same amino acid substitution, we considered our starting “missense universe” all unique missense variants resulting from single-base substitutions (n = 11,009) in the reference BRCA1 (RefSeq NM_007294.3). For example, although single-nucleotide substitutions at codon 1775 in BRCA1 (ATG; Met) can generate nine mutant codons (CTG, GTG, TTG, ACG, AGG, AAG, ATC, ATA, and ATT), only five unique mutant amino acid residues result from these changes (Leu, Val, Thr, Arg, Lys, and Ile).

Next, we defined a set of “documented missense variants” (n = 2484) recorded in BRCA Exchange (https://brcaexchange.org/) as of August 2019, which represents all reported variants known to have been found in at least one individual. These variants include those observed in the context of clinical testing (e.g., ClinVar) or observed in exome and genome sequencing of large cohorts (e.g., gnomAD; https://gnomad.broadinstitute.org/). Documented variants are denoted as “1” if observed and “0” if not (track 9; Supplementary Table 1).

Harmonization

We retained each author’s determination of whether a given variant had a significant impact on the function being tested. There is a wide variety of methods and cutoff values used across studies and details for how each assay’s data were transformed into a binary categorical result (impact versus no impact) can be found in Supplementary Table 2.

To harmonize scores we assigned “0” to a variant that did not lead to a significant impact on the function being tested (functionally normal) and “1” to those that did (functionally abnormal). Scores “0” and “1” are presumably associated with (likely) nonpathogenic and (likely) pathogenic variants, respectively. Variants for which the results were inconclusive or intermediate did not receive a score. Some variants affect both splicing and amino acid composition and, in these cases, the determination of functional impact refers exclusively to its effect on protein function.

For track 102 we previously developed a “functional class” (fClass) scoring scheme17 using the posterior probability calculation of a variant being pathogenic in the transcriptional activation assays (PrDel) output by VarCall to generate functional classifications (fClass): PrDel ≤0.001 as fClass 1 (nonpathogenic), 0.001 < PrDel ≤ 0.05 as fClass 2 (likely not pathogenic), 0.05 > PrDel ≤ 0.95 as fClass 3 (uncertain), 0.95 < PrDel ≤ 0.99 as fClass 4 (likely pathogenic), and PrDel >0.99 as fClass 5 (pathogenic). We collapsed fClasses 1 with 2 and 4 with 5 and transformed them into binary categories: we assigned “0” to a variant that was a benign (fClass 1) or likely benign (fClass 2); and “1” to a variant that was pathogenic (fClass 5) or likely pathogenic (fClass 4).

Reference panel of variants

To assess the accuracy of each track we used a highly stringent reference panel combining data from the ENIGMA consortium18 and ClinVar. The ENIGMA reference panel is composed of 298 missense variants assigned to International Agency for Research on Cancer (IARC) classes by the multifactorial model.8,9 While the ENIGMA panel has the advantage of being a set of variants systematically classified using a rigorous multifactorial statistical model,8,9 it has a limited number of reference variants.

Thus, we assessed accuracy using an expanded panel which combines the ENIGMA panel with additional variants from ClinVar. The ClinVar data set was obtained by downloading data for all BRCA1 germline variants (8197 entries) on 24 May 2020. Missense-only entries with reference sequence NM_00794.3 and NM_00794.4 and review status of 1-star or better were retained. Variants reported as “conflicting interpretations of pathogenicity” and “uncertain significance” were removed, resulting in a data set composed of 295 variants. The ENIGMA reference panel was then merged with the ClinVar data set. Because the tracks assessed exclusively examined protein function and were not expected to identify variants with splicing effect, we excluded six missense variants with known effects on splicing for the calculation of specificity and sensitivity.11,18,19,20,21,22,23 We also removed the data on the reference variant C1787S because its classification of pathogenic was achieved in the context of a C1787S/G1788D haplotype.9 Track 102 is the only track that tested this haplotype context.15 To assess sensitivity and specificity we used the nonredundant ENIGMA + ClinVar [E + C] reference panel composed of 389 reference variants (Supplementary Table 3).

Sensitivity and specificity calculations

To assess the accuracy of different assays we calculated sensitivity and specificity for all tracks that had tested more than ten variants (reference and VUS) in parallel and included at least four pathogenic and four benign missense reference controls.

Reference variants are classified by the multifactorial model in a five-tier scale according to IARC recommendations:24 variants with PrDel (probability of being pathogenic) ≤0.001 are assigned class 1 (nonpathogenic; benign); 0.001 < PrDel ≤ 0.05 are assigned class 2 (likely not pathogenic; likely benign); 0.05 > PrDel ≤ 0.95 are assigned class 3 (uncertain); 0.95 < PrDel ≤ 0.99 are assigned class 4 (likely pathogenic); and PrDel > 0.99 are assigned class 5 (pathogenic). We collapsed classes 1 with 2 and 4 with 5 and transformed them into binary categories: we assigned “0” to a variant that was a benign (IARC class 1) or likely benign (IARC class 2); and “1” to a variant that was pathogenic (IARC class 5) or likely pathogenic (IARC class 4). The multifactorial model does not include functional data, thus avoiding circularity.

Odds of pathogenicity calculations

To determine the strength of evidence associated with each track we estimated the odds of pathogenicity (OddsPath) for a theoretical assay that previously evaluated classified controls following the recommendations from the Clinical Genome (ClinGen) Resource Sequence Variant Interpretation Working Group.14

Because we excluded variants with intermediate results during data harmonization, we estimated an OddsPath that could be achieved by a perfect binary classifier.14 For each track, the OddsPath was calculated according to the formula: \(OddsPath = \frac{{\left[ {P2 \, \times\left( {1 - P1} \right)} \right]}}{{\left[ {\left( {1 - P2} \right) \,\times P1} \right]}}\) where P1 represents the pathogenic variants in the overall modeled data as a prior probability, and P2 the proportion of pathogenic variants with functionally abnormal or normal readouts as posterior probability.14 Results from these calculations (OddsPath for functionally abnormal variants, and OddsPath for functionally normal variants) were used to obtain a corresponding level of evidence strength (BS3 supporting, BS3 moderate, BS3, indeterminate, PS3 supporting, PS3 moderate, PS3) according to the Bayesian adaptation of the ACMG/AMP variant interpretation guidelines (Supplementary Table 4).25 In this framework, each assay track receives PS and BS criteria that are applied to every variant scoring as loss of function (PS criterion) or with no functional impact (BS criterion), respectively. For example, all variants scoring as loss of function in a track with OddsPath for functionally abnormal variants >18.7 will receive evidence criterion PS3.

Data and code availability

All data sets used are available as supplementary materials associated with this article (Supplementary Tables 13). These data sets and codes for queries to generate summary statistics and variant callings in Supplementary Tables 511) are available through GitHub (github.com/FunctionalAssaYIntegration/FYI_BRCA1). Updated data sets are also available for download and can be visualized at FYI-HBOC (http://iscva.moffitt.org/fyi-hboc/build/).

RESULTS

The landscape of BRCA1 functional assays

The Functional AssaY Integration for BRCA1 (FYI-BRCA1) data set contains 140 tracks, including 131 tracks representing individual instances of functional assays from 37 publications (Fig. 1) (Supplementary Table 2). There was functional information for 2701 missense variants, of which 2465 are currently VUS (22% of all possible single-nucleotide missense changes and 40% of all reported missense variants) (Supplementary Table 5) according to the ENIGMA + ClinVar reference panel. Approximately 47% (62/131) of all tracks have tested ten or more variants in parallel and ~33.5% (44/131) have also tested at least four pathogenic and four benign controls (Supplementary Table 6). Controls are from a reference panel composed of 389 known missense variants assigned to IARC classes by the multifactorial model8,9 or classified by ClinVar (Supplementary Table 3). Taken together, these data indicate that there is a wealth of untapped functional data to aid in classification of a significant fraction of BRCA1 missense VUS.

Fig. 1: Overview of functional track assessment.
figure 1

Subway chart illustrating the processes and stages of data collection, curation, and harmonization of functional data (green line); harmonization of reference variant classification (red line); and data analysis and evidence criteria assignment (yellow line). VUS variant of uncertain significance.

Assessing the accuracy of functional assays

The ACMG/AMP rules state that validated functional assays can be used as a source of evidence to classify a VUS and specific recommendations have recently been published.11,14 We followed the recommendations to define the disease mechanism and evaluate the applicability of general classes of assay (Supplementary Text) (Supplementary Fig. 1a). Most functional assays developed to date fall into the following 12 applicable classes: binding; focus formation; protein expression, stability, and folding; transcription activation; sensitivity; recombination; localization; proliferation; chromosome/mitotic apparatus; cell viability; catalytic activity; and cell cycle checkpoint (Supplementary Table 7).

The assay class “protein expression, stability, and folding” had one of the lowest fractions of tracks that met the criteria to be evaluated (i.e., that had tested at least ten variants and four benign and four pathogenic controls) and the lowest fraction of validated tracks. In contrast, 75% of recombination class tracks that met criteria were validated (Supplementary Table 7). We also evaluated tracks grouped by host (cell type or in vitro system used) of the assay. Although S. cerevisiae and E. coli showed <50% tracks validated, the data suggest that tracks using a range of different host categories can generate useful information (Supplementary Table 7).

Next, we evaluated the validity of individual tracks. After data harmonization, we used a reference panel of 389 variants (Supplementary Table 3) to calculate specificity and sensitivity for all tracks that had tested at least ten variants and four benign and four pathogenic controls (n = 44). Seven tracks achieved 100% sensitivity and 100% specificity. Twenty-two tracks with sensitivity and specificity ≥80% were considered appropriately validated for the purposes of variant interpretation (Fig. 2a) (Supplementary Table 8).

Fig. 2: Track validation.
figure 2

(a) Specificity and sensitivity of 22 tracks. Side color bars indicate different classes of assays. Blue, green, and red bars represent sensitivity/specificity, lower and upper bound of 95% confidence interval (CI) respectively. Tracks in blue font were validated (specificity/sensitivity ≥80%) and tracks in black font were not. (b) Fraction of variants receiving different American College of Medical Genetics and Genomics/Association for Molecular Pathology (ACMG/AMP) evidence criteria.

Applying the evidence to individual variant interpretation

We consider the 22 tracks for use in variant interpretation specified above (belong to an applicable assay class, inclusion of basic and variant controls, broadly accepted historically, validated) to be well-established and we refer to them as the “Hi Set”.15,26,27,28,29,30,31,32,33,34,35,36,37,38

We determined the evidence strength (i.e., BS3 supporting, BS3 moderate, BS3, indeterminate, PS3 supporting, PS3 moderate, and PS3) according the Clinical Genome Resource (ClinGen) Sequence Variant Interpretation (SVI) Working Group recommended equivalence14 (Supplementary Table 8). Importantly, requiring ≥80% specificity and sensitivity eliminated all assays with evidence strength equivalent to “indeterminate” (Fig. 2b).

From a set of 2449 VUS tested by assays in the Hi Set we identified variants that had been tested once and assigned BS3_supporting, BS3_moderate, or BS3 to 1481 VUS and PS3_moderate or PS3 to 188 variants (Fig. 3) (Supplementary Table 9). Next we identified variants that had been tested more than once and all results were concordant. We assigned as final evidence the strongest assignment for each variant. For example, a variant tested three times and receiving BS3_supporting, BS3_moderate, and BS3_moderate would be assigned BS3_moderate as the final assignment (Supplementary Table 9). Finally, for variants tested across multiple assays, 117 variants had discordant results ranging from variants with 9:1 (benign:pathogenic) to variants with 1:13 (Supplementary Table 10). Based on the distribution of ratios for discordant results (Supplementary Fig. 1b), we propose that ratios of 3:1 or greater and 1:3 or smaller (1.5 ≤ log2 [Ratio Benign/Path] ≤ −1.5) constitute preponderance of evidence. Ratios of 2:1 and 1:2 were considered too weak while ratios of 4:1 and 1:4 would significantly reduce the numbers of discordant results resolved. In these 23 cases, we assigned the strongest evidence criteria among the tracks providing the results. In total we assigned evidence criteria to 2355 VUS (Fig. 3).

Fig. 3: Overview of individual variant assessment using Hi Set.
figure 3

Subway chart illustrating the stages of evidence criteria assignment to variants of uncertain significance (VUS) (yellow line) and estimation of sensitivity and specificity of the integrated analysis (red line). Dotted red line represents the estimation of sensitivity and specificity with prior knowledge of which variants affected splicing.

Estimating sensitivity and specificity of the Hi Set integrated approach

Reference variants (which were defined as reference pathogenic and benign variants without considering functional data) tested in the Hi Set were assigned evidence criteria using the same rules applied for the VUS, allowing us to estimate the sensitivity and specificity of the integrated framework based on the 22 Hi Set functional tracks (Fig. 3). Expectedly, since these reference variants were used to identify the tracks that comprise the Hi Set, the integrated approach had a low error rate (3.5%) and high sensitivity (0.92; 95% CI 0.84–0.96) and specificity (1.00; 95% CI 0.96–1). Most reference missense variants misclassified by the functional tracks also affect splicing, and thus prior knowledge of variants that might affect splicing improves sensitivity and specificity (Fig. 3).

VUS and reference variants with discordant results

There were 154 variants (117 VUS) with discordant results (Supplementary Table 10). We consider results discordant when a variant tested by multiple tracks scores both as loss of function and no functional impact. We looked into this set of variants to identify limitations of our approach.

First, we examined all 37 reference variants that initially scored as discordant (Supplementary Table 11). Variant p.C1787S (IARC class 5) scored four times of benign and one time as pathogenic. This variant was classified as pathogenic (IARC class 5) but only in the context of a haplotype with G1788D.8 Track 102 has tested C1787S, G1788D, and the double C1787S/G1788D. Both variants in conjunction contribute to loss of function, while each in isolation does not significantly impact the function.17 All other tracks have only tested one or the other variant but not the haplotype. Variant p.R71G (IARC class 5), which also affects splicing, scored as benign three times and once as pathogenic (from an assay that identifies defects in RNA abundance). These cases provide cautionary notes as variants with a preponderance of evidence in either direction may, in some cases, be assigned incorrectly, considering the reference panel as the “correct” assignment.

Five variants have conflict between two related tracks of phosphopeptide binding activity (track 27) and phosphopeptide binding specificity (track 28). It was observed that several variants, while not affecting binding of phosphopeptide, would bind phosphorylated and unphosphorylated peptides indiscriminately, a loss of function associated with cancer risk.27 A similar case is encountered for conflicts in which a variant does not affect protein expression but the protein expressed has compromised function.

Some substitutions are more prone to generate discordant results

To determine whether there were classes of amino acid substitution (e.g., Arg → Glu) more prone to generate discordant results, we determined the frequency of 39 classes of substitutions represented in the set of discordant variants, and calculated their fold enrichment in the discordant set when compared with the tested set (Supplementary Fig. 2a). A small number of classes of substitution, such as Ile → Lys and Arg → Trp, were enriched in the discordant variant set, featuring prominently changes to/from hydrophobic and positively charged or polar amino acid residues (Supplementary Fig. 2b), suggesting that these substitutions are more affected by differences in experimental conditions or may lead to intermediate levels of activity.

Alternative approach

The process of assigning evidence criteria used in Fig. 3 is a relatively stringent approach. The harmonized data set, however, can filter tracks using different characteristics, and different assignment rules can be developed and compared. To illustrate this we have assigned evidence criteria using an alternative approach, although the relative contribution of the tracks with large number of variants does not change significantly (Supplementary Fig. 1c).

In the alternative approach, instead of preselecting tracks according to a specific specificity and sensitivity threshold, we used all 131 functional tracks. From a set of 2465 VUS tested we identified variants that had been tested once by any track and assigned BS3_moderate or BS3 to 1448 VUS and PS3 to 179 variants (Supplementary Fig. 3). Next we identified variants that had been tested more than once and all results were concordant. We assigned as final evidence the strongest assignment for each variant (Supplementary Table 11).

Finally, for VUS tested multiple times, 180 variants had discordant results ranging from 17:1 (benign:pathogenic) to 1:19. We used the same cutoff (1.5 ≤ log2 [Ratio Benign/Path] ≤ −1.5) to define preponderance of evidence. For those VUS we assigned BS3 and PS3 criteria. The remaining VUS were assigned by simple majority voting but, to reflect the discordance, were only assigned BS3_supporting and PS3_supporting criteria. Forty-two variants scored as benign in as many assays as they scored pathogenic) and remain unassigned until further testing. In total we assigned evidence criteria to 2421 VUS (Supplementary Fig. 3).

The alternative approach (majority voting) led to a small additional decrease in unassigned variants, mostly assigned to benign evidence criteria (BS3_supporting, BS3_moderate, and BS3) (Fig. 4). However, approximately 10% of variants achieving PS3 using the Hi Set were downgraded to PS3_moderate or PS3_supporting in the majority voting approach (Fig. 4).

Fig. 4: Comparison of evidence criteria assignment for Hi Set and majority voting approaches.
figure 4

Graphs show the fraction of each evidence criteria variants were assigned under the two alternative scenario, the percentage of VUS assigned and the estimated error rate.

Protein modular domains and structural motifs

Most variants receiving PS3 criteria were part of a functional domain of BRCA1, either the RING finger or the BRCT domains (Fig. 5). Within these domains, some structures seem to be more sensitive to variation and therefore changes are likely to lead to loss of function. For example, some secondary structures, such as β2 in the first BRCT repeat have not yet recorded a variant with impact on function (Fig. 5b). While this could be due to the limited number of variants tested in these regions, the fact that β’2, the corresponding β sheet in the second BRCT domain, is also tolerant to changes suggest that these β sheets are not critical to function.

Fig. 5: Location of pathogenic variants.
figure 5

(a) Fraction of different evidence criteria assignments by 100–amino acid windows. The RING finger and the BRCT domains are indicated by red lines. The number of measurements (number of reported results from assays for all variants) in the 100–amino acid windows is shown in red font. (b) Fraction of different evidence criteria assignments by secondary structures and connecting loops in the BRCT domains.

DISCUSSION

BRCA1 pathogenic variant heterozygotes are at substantially increased risk for breast and ovarian cancer. Pathogenic variant heterozygotes affected with cancer can also benefit from the use of poly (ADP-ribose) polymerase (PARP) inhibitors for treatment. Therefore, accurate determination of a variant’s pathogenicity is critical to improve risk stratification and treatment outcomes in breast and ovarian cancer treatment. However, due to lack of information hundreds of BRCA1 variants remain as VUS, constituting a significant unmet clinical need.

Missense variants constitute the largest class of unclassified variants in BRCA1 and functional assays have emerged as an important source of evidence to aid in classification.10 Remarkably, a large number of experiments probing into the mechanisms of cancer predisposition due to defective BRCA1 have been conducted since its cloning, many using missense variants.

Here we systematically reviewed the biological basis of all functional assays testing missense variants in BRCA1 published in the last 23 years and conclude that all assay classes are applicable as they directly or indirectly measure a demonstrated function of BRCA1, although not all molecular functions have clearly established connections to the cancer phenotype. Of note, none of the tracks in this study is the result of a CLIA or European Communities Confederation of Laboratory Medicine (EC4) laboratory-developed test and this is a limitation of our approach. However, as these assays are unlikely to ever be conducted commercially, the published data are likely to remain the sole source of functional information for variants.

We reasoned that this wealth of experimental data could be used to mitigate the challenges of VUS for hereditary breast and ovarian cancer. We then harmonized the results for 2701 missense variants in BRCA1. Another limitation of our study was the harmonization of qualitative, semiquantitative, or quantitative data as binary categorical data and the fact that cutoffs to distinguish a normal from abnormal function are variable. Presumably, this approximation results in a loss of information. However, a general treatment to harmonize quantitative data across studies has several obstacles, including the lack of access of raw data from individual studies and the need to generate quantitative models capable of integrating all data sets. We believe this will be possible in the near future as there are several quantitative models for individual assays that could be adapted to integration.16,33,36

We illustrate the utility of this data set by assigning ACMG/AMP evidence criteria using two scenarios. In the first scenario (Hi Set), only 22 tracks that (1) tested more than ten variants, (2) tested at least four benign and four pathogenic controls, and (3) achieved a specificity and sensitivity ≥80% using the [ENIGMA + ClinVar] reference, were considered. Using this approach, we assigned evidence criteria to 2355 VUS, which corresponded to 96.2% of all tested VUS.

The second scenario (majority voting) represents a less stringent one in which data from all 131 tracks were considered. Although the two data sets were not significantly different (the largest contributors to both scenarios are tracks with large number of variants tested, e.g., tracks 102, 131, 133, and 134) there was a small increase in the number of VUS assignments (2421) of variants tested, mostly due to increases in benign evidence criteria (BS3_supporting, BS3_moderate, BS3). We recommend the Hi Set approach for systematic assignment of ACMG/AMP evidence criteria to functional data. The availability of the data set will allow investigators to model the data using a variety of reference panels and criteria for choice of assay.

For variants with multiple tests, we also explored discordant results. Discordant results can be due to random variation, clerical errors (e.g., typographical errors in labels), experimental errors in one or multiple assays (e.g., sample swapping, incorrect pipetting), variants with intermediate effects that may be detected by the most sensitive but not all assays and variant impacts on risk independently of the function being tested. Our evidence criteria assignment can be improved by examining and adjudicating individual cases of discordant results.

Large data sets can also reveal more granular information about the role of specific protein segments and amino acid residues. For example, our analysis has shown that β2 and β’2 are relatively tolerant to changes, perhaps due to the fact that there is little contribution of these β sheets in the interrepeat BRCT interface.37

In summary, we use a large body of experimental evidence to assign evidence criteria to an overwhelming majority of missense VUS in BRCA1 in a large scale application of ACMG/AMP evidence criteria. It is important to stress that according to recommendations of the SVI Working Group the functional evidence criteria are currently not meant to be standalone evidence for either a benign or pathogenic classification. At least one other evidence type would be required to reach a final classification. Future developments should take into consideration the impact of a variant on different phenotypes and devise ways to consider variants with intermediate effects.