Cystic fibrosis (CF) is a relatively common autosomal recessive inherited genetic disease that results from the presence of two mutations in the CF transmembrane conductance regulator (CFTR) gene on chromosome 7q31 (OMIM number *602421; The CFTR protein forms a chloride channel and is primarily expressed in the apical membrane of exocrine epithelial cells. Accordingly, the clinical presentations of CF include meconium ileus, obstructive jaundice, overall failure to thrive, pancreatic insufficiency, diabetes mellitus, recurrent bacterial endobronchitis, and progressive deterioration of lung function. However, CF-related clinical manifestations form a continuum that can range from mild to very severe or rapidly lethal.1,2 Differences in the combination of underlying mutations, the influence of modifying genes and environmental factors, and the management of symptoms and complications all influence the phenotypic expression of the disease.

More than 1,500 CFTR sequence changes have been reported to the CF mutation database (, including point mutations, small deletions and insertions, frameshifts, splice-site mutations, and exon deletions and duplications. The incidence of CF, which is ~1 in 3,000 individuals in the United States, is highest in Caucasians and individuals of Ashkenazi Jewish descent, which are the most thoroughly studied populations.3 The incidence in individuals of other ethnicities is lower, and although knowledge of their mutations is still accumulating, it is evident that both the disease prevalence and distribution of mutations vary by ethnic groups.4 In 2001, the American College of Medical Genetics and Genomics (ACMG) and the American College of Obstetricians and Gynecologists (ACOG) recommended that an offer of CF carrier screening become part of the standard of care for all expecting couples and those contemplating pregnancy.5 For this universal population screen, the mutations that were selected were associated with severe early-onset disease with a pan-ethnic allele frequency of at least 0.1% among the US population affected with CF. This panel was later modified and currently stands at 23 mutations.6

The College of American Pathologists (CAP) offers proficiency testing (PT) challenges for molecular genetic disorders twice per year. For each of two CF PT surveys, participating molecular diagnostic laboratories receive either two (survey MGL5) or three (survey MGL2) DNA sample challenges. Laboratories can subscribe to either CF survey or both. The MGL5 survey, which began in 2008, provides challenges limited to CF testing, whereas the MGL2 survey also provides challenges for other disorders (e.g., Duchenne/Becker muscular dystrophy, Friedreich ataxia, HbS/HbC, Huntington disease, myotonic dystrophy, RhD, spinal muscular atrophy, and spinocerebellar ataxia). Both surveys are designed to offer challenges for detection of the 23 mutations that are currently on the ACMG/ACOG recommended panel for CF carrier screening.

Laboratories use their own clinically validated laboratory-developed test, or a commercially available method cleared by the US Food and Drug Administration that they have verified before implementation. The results of the PT challenges are submitted to the CAP, where they are analyzed, reviewed, and summarized and then forwarded to the CAP/ACMG Biochemical and Molecular Genetics Committee for grading and final interpretation. The assessment of a laboratory’s performance is based on both the accuracy of the identified genotype and the associated interpretation of its clinical meaning. As such, PT is an important component of quality assessment for an individual laboratory. If a laboratory fails the CF PT challenge, it must immediately address relevant issues. When a laboratory underperforms in more than one CF PT survey, a warning letter is issued and the laboratory may potentially lose CAP accreditation. The current analyses cover a period of 11 years (2003–2013), with the goal of updating the assessment of laboratory performance of CF testing previously published.7

Materials and Methods

The CAP/ACMG sample challenges consist of purified DNA from established cell lines provided by Coriell Repositories (Camden, NJ). Samples were chosen from cell lines that had been validated as part of the GeT-RM program8 or sent to participating CAP/ACMG committee member laboratories to verify the mutations before distribution. Results of the MGL2 and MGL5 surveys were evaluated together and, after 2008, separately. Because the survey questions and intended responses have changed through the years, efforts were made to standardize responses between the surveys. Only laboratories that identified themselves as performing clinical testing were included (a small number of laboratories reported doing research testing only). If a laboratory categorized itself as a clinical laboratory over several surveys but did not answer that question for a given survey or year, it was presumed to be a clinical laboratory and all responses were included. Analytical results (sensitivities and specificities) for international laboratories (not in the United States) were analyzed separate from laboratories with mailing addresses in the United States.

The PT surveys provided codes for each of the 23 mutations recommended by ACMG/ACOG,6 as well as for a few additional mutations that are commonly included in testing platforms. One sample with a “non-ACMG” mutation, V520F, was sent as a challenge in three surveys: MGL5-2009A (which also included a sample with the non-ACMG mutation 3905insT), MGL2-2010B, and MGL5-2010B. Two other surveys included samples with non-ACMG mutations: MGL2-2011B contained a sample with E60X (in combination with F508del), and MGL5-2011B included a sample with R347H. Laboratories that did not test for these mutations were graded based on the mutations they tested. In 2006, a mutation recommended by ACMG/ACOG was inadvertently omitted from the mutation code list, and that mutation was included as a challenge: therefore laboratories entered a code for “other” and wrote the identified mutation in a comments field. This challenge, as well as challenges that included mutations not recommended by the ACMG, were excluded from the overall analysis (see Discussion). Also excluded were challenges for the IVS8 polyT polymorphism for samples that did not contain the R117H mutation. The ACOG/ACMG recommendations suggest testing for IVS8 5T only when the R117H mutation is discovered because the polyT modifies the clinical severity of this mutation.5,6,9 Three sample challenges included the R117H mutation and the associated IVS8 polyT polymorphism. Those results are discussed separately.

Survey result fields included entries for allele 1 and allele 2, as well as for a clinical interpretation. Although laboratories were instructed to enter a result for all three fields, occasionally some fields were left blank. Only responses in which the laboratory provided an answer for both allele 1 and allele 2 and/or an interpretation were included in the analyses. If a laboratory provided an interpretation but did not record results for both allele fields, the blank entry(s) was assumed to be “none of the listed mutations.” This assumption may not always be correct. For example, if the sample was homozygous for a given mutation, the laboratory may have entered the mutation in the result field for allele 1 but failed to record that same mutation in the result field for allele 2. If an allele result field was not recorded, yet the intended response was a mutation, the result was considered incorrect. If a mutation was present, but the laboratory reported a different mutation, the result was considered incorrect (and categorized as a false positive).

The CAP/ACMG surveys list of mutations includes a code for “other.” This code was intended to be used for a detected pathogenic mutation that was not included on the mutation list. The laboratory should use the “other” code and list the detected mutation in a comments field. If no mutations were detected, the intended response code was “None of the ACMG mutations detected.” This intended use was apparently unclear to some survey participants. Based on the laboratories’ comments and interpretations, some of them used “other” to mean that no mutations were detected on their own panel that includes more than the 23 ACMG mutations. However, such a response could have indicated that another mutation may have been detected incorrectly. For this reason, laboratory responses with “other” entered for either allele 1 or 2 were considered unreliable and excluded from all analyses. This problem was addressed in 2013, when the CAP/ACMG survey revised the response option “other” to “other mutation.” If our assumption that laboratories misused this term is correct, excluding the responses of “other” most likely represents a conservative estimate of the performance.

For statistical analysis, each allele was treated separately. Therefore, the total potential number of alleles was twice the number of sample challenges. Analytical sensitivities and specificities were computed and compared separately for US and international laboratories, as well as by survey participation (MGL2 or MGL5). The results of analytical performance were also stratified by the laboratories’ reported numbers of tests performed and analytic methodology.

Laboratories’ clinical interpretations of their reported genotypes were also analyzed. Each PT survey contained a clinical scenario associated with the samples (e.g., from children or infants with failure to thrive). Although the wording for the interpretation choices has changed over the years, the general intended responses are two mutations “confirm a diagnosis of CF”, whereas one mutation is “supportive of CF, but not diagnostic” (an answer of “inconclusive” was also acceptable), and no mutations make “a diagnosis of CF less likely, but do not exclude it.” For laboratories that did not provide the intended response, the analytical results were examined. In some cases, a sample swap may have occurred and the intended clinical response actually matched the genotype reported. Thus, incorrect responses for genotype may or may not result in an incorrect interpretive response, depending on the type of error. To evaluate a laboratory’s ability to correctly interpret the clinical significance of specific genotypes, only interpretations from correctly genotyped samples were used to determine the percentage of laboratories giving the intended response. This approach focused on the interpretation as a separate measure. Including interpretation of incorrect results would not be meaningful, given the issue with the response of “other,” as described above.


A total of 179 laboratories from the United States participating in the MGL2 and/or MGL5 surveys in 2013 reported both the number of samples tested each month and their corresponding turnaround time (TAT) in days ( Figure 1 ). Overall, 119,814 samples were reportedly tested per month, with five participants testing >5,000 samples per month (overall median of 60 tested per month). This represents an estimated annual number tested in the United States of about 1,440,000. The reported TATs ranged from 2 to 56 days, with a median of 14 days. Of the 179 participants reporting, 165 (92%) had a TAT of 14 days or fewer. Those performing Sanger sequencing of the CFTR gene tended to test fewer samples (median, 1.5 per month) and have a longer TAT (median, 31 days). One laboratory using next-generation sequencing reported the longest TAT (>42 days). There were no significant differences reported in TATs between other analytical methods.

Figure 1
figure 1

Numbers of samples tested versus turnaround times (TATs) for US laboratories participating in external proficiency testing for cystic fibrosis mutation detection in 2013. The logarithmic horizontal axis shows the numbers of samples tested per month, with a cap at 5,000 samples or more. The vertical axis shows the TAT in days (one laboratory reported a TAT longer than 42 days, shown by the vertical arrow). The symbols indicate the associated analytic methodology (Sanger sequencing = filled diamonds; next-generation sequencing = filled squares).

From 2003 through 2013, 15 of the 23 recommended ACMG/ACOG mutations were distributed as sample challenges. These included 621+1, F508del, A455A, 1717-1, R117H, I507del, 3659delC, G85E, G542X, G551D, R553X, R347P, W1282X, N1303K, and R560T. During these years, a total of 322 US clinical laboratories and 35 international clinical laboratories responded to at least one survey. Among the US laboratories, 40 (12%) had at least one error, six had errors in two surveys (2%), and one each had errors in three and four surveys. Fifteen of the 19 surveys (79%) in which an error occurred were distributed before 2008. Only one laboratory has had errors in two surveys since 2008. The types of errors identified are shown in Table 1 .

Table 1 Results from external proficiency testing for CFTR mutations: common types of genotyping errors

Of the 10,952 total alleles included for analysis, the overall analytical sensitivity and specificity estimates for US laboratories were 98.8 and 99.6%, respectively ( Table 2 ). Figure 2 illustrates that same data set by year. After removing data from three outlying challenges (discussed later), there was a significant trend toward higher analytical sensitivity between 2003 and 2008 (first-order coefficient; P < 0.05), approaching a maximum of about 99.5% by 2009 (second-order coefficient; P ≤ 0.05). No linear trend was found for analytical specificity (P = 0.63). Beginning in the second survey for 2008, the analytical sensitivity/specificity estimates were separately computed for participants in the MGL2 and MGL5 surveys ( Table 2 ). The difference in analytical sensitivity for the MGL2 and MGL5 surveys was not statistically significant (P = 0.22), but the specificity was significantly higher for the MGL2 survey (P = 0.037).

Table 2 Results from external proficiency testing for CFTR mutations: analytic sensitivity and specificity for US and international clinical laboratories
Figure 2
figure 2

Analytical sensitivity and specificity estimates for US laboratories participating in external proficiency testing for cystic fibrosis mutation detection from 2003 to 2013. (a) The analytical sensitivity estimates for detecting CFTR mutations. Three samples had estimates below 97% (labeled) and were excluded from the regression. A second-order polynomial (solid curve) was fitted to the remaining data, and it showed significantly improving rates from 2003 through 2008, reaching a plateau that continues through 2013 (sensitivity = 0.98704 + 0.003453 × distribution − 0.0001138 × distribution2, where “distribution” represents numbers from 1 [2003A] to 21). (b) The corresponding analytical specificity estimates. The linear regression (solid line) shows no significant changes over time.

The corresponding estimates for international clinical laboratories (those not in the United States) were also computed ( Table 2 , last row). The analytical sensitivity and specificity for the combined international laboratories since 2008 were 96.0% (95% confidence interval: 92.9–97.8%) and 100% (95% confidence interval: 99.9–100%), respectively. Compared with US laboratories over the same time period, the international laboratories had a lower sensitivity (P < 0.001), but the higher specificity was not statistically significant (P = 0.40).

The lowest analytical sensitivity (91.1%) for a single challenge was recorded for a 2003 compound heterozygous sample (621+1G>T/A455E). Subsequent challenges using compound heterozygous samples did not show such low values. Two other genotypes were also associated with relatively low analytical sensitivities ( Figure 2 ): a heterozygous I507del challenge (2004) and a homozygous G542X sample (2009). Incorrect responses for the I507del challenge included suspected sample switches, incomplete entries (missing second allele), and incorrect I507del/F508del compound heterozygosity. The most common incorrect responses for the homozygous G542X challenge included reporting heterozygosity for G542X and incomplete entries (not reporting the second allele, rather than entering G542X in both allele fields). Given the previously discussed issues with the potential problems with missing data and the known problems with some methods for distinguishing the rare I507del and common F508del mutation, these challenges were not included in the analyses.

The 5T variant in intron 8 was evaluated only for the three sample challenges that contained the R117H mutation. In three different surveys, a compound heterozygous sample (R117H/F508del) was distributed. All participant responses correctly identified a 5T allele. The F508del is typically on the same chromosome as a 9T variant, and most laboratories also identified the 9T variant, although laboratories rarely reported a 5T/7T genotype.

For the US laboratories, no trend in performance for either analytical sensitivity or specificity was seen by numbers of tests performed monthly (data not shown). The most commonly reported methods were the oligonucleotide ligation assay (Abbott Laboratories, Abbott Park, IL), Invader chemistry, and reverse allele-specific oligonucleotide and allele-specific polymerase chain reaction/amplification refractory mutation system. All methods performed well analytically; miscalls appeared to be laboratory specific because most laboratories using these methods reported the correct genotype. This suggests that the errors were in laboratory processes or reporting ( Table 2 ). Laboratory-developed tests and Food and Drug Administration–cleared assays performed equally well.

Analytical interpretations for reported genotypes ( Table 3 ) varied more than the analytical sensitivity/specificity estimates. Laboratories were instructed to interpret the clinical significance of results assuming the clinical scenario was an infant with “failure to thrive.” The intended responses were based on consensus genotype and this scenario. Most laboratories correctly interpreted the lack of any mutations detected as making “a diagnosis of cystic fibrosis less likely” (98.8%). Similarly, most laboratories interpreted the presence of two mutations as “confirming a diagnosis of cystic fibrosis” (91.0%), although some laboratories preferred the interpretation “supportive of cystic fibrosis but not diagnostic” (6.9%). The greatest variability in the interpretation occurred in samples with a single mutation identified (heterozygous samples). The intended response was “supportive of cystic fibrosis but not diagnostic.” Although 86.0% of the laboratories gave this response, 10.8% of laboratories interpreted this genotype as making “a diagnosis of cystic fibrosis less likely.” This response was considered incorrect for mutation panels because the detection rate of the assay differs by ethnicity, which was not specified in the surveys. This could be considered a correct response if a laboratory sequenced the entire gene.

Table 3 Results from external proficiency testing for CFTR mutations: clinical interpretation of the associated genotype

The homozygous G542X challenge in the 2009 survey was closely examined to determine whether the interpretation matched the given result. In all cases for which G542X was not entered in both allele fields, the interpretation was consistent with a result of one mutation or no mutations detected.


Among US clinical laboratories participating in an external PT program for a subset of recommended CFTR mutations, the overall estimates for analytical sensitivity (98.8%) and analytical specificity (99.6%) indicate excellent performance. Evidence also indicates that current performance represents the best currently possible because the analytical sensitivity has reached a plateau. Analytical specificity has not changed significantly over the 10 years examined. All analytical methods, regardless of whether they are laboratory-developed tests or Food and Drug Administration–cleared tests, accurately detect the presence/absence of CFTR mutations. The few incorrect genotype assignments were most often related to process (preanalytic) errors or reporting (postanalytic) errors, or the identification of the specific mutation (not present versus absence). The improvement in analytical sensitivity from 2003 through 2008 may be because laboratories are complying with the 2004 ACMG/ACOG recommendations for a standard set of 23 mutations, representing 0.1% of affected individuals in the pan-ethnic US population. The decreased sensitivity observed in international laboratories may reflect the different mutation spectrum in other populations (not in the United States). This analysis did not review the mutation spectrums of international laboratories.

The IVS8 5T variant modifies the severity of the R117H mutation when on the same chromosome (in cis), but by itself it is considered a mild mutation not associated with classic CF.9 The R117H mutation and reflex testing for the IVS8 polyT were challenged in three surveys. All responding laboratories correctly reported the presence or absence of the 5T variant, although the chromosomal phase was not confirmed. The second mutation in this sample (in all three surveys) was F508del, which is typically on the same chromosome as the 9T, yet not all laboratories correctly reported the 9T allele (reporting 7T instead). Among all participants, about 1 in 20 reported that testing for the 5T variant was not performed. Given the ACMG/ACOG recommendations to reflex R117H samples to test for the 5T variant, laboratories should perform this test for challenges in which the R117H mutation is detected. One possible explanation for not reporting intron 8 polyT status is that laboratories may send samples with the R117H mutation to a referral laboratory to detect the 5T variant; reporting of outside laboratory results is a practice that is prohibited in PT testing. This demonstrates a scenario in which laboratories are not able to treat PT samples in a manner identical to that used for their clinical samples.

Laboratory performance on CFTR mutation testing has improved in the past decade. An estimate of the analytical sensitivities and specificities for CFTR mutation analysis from the same PT program for the years 1996 through 2001 was 97.9 and 98.4%, respectively.7 This is considerably lower than the measured sensitivity and specificity of 98.8 and 99.6%, respectively, from the more current data from 2003 through 2013 summarized in this report. Possible explanations for the observed improvement in analytic performance include ongoing participation in an external PT program, improvement in testing technologies, and/or the standardization of mutations tested.5 In 2003, one laboratory tested for F508del only, whereas several laboratories included relatively few mutations in their panels (even though the ACMG/ACOG recommendations were published in 2002). This affected the results for two laboratories. Although these laboratories were included in the calculations for sensitivity, their responses were considered acceptable for the mutations tested. By 2005, all but one participating laboratory tested at least the 23 recommended mutations. The one laboratory testing only for F508del indicated that its intended use was for follow-up of newborn screening.

Postanalytical interpretation of the CF genotype result represented the largest source of incorrect survey responses. The clinical scenario provided in these surveys was always an infant with failure to thrive. The associated indication for testing was “to diagnose or rule out cystic fibrosis.” This indication for testing is different from the most common clinical scenario for CFTR mutation testing in some laboratories: population-based carrier screening in couples for preconception/prenatal purposes. Although in this setting laboratories usually do not need to be able to identify homozygosity or compound heterozygosity, they should still be able to identify individuals with CF and be able to interpret results appropriately in the context of the clinical scenario presented. The clinical history provided did not include the ethnicity of the infant, which may have contributed to the differences in interpretive responses for heterozygous samples. Although the majority of laboratories gave the intended response of “supportive of CF but not diagnostic,” a number of laboratories interpreted the presence of one mutation as “making a diagnosis of CF less likely.” This is not the best interpretation in an infant with symptoms consistent with CF because a significant proportion of patients with CF will have only one mutation identified by a test panel of selected mutations. The percentages of individuals with one detectable panel mutation who are affected with CF differ by ethnicity, so a response of “a diagnosis of CF is less likely” could be considered acceptable for Ashkenazi Jewish or Caucasian populations with high mutation detection rates but may not accurately describe residual risk for individuals of Asian, Hispanic, or African-American ethnicity. Even in a northern European Caucasian population in which the ACMG-recommended panel can detect 90% of CF-causing mutations, both mutations will be detected for only 81% (90% × 90%) of patients with CF. Only one mutation will be identified by the test panel for another 18% (90% × 10% × 2) of patients with CF, and neither mutation will be identified for 1% (10% × 10%) of patients. In non-Caucasian patients, among whom the detection rate of the ACMG-recommended test panel is lower, the proportion of patients with CF who will have only one mutation identified is even higher. Thus, in an infant with symptoms consistent with CF, the finding of one CF-causing mutation is supportive of a diagnosis of CF. If a laboratory identifies one panel mutation in an infant with failure to thrive, more extensive mutation testing or sequence analysis of the entire CFTR gene coding regions and intron/exon boundaries, which detects more than 98% of mutations for all ethnicities, should be considered.

The existing MGL surveys do not capture several important points. The sample challenges do not evaluate the pre- and postanalytical processes. In addition, the challenges overrepresent difficult samples, rather than the genotypes commonly encountered, as part of the educational nature of the surveys. For example, the G542X homozygous genotype and the I507del heterozygous genotype were difficult challenges ( Figure 2 ), but both of these genotypes are rare, especially in the context of population-based screening. Last, information about training of laboratory directors or supervisors is not captured in these surveys but would be interesting to compare against measures of analytical performance.


Overall, the performance of laboratories participating in the CAP PT surveys has improved since 2003. This may be because of the standardization of recommended mutations tested but also may be because of the proficiency program itself, technical improvement in platforms and reagents, or other unknown factors. In 2013, >1.4 million CFTR mutation tests were performed in the United States. No differences in performance attributable to the number of samples a laboratory tested or the test methodology used were seen. Differences in performance were observed between the MGL2 and MGL5 CF surveys; the explanation for this is unclear but may represent differences in the types of laboratories that subscribed to these two surveys. Laboratories’ interpretative performance showed greater variation than the analytical performance, with the greatest variation in how laboratories interpreted the clinical significance of heterozygous samples.


E.L. is a consultant and a member of the advisory board for Complete Genomics. A.F.-G. owns stock in Luminex Corp. The other authors declare no conflict of interest.