Main

Since the publication of the recommendations by an expert panel assembled by the American College of Medical Genetics (ACMG),1 substantial progress has been made in the adoption of the uniform newborn screening panel by public health programs, culminating in their recent ratification by the US Secretary of Health and Human Services as a national standard.2 A major contribution to the expansion process has come from the Regional Genetics and Newborn Screening Collaboratives funded by the Health Resources and Service Administration of the Maternal and Child Health Bureau.3 The main goal of these projects has been to enhance and support the genetics and newborn screening capacity across the nation by undertaking a regional approach toward addressing the maldistribution of genetic resources. Notably, to be eligible for funding a regional proposal had to include at least four participating states. The initial application for Region 4 (principal investigator: Cynthia A. Cameron, PhD) involved all seven states in the region (Illinois, Indiana, Kentucky, Michigan, Minnesota, Ohio, and Wisconsin) and included a project to facilitate the universal implementation of newborn screening by tandem mass spectrometry (MS/MS) and confirmatory testing of newborns for inborn errors of amino acid, organic acid, and fatty acid metabolism. The specific objectives of this project were (a) to achieve uniformity of testing panels by MS/MS to maximize detection of affected newborns within the region; (b) to improve overall analytical performance; and (c) to set and sustain the lowest achievable rates of false positive and false negative results. The last two objectives were chosen as part of an effort to call more attention to “how well” conditions are screened for,4 as an alternative to merely counting “how many” of them are included in the panel of a given program.5 The rationale for this work also came from the need to address the considerable confusion, and, at times, vigorous controversy about the scientific basis of the uniform panel,69 and speculations of severe consequences of poor performance.10 When dealing with rare disorders, even long-term experiences of single sites11,12 are unlikely to generate adequate evidence for many if not most conditions, so it became apparent that only an unprecedented level of cooperation and collaboration among providers of screening services could lead to the creation of a body of evidence adequate for the clinical validation of cutoff values for most if not all markers measured by MS/MS.

Since its 2005 launch on a regional basis, this project has grown nationally and also internationally with the active participation of 48 US states and territories, plus 80 programs in 45 other countries. A milestone of this project took place in November 2008 when the Region 4 Stork (R4S) website went live (http://www.region4genetics.org/msms_data_project), ending the cumbersome use of offline spreadsheets. The website allows users to submit data independently and to have on demand access to up-to-date tools and reports based on the entire body of collective experience.

For the first time, we describe in this study the disorder ranges for amino acids, acylcarnitines, and related ratios in a total of 64 conditions. This group includes the 20 primary conditions in the ACMG uniform panel (detected by MS/MS), 21 of the 22 secondary targets, eight maternal conditions (leading to secondary abnormalities in the screening profile of the newborn), and 15 other conditions that manifest with biochemical phenotypes mimicking those of primary and/or secondary targets. The disorder ranges are then linked to cumulative population percentile data to define high and, when applicable, low cutoff target ranges. These ranges are automatically updated after any new submission and are available on demand to participants through a web-based interface.

MATERIALS AND METHODS

Participating sites

The collaborative project started in June 2005. As of December 1, 2010, the status of the United States and international participation is shown in Figure 1A and B. Forty-seven US states and Puerto Rico are active participants. International participation includes 80 programs in 45 countries. Most sites have one primary contact, the user with read/write access to the project website who is responsible for data submission. The professional background of other users with read-only access span over a large variety of roles, including program directors, laboratory supervisors and technologists, follow-up coordinators, genetic counselors, dieticians, residents, fellows training in genetics, and a growing number of metabolic specialists who are providers of patient care. The total number of users with an active password is 602 (range: 1–53 per US site, average 8; 1–14 per international site, average 4), double the earliest available count (N = 300) that was recorded in June 2009.

Fig. 1
figure 1

Status of the R4S collaborative project as of December 1, 2010. Gray color indicates active participation. A, Participants in the United States. B, Worldwide participants. In countries outside of the United States, multiple sites may be involved.

R4S website

R4S is a custom-designed and -coded application for the collection and reporting of possibly any type of newborn screening data based on numerical results. The system is a web-based application that implements a three-tier client-server architectural model. The client or presentation tier is the user interface, which can be accessed using popular web browsers from any computer with internet access. The R4S software is compatible with Internet Explorer 6+ and Firefox 2+ (as well as other browsers such as Safari, Opera, and Chrome). The logic and business tier applications are located on a web server and implementing Microsoft Internet Information Server version 6. The application code is written in ASP and C# for Microsoft .NET version 3.5.Winnovative HTML to PDF Converter version 4.0 for .NET and dotnetCharting version 5.3 are used by the web software for pdf and chart generation, respectively. The data tier, located on a database server, uses Microsoft SQL Server 2008 with custom-written T-SQL stored procedures.

Over the last 5 years this project has evolved into an organized, web-based data collection system with a computer program tailored to meet the needs of a diverse population of users. National and international participants are provided with a user ID and password to gain access to a secure section of the R4S website. Once logged in, users have access to profiles unique to their screening program for data submission and to comparison tools, as well as to common folders inclusive of more than 30 project tools and reports. Access to the MS/MS application can be personalized for individual users, including read/write (or read only) privileges and administrative oversight. The database is set to automatically perform basic calculation of descriptive statistics, particularly the calculation of predefined percentiles. These values (and the number of data points used to calculate them) are automatically linked to descriptive and comparative tools, which will be described in complete details elsewhere.

Data submission

Participants submit five types of data: (a) five selected percentiles of individual markers and ratios in the normal population; (b) all cutoff values used in routine screening practice; (c) the complete set of available amino acid and acylcarnitine results in true positive cases (according to case definition as established by local protocols and/or professional guidelines; for example, the ACMG act sheets13); (d) performance metrics (detection rate, false positive rate, and positive predictive value4), and (e) answers to a series of multiple choice questions to define a participant profile (e.g., source of reagents, use of derivatization, date of collection, and punch size). On average, 25–30 users log-in daily (40% are international), ranging from 0 to 145 in a given day.

Percentiles

Five percentile values (1%, 10%, 50%, 90%, and 99%) of each marker are calculated by standard statistical methods. When requested, assistance with data processing has been provided to 26 participating sites who send the project coordinator anonymized raw data for calculation and submission on their behalf of percentile values. Data can be entered manually or, preferably, by a semiautomated procedure based on the uploading of a comma separated value file (.csv) suitable for data transmission using LOINC codes.14 As of December 1, 2010, a total of 25,108 percentile values have been submitted by 102 participants, 70% of them have updated their profile after January 1, 2010. The age of specimen collection was 24–48 hours (57% of participating sites), 3 days (34%), or 5 days (9%). Each value is based on a variable number of cases, ranging from a few hundreds to more than 1 million. As an example, the current percentile values of the Minnesota program alone are derived from 517,283 newborns tested by MS/MS15 between July 1, 2004, and August 31, 2010. Although the total number of subjects included in the calculation of percentiles at each site is not consistently available, an extrapolation of the data collected for calculation of performance metrics (total numbers of true positives, false positives, false negatives, and true negatives; data have been submitted by 59% of participants) leads to an estimate of approximately 25–30 million individuals.

Cutoff values

The website is currently configured to upload 24 low and 90 high cutoff values. The standard unit is μmol/L. Data entry is manual only, divided into five categories: amino acids, amino acid ratios, acylcarnitines, acylcarnitine ratios, and second tier tests. The specific data of the latter group will not be discussed further in this study because they are not part of the primary screening by MS/MS. As of December 1, 2010, 5341 cutoff values (638 low and 4703 high, respectively) have been submitted by 113 participants, 69% of them have posted and/or updated their profile since January 1, 2010. The number of active cutoffs varies considerably from site to site (range: 2–114, average 44), but there has been a steady trend to include additional cutoffs once the clinical utility of a given marker has been highlighted by the collaborative project. For example, 43 participants have added a low cutoff for the amino acid methionine in response to the emerging evidence of its clinical utility for the primary identification of asymptomatic newborns affected with remethylation disorders.16

True positive cases

The website is configured to upload newborn screening data of 25 amino acid disorders (6 primary targets of the ACMG uniform panel, 8 secondary targets, and 11 other conditions [6/8/11]), 18 fatty acid oxidation disorders (5/8/5), counting as separate conditions maternal cases and carriers for very long-chain acyl-CoA dehydrogenase deficiency (OMIM number 201475) and medium-chain acyl-CoA dehydrogenase deficiency (MCAD, 607008), and 22 organic acid disorders (9/6/7). Figure 2 shows a summary of the number of cases entered into the R4S database (updated as of December 1, 2010). A total of 75 cases are annotated as false negatives (i.e., results that were reported as normal, but a diagnosis was made later based on clinical presentation), it is more than likely that a larger number of false negative cases were submitted but not disclosed. Nonketotic hyperglycinemia (OMIM# 605899), tyrosinemia type I (OMIM# 276700), ornithine transcarbamylase deficiency (OMIM# 300461), and different types of methylmalonic acidemia (Cbl A,B; OMIM# 251000, 2511100, 251110) are the conditions with greater representation in the false negative group. The database also includes 670 cases (6.3%) extracted from the literature. These cases are not duplicate entries because they were published by laboratories who have declined invitations to be active participants of the collaborative project. Conditions are sorted in descending order by group (left: amino acid disorders; center: fatty acid oxidation disorders; and right: organic acid disorders) and by status in the ACMG uniform panel (top: primary targets; center: secondary targets; and bottom: other conditions1). The darker bar color and the higher section of the Y-axis scale reflect the project goal to collect at least 50 cases of each condition. A case is considered eligible for submission if the following conditions have been met: (a) the diagnosis was confirmed by biochemical and in vitro testing according to local protocols; (b) results were from the first specimen only (no repeat samples); and (c) age at collection was between 1 and 7 days of life. A separate application in the newborn screening website domain has been activated to process the same types of data derived from routine second samples (project lead: Marzia Pasquali, PhD, University of Utah). Each case is assigned a unique code separate from any other traceable identifier, and no demographic information is collected except for the calendar year of birth. Accordingly, this project has been reviewed and approved as a minimum risk protocol by the Mayo Clinic Institutional Review Board (no. PR09-001709-01).

Fig. 2
figure 2

Number of available cases per condition as of December 1, 2010, sorted in descending order. The split scale on the Y-axis and the darker color are used to highlight those conditions with at least 50 cases, the initial goal of the collaborative project. Panels A–C: primary targets of the ACMG uniform panel; panels D–F: secondary targets; and panels G–I: other conditions, including carriers and maternal conditions. Left column: amino acid disorders; middle column: fatty acid oxidation disorders; and right column: organic acid disorder. Abbreviations (in alphabetical order): 2M3HBA, 2-methyl 3-hydroxybutyryl-CoA dehydrogenase deficiency (OMIM number 300438); 2MBG, 2-short/branched chain acyl-CoA dehydrogenase deficiency (610006); 3MCC, 3-methylcrotonyl-CoA carboxylase deficiency (210200,210210); 3MGA, 3-methylglutaconyl-CoA hydratase deficiency (250950); ARG, argininemia (207800); ASA, argininosuccinic acidemia (207900); B12 def, vitamin B12 deficiency; BIOPT (Reg), disorders of biopterin regeneration (261630); BIOPT (BS), disorders of biopterin biosynthesis (261640); BKT, β-ketothiolase deficiency (203750); CACT, carnitine:acylcarnitine translocase deficiency (212138); Cbl, cobalamin (complementation group); CIT-I, citrullinemia type I (215700); CIT-II, citrullinemia type II (605814,603471); CPT-I, carnitine palmitoyltransferase Ia deficiency (255120); CPT-II, carnitine palmitoyltransferase II deficiency (255110); CPS, carbamylphosphate synthase deficiency (237300); CUD, carnitine uptake defect (212140); EE, ethylmalonic encephalopathy (602473); FIGLU, formiminoglutamic acidemia (229100); GA-II, glutaric acidemia type II (608053, 130410, 231675); GA-I, glutaric acidemia type I (231670); H-PHE, hyperphenylalaninemia (261600); HCY, homocystinuria (236200); het, heterozygote (carrier status); HMG, 3-hydroxy-3-methylglutaryl-CoA lyase deficiency (300438); IBG, isobutyryl-CoA dehydrogenase deficiency (611283); IVA, isovaleryl-CoA dehydrogenase deficiency (243500); LCHAD, long-chain L-3-Hydroxy dehydrogenase deficiency (609016); M/SCHAD, medium/short-chain L-3-hydroxy acyl-CoA dehydrogenase def. (601609); MAL, malonyl-CoA decarboxylase deficiency (248360); (mat), maternal; MCAD, medium-chain acyl-CoA dehydrogenase deficiency (607008); MCD, holocarboxylase synthetase deficiency (253270); MCKAT, medium-chain ketoacyl-CoA thiolase deficiency (602199); MET, hypermethioninemias (250850); MSUD, maple syrup urine disease (248600); MTHFR, methylene tetrahydrofolate reductase deficiency (607093); MUT, methylmalonic acidemia (251000, 251100, 251110); NKHG, nonketotic hyperglycinemia (605899); OTC, ornithine transcarbamylase deficiency (300461); PA, propionic acidemia (606054); PC, pyruvate carboxylase deficiency (266150); PKU, phenylketonuria (261600); RED, 2,4-dienoyl-CoA reductase deficiency (222745); SCAD, short-chain acyl-CoA dehydrogenase deficiency (201470); TFP, trifunctional protein deficiency (609015); TYR-I, tyrosinemia type I (276700); TYR-II, tyrosinemia type II (276600); TYR-III, tyrosinemia type III (276710); TYR (trans), transient tyrosinemia; VLCAD, very long-chain acyl-CoA dehydrogenase deficiency (201475). Condition codes are according to Watson et al.1

The average rate of submission of true positive cases between December 1, 2008, and December 1, 2010, was 5.1 cases/day (3,651 cases). The total number of true positive cases is approaching 11,000, and the range of number of cases per condition is 1–2,057, with a median of 47. However, if only the primary targets of the ACMG panel were to be considered (N = 7,288), the range and median are 18–2,057 and 210, respectively. All primary conditions have exceeded 50 cases except 3-hydroxy 3-methylglutaryl-CoA lyase deficiency (OMIM number 300438; N = 35), β-ketothiolase deficiency (BKT, 203750; N = 26), and multiple carboxylase deficiency (MCD, 253270; N = 18). As of December 1, 2010, a total of 562,609 results (analyte values and calculated ratios) have been submitted by 113 sites; 85% of the active participants have posted at least one new case since January 1, 2010. Not surprisingly, the number of cases submitted by individual sites varies considerably, ranging from 1 to 870, with an average of 130.

RESULTS

The main deliverable of the R4S collaborative project is the definition of evidence-based cutoff target ranges for all analytes detected by MS/MS and related ratios. The cutoff target range could be either above (high) or below (low) the normal population: the high target range is defined as the interval between the cumulative 99th percentile of the normal population and the lowest 5th percentile of all disorder ranges of the same marker (if the analyte is informative for multiple conditions). On the other hand, the low target range is defined as the interval between the highest 99th percentile of disorder ranges and the 1st percentile of the normal population.

Table 1 presents the 1st percentile, 50th percentile (median), and 99th percentile cumulative values of amino acid and acylcarnitine species. In response to the recent introduction of a modified commercial kit17 that does not include derivatization to butyl esters18 (used by 31% of R4S participants), the different overlaps of isobaric acylcarnitine species with and without derivatization (shown as [D] and [U], respectively; the two analytes are combined by the symbol “&”) are shown separately. For each percentile value, the coefficient of variation (calculated as standard deviation/mean) is also shown. Despite existing differences in preanalytical and analytical variables, including collection age,19 and the inconsistent use and reporting of decimal digits at the submicromolar level, overall variability of the median values was on average 23% for amino acids (Fig. 3, analyte comparison tool for the amino acid phenylalanine; see figure legend for details) and 27% for acylcarnitines. Similar results were observed for all calculated amino acid and acylcarnitine ratios (data not shown; available on request). Notable exceptions that showed greater variability were argininosuccinic acid (Asa) and succinylacetone (Suac). The former showed significant differences between sites, with approximately half of the participants reporting normal values at a level much higher than seen in plasma, where Asa is usually undetectable.20 The remaining sites reported normal percentiles comparable with concentrations seen in plasma of normal newborns. Such a difference is unlikely to be explained solely by the analysis of a different specimen, neonatal dried blood spots, hence there seems to be some analytical factor behind these observations other than reagents and derivatization mode. There were no obvious differences based on the R4S participant profile comparison tool (data not shown), which is not fully informative because of the relatively small number of participants who are actively monitoring this analyte (N = 26). Suac is a relatively recent addition, also with limited participation (N = 22), and is measured using a variety of methods.2124 It has been suggested that the observed variability could be improved by standardization of preanalytical variables.25,26 Despite these issues, both Asa and Suac are absolutely required for the reliable detection of two primary targets of the uniform panel, argininosuccinic acidemia (OMIM number 207900), and tyrosinemia type I (276700). Every effort should be made to expand their utilization and to mitigate existing differences among laboratories.

Table 1 Amino acid and acylcarnitine cumulative percentiles in neonatal dried blood spots analyzed by tandem mass spectrometry by participants of the Region 4 Stork collaborative project (as of December 1, 2010)
Fig. 3
figure 3

R4S analyte comparison tool for the amino acid phenylalanine in neonatal dried blood spots. Each box represents the interval between the 10%ile and 90%ile, the upper and lower lines extend to the 99%ile and 1%ile, respectively. The median is shown as a white circle in the body of the box. Color coding: dark green: cumulative percentiles; light green: percentiles of individual participants, sorted in descending order of the 99%ile value; orange: cutoff target range (see text for details); light blue diamonds: actual cutoff values of participants; the marker size is proportional to the number of laboratories using the same value; and bright red bars: disorder ranges (partially hidden by Y-axis reduction to allow the normal percentiles to be visible). For the number of cases included in each disorder range, see Table 2. Abbreviations are listed in the legend of Figure 2.

The second defining element of the target range is calculated from the disorder ranges of all conditions related to a specific marker or ratio. The attribution of a marker to a condition stems from an objective process applied to establish a threshold of clinical utility. In the R4S project, clinical significance is attributed to a marker-to-condition association when at least half of the true positive cases with a given condition have values outside the normal population, defined as the interval between the average 1% and 99% percentiles calculated from the data submitted by all participating sites. An example of the above process is shown in Figure 4, the plot by condition for BKT. This R4S plot shows, on a log scale, a comparison between normal and disorder ranges after conversion of all quantitative values to the corresponding multiple of the average median. BKT was chosen as an example in this study because the R4S plots were instrumental for the initial recognition that hydroxy butyrylcarnitine (C4-OH) is a highly informative marker of this condition, one that moving forward should not be overlooked in the complex differential diagnosis of an elevated concentration of hydroxy isovalerylcarnitine (C5-OH).27

Fig. 4
figure 4

R4S plot by condition for β-ketothiolase (BKT) deficiency. This plot converts each case value to the corresponding multiple of the cumulative median (MoM). Each box represents the interval between the 10%ile and 90%ile, the upper and lower lines extend to the 99%ile and 1%ile, respectively. The median is shown as a white circle in the body of the box. Color coding: red: disorder ranges of informative markers; gray: disorder range of uninformative markers; and green: range of normal population. For the number of cases for informative analytes, see Table 4. Abbreviations are listed in the legend of Figure 2.

Based on the results of 10,679 true positive cases, the disorder ranges of amino acids, amino acid ratios, acylcarnitines, and acylcarnitine ratios are listed in Tables 25, respectively. The suffix “(low)” attached to a marker indicates clinical significance below the normal population, triggering the selection of a low cutoff value. A need for a low threshold was documented for all types of analytes: amino acids (3), amino acid ratios (7), acylcarnitines (7), and acylcarnitine ratios (6), underscoring how underused they currently are. The disorder ranges for a given analyte are condition-specific and listed together to facilitate comparative analysis. Rows are sorted in a descending order based on the median value; the number of cases for each condition is also provided. Differences between analyte counts related to the same condition reflect the variability of past and current testing panels of the participants. For example, 107 participants have a cutoff for octanoylcarnitine (C8), but only 87 of them also monitor decenoylcarnitine (C10:1). However, these differences have declined substantially since the beginning of the collaborative project.

Table 2 Amino acid disorder ranges in neonatal dried blood spots analyzed by tandem mass spectrometry by participants of the Region 4 Stork collaborative project (as of December 1, 2010)
Table 5 Disorder ranges of acylcarnitine ratios in neonatal dried blood spots analyzed by tandem mass spectrometry by participants of the Region 4 Stork collaborative project (as of December 1, 2010)

All data shown earlier in the text are combined to achieve the primary objective of this project, which is the definition of clinically relevant cutoff target ranges. Table 6 presents all markers with a low cutoff target range. In addition to the number of cases, and how many conditions could be detected, it introduces the key concept of “override.” One or both limits of a target range may need to be adjusted in response to the degree of overlap between normal population and disorder range. The ideal situation (no override at either end) occurs in 35% of all markers combined (40/114), amino acids 32%, and acylcarnitines 38%. The opposite scenario (need to override at both ends because of pervasive overlap) was encountered in 25% of the markers. The intermediate situation (partial overlap at either limit) is frequent (40%) and reflects the variability of the biochemical phenotype of these disorders in asymptomatic newborns and underscores the importance of using an evidence-based rather than statistical approach to the selection of a cutoff value. Tables 7 and 8 present the high cutoff target ranges for amino acids and acylcarnitines, respectively. The only difference is found in the reliance on the lowest 5th percentile of all disorder ranges for a given analyte. The choice of a slightly higher limit is driven by the recognition that false negative cases have been encountered in virtually all conditions.2830 Once the possibility of a cutoff value set too high has been considered,31 it must be recognized that a small number of cases could just be undetectable on the sole basis of their biochemical phenotype. Although this is unfortunate, the quest for perfect sensitivity should not be a reason to artificially set cutoff values so close to the normal population that they trigger very large numbers of false positive events.

Table 6 Low cutoff target ranges of amino acids, acylcarnitines, and ratios
Table 7 High cutoff target ranges of amino acids and amino acid ratios
Table 8 High cutoff target ranges of acylcarnitines and acylcarnitine ratios

In addition, Tables 68 also present the distribution of cutoff values below, within, and above the respective target range. Overall, 42% (2269/5341) of all submitted values are within the target range, 15% (788) are positioned to have low specificity but high sensitivity, and the rest (43%, 2282/5341) are set at a level where false negative outcomes are likely to occur. The most striking observation is that 42% of these cutoffs with potentially poor sensitivity are applied to 37 markers with no overlap between normal population and disorder range. This group should be scrutinized closely to identify adjustments, which could be relatively easy to implement in the pursuit of performance improvement.

DISCUSSION

We have reported the status of a worldwide collaborative project aimed at laboratory quality improvement of newborn screening by MS/MS. The central strategy of this effort is to assemble enough evidence to establish clinical utility using a more effective method for the selection of cutoff values. Traditionally, this is done by statistical elaboration, either as a given percentile of the normal population or by adding multiples of the standard deviation to the mean value. Hence, the criteria to define abnormality are almost exclusively based on normal results. Once cutoffs selected in this manner are implemented, negative feedback from the follow-up system (too many false positives) or the dreaded occurrence of a false negative case may lead to abrupt changes, often resulting in the opposite problem. This situation is compounded by the reality that most programs have actually never encountered a case affected with 30–80% of the conditions that they are testing for. A large repository of true positive cases exists and could have been helpful to advance this project further, but it is under the control of a commercial entity, and the information is treated as proprietary.32 However, four programs who outsource testing to the same company have nevertheless joined the project and have submitted limited sets of data (true positive cases only). In the interest of time, and particularly of the vulnerable population we serve, the lack of available information had to be addressed because of the anecdotal nature of single site experiences and of the inherent risk of making uninformed choices. These are often caused by limited familiarity with the complexity of the biochemical phenotype of metabolic disorders and to some extent with the technology being used.

Tolerance of some degree of analytical variability and an unprecedented willingness to share data have resulted in a vast body of evidence, which has been used for clinical validation of amino acid and acylcarnitine cutoff values. Rather than the conventional statistical approach, we have sought the definition of gaps in analyte concentrations between the normal population and the disorder range of rare disorders. These gaps were either naturally occurring or carefully selected by consensus expert opinion. This approach leads to a substantial expansion of the number of markers, which are potentially informative for a condition, both at the high and low end. In some cases, new associations between a marker and a condition are documented. For example, our database has confirmed a previously reported, and somewhat unexpected, association between disorders of propionate metabolism and an elevated concentration of hydroxy hexedecenoylcarnitine (C16:1-OH).33 Despite causing some initial consternation among new users, broadening the definition of clinical significance is critical to explain how cutoff values should not be used, a boundary between normal and abnormal, and place them instead in a role of review flags calling attention to cases that require an assessment in terms of pattern recognition and profile interpretation.34 Clearly, there remain conditions with only a very small number of cases (<5), and the disorder ranges are at this stage merely a preliminary indication of the magnitude of results a laboratory may expect to encounter in an affected newborn. On the other hand, the utility of disorder ranges is not limited to the definition of a cutoff target range. For example, the median value of a disorder range could be used as a more objective alternative to “panic values,” which are used inconsistently by many programs on the basis of mostly anecdotal and/or arbitrary information. After verification by a repeat analysis of the same specimen, a value that exceeds the median of the disorder range (e.g., a C8 concentration of 7 μmol/L or a C14:1 concentration of 1.8 μmol/L in the presence of a characteristic profile for MCAD and VLCAD deficiency, respectively) could become a valid reason to question the wisdom of collecting a repeat sample (dried blood spot) instead of proceeding directly to confirmatory testing by biochemical, enzymatic, and molecular means. Several days could be saved in the process, increasing the probability of preventing a first symptomatic event in the undiagnosed newborn, an episode that may have severe consequences.

The process described in this study is not complete and will continue until the goal of collecting 50 or more cases of each possible condition has been met. Notably, the number of conditions detectable by the analysis of amino acids and acylcarnitines will continue to grow, too.35,36 Furthermore, there are several improvements planned for future implementation. The highest priorities are to statistically validate the override process in response to overlaps, and the exclusion of extreme outliers from the calculation of ranges. This work is already in progress (Ryu et al., unpublished results) and will greatly improve the strength and clinical validity of the target ranges. On the other hand, the outcome of this analysis is likely to result in much tighter ranges, and consequently, a greater proportion of cutoff values will fall outside the suggested interval. For example, a preliminary analysis of the first 1300 cases with MCAD suggests that the target range for octanoylcarnitine (C8) would change from 0.21–0.70 μmol/L to just 0.33–0.38 μmol/L (95% confidence interval). When applicable, this statistical revision will take into consideration as a determining factor the growing number of available second tier tests.3742 These tests are performed on the same specimen submitted for the primary screening, with no additional patient contact, targeting informative analytes, which are not included in the primary screening.

Traditionally, conditions with essentially identical biochemical phenotypes have been lumped together on the assumption of almost indistinguishable profiles. Although that is a more than likely reality, participants will be asked to assign their cases, whenever possible, to either one condition of a pair, for example, either long-chain l-3-hydroxy dehydrogenase deficiency or trifunctional protein deficiency,43 or methylmalonic acidemia due to either mutase deficiency or belonging to the complementation groups Cbl A and Cbl B.44 Another improvement will be the gradual introduction of condition subtypes based on clinical or molecular criteria, similar to the arrangement already in place where carnitine palmitoyltransferase Ia deficiency patients with the common P479L mutations45,46 are shown separately from the other patients with the same condition but different genotypes. MCAD, for example, will be split in three categories: homozygosity for the common mutation (A985G/A985G), compound heterozygosity with one A985G allele (A985G/other), and homozygosity or compound heterozygosity with no A985G allele (other/other).47 Similarly, sorting of cases with isovaleric acidemia will be based on the presence of the A282V allele, which is found frequently in patients detected by newborn screening.48 Other conditions to be split in multiple subtypes are SCAD deficiency (based on profiles of pathogenic mutations and common polymorphisms)49,50 and homocystinuria (based on pyridoxine responsiveness).51

The data in this publication are deliberately kept at a global level, with no possibility to attribute any of them to a single participant. This is far from the reality of the tools accessible to the users on the R4S website, where up-to-date personalized reports, called comparison tools, are available to analyze the behavior of every percentile and cutoff value in the context of the collective experience. Each participant can generate a report where cutoff values are flagged as “clinically validated” when they meet two conditions: (a) they are within the target range and (b) they fall within the 25th–75th percentile range of all cutoff values. At the same time, special emphasis is placed on cutoff values standing at either the highest or lowest ranking among all sites. The rationale of highlighting such outliers is that corrective action could ensue, leading to an adjustment to a more realistic level. As another laboratory is consequently placed in the same outlier position, this process has been very effective in reducing extreme anomalies and narrowing the distribution curve.

On the basis of an encouraging trend of increased interest in recent months, it is worthwhile noting that the R4S project is hardly limited to laboratory personnel and could be beneficial to the practice of all professionals involved at different stages of the newborn screening system. It could be used to assess and monitor performance, investigate challenging cases by means of postanalytical interpretive tools, and access educational material. For example, since 2007, 139 individuals have attended a week-long training course, which is open to all active users (offered with no registration fee) as an opportunity to improve postanalytical skills, acquire familiarity with the tools of the R4S website, and network with other users. Many US users have received funding for travel and lodging from other Regional Collaboratives. Moreover, MS/MS is just one of eight live applications on the newborn screening domain (Fig. 5). Two of them (lysosomal storage diseases and severe combined immunodeficiency) are supported in part by a contract from the Newborn Screening Translational Research Network (www.nbstrn.org). The vision behind this expansion is to create an infrastructure of identical applications for each of the current and future metabolite-based newborn screening tests, i.e., not based on molecular methods. Access to these applications is stratified from a curator role, with complete access to all data and profiles, to the same read/write and read-only roles found in the MS/MS application. Based on our experience to date, it is critical to identify curators who are content experts, willing and capable of monitoring the quality of the submissions coming in, and to provide feedback to less experienced users.

Fig. 5
figure 5

Live applications on the newborn screening domain on the R4S website. Abbreviations are as follows (in alphabetical order): ALD, X-linked adrenoleukodystrophy (administrative oversight of this application is provided by Silvia Tortorelli, MD, PhD, Mayo Clinic College of Medicine); BIOT, biotinidase deficiency (Tina Cowan, PhD, Stanford University; Robert Grier, PhD, and Barry Wolf, MD, PhD, Wayne State University Medical School); CAH, congenital adrenal hyperplasia (Piero Rinaldo, MD, PhD, Mayo Clinic College of Medicine; Kyriakie Sarafoglu, MD, University of Minnesota); CH, congenital hypothyroidism (unassigned); LSD, lysosomal storage diseases (Dietrich Matern, MD, Mayo Clinic College of Medicine); MS/MS [2], routine second specimen of newborn screening by tandem mass spectrometry (Marzia Pasquali, PhD, University of Utah); NBS, newborn screening; SCID, severe combined immunodeficiency (Roshini Abraham, PhD, Mayo Clinic College of Medicine; Mei Baker, MD, Wisconsin State Laboratory of Hygeine; Amy Brower, PhD, American College of Medical Genetics; Michele Caggana, PhD, New York State Department of Health; Anne Comeau, PhD, University of Massachussetts; and Fred Lorey, PhD, California Department of Public Health).

In conclusion, the R4S collaborative project has paved the way to a collegial and transparent process for clinical validation of newborn screening by MS/MS and potentially of any other laboratory tests for rare disorders if a comparable level of cooperation could be reproduced. The critical factors behind the unanticipated expansion of the collaborative project to become a worldwide initiative have been the gain of mutual trust among participants, the belief of equal standing of all sites regardless of the magnitude of their contributions, and the vision to create tools that motivate users to be actively involved. Indeed, users of the collaborative project have contributed data as they believed that there was tangible value being added to their professional practice. As the project continues, even greater participation is needed, and every effort will be made to welcome new sites and users.

Table 3 Disorder ranges of amino acid ratios in neonatal dried blood spots analyzed by tandem mass spectrometry by participants of the Region 4 Stork collaborative project (as of December 1, 2010)
Table 4 Acylcarnitine disorder ranges in neonatal dried blood spots analyzed by tandem mass spectrometry by participants of the Region 4 Stork collaborative project (as of December 1, 2010)