Main

Newborn screening has been a successful public health program for nearly half of a century after the introduction of newborn screening for phenylketonuria (PKU) in 1963.1,2 Universal newborn screening for PKU was initiated after presymptomatic diagnosis proved to be an essential component to successful long-term outcome. Since that time, additional disorders have been added to newborn screening programs once it became evident that early diagnosis before symptom onset was shown to improve health and development of affected individuals. Many factors contribute to long-term outcome including the natural history and genetic heterogeneity of a condition, timing of diagnosis, implementation and management of treatment, adherence to medical recommendations, socioeconomic factors, access to treatment and resources required for optimal care, and the organization of the health care delivery system (Fig. 1). These factors have been researched most systematically in PKU. In 2000, a national conference on optimal management of PKU identified multiple measures required for good outcome including timely diagnosis, prolonged dietary restriction of phenylalanine, availability of varied nutritional dietary products, guidance by experienced care providers including specialized nutritionists, and economic factors such as coverage of special formulas and low-protein dietary products.3

Fig. 1
figure 1

Factors affecting long-term outcome of newborn screening.

The application of tandem mass spectrometry (MS/MS) to newborn screening, first described by Millington et al.4 in 1990 allowed for a drastic increase in the number of metabolic disorders screened for in the neonatal period. In 2005, the American College of Medical Genetics Newborn Screening Expert Group established a universal panel of metabolic disorders that would benefit from newborn screening in the United States.5 This universal panel has been gradually mandated and implemented throughout the United States.6 However, with evidence-based data being scarce, many important questions remain to ensure that the implementation of a newborn screening program will ultimately be followed by good outcome of the affected child. An important question is whether the outcome of affected children identified through newborn screening is better in both overall health and development, when compared with those children diagnosed clinically.7 The benefit of early diagnostic recognition has been documented for some conditions such as medium-chain acyl-CoA dehydrogenase deficiency but not for the majority of the disorders on the universal panel.8 Large-scale longitudinal studies with long-term data are needed to understand the clinical spectrum and to assess the impact of newborn screening.9,10 Defined datasets are a necessity to collect similar data regionally or even nationally.

In 2006, only about half of the state and territorial newborn screening programs in the United States conducted long-term follow-up.11 Of those states, only about half had a standard protocol in place, and there was great variation in long-term follow-up. Nonetheless, it was realized that the full benefits of newborn screening require a framework for long-term follow-up, particularly as the program is expanded to more disorders. The Association of Maternal and Child Health Programs' Newborn Screening and Genetics Advisory Group outlined the components of long-term follow-up to include assessing patient progress through defined outcome indicators, collecting and analyzing state and national long-term follow-up data, and engaging in continuous quality improvement.12

Such longitudinal studies are challenging, and to date, very few reports exist on long-term follow-up of the disorders detected by newborn screening. A German study tracked 106 affected newborns identified through MS/MS for a period of 42 months.13 Of the 70 babies deemed to require treatment, six developed symptoms and three of those children died.13 Sixty-one remained asymptomatic with normal psychomotor development, no major disabilities, and no metabolic crises, thus demonstrating the benefit of newborn screening and early treatment.13 However, this study lacked quantitative measurements of intelligence and motor development. A long-term follow-up study of 50 infants identified by newborn screening in Massachusetts, Maine, and Pennsylvania used quantitative outcome measurements such as number of hospitalizations, utilization of services, and quantitative measures of development (Bayley Scales of Infant Development and the Stanford-Binet intelligence Scale).14 On average, patients identified by newborn screening received treatment 4 months sooner than those identified clinically.14 The newborn screening cohort experienced fewer developmental and health problems and functioned better in aspects of daily living.14 They had 60% fewer medical problems and scored higher on developmental tests.14

There are several challenges in collecting long-term data for these rare disorders. First, the rarity of the conditions makes single-center studies impossible because it would take many years for a center to collect enough cases for statistical significance in outcome studies. Therefore, collaborative studies are a necessity but often introduce additional variables. Second, the natural history of many disorders is not completely understood. New late complications may develop in older patients, which have yet to be described. Newborn screening also identifies a new subset of patients with possible milder, more benign forms of some of the disorders, which previously went undetected, indicating a need for genotype-phenotype correlations.15 Finally, controversy exists whether some disorders, such as short-chain acyl-CoA dehydrogenase deficiency and 3-methylcrotonyl CoA carboxylase (3-MCC) deficiency, cause clinical disease.

Thus, the metabolic providers in the Mountain States region set forth to establish a consortium and develop a comprehensive program for long-term follow-up to study the factors involved in maximizing the outcome of children identified through newborn screening. This article describes the process and goals of this Mountain States Genetics Regional Collaborative Center's (MSGRCC) Metabolic Newborn Screening Long-term Follow-up Study.

OBJECTIVES

The primary objective of the MSGRCC Metabolic Newborn Screening Long-term Follow-up Study was to develop a long-term follow-up program over a sufficiently large population in a suitably homogenous manner that allowed the systematic analysis of factors that affect long-term outcome of all inborn errors of metabolism identified by newborn screening. To provide for a sufficient number of patients, a multistate collaborative consortium was organized. Homogeneity of data were achieved through the development of minimal disease-specific care plans and shared datasets that focused on disease characteristics including genotype, treatment criteria, and common measurable long-term outcome parameters. The datasets also included objective measures of neurocognitive and functional outcome. It is planned that all these parameters will be included in a single database. Using this data, parameters that critically affect long-term outcome will be evaluated. A primary question to be analyzed is whether presymptomatic identification through newborn screening has measurable objective benefits on long-term outcome, when compared with symptomatic clinical detection. Hence, the database will also include patients identified through clinical ascertainment in which the same parameters are tracked. In addition, minimum disease-specific care parameters in the database also will provide a baseline from which to analyze the impact of clinic-specific variations in treatment, thus allowing systematic studies of improvements in treatment strategies. Also, the minimum parameters will serve as a baseline to track adherence to medical recommendations as judged against the outcome. Finally, the heterogeneity of the study region over a large geographic area using similar homogenous treatment parameters will allow for studies on the impact of socioeconomic and organizational variables on the long-term outcome of the conditions involved.

MSGRCC OVERVIEW

The MSGRCC is one of seven regional collaborative centers in the nation and includes the states of Arizona, Colorado, Montana, New Mexico, Nevada, Texas, Utah, and Wyoming. The center is funded by the US Department of Health and Human Services, Health Resources, and Service Administration Genetics Services Branch. The MSGRCC then funds the MSGRCC Metabolic Consortium.

The Mountain States region is unique in the vast geographical size and cultural diversity of the region. There are more than 38 million people in this eight state region with a land area of 1,081,813 square miles.16 Although there are some major metropolitan areas in the region, much of this area is rural with an overall population density of approximately 37 people per square mile, about one half of the national population density.16 The region is culturally diverse and includes large populations of Native Americans, Hispanic Americans, African Americans, and groups from Southeast Asia and Eastern Europe.16

There are more than 600,000 births annually in this region.16 With expanded newborn screening detecting approximately 1 in 4,000 babies with a metabolic disease, 150 babies are born every year in the Mountain States region who require specialty care by a metabolic center. The metabolic centers in the Mountain States region are located in large cities. These clinics cover a vast geographic region that includes urban and rural areas, and often spans more than one state. Currently, there are 12 major metabolic centers in the region, with only four of those centers outside of the state of Texas. Before the start of the MSGRCC Metabolic Newborn Screening Long-term Follow-up Study, there were well-established collegial relationships among many of the metabolic providers in the Mountain States region.

METHODS

The MSGRCC Metabolic Newborn Screening Long-term Follow-up Study is an ongoing process including establishment of a consortium, development of disease-specific care plans and outcome measures (shared datasets), development of neuropsychological measures, review and implementation of care plans, database development, data collection of performance indicators and outcome measures, analysis of datasets, and continued review and collaboration by the consortium (Fig. 2).

Fig. 2
figure 2

Development and review of disease-specific care plans and shared datasets.

There were a number of factors considered when developing the MSGRCC long-term follow-up study. The first was unbiased enrollment, meaning all patients diagnosed by newborn screening with an inborn error of metabolism would be included. Second, common shared data elements needed to be determined, so that similar data would be collected from all metabolic clinics involved. Third, data would be collected prospectively and obtained longitudinally. Fourth, a consensus on a minimally homogenous approach to diagnosis and treatment would be developed over multiple centers. Fifth, respect for clinic to clinic variability would be allowed and the impact captured. Finally, these data collection measures should be possible within the very limited resources available to this project in this region.

The first step in this project was to establish a consortium of clinical care providers from the metabolic clinics throughout the region including biochemical geneticists, genetic counselors, metabolic dietitians, nurses, and consumers (Table 1). This group included representation from all the states located in the Mountain States region. After the establishment of the consortium, the group began development of disease-specific care plans for diagnosis and treatment of all the inborn errors of metabolism that are currently screened for in the region. The process included review of current literature and current clinical practice of the metabolic clinics, development of the disease-specific care plans with specific outcomes for each, preliminary review by the metabolic team in Colorado, review by the consortium of all disease-specific care plans, revision of the care plans with appropriate outcome measures, and distribution and implementation of the care plans. The consortium continues to meet yearly to review and update the care plans and outcome measures.

Table 1 MSGRCC metabolic consortium

The disease-specific care plans are for long-term treatment and follow-up and are not to be confused with the ACTion sheets, which were developed by the American College of Medical Genetics for short-term follow-up.17 These datasets identify minimal treatment criteria as a tool to measure compliance with clinical recommendations with subsequent comparison to outcome. The datasets are not to be considered standards of care but rather a framework to achieve homogenous data collection, a necessity for comparison of data across multiple clinics. A consensus by the consortium was reached about what current treatment is minimally appropriate and necessary. Differences in treatment beyond the minimally agreed-on care will be documented to allow later comparison of outcomes. In rare metabolic diseases, where randomized control trials are not often feasible, such analyses can provide insight into the impact of variations in treatment strategy.

Each care plan included common components, such as growth and cognitive outcome, and disease-specific considerations. For instance, in the care plan for very long-chain acyl-CoA dehydrogenase (VLCAD) deficiency, special considerations consisted of cardiomyopathy, arrhythmia, hepatic dysfunction, rhabdomyolysis, maternal complications during pregnancy such as acute fatty liver of pregnancy and HELLP syndrome, and possible normalization of acylcarnitine profile after an abnormal newborn screening.18 The VLCAD protocol included dietary treatment considerations of low-fat diet with limited long-chain triglycerides and supplementation with medium-chain triglyceride oil. For follow-up, the care plans outlined minimal frequency of clinic visits and laboratory studies including baseline studies, interim monitoring laboratories for dietary follow-up, and laboratory studies to be done at clinic visits or on a yearly basis. The care plans addressed emergency management including evaluations and laboratory studies obtained during illness. Subspecialty evaluations were included, such as cardiology evaluation in VLCAD deficiency (Appendix).

For all disorders, developmental components will be tracked using neuropsychological testing to evaluate the cognitive and developmental sequelae. A detailed neuropsychological profile has been described in only a few metabolic disorders, such as executive functioning deficits in PKU.19 Thus, a standard developmental protocol was established to capture subtle developmental concerns for all disorders. The protocol includes frequent developmental monitoring through yearly assessments at clinic visits by parent questionnaires, namely the Alpern Boll Developmental profile-II and the Child Behavioral Checklist. The latter is also available for Spanish speaking families, an important factor in the diverse population of the Mountain States region. At ages of 3 and 6 years, more thorough developmental information with the Wechsler Preschool and Primary Test of Intelligence-III will be gathered. A full neuropsychological evaluation at age 9 years will be done by neuropsychologists and will include particular disease-specific developmental concerns, such as executive function in PKU and speech dyspraxia in galactosemia. Psychiatric concerns in certain disorders, such as PKU and maple syrup urine disease, will be addressed at age 18 years.

Based on the disease-specific care plans, performance indicators and outcome measures were established for each disorder. Performance indicators are bench mark data used to track and compare disease parameters such as laboratory values and number of hospitalizations. Outcome measures are designed to measure the result of the interventions. For example, in PKU, the number of phenylalanine levels obtained and the yearly average of phenylalanine values constitute performance indicators, whereas the neuropsychological evaluation including intelligence quotient and overall level of functioning is an outcome measure. The performance indicators and outcome measures are shared datasets that the region will use as the framework for the analysis of long-term follow-up.

After the identification of this data framework, the next challenge included the development of tools for data collection and for data storage. The data collection tools must be easily integrated in the current clinic routine without burdening the metabolic providers. The final data will be entered into a shared web-based database. The Clinical Health Information Records of Patients database, developed by the Colorado Department of Public Health and Environment for use in newborn screening, is being adapted to store the data. Patients will receive a unique patient identifier and will be consented by each separate metabolic center per a protocol approved by the local institutional review board. A direct benefit is the portability of the data when patients migrate between the states within the consortium. Another patient benefit would be when clinical trials become available for specific disorders, patients included in the database could be alerted of the opportunity to participate.

RESULTS

Disease-specific care plans along with disease-specific performance indicators and outcome measures for 28 inborn errors of metabolism were developed and completed by the MSGRCC Metabolic Consortium (Table 2). After the completion of these components, the care plans were distributed to and trialed in multiple metabolic clinics throughout the Mountain States region. Care plans are used as tools in the clinic, where each care plan and associated dataset will soon be available online. In the Mountain States region, expanded newborn screening began in most states within the last 3 years. With the initiation of the disease-specific care plans very shortly after that, almost all data from newborn screened patients will be collected longitudinally and prospectively.

Table 2 Disease-specific care plans and shared datasets

Data collection tools are being piloted by three separate metabolic centers, and data are being entered in the Clinical Health Information Records of Patients database. Pilot data on socioeconomic and organizational indicators are being collected for PKU using the dataset from the past 3 years in Colorado before expanding the dataset to other states. Additional issues still being addressed are the ownership of the data and the possibility of a shared institutional reviewed protocol to be used by multiple metabolic clinics.

The MSGRCC Metabolic Consortium meets annually to review the project, discuss new findings reported in the literature, and adapt the care plans, performance indicators, and outcome measures. Data collection and database maintenance are also reviewed. Finally, the consortium will review new research proposals using the shared datasets.

DISCUSSION

The individual rarity of many inborn errors of metabolism detected by newborn screening using MS/MS resulted in limited scientific literature addressing treatment. Only for the most common disorders such as medium-chain acyl-CoA dehydrogenase deficiency do single-center studies comprise sufficient number of patients to derive reliable conclusions on issues such as the medical impact of the introduction of newborn screening.8 Multicenter studies such as the Australian collaboration, similar to the Mountain States, can provide more detailed answers.20 Care providers usually have to rely on limited publications and on individual experience, sometimes strengthened by discussions with colleagues around the world on such forums as the Metab-L listserve or informal discussions at national or international conferences. As a result, the treatment of these rare disorders differs from metabolic clinic to metabolic clinic or even within the same clinic from physician to physician. This lack of uniformity further impedes the grouping of cases into large empirical studies. As a result, it remains unclear what the best course of treatment is for many of the disorders, and thus, long-term data are required to develop evidence-based treatment strategies. Recent progress has been made on glutaric acidemia type I by collecting information from around the world, but for most disorders, many questions remain unanswered.21

Similar to other consortiums, the MSGRCC pools data from a group of metabolic treatment centers. It is unique in that it starts by defining relevant outcome measures such as neuropsychological outcomes and quality of life measures, then will systematically and prospectively study a full complement of determinants of these outcomes. Some factors to be analyzed will include the genetic variation within the disorders, the medical approaches to diagnosis and treatment, and the socioeconomic and health care organizational factors. The uniformity allows pooling of data across the region to develop data-based studies within a reasonable time frame for all but the rarest conditions. This will provide the basis for a future data-driven, rather than expert-driven, approach to the management of these patients.

Many challenges remain in the care of patients with these disorders that would greatly benefit from such data collection. These questions include late complications of older survivors, milder genotypes that may require a different treatment approach, the appropriateness of screening for certain disorders where new clinical information is becoming available, controlled trials of dietary approaches, introduction of new medications and treatments, the role of government support for treatment too expensive for most individuals but a necessity for good outcome, and the optimal organization of health care provisions within the medical home model considering the specialized nature of the disorders. Overall, the primary question is whether patients detected by newborn screening have a better overall outcome than those patients diagnosed clinically.

The ultimate goal of newborn screening is to prevent death and serious medical complications. When patients live longer, new medical complications may be recognized as in the case of speech dyspraxia and ovarian failure for patients with galactosemia surviving the neonatal period, and maternal PKU syndrome.15,22 Systematic and prospective follow-up is needed to identify these issues. Follow-up of patients identified by newborn screening has resulted in new insights into the symptom spectrum of patients identified without bias. Newborn screening now identifies a new subset of patients with possible milder, more benign forms of some of the disorders. Previously, such patients may have gone undetected as their recognition was difficult, and the course of disease was not serious enough to dictate evaluation and diagnosis.15 Phenotype and genotype correlations are beginning to emerge to separate severe presentation from milder cases in VLCAD deficiency and isovaleric acidemia.2325 Additional data are necessary to further strengthen this division and, thus, prevent overtreatment. To the extreme end of this question, the generally benign nature of certain conditions when identified without bias by newborn screening generates the question of whether patients will sufficiently benefit from newborn screening to warrant these diseases' inclusion, such as short-chain acyl-CoA dehydrogenase deficiency and 3-MCC deficiency.2628 In recent years, certain programs have removed conditions such as 3-MCC deficiency from the newborn screening panel due to the low ratio of benefit to harm.29

The development of disease-specific care plans and shared datasets resulted in extensive roundtable discussions in which information and experience was shared. Each member of the MSGRCC Metabolic Consortium found great value in the debate at the workgroup sessions. No one member of the consortium was an expert in all the disorders screened. The consortium helped foster collegiate relationships with the opportunity to share experiences, ask questions, and offer treatment advice. Through the pooling of the experts' opinion into a consensus, a greater quality of evidence-based care and greater uniformity of the approach to patient care were immediately achieved, resulting in improvement in the consistency of care. More formal approaches to the development of multiple expert opinions such as the Delphi method, recently developed for the diagnosis and care for 3-MCC deficiency and for VLCAD deficiency, can easily be integrated into these care plans.30,31 Overall, the experience benefited all those involved. Practical resources such as emergency protocols and parent information were pooled and disseminated.

In the economic reality of limited resources, it is important to guide policymakers with data on future decisions regarding the most effective application of financial support and the optimal organization of the care for disorders identified by newborn screening to reap its full potential benefit.32 Long-term follow-up studies might aid in continued financial support from both federal and state legislations and government-funded programs. Particularly, current funding for newborn screening could be validated by evidence showing the importance of newborn screening. The National Institutes of Health consensus development conference on PKU published in 2001 revealed that inconsistent policies for funding of medical foods and low-protein products have created a barrier to access, even though such products were deemed essential for maintenance of metabolic control throughout life.3 The panel recommended reimbursement for these medical foods and products to be covered by third-party payers. However, actual data of the implications of inconsistent coverage on long-term outcome have not been systematically studied, and little change has occurred in the approach by third-party payers or by individual states. Finally, the rarity of the individual conditions and the lack of expertise by primary care providers of these disorders pose special problems in the grounding of care to the medical home while delivering the most effective service. A centralized approach in an expert center delivered more effective care than a decentralized approach, but the impact of costly travel clinics on a “hub and spoke model” for large geographical areas has not been reviewed.33 Our large regional model will allow us to address such issues with factual data about the impact of various care models on long-term outcome.