Main

Technological advances in screening have enabled newborn screening (NBS) programs to detect many disorders, most of which are inherited. This leads to the early identification of babies with conditions that may cause disability or death if not detected and treated within days of birth. Most of these conditions present few signs or symptoms in the neonatal period or in the first few years of life; routine NBS is the only way of identifying children at risk for disorders before the onset of symptoms when treatment may be too late.

Follow-up of children identified by NBS is critical and necessary to achieve the program's full promise and public health benefits. Early access to services and treatment and ongoing, high-quality medical management and care coordination should be provided to the children and families affected by the NBS conditions.

State Departments of Health in the United States have established individual statewide NBS programs for laboratory screening and short-term follow-up of all newborns with screen-positive results. This ensures that all infants are screened, that abnormal results are appropriately and expediently handled, that affected infants are promptly identified and appropriately referred, and that treatment is initiated in time for early intervention.1

To assure the best possible outcomes for individuals with disorders identified through NBS, it is necessary to conduct long-term follow-up (LTFU), which begins with the initiation of treatment after a diagnosis is obtained through short-term follow-up of those children with screen-positive results. Other LTFU goals include care coordination through a medical home, evidence-based treatment, continuous quality improvement, and new knowledge discovery.2 LTFU is an important process of data collection and analysis for advancing the public health understanding of the impact on and health outcomes of affected children. Recent surveys of state NBS programs in the United States have described and analyzed the types of policies and practices related to LTFU activities for children with confirmed NBS disorders.36 The authors found that many state NBS programs lack a LTFU component. The NBS programs that do conduct LTFU face various challenges and barriers that may impact their ability to perform it effectively. These include a lack of comprehensive quality assurance practices, scope of mandate, outsourcing, financial constraints, and communication problems with providers who treat these patients.

Currently, there are very few existing data on long-term outcomes of children diagnosed after a positive newborn screen. The Centers for Disease Control and Prevention has conducted several studies using short- and long-term measures to evaluate the effectiveness of NBS for specific disorders and to access long-term developmental outcomes of children identified through the Georgia state NBS program.7,8 The findings of cases of developmental disabilities of varying severity attributable to a metabolic or endocrine disorder by Van Naarden Braun et al.8 suggest a need for ongoing population-based monitoring of the long-term developmental outcomes of children identified through NBS. A study on LTFU of patients with phenylketonuria identified by the NBS program in Japan showed the long-term outcome data to be valuable for the improvement of blood phenylalanine levels for patients with phenylketonuria and for the evaluation of the effectiveness of the initiatives and treatment guidelines issued during the follow-up period.9

In September 2008, the New York State Department of Health (NYSDOH) was awarded funds by the Centers for Disease Control and Prevention to develop and implement a population-based surveillance and tracking system in New York State (NYS) for children with confirmed NBS conditions through collaboration between the NBS and birth defects surveillance programs. Establishing a population-based surveillance and tracking system in NYS will allow the NYSDOH to enhance the collection of high-quality data on confirmed NBS cases, advance the public health understanding of the impact and short- and long-term outcomes of children with confirmed NBS conditions, and ensure that all affected children have early access to services and treatment. This article describes a model for LTFU of children with confirmed NBS conditions using the population-based health surveillance and administrative data sets that are routinely collected and maintained by the NYSDOH. Preliminary results from these data linkages are presented.

METHODS

Data sources

NBS program

The NBS Program housed in the Wadsworth Center of the NYSDOH performs more than 11 million tests annually for more than 40 heritable diseases, congenital hypothyroidism (CH), and human immunodeficiency virus (HIV) exposure on approximately 250,000 babies born in NYS. Approximately 4,300 babies per year are referred for additional diagnostic testing, and approximately 14% of that group (600 babies) is expected to be diagnosed with one or more of the screened disorders.10 Currently, the NBS program consists of the laboratory and short-term follow-up components. Testing is performed using dried blood spots collected from all babies born in NYS. If the laboratory results warrant additional testing, a referral is made to the infant's healthcare provider, the birth hospital, and a specialist in that particular disease area. The NBS program then follows up with healthcare providers to track the information on the confirmatory testing and closes the case after a diagnosis is made. The NBS database contains demographic information from the mother, the newborn, and the birth hospital/healthcare provider. It also contains the screening results and short-term follow-up information including the confirmatory test results for those children with screen-positive results and information on the treatment center and the primary healthcare provider.

Congenital Malformations Registry

The NYS Congenital Malformations Registry (CMR) in the Center for Environmental Health was established as part of the Environmental Disease Surveillance Program in 1982. Hospitals and physicians are required to report to the CMR all children 2 years of age or younger who were born or reside in NYS and were diagnosed with any structural, functional, or biochemical abnormality determined genetically or induced during gestation and not due to birthing events. A list of reportable major birth defects can be obtained elsewhere.11 Excluding HIV, approximately 90% of the conditions currently screened by the NBS program are reportable to the CMR. The CMR is one of the largest statewide population-based birth defects registries in the nation, receiving birth defects reports on more than 11,000 children annually from among approximately 250,000 live births.11 The CMR data contain demographic information on the children and parents, birth defect diagnoses, and information on case reporting hospitals and physicians.

Vital records

Birth and death certificate files of NYS residents are maintained in the Bureau of Biometrics and Health Statistics of the NYSDOH. Birth certificates include demographic information, maternal medical conditions, and other risk factors. Death certificates have demographic and cause of death information.

Hospital discharge files

Hospital discharge files containing inpatient and outpatient data are routinely collected and maintained by the Statewide Planning and Research Cooperative System (SPARCS) of NYSDOH. SPARCS, implemented by the NYSDOH in 1979, is a comprehensive, integrated information system which receives, processes, stores, and analyzes the inpatient hospitalization data from all facilities in NYS and ambulatory surgery data from hospital-based ambulatory surgery services and all other facilities providing ambulatory surgery services. As mandated, healthcare providers submit data electronically to SPARCS containing patients' date of birth, medical record number, admission and discharge information, diagnoses and procedures, and health insurance information.

Early intervention program

The Early Intervention (EI) program in the Center for Community Health of the NYSDOH is a statewide program offering therapeutic and support services for children up to 3 years of age with special needs and their families. EI is designed to enhance the development of infants and toddlers with disabilities and/or developmental delays. The EI database contains demographic information on the child, initial diagnosis and recommended services, and information about services actually received.

Study cohort

Children who were born in 2006–2007 and had confirmed NBS conditions including endocrine disorders, hemoglobinopathies, metabolic disorders, and cystic fibrosis were identified from the NBS database. The data set with identified cases contains child and mother's demographic information and other identifying information and the confirmatory diagnosis. The study protocol for accessing records containing identifying information was approved by the Institutional Review Board of the NYSDOH. Data use agreements were obtained from the project staff before conducting this study.

Long-term follow-up

Data acquisition and linking NBS data to other available administrative databases were the main components of the surveillance and tracking system established for this project. The study cohort was followed up to 2 years after birth by matching the NBS cases to the birth certificate files to obtain birth variables and selected parental risk factors; matching to the CMR database to identify other congenital disorders; and matching to SPARCS hospital discharges, the EI database, and death certificates to monitor the healthcare and service utilization, comorbidities, and mortality status of this cohort.

Data linkage

The Statistical Analysis System (SAS, NC) was used to develop programs for data linkage. Deterministic data linkage methods were used with multiple criteria for establishing matches between records. A successful deterministic linkage relies not only on the completeness of data but also on choosing an appropriate combination of common identifiers. As there was no unique identifier available among the data sources, personal identifiers such as last name, sex, and date of birth were used as common identifiers in matching databases deterministically. Other potential indicators such as residential address, medical record number, and birth weight were also used further to improve the accuracy of record linkage. Several combinations of the matching variables were used to identify all possible matches. Points were assigned to each criterion. Assigning different points to different identifiers provided a way to recognize variations in quality or reliability of different data elements. Records were compared on the selected matching variables until (1) a match was found, (2) a possible match was found, or (3) the list was exhausted without finding a match. For the possible matches, staff manually reviewed each record to decide whether to consider it a match.

The accuracy of linkage was examined by crossvalidating personal information obtained from the matching data sources. Questionable links identified by the computerized process were checked manually to remove false matches (false positives) and identify false nonmatches (false negatives). The data linkage rate was calculated by using the number of true matches divided by the total number of records available for matching.

Integrated database system

Microsoft Access was used to create an integrated database system for collecting, managing, and analyzing the records of affected children containing linked information from multiple sources. Structured Query Language was used in developing applications for users to view, update, or query records and to check data quality by searching for missing or inconsistent records. The integrated database is password protected, maintained on a secure, dedicated project server of NYSDOH, backed up on a daily basis, and accessible only to key project personnel.

RESULTS

Figure 1 shows the flowchart of the data linkage process that includes the following steps: (1) selecting records and matching variables from the data sources, (2) standardizing the matching variables, and (3) performing record matching. By reviewing the records and the matching scores, point thresholds for defining a match, possible match, or nonmatch were established. For instance, if the score was greater than a threshold, P1, the record was considered as a match; if the score was within a defined range, P1 − P2 where P2 < P1, the record was a possible match; and if the score was below P2, there was no match. The project staff manually reviewed every record and matching variables for the possible matched records to identify correct matches.

Fig. 1
figure 1

A flowchart of the data linkage process used for the long-term follow-up of children with confirmed newborn screening disorders in New York State. Note: P1 and P2 are defined point thresholds and P2 < P1.

As shown in Figure 2, the data elements obtained from these multiple sources by record linkages included birth variables, parental demographics, and risk factors and mortality status from vital records; comorbidity and hospitalization data from SPARCS hospital discharge files; birth defects from the CMR; and the comorbidity evaluations and services the children and families received from the EI program. An integrated relational database was developed for the surveillance and tracking system to incorporate the information obtained from these multiple sources by record linkages. Unique identification fields from each data source were included in the database, so that additional source information for records in the integrated system was always accessible from the contributing sources when needed.

Fig. 2
figure 2

Data collection and integration for long-term follow-up of children with confirmed newborn screening disorders in New York State; data from record linkages using existing data sources that are routinely collected and maintained by the New York State Department of Health.

Using the automated data matching program, a matching rate of 97% was achieved between the NBS cohort and live birth certificates for children born in NYS excluding New York City (the New York City live birth data were not available for this project). Additional matches were found by extensive manual searching against the birth certificates using the information obtained from the CMR and other data sources. The efforts resulted in 98% of the children being matched to their birth certificates.

The first and second year morbidity and mortality experience and healthcare and service utilization were assessed by data linkages. A total of 1215 children born in 2006–2007 were identified with confirmed NBS conditions including endocrine and metabolic disorders, hemoglobinopathies, and cystic fibrosis. Figure 3 shows the results from the data linkage of NBS children to data sources including SPARCS hospital discharges, CMR, EI, and death certificates by NBS disorder categories examined. The percentage of the children matched to each data source varied by condition. A majority of the children (>76%) used hospital inpatient or outpatient services during the follow-up period (up to 2 years after birth). Forty-four percent of NBS children with hemoglobin disorders were enrolled in the CMR, whereas only 14.5% of children with metabolic disorders were enrolled in the CMR. The percentage of children matched to the EI database was 32.0% among children with endocrine disorders but only 10.2% among children with hemoglobin disorders and 10.8% among children with cystic fibrosis. In addition, 5.5% of the children with endocrine disorders were matched to their death certificates.

Fig. 3
figure 3

Results from data linkage of children with confirmed newborn screening disorders (N = 1215) to data sources including hospital discharges, Congenital Malformations Registry (CMR), Early Intervention, and death certificates: percent of matched records by NBS disorder categories (Birth cohort: 2006 and 2007 New York State live births; follow-up period: up to 2 years after birth).

Table 1 presents the results of the data linkage of the cohort to data sources by individual disorder within each condition category. Overall, 86.1% of the children were matched to hospital discharge files, 36.1% to the CMR, 19.9% to the EI, and 2.1% to death certificates. The percent of the children matched to each data source varied by disorder and by data source. The percentage of children matched to SPARCS hospital discharge files was >80% among children with congenital adrenal hyperplasia, CH, sickle cell disease, thalassemia, other hemoglobin disorders, fatty acid oxidation disorders, and organic acid disorders. More than 50% of the children with congenital adrenal hyperplasia and sickle cell disease were matched to the CMR. Among 25 deceased children identified through matching to the death certificates, 23 had CH, one had sickle cell disease, and one had a fatty acid oxidation disorder.

Table 1 Results from data linkage of children with confirmed newborn screening (NBS) disorders (N = 1215) to data sources including hospital discharges (Hosp), Congenital Malformations Registry (CMR), Early Intervention (EI), and death certificates (DCs) by disorder category (Birth cohort: 2006 and 2007 New York State live births; 2 years after birth)

For the 439 children who were matched to the CMR, we examined the birth defect conditions recorded in CMR and compared them with the confirmatory diagnoses from NBS. As presented in Table 2, 82.5% of the children had the same diagnoses in both data sources. The percentage of agreement in diagnoses was >90% for children with congenital adrenal hyperplasia, sickle cell disease, and cystic fibrosis. Overall, 24.1% of the children had multiple congenital malformations (comorbidities) recorded in the CMR. The percentage of children with comorbidities found in the CMR varied by individual NBS condition.

Table 2 Results from data linkage: Diagnosis information for children who had confirmed newborn screening (NBS) disorders and were matched to the Congenital Malformations Registry (CMR) (N = 439), by disorder category (Birth cohort: 2006 and 2007 New York State live births; follow-up: 2 years after birth)

DISCUSSION

LTFU of the NYS children with confirmed NBS disorders using existing administrative data routinely collected and maintained by the NYSDOH is a very useful approach for population-based surveillance and tracking. This systematic data collection approach leverages existing resources and has proved to be inexpensive, cost-effective, and efficient. It allows periodical assessment of healthcare services utilization and health outcomes for NBS children with confirmed disorders. The data linkage programs efficiently allow for the batch matching of large data sets.

Through collaboration and partnerships among established statewide public health surveillance programs and healthcare service agencies, we were able to conduct a pilot study of long-term outcomes for this cohort. This was accomplished by use of data linkage to collect information on health status and healthcare utilization. This integrated child health information database system can be used to evaluate the health of individuals and assess their utilization of services. This information can also be used by public health professionals, advocates, and policymakers to assess health outcomes of these children and improve and strengthen the services and support systems for the children throughout their lifespan. Linking the NBS children to other administrative data sources such as immunization and early hearing detection and intervention would also be helpful in developing integrated child health information systems to meet medical care and public health needs, which result in improved child health. However, the project staff did not have access to these data sources for the current project.

This study found that of 1215 children born in 2006–2007 and confirmed with this group of conditions, 25 deaths (2.1%) occurred, 86.1% used hospital inpatient or outpatient services, 36.1% were enrolled in the CMR, and 19.9% used the services provided by the EI program during the 2-year follow-up period. The EI uptake rate among the study cohort is significantly higher than that among New York newborn population (12%) (EI program, unpublished data). Studies are underway to identify children who had confirmed NBS disorders such as CH and who were enrolled in the EI Program. Detailed analyses will be performed to describe and evaluate the care and services received by the children and their families.

We found that the majority of deaths among the cohort occurred in children with a diagnosis of CH. Among the deceased children with confirmed CH, 65% had very low birth weight (VLBW) (450–940 g) and 30% had low birth weight (1000–2415 g). A recent study reported that the incidence of early, transient hypothyroidism in VLBW infants defined by thyrotropin concentration was eight times that in term infants.12 Hypothyroidism (primary or transient) is prevalent in VLBW babies. Because of prematurity and/or the presence of congenital malformations or complications, VLBW and low birth weight babies experience high mortality rates.

Approximately 90% of the conditions screened by the NBS program are reportable to the CMR. However, only approximately 36% of the cohort (children with confirmed NBS disorders) was found in the CMR through data linkage. This could be because that not all children with confirmed NBS disorders required hospitalization. They may have been cared for and treated by physicians as outpatients. Although it is mandatory for hospitals and physicians to report birth defect cases to the CMR, not all physicians who treat children with birth defects in office settings comply. The project staff has been working on identifying the children with confirmed NBS conditions that were missed by the CMR. A separate manuscript will be prepared to summarize the findings of using NBS database as an additional data source to improve the completeness of cases ascertainment of the CMR.

The completeness and accuracy of records linkage rely on the availability and quality of matching variables in data sources. For the NBS and birth certificate data sources, nearly 100% of the records have nonmissing values for the matching variables including child's date of birth, last name, mother's last name, and residential address. Ideally all NBS children born in NYS should be matched to their birth certificates. We have successfully matched 98% of children with confirmed NBS disorders to their birth certificates for the 2006–2007 births in the NYS excluding infants from New York City. The 2% nonmatches (false negatives) could have resulted from incomplete or incorrect information on the matching variables recorded in one or both data sources or out-of-state births to the NYS residents, whose physicians send in an initial screen when the baby comes to their office. We were not able to assess the matching rate between the NBS children and SPARCS hospital discharge files, CMR, or EI data. Not all children with confirmed NBS disorders require hospitalization, have the major congenital malformations that are reportable to the CMR, or require early intervention services. However, we were able to modify the matching programs and algorithms to minimize the number of false positives (records having high matching scores but are not true matches) and false negatives (records having relatively low matching scores but are true matches). Staff manually checked for both false positives and false negatives to ensure the accuracy and completeness of data linkage. In the current project, matching NBS children to the SPARCS hospital discharge files was very challenging because there are no patient names in the SPARCS data set. We were able to improve the matching by using patient medical record numbers and other available demographic variables such as date of birth, gender, race, birth weight, and residential address. In addition, multiple admissions of the same patient to the same hospital or to different hospitals helped to improve the chance of matching.

A recent study ascertaining the extent of patients being lost to follow-up in an observational HIV cohort using a data linkage method found that this approach helped to identify patients returning for care at different centers and thus reduced the number of patients lost to follow-up in the cohort study.13 In the current project, 72 (20%) of the 356 children, who were born in 2006–2007 and were lost to follow-up (had positive screening results but no confirmatory diagnoses), were found through data linkage to the CMR and EI database, and 280 (79%) were found through data linkage to SPARCS hospital discharge (data not shown). The information obtained from the CMR, EI, and SPARCS hospital discharge databases will help to evaluate the health outcomes of these children.

The availability of administrative data provides opportunities for monitoring access to needed healthcare and identifying health disparities in services utilization in a large, targeted population. Because of these potential advantages, there has been an increasing use of the existing administrative data in healthcare research and disease surveillance.14,15 A similar model for using state resources to develop sickle cell disease surveillance across the lifespan has been proposed recently.16 The most commonly used administrative data include federal- and state-specific vital records, hospital discharge data sets, and public and private health insurance claims databases.17

Some limitations of administrative data in general have been identified17,18 including potential misclassification, underreporting, lack of sociodemographic information, inaccuracy of some data fields, and reporting bias of some indicators. To improve the accuracy of the follow-up information obtained by data linkage, the project staff is conducting a review of clinical records for children seen and treated at the Inherited Metabolic Specialty Treatment Centers across New York. Data collected at our treatment centers will be compared with those obtained by data linkage to verify the quality and accuracy of the data from the administrative data sources such as the SPARCS hospital discharge files.