Neonatal sepsis, broadly defined as a systemic infection of predominantly bacterial origin in neonates1,2,3, is a leading cause of infant mortality and morbidity around the world1,4,5,6,7,8,9,10. Even in high-resource countries where the incidence of confirmed neonatal sepsis is relatively low (1–5 cases per 1,000 live births3,5,9,11,12) and all such infants would receive treatment, approximately 40% will either die or suffer major developmental disabilities13. Although reported incidence rates of neonatal sepsis from low-resource countries are generally higher14,15,16, cases are likely underascertained owing to many factors, including a lack of access to care, poor quality care, and a lack of adequate laboratory services16,17. In one review of community-based studies from low- and middle-income countries, incidence rates ranged from 49 to 170 cases per 1,000 live births across 11 studies16.

The successful clinical management of neonatal sepsis is predicated on early detection and prompt initiation of antibiotic therapy; however, this is difficult to achieve in practice owing to the non-specific signs and symptoms of sepsis, low sensitivity of the gold standard blood culture test (particularly following intrapartum antibiotic prophylaxis), and delayed availability of culture results (approximately 48–72 hours after blood collection)7,8,9,12,18,19,20,21. To prevent negative health outcomes in the face of these diagnostic challenges, antibiotic therapy is generally initiated in all clinically-suspected cases of neonatal sepsis22. However, concerns are being increasingly raised about this recommended approach22,23, as it contributes to unnecessary or prolonged antibiotic therapy, antimicrobial resistance, and heightened risk of nosocomial infection due to increased length of hospital stay while the infant is monitored10,12,24.

Given current challenges with diagnosis, and the high mortality and morbidity associated with neonatal sepsis, particularly in low-income countries, there is a need to develop novel approaches for identifying neonates at greatest risk. Numerous biomarkers have been studied to assess their ability to identify neonatal sepsis, but thus far no individual biomarkers have demonstrated adequate sensitivity and specificity3,8,12,23 and many have important limitations that preclude their utility for neonatal sepsis diagnosis25. High-dimensional biology, including the study of the metabolome, is increasingly being used to help provide insights on complex biological processes and disorders26. Newborn screening measures a diverse panel of analytes, including amino acids, acyl-carnitines, hemoglobin variants, and enzymatic and endocrine markers to identify infants with rare, treatable conditions. We hypothesized that newborn screening analytes — either individually or in combination12,25,27 — could be utilized beyond their traditional application to identify neonates at high risk for sepsis. For example, differences in these analytes could reflect underlying susceptibility to neonatal sepsis through suboptimal metabolic responses to the physiologic stress associated with infection. Alternatively, any differences could also represent the sequelae of early subclinical infection as a manifestation of associated catabolic stress. Our primary objective, therefore, was to assess whether there was any association between newborn screening analyte profiles and neonatal sepsis in a large, population-based cohort of infants.


Characteristics of study population

Overall, there were 793,128 infants in the full study cohort following exclusions (Fig. 1), 4,794 of whom received a diagnosis of sepsis during the neonatal period (6.04 per 1,000 screened infants; Table 1). Rates of sepsis were highest among infants born at early preterm gestation (86.6 per 1,000) and decreased with increasing gestational age at birth. Almost 40% of the total number of cases of neonatal sepsis occurred among infants born prior to 37 weeks’ gestation (Table 1). Rates of sepsis were also higher among neonates with a birth weight under 2,500 grams and among those born from a multifetal gestation. The timing of sample collection was later among infants diagnosed with sepsis during the neonatal period (median of 44 hours) compared to the total population (median of 28 hours). After excluding 4,713 infants with missing gestational age information, there were 731,841 infants born at term gestation (92.8%), 44,754 late preterm infants (5.7%), and 11,820 early preterm infants (1.5%). Corresponding rates of neonatal sepsis per 1,000 screened infants within each of these gestational age groups were: 4.03, 18.0, and 86.6, respectively. The distribution of baseline characteristics in the term cohort subset was similar to the full group of term births (Table 2).

Figure 1
figure 1

Study flow diagram.

Table 1 Characteristics of the study population and frequency of neonatal sepsis.
Table 2 Characteristics of the study population by gestational age group.

Model performance by gestational age

Term subset (≥37 weeks)

The results from the predictive modelling are summarized in Table 3. In the term birth subset of 32,406 infants, the addition of the relative fetal-to-adult Hb level in Model 2 to the baseline model containing only clinical characteristics (Model 1) did not improve the fit of the model, as measured by the c-statistic. However, the fit improved substantially (adjusted c-statistic increased from 0.577 in Model 1 to 0.704 in Model 3) when select non-mass spectrometry derived newborn screening analytes (17-OHP and TSH) were added, and improved further still (adjusted c-statistic of 0.848) when remaining analytes were added in the final model (Model 4). The addition of screening analytes in Models 3 and 4 also significantly improved the predicted probabilities between events and non-events compared with the preceding model (IDI values shown in Table 3). Higher levels of 17-OHP were associated with an increase in the likelihood of neonatal sepsis while higher TSH level was associated with a decreased likelihood even after other screening analytes were added in the final model.

Table 3 Model performance comparing baseline clinical model (Model 1) and clinical model plus newborn screening analytes for prediction of neonatal sepsis.

Late preterm (34–36 weeks)

We observed similar patterns in model performance among infants born at late preterm gestational ages. Compared to the baseline model containing only clinical factors, the addition of the relative fetal-to-adult Hb level (Model 2) did not improve the model fit; however, the optimism-adjusted c-statistic increased when 17-OHP and TSH were added (from 0.685 in Model 2 to 0.725 in Model 3) and again when the additional screening analytes/analyte ratios were added to the final model (adjusted c-statistic increased to 0.782). The IDI also increased significantly in Models 2, 3, and 4 (comparing with Models 1, 2, and 3, respectively); however, to a lesser magnitude than the incremental improvements observed among the term infants (Table 3). Again, in this subgroup, 17-OHP was associated with an increase in the risk of neonatal sepsis; however, there was no association between TSH level and sepsis in the fully-adjusted final model.

Early preterm (<34 weeks)

Compared with the other gestational age subgroups, the models did not perform well in early preterm infants. The addition of the relative fetal-to-adult Hb level at birth did not improve the fit of the model beyond the predictive ability of the nested model containing clinical variables only, nor did the addition of other non-mass spectrometry derived newborn screening analytes in Model 3 (TSH or 17-OHP). The c-statistic adjusted for optimism in the final model (Model 4) was 0.667, only minimally different from the baseline model (optimism-adjusted c-statistic 0.65) and of a magnitude indicating poor predictive ability. When we reanalyzed these models including early preterm infants who had received a transfusion, the adjusted c-statistic showed little change across the models (0.74 in Model 1 and 0.75 in Model 4).

Heat map analysis

Supplementary Figs S1 through S6 present correlational heat maps for infants with and without a diagnosis of neonatal sepsis. Among term infants, with or without sepsis, the strongest positive correlation was between carnitine and acetylcarnitine (ρ = 0.8, Supplementary Figs S1 and S2), while the strongest negative correlation was between methylmalonylcarnitine and relative fetal-to-adult Hb level (ρ = −0.3, Supplementary Figs S1 and S2). The strongest positive and negative correlations among non-septic infants born at late preterm gestation were also among acyl-carnitines and of similar magnitude (Supplementary Fig. S3); however, among septic infants born at late preterm gestation, carnitine and acetylcarnitine (ρ = 0.9) and leucine and TSH (ρ = −0.3) were the strongest positive and negative correlations (Supplementary Fig. S4). Among very preterm infants, the majority of correlations between pairs of analytes were negative, irrespective of sepsis (Supplementary Figs S5 and S6).


Reducing the burden and harms associated with neonatal sepsis is an international priority warranting further exploration of the potential for early detection of this condition28. In this large, population-based study we assessed whether analyte values from newborn screening samples taken close to the time of birth demonstrated the potential to identify infants at higher likelihood of developing sepsis during the neonatal period. Our results indicate that the newborn screening profiles of infants with neonatal sepsis are different from those of non-septic infants. In particular, a combination of baseline clinical variables and newborn screening analytes performed well in identifying neonatal sepsis among late preterm and term infants. In these two gestational age groups, clinical variables alone or in combination with hemoglobin values were not strongly predictive of developing neonatal sepsis. However, the model fit improved considerably after adding markers of thyroid and adrenal dysfunction, along with acyl-carnitines, enzyme makers, and amino acids, with adjusted c-statistic values indicating a strong model for identifying cases of neonatal sepsis. However, among infants born at early preterm gestational ages (i.e., <34 weeks), clinical variables alone were markedly less predictive of neonatal sepsis and the addition of newborn screening analytes minimally altered the predictive ability of the model. These results provide insights into the pathophysiology of neonatal sepsis and may provide guidance for the identification of early biomarkers.

Different analyte patterns associated with neonatal sepsis in infants could reflect differences in underlying physiology that may predispose infants towards the condition. For example,various studies have suggested that elevated fetal hemoglobin levels may predispose infants to adverse health outcomes, including neonatal sepsis29,30,31. Tighter oxygen binding capacity and thus inferior oxygen delivery to tissues may play a role in susceptibility to sepsis. However, in our models, the addition of fetal hemoglobin to baseline clinical characteristics had no impact on the model’s ability to identify infants with neonatal sepsis in any of the gestational age groups. We also hypothesized a priori that measures of thyroid and adrenal function (TSH and 17-OHP, respectively) could be associated with neonatal sepsis. A low level of cortisol is a marker of inadequate adrenal response to illness32,33, and links have previously been identified between levels of TSH and neonatal sepsis34. We found the addition of markers of thyroid and adrenal function did improve the model fit, but not as substantially as the addition of the acyl-carnitines and amino acids.

Alternatively, different analyte patterns associated with neonatal sepsis in infants could reflect the stress response to sepsis. Our analyses demonstrate that in late preterm and term infants, acyl-carnitines, enzyme markers, and amino acids are the newborn screening analytes most strongly associated with neonatal sepsis. Associations between acyl-carnitines and sepsis have been previously documented, with heightened analyte levels occurring among septic infants35,36 and positive correlations being reported between certain acyl-carnitine levels and 28-day mortality37. High acyl-carnitine levels have also been reported in other cases of injury35 and catabolic stress, which is likely the mechanism of their association with neonatal sepsis38,39,40. In cases of stress, fatty acid oxidation is disrupted, resulting in higher levels of circulating acyl-carnitines36. Abnormal carnitine metabolism can then contribute to the development of septic shock and multi-system organ damage36. Our hypothesis that catabolic stress mediated the associations we observed also explains the model performing less well among early preterm infants, many of whom experience catabolic stress even in the absence of sepsis41. However, the changes in acyl-carnitines and amino acids we observed may also reflect differences in metabolic processes that could predispose infants towards neonatal sepsis. In several of the conditions screened for with newborn screening, such as medium-chain acyl-CoA dehydrogenase deficiency, an infectious stressor can result in metabolic decompensation42. The subtle alterations we observed may reflect milder changes in the same pathways that are altered in these conditions.

A strength of our study was the size of the cohort — after exclusions, we were able to analyze data from nearly 800,000 newborns, 4,800 of whom received a diagnosis of sepsis during the neonatal period. This large, population-based cohort provided substantial statistical power to perform the complex analyses needed to develop our models. Nevertheless, our study also had several important limitations. The exact timing of onset of clinical signs of neonatal sepsis was not documented in our data, and could have occurred prior to blood spot collection, which typically occurs between 24 and 72 hours after birth. In the hospitalization database, we were also unable to reliably distinguish between early- and late-onset sepsis. Since early- and late-onset disease may differ in their clinical presentation and etiology3, it is possible that newborn screening analytes could also display distinct patterns according to the timing of onset. In addition, we did not capture diagnoses of late-onset sepsis after 28 days of age. Finally, the use of diagnostic codes in administrative databases to identify cases of neonatal sepsis is expected to generate some measurement error due to low sensitivity of the codes28. Although it is likely that many of the infants who received an ICD-10-CA code for sepsis (P36) in the hospitalization database had a confirmed diagnosis of sepsis based on blood culture, we could not distinguish these confirmed diagnoses from infants classified on the basis of possible sepsis. False-negative classification of neonatal sepsis owing to low sensitivity would be expected to dilute the strength of our findings. The low prevalence of neonatal sepsis could also result in more false-positive results despite the high specificity; however, we expect any false-positive classification of sepsis is likely identifying other serious neonatal illness also causing catabolic stress. To our knowledge, the diagnostic codes for sepsis have not been validated against medical record data, and this would be informative for future studies43.

Our findings have value in identifying future directions for the early detection of neonatal sepsis. Future studies should examine whether measurement of markers of catabolic stress soon after birth can help identify neonatal sepsis in infants who would otherwise not be considered at risk for the condition. Our findings complement those recently published by Sarafidis and colleagues44, whose study found substantial differences in the metabolic profiles of urine samples from septic and non-septic infants. As with our study, they also noted significantly heightened levels of several amino acids among late-onset neonatal sepsis cases, which the authors hypothesized may suggest disturbances in protein metabolism44.


In this study, we sought to assess whether there was any association between newborn screening analyte profiles and neonatal sepsis. Our models suggest potential utility in using a combination of clinical variables and newborn screening analytes to identify neonatal sepsis among infants born at term or late preterm gestational ages, among whom illness may not otherwise be suspected. Indications of catabolic stress in newborn screening analyte values can serve to prompt clinicians to further explore the possibility of sub-clinical or impending illness, potentially resulting in earlier diagnosis and treatment, and ultimately improved health outcomes.


Ethical approval and data access

This study was conducted at the Institute for Clinical Evaluative Sciences (ICES), a non-profit research organization with Prescribed Entity status under provincial privacy legislation. ICES acts as a secure repository for Ontario’s newborn screening database and provincial health administrative databases, with the ability to deterministically link individual patient health information across databases using unique encoded identifiers to protect privacy and confidentiality. This study received approval from the institutional review board at Sunnybrook Health Sciences Centre, the ICES Privacy Office, the Ottawa Health Science Network Research Ethics Board, and the Children’s Hospital of Eastern Ontario Research Ethics Board. All procedures were performed in accordance with these institutions’ relevant guidelines and regulations. Given the Prescribed Entity status of ICES under Ontario’s privacy legislation, express consent was not required for the use of the anonymized health administrative data used in this study. The dataset from this study is held securely in coded form at ICES. While data sharing agreements prohibit ICES from making the dataset publicly available, access may be granted to those who meet pre-specified criteria for confidential access, available at The full dataset creation plan is available from the authors upon request.

Study design, population and data sources

We conducted a population-based retrospective cohort study of all live births in the province of Ontario between January 1, 2010 and December 30th, 2015 with completed newborn screening. Infants were identified using electronic screening records from Newborn Screening Ontario (NSO), a province-wide program that screens all infants for rare conditions using a panel of screening analytes obtained via heel prick blood spot, typically between 24 and 72 hours following birth. Each infant’s electronic screening record contained information on sex, gestational age (completed weeks), mode of feeding (i.e., breast, formula, total parenteral nutrition), timing of blood spot collection, transfusion of blood or blood products, and specific analyte values.

We excluded infants who screened positive for one or more disorders on the newborn screening panel, those with unsatisfactory sample quality, and those who were transfused prior to blood spot collection, as donor blood interferes with hemoglobin and galactosemia-related analytes. We additionally excluded those without continuous eligibility for provincial health care benefits between birth and the end of the neonatal period (i.e., up to 28 days of age). To ascertain the study outcome, we linked the study cohort to hospitalization records from the Canadian Institute for Health Information’s Discharge Abstract Database (DAD) using unique encrypted identifiers at ICES. For each admission of an infant to any Ontario hospital (including the birth hospitalization, any neonatal intensive care unit transfers/admissions or any other hospital admission), the DAD contains information on up to 25 medical diagnoses (coded using the Canadian implementation of the International Classification of Diseases, 10th Revision [ICD-10-CA]45), medical interventions received, length of hospital stay and other data elements. Only hospitalizations with a date of admission within the neonatal period were included.

Variable measurement

All newborn screening samples in Ontario are analyzed in one central laboratory using standardized biochemical assays and reported in specific measurement units for each analyte. The 49 individual analytes used in our study (Table 4) included acyl-carnitines (n = 31), amino acids (n = 12), relative fetal-to-adult hemoglobin level (n = 1), endocrine markers (n = 2), and enzymes (n = 3). To minimize the impact of extreme analyte values in this screen-negative study population, we replaced the outliers (i.e., those below the 0.001st percentile or above the 99.999th percentile) with the analyte value at those percentile levels. We computed analyte ratio combinations (n = 1,176) to capture the relative balance/imbalance between individual analytes after replacing any analyte value of zero with the next lowest positive value in the study population in order to have a mathematically valid denominator. We standardized all analytes to the standard normal distribution to render a unit change in each analyte comparable regardless of different measurement scales. This standardization was conducted by calendar week of birth of the screened infant to minimize variation due to any external factors, including changes in assays, seasonal effects, or other non-biological factors over the study period.

Table 4 Newborn screening analytes used in model development.

We defined neonatal sepsis cases as those infants with a diagnostic code for newborn sepsis (ICD-10-CA code: P36) recorded during any hospitalization initiated within the neonatal period (i.e., prior to 28 days of age), but we were unable to determine the exact time of onset or distinguish between early- or late-onset cases. A Canadian validation study of diagnostic codes for neonatal sepsis using administrative health data found sensitivity and specificity values of 38.4% and 99.7%, respectively, for the P36 code when validated against a clinical perinatal database28.

Statistical analyses

We described baseline characteristics of the study population using frequency distributions and descriptive statistics, and calculated crude rates of neonatal sepsis per 1,000 screened infants. As gestational age is strongly correlated with both incidence of neonatal sepsis and with newborn screening analyte values46,47,48,49, we made an a priori decision to stratify all remaining analyses into subgroups defined by gestational age: <34 weeks (early preterm), 34–36 weeks (late preterm), and ≥37 weeks (term). To improve the statistical efficiency of predictive modelling within the large group of term infants, we subset the data by randomly selecting 10 non-septic infants for every infant diagnosed with sepsis.

To avoid over-fitting and conserve degrees of freedom in our gestational age-specific models, a priori, we determined the maximum number of model parameters that could be accommodated in the analysis for each gestational age group, based on a requirement of 10 sepsis cases per parameter. We then used several steps to reduce the number of analytes and analyte ratios prior to modeling. First, we computed crude Spearman’s rank correlation coefficients between sepsis and each individual standardized analyte value and standardized analyte ratio. We then computed partial Spearman’s correlations between sepsis and all individual standardized analytes, as well as the top 100 ranked standardized analyte ratios based on crude Spearman’s correlation (top 50 ranked standardized analyte ratios for the late preterm birth group owing to the low number of sepsis cases in that gestational age stratum), adjusted for infant sex, gestational age, birth weight, plurality, and mode of feeding. We then prioritized variables for inclusion in multivariable modelling in order of magnitude of partial Spearman’s correlation, and the constraint requiring at least 10 sepsis cases for every parameter included in the model. Variables modelled categorically, or using restricted cubic splines required additional degrees of freedom and thus represented more than one regression parameter.

The models were developed using multivariable logistic regression with sepsis as the binary dependent variable. We chose models based on both pathophysiological parameters as well as mechanism of measurement. We began with a baseline model (Model 1) containing only the following clinical factors: infant sex (male vs. female), continuous gestational age (weeks), continuous birth weight (grams), plurality (singleton vs. twin or higher-order multiple), and mode of feeding (total parenteral nutrition vs. other). To this baseline clinical model, we then added the relative fetal-to-adult hemoglobin (Hb) level (Model 2), which represents a marker of oxygen delivery to tissues and is measured using high performance liquid chromatography. Next, markers of endocrine function and analytes measurable using non-mass spectrometry methods were added in Model 3 (thyroid stimulating hormone [TSH] and 17-hydroxyprogesterone [17-OHP]). In the final model (Model 4), we added the remaining analytes/analyte ratios (acyl-carnitines, enzyme markers, and amino acids) according to rank order based on Spearmans’s correlation until the maximum number of parameters was reached. We used restricted cubic spline terms to model the top five ranked analytes/analyte ratios to capture non-linear associations. The decision of how many analytes to model using splines was made post-hoc by inspection of the ranked partial Spearman’s correlations, and using splines to model a small number of the top analytes if there was a sharp drop-off in correlation after the first few ranked analytes. In a sensitivity analysis of the early preterm subgroup, we re-ran the models while including infants with a documented transfusion of blood or blood products, and adjusted for transfusion in the first nested model containing only clinical variables (Model 1) as well as all subsequent models.

We assessed model discrimination using the area under the receiver operating characteristic curve (AUC) and used the Integrated Discrimination Improvement (IDI)50 to quantify the impact of additional groups of analytes on the average sensitivity of the preceding nested model. We also computed the Net Reclassification Improvement (NRI) to quantify the net increase/decrease in predicted values for the outcome compared to the preceding nested model50. Since we were unable to partition our dataset into random subsamples for validation purposes owing to the low prevalence of sepsis, we internally validated our model performance using a bootstrap approach as described by Harrell et al.51. We drew repeated samples with replacement to produce 200 bootstrap samples, each the same size as the original dataset. The average difference in c-statistic values between the original model and each bootstrap sample was used to calculate an optimism-adjusted c-statistic51 and we additionally report the Akaike Information Criterion (AIC) for each model, which is a metric that rewards model goodness of fit, but includes a penalty against a large number of parameters to reduce overfitting. Lower AIC values are indicative of improved model fit, relative to preceding nested models. We generated correlation heat maps to visualize the relationships between analytes and differences in correlation patterns between infants with and without neonatal sepsis. Analyses were conducted with SAS version 9.4 (SAS Institute, Inc., Cary, North Carolina) and with R/R Studio using the RMS and HMISC packages.