Preterm infants are born with an immature hypothalamic−pituitary−thyroid axis, placing them at risk for hypothyroidism from a delayed rise in thyroid-stimulating hormone (TSH).1,2 Furthermore, prior studies have shown that compared to term infants, very low birth weight infants have a higher prevalence of congenital hypothyroidism and a 14-fold higher rate of transient hypothyroidism relative to those >1500 g at birth.2,3,4,5,6 These risks for a hypothyroid state are particularly notable in preterm infants given the crucial role thyroid hormones play in normal human brain development.7

Multiple challenges exist regarding the evaluation of thyroid function in preterm infants. Currently in the United States, the initial evaluation of thyroid function in all infants is a component of the state-mandated metabolic newborn screen ideally collected in the first 1–3 days of life. The state newborn screening programs have wide variation in TSH cutoff values used as the initial screen for congenital hypothyroidism.8,9 The 2014 European Society for Pediatric Endocrinology Consensus Guidelines on Screening, Diagnosis, and Management of Congenital Hypothyroidism indicate that follow-up screening may be warranted for preterm infants, infants born <1500 g birth weight, or infants admitted to the neonatal intensive care unit (NICU).10 However, there are limited data establishing reference interval values for preterm infants at various gestational ages,5,11,12,13,14 and the timeline for when to rescreen is debated.2,7,15,16 To further complicate thyroid studies in preterm infants, their thyroid results can be altered by congenital hypothyroidism, illness, exposure to frequently used medications (dopamine, glucocorticoids, metoclopramide, caffeine, heparin, etc.), and an immature ability to maintain appropriate iodine levels.2 In preterm infants, the particular free thyroxine (FT4) serum levels needed for optimal brain maturation remain unclear, and there is not an agreed upon FT4 level for when hypothyroid treatment is indicated to avoid neurodevelopmental delays.17,18,19


To aid clinicians in the evaluation of preterm infants at risk for hypothyroidism, our objective was to establish postmenstrual age (PMA, gestational age at birth + postnatal chronological age)-specific reference intervals for TSH and FT4 in preterm infants. The PMA, being reflective of both the birth gestational age and the postnatal chronological age, is the recommended perinatal age classification terminology by the American Academy of Pediatrics Committee on Fetus and Newborn.20 Given the immature hypothalamic−pituitary−thyroid axis seen in preterm infants, further data elaborating reference intervals for TSH and FT4 across different postmenstrual ages of preterm infants would be of particular utility.



For this study, all data were gathered retrospectively from our local Epic Systems Corporation (Epic) (Verona, WI) electronic medical record. The initial selection of infants included those born <36-completed weeks birth gestation who were admitted to the Nationwide Children’s Hospital Level IV NICU in Columbus, Ohio from October 2011 to August 2018 and who were included in the Children’s Hospitals Neonatal Database, which records clinical data on all infants cared for in our NICU. TSH and FT4 laboratory results as well as medication exposures during the NICU admission were then extracted from Epic. The TSH and FT4 laboratory results used in this study are those that were previously collected during routine clinical care while admitted to the NICU. The Nationwide Children’s Hospital Institutional Review Board approved this project as no more than minimal risk and granted a waiver of informed consent. The Ohio Department of Health collects a single newborn screen on all infants at birth and recommends that clinicians should obtain a thyroid panel on all infants hospitalized in neonatal intensive care units prior to discharge, regardless of the initial thyroid screening result, with the specific timing of panel collection left to clinician discretion.21 Following initial evaluation on the state newborn screen, repeat TSH and FT4 are typically collected in preterm infants at the Nationwide Children’s Hospital NICU at 4 weeks of age per local standard of care, although exact timing of laboratory sample collection may vary between specific providers. Additional patient-level data included diagnoses and demographics. All infants who had a listed diagnosis (identified by ICD-9 and ICD-10 codes) indicating a persistent alteration of the hypothalamic−pituitary−thyroid axis (congenital hypothyroidism, hypothyroid, thyroid agenesis, congenital anomaly of the thyroid gland, neonatal Graves’ disease, neonatal thyrotoxicosis, septo-optic dysplasia, pituitary hypoplasia) were excluded. Additionally, with the goal to establish specific reference intervals for TSH and FT4 in preterm infants without thyroid disease, infants who received levothyroxine treatment anytime during their hospital admission were also excluded from our analyses. These infants were excluded for having a disease state that would lead to TSH or FT4 values that are non-representative of an infant without ongoing thyroid axis dysfunction or for having medication exposure that deliberately alters the laboratory values in question. For these reasons, these infants were not included out of concern that any TSH or FT4 laboratory values collected from them would introduce bias during the generation of reference intervals.


Both TSH and FT4 laboratory samples were run in the clinical chemistry laboratory at Nationwide Children’s Hospital on the Abbott ARCHITECT i2000SR (Abbott Core Laboratory Systems, Lake Forest, Illinois). The TSH (B7K620) assay had a linear range of 0.015–100.000 µIU/mL, an average bias of −7.0%, and a maximum coefficient of variation of 5.3%. The FT4 (B7K6F0) assay had a linear range of 0.40–5.0 ng/dL, an average bias of 4.2%, and a maximum coefficient of variation of 6.6%. All TSH results are displayed in units of µIU/mL and all FT4 results are displayed in units of ng/dL as these were the units utilized in the local electronic medical record.

Data analysis

The central 95% (2.5–97.5%) interval was determined using bootstrapped maximum likelihood methods for TSH and FT4. Bootstrapping was performed using the resample function in the mosaic package.22 Due to the limited number of samples, reference intervals were determined without respect to age or sex. The maximum likelihood methods employed by this study were included in the mixtools package within the R statistical environment.23,24 The bestNormalize package was used to perform Yeo-Johnson transformations for the TSH results.25 A normalizing transformation was not required for FT4. The Tukey’s fence method was used to exclude outliers from the dataset.26 In silico experiments have shown the maximum likelihood methods employed by mixtools recapitulates reference interval cut-points better than traditional methods.27

When displaying laboratory values of TSH and FT4 relative to PMA, the PMA was divided into 3-week subgroupings. Initially, all TSH and FT4 laboratory values were displayed at 1-week intervals. Best fit groupings were utilized and identified 3-week groupings as the optimal pairings (data not shown).


From October 2011 to August 2018, 2750 infants born at <36 weeks gestation were admitted to the Nationwide Children’s Hospital Level IV NICU. After excluding infants with the previously specified diagnostic criteria or with exposure to levothyroxine, 2592 preterm infants remained. Of these 2592 preterm infants, 1087 had a TSH or FT4 laboratory result at some point during their NICU admission. Inclusive of infants with multiple evaluations of thyroid function, a total of 1584 TSH results and 1576 FT4 results were identified. The data were limited to that collected from 7 to 300 days postnatal in order to account for the TSH surge that abates in the first week of life and since the sample size of thyroid measurements at the >300 day timepoint was small in the subset infants who required a longer NICU stay.9,28,29 In doing so, the data sample was reduced to 1399 TSH and 1390 FT4 individual results. Excluding outlying data points reduced the sample to 1353 and 1347 TSH and FT4 individual results, respectively (Fig. 1). The median time for collection of both TSH and FT4 laboratory results was 48 days of life with a range of 8–135 days. The demographic profile of included infants is listed in Table 1.

Fig. 1: Flowchart of Study Participants.
figure 1

Identification and inclusion of infants and laboratory results for analysis.

Table 1 Demographic profile of infants from whom TSH and FT4 data were evaluated, ordered by birth gestational age.

TSH values were measured in preterm infants from 25 to 43 weeks PMA. Figure 2 plots all TSH laboratory values against the PMA at which time the laboratory sample was collected. There was a higher density of TSH values ranging from 0 to 7.5 µIU/mL across all PMAs. Scattered TSH values are seen across all PMAs but without any apparent increasing or decreasing trend. In Fig. 2, while laboratory values are distributed from 25 to 43 weeks PMA, there was an increase in the gross number of laboratory samples collected from approximately 31–40 weeks PMA.

Fig. 2: TSH Values According to Postmenstrual Age.
figure 2

All TSH laboratory values (in µIU/mL) displayed with respect to postmenstrual age.

To further analyze the change in TSH, Fig. 3 shows the laboratory results organized into subgroups by PMA: 25–27 6/7 weeks, 28–30 6/7 weeks, 31–33 6/7 weeks, 34–36 6/7 weeks, 37–39 6/7 weeks, and 40 weeks and greater. When Kruskal−Wallis rank sum testing was performed, there was a statistically significant change between the PMA groups (p value = 0.00019); however, when a slope analysis was done for the trend across the groups, the value was close to zero. From the <28 weeks’ PMA subgroup to the >40 weeks’ PMA subgroup, median TSH values increased from 2.992 to 3.677 µIU/mL. Across all PMA subgroups, the range of TSH median values was 2.806–3.677 µIU/mL (Table 2).

Fig. 3: TSH median laboratory values with reference intervals organized by postmenstrual age groupings.
figure 3

The solid circles indicate boxplot outliers that were between 1.5 and 2× the interquartile range. Note that the identified outliers are those remaining after the Tukey’s fence method removed more extreme values.

Table 2 Reference intervals and median values of TSH and FT4 distributed by PMA subgroupings.

Figure 4 shows all FT4 values that were measured for the same infant population and plotted against PMA. Infants who were closer to term corrected at the time of evaluation had higher FT4 levels overall. Similarly to the TSH data, the median FT4 values were calculated for each PMA subgroup of preterm infants from <28 weeks to post term and are shown in Fig. 5. When Kruskal−Wallis rank sum testing was performed, there was a statistically significant change between the PMA subgroups (p value = 2.2e–16). As the PMA subgroup increases, there is an overall increase in FT4 values. Median FT4 values increased from 0.70 to 1.15 ng/dL from the <28 weeks to the ≥40 weeks PMA subgroup (Table 2).

Fig. 4: FT4 Values According to Postmenstrual Age.
figure 4

All FT4 laboratory values (in ng/dL) displayed with respect to postmenstrual age.

Fig. 5: Median FT4 According to Postmenstrual Age Subgroups.
figure 5

FT4 median laboratory values with reference intervals organized by postmenstrual age groupings.

Table 2 also presents the reference intervals of TSH and FT4 values for each PMA subgroup. A total of 1353 TSH and 1347 FT4 samples are included. The largest PMA subgroup included preterm infants ranging from 34–36 6/7 weeks. Overall, the TSH reference interval goes from 0.340–9.681 to 1.090–7.627 µIU/mL from 25–27 6/7 weeks to ≥40 weeks’ PMA. The FT4 reference interval increases gradually with increasing PMA. The reference interval increases from 0.42–0.91 to 0.87–1.32 ng/dL from 25–27 6/7 to ≥40 weeks’ PMA.


To the authors’ knowledge, this study’s dataset provides one of the largest samples to date of TSH and FT4 laboratory values in hospitalized extremely preterm to term infants for a window of evaluation outside of the TSH surge anticipated at delivery. The TSH surge occurs at approximately 30 min of age, decreases within the first week of life, and then remains steady afterwards.9,11,13,28,29,30 The anticipation in this study is that utilizing TSH and FT4 samples collected after the first week of life and at a median age of 48 days of life will yield values outside of the initial TSH surge to generate reference intervals that will help guide clinicians in their interpretation of follow-up thyroid studies after the initial newborn screening. The principle findings include largely stable median TSH values across PMA while FT4 values show an overall small and gradual increase in values with regards to increased PMA (Figs. 25). The small variability seen in median TSH values across PMA groupings in Fig. 3 is likely a reflection of the scattered TSH values as seen in Fig. 2. With regard to the TSH reference intervals, as gestational age increases, there are small changes in the reference intervals where the variable range likely reflects the statistical power secondary to the sample size of each PMA subgroup (Table 2). Our TSH upper limit reference interval values are similar to previously published data. For 4-week-old 22–27-week and 28–31-week infants, Kaluarachchi et al.13 identified the 95th percentile TSH value to be 11.0 and 8.2 µIU/mL, respectively. Other studies had similar although slightly higher TSH reference intervals and evaluated preterm infant populations with older PMAs or birth gestational ages.11,30

When evaluating FT4 median values as well as upper and lower limit reference intervals from 25–27 6/7 weeks to >40 weeks PMA, there is an overall upward pattern that is likely reflective of an increasingly mature hypothalamic−pituitary−thyroid axis with increasing PMA (Figs. 4 and 5).1,2 Very limited published data exist regarding FT4 reference intervals in preterm infants. After unit conversion, the study by Wang et al.11 reported an FT4 reference interval for 29–38 weeks corrected gestational age of 0.87–1.94 ng/dL. We have similar values depicted in Table 2 that are slightly lower overall and with a narrower reference interval. Additionally, we identified reference interval values for a more preterm population down to 25–27 6/7 weeks PMA.

Providing clinicians with TSH and FT4 reference intervals that vary by increasing PMA is more representative of the underlying endocrine physiology and a maturing hypothalamic−pituitary−thyroid axis1,2 and perhaps more clinically useful than uniform values representing all preterm infants, as can be seen in electronic medical record reference values or in state newborn screening programs, which typically establish their normal ranges from screening results collected within the first several postnatal days. The identified reference intervals for <31-weeks infants in this study for TSH and FT4 are higher and lower respectively relative to those provided for preterm infants (TSH: 0.7–6.6 µIU/mL, FT4: 0.7–1.8 ng/dL) within the electronic medical record at the primary author’s academic institution. Identifying a lower FT4 and higher TSH as the norm for these <31-weeks infants suggests that these preterm infants may be experiencing a physiologic hypothyroxinemia of prematurity rather than a primary hypothyroid process with an abnormally elevated TSH in the setting of lower than normal FT4. Ideally, by identifying a reference interval that is more characteristic of the underlying and changing preterm infant thyroid physiology, clinicians will reduce laboratory sample collections to follow-up TSH and FT4 values that may have previously been flagged as outside of the current reference intervals.

Of note, for the subgroups >37 weeks PMA, the upper limit of the TSH reference interval is identified as a value higher than what is often accepted as within normal limits for term infants. In the primary author’s academic institution, TSH values >6.6 µIU/mL are identified as abnormal and warrant consideration of follow-up depending on clinical circumstances. While there is ongoing research evaluating TSH reference intervals in term infants that include upper limits similar to and higher than those identified in this study, this remains a field of active research, and TSH values falling within the identified reference intervals of this study may not exclude thyroid disease in the appropriate clinical context.30,31


The primary limitation for this study is the patient sample utilized to establish TSH and FT4 reference intervals. Infants who were diagnosed with a process suggesting ongoing dysfunction of the hypothalamic−pituitary−thyroid axis rather than a transient process or who were receiving levothyroxine were excluded. Other factors can affect thyroid hormone levels in preterm infants including but not limited to medication exposure (dopamine, glucocorticoids, heparin, etc.) and postnatal illness.2 In establishing reference intervals, the ideal study population is one that is as “normal” as possible. There is not an established profile for a “normal” preterm infant who is concurrently requiring treatment within an intensive care unit. In our study, we excluded infants who were diagnosed with a condition suggesting a persistent alteration of their thyroid physiology. However, we did not seek to eliminate infants who may have experienced a temporary alteration of their thyroid physiology. It is not typical practice to evaluate thyroid hormone function during periods of illness and in gathering a large sample of laboratory values, the influence of any single data point that may be affected by patient-specific factors (i.e. medications, severe acute illness) is limited. One additional limitation was in the <28- and ≥40-weeks PMA subgroups. For these two subgroups, we had a suboptimal number of samples for establishing reference intervals based on Clinical and Laboratory Standards Institute guidelines.32 With TSH and FT4 samples collected at a median age of 48 days, the number of infants born at an early enough gestational age to have laboratory values collected at <28 weeks PMA is exceedingly small given only an estimated 0.73% of all births are born <28 weeks gestation.33 However, robust statistical tools such as the mixtool package used in this study are able to partially correct for the limited sample number, and without major outlying data points, can still provide a useful portrayal of reference intervals for these subgroups.


Currently, there are limited TSH and FT4 reference interval data in preterm infants to aid clinicians in interpreting these thyroid test results across varied postmenstrual ages. This study identifies reference intervals from extremely preterm to term infants that are reflective of a maturing hypothalamic−pituitary−thyroid axis and suggests that infants <31 weeks PMA may be experiencing a physiologic hypothyroxinemia of prematurity rather than a primary hypothyroid process.