Introduction

New infant growth charts were adopted in the UK in May 2009, based on a large international study by the World Health Organisation, for monitoring growth of children between 0–4 years of age1. However, for preterm infants, old UK1990 reference curves2,3 continued to be used as recommended by the Scientific Advisory Committee on Nutrition (SACN) in 2007. The Royal College of Paediatrics and Child Health in the UK (RCPCH) combined these two datasets to produce new Neonatal Infant Close Monitoring Charts (NICM) for boys and girls4 the current standard for growth monitoring in preterm infants in the UK. The UK1990 reference curves plot cross-sectional data at birth, an approach that has been used elsewhere for reference standards5. Growth charts constructed using cross sectional reference data of weight at birth reflect expected or optimum in utero growth rather than actual growth after preterm birth. Ideally, growth chart references should be based on longitudinal data from a reference population of ‘healthy’ infants. However, the majority of preterm population born at gestations <32 weeks are challenged by multiple additional morbidities and it is unlikely that any of these infants can be classed as “healthy”. Growth of these infants is profoundly affected by the direct or indirect cumulative impact of multiple morbidities, not all exclusively linked to nutritional limitations6. More likely, preterm infants represent a different population compared to fetuses of corresponding gestations7 and it is unrealistic that very preterm newborns will follow the intrauterine growth trajectory using current methods of nutrition delivery. A review of the current literature reveals that no longitudinal reference centile charts exist to monitor the most vulnerable preterm infants born at extremes of gestation, who are at the highest risk of having growth problems8. This is a specific gap we have aimed to fulfil: a well-designed longitudinal growth chart from a large data set that represents the actual preterm population and contemporary clinical practice that is likely to better represent their true growth potential. Variations in nutrition delivery may be expected to be smoothed out in such a large population data set.

We aimed to achieve this objective by collecting retrospective, longitudinal measurements of weight from preterm infants born at <32 weeks gestation in Wales, starting from birth, and model gender- and gestation specific predictive centile charts till discharge from the neonatal unit.

Methods

Data collection

Retrospective anonymised data was collected for all infants born at <32 weeks of gestation in the calendar years 2011–2014 from the ‘Badgernet’ electronic neonatal database (Clevermed Systems, Edinburgh, UK) for Wales. The database maintains clinical records for all preterm infants admitted to the neonatal units in Wales from 2011. All infants born during the study period were included in the data cleaning process. Reasons for necessary exclusions are detailed in the data cleaning section. The postnatal weight of these infants was recorded in the database according to practices in different neonatal units and varied from daily to once in 2 weeks. Along with weight, data was also collected on duration of mechanical ventilation (days), respiratory support (days) and parenteral nutrition (days). Data was collected from all infants from birth till discharge from the neonatal unit. Infants who died before discharge contributed data up to the point of their stay on the unit (Table 1 and Supplementary Fig. 2). Stay duration was variable due to the differences in morbidities among infants but tended to be longer for the earlier gestation groups (Table 2).

Table 1 Demographic details of the whole cohort, stratified by gestation bands.
Table 2 Demographic details, stratified by model- and test-groups, of the whole cohort which was analysed for the study.

Ethics statement

All data was collected as part of routine data collection on neonatal units for which individual parental consent is not sought. The Wales Neonatal Network has permissions in place to access anonymised data from neonatal units. Routinely collected anonymised data of clinical care was acquired by the authors from the Wales Neonatal Network for analysis; no identifiable data was available to the authors. Using the NHS Health Research Authority decision tool (http://www.hra-decisiontools.org.uk/ethics/), this type of research was exempt from specific ethical consent.

Data cleaning

Data cleaning was undertaken using various strategies as summarised below, with detailed examples in the supplementary information.

  1. 1.

    Repeated values on successive days. If the weight – accurate to 4 significant figures – did not vary over at least two successive days, then the first value on the first day was taken as correct and the subsequent days with the same value were deleted as, from clinical experience, they were very likely to be re-entries without re-weighing in critically unwell and unstable infants.

  2. 2.

    Weights that were very likely to be 1/10th of what they should be were corrected by multiplying by 10 and removing duplicates.

  3. 3.

    Weights that were very likely to be 10 times what they should be were corrected by dividing by 10 and removing duplicates.

  4. 4.

    Weights that were very likely to be too large or small by comparison with the weights either side of them and not correctable as in 2 and 3 above were removed.

  5. 5.

    Patterns of weights that were repeated over successive groups of days were deleted beyond the first.

  6. 6.

    Any infant with only one distinct weight was removed e.g. either just birth weight or birth weight repeated for one or more subsequent days. These are mostly infants who died before a second weight could be recorded.

Statistical analysis

For the first part of the analysis – comparison with UK1990 data – the cohort was divided into three pragmatic gestation bands: 23+0 to 25+6 weeks, 26+0 to 28+6 weeks and 29+0 to 31+6 weeks. Demographic data is reported as means and proportions and was compared (model data vs test data) by 2-sided independent samples t-test or Fisher’s test as appropriate. Date-wise weight data from all infants were initially converted to day-wise data till discharge and cross-checked. Infants were first sorted by gender and then by gestation (weeks and days). For calculation of standard deviation scores (SDS) and centiles, weights were first converted from gm to kg. Birth weight (day 0) and weights for each week (day 7, 14, 21 etc.) ±3 days were recorded. SDS was calculated by using LMS Growth Excel add-in (Pan H, Cole TJ, LMS growth, a Microsoft Excel add-in to access growth references based on the LMS method Version 2.76. http://www.healthforallchildren.co.uk/; 2011). The LMS method was used to create NICM growth charts from the UK1990 data3,4. Gender-specific mean ± 95% confidence interval of weekly SDS and weights were plotted from birth to discharge.

The complete data was then randomly divided: two-thirds of the data was used for modelling (training cohort) and the remaining third was used as test data to validate the model (test cohort).

Extensive investigation identified an appropriate mixed effects regression model where infant weight was explained by gender, time since birth and time since conception (calculated as the sum of time since birth and gestation minus two weeks) to create predictive weight-gain centiles from birth till discharge (at a variable period of time for different gestations) from the neonatal units. The use of these two distinct timescales is worth highlighting, as one of the main purposes of our work was to distinguish between pre- and post-birth weight gain. The relationship between weight and time since birth was captured using splines with boundary knots at 0 and 14 weeks and other knots at 1, 2, 3 and 4 weeks to properly accommodate the more varied behaviour in the first 4 weeks of life. Time since birth and time since conception were also included as quadratic polynomials. Random effects for each child allowed for correlation between an individual’s observations. Interestingly, no random intercept terms were needed because foetal weight is known to be (effectively) zero when time since conception is zero, and any random intercept at birth is captured by the random slope on the time since conception scale. To reduce the complexity of our modelling, these two slopes were assumed to be uncorrelated. By using parametric models to describe the relationship between weight, time since birth and time since conception, we were able to borrow strengths between different gestation groups and ages.

Predicted quantiles were calculated at 0, 2, 4, 6, 8, 10 and 12 weeks since birth and applied for the following 14 days; thus, the model variance was effectively refreshed every two weeks with the additional data available. Growth trajectories were determined from the model at the following fractiles used in growth charts: 0.004, 0.02, 0.09, 0.25, 0.50, 0.75, 0.91, 0.98 and 0.9969. These fractiles can equivalently be interpreted as centiles when (e.g.) 0.02 is multiplied by 100% to give 2%.

All models and figures were produced using R for Windows (R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/). Each model was tested in three different ways:

  1. 1.

    Plotting model data on new centile curves to check fit.

  2. 2.

    Plotting test data on new centile curves to confirm fit.

  3. 3.

    Plotting test data on UK1990 curves to check fit.

All the test data (all gestations and both genders) was used to numerically compare their fit to both the model we produced and the UK1990 curves by counting the number of weights in each quartile for each. Each weight was compared to quartile levels at the same time to determine in which quartile it belonged. The results for each gender and gestation combination were combined and are given in the tables below where the predicted values are those that should be seen in a completely representative model.

Results

Between 2011–2014, there were 1606 infants born at <32 weeks of gestation and admitted to a neonatal unit in Wales. Table 2 summarises the demographics for the whole cohort of infants. While the birth SDS were comparable with the reference data (UK1990 or LMS), SDS were significantly lower for all three gestation bands soon after birth until discharge (Table 3), suggesting apparent growth failure during stay (supplementary Fig. 1). 5-weekly SDS data starting from birth is presented in Table 3; full weekly data is presented in Supplementary Tables 12.

Table 3 Gestation-band specific weekly mean standard deviation score (SDS) of weight with 95% confidence interval (CI) of the mean, from birth up to the 20th week of life.

Data from 1067 infants were used as the training cohort for the model, while data from the remaining third of the infants (539 infants) were used to test the model. Table 1 gives details of the gender- and gestation-specific demographics of the two groups of infants used as modelling data and test data. All clinical characteristics in the three gestation groups were well balanced between the training and test cohorts for both boys and girls, except a small but statistically significant difference in the mean gestation of males compared to females in the 29-31-week group, and a significantly higher discharge weight in females on the 29-31-week group in the test cohort.

Using methods described above, and given a hypothetical observation lying on one of the fractiles used in growth charts (i.e. 0.004, 0.02, 0.09, 0.25, 0.50, 0.75, 0.91, 0.98 and 0.996) at a particular point in time (one of 0, 2, 4, 6, 8, 10 and 12 weeks since birth), we were able to calculate the expected values of the two random effects and determine a predicted two-week growth trajectory. Figure 1 represents an example of the model output for girls born at 26 weeks of gestation, with test data plotted on the model centiles and also on the UK1990 curves for comparison. Effectively, the predictive model was recalculated every 2-weeks, leading to small changes in the centile lines at the extremes of the range. However, the fit for the test data was good on the model centiles and was uniformly plotted on lower centiles for the UK1990 data as expected (Table 4). Detailed figures for all gestations are presented in the supplementary files (supplementary information 2 for boys and girls born at 23–31 weeks of gestation) and show good fit of the test data on the model centiles compared to the UK1990 data. From our model, it is possible to create predictive weight centiles for any infant born at any gestation, a fraction of any gestation, or a gestation band, between 23–31 weeks (supplementary information 2).

Figure 1
figure 1

Results from final model for females born at 26 weeks of gestation. (A) Model centile curves, (B) test data plotted on model centiles, and (C) test data plotted on UK1990 reference centiles. Days since birth are plotted on the x-axis on each figure, and weight in grams is plotted on the y-axis. Each point plotted in parts B and C are individual data points from test infants. All infants were included in this analysis.

Table 4 Proportions of plotted points in each quartile for the UK1990 data and for our new model including all infants.

In our cohort, there were 77 deaths in total before discharge from the neonatal unit (Table 1). As infants who died could have affected the model, a sub-group analysis for the growth curves was undertaken excluding all of the infants who died. Representative data for male infants born at 25 weeks of gestation with and without infants who died is presented in Figs. 2 and 3 respectively. Model fit for the subgroup analysis is presented in Table 5. Proportions of plots in each quartile were almost identical for both models (all infants and surviving infants).

Figure 2
figure 2

Results from final model for males born at 25 weeks of gestation. (A) Model centile curves, (B) test data plotted on model centiles, and (C) test data plotted on UK1990 reference centiles. Days since birth are plotted on the x-axis on each figure, and weight in grams is plotted on the y-axis. Each point plotted in parts B and C are individual data points from test infants. All infants were included in this analysis. Infants who died are represented as a bold red line in the figure.

Figure 3
figure 3

Results from final model for males born at 25 weeks of gestation. (A) Model centile curves, (B) test data plotted on model centiles, and (C) test data plotted on UK1990 reference centiles. Days since birth are plotted on the x-axis on each figure, and weight in grams is plotted on the y-axis. Each point plotted in parts B and C are individual data points from test infants. Only surviving infants were included in this analysis.

Table 5 Proportions of plotted points in each quartile for the UK1990 data and for our new model including surviving infants only.

Discussion

Although it has been shown that cross-sectional birthweight centiles differ in shape from postnatal growth curves in infants <32 weeks gestation and how the mean growth curves vary by gestation10, this is the first attempt to analyse longitudinal weight data from extreme preterm infants and appropriately model them to produce predictive weight centile references until discharge. Our results show good fit for preterm infants at all gestations tested, while we also demonstrate poor fit of the data on the UK1990 curves.

Our proposed model has several features that distinguish it from the LMS approach, which is currently used to construct Newborn and Infant Close Monitoring (NICM) Growth Charts11. Most importantly, it is based around two different timescales: one measuring elapsed time since conception, and the other elapsed time since birth. It seems to us that both these timescales are of immediate and obvious relevance in assessing the growth of preterm infants. A convenient aspect of the two-timescale approach is that all random effects may be thought of as growth rates (slopes), the intercept terms being (in the case of time since conception) known to be zero, and (in the case of time since birth) determined by growth since conception.

The discontinuities in Fig. 1 (and other similar figures in supplementary information) are intentional. In essence, such figures are composed of numerous individual two-week longitudinal predictions. While our underlying model can produce predictions of arbitrary duration and based on arbitrarily timed (since conception, and since birth) observed percentiles, such estimates would only really be tractable and practicable in dynamic, digital media. For static, printed formats, Fig. 1 represents a visual compromise: predictions may be based on observed percentiles at most one week distant from the truth and will then last between one and two weeks. The discontinuities also serve as a visual reminder of the difference between cross-sectional and longitudinal percentiles. For example, even among infants observed at the 90th percentile, there will be variation in subsequent growth; we plot the mean predicted trajectory, but the steps in the curves at the end of each two-week segment reflect real variation around this central estimate.

Preterm infants are probably the only population group where “growth” assessment is still undertaken by using cross-sectional reference charts plotting centiles at birth (or in-utero growth) as opposed to longitudinal measurements12. This was based on the principle that postnatal weight gain should reflect in-utero weight gain13. However, this standard seems to have never been achieved in practice6,10,14. It is also evident in almost all growth data presented in this high-risk population where preterm infants suffer a period of initial growth failure (defined variably) when plotted on the cross-sectional growth charts6,10,15,16. Aggressive nutritional interventions to achieve growth rates similar to the intrauterine growth trajectories have had variable success rates17 but more importantly serious long-term concerns have emerged18. Thus, the practical utility of traditional growth charts based on cross sectional growth data of in-utero fetuses to monitor growth after preterm birth is questionable. In a recent detailed review, Villar and colleagues19 elegantly argued against this principle, as preterm infants are often unable to achieve the recommendations from these charts, even after receiving the best possible care17, or with advanced neonatal care10,20. In addition, studies have shown that preterm infants often have preserved head growth in preference to body weight during postnatal life, accounting for apparent “growth failure”21. More importantly, the authors argued that attempting to reflect in-utero weight gain by accelerated postnatal growth may not be desirable due to long-term metabolic adverse effects18,22,23.

In a recent systematic review of longitudinal studies attempting to create postnatal growth charts for preterm infants, Villar and colleagues8 noted that overall methodologic quality of the studies was fair to low by a scoring system. Several studies included in the review were from historical cohorts of infants, and their management would not be comparable with current practice. However, there were nine studies that looked at cohorts of infants born after the year 200010,24,25,26,27,28,29,30,31, which are more representative of current clinical care. Only three of these studies had sufficient sample size for analysis10,24,27. Of the nine studies, six studies did not publish any centile lines10,24,26,28,29,30 and one study produced only birth-weight specific data with no attempt at constructing charts25. The study by Bocca-Tjeertes and colleagues27 published reference centiles for growth of preterm infants up to 15 months of age. However, the early postnatal period from birth up to discharge from the neonatal unit was presented in a compressed format in the chart and would be challenging to use clinically on the neonatal unit. The study by Villar and colleagues31, while methodologically rigorous, had a small sample size (201 infants), with only 28 infants born below 33 weeks of gestation (centile charts were presented from 27 weeks of gestation). Thus, no longitudinal reference centile charts exist to monitor the most vulnerable preterm infants born at extremes of gestation, who are at the highest risk of having growth problems. This is a specific gap we have aimed to fill.

Our study has several strengths including a large sample size and using gestation (and not birth weight) as the correct representation of prematurity. We have used rigorous and detailed analysis followed by easily replicable mixed modelling methods allowing for variability (a constant in this population) at every stage, resulting in predictive weight-gain centiles of extreme preterm infants. This contemporary data reflects current clinical practice and any variations in clinical management are expected to be reflected in the large data set analysed. Borrowing strengths across gestation groups has allowed us to make meaningful predictions for even the early gestation groups, where smaller numbers of infants were available to provide data. Parametric modelling allowed borrowing of strength across the whole dataset, so that (in principle) even a full-term infant could contribute something to our estimation of growth curves for extremely preterm infants although in practice, such contributions are small. Although we have presented reference centiles for only a few chosen gestations, this is neither due to data paucity nor model inadequacy but is instead a limitation of static graphical presentation. The model is in principle infinitely fine-grained and interpolates seamlessly between different ages and gestations. Any choice of static plots will necessarily represent a compromise between precision and practicality: it would be impractical to suggest neonatal units employ books of growth charts hundreds of pages long corresponding to all possible combinations of age and gestation. Nevertheless, we consider dynamic plots an attractive and viable option in the near future, where growth percentiles tailored exactly to a child’s age and gestation are instantly available electronically. Our data and model can already produce such dynamic predictions; our static representations provide an example of these possibilities.

There are some limitations that need pointing out. Although most infants in our cohort come from a population where gestation is assessed early at around 12–14 weeks by ultrasound, this data was not available to us to cross-check. Weight measurements were taken using similar machines at the different hospitals, but these were not standardised or quality controlled. Morbidity data in our retrospective cohort was of an insufficient quality to include in the model predictions for individualised charts. All of these limitations can be corrected by designing a prospective national study to collect representative data.

In conclusion, we have published the first gestation- and gender-specific predictive longitudinal weight gain charts in extreme preterm infants. This model can be used to produce similar reference charts for other anthropological measurements from preterm infants, by collecting prospective data in a national study.