Modelling predictive gender- and gestation-specific weight reference centiles for preterm infants using a population-based cohort study

We aimed to model longitudinal data to create predictive growth charts for weight in preterm infants from birth till discharge, that took into account the differing growth rates post-birth when compared to in-utero growth and therefore was more representative of the data than the UK1990 reference charts. Data from birth until discharge (or death), was collected and rigorously cleaned for all infants born at <32 weeks of gestation over a 4-year period. Means and standard deviations from the UK1990 reference charts were used to compute standard deviation scores (SDS) for our cohort. 2/3rd of the data was randomly selected and used to create gestation and gender-specific predictive weight centile lines through novel application of mixed modelling methods. The remaining 1/3rd of the data was used to test model fit by comparing expected vs actual weights for the new model with those predicted by the UK1990 model. Data from 1,510 preterm infants was analysed. 1067 of these were used to produce the predictive model. Weekly SDS were significantly lower than predicted throughout hospital stay for all gestation groups when compared with UK1990 data. The test data (n = 539) fitted the new centile lines substantially better than those modelled by the UK1990 centile lines. Mixed modelling of longitudinal data produced new predictive references for weight centiles of preterm infants. A large population-based prospective study is needed to produce representative longitudinal reference growth charts using these methods.

New infant growth charts were adopted in the UK in May 2009, based on a large international study by the World Health Organisation, for monitoring growth of children between 0-4 years of age 1 . However, for preterm infants, old UK1990 reference curves 2,3 continued to be used as recommended by the Scientific Advisory Committee on Nutrition (SACN) in 2007. The Royal College of Paediatrics and Child Health in the UK (RCPCH) combined these two datasets to produce new Neonatal Infant Close Monitoring Charts (NICM) for boys and girls 4 the current standard for growth monitoring in preterm infants in the UK. The UK1990 reference curves plot cross-sectional data at birth, an approach that has been used elsewhere for reference standards 5 . Growth charts constructed using cross sectional reference data of weight at birth reflect expected or optimum in utero growth rather than actual growth after preterm birth. Ideally, growth chart references should be based on longitudinal data from a reference population of 'healthy' infants. However, the majority of preterm population born at gestations <32 weeks are challenged by multiple additional morbidities and it is unlikely that any of these infants can be classed as "healthy". Growth of these infants is profoundly affected by the direct or indirect cumulative impact of multiple morbidities, not all exclusively linked to nutritional limitations 6 . More likely, preterm infants represent a different population compared to fetuses of corresponding gestations 7 and it is unrealistic that very preterm newborns will follow the intrauterine growth trajectory using current methods of nutrition delivery. A review of the current literature reveals that no longitudinal reference centile charts exist to monitor the most vulnerable preterm infants born at extremes of gestation, who are at the highest risk of having growth problems 8 . This is a specific gap we have aimed to fulfil: a well-designed longitudinal growth chart from a large data set that represents the actual preterm population and contemporary clinical practice that is likely to better represent their true growth potential. Variations in nutrition delivery may be expected to be smoothed out in such a large population data set.
We aimed to achieve this objective by collecting retrospective, longitudinal measurements of weight from preterm infants born at <32 weeks gestation in Wales, starting from birth, and model gender-and gestation specific predictive centile charts till discharge from the neonatal unit.

Methods
Data collection. Retrospective anonymised data was collected for all infants born at <32 weeks of gestation in the calendar years 2011-2014 from the 'Badgernet' electronic neonatal database (Clevermed Systems, Edinburgh, UK) for Wales. The database maintains clinical records for all preterm infants admitted to the neonatal units in Wales from 2011. All infants born during the study period were included in the data cleaning process. Reasons for necessary exclusions are detailed in the data cleaning section. The postnatal weight of these infants was recorded in the database according to practices in different neonatal units and varied from daily to once in 2 weeks. Along with weight, data was also collected on duration of mechanical ventilation (days), respiratory support (days) and parenteral nutrition (days). Data was collected from all infants from birth till discharge from the neonatal unit. Infants who died before discharge contributed data up to the point of their stay on the unit (Table 1 and Supplementary Fig. 2). Stay duration was variable due to the differences in morbidities among infants but tended to be longer for the earlier gestation groups ( 1. Repeated values on successive days. If the weight -accurate to 4 significant figures -did not vary over at least two successive days, then the first value on the first day was taken as correct and the subsequent days with the same value were deleted as, from clinical experience, they were very likely to be re-entries without re-weighing in critically unwell and unstable infants. 2. Weights that were very likely to be 1/10th of what they should be were corrected by multiplying by 10 and removing duplicates. 3. Weights that were very likely to be 10 times what they should be were corrected by dividing by 10 and removing duplicates. 4. Weights that were very likely to be too large or small by comparison with the weights either side of them and not correctable as in 2 and 3 above were removed. 5. Patterns of weights that were repeated over successive groups of days were deleted beyond the first. 6. Any infant with only one distinct weight was removed e.g. either just birth weight or birth weight repeated for one or more subsequent days. These are mostly infants who died before a second weight could be recorded.

Statistical analysis.
For the first part of the analysis -comparison with UK1990 data -the cohort was divided into three pragmatic gestation bands: 23 +0 to 25 +6 weeks, 26 +0 to 28 +6 weeks and 29 +0 to 31 +6 weeks. Demographic data is reported as means and proportions and was compared (model data vs test data) by 2-sided independent samples t-test or Fisher's test as appropriate. Date-wise weight data from all infants were initially converted to day-wise data till discharge and cross-checked. Infants were first sorted by gender and then by gestation (weeks and days). For calculation of standard deviation scores (SDS) and centiles, weights were first converted from gm to kg. Birth weight (day 0) and weights for each week (day 7, 14, 21 etc.) ±3 days were recorded. SDS was calculated by using LMS Growth Excel add-in (Pan H, Cole TJ, LMS growth, a Microsoft Excel add-in to access growth references based on the LMS method Version 2.76. http://www.healthforallchildren.co.uk/; 2011). The LMS method was used to create NICM growth charts from the UK1990 data 3,4 . Gender-specific mean ± 95% confidence interval of weekly SDS and weights were plotted from birth to discharge. The complete data was then randomly divided: two-thirds of the data was used for modelling (training cohort) and the remaining third was used as test data to validate the model (test cohort).
Extensive investigation identified an appropriate mixed effects regression model where infant weight was explained by gender, time since birth and time since conception (calculated as the sum of time since birth and gestation minus two weeks) to create predictive weight-gain centiles from birth till discharge (at a variable period of time for different gestations) from the neonatal units. The use of these two distinct timescales is worth highlighting, as one of the main purposes of our work was to distinguish between pre-and post-birth weight gain. The relationship between weight and time since birth was captured using splines with boundary knots at 0 and 14 weeks and other knots at 1, 2, 3 and 4 weeks to properly accommodate the more varied behaviour in the first 4 weeks of life. Time since birth and time since conception were also included as quadratic polynomials. Random effects for each child allowed for correlation between an individual's observations. Interestingly, no random intercept terms were needed because foetal weight is known to be (effectively) zero when time since conception is zero, and any random intercept at birth is captured by the random slope on the time since conception scale. To reduce the complexity of our modelling, these two slopes were assumed to be uncorrelated. By using parametric models to describe the relationship between weight, time since birth and time since conception, we were able to borrow strengths between different gestation groups and ages.
Predicted quantiles were calculated at 0, 2, 4, 6, 8, 10 and 12 weeks since birth and applied for the following 14 days; thus, the model variance was effectively refreshed every two weeks with the additional data available. Growth trajectories were determined from the model at the following fractiles used in growth charts: 0.004, 0.02, 0.09, 0.25, 0.50, 0.75, 0.91, 0.98 and 0.996 9 . These fractiles can equivalently be interpreted as centiles when (e.g.) 0.02 is multiplied by 100% to give 2%.
All models and figures were produced using R for Windows (R Core Team (2013  www.nature.com/scientificreports www.nature.com/scientificreports/ All the test data (all gestations and both genders) was used to numerically compare their fit to both the model we produced and the UK1990 curves by counting the number of weights in each quartile for each. Each weight was compared to quartile levels at the same time to determine in which quartile it belonged. The results for each gender and gestation combination were combined and are given in the tables below where the predicted values are those that should be seen in a completely representative model.

Results
Between 2011-2014, there were 1606 infants born at <32 weeks of gestation and admitted to a neonatal unit in Wales. Table 2 summarises the demographics for the whole cohort of infants. While the birth SDS were comparable with the reference data (UK1990 or LMS), SDS were significantly lower for all three gestation bands soon after birth until discharge (Table 3), suggesting apparent growth failure during stay (supplementary Fig. 1). 5-weekly SDS data starting from birth is presented in Table 3; full weekly data is presented in Supplementary Tables 1-2.
Data from 1067 infants were used as the training cohort for the model, while data from the remaining third of the infants (539 infants) were used to test the model. Table 1 gives details of the gender-and gestation-specific demographics of the two groups of infants used as modelling data and test data. All clinical characteristics in the three gestation groups were well balanced between the training and test cohorts for both boys and girls, except a small but statistically significant difference in the mean gestation of males compared to females in the 29-31-week group, and a significantly higher discharge weight in females on the 29-31-week group in the test cohort.
Using methods described above, and given a hypothetical observation lying on one of the fractiles used in growth charts (i.e. 0.004, 0.02, 0.09, 0.25, 0.50, 0.75, 0.91, 0.98 and 0.996) at a particular point in time (one of 0, 2, 4, 6, 8, 10 and 12 weeks since birth), we were able to calculate the expected values of the two random effects and determine a predicted two-week growth trajectory. Figure 1 represents an example of the model output for girls born at 26 weeks of gestation, with test data plotted on the model centiles and also on the UK1990 curves for comparison. Effectively, the predictive model was recalculated every 2-weeks, leading to small changes in the centile lines at the extremes of the range. However, the fit for the test data was good on the model centiles and was uniformly plotted on lower centiles for the UK1990 data as expected (  Table 3. Gestation-band specific weekly mean standard deviation score (SDS) of weight with 95% confidence interval (CI) of the mean, from birth up to the 20 th week of life. SDS was calculated by comparing with the UK1990 birth centiles data at each gestation. Numbers of infants in this table are different from the whole group as presented in Table 2 due to missing data at or within 3 days of the time-points considered, including at birth. www.nature.com/scientificreports www.nature.com/scientificreports/ model, it is possible to create predictive weight centiles for any infant born at any gestation, a fraction of any gestation, or a gestation band, between 23-31 weeks (supplementary information 2).
In our cohort, there were 77 deaths in total before discharge from the neonatal unit (Table 1). As infants who died could have affected the model, a sub-group analysis for the growth curves was undertaken excluding all of the infants who died. Representative data for male infants born at 25 weeks of gestation with and without infants who died is presented in Figs. 2 and 3 respectively. Model fit for the subgroup analysis is presented in Table 5. Proportions of plots in each quartile were almost identical for both models (all infants and surviving infants).

Discussion
Although it has been shown that cross-sectional birthweight centiles differ in shape from postnatal growth curves in infants <32 weeks gestation and how the mean growth curves vary by gestation 10 , this is the first attempt to analyse longitudinal weight data from extreme preterm infants and appropriately model them to produce predictive weight centile references until discharge. Our results show good fit for preterm infants at all gestations tested, while we also demonstrate poor fit of the data on the UK1990 curves.
Our proposed model has several features that distinguish it from the LMS approach, which is currently used to construct Newborn and Infant Close Monitoring (NICM) Growth Charts 11 . Most importantly, it is based around two different timescales: one measuring elapsed time since conception, and the other elapsed time since birth. It seems to us that both these timescales are of immediate and obvious relevance in assessing the growth of preterm infants. A convenient aspect of the two-timescale approach is that all random effects may be thought of as growth rates (slopes), the intercept terms being (in the case of time since conception) known to be zero, and (in the case of time since birth) determined by growth since conception.
The discontinuities in Fig. 1 (and other similar figures in supplementary information) are intentional. In essence, such figures are composed of numerous individual two-week longitudinal predictions. While our underlying model can produce predictions of arbitrary duration and based on arbitrarily timed (since conception, and since birth)      www.nature.com/scientificreports www.nature.com/scientificreports/ observed percentiles, such estimates would only really be tractable and practicable in dynamic, digital media. For static, printed formats, Fig. 1 represents a visual compromise: predictions may be based on observed percentiles at most one week distant from the truth and will then last between one and two weeks. The discontinuities also serve as a visual reminder of the difference between cross-sectional and longitudinal percentiles. For example, even among infants observed at the 90th percentile, there will be variation in subsequent growth; we plot the mean predicted trajectory, but the steps in the curves at the end of each two-week segment reflect real variation around this central estimate.
Preterm infants are probably the only population group where "growth" assessment is still undertaken by using cross-sectional reference charts plotting centiles at birth (or in-utero growth) as opposed to longitudinal measurements 12 . This was based on the principle that postnatal weight gain should reflect in-utero weight gain 13 . However, this standard seems to have never been achieved in practice 6,10,14 . It is also evident in almost all growth data presented in this high-risk population where preterm infants suffer a period of initial growth failure (defined variably) when plotted on the cross-sectional growth charts 6,10,15,16 . Aggressive nutritional interventions to achieve growth rates similar to the intrauterine growth trajectories have had variable success rates 17 but more importantly serious long-term concerns have emerged 18 . Thus, the practical utility of traditional growth charts based on cross sectional growth data of in-utero fetuses to monitor growth after preterm birth is questionable. In a recent detailed review, Villar and colleagues 19 elegantly argued against this principle, as preterm infants are often unable to achieve the recommendations from these charts, even after receiving the best possible care 17 , or with advanced neonatal care 10,20 . In addition, studies have shown that preterm infants often have preserved head growth in preference to body weight during postnatal life, accounting for apparent "growth failure" 21 . More importantly, the authors argued that attempting to reflect in-utero weight gain by accelerated postnatal growth may not be desirable due to long-term metabolic adverse effects 18,22,23 .
In a recent systematic review of longitudinal studies attempting to create postnatal growth charts for preterm infants, Villar and colleagues 8 noted that overall methodologic quality of the studies was fair to low by a scoring system. Several studies included in the review were from historical cohorts of infants, and their management would not be comparable with current practice. However, there were nine studies that looked at cohorts of infants born after the year 2000 10,24-31 , which are more representative of current clinical care. Only three of these studies had sufficient sample size for analysis 10,24,27 . Of the nine studies, six studies did not publish any centile lines 10,24,26,[28][29][30] and one study produced only birth-weight specific data with no attempt at constructing charts 25 . The study by Bocca-Tjeertes and colleagues 27 published reference centiles for growth of preterm infants up to 15 months of age. However, the early postnatal period from birth up to discharge from the neonatal unit was presented in a compressed format in the chart and would be challenging to use clinically on the neonatal unit. The study by Villar and colleagues 31 , while methodologically rigorous, had a small sample size (201 infants), with only 28 infants born below 33 weeks of gestation (centile charts were presented from 27 weeks of gestation). Thus, no longitudinal reference centile charts exist to monitor the most vulnerable preterm infants born at extremes of gestation, who are at the highest risk of having growth problems. This is a specific gap we have aimed to fill.
Our study has several strengths including a large sample size and using gestation (and not birth weight) as the correct representation of prematurity. We have used rigorous and detailed analysis followed by easily replicable mixed modelling methods allowing for variability (a constant in this population) at every stage, resulting in predictive weight-gain centiles of extreme preterm infants. This contemporary data reflects current clinical practice and any variations in clinical management are expected to be reflected in the large data set analysed. Borrowing strengths across gestation groups has allowed us to make meaningful predictions for even the early gestation groups, where smaller numbers of infants were available to provide data. Parametric modelling allowed borrowing of strength across the whole dataset, so that (in principle) even a full-term infant could contribute something to our estimation of growth curves for extremely preterm infants although in practice, such contributions are small. Although we have presented reference centiles for only a few chosen gestations, this is neither due to data paucity nor model inadequacy but is instead a limitation of static graphical presentation. The model is in principle infinitely fine-grained and interpolates seamlessly between different ages and gestations. Any choice of static plots will necessarily represent a compromise between precision and practicality: it would be impractical to suggest neonatal units employ books of growth charts hundreds of pages long corresponding to all possible combinations of age and gestation. Nevertheless, we consider dynamic plots an attractive and viable option in the near future, where growth percentiles tailored exactly to a child's age and gestation are instantly available electronically. Our data and model can already produce such dynamic predictions; our static representations provide an example of these possibilities.
There are some limitations that need pointing out. Although most infants in our cohort come from a population where gestation is assessed early at around 12-14 weeks by ultrasound, this data was not available to us to cross-check. Weight measurements were taken using similar machines at the different hospitals, but these were not standardised or quality controlled. Morbidity data in our retrospective cohort was of an insufficient quality to  www.nature.com/scientificreports www.nature.com/scientificreports/ include in the model predictions for individualised charts. All of these limitations can be corrected by designing a prospective national study to collect representative data.
In conclusion, we have published the first gestation-and gender-specific predictive longitudinal weight gain charts in extreme preterm infants. This model can be used to produce similar reference charts for other anthropological measurements from preterm infants, by collecting prospective data in a national study.