Preterm birth can lead to impaired language development. This study aimed to predict language outcomes at 2 years corrected gestational age (CGA) for children born preterm.
We analysed data from 89 preterm neonates (median GA 29 weeks) who underwent diffusion MRI (dMRI) at term-equivalent age and language assessment at 2 years CGA using the Bayley-III. Feature selection and a random forests classifier were used to differentiate typical versus delayed (Bayley-III language composite score <85) language development.
The model achieved balanced accuracy: 91%, sensitivity: 86%, and specificity: 96%. The probability of language delay at 2 years CGA is increased with: increasing values of peak width of skeletonized fractional anisotropy (PSFA), radial diffusivity (PSRD), and axial diffusivity (PSAD) derived from dMRI; among twins; and after an incomplete course of, or no exposure to, antenatal corticosteroids. Female sex and breastfeeding during the neonatal period reduced the risk of language delay.
The combination of perinatal clinical information and MRI features leads to accurate prediction of preterm infants who are likely to develop language deficits in early childhood. This model could potentially enable stratification of preterm children at risk of language dysfunction who may benefit from targeted early interventions.
A combination of clinical perinatal factors and neonatal DTI measures of white matter microstructure leads to accurate prediction of language outcome at 2 years corrected gestational age following preterm birth.
A model that comprises clinical and MRI features that has potential to be scalable across centres. It offers a basis for enhancing the power and generalizability of diagnostic and prognostic studies of neurodevelopmental disorders associated with language impairment.
Early identification of infants who are at risk of language delay, facilitating targeted early interventions and support services, which could improve the quality of life for children born preterm.
An estimated 15 million infants are born preterm (before 37 weeks of gestation) annually worldwide.1 Although advances in neonatal intensive care have led to a decrease in infant mortality rates over time, survivors of preterm birth are at increased risk of long-term neurocognitive impairment.2 Preterm birth may lead to language deficits that persist into school age3 and are associated with a range of negative sequelae across the life span, including poor academic performance, poor social, emotional and behavioural functioning, and unemployment.4,5 Neurodevelopmental trajectories are amenable to early intervention, which presents a window of opportunity to have a profound, long-lasting effect on later life.6 Therefore, there is a clear unmet clinical need for early identification of those children who are at high risk of poor language development.
Multiple outcome studies have demonstrated associations between prenatal, neonatal, and postnatal factors and early neurodevelopmental outcomes for preterm infants.7,8 In addition, preterm birth is closely associated with generalized microstructural changes in cerebral white matter, inferred from diffusion tensor imaging (DTI) (fractional anisotropy [FA], mean, axial, and radial diffusivities [MD, AD, RD]), and alterations in these have been linked to language delay.9 However, it is rare for research to combine data from different modalities for the development of prediction models for neurodevelopmental outcomes.
Nonetheless, a few studies have built and validated tools for prediction of the composite outcome of neurodevelopmental impairment at 2 years corrected gestational age (CGA) for children born preterm. Tyson et al.10 investigated the clinical and demographic characteristics of a cohort of infants born before 26 weeks of gestation and found that the risk of adverse neurodevelopmental outcome at 18–22 months CGA was predicted using gestational age (GA), sex, exposure to antenatal corticosteroids, multiple birth, and birth weight. Ambalavanan et al.11 reported that neurodevelopmental impairment at 18–22 months CGA was predicted by combining sex, respiratory illness severity, and enlarged ventricular size, periventricular leukomalacia, or porencephalic cyst on cranial ultrasound. Vesoulis et al.12 developed a tool for prediction of risk of neurodevelopmental impairment at 18–24 months CGA. This tool comprised of ventilator days, mode of delivery, exposure to antenatal corticosteroids, retinopathy of prematurity (ROP) requiring surgery, and magnetic resonance imaging (MRI) findings (cerebellar haemorrhage size, cerebellar haemorrhage laterality, intraventricular haemorrhage grade, white matter injury).
However, deficits in different developmental domains require different therapies and targeted support strategies. Thus, tools for stratification of children at high risk of impairment in specific developmental domains would be valuable. Recently, Vassar et al.13 evaluated the predictive value of structural MRI and DTI variables for classification of very preterm infants at high versus low risk of language delay. They developed a model for prediction of language delay that included DTI variables in three brain regions and achieved 89% sensitivity and 86% specificity. Ball et al.14 revealed that distinct patterns of brain structure and microstructure following preterm birth are linked to specific clinical and environmental factors, and these patterns correlate with neurodevelopmental outcome at 18–24 months CGA. Language outcome was associated with specific neuroanatomic variation, which was linked to age at scan, need for continuous positive airway pressure, birth weight, GA at birth, parenteral nutrition, surfactant administration, and mechanical ventilation.
In view of this evidence, we hypothesized that a combination of clinical, environmental, and imaging factors derived from DTI that capture generalized white matter dysmaturation would potentially enhance the prediction of language outcomes at 2 years CGA following preterm birth. Blesa et al.15 demonstrated that histogram-based variables derived from DTI (peak width of skeletonized [PS] FA, MD, RD, and AD), which represent generalized water content and myelination, can be used as biomarkers of microstructural white matter alterations associated with preterm birth. The advantage of the histogram-based framework is that it is fully automated, captures generalized white matter dysmaturation that characterizes the encephalopathy of prematurity, is computationally inexpensive compared with tract-specific approaches, and has high inter-scanner reproducibility.16
A prediction tool that combines clinical data and imaging biomarkers for early language development is lacking, and yet timely identification of future language deficits has clinical and research implications, because it could stratify infants at most need for early interventions. Here we aimed to develop a machine learning model that accurately predicts typical versus delayed language outcomes at 2 years CGA using a parsimonious feature set derived from clinical, demographic, and histogram-based variables computed from neonatal brain DTI.
Participants were selected from a longitudinal cohort of preterm neonates born at ≤33 weeks of gestation at the Royal Infirmary of Edinburgh between February 2012 and August 2015.17 Selection from the larger cohort was based on availability of diffusion MRI (dMRI) scans at term-equivalent age and 2-year language outcome. Ethical approval was obtained from the UK National Research Ethics Service (NRES), South East Scotland Research Ethics Committee (NRES numbers 11/55/0061 and 13/SS/0143). Written informed consent from parents/carers was obtained for all neonates. Exclusion criteria for the study were congenital anomalies, chromosomal abnormalities, congenital infections or major overt parenchymal lesions (cystic periventricular leukomalacia, haemorrhagic parenchymal infarction), and post-haemorrhagic ventricular dilatation. Infants with a contraindication to MRI at 3 Tesla were also excluded.
Clinical and demographic features
The selection of clinical and demographic features included in models was guided by extant literature linking biological and environmental exposures with neurocognitive development in preterm infants. Specifically, we studied the contribution towards prediction of language outcome at 2 years CGA of the following features: sex,10,11,18,19 GA (based on first trimester ultrasound),10,18 birth weight,10,20 maternal age,21 primiparity,19 twin status,10,20 maternal body mass index (BMI),22 medical history of maternal depression,23 administration of a complete course of antenatal corticosteroids for foetal lung maturation (defined as two doses 24 h apart), any antenatal corticosteroid exposure,10,12,19,20 administration of antenatal magnesium sulfate (MgSO4) for neuroprotection,24 mode of delivery (spontaneous vaginal delivery or caesarean section),19 total days requiring intubation while in the neonatal intensive care unit (NICU),11,12,18 bronchopulmonary dysplasia (defined as oxygen requirement at ≥36 weeks CGA),19,20,25,26 late-onset sepsis (defined as blood stream infection occurring ≥72 h postnatally with (a) bacterial pathogen isolated from blood culture or (b) blood culture growing coagulase-negative staphylococcus, along with one or more signs of generalized infection, and treatment with intravenous antibiotics for ≥5 days),20 necrotizing enterocolitis (NEC, defined as stages two or three according to the modified Bell’s staging for NEC27),25,28 ROP treated with laser therapy,12,29 and type of infant feeding at discharge from the neonatal unit (dichotomized as exclusive maternal breast milk versus exclusive formula or mixed feeding).30 All infants had placental histopathology performed and histological chorioamnionitis was defined using an established system.31 Maternal level of education (dichotomized as secondary school or below versus college, university or postgraduate studies)18,19,20 and socioeconomic status of the family, operationalized as Scottish Index of Multiple Deprivation 2016 (SIMD16) quintile, where 1 indicates the most deprived and 5 indicates the least deprived (https://www2.gov.scot/Topics/Statistics/SIMD), were also included.
Infants underwent a brain MRI scan at term-equivalent age (38–42 weeks GA) without sedation, during natural sleep after having been fed and swaddled. Vital signs were monitored throughout the scan, and hearing protection was provided for all neonates (MiniMuffs, Natus). All scans were supervised by a physician and a paediatric nurse trained in neonatal resuscitation.
A Siemens MAGNETOM Verio 3-Tesla MRI clinical scanner (Siemens Healthcare Gmbh, Erlangen, Germany) and 12-channel phased-array head coil were used to acquire dMRI data consisting of 11 T2-weighted and 64 diffusion-weighted (b = 750 s/mm2) single-shot, spin-echo, echo planar imaging volumes collected in the axial plane with 2 mm isotropic voxels (repetition time = 7300 ms, echo time = 06 ms, field of view = 256 mm, acquired matrix = 128 × 128, 50 contiguous interleaved slices with 2 mm thickness, acquisition time=9 min 29 s).
For each participant, the dMRI was denoised using a Marchenko-Pastur-PCA-based algorithm;32,33 eddy current and head movement were corrected using outlier replacement34,35,36 and bias field inhomogeneity correction was performed by calculating the bias field of the mean b0 volume and applying the correction to all the volumes.37 For each participant, PSFA, PSMD, PSRD, and PSAD were calculated using age-optimized methods described by Blesa et al.15 In summary, image data were registered to the Edinburgh Neonatal Atlas5015 using a tensor registration,38 and their DTI maps were calculated. Subsequently, the individual FA maps were projected into the template skeleton and multiplied by the atlas custom mask. Finally, the peak width of the histogram values within the skeletonized maps was calculated as the difference between the 95th and 5th percentiles.16 Figure 1 illustrates a summary of the process described. The code necessary to calculate histogram-based metrics can be found at https://git.ecdf.ed.ac.uk/jbrl/psmd. Figure 2 shows scatterplots of the values of the PS DTI metrics for all participants.
All children took part in a developmental assessment with a trained clinician at 2 years CGA (median age 24.13, range 23.1–28.27 months) using the Bayley Scales of Infant and Toddler Development, Third Edition (Bayley-III).39 We used the Bayley-III language composite score (mean 100, SD 15) as the response variable. The clinical cut-off of 85 (i.e. 1 SD below the mean) was used in order to assign children into two distinct groups, thus creating a binary outcome; children whose score was <85 were considered to have moderate-to-severe language impairment, while scores ≥85 were considered as normal range or higher.40
We compared three feature selection algorithms: (a) Boruta,41 (b) ReliefF expRank,42,43 and (c) random forests (RF) variable importance.44 The Boruta algorithm is a wrapper feature selection technique built around the RF learner, which uses Z score as the importance measure. In other words, it measures the importance of each feature by dividing the average loss of accuracy among all trees by the standard deviation of the accuracy loss. The basic idea of the ReliefF algorithm is to assign a ‘weight’ value to all features of a data set based on how well their values distinguish between the instances that are near to each other and thus how useful they are in predicting the response variable. The important features will have a large weight, while the redundant ones will have a low weight. In RF variable importance, variable importance is computed using the mean decrease in Gini index. We can measure the total amount that the Gini index is decreased by splits over a given feature, averaged over all trees. A large value indicates an important feature. In all cases, we obtain a feature ranking indicating in descending order their contribution towards prediction of the response variable. The final feature subset for each feature selection algorithm was selected using leave-one-out cross-validation (LOOCV), using only the training data set in each cross-validation iteration and following the process described by Tsanas et al.45 Subsequently, the selected feature subset was presented into a RF classifier46 in order to predict the binarized language composite score. Partial dependence plots (PDP)47 were constructed in order to assess how the selected features influence the prediction of the RF classifier. To quantify the strength of the association between the selected features, we used correlation analysis (the Spearman’s rank correlation coefficient was used to quantify the strength of the association between two continuous features, the phi coefficient was used to quantify the association between two binary features, and the point-biserial correlation coefficient was used to quantify the strength of the association between a continuous and a binary feature).
The data set is imbalanced since only 16% of the study group had a language composite score <85. To overcome the class imbalance problem in the data set, we explored different data balancing techniques: under-sampling of the majority class, over-sampling of the minority class, and the synthetic minority over-sampling technique (SMOTE),48 which has been previously used in similar unbalanced applications in the healthcare domain.49,50,51,52,53 We found that SMOTE yields the best results, which are presented in the paper. SMOTE is a training data enrichment method, where the minority class is over-sampled by creating new synthetic samples, to create a balanced data set. For each minority class sample, the k minority class nearest neighbours were identified (using the suggestion of Chawla et al. with k = 5) and synthetic samples were introduced along the line segments joining any or all of the k minority class nearest neighbours. Model validation was implemented using LOOCV. LOOCV involves holding out a single observation to be used as the test set, while the learner is trained using the remaining n − 1 observations (n is the total number of observations). The process is repeated n times and each time a different observation from the original data set is used as the test set. The result is n estimates of the test error. The final test error rate is the average of these n test error estimates. The accuracy of the model was assessed by constructing a confusion matrix, which is a contingency table of the observed and predicted classes. Missing data for both numeric and categorical features were imputed using multiple imputation by chained equations (five imputed data sets were created in each LOOCV iteration),54,55 based only on the information in the training set independently within each LOOCV iteration. Data analysis was conducted in R. The R packages used were: tidyverse, dplyr, caret, randomForest, CORElearn, Boruta, mice, ggplot2, DMwR, Hmisc, RGraphics, grid, gridExtra, and gridGraphics.
Two-year language data and dMRI of the brain at term-equivalent age were available from 89 children; demographic and clinical characteristics of the study population are presented in Table 1. At median age 24.13 months (range 23.24–28.27 months), 14 children had a language composite score <85. The percentage of missing values in the data set was 0.2% (1 participant had missing histological chorioamnionitis data, 2 participants had missing SIMD16, and 3 participants had missing maternal BMI).
Figure 3 illustrates the out-of-sample performance of the RF classifier (trained on approximately 150 samples in each LOOCV iteration) as a function of the number of features selected by the different feature selection algorithms. These data show that feeding a subset of eight features selected by the Boruta feature selection algorithm (a wrapper feature selection technique built around the RF learner) to the RF classifier gives the highest balanced accuracy. The selected feature subset comprises PSFA, twin status (yes or no), antenatal steroid exposure (complete or incomplete course), any antenatal steroid exposure (yes or no), sex (male or female), PSRD, PSAD, and feeding at discharge from the NICU (exclusive maternal breast milk versus exclusive formula or mixed feeding). Figure 4 shows the importance attributed to each feature by each of the feature selection algorithms. PSFA, twin status, the course of antenatal steroid exposure, any antenatal steroid exposure, sex, PSRD, PSAD, and feeding are the jointly most predictive features towards the prediction of the binarized language outcome. PDP were used to visualize relationships between the selected features and the response based on our model (see Fig. 5). The PDP provide insight into the effect of changing one or two features in terms of the model’s prediction (binary response variable, indicating whether language composite score <85). Regarding the histogram-based variables derived from DTI, the PDP show that the predicted language impairment probability rises with increasing PSFA, PSRD, and PSAD values. PSRD and PSAD are presented in the same plot because they are highly correlated as illustrated in the correlogram and correlation matrix in Fig. 6. Language composite score <85 at 2 years CGA is more likely following a twin pregnancy, an incomplete course of antenatal corticosteroids, or no exposure to antenatal steroids. Female sex and feeding with exclusive breast milk reduce the risk of future language delay.
Table 2 shows the confusion matrix of the out-of-sample classification performance of the RF classifier when mapping the selected feature subset (i.e., PSFA, twin status, antenatal corticosteroid exposure, sex, PSRD, PSAD, and feeding at discharge) to the binarized language composite score. Our model achieved balanced accuracy: 91%, sensitivity: 86%, and specificity: 96%.
Finally, we repeated the analysis to investigate separately the performance of the model when presented only with either clinical or MRI features, which led to reduced model performance. As shown in Table 3, the model that comprises clinical and MRI features outperformed the models using only clinical or MRI features. The combination of clinical and DTI features enhances the prediction of language outcomes at 2 years CGA following preterm birth.
We developed a parsimonious machine learning model that accurately identifies preterm infants who are likely to develop language impairment in early childhood. We explored the predictive value of 24 clinical, demographic, and brain imaging features and found that a robust subset of eight clinical characteristics and imaging biomarkers best predicts a language composite score <85 on the Bayley-III: PSFA, PSRD, PSAD, twin status, administration of an incomplete course of antenatal corticosteroids, no exposure to antenatal corticosteroids, male sex, and feeding with exclusive formula milk or mixed formula and breast milk. Overall, we demonstrated out-of-sample balanced accuracy: 91%, sensitivity: 86%, and specificity: 96%.
Feature selection was conducted by comparing three feature selection algorithms: (a) Boruta, (b) ReliefF expRank, and (c) RF variable importance. Feature selection methods can be broadly considered into three main categories: filter, wrapper, embedded methods. Filter feature selection methods work independently of a statistical learner relying on the general statistical properties of the data and thus select a feature subset that is not tuned or optimized towards a specific learning algorithm. Wrapper methods take a particular machine learning method into account in order to choose the best subset of the original features. They evaluate multiple models by training and testing in the feature space, thus optimizing the performance of the particular machine learning model that was used. Embedded methods choose the subset of features while the learning model is being constructed. This means that the resulting feature subset is specific to a particular learning algorithm. We chose to use a feature selection algorithm from each main category for our exploration; ReliefF is a filter technique, Boruta is wrapper feature selection technique built around the RF learner, and RF variable importance is an embedded method. The use of ReliefF and the RF importance have been extensively used and validated in many different applications and we have previously conducted a thorough empirical study56 where they performed very competitively against many established feature selection approaches. In general, we would expect a wrapper or embedded method to perform better for a particular choice of a classifier, although it might not necessarily generalize very well with the choice of different classifiers.
Our findings suggest that PSFA, PSRD, and PSAD, which detect generalized white matter microstructural alterations in preterm infants compared to infants born at term,15 are predictive of impaired language development at 2 years CGA. We explored the predictive value of whole-brain measures of PS DTI metrics, instead of tract-specific segmentations, because preterm brain dysmaturation is a substantially generalized process,57 and language development draws on broad cognitive capacities. We have found that the probability of language delay is higher with increased PSFA, PSRD, and PSAD. These features are consistent with delayed myelination, less coherent white matter organization, and altered axonal integrity in the preterm brain.15,58 Previous research has also shown that abnormalities in brain structure following preterm birth are correlated with long-term neurodevelopmental outcome.59
The data show that twin status is associated with increased risk of impaired language development. This finding is consistent with studies in the extant literature which have found that multiple pregnancy is associated with neurodevelopmental impairment10,20,60 and language delay61 at 2 years CGA. Language delay in twins can be attributed to postnatal environmental factors;62,63 twins receive a less focussed and less elaborated communicative interchange with their parents than do singletons. Thorpe et al.62 compared families with twins to families with pairs of closely spaced singletons. This study found that language delay in twins compared to singletons may be explained by patterns of parent–child interaction and communication. Antenatal corticosteroid administration is associated with lower risk of language deficits, which has been previously proved by research.10,12 Our findings suggest that male sex is a risk factor for language impairment in early childhood, consistent with previous studies that have associated male sex with poorer neurodevelopmental outcome following preterm birth.10,11,18,19
Moreover, previous work has shown that exclusive breast milk feeding in the weeks following preterm birth can enhance brain development,30 and in the general population breast milk intake in infancy is associated with improved performance on intelligence tests.64 In line with this, we found that exclusive breastfeeding is associated with improved language outcomes compared to formula feeding or mixed breast and formula feeding. It is surprising that GA at birth was not included in the final feature set. However, its influence on long-term outcome may be captured by PSRD and PSAD, which are strongly correlated with GA at birth.15
This study is the first to investigate the use of PS DTI metrics as predictors for language development in the preterm population. The advantage of using these image biomarkers is that their calculation is fully automated, computationally inexpensive, and has high inter-scanner reproducibility,16 meaning that they can be easily obtained for preterm neonates who undergo a dMRI scan at term-equivalent age and can be used for multi-centre studies. Thus, our model comprises features that can be easily obtained for future clinical application.
Hitherto, few studies have focussed on developing and validating prediction models for early neurodevelopmental outcomes for children born preterm. Most tools predict the composite outcome of neurodevelopmental impairment.10,11,12 However, deficits in different developmental domains require different interventions. Therefore, tools for timely identification of children at risk of impairment in specific developmental domains are valuable. The developed model predicts language deficits at 2 years CGA. Recently, a model was developed for classification of very preterm infants at high versus low risk for language delay, which achieved 89% sensitivity and 86% specificity.13 That model included DTI variables in three brain regions: MD of right sagittal stratum and right inferior occipital gyrus and AD of right lingual gyrus. However, whole-brain calculation of DTI variables is computationally expensive; hence, we investigated the predictive value of histogram-based variables derived from DTI. We have shown that combining DTI metrics with perinatal factors, along with the use of advanced machine learning techniques, can further improve identification of children at risk of language impairment.
The main strength of our study is that we had a longitudinal cohort of preterm infants that is deeply phenotyped with brain imaging and biological information that enabled us to investigate a large number of clinical, demographic, social, and DTI variables. We acknowledge some limitations in our study. The sample size is relatively small, and this is a single-centre study, so despite our best efforts with standard model validation techniques to assess model generalization we would need to further validate findings in a different cohort. Nonetheless, the study population was fairly representative of NICU populations in terms of comorbidities that have been associated with long-term neurodevelopmental outcomes. In addition, cortical grey matter was not assessed in this study. We focussed on alterations in white matter microstructure, since it is the most consistently abnormal finding in preterm infants, by measuring a functionally tractable property using a tool that is readily applied to clinical image data. Future studies could aim to validate our model in additional external cohorts and also apply machine learning techniques for prediction of motor, cognitive, and social–emotional outcomes for children born preterm.
A combination of clinical perinatal factors and neonatal DTI measures of white matter microstructure best predict language impairment at 2 years after preterm birth. This model has the potential to enable clinicians identify infants who are at risk of language delay, thus facilitating targeted early intervention and support services. The model comprises clinical and MRI features that have potential to be scalable across centres, so it offers a basis for enhancing the power and generalizability of diagnostic and prognostic studies of neurodevelopmental disorders associated with language impairment.
Chawanpaiboon, S. et al. Global, regional, and national estimates of levels of preterm birth in 2014: a systematic review and modelling analysis. Lancet Glob. Health 7, e37–e46 (2019).
Pierrat, V. et al. Neurodevelopmental outcome at 2 years for preterm children born at 22 to 34 weeks’ gestation in France in 2011: EPIPAGE-2 cohort study. BMJ 358, j3448 (2017).
van Noort-van der Spek, I. L., Franken, M.-C. J. P. & Weisglas-Kuperus, N. Language functions in preterm-born children: a systematic review and meta-analysis. Pediatrics 129, 745–754 (2012).
Law, J., Rush, R., Schoon, I. & Parsons, S. Modeling developmental language difficulties from school entry into adulthood: literacy, mental health, and employment outcomes. J. Speech Lang. Hear. Res. 52, 1401–1416 (2009).
Conti-Ramsden, G., Mok, P. L. H., Pickles, A. & Durkin, K. Adolescents with a history of specific language impairment (SLI): strengths and difficulties in social, emotional and behavioral functioning. Res. Dev. Disabil. 34, 4161–4169 (2013).
Spittle, A., Orton, J., Anderson, P. J., Boyd, R. & Doyle, L. W. Early developmental intervention programmes provided post hospital discharge to prevent motor and cognitive impairment in preterm infants. Cochrane Database Syst. Rev. CD005495 (2015).
Linsell, L., Malouf, R., Morris, J., Kurinczuk, J. J. & Marlow, N. Prognostic factors for poor cognitive development in children born very preterm or with very low birth weight: a systematic review. JAMA Pediatr. 169, 1162–1172 (2015).
Linsell, L., Malouf, R., Morris, J., Kurinczuk, J. J. & Marlow, N. Prognostic factors for cerebral palsy and motor impairment in children born very preterm or very low birthweight: a systematic review. Dev. Med. Child Neurol. 58, 554–569 (2016).
Feldman, H. M., Lee, E. S., Yeatman, J. D. & Yeom, K. W. Language and reading skills in school-aged children and adolescents born preterm are associated with white matter properties on diffusion tensor imaging. Neuropsychologia 50, 3348–3362 (2012).
Tyson, J. E. et al. Intensive care for extreme prematurity–moving beyond gestational age. N. Engl. J. Med. 358, 1672–1681 (2008).
Ambalavanan, N. et al. Outcome trajectories in extremely preterm infants. Pediatrics 130, e115–e125 (2012).
Vesoulis, Z. A., El Ters, N. M., Herco, M., Whitehead, H. V & Mathur, A. M. A web-based calculator for the prediction of severe neurodevelopmental impairment in preterm infants using clinical and imaging characteristics. Children 5, 151 (2018).
Vassar, R. et al. Neonatal brain microstructure and machine-learning-based prediction of early language development in children born very preterm. Pediatr. Neurol. 108, 86–92 (2020).
Ball, G. et al. Multimodal image analysis of clinical influences on preterm brain development. Ann. Neurol. 82, 233–246 (2017).
Blesa, M. et al. Peak width of skeletonized water diffusion MRI in the neonatal brain. Front. Neurol. 11, 235 (2020).
Baykara, E. et al. A novel imaging marker for small vessel disease based on skeletonization of white matter tracts and diffusion histograms. Ann. Neurol. 80, 581–592 (2016).
Boardman, J. P. et al. Impact of preterm birth on brain development and long-term outcome: protocol for a cohort study in Scotland. BMJ Open 10, e035854 (2020).
Charkaluk, M. L. et al. Neurodevelopment of children born very preterm and free of severe disabilities: the Nord-Pas de Calais Epipage cohort study. Acta Paediatr. 99, 684–689 (2010).
Wood, N. S. et al. The EPICure study: associations and antecedents of neurological and developmental disability at 30 months of age following extremely preterm birth. Arch. Dis. Child. Fetal Neonatal Ed. 90, F134–F140 (2005).
Vohr, B. R., Wright, L. L., Poole, W. K. & McDonald, S. A. Neurodevelopmental outcomes of extremely low birth weight infants <32 weeks’ gestation between 1993 and 1998. Pediatrics 116, 635–643 (2005).
Tseng, K.-T. et al. The impact of advanced maternal age on the outcomes of very low birth weight preterm infants. Medicine 98, e14336 (2019).
Reynolds, L. C., Inder, T. E., Neil, J. J., Pineda, R. G. & Rogers, C. E. Maternal obesity and increased risk for autism and developmental delay among very preterm infants. J. Perinatol. 34, 688–692 (2014).
Bozkurt, O. et al. Does maternal psychological distress affect neurodevelopmental outcomes of preterm infants at a gestational age of ≤32weeks. Early Hum. Dev. 104, 27–31 (2017).
Marret, S. et al. [Effect of magnesium sulphate on mortality and neurologic morbidity of the very-preterm newborn (of less than 33 weeks) with two-year neurological outcome: results of the prospective PREMAG trial]. Gynecol. Obstet. Fertil. 36, 278–288 (2008).
Synnes, A. et al. Determinants of developmental outcomes in a very preterm Canadian cohort. Arch. Dis. Child. Fetal Neonatal Ed. 102, F235–F234 (2017).
Twilhaar, E. S. et al. Cognitive outcomes of children born extremely or very preterm since the 1990s and associated risk factors: a meta-analysis and meta-regression. JAMA Pediatr. 172, 361–367 (2018).
Bell, M. J. et al. Neonatal necrotizing enterocolitis. Therapeutic decisions based upon clinical staging. Ann. Surg. 187, 1–7 (1978).
van Vliet, E. O. G., de Kieviet, J. F., Oosterlaan, J. & van Elburg, R. M. Perinatal infections and neurodevelopmental outcome in very preterm and very low-birth-weight infants: a meta-analysis. JAMA Pediatr. 167, 662–668 (2013).
Schmidt, B., Davis, P. G., Asztalos, E. V., Solimano, A. & Roberts, R. S. Association between severe retinopathy of prematurity and nonvisual disabilities at age 5 years. JAMA 311, 523 (2014).
Blesa, M. et al. Early breast milk exposure modifies brain connectivity in preterm infants. Neuroimage 184, 431–439 (2019).
Anblagan, D. et al. Association between preterm brain injury and exposure to chorioamnionitis during fetal life. Sci. Rep. 6, 37932 (2016).
Veraart, J., Fieremans, E. & Novikov, D. S. Diffusion MRI noise mapping using random matrix theory. Magn. Reson. Med. 76, 1582–1593 (2016).
Tournier, J.-D. et al. MRtrix3: a fast, flexible and open software framework for medical image processing and visualisation. Neuroimage 202, 116137 (2019).
Smith, S. M. et al. Advances in functional and structural MR image analysis and implementation as FSL. Neuroimage 23, S208–S219 (2004).
Andersson, J. L. R. & Sotiropoulos, S. N. An integrated approach to correction for off-resonance effects and subject movement in diffusion MR imaging. Neuroimage 125, 1063–1078 (2016).
Andersson, J. L. R., Graham, M. S., Zsoldos, E. & Sotiropoulos, S. N. Incorporating outlier detection and replacement into a non-parametric framework for movement and distortion correction of diffusion MR images. Neuroimage 141, 556–572 (2016).
Tustison, N. J. et al. N4ITK: improved N3 bias correction. IEEE Trans. Med. Imaging 29, 1310–1320 (2010).
Zhang, H., Yushkevich, P. A., Alexander, D. C. & Gee, J. C. Deformable registration of diffusion tensor MR images with explicit orientation optimization. Med. Image Anal. 10, 764–785 (2006).
Albers, C. A. & Grieve, A. J. Test Review: Bayley, N. (2006). Bayley Scales of Infant and Toddler Development–Third Edition. San Antonio, TX: Harcourt Assessment. J. Psychoeduc. Assess. 25, 180–190 (2007).
Johnson, S., Moore, T. & Marlow, N. Using the Bayley-III to assess neurodevelopmental delay: which cut-off should be used? Pediatr. Res. 75, 670–674 (2014).
Kursa, M. B. & Rudnicki, W. R. Feature selection with the Boruta package. J. Stat. Softw. 36, 1–13 (2010).
Kira, K. & Rendell, L. A. The Feature Selection Problem: Traditional Methods and a New Algorithm (AAAI Press, 1992).
Kononenko, I. In Machine Learning: ECML-94. ECML 1994. Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence), Vol. 784 (eds Bergadano, F. & De Raedt, L.) 171–182 (Springer, 1994).
Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning Data Mining, Inference, and Prediction (Springer, 2009).
Tsanas, A., Little, M. A., McSharry, P. E., Spielman, J. & Ramig, L. O. Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease. IEEE Trans. Biomed. Eng. 59, 1264–1271 (2012).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Friedman, J. H. Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001).
Chawla, N. V., Bowyer, K. W., Hall, L. O. & Kegelmeyer, W. P. SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002).
Dessie, E. Y., Tsai, J. J. P., Chang, J.-G. & Ng, K.-L. A novel miRNA-based classification model of risks and stages for clear cell renal cell carcinoma patients. BMC Bioinformatics 22, 270 (2021).
Park, K. H., Batbaatar, E., Piao, Y., Theera-Umpon, N. & Ryu, K. H. Deep learning feature extraction approach for hematopoietic cancer subtype classification. Int. J. Environ. Res. Public Health 18, 1–24 (2021).
Lee, Y. W., Choi, J. W. & Shin, E. H. Machine learning model for predicting malaria using clinical information. Comput. Biol. Med. 129, 104151 (2021).
Ivanović, M. D. et al. Predicting defibrillation success in out-of-hospital cardiac arrested patients: Moving beyond feature design. Artif. Intell. Med. 110, 101963 (2020).
Nguyen, Q. D. N., Liu, A. B. & Lin, C. W. Development of a neurodegenerative disease gait classification algorithm using multiscale sample entropy and machine learning classifiers. Entropy 22, 1340 (2020).
Raghunathan, T. E., Lepkowski, J., Hoewyk, J., Van & Solenberger, P. A multivariate technique for multiply imputing missing values using a sequence of regression models. Surv. Methodol. 27, 85–95 (2001).
Van Buuren, S. Multiple imputation of discrete and continuous data by fully conditional specification. Stat. Methods Med. Res. 16, 219–242 (2007).
Tsanas, A. Accurate Telemonitoring of Parkinson’s Disease Symptom Severity using Nonlinear Speech Signal Processing and Statistical Machine Learning. PhD thesis, Oxford Univ. (2012).
Telford, E. J. et al. A latent measure explains substantial variance in white matter microstructure across the newborn human brain. Brain Struct. Funct. 222, 4023–4033 (2017).
Boardman, J. P. & Counsell, S. J. Invited Review: Factors associated with atypical brain development in preterm infants: insights from magnetic resonance imaging. Neuropathol. Appl. Neurobiol. 46, 413–421 (2020).
Batalle, D., Edwards, A. D. & O’Muircheartaigh, J. Annual Research Review: Not just a small adult brain: understanding later neurodevelopment through imaging the neonatal brain. J. Child Psychol. Psychiatry Allied Discip. 59, 350–371 (2018).
Wadhawan, R. et al. Twin gestation and neurodevelopmental outcome in extremely low birth weight infants. Pediatrics 123, e220–e227 (2009).
Adams-Chapman, I., Bann, C. M., Vaucher, Y. E., Stoll, B. J. & Eunice Kennedy Shriver National Institute of Child Health and Human Development Neonatal Research Network. Association between feeding difficulties and language delay in preterm infants using Bayley Scales of Infant Development-Third Edition. J. Pediatr. 163, 680.e3–685.e3 (2013).
Thorpe, K., Rutter, M. & Greenwood, R. Twins as a natural experiment to study the causes of mild language delay: II: Family interaction risk factors. J. Child Psychol. Psychiatry 44, 342–355 (2003).
Thorpe, K. Twin children’s language development. Early Hum. Dev. 82, 387–395 (2006).
Horta, B. L., Loret De Mola, C. & Victora, C. G. Breastfeeding and intelligence: a systematic review and meta-analysis. Acta Paediatr. 104, 14–19 (2015).
This work was supported by Theirworld (www.theirworld.org). The work was undertaken in the MRC Centre for Reproductive Health, which was funded by MRC Centre Grant (MRC G1002033). The study was also supported by Health Data Research UK, which receives its funding from HDR UK Ltd (HDR-5012) funded by the UK Medical Research Council, Engineering and Physical Sciences Research Council, Economic and Social Research Council, Department of Health and Social Care (England), Chief Scientist Office of the Scottish Government Health and Social Care Directorates, Health and Social Care Research and Development Division (Welsh Government), Public Health Agency (Northern Ireland), British Heart Foundation, and the Welcome Trust. The funders had no role in the study and the decision to submit this work to be considered for publication.
The authors declare no competing interests.
Ethics approval and consent to participate
Ethical approval was obtained from the UK National Research Ethics Service (NRES), South East Scotland Research Ethics Committee (NRES numbers 11/55/0061 and 13/SS/0143). Written informed consent from parents/carers was obtained for all neonates.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Valavani, E., Blesa, M., Galdi, P. et al. Language function following preterm birth: prediction using machine learning. Pediatr Res 92, 480–489 (2022). https://doi.org/10.1038/s41390-021-01779-x
This article is cited by
Machine learning for understanding and predicting neurodevelopmental outcomes in premature infants: a systematic review
Pediatric Research (2023)
Intelligent wearable allows out-of-the-lab tracking of developing motor abilities in infants
Communications Medicine (2022)
Früherkennung primärer Sprachentwicklungsstörungen – zunehmende Relevanz durch Änderung der Diagnosekriterien?
Bundesgesundheitsblatt - Gesundheitsforschung - Gesundheitsschutz (2022)