Introduction

Post-hemorrhagic ventricular dilatation (PHVD) remains a clinically significant complication of germinal matrix–intraventricular hemorrhage (GMH-IVH) in very preterm infants, affecting up to 50% of infants with a large IVH. PHVD usually develops within the first 10–14 days of the onset of a GMH-IVH and is caused by impaired drainage of cerebrospinal fluid (CSF) from the ventricular system.1,2 Particularly in the early stage, it is mostly asymptomatic, being related to the compliant skull, large extracerebral spaces, and high water content of the white matter in preterm infants. Clinical signs of increased intracranial pressure (ICP) often only develop in a late stage, with severe ventricular dilatation.3

PHVD can be associated with adverse neurodevelopmental outcomes in survivors, which is probably related to injury and alterations to the immature and vulnerable preterm brain due to prolonged increased pressure on the brain tissue, inflammation and restricted cerebral perfusion.4,5,6,7 This stresses the importance of interventions to relieve pressure, even before signs of increased ICP develop. Although the optimal timing of intervention remains controversial, an increasing number of studies show benefits for long-term outcome with an early stepwise intervention approach, based on ventricular size measurements rather than signs of increased ICP.4,8,9

Cranial ultrasound (cUS) is the gold standard for diagnosing and following the progress of PHVD as it is an easily accessible, non-invasive bedside tool. In 60% of cases, PHVD is progressive over days to weeks,10 stressing the importance of serial imaging of the preterm brain, particularly when an IVH has been diagnosed. Following the evolution of lateral ventricular size with measurements from serial cUS has been shown superior to visual evaluations and clinical signs in the assessment of significant PHVD.11,12,13,14 The most commonly used measurements are the ventricular index (VI), anterior horn width (AHW), and fronto-temporal or fronto-occipital horn ratio (FTHR/FOHR).12,15,16,17 Concerns, however, remain on the reproducibility of these indices between observers, especially between observers with differing levels of expertise in cUS assessments. It is also unclear which index measured at what time point during the neonatal period is the most predictive of the severity of PHVD and need for intervention.

The primary objective of this retrospective observational cohort study in preterm infants with IVH was to assess the reliability of ventricular size indices from cUS between observers with differing levels of expertise in cUS assessments. A secondary objective was to assess the predictive value of ventricular size indices throughout the neonatal period for the severity of PHVD as reflected by the receipt of surgical intervention in the same population.

Patient and methods

Patients

This retrospective observational cohort study involved preterm infants with a gestational age (GA) of <29 weeks admitted to Neonatal Intensive Care Units (NICUs) in Calgary between January 2013 and December 2018 and diagnosed with GMH-IVH of any grade1 as assessed by 3 reviewers by consensus according to our standard practice. Exclusion criteria included congenital brain malformations; (suspected) meningitis; and metabolic, chromosomal, or genetic disorders.

For all the included infants, relevant perinatal and neonatal parameters were collected by chart review, including GA at birth, birth weight, and postnatal age and postmenstrual age (PMA) at the time of cUS; receipt and timing of neurosurgical intervention for PHVD (see definition below); and death during the neonatal period. The decision for surgical intervention was based on the presence of severe ventricular dilatation (as defined below) and any one clinical sign of increased ICP (including bulging fontanel, mid-sagittal suture split of >2 mm, and increase in head circumference with crossing of percentiles). Temporary ventricular drainage devices (i.e., ventricular reservoirs) or permanent drainage devices (i.e., ventriculo-peritoneal shunts) were used at the discretion of the attending Pediatric Neurosurgeon. Approval from the Conjoint Human Research Ethics Board at the University of Calgary was obtained; requirement for informed consent from participants for this retrospective study with anonymized data was waived.

Cranial ultrasound

A database search for all available cUS scans of included preterm infants obtained throughout the neonatal period was conducted. As per routine clinical care in the NICUs in Calgary during the study period, serial cUS were performed on all extremely preterm infants on day 4–7 of birth (time point 1), on day 14 of birth (time point 2), and around 36 weeks PMA (time point 3). Scans were obtained using a General Electric LOQIC E9 ultrasound machine (GE Healthcare, United States). Scans were performed according to the local standard cUS protocol, which included at least six coronal and five sagittal planes via the anterior fontanel. Scanning through the mastoid window was not standard practice during the first years of the study period. The protocol remained unchanged throughout the study period.

cUS assessments

For each infant, digitally stored cUS scans acquired at the 3 standard time points were assessed independently by three observers with differing levels of expertise in neonatal cUS assessments: (1) neuroradiologist with ample training (J.N.S. with >18 years of experience), (2) neonatologist with ample experience (L.M.L. with 15 years of experience), (3) trainee with minimal training and experience (neonatal fellow; S.R.); the last observer had not previously performed ventricular size measurements. All observers were blinded to the infants’ identity, clinical course, and previous cUS findings. cUS assessments included the presence, grade, and side of GMH-IVH; presence of PHVD; and measurements of size of both the left and right lateral ventricle, including the indices VI, AHW, and FTHR, as previously described12,15,16 and depicted in Fig. 1. The VI was measured in the coronal plane at the level of the foramen of Monro as the distance between the falx and the lateral wall of the anterior horn and the AHW in the same plane as the diagonal width between the walls of the anterior horn at its widest point. The FTHR was obtained by measuring the widest distance of the frontal and temporal horns and dividing the average of these measurements by the largest bi-parietal distance. These distances were taken from different coronal planes to obtain the maximum width for each measurement. In case of a porencephalic cyst following periventricular hemorrhagic infarction (PVHI), the cystic area was not included in the measurements. Measurements of the VI and AHW in millimeter and ratios for the FTHR at the three time points were recorded for each infant by the three observers separately.

Fig. 1: Cranial ultrasound images showing measurement of ventricular size indices.
figure 1

Coronal cranial ultrasound views with the arrows showing how the ventricular size indices, including ventricular index (a), anterior horn width (a), and fronto-temporal horn ratio (b), are obtained. A represents the widest frontal horn distance, B the widest temporal horn distance, and C the largest bi-parietal diameter. AHW anterior horn width, FTHR fronto-temporal horn ratio, VI ventricular index.

Outcomes

Measures used as outcomes consisted of the presence and severity of PHVD and the receipt of neurosurgical intervention. PHVD was defined according to previously described cut-off values for ventricular dilatation12,15,16,17 based on the most severe side and our local protocol as:

Moderate PHVD: VI >97th percentile and AHW >6 mm or FTHR 0.45–0.49;

Severe PHVD: VI >97th percentile + 4 mm and AHW >10 mm or FTHR >0.5.

Receipt of neurosurgical intervention was defined as insertion of a ventricular drainage device, being either a temporary or a permanent device. The date at the first neurosurgical intervention was used for determining the timing of intervention.

Statistical analyses

Data analyses were performed using IBM SPSS Statistics version 25 (IBM Corp., Armonk, NY). Continuous clinical parameters were presented as mean and standard deviation (SD), and categorical parameters were presented as number and percentage (%). Analyses for the primary and secondary objective were performed in a two-step approach. For the primary objective, all measurements of ventricular size by all three observers were included, while for the secondary objective the average measurements from the two observers with the highest inter-observer reliability were used.

Inter-observer (all 3 observers) and inter-expertise (i.e., ample training versus minimal training/experience, ample experience versus minimal training/experience, and ample training versus ample experience) reliability for the measurements of ventricular size indices at all 3 time points were calculated as intra-class correlation coefficient (ICC) and 95% confidence interval using a two-way random-effects model testing for absolute agreement.18 To assess intra-observer agreement (as ICC) for the measurements, the observers repeated the measurements for a random sample of 15 infants with PHVD using 3 cUS scans each. The effectiveness of instruction and practice was assessed by comparing the agreement between repeated measurements by the observer with minimal training/experience for a random sample of 15 infants before and after instruction. ICC was classified as poor–moderate for ICC < 0.8, good for 0.8 < ICC < 0.9, and as excellent for ICC > 0.9. Receiver operating characteristic analysis was used to assess the predictive value of the measurements of ventricular size indices at all 3 time points for receipt of surgical intervention for severe PHVD; for infants in whom surgical intervention was performed after time point 3, all 3 time point were included, while for the infants in whom surgical intervention was performed between time points 2 and 3, only the first 2 time points were used. Predictive values were depicted as area under the receiver operating characteristic curve (AUC) and 95% confidence interval. AUC was classified as fail–poor for AUC < 0.7, fair for 0.7 < AUC < 0.8, good for 0.8 < AUC < 0.9, and as excellent for AUC > 0.9.

Results

Patient characteristics

A total of 671 extremely preterm infants were admitted to the Calgary NICUs during the study period, of whom 139 (21%) met the inclusion criteria for the study, including diagnosis of GMH-IVH of any grade. Mean GA and birth weight of the included infants were, respectively, 25.9 weeks and 830 g; 75 out of the 139 (53%) included infants were male. Forty-six (7% of total admissions) infants with GMH-IVH were excluded because of the additional diagnosis of congenital brain malformations (n = 10); (suspected) meningitis (n = 27); or metabolic, chromosomal, or genetic disorders (n = 9), and 486 (72%) infants did not have a GMH-IVH. Basic clinical parameters of the included infants are shown in Table 1. All infants had a cUS at time point 1, 129 (93%) at time point 2, and 121 (87%) at time point 3, adding up to a total of 389 cUS scans. The decrease in number of available cUS per time point was related to infants passing away in the neonatal period. Twelve infants (9%) received neurosurgical intervention, at a mean PMA of 39.5 weeks. In five of these 12 infants, a ventricular reservoir was placed as first ventricular drainage device, while in the remaining seven infants a ventriculo-peritoneal shunt was placed. In three of the infants with initial reservoir insertion, a subsequent ventriculo-peritoneal shunt was needed. Twenty infants (14%) died during the neonatal period, 18 during the neonatal period and two after cUS time point 3. All infants died from a combination of clinical conditions in view of extreme prematurity; none of the infants died due to severe IVH, PHVD, or surgical intervention as such.

Table 1 Clinical characteristics of the included extremely preterm infants.

cUS assessments

Visual assessment and ventricular size indices

Of the 139 included infants with IVH, the IVH was unilateral in 61 (44%) and bilateral in 78 (56%). Most severe IVH was graded as grade 1 in 43 (31%), grade 2 in 35 (25%), and grade 3 in 9 (7%) infants, and 52 (37%) infants showed a PVHI. In 32 of the 38 (84%) infants with a PVHI who survived until 36 weeks PMA, the PVHI evolved into a porencephalic cyst by time point 3 (18 left-sided, 14 right-sided). At all 3 cUS time points, the overall incidence of PHVD, according to the cut-off values for moderate PHVD, was markedly lower when defined based on VI and AHW as compared to when defined based on the FTHR (all observers combined); in 27 (19%) and 84 (60%) infants, PHVD persisted over the course of time points 2 to 3 when defined based on, respectively, VI and AHW or FTHR. Details are shown in Table 1.

Intra-observer, inter-observer, and inter-expertise reliability

Intra-observer reliability

Intra-observer agreement on ventricular size indices was excellent for the observers with ample training and ample experience (ICC > 0.9 for both). For the observer with minimal training/experience, the agreement markedly increased from moderate (ICC 0.78) prior to instruction to good (ICC 0.88) after instruction for the VI and AHW; this correlation remained moderate for the FTHR (ICC 0.79).

Inter-observer reliability

Inter-observer reliability for all 3 observers combined varied from poor to excellent. Overall, at all 3 time points, reliability was slightly higher for the VI and AHW (ICC range 0.49–0.84 and 0.51–0.91, respectively) than for the FTHR (ICC range 0.41–0.82), mostly attributable to differences in measurements of the bi-parietal distance (distance C). Also, inter-observer reliability was highest for indices taken from cUS performed at time points 2 and 3 as compared to time point 1; ICC was mostly poor for the indices at time point 1, becoming moderate for VI and FTHR at time points 2 and 3 and good for AHW at time point 3.

The inter-observer variability for all 3 indices in the 32 infants who developed a porencephalic cyst by time point 3 (18 left-sided, 14 right-sided) were comparable to those for the total cohort of infants. ICC for these groups were, respectively, 0.75 and 0.82 for the right VI, 0.57 and 0.69 for the left VI, 0.94 and 0.91 for the right AHW, 0.76 and 0.79 for the left AHW, and 0.78 and 0.82 for the FTHR.

Inter-expertise reliability

For all indices, the highest inter-expertise reliability was found between the observers with ample experience and ample training, almost exclusively showing good to excellent correlation (ICC range 0.65–0.99); the reliability between observers with ample training or experience and with minimal training/experience showed mostly poor to moderate correlation (ICC range 0.28–0.88). The inter-expertise reliability for the observers with ample experience and ample training was higher for the VI (ICC range 0.87–0.99) and AHW (ICC range 0.88–0.99) as compared to FTHR (ICC range 0.65–0.91), being most pronounced for time point 1 and less pronounced for time point 2 and particularly for time point 3.

Details on inter-observer and inter-expertise reliability for all ventricular size indices at all three time points are depicted in Table 2. The inter-observer and inter-expertise reliability for the VI and AHW more often were higher for the right lateral ventricle than for the left lateral ventricle, with the difference in side being the most pronounced when including the measurements by the observer with minimal training/experience.

Table 2 Inter-observer and inter-expertise reliability for all ventricular size indices, including the components of the FTHR, from all available cUS of the 139 extremely preterm infants at all 3 time points.

Predictive values for receipt of neurosurgical intervention

As the highest reliability for all ventricular size indices was found between the observers with ample training and with ample experience, the average of their measurements was used to calculate predictive values for the severity of PHVD as reflected by receipt of surgical intervention. Predictive values of all ventricular size indices at all 3 time points are depicted in Table 3. For all ventricular size indices, a good to excellent predictive value for receipt of surgical intervention was found, with slightly higher predictive values at time points 2 and 3 as compared to time point 1. Overall, a slightly higher predictive value for receipt of intervention was found for the AHW (AUC range 0.86–0.96) compared to VI (AUC range 0.80–0.97) and FTHR (AUC range 0.80–0.95) and for the left lateral ventricle compared to the right lateral ventricle, particularly at time point 1.

Table 3 Predictive values of all ventricular size indices for receipt of surgical intervention for PHVD in all 139 included extremely preterm infants.

Discussion

PHVD is a clinically important complication of prematurity; however, the assessment of the severity of PHVD and associated risk of brain tissue damage and need for intervention remain a debate. With a trend toward earlier intervention based on ventricular measurements, it is of critical importance that ventricular size indices are easy to perform and repeat, reproducible with low intra-observer and inter-observer variability, and strongly reflective of severity and progress of PHVD. This study is the first to simultaneously explore the reproducibility of several ventricular size indices between observers with differing levels of expertise in cUS assessment and the indices’ association with short-term outcome of receipt of surgical intervention for PHVD in a large cohort of extremely preterm infants. Our findings confirmed that the VI and AHW are highly reproducible with good to excellent intra-observer and inter-observer agreement when performed by experienced clinicians. The measurement of the AHW, especially from the second week of birth, was the strongest predictor of PHVD severity, as reflected by receipt of surgical intervention. This suggests that the AHW, preferably in combination with the VI, from the second week of birth may aid in early recognition of severe and progressive PHVD and decision-making with regards to the need for and timing of intervention. Consistent use of these indices has the potential to improve the management of PHVD in preterm infants and therewith the long-term outcomes in this vulnerable population.

PHVD is caused by impaired CSF drainage related to reduced reabsorption by subarachnoid villi and/or outflow obstruction by clot formation or scarring within the ventricular system.1,2 Previous studies using various neuro-monitoring tools, such as amplitude-integrated electroencephalogram, near-infrared spectroscopy, Doppler ultrasound, and magnetic resonance imaging, have shown PHVD to be associated with abnormal brain activity and perfusion and altered brain growth and maturation, even in the absence of signs of increased ICP.19,20,21,22,23,24,25,26,27,28,29,30 Reassuringly, brain activity and perfusion have also been shown to recover upon reduction of intraventricular pressure following CSF drainage.22,23 This highlights the potential benefits of timely intervention based on easily applicable, objective indices for following ventricular size in case of PHVD.

The VI and AHW were highly reproducible when performed by observers with experience in cUS assessment. Instruction and practice strongly improved the reproducibility of these indices in inexperienced hands, indicating that teaching of a single, linear index is feasible, fast and effective. Our findings are consistent with previous studies that found a high intra-observer and inter-observer agreement for the VI and AHW,12,16,17,31 indices that are most commonly used to assess PHVD by neonatologists and radiologists. In contrast with a recent study,15 a lower inter-observer agreement, particularly at time point 1, was found for the FTHR, an index more commonly used by neurosurgeons. In the study by Radhakrishnan et al.,15 the inter-observer reliability for the FTHR was assessed from cUS of preterm and term-born infants diagnosed with ventriculomegaly or hydrocephalus at a mean postnatal age of 35.5 days. The inconsistency between studies may therefore be related to the fact that the lateral ventricles were dilated in all infants in their study at the time of assessment, while there was a wide range in ventricular size in our study, from absent dilatation to severe dilatation, at all time points. From our personal experience, the landmarks for the FTHR are easier to identify when the ventricles are enlarged, which is strengthened by the increase in inter-observer reliability for the FTHR from time point 1 to time points 2 and 3 for the infants with PHVD.

Of interest, in accordance with previous studies,16,17 the inter-observer reliability for the VI and AHW was more often found to be higher for the right compared to for the left lateral ventricle than vice versa. This may, as previously hypothesized, be related to changes in the infant’s head position during scanning, resulting in asymmetry of the ventricles and CSF shifts and therewith contributing to variable indices on the right versus the left lateral ventricle. However, as for the retrospective nature of this study and as head position and changes therein during scanning were not reported, this study cannot confirm this hypothesis. Also, in contrast, the predictive value of the indices for receipt of surgical intervention was generally higher for the left than for the right lateral ventricle.

Although both VI and AHW had good predictive value, overall the highest predictive value for severity of PHVD, as reflected by receipt of surgical intervention, was found for the AHW. Previous studies have also suggested that the AHW may be a more sensitive marker of early ventricular dilatation and increasing ICP than other ventricular size indices.32,33,34 When CSF accumulates in the lateral ventricles, the ventricles tend to change in shape from oval to round, also referred to as “ballooning”. It is therefore understandable that this effect is reflected earlier and more sensitively by an increase in AHW than in VI and FTHR.35 The VI and FTHR generally only start to increase in a later stage of ventricular dilatation and may thus mask early signs of PHVD.32,33

The majority of IVHs occur within the first 72 h of birth, with PHVD usually developing within the first 10–14 days of the onset of an IVH. Not surprisingly, the predictive value of all ventricular size indices for receipt of surgical intervention for PHVD was highest from the second week of birth onwards as compared to around days 5–7 of birth. In 40% of infants with PHVD, slow progressive dilatation is followed by spontaneous arrest and even reduction, while in the remaining 60% of infants, ventricular size increases slowly or rapidly over the course of days to weeks.10 The timing of PHVD and its natural course over time explains why the predictive value of the indices performed from day 14 of birth showed the best predictive value. In addition, as also mentioned above, from our experience, the landmarks for the ventricular size indices, particularly the FTHR, are easier and more reliably identified when the ventricles become enlarged. This is strengthened by the results of this study showing that for all indices inter-observer reliability was higher at time points 2 and 3 than for time point 1, being most pronounced for the FTHR. Caution is, however, required regarding the interpretation of measurements taken around 36 weeks PMA. Preterm infants generally have larger ventricular and extracerebral spaces at this age compared to term-born infants, probably related to cerebral atrophy and/or altered growth.1,36,37,38 In addition, at this time, a PVHI may have evolved into a porencephalic cyst, which was the case in the majority of our infants with PVHI. These ex vacuo phenomena should be differentiated from ventricular dilatation related to increased CSF pressure.

A range of other ventricular size indices for assessing PHVD from cUS have previously been described, including depth of the occipital horn of the lateral ventricle (i.e., thalamo-occipital distance [TOD]), ventricular height, and width of the third and fourth ventricle. We decided to only include the VI, AHW, and FTHR in our study as these indices are standardly used in our centers and we have gained proficiency with these indices. The included indices are the most commonly used worldwide, are relatively easy to perform as for their single, linear nature with easy to identify and reliable landmarks, have previously been described to have good reproducibility, and reference values have been described and validated in extremely preterm infants.12,13,16,17,31,34,39 Also, it has previously been described that in preterm infants the occipital horns of the lateral ventricles are more prominently dilated in case of PHVD than the frontal horns and may even be the only part of the ventricle to dilate.34 However, with the increase in extremely preterm infants, isolated and often transient dilatation of the occipital horns is more often seen, both with and without IVH. The clinical significance of this finding and thus the TOD still needs to be explored.

Of note, in our study substantially more infants were diagnosed with PHVD around 14 days of birth when defining PHVD based on FTHR compared to based on VI and AHW (88% versus 28%). PHVD incidences have been described in the literature in up to 50% of preterm infants with high-grade IVH, while our cohort included infants with a range of low to high IVH grades, for which incidences of around 30% have been described.1,2,4 It is possible that the higher incidence of PHVD when using the current cut-off values to define PHVD based on FTHR is an overestimate. In the study by Radhakrishnan et al., describing a strong linear correlation between the FTHR and FOHR, a ratio of >0.55 was used as cut-off for ventricular dilatation.15 Also, as to our knowledge our study is the first to compare VI, AHW, and FOHR, this suggests that the cut-off values for the studied indices require reconsideration. Long-term neurodevelopmental outcome studies are required to reassess the long-term impact of the current cut-off values for PHVD for the different indices. This would aid ongoing discussions regarding the optimal timing of interventions in PHVD guided by ventricular size measurements.

The main strengths of our study are that we simultaneously assessed the reliability of three ventricular size indices in a large cohort of extremely preterm infants from three cUS scans throughout the neonatal period for each infant and that we assessed the variance in measurements between observers with differing levels of expertise in cUS assessment.

Also, several limitations to our retrospective study need to be acknowledged. As per the clinical protocol in our centers during the study period, we only had three cUS per infant that, although with a range in postnatal ages, were acquired around the same time for the whole cohort. As for the natural course of PHVD and the relatively long timespan between the second and third cUS, i.e., 14 days to 36 weeks PMA, we may have missed the most severe extent of ventricular dilatation in some infants. A prospective study including more frequent cUS scans throughout the neonatal period, particularly in the first weeks of the onset of PHVD and at standardized postnatal ages, would overcome these limitations. Also, notation of head position of the infant during scanning would enable studying whether head position, and potential resultant asymmetry of the lateral ventricles and CSF shift, has an influence on the reproducibility of the ventricular size indices. We only have short-term outcomes in this cohort, and decisions on surgical intervention for PHVD were based on a combination of ventricular size (using previously described cut-off values) and clinical parameters, limiting the ability to assess the impact of the indices on the threshold for intervention or long-term clinical outcomes. To confirm whether AHW is indeed the most reliable indicator of severity of PHVD, to define appropriate cut-off values of the ventricular size indices for PHVD, and to guide optimal timing of intervention, further studies using ventricular size measurements throughout the neonatal period along with long-term neurodevelopmental outcomes are required. Studies exploring easier tools to measure lateral ventricular area or even volumes from two-dimensional US, without the need for complex algorithms or calculations, or three-dimensional ultrasonography are ongoing.

In conclusion, our findings of excellent intra-observer and inter-observer agreement with high predictive value for severity of PHVD, along with ease of measurement (i.e., single linear measurements in 1 plane) and a consistent cut-off of 6 mm for ventricular dilatation regardless of PMA, supports AHW to be an easy and reliable indicator of early and true ventricular dilatation, particularly after the second week of birth. Further studies in a large cohort of preterm infants with PHVD, relating longitudinal ventricular size measurements over the neonatal period with long-term neurodevelopmental outcomes, are needed to assess the predictive value of indices for outcomes and therewith optimize the cut-off values to initiate intervention.