Introduction

High-grade intraventricular hemorrhage (IVH) is often associated with prematurity and can lead to post-hemorrhagic ventricular dilatation (PHVD), permanent ventriculoperitoneal shunting, and impaired neurodevelopmental outcomes.1 While IVH is diagnosed within the first week of life in 95% of patients,2 intervention is often delayed until there is evidence of progressive ventricular dilatation and onset of clinical signs and symptoms of increased intracranial pressure (ICP) in North America. However, there is increasing evidence that earlier intervention prior to the onset of symptoms3,4,5,6 can lead to improved neurodevelopmental outcomes.4,7,8 Various ventriculomegaly measurement thresholds defined a priori through expert consensus have been adopted and compared to guide decision making for cerebrospinal fluid (CSF) diversion.3,4,5,9,10,11,12,13 However, since it is not feasible to experimentally test more than two or three thresholds, data-driven methods can be employed to calculate thresholds with maximum predictive value for specific outcomes. Our recent publication on the same patient population calculated a frontal-occipital horn ratio (FOHR) of 0.61 for prediction of moderate to severe functional impairment at school age.14 Still, many neurosurgeons may be hesitant to place a reservoir in very young preterm infants who may not eventually demonstrate the need for CSF diversion due to risks associated with surgery. Thus, the aim of this study is to use a similar data-driven approach as our previous study investigating functional outcomes to calculate and compare ventriculomegaly thresholds that best predict the eventual need for CSF diversion for symptomatic increases in ICP.

Common linear measurements of ventriculomegaly are ventricular index (VI), anterior horn width (AHW), frontal-temporal horn ratio (FTHR), and FOHR (see15,16 for editorial). VI relies on a comparison against a validated reference range by gestational age (GA) at the time of measurement17,18 while AHW, FOHR, and FTHR are relatively constant independent of GA.17,18,19,20,21 All measurements have high inter-rater reliability,17,22,23 although a recent study comparing all four measurements showed that AHW was most reliable.20 We will delineate the natural history of ventricular dilatation with and without intervention using FOHR, FTHR, AHW, and VI in patients with grade III and IV IVH and calculate thresholds at early timepoints to best predict progression of PHVD necessitating neurosurgical intervention. While there is wide variability of timing for surgical intervention between surgical practices,24 our institution has historically followed a very late-intervention practice where children are monitored with serial head ultrasound (US) until CSF diversion is initiated upon clear demonstration of persistent, progressive, ventriculomegaly with clinical signs and symptoms of increased ICP. This practice contrasts with the current trend for early intervention but provides an opportunity to characterize the natural history of PHVD for some time prior to intervention to identify early thresholds that would predict eventual symptomatic ventriculomegaly.

Methods

Patients

This study was approved by the Ann & Robert H. Lurie Children’s Hospital of Chicago institutional review board. Consent was waived for retrospective analysis of deidentified data. School-aged outcomes and detailed descriptions of patient characteristics were presented in a separate publication.14 Inclusion criteria were GA <37 weeks, admission to the neonatal intensive care unit (NICU) at our children’s hospital from the beginning of our electronic medical record in 2007 through 2020, grade III or IV IVH, and a minimum of two head US studies available in our system. Diagnosis of grade III or IV IVH was confirmed by the first author (G.Y.L.) using the Papile definition.25 Exclusion criteria were intracranial hemorrhage from causes other than IVH, congenital anomaly, culture-proven meningitis prior to CSF diversion, death within the first 3 months of life prior to the decision on neurosurgical intervention, and missing data. No children with grade I or II hemorrhages that met inclusion criteria required neurosurgical intervention.

Information extracted from the medical record included GA, birthweight (BW), sex, neuroimaging reports, day of life (DOL) at each imaging examination, DOL at first neurosurgical intervention (ventriculoperitoneal shunt, lumbar puncture, ventricular reservoir, or ventriculosubgaleal shunt placement), and death during initial hospitalization.

Surgical intervention

During the study period, there were no standardized care pathways for the management of PHVD and decision making was based on the clinical judgment of the treating physicians. The timing and frequency of US after diagnosis of IVH was decided by the neonatology team prior to consult, and then by the neurosurgery team after consult, where more clinically concerning patients had more frequent scans. Typically, neurosurgical intervention was initiated after persistent progression of ventriculomegaly, head circumference crossing percentiles, and signs and symptoms of increased ICP (splayed sutures, full fontanelle, frequent apnea and bradycardia events), with preference to defer surgical treatment until infants reached 1.5–2 kg. Intervention primarily consisted of ventricular access device placement followed by serial taps, the frequency of which was based on clinical judgment, and conversion to ventriculoperitoneal shunt when the baby was closer to 2 kg. Lumbar and/or ventricular punctures were performed rarely in emergent cases.

Image acquisition

Head US was acquired by trained sonographers through the anterior fontanelle in the coronal and sagittal planes. Some patients also had MRI examinations that were performed on a 1.5 T Siemens scanner and included axial and coronal T2-weighted sequences. Examinations acquired at outside hospitals were included if the relevant anatomy was visible and measurement could be reliably performed.

Ventricular measurements

Measurements were performed on available head USs and MRs acquired during the first 16 weeks of life using our hospital’s picture archiving and communications system (PACS), as described previously for FOHR/FTHR21 and VI/AHW.26 For VI, we used the electronic spreadsheet available through El-Dib et al. (https://tinyurl.com/PHVD-Measures-1)13 to determine the value of the 97th percentile (p97) line for each timepoint. Estimations outside the 24–42 GA range were calculated using the formula derived from the electronic spreadsheet. AHW, VI, biparietal distance, frontal, and temporal horn widths were measured in coronal projection near the level of the foramina of Monroe where the widest ventricular distance was appreciated. Biparietal distance was measured at the widest point above the Sylvian fissure from inner table to inner table. Occipital horn width was measured in the coronal plane at the widest distance (Fig. 1). In cases of porencephalic cysts contiguous with the ventricle, the cyst was not included in the measurement. VI/AHW measurements from both ventricles were averaged for a global estimate of ventriculomegaly. All measurements were made on the coronal plane.

Fig. 1: Ventriculomegaly measurements.
figure 1

a Anterior horn width (AHW) and ventricular index (VI). b Maximum frontal horn width, biparietal distance, and temporal horn width used for calculation of frontal-temporal horn ratio (FTHR): (frontal + temporal)/(biparietal × 2). c Maximum occipital horn width, along with frontal horn width and biparietal distance, was used for calculation of frontal-occipital horn ratio (FOHR): (frontal + occipital)/(biparietal × 2).

All imaging studies were measured by G.Y.L. (neurosurgery resident with 6 years of experience). Scans from 27 randomly selected patients (253 scans) were independently measured by P.A. (pediatric neuroradiology fellow, 9 years of experience). Raters were blinded to radiology reports and studies were measured in chronological order such that subsequent shunt status was not known when measurements on earlier studies were performed. Measurements made by a single rater (G.Y.L.) were used in subsequent analyses.

Statistical analysis

Statistical analyses were performed using R-studio version 4.0.4 (Boston, MA) by G.Y.L. Descriptive analyses included mean ± standard error for continuous variables and median and interquartile range (IQR) for categorical variables. Patients were grouped into those who underwent neurosurgical intervention, those for whom neurosurgery was consulted but did not intervene, and those who did not have a neurosurgery consult. Analysis of variance was used to test group-wide differences between normally distributed continuous variables (verified using Shapiro–Wilk’s test), Kruskal–Wallis test for non-normally distributed continuous variables, and Fisher exact test for categorical variables. Post-hoc pairwise comparisons were performed using two-sample t-tests with Bonferroni correction. Time courses were compared using general linear regression with generalized estimating equations with an unstructured covariance matrix.

Inter-rater reliability was assessed using two-way random-effects intraclass correlation coefficient (ICC) with absolute agreement. Since not all patients had imaging on the same DOL and at the same frequency, images were binned into weeks of life and measurements were averaged across each 7-day period.

AUC and 95% CI for prediction of neurosurgical intervention was calculated for measurements on the first diagnostic scan demonstrating IVH, scans during the first 4 weeks of life, scan immediately prior to intervention when applicable, and maximum ventricular measurement. For two patients who underwent intervention within the first 4 weeks of life, only studies prior to intervention were included for prediction. The threshold at each timepoint that maximized sensitivity and specificity were calculated using Younen’s J statistic. The DeLong method was used to test for statistical pairwise differences in AUC curves between indices.

Results

Of 1583 infants with GA <37 weeks admitted to our NICU between 2007 and 2020, 137 patients had grade III or grade IV IVH after excluding patients with missing records, congenital and/or chromosomal anomaly, or misdiagnosis of IVH. Neurosurgery was consulted on 65 patients: 3 were further excluded for fewer than 2 scans available on our PACS system, 2 for culture-proven meningitis prior to first intervention, and 4 died within 3 months of life prior to the decision on whether intervention was needed. Of 72 patients for whom neurosurgery was not consulted, 22 died within 3 months of life. After exclusions, the study population consisted of 50 patients with grade III/IV IVH who did not receive a neurosurgery consult, 19 who had neurosurgery evaluation but did not undergo intervention, and 37 who received neurosurgical intervention (Fig. 2). One patient in the intervention group was scheduled for Ommaya placement but withdrew care immediately prior to surgery. Demographics for each group are summarized in Table 1. Patients who had neurosurgical consult but no intervention had higher GA and BW compared to the other groups (p < 0.001). Pairwise comparisons showed no difference in GA and BW between patients who underwent intervention and those for whom neurosurgery was not consulted (p = 0.99).

Fig. 2: Flowchart of patients included.
figure 2

Number of patients excluded and reasons for exclusion are documented. Patients with Grade III and IV hemorrhages were ultimately divided into those who had neurosurgery consult and received neurosurgical intervention, those who had neurosurgery consult but did not receive intervention, and those that did not have neurosurgery consult.

Table 1 Demographic information for patients with grade III or IV germinal matrix hemorrhage who did not have neurosurgery consult, who had neurosurgery consult but did not have an intervention, and those who underwent intervention.

Progression of ventriculomegaly

Measurements were made on 1254 imaging studies with a median of 16 studies (IQR 13–21) per patient for those who received the intervention, 9 (IQR 6.5–10.5) for patients who had neurosurgical consult but no intervention, and 7 (IQR 6–9) for those who did not have neurosurgical consult. ICC between the raters was excellent: FOHR 0.95 (95% CI 0.90–0.97); FTHR 0.95 (95% CI 0.93–0.96); AHW 0.96 (95% CI 0.94–0.97); VI 0.95 (95% CI 0.92–0.97). The median DOL at diagnosis of IVH for the entire cohort was 4 days (IQR 3–8, range 1–33) and did not differ between groups (p = 0.12). The median DOL at neurosurgery consult was 19 days (IQR 14–34, range 2–129) for those who had an intervention and 21 days (IQR 15–99, range 1–559) for those who did not. The median DOL at first intervention was 46 days (IQR 32–112, range 11–195). Measurements on the study immediately preceding intervention were as follows: FOHR 0.77 ± 0.095; FTHR 0.71 ± 0.074; AHW 27.63 ± 13.15 mm; VI 14.18 ± 7.01 mm above p97.

The time course of ventricular measurements over the first 16 weeks of life for each group is shown in Fig. 3. Across time, all measurements were higher in patients who received intervention as compared to those that did not (p < 0.001). For those who did not undergo intervention, time course of measurements between those who had neurosurgical consultation as compared to those who did not differ for FTHR (p = 0.014) and AHW (p < 0.001) but not for FOHR (p = 0.13) nor VI (p = 0.06) (Table 2). Mean measurements on the first diagnostic study, over the course of the first four weeks, and at maximum measurement are listed in Table 3, right. There was an effect of group for all measurement indices and across all timepoints (p < 0.001). Compared to patients who did not have neurosurgery consult, those who neurosurgery consulted on but did not undergo intervention had higher AHW at weeks 3 and 4 (p < 0.05), but maximum measurements did not differ (p = 0.32) (Table 3, left). VI at week 1 did not differ between those with neurosurgery consult but no intervention and those who did have neurosurgery consult (p = 0.09) but did differ at all other timepoints (p < 0.05–p < 0.001).

Fig. 3: Time course of ventriculomegaly indices.
figure 3

Time course of average a FOHR, b FTHR, c AHW, and d VI in mm above the 97th percentile measurements over the first 16 weeks of life for patients who required intervention, patients who had neurosurgery consult but no intervention, and patients who did not have neurosurgery consult. The grid below graph (a) shows the number of patients in each group with scans at each timepoint. Horizontal gray dashed line on each graph represents the higher limit of published thresholds for initiation of intervention: FOHR > 0.55,9 FTHR > 0.55,9 AHW > 10 mm,13 and VI > p97th percentile +4 mm.13

Table 2 Generalized estimating equations for comparison of differences between time course of each measurement method across groups.
Table 3 Ventricular measurements at diagnosis, week 1 through week 4 of life, and maximum measurement within the first 16 weeks of life or prior to intervention.

Prediction of symptomatic PHVD requiring intervention

Table 4 lists the AUC and 95% CI for prediction of intervention and optimal threshold values with associated sensitivity/specificity for each measurement modality at various timepoints. AUC increased sequentially across the first 4 weeks and was highest for maximum measurement prior to intervention (if applicable) for all measurement indices, ranging from 76–82% at the time of diagnosis to 97–98% at maximum measurement (Fig. 4). AUC was lower for VI relative to all other measures at week 1 (p < 0.01) but did not differ between indices at any other timepoint. Thresholds for maximum measurements were as follows: FOHR 0.66 (sensitivity 97.3, specificity 90.3%%); FTHR 0.62 (sensitivity 100%, specificity 90.3%), AHW 15.5 mm (sensitivity 89.2, specificity 90.1); and VI p97 + 8.4 mm (sensitivity 86.8%, specificity 95.8%).

Table 4 Area under the receiver operating curve (AUC) and calculated threshold to maximize sensitivity and specificity for prediction of symptomatic ventriculomegaly requiring neurosurgical intervention.
Fig. 4: Receiver operating characteristic curves.
figure 4

Area under the receiver operating curve using a FOHR, b FTHR, c AHW, and d VI measurements to predict severe post-hemorrhagic ventricular dilatation requiring neurosurgical intervention on the diagnostic scan, the maximum measurement prior to intervention, and measurements over the first 4 weeks.

The percentage of patients who did not receive intervention but had a maximum measurement that met the criteria for diagnosis of severe PHVD according to the current higher limit of published thresholds (FOHR/FTHR > 0.55,9 AHW > 10 mm,13 VI > p97 + 4mm13) were as follows: FOHR 31/71 (44%), FTHR 29/71 (41%), AHW 20/71 (28%), and VI 22/71 (31%) (Table 3, bottom).

Discussion

In this study, we compared the early evolution of ventricular dilatation between patients who developed symptomatic and persistent progression of PHVD and those who did not. Thresholds based on the maximum ventriculomegaly measurements that best predicted progressive symptomatic ventriculomegaly in our cohort were FOHR 0.66, FTHR 0.62, AHW 15 mm, VI 8.4 mm above p97 with 86–100% sensitivity and 90–96% specificity. These thresholds were lower than measurements immediately prior to intervention (FOHR 0.77, FTHR 0.71, AHW 28 mm, and VI 14 mm >p97) and represent a maximum limit to when intervention should be initiated, although it is now recommended that intervention be considered prior to the onset of symptoms.13

The Hydrocephalus Research Network (HCRN) adopted a FOHR of >0.50 for neurosurgical consultation/evaluation and FOHR >0.55 and onset of signs and symptoms of elevated ICP for neurosurgical intervention,9 although actual intervention occurred across HCRN institutions at FOHR 0.65–0.68.2,9 More recent guidelines propose CSF diversion at AHW > 6–10 mm and VI > p97–p97 + 412 regardless of symptomology. The European Early versus Late Ventricular Intervention Study compared initiation of CSF diversion at VI > p97 and AHW > 6 mm with VI > p97 + 4 mm and AHW > 10 mm,5 where intervention was initiated at an average FOHR of 0.49 in patients with adverse outcomes and 0.43 in those without.6 In the current study, we found that as early as the time of initial diagnosis of IVH, calculated thresholds have a sensitivity of 57–75% and specificity of 84–91% for prediction of symptomatic ventriculomegaly. This suggests that early ventriculomegaly may be valuable for the prediction of progressive and sustained ventriculomegaly, consistent with our group’s animal work.27

A recent study reported excellent predictability of subsequent CSF diversion using VI, AHW, and FOHR from the day 14 US20 with AUC values of >90% in contrast to values in the range of 70–80% in our study. Another study reported 100% sensitivity and 77% specificity for predicting the need for intervention using the currently recommended thresholds of FOHR and FTHR >0.55.22 However, these studies included all patients with grade I–IV hemorrhages, whereas we included only grade III/IV. Since according to the Papile criteria, grade I/II hemorrhages exhibit minimal or no ventricular distension25 with a low risk of subsequent PHVD,28 inclusion of all grades may inflate the sensitivity and specificity of prediction based on ventricular measurements. Furthermore, thresholds computed based on grade III/IV hemorrhages only may be more directly relevant to surgical decision making and represent a relative strength of the present study.

The ideal optimal threshold should balance surgical risks and neurodevelopmental benefits of limiting the degree of ventriculomegaly. Between 20 and 40% of patients in our population who did not receive treatment would have qualified for treatment according to currently accepted thresholds for severe PHVD. On the other hand, there is growing evidence that early intervention limiting the degree of ventriculomegaly is associated with improved neurodevelopmental outcomes,4,6,29,30 while increased ventriculomegaly is associated with secondary brain injury.12,31 One recent study calculated a FTHR threshold of 0.51 for the prediction of white matter injury on diffusion tensor imaging at term equivalent age.19 In a parallel study by our group in the same cohort of patients, we calculated a FOHR threshold of 0.61 for prediction of moderate/severe functional impairment at school age,14 which is lower than the 0.66 calculated in the present study for prediction of symptoms of increased ICP. Indeed, elevated ICP is a late-stage phenomenon of PHVD due to the accommodation of the infant skull. It has been shown that FOHR/FTHR does not correlate with ICP,32 while evidence of white matter damage correlates with increased ventricular size and worse neurodevelopmental outcomes.19 In addition, earlier intervention may reduce detrimental effects of blood break-down products as suggested by improved long-term neurodevelopmental outcomes in patients who received drainage, irrigation, and fibrinolytic therapy.13,33,34

Comparison between measurement indices in our study showed similar inter-rater reliability and predictability of intervention across FOHR, FTHR, AHW, and VI (ICC 0.95, 0.95, 0.96, and 0.95; AUC 97.1, 97.7, 96.6, and 96.8, respectively), consistent with previous studies reporting similarly high reliability >0.9 for FOHR/FTHR,22,23 although there is more variability in ICC ranging from 0.7 to >0.9 for VI/AHW.17,18,22,26 Previous studies have reported inter-rater reliability to be slightly higher for AHW relative to VI, and FOHR.20 This is not surprising given that AHW and VI require one measurement per side while FOHR and FTHR require three, leading to less variability for AHW/VI. Furthermore, as noted previously,15 the lateral boundaries of the parietal skull are sometimes obscured on head US. However, VI has its own limitations: it is dependent on GA and percentile charts restricted to 24–42 weeks GA and were derived from single-institution data,17,35 which limits its ease of use and comparison across patients and over time. Thus, we had to estimate VI measurements at postmenstrual ages outside of the age range based on the assumption that the relationship remains linear for some period of time. Furthermore, we chose to average the AHW/VI measurements across both ventricles rather than reporting each independently. We felt it was important to ensure all measurements compared provided a global representation of ventriculomegaly. However, this approach limits the comparison of our findings to other studies that report left and right AHW/VI measurements separately. Finally, regardless of index, measurements at earlier timepoints when the clot is still present or in cases of periventricular hemorrhagic infarction may be less accurate due to less defined ventricular borders.

AUC analyses in the present study showed no statistical difference between indices except at week 1, where VI performed worse. AUC across measurement indices in a study comparing VI, AHW, and FTHR4 were also very close (0.8–0.86 at week 1 and 0.92–0.96 at week 2) with overlapping 95% confidence intervals. Presently, FOHR and FTHR had higher sensitivities than AHW and VI (97 and 100% vs 89 and 87%) but VI had the highest specificity (95% vs 90% for all other indices). Based on these results, the independence of postmenstrual age for FOHR and FTHR, and consideration of dimensions from aspects of the lateral ventricles beyond only the frontal horns, we would favor the use of FOHR and FTHR in addition to VI and AHW.

There are notable limitations with our single-center retrospective study. The lack of a universal management strategies limits the generalizability of any single-institution study using the need for CSF diversion as an outcome variable. Treatment decisions were made without standardized protocols in the epoch of this study. Timing and frequency of USs were variable across patients, which required data to be binned into 7-day time periods and limits the precision of our timing estimates. It is possible that changes in approach over the long study period may have influenced outcomes, but our neurosurgical team remained constant throughout the study period and our previous study did not find any correlation between birthyear and functional outcome.14 Our institutional practice has since been revised in accordance with the most recent guidelines.13 However, while the data support lower thresholds, we still have not established a minimum where the risks of intervention may outweigh developmental gains. Future directions for research may include multi-institutional collaborations to provide a larger data set of institutions that employ varying thresholds for intervention to validate and improve on the predictive power of data-derived thresholds that may include other clinical and physiologic factors to optimize outcomes.

In summary, we calculated data-derived thresholds of ventriculomegaly measures for the prediction of persistent, progressive, and symptomatic PHVD requiring neurosurgical intervention based on our retrospective single-institution cohort. Due to our “high threshold” clinical practice, these thresholds represent upper limits for the initiation of CSF diversion. Ideal thresholds would initiate surgical neurosurgical intervention with the goal of optimizing long-term neurodevelopmental outcomes rather than solely for mitigation of late-onset symptoms of elevated ICP.