Introduction

Microcephaly is a sign of fetal brain abnormality in which there is a significantly small head size for gestational age and sex. Infants born with microcephaly are likely to present with variable clinical features ranging from subtle impairment in neurological development to serious intellectual disabilities in the long term. It is a rare condition occurring in 5.8 to 18.7 per 100,000 pregnancies and often arising from a wide variety of conditions that can cause abnormal brain growth1.

In 2015, a 20-fold increase in neonatal microcephaly was observed in association with Zika virus (ZIKV) infections in pregnant women in Latin America2. This observation prompted the World Health Organization (WHO) to declare the ZIKV outbreak in the Americas a Public Health Emergency of International Concern on 1st February 20163.

As part of its strategic framework, WHO provides normative guidance to affected countries on conditions presumably associated with prenatal ZIKV infection, to improve surveillance and clinical outcomes in at risk populations. The WHO interim guidance recommends that pregnant women residing in areas of ongoing ZIKV transmission should have fetal ultrasound scans to exclude microcephaly or other brain abnormalities that have been reported in fetuses of women with prenatal ZIKV infection4.

Prenatal assessment of microcephaly has conventionally relied on ultrasound measurements of fetal biometric parameters such as the head circumference, biparietal diameter and occipitofrontal diameter5,6,7. The measurements of these parameters below a given threshold and at a specific gestational age of assessment have been applied to diagnose fetal microcephaly6, 8. However, at the time of this review, no international consensus on fetal biometric parameters or the threshold for in-utero microcephaly diagnosis exists. Also, due to the rare nature of this condition, the application of different parameters and limits, the risk of wrong or missed diagnosis is high9,10,11,12.

In the context of ZIKV infection, an accurate prenatal diagnosis of microcephaly is critical for fetal prognosis and decision-making by health providers and families of women suspected or confirmed to have ZIKV infection. We conducted a systematic review to assess the diagnostic accuracy of ultrasound measurement of fetal biometric parameters compared to reference assessments at birth for prenatal diagnosis of microcephaly in the context of ZIKV infection. This review served as part of the evidence base for the revised WHO interim guidance on the prenatal assessment of microcephaly in the context of ZIKV infection.

Methodology

Protocol registration

We registered this review in PROSPERO, the international prospective register of systematic reviews of the University of York and the National Institute for Health Research, under the number CRD42016039365.

Search strategies

We searched MEDLINE, EMBASE, CENTRAL, Cochrane Database for DTA studies, LILACS, and WHO Global Health Library for studies published until 3rd March 2016. Search terms related to the index tests, reference tests and target condition were employed in the search strategies as shown in Appendix 1 (Supplementary Information).

Searches for grey literature and bibliographies of existing systematic reviews on ultrasound in pregnancy were complemented with results of the search strategies. No restrictions were placed on search dates or language. Two review authors independently screened the titles and abstracts of studies identified by the search strategies. Full texts of potentially eligible studies were independently assessed by two review authors for relevant studies.

Any disagreements were resolved through discussion or consultation with a third review author.

Inclusion and exclusion criteria

Index tests, reference standard, and diagnosis of interest

We considered studies that compared prenatal ultrasound measurements (index test) with direct postnatal measurements of head size (reference test). We included studies which used any of the following biometric parameters as index tests: head circumference (HC), occipitofrontal diameter (OFD), biparietal diameter (BPD), or ratios of any of these with either abdominal circumference (AC) or femur length (FL).

Microcephaly (diagnosis) was the condition of interest, reported either as the only condition or separately in addition to other fetal brain abnormalities.

Types of studies

We considered for inclusion, studies of any design (randomized controlled trial, prospective or retrospective cohort studies, cross-sectional studies and case-control studies) comparing prenatal assessment of fetal biometric parameters with standard postnatal head size measurements for diagnosing microcephaly.

Case series and conference proceedings reporting original data and with adequate information were also considered for inclusion.

Types of participants

Pregnant women who had ultrasound measurements of fetal biometric parameters for diagnosis of microcephaly (irrespective of the indications for ultrasound). We planned to separately assess pregnant women suspected of being at risk of or confirmed with ZIKV infection.

Data extraction and synthesis

Two review authors independently extracted data on participants’ characteristics (ZIKV virus infection status, gestational age at the time of ultrasound assessment). We obtained data on the number of true positives (TP), false positives (FP), true negatives (TN) and false negatives (FN) to determine the sensitivity and specificity (with 95% confidence intervals [CI]) of the index tests for each fetal biometric parameter.

Results from studies that presented insufficient data for meta-analysis were qualitatively shown. In one study where ultrasound and magnetic resonance imaging (MRI) were employed13, only sonographic data was extracted for the review.

For studies considered similar in terms of the research questions, study design and execution, we performed a meta-analysis using a random effects model on pooled data to estimate group sensitivity and specificity (with 95% CI).

We generated hierarchical receiver operating characteristic (HROC) curves using a hierarchical summary receiver operating characteristic (HSROC) model. To gauge overall test accuracy, we calculated a diagnostic odds ratio (DOR) and an area under the curve (AUC) using Der Simonian-Laird random-modeling and Holling’s proportional hazard model14.

Data on TP, TN, FP and FN using cut-off values ranging from 3 SD to 5 SD below the mean were applied to estimate diagnostic test accuracy of fetal ultrasound. Pre-test probabilities based on the incidence of microcephaly in unclassified pregnancy (0.0285%)1 and ZIKV-infected pregnancies (0.95%)1 were applied to estimate positive and negative predictive values.

Risk of bias assessment

We assessed the risk of bias using version 2 of the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool15, 16 in Review Manager (RevMan Version 5.3.). We provided a rating for risk of bias and applicability concerns based on the presence or absence of indicators for index and reference standards, flow and timing of prenatal and postnatal tests.

We assessed as high risk where serious deficiencies in criteria were detected and unclear or low risk where descriptions were inadequate or appropriate. For the meta-analysed data, we assessed heterogeneity using the I2 statistic (percentage of inter-study variation due to heterogeneity).

Results

Search results

The search strategies yielded 2,258 citations from the databases. One hundred and eleven potentially eligible studies were identified after screening of titles and abstracts and removing duplicates (Fig. 1).

Figure 1
figure 1

Flow diagram. Search results and study selection (see appendices for details).

Full texts of all potentially eligible studies were assessed and studies excluded for reasons shown (Appendix 2, Supplementary Information). Nine studies met our inclusion criteria. Two of these studies reported sufficient data that could be used in meta-analysis while the other seven presented incomplete data and were described.

Characteristics of included studies

All included studies were based on hospital records in the USA (5), Israel (2), France (1) and Canada (1). The study designs were either prospective cohort17,18,19,20 or retrospective cohort13, 21,22,23,24 with enrollment periods spanning between 1979 and 2014 (Table 1).

Table 1 Characteristics of included studies.

The thresholds for prenatal and postnatal diagnoses of microcephaly were pre-specified in some studies. It was defined as head measurements of >2 SD below the mean17, <3 SD below the mean18, 19, 21, 3 SD below the mean24, >3 SD below the mean20, below the 5th percentile13 and the 10th percentile22 threshold. The threshold applied was unstated in one study23.

Only two studies contained data appropriate for meta-analysis as they assessed similar parameters at same thresholds18, 19. The HSROC curves are shown in Fig. 2A–I.

Figure 2
figure 2

Hierarchical summary receiver operating characteristics (HSROC) curves for A–C) BPD, D–F) OFD and G–I) HC at 3, 4 and 5 SD below the mean. The size of each circle reflects weight, not confidence region. (Open arrow: Two circles had exactly same accuracy and weight. Filled arrow: Three circles had exactly same accuracy and weight).

Fetal microcephaly was secondary to cytomegalovirus (CMV)-infection13 and phenylketonuria (PKU)20 in 2 studies and congenital or primary in the seven other studies17,18,19, 21,22,23,24.

In three out of nine studies13, 20, 21, the ultrasound device used for prenatal detection of fetal parameters was reported. These included Acuson 128 XP 10 (Siemens)20, GE Voluson 73013 and a range of ultrasound machines in the third study21: GE Voluson E8, 730 Expert and Voluson 730 Pro (all GE Healthcare).

Accuracy of ultrasound measurements of BPD (3 studies)

Meta-analysis of two studies18, 19, which included 51 fetuses reported a high sensitivity (94%) at 3 SD below the mean but lower sensitivities at 4 and 5 SDs. The specificity at 3 SD was very low but improved with lower cut-offs. The positive likelihood ratio for 3 SD suggests a slight increase in the likelihood of microcephaly, but the confidence interval includes 1 (suggesting no change in the likelihood of microcephaly) (Table 2, Fig. 2A–C).

Table 2 Diagnostic accuracy of ultrasound measurements of BPD for prenatal assessment of microcephaly.

The positive likelihood ratios for 4 and 5 SDs indicate a large and often conclusive increase in the likelihood of microcephaly with the ratios exceeding 1. The positive predictive values (PPV) for unspecified and ZIKV-infected pregnancies were even much lower than for OFD measurements across the three thresholds.

One study20 provided descriptive data. This study noted a low true positive and a high false negative frequency for the second (3.2%; 29%) and third trimester (42.9%; 57.1%) at a threshold of 3 SD below the mean.

Accuracy of ultrasound measurement of OFD (2 studies)

Pooled data from two studies18, 19 (45 fetuses) reported sensitivities of 76%, 58% and 58% and specificity of 84%, 97% and 97% at 3, 4, and 5 SDs below the mean for GA, respectively. Higher thresholds were more sensitive while lower thresholds were more specific (Table 3, Fig. 2D–F).

Table 3 Diagnostic accuracy of ultrasound measurements of OFD for prenatal assessment of microcephaly.

OFD measurement at a threshold of 3 SD below the mean for GA was more sensitive, and measurements at 4 and 5 SDs more specific. Given the extremely low incidence of microcephaly applied, the proportion of fetuses diagnosed with microcephaly based on 3, 4, and 5 SD thresholds which were correctly diagnosed (PPV) was extremely low. Deduction of PPVs using 0.95% incidence of microcephaly among ZIKV-infected women did improve the PPV (Table 3, Fig. 2D–F).

However, the proportion of fetuses without microcephaly who were correctly diagnosed was close to 100% for the three thresholds, for both unspecified and ZIKV-infected pregnancies.

Accuracy of ultrasound measurements of HC (8 studies)

Eight studies reported on the diagnostic accuracies of HC. Synthesis of two studies18, 19 (45 fetuses) with meta-analyzable data showed sensitivities of 84%, 68% and 58% and specificity of 70%, 91% and 97% at thresholds of 3, 4 and 5 SD below the mean for GA, respectively (Table 4, Fig. 2G–I).

Table 4 Diagnostic accuracy of ultrasound measurements of head circumference for prenatal assessment of microcephaly.

Based on these two studies, HC measurements using 3 SD below the mean had relatively high sensitivity (84%), specificity (70%), positive likelihood ratio (2.6), and negative predictive values for unspecified (99%) and ZIKV-infected pregnant populations (99%) (Table 4, Fig. 2G–I). As the SD below the mean for GA increased from 3 to 5, the sensitivity decreased while the specificity increased substantially.

Descriptive data was provided in the other six studies13, 17, 21,22,23,24. Among 42 fetuses prenatally diagnosed with microcephaly, Leibovitz et al.21 reported 24 true positives and 18 false positives, and a positive predictive value (PPV) of 57.1 at an HC of 3 SD below the mean for GA.

In a study of 20 suspected cases of fetal microcephaly, Stoler-Poria et al.17 confirmed five cases to be true positives and 15 false positives. The true positive cases had a HC of between 2 and 4.8 SDs below the mean for gestational age.

Wong et al.22 reported comparable z-scores for prenatal and postnatal correlations in 455 fetuses. A z-score threshold of ≤1.3 below the mean (44.6% sensitivity, 35.1% specificity, 44.9% FP rate, 45.9% FN rate,) was more sensitive and specific relative to a z-score of ≤1.7 below the mean (28.8% sensitivity, 21% specificity, 62.6% FP rate, 28.2% FN rate). Additionally, an area under the ROC curve of 0.6 suggested inaccuracy of prenatal ultrasound diagnosis of microcephaly.

One study13 reported a sensitivity of 85.7% and specificity of 85.3% for microcephaly detection at a HC of <5th percentile for gestational age. In this study, prenatal and postnatal findings were more consistent in the absence of coexisting brain abnormality.

In another study24, 11 of 16 cases of prenatally diagnosed microcephaly at a threshold of 3 SD below the mean for GA were false positive when examined at birth, giving a sensitivity of 31%. Campbell et al.23 reported the accurate identification of all ten cases of microcephaly suspected before 24 weeks gestation at the postnatal examination. There were no false positives or false negatives.

Accuracy of ultrasound measurements of the HC to AC ratio (3 studies)

We could not perform a meta-analysis for this parameter. Descriptive information on the accuracy of ratios of the head circumference to abdominal circumference for fetal biometry assessment was provided in only three studies18, 19, 21.

In one study18, ultrasound detection of microcephaly with HC: AC ratio was consistently specific in diagnostic accuracy at all thresholds (3, 4 and 5 SDs) below the mean. For sensitivity, frequencies were lower at 5 SD (20%) and higher at 3 SD (80%), both below the mean.

Another study19 accurately detected the absence of microcephaly at thresholds of 3, 4 and 5 SD below the mean (specificity of 100%), with accuracy in sensitivity greatest at 3 SD (80%) below the mean. The third study21 identified a low sensitivity for HC: AC ratio at <5th percentile, for fetal suspicion (33.3%) and actual confirmation of microcephaly (37.5%).

Accuracy of ultrasound measurements of BPD to FL ratio (2 studies)

A meta-analysis was not possible for this parameter. In one study18, the sensitivity and specificity of BPD: FL ultrasound measurements in detecting microcephaly were low at all thresholds measured (33–78%), but the specificity was high for measurements of 5 SD (87%) below the mean.

Another study24 noted the limitations of using the BPD: FL ratio for defining cases with or without microcephaly and reported five true positives and 11 false positives.

Accuracy of ultrasound measurements of FL to HC ratio (3 studies)

Available studies could not be meta-analysed. In one study18, ultrasound measurement of FL: HC had a high sensitivity of 75–100% at 3–5 SDs and 87–100% specificity for ≤3SDs all below the mean.

Another study19 reported low sensitivity at 50–75% SD at all thresholds, highest at 1 SD (75%) and 85–100% specificity at ≤2 SD all below the mean for FL: HC parameter. Leibovitz et al.21 showed that at <5th percentile, an HC: FL ultrasound measurement showed a low sensitivity for both suspected (52.4%) and confirmed microcephaly (50%).

Risk of bias assessment and applicability concerns (QUADAS-2)

The two studies included in the meta-analysis were at a high risk of bias due to lack of pre-specified prenatal thresholds18, 19 and inappropriate exclusions19. Only one of these studies19 had concerns regarding applicability due to the limitation of the study population to a short interval of <2 weeks between a prenatal index scan and postnatal reference test.

In five of seven descriptive studies13, 17, 20, 23, 24, a high risk of bias rating was assigned. These studies limited the study population of pregnant women to the following: CMV-infection and the availability of MRI and US diagnosis13, Hebrew native language ability17, mothers who presented with phenylketonuria20, before 26 weeks gestation23 or late trimester measurements (28 to 43 weeks)24. The two other studies had a low risk of bias21, 22.

Concerns regarding applicability were noted in two13, 20 of the seven studies that provided only descriptive data. These studies included only high risk mothers infected with CMV13 and having phenylketonuria20. All other five studies17, 21,22,23,24 had low concerns regarding applicability (Fig. 3).

Figure 3
figure 3

Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2). Summary of risk of bias and applicability concerns of included studies.

Discussion

This review provides a thorough overview of available information on the prenatal application of ultrasound for diagnosis of microcephaly. HC and OFD measurements at 4 and 5 SD below the mean had a high DOR (25.3 to 48.0) and positive likelihood ratios (7.6 to 19.3) with wide 95% confidence intervals.

Negative predictive values for unspecified- and extrapolated-ZIKV-infected pregnancies at these standard deviations were consistently high, close to 100%, although these values were derived from a relatively small number of fetuses. Thresholds of 4 and 5 SDs below the mean for OFD and HC showed a tendency to consistently “rule in” the diagnosis of fetal microcephaly with a reasonable level of confidence.

Our study indicates that the overall diagnostic test accuracy of ultrasound for predicting microcephaly at birth is limited as it varied with the applied cut-offs. Large differences were not observed among the different biometric parameters used to make a prenatal detection of microcephaly.

Ultrasound measurements of all three parameters should be recommended for cases with a high likelihood of microcephaly. Given the low incidence of microcephaly1, a fetal ultrasound seems not to have a large effect on the probability of identifying true cases of microcephaly.

To detect fetal microcephaly and/or brain abnormalities, the WHO currently recommends an early fetal anomaly scan between 18 to 20 weeks gestation or at the earliest possible time if after 20 weeks. A repeat ultrasound in the late second or early third trimester, usually around 28 to 30 weeks gestation4 is further encouraged to exclude false positives.

The inclusion of coexisting abnormalities such as intrauterine growth restriction, intracranial deformities and a detailed family history has been shown to improve the predictive value of ultrasound diagnosis21. Thus, setting an SD threshold to increase the accuracy of microcephaly detection in ZIKV-infected and any pregnancies should be informed by a balance of expert opinion, detailed history and analysis of other associated fetal anomalies25.

Variation in sensitivity and specificity for all fetal head biometric measurements (BPD, HC, OFD) observed in all studies may have been due to trimester-specific changes in fetal growth, differences in ultrasound device, techniques and patient characteristics (congenital or acquired microcephaly, the presence of other anomalies)11, 26. Growth appreciably slows in the third trimester in a fetus affected with microcephaly and autosomal recessive inheritance patterns may play an importantrole.

Fetuses with microcephaly are often miscarried, terminated or result in stillbirths which may explain the absence of comparative studies in ZIKV-infected pregnant women. Comparisons with postmortem or pathological samples derived from such scenarios introduce some form of bias27. In such cases, the estimated accuracy should be interpreted with caution.

The prenatal diagnostic accuracy of structural abnormalities affords informed maternal and health provider decisions, on whether to continue, terminate or institute fetal therapies. Potential misdiagnosis can be a source of emotional trauma during pregnancy. Hence, a review of growth standards employed and agreement with postnatal measurements can help eliminate or decrease the incidence of misdiagnosis.

To the best of our knowledge, no study at the time of conducting this systematic review had examined the variations in head measurements, in the context of microcephaly for fetuses of ZIKV-infected pregnant women. An evident lack of longitudinal or other studies indicating the best time-point for head measurements of fetuses from ZIKV-infected pregnant women is also present. Our comprehensive search strategy and lack of a date or language restrictions likely identified all studies.

Our study had limitations. Primary data was from a limited number of fetuses and reported by two studies with unclear or high risk of bias. The nature of studies included in the quantitative synthesis and an overall high risk of bias rating limit the confidence in extrapolated results for ZIKV-infected pregnant women.

Trimester-specific variation in fetal morphology visible on ultrasound measurements also restricts the use of fetal biometric parameters in isolation28. This proposes a need for incorporating presenting features and a detailed history of the pregnant woman. Variation in thresholds, ultrasound device and timing of assessment during pregnancy adds potential flow and timing bias.

With the influx of research on ZIKV infections in pregnancy, we acknowledge the rapid evolution of knowledge on the subject. Further studies addressing ultrasound accuracy and based on fetal biometric parameters, all relative to reference measures at birth using modern ultrasound machines will be helpful.

It is reasonable to assume that the technical improvement of ultrasound machines in the last 20 years should contribute to improved diagnostic accuracy which was lacking in the published studies published. Research on diagnostic test accuracy based on present-day ultrasound devices is needed to improve confidence in fetal microcephaly diagnosis.

In conclusion, we provide evidence for the diagnostic accuracy of ultrasound in the detection of fetal microcephaly. Ultrasound diagnostic accuracy of HC and OFD parameters at 4 and 5 SD below the mean was better at ruling in fetal microcephaly with high DOR, sensitivity, specificity and positive likelihood ratio. The relative improvement in ultrasound technology and technical skills suggests the need for new studies on the subject.