Diagnostic Accuracy of Ultrasound Scanning for Prenatal Microcephaly in the context of Zika Virus Infection: A Systematic Review and Meta-analysis

To assess the accuracy of ultrasound measurements of fetal biometric parameters for prenatal diagnosis of microcephaly in the context of Zika virus (ZIKV) infection, we searched bibliographic databases for studies published until March 3rd, 2016. We extracted the numbers of true positives, false positives, true negatives, and false negatives and performed a meta-analysis to estimate group sensitivity and specificity. Predictive values for ZIKV-infected pregnancies were extrapolated from those obtained for pregnancies unrelated to ZIKV. Of 111 eligible full texts, nine studies met our inclusion criteria. Pooled estimates from two studies showed that at 3, 4 and 5 standard deviations (SDs) <mean, sensitivities were 84%, 68% and 58% for head circumference (HC); 76%, 58% and 58% for occipitofrontal diameter (OFD); and 94%, 85% and 59% for biparietal diameter (BPD). Specificities at 3, 4 and 5 SDs below the mean were 70%, 91% and 97% for HC; 84%, 97% and 97% for OFD; and 16%, 46% and 80% for BPD. No study including ZIKV-infected pregnant women was identified. OFD and HC were more consistent in specificity and sensitivity at lower thresholds compared to higher thresholds. Therefore, prenatal ultrasound appears more accurate in detecting the absence of microcephaly than its presence.

populations. The WHO interim guidance recommends that pregnant women residing in areas of ongoing ZIKV transmission should have fetal ultrasound scans to exclude microcephaly or other brain abnormalities that have been reported in fetuses of women with prenatal ZIKV infection 4 .
Prenatal assessment of microcephaly has conventionally relied on ultrasound measurements of fetal biometric parameters such as the head circumference, biparietal diameter and occipitofrontal diameter [5][6][7] . The measurements of these parameters below a given threshold and at a specific gestational age of assessment have been applied to diagnose fetal microcephaly 6,8 . However, at the time of this review, no international consensus on fetal biometric parameters or the threshold for in-utero microcephaly diagnosis exists. Also, due to the rare nature of this condition, the application of different parameters and limits, the risk of wrong or missed diagnosis is high 9-12 . In the context of ZIKV infection, an accurate prenatal diagnosis of microcephaly is critical for fetal prognosis and decision-making by health providers and families of women suspected or confirmed to have ZIKV infection. We conducted a systematic review to assess the diagnostic accuracy of ultrasound measurement of fetal biometric parameters compared to reference assessments at birth for prenatal diagnosis of microcephaly in the context of ZIKV infection. This review served as part of the evidence base for the revised WHO interim guidance on the prenatal assessment of microcephaly in the context of ZIKV infection.

Methodology
Protocol registration. We registered this review in PROSPERO, the international prospective register of systematic reviews of the University of York and the National Institute for Health Research, under the number CRD42016039365.
Search strategies. We  Searches for grey literature and bibliographies of existing systematic reviews on ultrasound in pregnancy were complemented with results of the search strategies. No restrictions were placed on search dates or language. Two review authors independently screened the titles and abstracts of studies identified by the search strategies. Full texts of potentially eligible studies were independently assessed by two review authors for relevant studies.
Any disagreements were resolved through discussion or consultation with a third review author.
Inclusion and exclusion criteria. Index tests, reference standard, and diagnosis of interest. We considered studies that compared prenatal ultrasound measurements (index test) with direct postnatal measurements of head size (reference test). We included studies which used any of the following biometric parameters as index tests: head circumference (HC), occipitofrontal diameter (OFD), biparietal diameter (BPD), or ratios of any of these with either abdominal circumference (AC) or femur length (FL). Microcephaly (diagnosis) was the condition of interest, reported either as the only condition or separately in addition to other fetal brain abnormalities.
Types of studies. We considered for inclusion, studies of any design (randomized controlled trial, prospective or retrospective cohort studies, cross-sectional studies and case-control studies) comparing prenatal assessment of fetal biometric parameters with standard postnatal head size measurements for diagnosing microcephaly.
Case series and conference proceedings reporting original data and with adequate information were also considered for inclusion.
Types of participants. Pregnant women who had ultrasound measurements of fetal biometric parameters for diagnosis of microcephaly (irrespective of the indications for ultrasound). We planned to separately assess pregnant women suspected of being at risk of or confirmed with ZIKV infection.
Data extraction and synthesis. Two review authors independently extracted data on participants' characteristics (ZIKV virus infection status, gestational age at the time of ultrasound assessment). We obtained data on the number of true positives (TP), false positives (FP), true negatives (TN) and false negatives (FN) to determine the sensitivity and specificity (with 95% confidence intervals [CI]) of the index tests for each fetal biometric parameter.
Results from studies that presented insufficient data for meta-analysis were qualitatively shown. In one study where ultrasound and magnetic resonance imaging (MRI) were employed 13 , only sonographic data was extracted for the review.
For studies considered similar in terms of the research questions, study design and execution, we performed a meta-analysis using a random effects model on pooled data to estimate group sensitivity and specificity (with 95% CI).
We generated hierarchical receiver operating characteristic (HROC) curves using a hierarchical summary receiver operating characteristic (HSROC) model. To gauge overall test accuracy, we calculated a diagnostic odds ratio (DOR) and an area under the curve (AUC) using Der Simonian-Laird random-modeling and Holling's proportional hazard model 14 .
Data on TP, TN, FP and FN using cut-off values ranging from 3 SD to 5 SD below the mean were applied to estimate diagnostic test accuracy of fetal ultrasound. Pre-test probabilities based on the incidence of microcephaly in unclassified pregnancy (0.0285%) 1 and ZIKV-infected pregnancies (0.95%) 1 were applied to estimate positive and negative predictive values.

Risk of bias assessment.
We assessed the risk of bias using version 2 of the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool 15,16 in Review Manager (RevMan Version 5.3.). We provided a rating for risk of bias and applicability concerns based on the presence or absence of indicators for index and reference standards, flow and timing of prenatal and postnatal tests.
We assessed as high risk where serious deficiencies in criteria were detected and unclear or low risk where descriptions were inadequate or appropriate. For the meta-analysed data, we assessed heterogeneity using the I 2 statistic (percentage of inter-study variation due to heterogeneity).

Results
Search results. The search strategies yielded 2,258 citations from the databases. One hundred and eleven potentially eligible studies were identified after screening of titles and abstracts and removing duplicates (Fig. 1).
Full texts of all potentially eligible studies were assessed and studies excluded for reasons shown (Appendix 2, Supplementary Information). Nine studies met our inclusion criteria. Two of these studies reported sufficient data that could be used in meta-analysis while the other seven presented incomplete data and were described.
The thresholds for prenatal and postnatal diagnoses of microcephaly were pre-specified in some studies. It was defined as head measurements of >2 SD below the mean 17 , <3 SD below the mean 18, 19, 21 , 3 SD below the mean 24 , >3 SD below the mean 20 , below the 5th percentile 13 and the 10th percentile 22 threshold. The threshold applied was unstated in one study 23 .
In three out of nine studies 13,20,21 , the ultrasound device used for prenatal detection of fetal parameters was reported. These included Acuson 128 XP 10 (Siemens) 20 , GE Voluson 730 13 and a range of ultrasound machines in the third study 21 : GE Voluson E8, 730 Expert and Voluson 730 Pro (all GE Healthcare).

Accuracy of ultrasound measurements of BPD (3 studies).
Meta-analysis of two studies 18,19 , which included 51 fetuses reported a high sensitivity (94%) at 3 SD below the mean but lower sensitivities at 4 and 5 SDs. The specificity at 3 SD was very low but improved with lower cut-offs. The positive likelihood ratio for 3 SD  suggests a slight increase in the likelihood of microcephaly, but the confidence interval includes 1 (suggesting no change in the likelihood of microcephaly) ( Table 2, Fig. 2A-C). The positive likelihood ratios for 4 and 5 SDs indicate a large and often conclusive increase in the likelihood of microcephaly with the ratios exceeding 1. The positive predictive values (PPV) for unspecified and ZIKV-infected pregnancies were even much lower than for OFD measurements across the three thresholds.
One study 20 provided descriptive data. This study noted a low true positive and a high false negative frequency for the second (3.2%; 29%) and third trimester (42.9%; 57.1%) at a threshold of 3 SD below the mean.
OFD measurement at a threshold of 3 SD below the mean for GA was more sensitive, and measurements at 4 and 5 SDs more specific. Given the extremely low incidence of microcephaly applied, the proportion of fetuses diagnosed with microcephaly based on 3, 4, and 5 SD thresholds which were correctly diagnosed (PPV) was extremely low. Deduction of PPVs using 0.95% incidence of microcephaly among ZIKV-infected women did improve the PPV (Table 3, Fig. 2D-F).
However, the proportion of fetuses without microcephaly who were correctly diagnosed was close to 100% for the three thresholds, for both unspecified and ZIKV-infected pregnancies.

Accuracy of ultrasound measurements of HC (8 studies).
Eight studies reported on the diagnostic accuracies of HC. Synthesis of two studies 18,19 (45 fetuses) with meta-analyzable data showed sensitivities of 84%, 68% and 58% and specificity of 70%, 91% and 97% at thresholds of 3, 4 and 5 SD below the mean for GA, respectively (Table 4, Fig. 2G-I).
Based on these two studies, HC measurements using 3 SD below the mean had relatively high sensitivity (84%), specificity (70%), positive likelihood ratio (2.6), and negative predictive values for unspecified (99%) and ZIKV-infected pregnant populations (99%) ( Table 4, Fig. 2G-I). As the SD below the mean for GA increased from 3 to 5, the sensitivity decreased while the specificity increased substantially.
Descriptive data was provided in the other six studies 13, 17, 21-24 . Among 42 fetuses prenatally diagnosed with microcephaly, Leibovitz et al. 21 reported 24 true positives and 18 false positives, and a positive predictive value (PPV) of 57.1 at an HC of 3 SD below the mean for GA.
In a study of 20 suspected cases of fetal microcephaly, Stoler-Poria et al. 17 confirmed five cases to be true positives and 15 false positives. The true positive cases had a HC of between 2 and 4.8 SDs below the mean for gestational age.
Wong et al. 22 reported comparable z-scores for prenatal and postnatal correlations in 455 fetuses. A z-score threshold of ≤1.3 below the mean (44.6% sensitivity, 35.1% specificity, 44.9% FP rate, 45.9% FN rate,) was more sensitive and specific relative to a z-score of ≤1.7 below the mean (28.8% sensitivity, 21% specificity, 62.6% FP rate, 28.2% FN rate). Additionally, an area under the ROC curve of 0.6 suggested inaccuracy of prenatal ultrasound diagnosis of microcephaly.  Table 3. Diagnostic accuracy of ultrasound measurements of OFD for prenatal assessment of microcephaly. Parentheses indicate 95% CI. Pre-test probabilities, i.e. incidence of microcephaly among general pregnancies and ZIKV-infected pregnancies were estimated as 0.0285% and 0.95%, respectively.  Table 2. Diagnostic accuracy of ultrasound measurements of BPD for prenatal assessment of microcephaly. Parentheses indicate 95% CI. Pre-test probabilities, i.e. incidence of microcephaly among general pregnancies and ZIKV-infected pregnancies were estimated as 0.0285% and 0.95%, respectively.
One study 13 reported a sensitivity of 85.7% and specificity of 85.3% for microcephaly detection at a HC of <5th percentile for gestational age. In this study, prenatal and postnatal findings were more consistent in the absence of coexisting brain abnormality.
In another study 24 , 11 of 16 cases of prenatally diagnosed microcephaly at a threshold of 3 SD below the mean for GA were false positive when examined at birth, giving a sensitivity of 31%. Campbell et al. 23 reported the accurate identification of all ten cases of microcephaly suspected before 24 weeks gestation at the postnatal examination. There were no false positives or false negatives.

Accuracy of ultrasound measurements of the HC to AC ratio (3 studies).
We could not perform a meta-analysis for this parameter. Descriptive information on the accuracy of ratios of the head circumference to abdominal circumference for fetal biometry assessment was provided in only three studies 18,19,21 .
In one study 18 , ultrasound detection of microcephaly with HC: AC ratio was consistently specific in diagnostic accuracy at all thresholds (3, 4 and 5 SDs) below the mean. For sensitivity, frequencies were lower at 5 SD (20%) and higher at 3 SD (80%), both below the mean.
Another study 19 accurately detected the absence of microcephaly at thresholds of 3, 4 and 5 SD below the mean (specificity of 100%), with accuracy in sensitivity greatest at 3 SD (80%) below the mean. The third study 21 identified a low sensitivity for HC: AC ratio at <5th percentile, for fetal suspicion (33.3%) and actual confirmation of microcephaly (37.5%).

Accuracy of ultrasound measurements of BPD to FL ratio (2 studies).
A meta-analysis was not possible for this parameter. In one study 18 , the sensitivity and specificity of BPD: FL ultrasound measurements in detecting microcephaly were low at all thresholds measured (33-78%), but the specificity was high for measurements of 5 SD (87%) below the mean.
Another study 24 noted the limitations of using the BPD: FL ratio for defining cases with or without microcephaly and reported five true positives and 11 false positives.

Accuracy of ultrasound measurements of FL to HC ratio (3 studies).
Available studies could not be meta-analysed. In one study 18 , ultrasound measurement of FL: HC had a high sensitivity of 75-100% at 3-5 SDs and 87-100% specificity for ≤3SDs all below the mean.
Another study 19 reported low sensitivity at 50-75% SD at all thresholds, highest at 1 SD (75%) and 85-100% specificity at ≤2 SD all below the mean for FL: HC parameter. Leibovitz et al. 21 showed that at <5th percentile, an HC: FL ultrasound measurement showed a low sensitivity for both suspected (52.4%) and confirmed microcephaly (50%). (QUADAS-2). The two studies included in the meta-analysis were at a high risk of bias due to lack of pre-specified prenatal thresholds 18,19 and inappropriate exclusions 19 . Only one of these studies 19 had concerns regarding applicability due to the limitation of the study population to a short interval of <2 weeks between a prenatal index scan and postnatal reference test.

Risk of bias assessment and applicability concerns
In five of seven descriptive studies 13,17,20,23,24 , a high risk of bias rating was assigned. These studies limited the study population of pregnant women to the following: CMV-infection and the availability of MRI and US diagnosis 13 , Hebrew native language ability 17 , mothers who presented with phenylketonuria 20 , before 26 weeks gestation 23 or late trimester measurements (28 to 43 weeks) 24 . The two other studies had a low risk of bias 21,22 .
Concerns regarding applicability were noted in two 13,20 of the seven studies that provided only descriptive data. These studies included only high risk mothers infected with CMV 13 and having phenylketonuria 20 . All other five studies 17, 21-24 had low concerns regarding applicability (Fig. 3).

Discussion
This review provides a thorough overview of available information on the prenatal application of ultrasound for diagnosis of microcephaly. HC and OFD measurements at 4 and 5 SD below the mean had a high DOR (25.3 to 48.0) and positive likelihood ratios (7.6 to 19.3) with wide 95% confidence intervals.
Negative predictive values for unspecified-and extrapolated-ZIKV-infected pregnancies at these standard deviations were consistently high, close to 100%, although these values were derived from a relatively small number of fetuses. Thresholds of 4 and 5 SDs below the mean for OFD and HC showed a tendency to consistently "rule in" the diagnosis of fetal microcephaly with a reasonable level of confidence.
Our study indicates that the overall diagnostic test accuracy of ultrasound for predicting microcephaly at birth is limited as it varied with the applied cut-offs. Large differences were not observed among the different biometric parameters used to make a prenatal detection of microcephaly.
Ultrasound measurements of all three parameters should be recommended for cases with a high likelihood of microcephaly. Given the low incidence of microcephaly 1 , a fetal ultrasound seems not to have a large effect on the probability of identifying true cases of microcephaly.
To detect fetal microcephaly and/or brain abnormalities, the WHO currently recommends an early fetal anomaly scan between 18 to 20 weeks gestation or at the earliest possible time if after 20 weeks. A repeat ultrasound in the late second or early third trimester, usually around 28 to 30 weeks gestation 4 is further encouraged to exclude false positives.
The inclusion of coexisting abnormalities such as intrauterine growth restriction, intracranial deformities and a detailed family history has been shown to improve the predictive value of ultrasound diagnosis 21 . Thus, setting an SD threshold to increase the accuracy of microcephaly detection in ZIKV-infected and any pregnancies should be informed by a balance of expert opinion, detailed history and analysis of other associated fetal anomalies 25 .
Variation in sensitivity and specificity for all fetal head biometric measurements (BPD, HC, OFD) observed in all studies may have been due to trimester-specific changes in fetal growth, differences in ultrasound device, techniques and patient characteristics (congenital or acquired microcephaly, the presence of other anomalies) 11,26 . Growth appreciably slows in the third trimester in a fetus affected with microcephaly and autosomal recessive inheritance patterns may play an importantrole. Fetuses with microcephaly are often miscarried, terminated or result in stillbirths which may explain the absence of comparative studies in ZIKV-infected pregnant women. Comparisons with postmortem or pathological samples derived from such scenarios introduce some form of bias 27 . In such cases, the estimated accuracy should be interpreted with caution.
The prenatal diagnostic accuracy of structural abnormalities affords informed maternal and health provider decisions, on whether to continue, terminate or institute fetal therapies. Potential misdiagnosis can be a source of emotional trauma during pregnancy. Hence, a review of growth standards employed and agreement with postnatal measurements can help eliminate or decrease the incidence of misdiagnosis.
To the best of our knowledge, no study at the time of conducting this systematic review had examined the variations in head measurements, in the context of microcephaly for fetuses of ZIKV-infected pregnant women. An evident lack of longitudinal or other studies indicating the best time-point for head measurements of fetuses from ZIKV-infected pregnant women is also present. Our comprehensive search strategy and lack of a date or language restrictions likely identified all studies.
Our study had limitations. Primary data was from a limited number of fetuses and reported by two studies with unclear or high risk of bias. The nature of studies included in the quantitative synthesis and an overall high risk of bias rating limit the confidence in extrapolated results for ZIKV-infected pregnant women.
Trimester-specific variation in fetal morphology visible on ultrasound measurements also restricts the use of fetal biometric parameters in isolation 28 . This proposes a need for incorporating presenting features and a detailed history of the pregnant woman. Variation in thresholds, ultrasound device and timing of assessment during pregnancy adds potential flow and timing bias.
With the influx of research on ZIKV infections in pregnancy, we acknowledge the rapid evolution of knowledge on the subject. Further studies addressing ultrasound accuracy and based on fetal biometric parameters, all relative to reference measures at birth using modern ultrasound machines will be helpful.
It is reasonable to assume that the technical improvement of ultrasound machines in the last 20 years should contribute to improved diagnostic accuracy which was lacking in the published studies published. Research on diagnostic test accuracy based on present-day ultrasound devices is needed to improve confidence in fetal microcephaly diagnosis.
In conclusion, we provide evidence for the diagnostic accuracy of ultrasound in the detection of fetal microcephaly. Ultrasound diagnostic accuracy of HC and OFD parameters at 4 and 5 SD below the mean was better at ruling in fetal microcephaly with high DOR, sensitivity, specificity and positive likelihood ratio. The relative improvement in ultrasound technology and technical skills suggests the need for new studies on the subject.