## Introduction

Despite prevention guidelines and public health advisories1,2,3, alcohol use continues to be common during pregnancy. A recent meta-analysis determined the global prevalence of alcohol use during pregnancy of 9.8%4. In the US, a 2013 report found that ~ 18% of women consumed alcohol during pregnancy, and 6.6% reported binge-drinking episodes5, which can be particularly damaging to fetal development6. In a 2015–2016 Texas state-wide assessment, we reported a third trimester rate of alcohol exposure of 8.4%, with rates as high as 17.7% in some geographical regions7. Global prevalence estimates for fetal alcohol syndrome (FAS) range from 0.15 to 0.3%4, 8 but are much higher in endemic populations, such as, 5.9–9.1% in the Cape Coloured (mixed ancestry) community in South Africa9. The diagnosis of FAS, the most severe of the fetal alcohol spectrum disorders (FASD), consists of a specific pattern of facial dysmorphology, microcephaly, and growth retardation, and applies to only a fraction of the estimated population of fetal alcohol-affected children10, 11. Estimates of the prevalence of FASD range from ~ 1 to 5% of the school age population in the US12 to 13.6–20.9% in the Western Cape Province of South Africa9, and the public health and economic burdens attributable to FASD are substantial13.

Early identification of alcohol-affected children can facilitate early intervention, which can mitigate some of the adverse secondary effects of prenatal alcohol exposure (PAE), which may emerge later in life14. However, identification of alcohol-affected children is difficult, particularly in children who lack the characteristic dysmorphic facial features that characterize FAS and partial FAS (PFAS)15. PAE-related dysmorphology is especially difficult to identify in infancy and early childhood, when interventions may be particularly effective16. A documented history of drinking during pregnancy is often difficult to obtain9, requiring detailed interviewing by trained and skilled interviewers17, 18. To date, several promising biomarkers of exposure to alcohol in utero have been identified, including fatty acid ethyl esters, which are metabolites of alcohol that are found in neonatal meconium19,20,21, ethyl glucuronide in placenta22, and phosphatidylethanol in newborn blood23. However, because effects of PAE on cognition and behavior vary considerably depending on factors, including timing of exposure, dose per drinking occasion, and genetic vulnerability, biomarkers of exposure cannot identify which exposed children will be adversely affected and require intervention.

A recent study on children from our original Cape Town Longitudinal Cohort24, using facial imaging involving dense surface modeling and shape signature analysis of 3 dimensional (3D) facial photographs, provided one of the first biomarkers of effect for FASD25. The 3D methodology showed that about half of the heavily exposed (HE) nonsyndromal children who appeared to lack the distinctive facial anomalies when examined by dysmorphologists, actually had subtle, difficult-to-detect facial features, resembling those seen in FAS and PFAS, and that these children had deficits in verbal IQ and learning and memory comparable to those in FAS and PFAS. These sophisticated facial imaging procedures are not yet available and cost-efficient for routine clinical practice. There is, therefore, a need for alternative, less expensive biomarkers of effect that can identify which exposed nonsyndromal children may be developmentally compromised by PAE26.

Data from a preclinical study27 and two previous human studies28, 29 suggest that microRNAs (miRNAs) secreted by cells and tissues into biofluids, such as blood, can provide biochemically stable, cost-effective biomarkers of PAE. miRNAs are a class of small non-protein-coding RNAs that, among several functions, serve as intracellular repressors of protein translation30. In preclinical models, our research was the first to show that ethanol influences miRNA expression31 via receptor mediated pathways31, 32 and epigenetic mechanisms33, 34. We and others have also found that alterations in miRNAs can mediate effects of alcohol on fetal neural stem cells31, 35, cranial development33, and behavior36. A growing body of evidence has shown that miRNAs play an important role in alcohol addiction, toxicity, and teratology37. The potential of miRNAs as biomarkers first became evident in 2008 when it was reported that they are secreted into human plasma38 and can identify patients with prostate cancer39. Impressively, plasma miRNAs are remarkably stable with proper sample collection and storage, even through multiple sample freeze–thaw cycles39, a factor that enhances their biomarker potential.

Since health care providers are often presented with newborns and infants who lack the characteristic facial dysmorphic features and/or growth deficits but may be alcohol-affected40, 41, it is of interest to determine whether biomarkers, such as extracellular, circulating miRNAs (exmiRNAs) in the presenting infant, can be used to identify alcohol-affected infants; that is, those who will exhibit PAE-related developmental impairment. To address this question, we first identified miRNAs in plasma samples from infants at 2 weeks and 6.5 months that discriminate between alcohol-exposed and non-exposed infants. This study is the first to examine whether exmiRNAs obtained from the infant can serve as biomarkers both of PAE and of adverse effects that emerge during development following PAE. This study is also the first to test directly whether exmiRNAs that discriminate between exposed and non-exposed infants mediate effects of PAE on specific outcome domains, namely, postnatal growth restriction and cognition.

## Results

### Participant characteristics

Demographic and background characteristics are summarized in Table 1. The mothers were poorly educated (i.e., the majority did not complete high school; none received any post-high school education) and economically disadvantaged (mean = low end of Hollingshead Inventory42 Level IV, semi-skilled laborers). There were no between-group differences in maternal education, socioeconomic or marital status, or parity, nor were there differences in gestational weight gain or daily caloric intake, indicating comparable between-group nutritional status during pregnancy. Women in the control group were on average 3.3 years younger and had their first antenatal visit on average 3.2 weeks earlier than alcohol-exposed mothers.

The mothers of infants in the alcohol-exposed group reported consuming an average of 4.5 oz absolute alcohol (AA)/occasion (128 g; 133 ml; ≈ 9 standard drinks/occasion) across pregnancy on an average of 1–2 days/week. 26 of the 31 controls (83.9%) abstained from drinking during the pregnancy. Four controls (12.9%) reported no alcohol use when recruited but subsequently drank 1–2.5 drinks on 1–2 occasions later in pregnancy; the fifth non-abstaining control consumed 3 drinks on 2 occasions around time of conception and then reported abstaining when she learned she was pregnant. While there were no group differences in number of cigarettes smoked/day, 31 women (78.4%) reported smoking in the PAE group compared to 17 controls (54.8%) (Χ2(1) = 4.27, p = 0.039). The number of cigarettes smoked/day was generally low with most women smoking < 0.5 pack/day. Only 4 women (10.8%) reported using marijuana in the PAE group and 2 among the controls (6.5%) (Χ2(1) = 0.40, p = 0.528); the number of days/week marijuana was used did not differ between the two groups. None of the women reported using methamphetamine, methaqualone, cocaine, or opiates during pregnancy.

There were no between-group differences in infant sex, gestational age (GA) at birth, birth weight, length, head circumference, or age at T2wk or T6.5mo, but the alcohol-exposed infants had smaller weight-, length-, and head circumference-for-age by 6.5 months of age. Five of the 37 infants (13.5%) born to the heavy drinking mothers met the Revised Institute of Medicine criteria10 for full FAS; an additional two (5.9%), for PFAS.

### Plasma and RNA characteristics

#### Sample purity characteristics

There were no significant differences in free hemoglobin levels (absorbance at 414 nm) attributable to exposure group or infant sex (both p-values > 0.20). Free hemoglobin at 2 weeks was 5.4-fold higher than at 6.5 months (F(1,118) = 38.93, p = 7.11 × 10–09; see Supplementary Fig. S1). However, there were no significant differences due to age, exposure group, or sex in the difference in cycle threshold (ΔCT) for miR-23a-3p (miRbase accession number MIMAT0000078) and miR-451a (MIMAT0001631) (ΔCT(miR23a-miR451a)), an independent marker for hemolysis43. Moreover, in contrast to previous reports in adult populations44, there was no significant relation (r = 0.13, p > 0.10) between absorbance at 414 nm and ΔCT(miR23a-miR451a), and none of the samples expressed the erythrocyte-specific transcript, SLC4A1 (see Supplementary Fig. S1). These data suggest that the elevated free hemoglobin observed was not due to acute hemolysis during sample collection but rather to the normal physiologic elimination of extra red blood cells that occurs during the first ~ 6 weeks of life.

Sex and age effects. Female infant-derived plasma samples contained ~ 12% more total isolated RNA, which included carrier MS2 RNA, per microliter (Fig. 1a) compared to male samples (F(1,118) = 5.6, p = 0.02). The number of expressed miRNAs decreased with age (F(1,110) = 12.9, p = 0.001; Fig. 1b); infants expressed ~ 14% fewer unique miRNAs at 6.5 months than at 2 weeks. However, whereas the control infants exhibited an ~ 18% decline in total RNA content between T2wk and T6.5mo, infants with PAE continued to exhibit elevated plasma RNA levels at T6.5mo (age by exposure group interaction: F(1,118) = 4.99, p = 0.027; Fig. 1c). Notwithstanding, there were no significant effects of PAE or infant sex on number of unique miRNAs expressed (both p-values > 0.20) and no effects of PAE, age, or sex on average expression level of all expressed miRNAs (average CT; all p-values > 0.15).

### PAE influences exmiRNA expression in infancy

We next examined the 148 miRNAs that were detected in at least 80% of the samples in either the alcohol-exposed or control group at T2wk or T6.5mo (Table 2, Supplementary Table S2). To isolate the effects of PAE in this population, effect sizes were determined based on ANCOVA adjusted means, with cigarettes/day as the covariate. The expression of two miRNAs at T2wk and 13 miRNAs at T6.5mo was significantly altered by PAE, i.e., the 95% confidence interval of the effect size did not contain zero (ANCOVA F > 4.03; p < 0.049). To create a broader set of candidate miRNAs that might be present in miRNA profiles with mechanistic significance for FASD in this sample, we next identified miRNAs that differed between the exposed and control groups with at least a medium effect size (Cohen’s d ≥ 0.40). At T2wk, expression levels of 18 miRNAs differed between the exposed and control groups, with effect sizes ranging from 0.40 (p = 0.179) to 0.68 (p = 0.014). At T6.5mo, expression of 26 miRNAs differed between the groups with effect sizes ranging from 0.40 (p = 0.105) to 0.88 (p = 0.001). At T2wk, 72% of these PAE-responsive miRNAs were upregulated by PAE, while at T6.5mo, 92% of the miRNAs were upregulated by PAE (Fig. 2).

Previous research suggests that there are sex differences in the early presentation of FASD45 and some of the cognitive impairments associated with PAE46. Moreover, sex differences have been found in the expression of other pediatric biomarkers47, 48 and in exmiRNA profile49. Although this study was under-powered to evaluate sex differences in exmiRNA profiles, some large-effect-size differences did emerge when exmiRNA expression data were disaggregated by sex (Table 2, Supplementary Table S2). For instance, at T6.5mo, MIMAT0000752 (hsa-miR-328-3p) was significantly elevated in female PAE infant plasma samples (Hedges’s g = 1.15, p = 0.002), but the effect was much smaller in male PAE infants (g = − 0.34; p = 0.324). Bootstrap resampling analysis50 of ANCOVA-adjusted means indicated that the expression of this exmiRNA was significantly different in the PAE compared to the control group in 83.1% of the resampling iterations which contained both male and female infants (Fig. 3). When resampled separately, female PAE infants were also significantly different from female controls in 93.3% of iterations, whereas male PAE infants were significantly different from male controls in only 17.1% of the iterations. In contrast, MIMAT0000443 (hsa-miR-125a-5p) was significantly elevated in T6.5mo male PAE samples (g = 0.89, p = 0.013) but not female PAE infant plasma samples (g = 0.15; p = 0.66). Resampling analysis indicated that the expression of this exmiRNA was different in PAE compared to the control group in 57.0% of the iterations which contained both male and female infants. However, when resampled separately, male PAE infants were different from controls in 80.7% of the iterations, while female PAE infants were significantly different from controls in only 5.5% of the iterations. Sex differences in PAE-regulated exmiRNA expression were more pronounced at T6.5mo than at T2wk; the number of exmiRNAs that were significant in more than half of the iterations and more frequently significant than the population as a whole when disaggregated by sex was 6 at T2wk and 21 at T6.5mo.

Hierarchically-clustered correlation matrices were computed to assess the extent to which PAE resulted in coordinated expression of exmiRNAs (Fig. 4). At T2wk, the alcohol-exposed infants exhibited 1.6-fold greater significantly (p < 0.05) correlated exmiRNAs compared with controls; at T6.5mo, the infants with PAE exhibited 1.2-fold more significant correlations as controls. Bootstrap resampling with replacement was used to assess the statistical stability of the number of significant correlations. At both T2wk and T6.5mo, PAE was associated with higher numbers of stable correlations among miRNAs compared with controls (99% confidence interval (CI): T2wk Cont [2273.9, 2319.4], PAE [3826.9, 3918.2]; T6.5mo Cont [2708.4, 2746.4], PAE [2999.3, 3043.3]). At T2wk the standard deviation of the control and PAE distributions of significant correlations did not overlap, whereas there was overlap of the distributions at T6.5mo. These data indicate that the correlated expression of exmiRNAs seen at 2 weeks in PAE infants was diminished over development. Re-computation of correlation matrices for the subset of miRNAs that differed between groups with effect size ≥ 0.40 also showed that alcohol-exposed infants exhibited a greater number of positively intercorrelated miRNAs at both ages (T2wk: 153 total possible correlations: control = 23 (15%), PAE = 37 (24%); T6.5mo: 325 total possible correlations: control = 45 (14%), PAE = 72 (22%)).

### PAE Is associated with increased coordinated expression of exmiRNAs across chromosomes at 2 weeks

Transcription is thought to be localized to relatively few spatially distinct nuclear regions, termed transcription factories, where strands from different chromosomes are functionally linked, resulting in co-transcription that can lead to correlated gene expression51. To assess the extent to which PAE may reorganize co-transcription from miRNA-encoding chromatin, we tabulated the number of significant correlations between expressed exmiRNAs by chromosomal location. We then assessed the fold-change associated with PAE in number of significant cross-chromosome correlations between exmiRNAs. There was a larger number of inter-correlations between exmiRNAs encoded on different chromosomes in the plasma of exposed infants compared with controls (T2wk: 217 total possible cross-chromosomal correlations: control = 32 (15%), PAE = 43 (20%); T6.5mo: 560 total possible cross-chromosomal correlations: control = 69 (12%), PAE = 141 (25%)) (Fig. 5; Supplementary Fig. S3). These correlations are not due to co-expression of miRNAs within chromosomal clusters, i.e., miRNAs within 10 kb of additional miRNAs which are commonly co-expressed52, as few PAE-sensitive exmiRNAs were located within chromosomal clusters (Supplementary Table S4). Of note, miRNAs from the development-associated miR 17–92 genomic cluster on chromosome 1353, are expressed at T2wk and T6.5mo and have higher correlated expression in the PAE group at both timepoints (percentage of significant correlations T2wk: Cont 1.7%, PAE 8.7%; T6.5mo: Cont 2.7%, PAE 6.1%). Increased correlation between specific chromosomes was observed (Fig. 5b). For example, at T2wk the number of significant inter-correlations between exmiRNAs on chromosome 16 and those on chromosome 1 was 23-fold greater in the PAE group than among the controls. Similarly, the number of significant correlations between miRNAs on chromosome 6 and those on chromosomes 2, 13, and 16 was more than tenfold greater in the PAE group than among the controls. Neither of these patterns was seen at T6.5mo.

### Confirmatory factor analyses at 2 weeks and 6.5 months of age

An iterative procedure was used to identify clusters of PAE-responsive (Cohen’s d ≥ 0.40, adjusted for smoking) exmiRNAs that shared common variance. Following several iterations in which exmiRNAs were linked to each other using conceptual (exploratory Ingenuity pathway analysis (IPA)), empirical (chromosomal location), and statistical criteria (exploratory factor analysis), simple factor structures were tested for minimal amounts of measurement error using confirmatory factor analysis (CFA). The biological pathways associated with each of the factors were identified based on IPA. As shown in Table 3, at T2wk, a bifactor model provided the best model for clusters of exmiRNAs that distinguished the exposed and control infants using inferential statistics but not using information criteria (the Bayesian information criterion (BIC) was smaller in the 3-factor correlated solution). Furthermore, the bifactor solution was not interpretable as neither the general factor nor the specific factors were supported. Consequently, the 3-factor correlated solution was the preferred simple structure at both ages.

At T2wk, IPA showed that the first factor included miRNAs related to cell maturation, cell cycle inhibition, and somatic growth, the second factor included miRNAs related to cell survival, apoptosis, cardiac development, and metabolism, and the third factor included miRNAs related to cell proliferation, skeletal development, hematopoiesis, and inflammation (Fig. 6a; Supplementary Fig. S5). This model provided an excellent fit to the data despite having a significant omnibus chi-square test, suggesting minimal discrepancies between the current model and “perfect” model fit (i.e., with zero residuals) [χ2(115) = 149.170, p = 0.018]. Fit indices were also excellent (Root Mean Square Error of Approximation (RMSEA) = 0.073, 95% CI = 0.032–0.104; Bentler’s Comparative Fit Index (CFI) = 0.920, Tucker-Lewis Index (TLI) = 0.906) prior to Bartlett’s correction. After applying the correction to the omnibus chi-square statistic, it became non-significant (χ2(115) = 127.434, p = 0.202), and two of the fit indices also improved after application of the correction (RMSEA = 0.43, 95% CI = 0.001–0.087; CFI = 0.971, TLI = 0.966) (see Supplementary Fig. S6a,b). No alterations from the hypothesized model were implemented, except for a residual covariation between miRNA MIMAT0001341 (miR-424-5p) and MIMAT0000759 (miR-148b-3p), and all items loaded significantly on their respective factors. The between-factors correlations ranged between 0.238 and 0.617 and were significant only between F1 and F2 and F1 and F3 with p < 0.001, suggesting correlated but distinct dimensions. This 3-factor structure at the neonatal period was compared to a unidimensional structure and to a bifactor model positing generalized and specific effects of the miRNAs in that both domain specific factors and a generalized factor would explain the relations between miRNAs (for a discussion on bifactor models see54). Results as mentioned earlier favored the 3-factor correlated solution using the information criteria (i.e., the BIC) and interpretational quality of the solution (Table 3). Internal consistency estimates using omega and maximal reliability H were 0.867 and 0.971 for the first factor, 0.553 and 0.804 for the second factor, and 0.354 and 0.711 for the third factor, respectively, indicating minimal amounts of measurement error and accurate measurement of the latent variables.

At T6.5mo, a 3-factor correlated solution provided the best model (Table 3). IPA showed that the first factor included miRNAs related to neurodevelopment, neural crest/mesoderm-derivative development and growth; the second, miRNAs related to immune system and inflammation; the third, miRNAs related to somatic growth and cardiovascular development (Fig. 6b, Supplementary Fig. S7). The measurement model showed good model fit, as all factor loadings were significantly different from zero. Prior to Bartlett’s correction, unstandardized residuals were 7.3%, fit indices were: CFI = 0.880, TLI = 0.864, and the chi-square test was significant [χ2(203) = 274.617, p = 0.001]. Post correction, model fit improved substantially with a point estimate of the residuals of 4.7% and a 95% CI ranging between 0.1 and 7.3%. The chi-square test was no longer significantly different from zero [χ2(203) = 233.253, p = 0.072] (Supplementary Fig. 6c,d) supporting an inference that the tested model does not deviate from a “perfect” model. The between factor correlations ranged from 0.649 and 0.756, indicating the absence of collinearity but the presence of three different yet highly correlated constellations of miRNAs. All between factor correlations were significantly different from zero and indicative of large effect sizes (rs > 0.5055). This 3-factor solution had three residual covariations between miRNAs, namely between miRNA MIMAT0000092 (hsa-miR-92a-3p) on Factor 1 and both MIMAT0002177 (hsa-miR-486-5p, Factor 1) and MIMAT0000460 (hsa-miR-194-5p, Factor 2) and between miRNAs MIMAT0000074 (hsa-miR-19b-3p) on Factor 1 and MIMAT0000431 (hsa-miR-140-5p) on Factor 3, suggesting the presence of a third variable that likely accounts for part of the variance not attributable to the latent constructs measured by Factors 1 and 2. Internal consistency reliability indicated low levels of measurement error. The omega coefficient estimates for the 3 factors were 0.807, 0.802, and 0.625, respectively. The estimates using maximal reliability were 0.918, 0.838, and 0.638, all within acceptable standards, although Factor 3 was lower than the others.

### Mediation of effects of PAE on somatic growth by exmiRNAs

Table 4 presents the results of the analyses examining the degree to which the effects of PAE (alcohol exposure (in AA/day), saturated linear regression) on the anthropometric measures are mediated by the expression levels of the exmiRNAs loading on each of the factors. The negative direct paths indicate that higher levels of PAE were associated with shorter infant length, lower weight, and smaller head circumference, after accounting for (a) the mediating role of the miRNA factors, and (b) the relation between alcohol exposure and smoking. The rows labeled “Indirect Paths” present the data from the mediation models for each of the factors. The indirect path rows assess the effect of the pathways from PAE to exmiRNA expression level and from exmiRNA expression level to the outcome. The rows labeled “Specific Indirect” show the degree to which the effect of PAE was mediated by the expression levels of the exmiRNAs in each factor.

At T2wk, all direct effects were negative and significant, showing that higher levels of exposure were associated with decrements in all three growth parameters. Because higher exmiRNA values (increased CT) indicate lower levels of expression, among the indirect paths PAE was associated with decreased expression of exmiRNAs associated with (F1) cell maturation, cell cycle inhibition, and somatic growth and (F3) cell proliferation, skeletal development, hematopoiesis, and inflammation and with elevated miRNA expression of exmiRNAs associated with (F2) cell proliferation, skeletal development, hematopoiesis, and inflammation in relation to all three anthropometric measures. Partial mediation, in which both the direct path and specific indirect path were significant, was observed for weight and head circumference by exmiRNAs associated with (F3) cell proliferation, skeletal development, hematopoiesis, and inflammation.

As at T2wk, all direct effects at T6.5mo were negative and significant, showing that higher levels of exposure were associated with greater decrements in all three growth parameters. Among the indirect paths, higher PAE was associated with elevated miRNA exposure for all three domains, namely, (F1) neurodevelopment, neural crest/mesoderm-derivative development and growth, (F2) immune system and inflammation, and (F3) somatic growth and cardiovascular development. (F2) immune system and inflammation exmiRNAs partially mediated the effect of PAE on length.

### exmiRNAs mediate the effect of PAE on cognitive functioning

Mediation of the effect of alcohol exposure on the Fagan Test of Infant Intelligence (FTII;56) visual recognition memory was significant for one exmiRNA factor: T2wk (F3) cell proliferation, skeletal development, hematopoiesis, and inflammation (Fig. 7a). The indirect effect of alcohol exposure on cognitive functioning through F3 was significant (b = − 0.169, p = 0.032), and the model fit the data well [χ2(4) = 3.355, p = 0.500; CFI = 1.00; TLI = 1.00]. A similar model also fit well at T6.5mo, [χ2(4) = 9.001, p = 0.061; CFI = 0.965; TLI = 0.869]. As shown in Fig. 7b, the data suggest partial mediation at T6.5mo of the effect of PAE on cognitive functioning by (F2), immune system and inflammation exmiRNAs (b = 0.108, p = 0.109).

## Discussion

PAE is associated with a broad range of cognitive deficits, including lower IQ57, 58, poor attention and executive function59,60,61,62, deficits in eyeblink conditioning24, 63,64,65,66 and number processing67, 68, and slower cognitive processing speed57, 67, 69,70,71,72. Although prenatal alcohol effects are irreversible, early interventions and remediation may mitigate severity of intellectual and behavioral impairment73. However, identification of affected individuals in need of early intervention is challenging both because FASD diagnostic clinical assessment is not universally available and because most PAE-affected individuals may be nonsyndromal and, therefore, lack the characteristic pattern of craniofacial dysmorphic features and growth deficits seen in FAS. There is thus a clear and pressing need for supplemental diagnostic approaches that can be more widely deployed as a screening tool for FASD in the general population. We and others recently found that plasma miRNAs in the pregnant mother could be a potential index not only for exposure29 but also for predicting which infants were more likely to meet criteria for a diagnosis of FAS or PFAS28. In the current study, based on our preclinical data in an ovine model27, we hypothesized firstly that PAE would influence the pattern of plasma miRNAs in the newborn infant, and secondly, that these infant exmiRNAs would predict infant growth and cognitive outcomes, thus enabling better early identification of PAE-affected infants. Findings from this study lend support to these hypotheses.

PAE was associated with alterations in patterns of exmiRNA expression at both 2 weeks and 6.5 months, demonstrating the utility of exmiRNAs as biomarkers of exposure at different ages. PAE-related exmiRNAs were different at the two ages, though there was a clear bias towards increased miRNAs in infant plasma due to PAE such that by 6.5 months almost all of the altered exmiRNAs were elevated in PAE infants relative to controls. Likewise, the number of likely sex-specific exmiRNAs also increased over development, with 3.5-fold more at T6.5mo than T2wk. Four of the 18 exmiRNAs altered to a clinically-relevant level at T2wk were also altered at T6.5mo (Supplementary Table S8), and three of these four were altered in the same direction and with a greater effect size at T6.5mo (d increased by 0.07–0.12). In addition, some of the miRNAs identified in this study have also been found in our previous work identifying miRNA biomarkers in pregnant women28 and in an ovine model27. T2wk exmiRNAs had the best concordance with the infant-outcome predictive model from our previously published maternal data28, while at T6.5mo the best concordance is from the lamb neonate27. These findings suggest a transition away from maternally-programmed miRNA alterations at 2 months to an adaptive growth and development driven profile at 6.5 months.

At T2wk, changes in exmiRNAs in the PAE group may be attributable to the direct effect of PAE on fetal organ and tissue development, whereas the differences in exmiRNA expression seen at T6.5mo may be due to the downstream effects of these in utero insults later in development and/or facilitate compensatory adaptations to the effects of PAE. For example, secreted miR-126-3p (MIMAT0000445) has been shown to promote angiogenesis74, which has, in turn, been shown, in pre-clinical models, to be inhibited by PAE75, whereas secreted miR-146a (MIMAT0000449) has been shown to suppress inflammation76, 77, which is induced in PAE models78. Further research using preclinical models will be necessary to determine the biological roles of these identified PAE-sensitive exmiRNAs, particularly as exmiRNAs expressed in concert can have effects that are different than a summation of the individual effects of each miRNA79. For instance, we previously identified exmiRNAs significantly elevated in maternal plasma from heavy alcohol exposed pregnancies28 and preclinically and found that these exmiRNAs together, but not individually, inhibited placental growth and maturation79.

In the newborn period, altered levels of exmiRNAs in the PAE group showed higher intercorrelated expression than was seen in the control group. The greater intercorrelated expression of exmiRNAs transcribed from specific chromosomes at T2wk suggests substantial chromatin reorganization in contributory cells and the emergence of new nuclear transcription factories as a consequence of PAE. Such factories, containing loops of chromatin from different chromosomes, are thought to be the functional unit within cellular nuclei for co-regulated transcription, including the regulation of multiple gene loci by a single transcription factor51, to adapt and reorganize to influence cellular differentiation states80. Recently, pro-inflammatory cytokines, such as tumor necrosis factor (TNF)-α, which are reportedly elevated in PAE mice following nerve damage81, have also been shown to induce the reorganization of transcription factories to promote transcription at miRNA-encoding genes82. It is also possible that a reactive network of cytokines coordinates exmiRNA secretion from multiple tissues, following PAE. Co-secretion of exmiRNAs from muscle, endothelial cells, and other tissues has been hypothesized to occur following muscle damage83 and may coordinate damage repair mechanisms that rely upon a number of tissues84. The much smaller number of PAE-related chromosomal intercorrelations observed at T6.5mo suggests that some direct effects of PAE on chromatin reorganization in these regions may be transient or altered by the effects of the early postnatal environment.

Mediation models indicated that at T2wk exmiRNAs associated with cell proliferation, skeletal development, hematopoiesis, and inflammation (F3) mediated the effects of PAE on weight and head circumference, after controlling for maternal smoking during pregnancy. At T6.5mo, exmiRNAs associated with immune system and inflammation (F2) mediated the effects of PAE on length, after controlling for maternal smoking during pregnancy. The majority (> 70%) of exmiRNAs within these factors did not exhibit sex differences (Supplementary Table S9), indicating that the mediation of the effects of PAE by these exmiRNAs likely occur in a sex-independent manner.

The apparent relevance of hematopoiesis-related exmiRNAs to PAE-related growth restriction is consistent with our prior work demonstrating that PAE was related to a 70% increase in the prevalence of iron deficiency anemia at age 6.5 months and that fetal alcohol-related growth restriction was markedly more severe among infants with iron deficiency anemia85, 86. In addition, in a longitudinal study we recently reported that effects of PAE on IQ, learning and memory, and cognitive flexibility in childhood were stronger in children with PAE-related postnatal growth restriction87. Studies in preclinical animal models have shown similar interaction effects between iron status, anemia, and PAE-induced growth and neurobehavioral deficits88. Intriguingly, this hematopoeisis related factor (T2wk F3) also partially mediated the effects of PAE on FTII infant visual recognition memory. FTII performance, which assesses basic information-processing skills involving encoding and retrieval, has been repeatedly shown to predict intellectual function in childhood89, whereas the more frequently used global Bayley Scales of Infant Development rely heavily on sensorimotor manipulation of objects and have poor predictive validity90. Thus, the observed mediation of effects on growth suggests that exmiRNAs during the newborn period may provide early biomarkers of vulnerability to long-term PAE-related intellectual impairment.

This study thus provides the first evidence that levels of exmiRNAs during early infancy can provide a biomarker of the effect of PAE on a measure of cognitive processing abilities in infancy that is predictive of school-age IQ. PAE-altered exmiRNAs found to play functional mediating roles in FASD may be targets for therapeutic intervention, using strategies such as antagomiRs and synthetic miRNAs as have been seen in clinical trials for therapies for cancer and hepatitis C91.

The differences in the exmiRNA patterns attributable to PAE at T2wk and T6.5mo suggest that these patterns evolve with postnatal development and may be modified by adverse postnatal environmental influences, such as maternal stress92, 93 or the transgenerational effects of early life stress93,94,95. A more detailed study of the stability of infant and child exmiRNA expression patterns will be needed to tease apart the relative contributions of PAE, development, and the postnatal environment. The generalizability of these exmiRNAs as biomarkers of PAE requires further validation studies to assess the robustness of these identified exmiRNAs, particularly as the best practices in exmiRNA analysis continue to evolve96, the duration of their expression, and preclinical studies to assess whether mediating exmiRNAs directly contribute to the PAE growth and neurodevelopmental deficits, as has seen with maternal exmiRNAs79, or are byproducts of the biological mechanisms underlying these deficits. The biological function of these exmiRNAs in the factors were determined based on our current understanding of gene function and miRNA targeting, and as our understanding evolves, additional biological functions may attributable to these factors. In addition, although our study was not specifically powered to address the impact of infant sex on exmiRNA expression profiles, bootstrap resampling analyses indicated more pronounced sex-segregated PAE-related differences in exmiRNA expression levels at T6.5mo than at T2wk. (Fig. 3). Follow-up studies specifically powered to ascertain the effect of infant sex on exmiRNAs following PAE are, therefore, warranted.

This study has additional limitations common to other longitudinal studies of PAE. Misclassification of some cases with regard to PAE may obscure some associations, but we have previously validated the timeline follow-back alcohol ascertainment protocol used in this study in relation to levels of fatty acid ethyl esters metabolites in meconium samples in this community21 and to infant and child behavior17, 24, 97, somatic growth87, and brain structure98,99,100 and function101. Some unmeasured confounders may have played a role in these findings, such as differences in unmeasured environmental exposures, such as household smoking, or lead or pesticide exposure. However, both the exposed and control groups were recruited from the same community, and socioeconomic status42 did not differ between groups, nor did weight gain during pregnancy or nutritional status across most micronutrients102.

## Conclusions

In summary, our findings suggest that alterations in exmiRNAs in infant plasma have the potential to serve as biomarkers of both PAE and of the effects PAE on growth and cognition. Biomarkers of the developmental effects of alcohol may facilitate detection of affected infants who do not display overt physical characteristics of PAE and referral of infants with FASD for interventions during early sensitive periods of development. These findings thus respond to the consensus statement issued by the American Academy of Pediatrics40 and the Interagency Coordinating Committee on Fetal Alcohol Spectrum Disorders highlighting the urgent need for new tools to aid in the identification and ultimately, the diagnosis of FASD. These findings build on our previous report that maternal exmiRNAs can predict FAS and PFAS to now show that infant exmiRNAs can provide direct indications of fetal damage. This study is the first to test directly whether exmiRNAs that discriminate between alcohol-exposed and non-exposed infants mediate effects of PAE on specific outcome domains, namely, growth and cognition. In summary, this study indicates the potential of miRNAs as biomarkers for predicting PAE-related developmental impairment and perhaps other developmental origins of health and disease.

Further validation studies in different infant populations will be needed to determine if exmiRNAs are a robust biomarker of effect.

## Methods

### Participants

The sample consisted of 68 Cape Coloured (mixed ancestry) mothers and their infants (37 heavy alcohol exposed, 31 healthy controls) from a larger prospective cohort87, 103. The Cape Coloured population historically comprised the large majority of workers in the wine-producing region of the Western Cape. The high prevalence of FAS in this community is a consequence of very heavy maternal drinking during pregnancy, due to poor psychosocial circumstances and that farm laborers were historically paid, in part, with wine104. Drinking, which is concentrated primarily on 2–3 days on the weekends24, continues to be a major source of recreation for many in urban and rural Cape Coloured communities9, despite numerous public health interventions to reduce pregnancy drinking.

The infants in this study were born to women who were recruited between 2013 and 2015, at their first antenatal visit, at one of two midwife obstetrical clinics that serve economically disadvantaged areas of Cape Town. Each mother was interviewed antenatally regarding her alcohol use using the ‘gold standard’ 2-week timeline follow-back interview17, adapted to reflect how pregnant women in this community drink. The interview included information about type and amount of each beverage consumed at time of conception and across pregnancy, as well as about container size (including pictures of different containers, bottles, cans, glass size) and sharing (by how many individuals), for use in calculation of amount of alcohol consumed. At recruitment, the mother was interviewed regarding incidence and amount of drinking on a day-by-day basis during a typical 2-week period at time of conception. Volume was recorded for each type of beverage consumed each day and converted to oz absolute alcohol (AA), using weights that reflect potency of AA in Cape Town (liquor0.4, beer0.05, wine0.12, cider0.06). The woman was then asked whether her drinking had changed since conception; if so, when the change occurred and how much she drank on a day-by-day basis during the past 2 weeks. Maternal exclusionary criteria were age < 18 years, HIV infection, multiple gestation pregnancy, and pharmacologic treatment for chronic medical conditions, including diabetes, hypertension, epilepsy, or cardiac problems.

Two groups of women were recruited: heavy drinkers, who consumed 14 or more standard drinks/week (1.0 oz AA/day, equivalent to 28 g or 30 ml AA/day) and/or engaged in binge drinking (4 or more drinks/occasion), and controls, who abstained or drank only minimally during pregnancy (83.9% abstained, 12.9% drank 1–2.5 drinks on 1–2 occasions, one consumed 3 drinks on 2 occasions around conception). The 2-week timeline follow-back interview was repeated at 4 and 12 weeks after recruitment. Data from the three alcohol consumption interviews were averaged to provide continuous measures of drinking across pregnancy: average oz AA consumed/day, AA/drinking day (dose/occasion), and frequency (days/week). Mothers were also asked about their drug use during pregnancy. Marijuana (“dagga”), cocaine, heroin, methaqualone (“mandrax”), and methamphetamine (“tik”) were measured in days/month; smoking, as cigarettes/day. All women who reported drinking during pregnancy were advised to stop or reduce their intake and were offered referrals for treatment, if requested. Infant exclusionary criteria were major chromosomal anomalies, neural tube defects, multiple births, very low birth weight (< 1500 g), GA < 30 weeks, and seizures.

Human subjects’ approval was obtained from the Wayne State University, the University of Cape Town Faculty of Health Sciences, and Texas A&M University Institutional Review Boards. All mothers provided written informed consent. All procedures were followed according to the relevant guidelines.

### MiRNA expression analysis

#### Sample collection

Venous EDTA-stabilized plasma samples were obtained from the infants at 2 weeks (T2wk) and 6.5 months (T6.5mo) postpartum by a trained phlebotomist, placed on ice, and centrifuged and decanted immediately to avoid contamination with intracellular miRNA. Approximately 1 ml of whole blood was drawn into an EDTA tube to ensure at least 500 µL plasma. The samples were stored at − 80 °C and subsequently shipped on dry ice to RCM’s laboratory for exmiRNA analyses (see below). Samples were obtained from 68 infants in total; 58 infants were sampled at both T2wk and T6.5mo and an additional 10 infants at T6.5mo only.

#### Sample preparation and quality control analysis

Total plasma RNA, including lipoprotein and protein bound and extracellular vesicle packaged, was isolated from 80 to 200 μl plasma using the miRNeasy Mini RNA isolation kit (Qiagen, Gaithersburg, MD) with 1.2 µg MS2 carrier RNA added per 200 µL of plasma (Roche Diagnostics, Indianapolis, IN). RNA concentration was determined using a NanoDrop ND-1000 spectrophotometer, and RNA samples were stored at − 80 °C prior to use.

Plasma miRNA content can be contaminated by miRNAs from lysed erythrocytes105. We controlled for possible erythrocyte contamination in three steps, as described elsewhere27, 28 (see Supplementary Fig. S1). First, the presence of free hemoglobin was assessed in the plasma samples, using absorbance at 414 nm which has been shown to be an indicator of hemolysis105 and has been used previously by our laboratory to assess plasma purity. Each sample was then assessed both for the presence of mRNA for the erythrocyte-specific band-3 membrane protein (SLC4A1) and for levels of erythrocyte-enriched miR-451a (MIMAT0001631) relative to miR-23a-3p (MIMAT0000078). SLC4A1 content was assessed with quantitative RT-PCR after qScript cDNA Synthesis Kit (Quantabio, Beverly, MA) using PerfeCTa SYBR Green FastMix (Quantabio) with gene-specific primers (forward: 5′-aacgagtgggaacgtagctg-3′; reverse: 5′-cttcatattcctcctgctccag-3′). RNA isolated from sheep red blood cells was used as a positive control. Three samples had 414 nm absorbance and miR-451a enrichment (ΔCT(miR-23a-3p -miR-451a)) over hemolysis-indicator thresholds (> 0.3 and > 7, respectively)43 and were excluded from subsequent miRNA expression analysis (one sample each for Control T2wk, PAE T2wk, and PAE T6.5mo). Therefore, miRNA data is derived from 56 samples T2wk and 67 samples at T6.5mo.

To examine the possibility that the RNA collection method contained inhibitors for cDNA synthesis, cDNA was synthesized from two independant RNA samples that were spiked with cel-miR-39-3p and UniSp6 RNA, according to the miRCURY LNA RNA Spike-in Kit. UniSP6 was added at 100 × higher concentration than cel-miR-39-3p. The RNA was then added into 20 μL cDNA synthesis reactions at increasing amounts (5 ng, 10 ng, 20 ng, 40 ng). These cDNA were then diluted 1:100 into the qPCR mix and the quantity of the spike-ins was assessed using appropriate primers (Exiqon/Qiagen). These analyses showed no evidence for the presence of endogenous inhibitors of cDNA synthesis and PCR amplification (see Supplementary Fig S1).

#### miRNA analysis

Using 25 ng of RNA input, cDNA was synthesized using the miRCURY LNA Universal RT cDNA synthesis kit (Exiqon/Qiagen). A standardized input of cDNA (40 µl) was diluted 110 × and then combined 1:1 with the ExiLENT SYBR green master mix (Exiqon/Qiagen). miRNA content was assessed using Human miRCURY LNA miRNA miRNome PCR Panels (V4; Exiqon/Qiagen). These arrays are contained in 2 X 384 well plates and assess 752 unique miRNAs. Quantitative PCR was then performed using Applied Biosystems 7900HT Fast Real-Time PCR System (Applied Biosystems/Thermo Fisher Scientific, Waltham, MA). Amplification data were visually inspected to ensure proper amplification of target miRNA. Interplate controls were assessed to determine equal performance of each panel. At T2wk, 8 panel I plates did not reach amplification criteria and the miRNAs from panel I, but not panel II, were excluded for these samples. CTs for each amplicon were determined using SDS2.4 software (Applied Biosystems/Thermo Fisher Scientific).

### Infant assessments

Mothers and infants were transported in a research van by our research driver and nurse to the Cape Universities Brain Imaging Center (CUBIC) for scanning and to our University of Cape Town Mother–Child Research Development Laboratory at 6.5 months postpartum for cognitive assessment. Weight, length, and head circumference were measured by our research assistants, who were trained by RCC, using standard WHO protocols106 at 2 weeks and 6.5 months (see87). Length-for-age, weight-for-age, and head circumference-for-age z-scores were calculated from measurements obtained at 2 weeks, 6.5 months, and at a diagnostic clinic in 2016 (see below). GA was based on early gestation ultrasound, which was available for 92.6% of the study participants, or date of last menstrual period.

#### Fagan test of infant intelligence

Visual recognition memory was assessed on the FTII107, which was administered by Master’s-level psychologists at the 6.5-month visit. The infant, seated on the mother’s lap, is shown identical photos of a human face and then a novel photo of another face paired with the familiar one. The normative response, preference for the novel stimulus, indicates the ability to recall the familiar stimulus and discriminate it from the novel one. The infant is administered 10 visual comparison problems consisting of pairs of faces. Infant fixation is recorded on a computer, and preference for novelty is computed by dividing duration of time looking at the novel stimulus by total time looking at the paired familiar and novel stimuli for each of the 10 problems. We have previously reported that the FTII is sensitive to PAE and that the specific effects of PAE on FTII are not seen in relation to other exposures, including smoking, cocaine, or marijuana72, 108.

#### FASD diagnosis

In 2016, we organized a clinic in which the infants, including the 68 in the present study (mean age = 1.7 yr, SD = 0.6; range = 0.9 to 3.1 yr), were examined for growth and FAS anomalies using a standard protocol10. Each child was independently examined by HE Hoyme (HEH), MD, an expert FASD dysmorphologist, and a second examiner trained by HEH (G De Jong, MD; H Bezuidenhout, MD; E Krzesinski, MD; RCC, MD). HEH, the other dysmorphologists, and SWJ, JLJ, RCC, and CDM, subsequently conducted case conferences to reach consensus regarding which infants met criteria for diagnoses of FAS or PFAS. Diagnosis was based on the 2005 Revised Institute of Medicine Guidelines10. FAS is characterized by microcephaly, growth retardation, and at least two of the distinctive craniofacial anomalies linked to fetal alcohol exposure: short palpebral fissures, flat philtrum, thin vermilion (upper lip). PFAS was diagnosed when at least two of these facial characteristics were present in conjunction with microcephaly or growth retardation and there was confirmed evidence of maternal drinking during pregnancy. Heavily exposed infants who did not meet criteria for FAS or PFAS were classified as nonsyndromal HE.

### Statistical analyses

#### Group difference and effect size estimates

CTs were computed for each expressed miRNA (expressed_miRNACT). The average CT of all expressed miRNAs in each sample was computed (averageCT), and the levels of each miRNA in each sample were expressed as ΔCT (expressed_miRNACT—averageCT). ΔΔCT, ΔCTPAE-ΔCTControl, is reported for comparison between groups, with negative values indicating higher expression in the PAE group and positive values indicating higher expression in the control group. For the 148 miRNAs whose expression was detected in at least 80% of the samples in at least one group at one age, miRNAs that were below the level of detection for the assay were assigned a CT value of 1 greater than the highest value of the given miRNA that was detected, in accordance with research that shows including these non-detected reactions as data points reduces bias109 and that best practices to replace these non-detected values with an experimentally relevant value96, 110. A low ΔCT value indicates higher level of miRNA expression, and as with CTs, a unit change in ΔCT, or ΔΔCT, of 1.0 indicates a twofold difference in miRNA expression.

Given the prevalence of smoking within this population (Table 1), we used an ANCOVA model to test for exposure group differences in mean expression of each miRNA, adjusted for average number of cigarettes smoked during pregnancy per day. These analyses were performed on the 148 miRNAs whose expression was detected in at least 80% of the samples in at least one group at one age. Given the relatively small sample size in relation to the number of tests run, Cohen’s d55, 111 effect size estimates of ANCOVA-adjusted mean expression were used to identify meaningful between-group differences. In Cohen’s d, which compares standard deviation differences between two groups, differences of 0.20, 0.50, and 0.80 are considered small, medium, and large effects, respectively. In this study we used a small-to-medium effect size cut-off of 0.40 to identify miRNAs on which the exposed and control groups differed. This criterion was selected because (a) 0.40 is considered the lower bound for a clinically meaningful effect size112; (b) an effect size of 0.40 corresponds to an odds ratio of 2.0, indicating twice the odds that the outcome will be observed in the exposed than in the control group113; (c) with an effect size of 0.40, based on Cohen’s111 U3 statistic, 66% of the participants in the exposed group will be above the mean for the participants in the control group; (d) using Rosenthal and Rubin’s114 binomial effect size display (BESD), an effect size of 0.40 indicates a 20% between-group difference on the outcome of interest. A large proportion of the effect sizes for the differences in miRNA expression identified using this approach was substantially larger than 0.40 (see Table 2, Supplementary Table S2). While Hedge’s g has been shown to have decreased bias for small sample sizes 115, our sample is sufficiently large that Cohen’s d and Hedges’s g vary by < 0.1% (Supplementary Table S2). When examining the population segregated by sex, in which the sample size for each group is smaller, the more conservative Hedges’s g effect size estimate is reported (Table 2).

#### Within-group correlation analysis

To assess the extent to which PAE resulted in coordinated expression of exmiRNAs, Pearson’s correlations between the expression levels of ΔCT values in all possible pairs of the 148 miRNAs were generated. Correlation plots were created for the control and alcohol exposed groups separately for each time point using the “corrplot” package116 implemented in R117 (see Supplementary Methods S10). We calculated p-values for each correlation, and correlations with p ≥ 0.05 were masked. Correlation matrices were ordered using two approaches—first, by a complete hierarchical clustering method that aims to identify highly correlated clusters in the measured variables; then, by using the chromosomal location of the miRNAs (ftp://mirbase.org/pub/mirbase/20/genomes/hsa.gff3). The number of significant correlations was reported for each of the infant groups at each time point. To test the stability of the pairwise correlations with p-values less than 0.05, we bootstrapped the estimates of correlation coefficients using sampling with replacement (see Supplementary Methods S10). Additionally, we reported the correlations between miRNAs with effect sizes greater than 0.40.

#### Confirmatory factor analysis

The PAE-responsive miRNAs (i.e., those with a group difference of Cohen’s d ≥ 0.40) were examined using CFA. CFA uses a simultaneous equation approach to estimate discrepancies between a population Σ(Θ) and a sample’s variance–covariance matrix S(θ), using an iterative procedure. Model fit is assessed using an omnibus chi-square test and three descriptive fit indices, among which the most preferred are Bentler’s CFI118, the TLI119, and Steiger and Lind’s120 RMSEA. Values of good model fit are indicated by fit-index values greater than 0.95121 and residual values less than 0.10122 and 0.08 for the other two indices, respectively123, 124. In light of the relatively small sample size in the present study, we adjusted the omnibus chi-square test value, and consequently the values of all other indices, using Bartlett’s125 correction using an R-function we developed for that purpose, which accounts for sample size and model complexity as shown below:

$${\text{Bartlett}'\text{s}}\;{\text{Corrective}}\;{\text{Factor}} = 1 - \frac{{4{\text{k}} + 2{\text{p}} + 5}}{{6{\text{n}}}}$$
(1)

with k being the number of latent variables, p the number of observed variables and n the sample size + 1. CFAs were run separately to examine the miRNA expression levels seen at each of the two ages. The expression levels of the miRNAs loading on each of the factors at each age were averaged within groups to provide the mean expression level for that factor.

#### Power for the CFA models

A Monte Carlo simulation of the CFA model at the first time-point, T2wk, was run by generating population data to test the sample size required to find significant factor loadings of specified minimum values and overall model fit indices. The specified parameters were: (a) factor loadings = 0.50, (b) factor means and variances at 0 and 1, respectively, for identification, (c) factor covariance at 0.50, standardized value, and, (d) item residual variances = 0.75. A model with n = 59 was run using 10,000 replicated datasets, as recommended126. Results indicated that coverage ranged between 92.9 and 94.0%. Power for the significance of the factor loadings ranged between 89.5 and 92.9%. Power for the standardized covariance estimate was 90.7%. Out of the 10,000 samples, the analysis converged with 9,989. Percentage bias of the estimated parameters was set to a maximum of 10% and should ideally be less than 5%126. Bias was calculated by subtracting the population value from the average parameter estimate, dividing by the population value, and then multiplying by 100. Results indicated that the largest bias was at 1.8%, which is well below the lowest acceptable standard (i.e., 5%). The results for the 3-factor model at 6.5 months (T6.5mo) are not reported here because with the exact same configuration, the sample size of n = 67 would be associated with even higher power and more stable parameter estimates in the estimation of the factor loadings and the between factor correlations. These simulation findings generally agree with an earlier simulation study127, in which sample sizes with between 50 and 70 participants were associated with nominal Type-I errors (of the chi-square test), proper coverage (> 90%), and stable parameter estimates (e.g., of factor loadings, item variances, etc.).

#### Internal consistency reliability

In light of the limitations in the estimation of Cronbach’s alpha and its highly restrictive assumptions (e.g., requiring essential tau equivalence), two indices directly estimable from the CFA model were utilized; namely, the omega coefficient128 and maximal reliability H129. The omega index128, 130, 131 is similar to alpha but has the advantage of allowing for heterogeneous item-latent variable correlations, thus enabling accurate estimation of measurement errors at the item level with congeneric measures (i.e., non-tau-equivalent). It is estimated as follows:

$$Omega = \frac{{\left( {\sum \lambda_{i} } \right)^{2} }}{{\left( {\sum \lambda_{i} } \right)^{2} + \sum Var(\varepsilon_{i} )}}$$
(2)

with λi being the factor loadings of item i and Σvar, the respective error variances of item i. Omega was adjusted for collinearity in the residuals as recommended by Wang and Wang132 following a procedure detailed in the Supplementary Methods S10. Maximal reliability H was estimated using an optimally weighted composite comprised of standardized factor loadings, which provided the advantage of allowing negative factor loadings to contribute meaningful variance to the latent construct133. It was estimated as follows134, 135:

$$H = \frac{{\sum \frac{{l_{i}^{2} }}{{1 - l_{i}^{2} }}}}{{1 + \sum \frac{{l_{i}^{2} }}{{1 - l_{i}^{2} }}}}$$
(3)

#### Ingenuity pathway analysis

Before examining the relation of each of the factors to the infant outcomes, we conducted Ingenuity Aathway Analysis (IPA, Qiagen) to identify core physiological functions of the miRNAs within each factor28. Briefly, for the exmiRNAs within each factor, the IPA miRNA Target Filter was used to identify target mRNAs with (a) high predicted confidence of interaction with PAE-sensitive exmiRNAs (context score < − 0.4, as defined previously137) or (b) experimentally validated miRNA/mRNA interactions. These targets were then subjected to IPA's Core Analysis workflow to identify known gene regulatory networks that are overrepresented amongst the predicted miRNA targets.

The target mRNAs for the miRNAs in each factor were then subjected to pathway enrichment analysis. In this analysis target mRNAs are assessed for overrepresentation in biological pathways against a reference data set list of genes to determine pathways that were significantly associated with the target mRNAs of the CFA-linked miRNAs compared to the universe of all possible pathways. This procedure allowed us to control for both experimental and curation-based biases in pathway overrepresentation. Pathways were considered overrepresented if they had a significantly higher proportion of the target mRNAs within the pathway than outside the pathway, determined using a right-tailed Fisher's Exact Test false discovery rate adjusted p-value of < 0.05. IPA’s comparison analysis tool was used to visualize and hierarchically cluster both canonical pathways and overarching disease and biofunction pathways for each timepoint.

#### Mediation analysis

Mediation analysis was performed to examine the degree to which effects of PAE on growth and cognitive outcomes were mediated by the exmiRNA expression levels of each of the factors generated by the CFAs at T2wk and T6.5mo138. Mediation by miRNAs of the effects of PAE on the outcomes (i.e., the indirect effect) was estimated as the product of two regression slopes, one predicting the miRNA expression levels in each factor from PAE and one predicting the outcome from the expression levels. The significance of each indirect effect was tested by use of a z-test and also bootstrapping as described below. All models tested for linear relations using a saturated regression analysis model. However, because the distribution of indirect effects is likely non-normal, particularly with modest sample sizes139, bias-corrected non-symmetric 95% confidence intervals were created using bootstrap standard errors from 10,000 replicated samples140. Those results agreed with the results from the z-test testing the significance of the indirect effect across all instances. Mediation of the effect of PAE on each growth and cognitive outcome by each of the factors generated by the CFAs at T2wk and T6.5mo was examined in separate models.

Power for each mediation model was estimated using a Monte Carlo simulation with paths being set to a standardized value of 0.40, the Cohen’s d effect size used for the initial group comparisons and by fixing item means and variances to 0 and 1, respectively. Based on the sample size at T2wk (n = 58) and 10,000 replications, coverage of the direct and indirect paths ranged from 93.7 to 94.5%. Power ranged from 80.4 to 85.3%. The largest parameter bias estimate was 0.005, which was well below 1% and far below 5–10%, the recommended cutoff values. Power was slightly greater for the larger sample at 6.5 months. All analyses were conducted using Mplus 8.1, except Bartlett’s correction for which we utilized R version 3.4.4.

### Ethics approval and consent to participate

Human subjects’ approval was obtained from the ethics committees at Wayne State University, the University of Cape Town Faculty of Health Sciences, and Texas A&M University. All mothers provided written informed consent.