Introduction

Childhood malnutrition is strongly associated with the risk of death from diarrhea, pneumonia, and other infectious diseases and is associated with growth failure, cognitive delay, and loss of productivity1. The main bacterial enteric pathogen in less-developed countries, particularly among children aged 2 to 24 months, is Escherichia coli2, 3. Among the major categories of diarrhoeagenic E. coli, enteroaggregative E. coli (EAEC) constitute a significant health risk and remains an important cause of infant mortality in developing countries4.

Evidence suggests that gastrointestinal infections with several enteropathogens, including EAEC, are linked with childhood malnutrition5,6,7. EAEC has been increasingly recognized as important enteropathogens since their initial discovery by patterns of adherence to HEp-2 (Human epithelial type 2) cells in E. coli isolates from Chilean children with diarrhea8. The genetic determinants and biological mechanisms for the virulence of EAEC have been reported to be mediated by a complex array of interacting traits that reside on both the chromosome and the plasmid9. As presently defined, EAEC is heterogeneous with regards to their genetic content10.

The aagR plasmid, harbored by most EAEC strains encodes a transcriptional activator of the AtaC/XylS class known as AggR11, which in turn, regulates the expression of virulence proteins involved in the adherence to mucosal secretions and thereby promoting intestinal colonization12, 13. AggR also activates the expression of the genes encoding dispersin (aap), the dispersin translocator Aat, the Aai chromosomal type IV secretion system (aaiA-Y), as well as the plasmid-borne aatA gene, which encodes an ABC (ATP-binding cassette) transporter14. Consequently, the aar gene of EAEC encodes a small protein named Aar (AggR activated regulator), whose expression is activated by AggR, and serves as a negative regulator of AggR15. Henceforth, there is a complex interplay between AggR and Aar, in relation to the virulence of EAEC and its consequent pathogenesis16.

Environmental Enteric Dysfunction (EED) is a subclinical intestinal disorder that is highly prevalent in low- and middle-income countries (LMICs)17 and is attributable to the infection by many environmental enteropathogens of bacterial, viral, and parasitic nature18, 19. Mechanisms contributing to growth failure in EED include intestinal leakiness and elevated gut permeability, gut inflammation, bacterial translocation, nutrient malabsorption, and systemic inflammation17. However, the definite contribution of different strains of this EAEC with linear growth faltering and EED remains poorly understood20.

The main goal of our study was to estimate the site-specific incidence rates of potential strain carrying genes of EAEC and their possible associations with the composite EED score and the consequent growth failure among children at 24 months of age.

Results

General characteristics of the study population and incidence rate of strain carrying genes of EAEC

A total of 34,622 monthly stool samples were collected from 1715 participants who completed the follow-up to 24 months. All the stool samples collected over this time from all the participants at the different study sites were assessed for the presence of strain carrying genes associated with EAEC using TaqMan Array Cards (TAC). The general characteristics of the study children are presented in Table 1.

Table 1 General characteristics of the MAL-ED study Children (n = 1715) from Bangladesh, India, Nepal, Pakistan, South Africa, Tanzania, Brazil, and Peru from November 2009 to February 2012.

The incidence rates of the strain carrying genes of EAEC in the stool samples collected across all the 8 study sites over the 24 months’ study period have been shown in Table 2. The overall incidence rate of aggR was highest (43.3%). The incidence of strain carrying gene Aar associated with EAEC was highest in Tanzania (58.1%). It was also observed that the overall incidence of both aaiC and aatA strain carrying gene was lowest among all the sites.

Table 2 The incidence rate of strain carrying genes of EAEC infection across each of the eight study sites (Bangladesh, India, Nepal, Pakistan, South Africa, Tanzania, Brazil, and Peru) from November 2009 to February 2012.

Factors associated with strain carrying genes of EAEC

Factors associated with strain carrying genes associated with EAEC across all study sites were identified using Poisson regression (Table 3). The incidence rate for infection of EAEC in female children was comparable with male children. Additionally, household ownership of cattle among children with infections by the genomic strains of aaiC [IRR: 0.90 (95% CI: 0.83, 0.97); p = 0.010], Aar [IRR: 0.94 (95% CI: 0.88, 1.0); p = 0.042], and aggR [IRR: 0.94 (95% CI: 0.88, 1.0); p = 0.048]; improved floor among aaiC strain infected children [IRR: 0.93 (95% CI: 0.87, 0.99; p = 0.029]; monthly income less than 150 USD among children infected with the genomic strains of Aar and aggR; maternal education in years among children infected with aaiC [IRR: 0.99 (95% CI: 0.99, 1.0); p = 0.030] and Aar [IRR: 0.99 (95% CI: 0.99, 1.0); p = 0.006] genomic strains were associated and found to be statistically significant.

Table 3 Factors associated with strain carrying genes of EAEC detection in monthly surveillance stool samples across each of the eight study sites (Bangladesh, India, Nepal, Pakistan, South Africa, Tanzania, Brazil, and Peru) from November 2009 to February 2012.

The incidence rate of aaiC and aatA positive strains was higher for the sites of India, Nepal, Peru, and Tanzania, while that of the genomic strain of Aar was the greatest in Tanzania. Consequently, the incidence rate of aatA genomic strain was the lowest in Brazil and that of aggR was the lowest in Nepal, South Africa, Brazil, and Pakistan. For concomitant infection by the positive strains of aaiC and aatA, the incidence rate was greater in India, Nepal, Peru, Tanzania compared to the Bangladesh site (Table 3).

Association between the strain carrying genes of EAEC and child growth

Infections with the strain carrying genes of EAEC, namely: Aar, aatA, aggR, and the concomitant presence of aaiC and aatA strains were associated with poor linear growth (difference in 24 months LAZ: length-for-age z score), with a stronger association being observed for all the study sites (Table 4). In Bangladesh, aaiC [− 1.32 difference in 24 months LAZ (95% CI: − 2.48, − 0.16); p < 0.027] and both aaiC and aatA carrying genes [− 1.36 difference in 24 months LAZ (95% CI: − 1.73, − 0.99; p < 0.001] had a negative association with LAZ. In Peru, Aar strain [− 0.96 difference in 24 months LAZ (95% CI: − 1.84, − 0.09); p = 0.030] was negatively associated with LAZ. In South Africa aatA [− 1.02 difference in 24 months LAZ (95% CI: − 2.00, − 0.05); p = 0.040]; in Tanzania, aatA [− 2.04 difference in 24 months LAZ (95% CI: − 3.55, − 0.53); p = 0.009] was negatively associated with LAZ. For the other four countries (South Africa, Brazil, Nepal and India), strain carrying genes associated with EAEC were negatively associated with LAZ but were not statistically significant.

Table 4 Strain carrying genes of EAEC and burden on child growth at 24 months across each of the study sites (except Pakistan) from November 2009 to February 2012.

Association between strain carrying genes associated with EAEC and enteric inflammation

After adjusting for the potential covariates like age, sex, WAMI index (water/sanitation, assets, maternal education, and income); enrollment length-for-age z score; maternal BMI; the number of children in the household, presence of poultry/cattle in the household, seasonality, serum zinc level, AGP (alpha-1-acid glycoprotein), presence of co-pathogens (Campylobacter, LT-ETEC, ST-ETEC, Shigella/EIEC, and Giardia), site for the overall estimate and age as the time variable in GEE model for strain carrying genes of EAEC infection were also clearly and consistently associated with increased EED score (Table 5) across all sites except Pakistan and Brazil. In case of the overall effect, the presence of all the strain carrying genes was associated with the EED score except aaiC. The same findings were observed in India, Nepal and Peru individually except aaiC strain carrying gene. In Pakistan and Brazil, there was no statistically significant relationship found among all the strain carrying genes. In Tanzania, only aaiC gene was non-significantly associated with the EED score. Aar, and aggR, strain carrying genes had a positive relationship with EED score among the children from Bangladesh. In South Africa except for Aar and aaiC, all other genes were significantly associated with EED score (Table 5). In Nepal and Peru both aaiC and aatA strain carrying genes had a statistically significant positive relationship with EED score.

Table 5 Association between strain carrying genes of EAEC and EED score across each of the eight study sites (Bangladesh, India, Nepal, Pakistan, South Africa, Tanzania, Brazil, and Peru) from November 2009 to February 2012.

Discussion

To our knowledge, this is the first study investigating the association between different strain carrying genes of EAEC with linear growth and enteric inflammation in children from birth until 2 years of age. Several EAEC strain carrying genes have been used in case–control and epidemiological studies in recent years, with aatA, aap, aagR, astA, and aafA being among the most common for EAEC diagnosis21,22,23.

We documented an overall high incidence rate of the concomitant presence of aggR gene carrying strains of EAEC across all eight study sites. The different clinical and nutritional outcomes associated with EAEC infection included: poor child growth and development during the study period and changes in the status of intestinal inflammation and provide stern challenges for understanding pathobiology and proposing potential therapeutic approaches for EAEC8. In this scenario, EAEC strain heterogeneity is a key contributor to these effects, but very little is known about the individual genomic strains responsible for this8.

In our study, the overall incidence rate for the genomic strain of aaiC was the lowest across all the study sites. Previously, an incidence rate of 38% for the aaiC gene among children under 5 years of age was reported in the Global Enteric Multicenter Study (GEMS)10. The incidence of aaiC, aggR, aatA, aaR, and the concomitant presence of aaiC and aatA strain carrying genes were higher among the study participants in Tanzania. Similar findings were observed in the case of traveler’s diarrhea in Guatemala and Mexico24. Many plasmid-encoded (AAFs, AggR itself, Aap, and Aat) and chromosomal-encoded (pheU pathogenicity island) virulence determinants are regulated by aggR15. Reduced viability of infected G. mellonella larvae with an aggR mutant strain and elevated virulence of atypical EAEC strains were discovered in a study, suggesting that EAEC virulence is linked to the AggR regulon25.

The low incidence of the aaiC gene in our study compared to the other strain carrying genes of EAEC may be additional support for the plasmid-mediated gene transfer of the pAA. In several studies, the authors suggested that aggR was not a feasible virulence marker for the diagnosis of EAEC infections since this gene alone was not a considerably sensitive target in comparison to the aatA gene26, 27. Our study findings regarding aaiC being an incongruent marker for EAEC identification were found to be consistent with a study conducted in southern Mozambique23.

The association between strain carrying genes associated with EAEC and malnutrition is not entirely clear, although plausible models for pathogenesis can be proposed. The Aar gene, which has been hypothesized to act directly or indirectly as a virulence suppressor, was present among the children in Tanzania28. Similar findings have also been reported in Mali and Brazil and these findings may strongly support a significant role for the Aar gene in the epidemiology of EAEC infection23. We also observed that there was poor linear growth among the children from Peru who were infected with the genomic strain of Aar; which may be pointed towards increased pathogenicity of Aar gene.

Consequently, the aggR gene mediates the expression of a large number of other EAEC genes responsible for virulence. The first genes found to be regulated by aggR were those encoding the aggregative adherence fimbriae (AAF)11. Our study findings demonstrate that an overall high incidence of the genomic strain of aggR was found across all the sites and the linear growth of children was negatively associated with the presence of aggR gene. However, though the role of aggR of EAEC in child growth remains unknown.

Our study findings also illustrate that the concomitant presence of aaiC and aatA strain carrying genes of EAEC was negatively associated with childhood linear growth except Nepal and Tazania. Other studies suggested that the presence of different virulence genes from within and outside the plasmid AA is necessary for complete EAEC virulence. Such aforementioned finding is in line with the findings of another concurrent study where the diagnosis of EAEC without overt diarrhea is associated with length/height-for-age z score decrements and chronic inflammation in children from Brazil29, 30.

Our data shows presence of improved floors in the house has a protective effect against infection of EAEC by aaiC genomic strain. We found higher maternal education was protective for aaiC and Aar genomic strain of EAEC. Findings from several other studies conducted in the MAL-ED settings also found an association of these factors with EAEC and Campylobacter infection31, 32.

The presence of cattle in the household was associated with aaiC, Aar, and aggR strain carrying genes. Still, there is no evidence regarding the association of the presence of an animal in the household with EAEC infection. The transfer of Giardia lamblia genes, any E. coli virulence gene, and unique E. coli virulence genes from animal feces to the hands of the mothers upon animal handling were reported in a study conducted in rural Bangladesh. As a result, domestic animals played a significant role in the spread of enteric pathogens in households33.

Previous findings have reported that EAEC detection was associated with higher levels of MPO (a marker for intestinal inflammation), NEO (intestinal inflammation), and AAT (permeability) among all 8 sites of MAL-ED32. In our study, the overall concomitant presence of aaiC and aatA strain carrying genes were more strongly associated with increased EED score, implying a higher intestinal inflammation. The relevance of elevated intestinal inflammation associated with virulence-related genes associated with EAEC is not yet clear and there is no evidence that virulence-related genes associated with EAEC were associated with elevated intestinal inflammatory biomarkers and subsequently an increased EED score. Henceforth, our study is the first attempt undertaken towards the generation of evidence-based understanding of the contribution of the different virulent genetic strains of EAEC with regards to enteric inflammation and poor child growth in LMIC settings.

Despite some potentially promising nature of our study findings, there are several possible limitations associated with our analysis. As an observational cohort study, the causality of the associations between infection with various genomic strains of EAEC and both intestinal inflammation and linear growth cannot be proven but can be hypothesized based on several factors, including the appropriate adjustment of the models for possible confounders, the strength and consistency of the associations, and the biological plausibility. We were unable to establish a temporal relationship between infections and the outcomes, which would require structured longitudinal models.

In conclusion poor maternal education, lack of improved floor, and ownership of cattle in the household are possible risk factors for EAEC infections by different genomic strains and thereby leading to a compromised linear growth in childhood. The burden of strain carrying genes associated with EAEC was associated with increased enteric inflammation among children in the first 2 years of life. EAEC virulence related genomic strains of aaiC, Aar, aatA, and the concomitant presence of aaiC and aatA genomic strains had a stronger association with growth failure among children of Bangladesh, whereas the association with inflammation was strongest for aggR, strain carrying genes.

Method

Study design and participants

MAL-ED (Etiology, Risk Factors, and Interactions of Enteric Infections and Malnutrition and the Consequences for Child Health) was a birth cohort study performed across eight sites in South America, sub-Saharan Africa, and Asia. The MAL-ED study design and methodology have been described elsewhere34. Briefly, 1715 children were enrolled from November 2009 to February 2012 from the community within 17 days of birth at all eight sites, namely: Bangladesh, India, Nepal, Pakistan, South Africa, Tanzania, Brazil, and Peru. In our current analysis, data from all 1715 participants were available from enrolment soon after birth up to 24 months of age.

Data collection

Anthropometric measurements were done at monthly intervals up to the age of 24 months using standard scales (Seca gmbh & co. kg., Hamburg, Germany). Length-for-age z score (LAZ), weight-for-age z score (WAZ), and weight-for-length z score (WLZ) were calculated through the use of the 2006 WHO standards for children35. Anthropometric measurements were performed monthly. Details of illness and child feeding practices were collected during twice-weekly household visits36. Additionally, household demographics, presence of siblings, maternal characteristics and other data on the child’s birth and anthropometry were obtained at enrollment34. Beginning at 6 months of age, socioeconomic data were collected every 6 months. The WAMI score (Water, sanitation, hygiene, Asset, Maternal education, and Income index, ranging from 0 to 1) is a socioeconomic status index that includes access to improved water and sanitation, eight selected assets, maternal education, and household income as a representative of the socioeconomic status of the households37. A better socioeconomic status is indicated by a higher WAMI score38. Improved water and sanitation were defined following World Health Organization guidelines39. Treatment of drinking water was defined as filtering, boiling, or adding bleach40.

Collection of stool and blood samples

Non-diarrheal stool samples were collected monthly (at least 3 days before or after a diarrhea episode) from birth to age 2 years and venous/peripheral blood was collected at 7, 15, and 24 months of age41. Raw stool aliquots and blood samples were processed at all sites using harmonized protocols and kept at − 80 °C freezers before subsequent laboratory analyzes42.

In this study, plasma zinc was assessed as the measure of zinc status at the age of 7, 15, and 24 months. Plasma zinc concentration is a proxy marker and recommended for the assessment of population zinc status, especially for children in low-income countries43. Plasma alpha-1-acid glycoprotein (AGP) level was considered as a biomarker for systemic inflammation and was also assessed at 7, 15, and 24 months44.

Assessment of enteropathogens by TaqMan array cards (TAC)

Total nucleic acid (both DNA and RNA) was extracted from the fecal samples using the QIAamp Fast DNA Stool Mini kit (Qiagen), following the manufacturer’s guidelines. Two external controls, namely: MS2 bacteriophage and Phocine herpesvirus (PhHV) were added to the samples for the confirmation of nucleic acid extraction and amplification efficiency45.

For the detection of enteropathogens, a quantitative polymerase chain reaction (qPCR) with the use of a customized TaqMan Array Card (TAC) involving compartmentalized probe-based real-time PCR assays was used for the detection of a possible 29 pathogens from each of the samples41. Ct (quantification cycle) value of 35 was set as a threshold for analysis, whereby a Ct > 35 was considered as negative, as mentioned elsewhere45. In MAL-ED study, they investigated the occurrence of putative virulence-related genes (VRG) of EAEC, namely: aatA, aggR, Aar, and aaiC. To diagnose EAEC, the genes aatA (dispersin transporter protein), and aaiC (secreted protein) were targeted by PCR. Primers specific for EAEC identification were aaiC and aatA. Samples were considered positive for EAEC if they could detect either one of the two diagnostic genes or both. Only EAEC positive samples were further analyzed by multiplex PCRs to identify 20 EAEC VRG46. Moreover, cases positive for concomitant presence of both aaiC and aatA genotypes were further analyzed for co-infection and the list of pathogens in this study, as described elsewhere42. In MAL-ED Brazil site they compared the combinations of EAEC VRGs from positive samples presenting both diagnostic genes (aaiC and aatA). Their choice was based on the fact that only the samples presenting both genes were statistically associated with malnourished children 46.

Assessment of biomarkers of intestinal inflammation

Intestinal inflammation was evaluated by measuring the levels of the biomarkers: alpha-1-anti-trypsin (Biovendor, Chandler, NC), neopterin (GenWay Biotech, San Diego, CA), and myeloperoxidase (Alpco, Salem, NH) in the stool samples collected from the study participants at the 3, 6, 9, 15, and 24 months of age time points by quantitative ELISA, using manufacturer’s guidelines34.

Statistical analysis

All statistical analyses were performed in STATA 15 (Stata Corporation, College Station, TX). Descriptive statistics such as proportion, mean and standard deviation (SD) for symmetric data, and median with inter‐quartile range (IQR) for asymmetric quantitative variables were used to summarize the data. Incidence rates were calculated using Poisson regression where outcome variables were the number of infections of EAEC (different strain carrying genes) and offset variables were a log of number of follow up. The factors associated with strain carrying genes of EAEC in the monthly stool samples were calculated using Poisson regression models. In the final multiple Poisson regression model, the following variables were considered for inclusion using stepwise forward selection: child sex, birth weight, duration of exclusive breastfeeding in months, enrollment weight for age z-score, length for age z score, maternal age in years, maternal education, mother having less than 3 living children, maternal BMI, routine treatment of drinking water, improved sanitation, household ownership of cattle/poultry, and less than 2 people live in per room. We excluded children from the Pakistan site for growth analysis, owing to bias noted in a subset of this cohort within the study period. Myeloperoxidase (MPO), neopterin (NEO), and alpha-1-antitrypsin (AAT) values were log‐transformed before the analysis. At each time point, the composite EED score ranging from 0 to 10 was calculated from the three fecal markers, as described in the previous literature by MAL-ED researchers20, 47. Categories were assigned values as 0 (low), 1 (medium), or 2 (high). The formula for the composite EED score is as follows48:

$${\text{EED score}}\, = \,{2}\, \times \,{\text{AAT category}}\, + \,{2}\, \times \,{\text{MPO category}}\, + \,{1}\, \times \,{\text{NEO category}}$$

Associations between strain carrying genes of EAEC and composite EED score was estimated using generalized estimating equations (GEE) to fit regression models after adjusting for sex, age, water/sanitation, assets, maternal BMI, and (WAMI) index; enrollment length-for-age and weight-for-age z score, maternal height; poultry/ cattle in house, serum zinc level, inflammatory biomarker AGP (alpha-1-acid glycoprotein), presence of co-pathogens (Campylobacter, LT-ETEC, ST-ETEC, Shigella/EIEC, and Giardia), seasonality, and site for overall estimate and age in the month as time variable49. To assess and compare the associations of strain carrying genes of EAEC infection burden on growth at 24 months of age, we used multivariable linear regression after adjusting for the site and the necessary covariates. To detect multicollinearity, the variance inflation factor (VIF) was calculated, and no variable producing a VIF value > 5 was found in the final model. We calculated the strength of association by estimating the coefficient and its 95% CI (confidence interval). A p-value of < 0.05 was considered statistically significant during the multivariable analysis.

Ethical consideration

The study was approved by the ethical committees at each of the participating institutes across each of the eight study sites34. The study was approved by the Research Review Committee and the Ethical Review Committee, icddr,b (BGD); Committee for Ethics in Research, Universidade Federal do Ceara; National Ethical Research Committee, Health Ministry, Council of National Health (BRF); Institutional Review Board, Christian Medical College, Vellore; Health Ministry Screening Committee, Indian Council of Medical Research (INV); Institutional Review Board, Institute of Medicine, Tribhuvan University; Ethical Review Board, Nepal Health Research Council; Institutional Review Board, Walter Reed Army Institute of Research (NEB); Institutional Review Board, Johns Hopkins University; PRISMA Ethics Committee; Health Ministry, Loreto (PEL); Ethical Review Committee, Aga Khan University (PKN); Health, Safety and Research Ethics Committee, University of Venda; Department of Health and Social Development, Limpopo Provincial Government (SAV); Medical Research Coordinating Committee, National Institute for Medical Research; Chief Medical Officer, Ministry of Health and Social Welfare (TZH)32.Written informed consent was obtained from the parents or legal guardians of every child.