Maternal breast milk microbiota and immune markers in relation to subsequent development of celiac disease in offspring

The potential impact of the composition of maternal breast milk is poorly known in children who develop celiac disease (CD). The aim of our study was to compare the microbiota composition and the concentrations of immune markers in breast milk from mothers whose offspring carried the genetic predisposition to CD, and whether they did or did not develop CD during follow-up for the first 3 years of life. Maternal breast milk samples [CD children (n = 6) and healthy children (n = 18)] were collected 3 months after delivery. Enzyme-linked immunosorbent assays were used to measure TGF-β1, TGF-β2, sIgA, MFG-E8 and sCD14. For microbiota analysis, next generation (Illumina) sequencing, real-time PCR and denaturing gradient gel electrophoresis were used. Phylotype abundance and the Shannon ‘H’ diversity index were significantly higher in breast milk samples in the CD group. There was higher prevalence of the phyla Bacteroidetes and Fusobacteria, the classes Clostridia and Fusobacteriia, and the genera Leptotrichia, Anaerococcus, Sphingomonas, Actynomyces and Akkermansia in the CD group. The immunological markers were differently associated with some Gram-negative bacterial genera and species (Chryseobacterium, Sphingobium) as well as Gram-positive species (Lactobacillus reuteri, Bifidobacterium animalis). In conclusion, the microbiota in breast milk from mothers of genetically predisposed offspring who presented CD showed a higher bacterial phylotype abundance and diversity, as well as a different bacterial composition, as compared with the mothers of unaffected offspring. These immune markers showed some associations with bacterial composition and may influence the risk for development of CD beyond early childhood.

genetic risk for the development of CD 6 . De Palma et al. demonstrated that the type of milk feeding in relation to HLA-genotype played a role in establishing infant gut microbiota 14 . The HLA-DQ genotype may specifically influence the colonization process of Bacteroides species 3,15 . In addition, it has been observed that breast milk from CD mothers have lower levels of TGF-β1, sIgA and a reduced abundance of Bifidobacterium sp. and B. fragilis compared with healthy women 5 .
As the frequency of CD has increased we hypothesized that the immunological composition and microbiota of breast milk might differ between the mothers whose offspring carry a genetic susceptibility to CD, and developed CD relative to those who remain unaffected. The current study set out to characterize breast milk microbiota composition and its possible relationship with immunological markers (TGF-β1, TGF-β2, sIgA, MFG-E8, and sCD14) in mothers whose offspring presented or did not present CD during the first 3 years of life.

Results
Immune markers in breast milk from mothers of CD and control offspring. There were no significant differences between TGF-β1, TGF-β2, sIgA, MFG-E8, and sCD14 levels in the breast milk from mothers in the CD group and the breast milk from mothers in the control group ( Table 1).
The microbiome of the maternal breast milk. We applied Illumina sequencing of the 16S rRNA V4 region to reveal the full microbiome of these investigated samples. A total of 6,475,320 high quality reads were obtained, or 269,805 ± 239,377 reads per milk sample. The OTUs were classified into known taxa (7 phyla, 17 classes, 44 genera, and more than 90 species) and unclassified groups.
Phylotype abundance and the Shannon 'H' diversity index were significantly higher in milk samples from the mothers in the CD group than in the maternal samples in the control group (p = 0.016; p = 0.008, respectively) ( Table 2).
A principal coordinate analysis (PCoA) plot based on different taxonomic levels (phylum, class and genus) was generated to assess the relationships between the community structures of these samples. Phylum and class abundance data indicated significant inter-individual variability (Fig. S1A,B), while the PCoA plot of relative genus abundance demonstrated weighted clustering in the CD group (Fig. S1C).
Phyla. In both study groups, the phylum Firmicutes displayed the highest relative abundance (medians of 48.9% and 60.2% for the CD group and the control group, respectively) (Fig. 1A, Supplementary Table S2). In addition, Proteobacteria and Actinobacteria were also quite abundant (median 29.0% and 9.8% for the CD group vs. 23.6% and 8.1% for the control group). A borderline statistical significance was detected for the relative abundance of Bacteroidetes and Fusobacterium phyla (p = 0.056 and p = 0.048; Fig. 1A, Supplementary Table S2).
A principal coordinate analysis (PCoA) plot of relative abundance data of Lactobacillus and Bifidobacterium species indicated quite different results, with weighted clustering of Lactobacillus in the CD group ( Fig. 2A) and Bifidobacterium in the control group (Fig. 3A). No differences were found between the groups in quantitative counts for Lactobacillus or Bifidobacterium (Figs. 2D, 3D; Table S9). Milk samples in the CD group showed borderline significance for the lower relative abundance of L. fermentum when compared to the control group (p = 0.058, Supplementary Table S5).

Discussion
In this case-control study, we wanted to identify whether there were differences in the immunological composition and microbiota of breast milk between mothers whose genetically predisposed offspring developed CD during the first 3 years of life or remained unaffected. To the best of our knowledge, the present study was the first where maternal breast milk samples were analyzed for both immunological markers and for microbiota composition.
We demonstrated that, compared to milk samples from mothers of unaffected offspring, the human milk microbiota in the mothers in the CD group had a higher phylotype abundance and diversity, with a higher abundance of the phyla Bacteroidetes and Fusobacteria, the classes Clostridia and Fusobacteria, as well as the genera Leptotrichia, Anaerococcus, Sphingomonas, Actinomyces, and Akkermansia. Increased relative abundance for species such as A. odontolyticus, A. hydrogenalis, A. octavius, and F. prausnitzii, and decreased abundance for L. fermentum were also observed in the CD group. We observed an association between certain immune markers (TGF-β1, TGF-β2, MFG-E8, and sCD14) and some Gram-negative (Chryseobacterium, Sphingobium, and A. muciniphila) and Gram-positive bacteria (L. reuteri and B. animalis).
Breast milk does not contain abundant culturable microbiota. However, some next generation sequencing studies have revealed Streptococcus, Staphylococcus, Gemella, Veillonella, Rothia, Lactobacillus, Propionibacterium, Corynebacterium, and Pseudomonas in milk samples from healthy woman 8,16 . In the current study, the most abundant genus in all milk samples was Lactobacillus, followed by Streptococcus, Staphylococcus, Moraxella, Acinetobacter, Enterobacter, and Corynebacterium. Some studies have indicated that the primary bacterial taxa in milk may vary across population, suggesting that geographic, genetic, and dietary factors could be influencing microbial diversity in breast milk. In Finland, Leuconostoc, Weisella, and Lactococcus were the most predominant genera in milk samples 17 , while in Mexican-American mothers the predominance of Streptococcus, Staphylococcus, Xanthomonadaceae, and Sediminibacterium was observed 18 .
Breast milk is a major source of bacterial diversity for the neonatal gut, including gut-associated obligate anaerobes, and therefore significantly influences gut colonization and the maturation of the immune system 6,19 . Different genetic and epigenetic factors may influence host microbiome interactions and shift the immune balance in subjects with CD 20 . In case of the intestinal microbiome, higher microbial richness and diversity values are considered to be more health-supporting 21 . However, this may not be true in case of some other biotopes (for example, the reproductive tract) 22 . Quite unexpectedly, we found that samples taken from mothers whose children developed CD had a higher phylotype abundance and Shannon diversity index than healthy samples. It is interesting that a recent study by Huang et al. also demonstrated a significantly higher alfa diversity in fecal samples from CD progressors compared to healthy controls at an age of one year 23 . This indicates that the www.nature.com/scientificreports/ increase of a few dominant bacteria would be expected in cases of infection and suggests that milk microbiota is not activating an immune response in the host. Unfortunately, in the current study inflammatory markers were not measured. In contrast, it has been shown that in case of allergy, feeding breast milk with lower bacterial diversity during the first month of life was related to an increased risk of allergy in childhood 24 . We cannot rule out the influence of anticommensal antibodies on microbiota composition, which may differ even between healthy individuals and may have an impact on microbiome composition 25,26 .
Olivares et al. 6 showed that the breast milk of mothers with CD demonstrated significant reductions in the counts of Bifidobacterium spp. and Bacteroides fragilis in comparison to healthy mothers. In the present study we observed a significantly increased relative abundance of class Clostridia in the milk samples from the CD group. We also found differences in the relative abundance of the genus Anaerococcus (incl. A. hydrogenalis, A. octavius) and F. prausnitzii. The presence of A. octavius was reported previously in breast milk samples from healthy Spanish women. In the same study, an association was shown between the amounts of proteins in milk and Anaerococcus 27 . F. prausnitzii is the major butyrate producer in the gastrointestinal tract. It has antiinflammatory properties important for gastrointestinal microbiota homeostasis and is associated with a range of metabolic processes in the human mucosa 28 . A Spanish study showed that the adherence to a gluten-free  www.nature.com/scientificreports/ diet resulted in slight microbiota shifts in adults, including a decreased abundance of F. prausnitzii 29 . As F. prausnitzii produces the anti-inflammatory short-chain fatty acid butyrate 30,31 , this species may have a key role in maintaining the integrity and function of the mucus barrier, representing an important aberrant mechanism in the development of CD. We also found that the prevalence of the genus Akkermansia was higher in the CD group samples. Interestingly, unclassified Akkermansia species were found only in that group. A study by Huang et al. observed the Akkermansia genus only in the control group and not in CD progressors 23 . However, they analyzed fecal samples and not breast milk samples. In contrast, some experiments in mice have demonstrated that a gluten-free diet can induce changes in the intestinal microbiota by increasing the number of Akkermansia 32,33 . It can be speculated that this species may translocate from its normal habitat (intestine) to an unusual habitat (breast) in cases of CD.
An increased relative abundance for the genus Actinomyces and for a species of this genus, A. odontolyticus, were observed in the CD group. Mucosal membranes of the oropharynx, gastrointestinal tract and female genital tract represent the normal habitat of Actinomyces 34 . They have also been observed to associate with breast infection 35,36 . Although being low in virulence, they may rely on the presence of co-pathogens, such as anaerobic bacteria, to enhance pathogenicity 37 . Previously, A. graevenitzii was identified in biopsy specimens from the proximal small intestine of children with CD, and an unidentified Actinomyces sp. was shown to be attached to the epithelial lining 38 .
For detection of Lactobacillus and Bifidobacterium, we applied two methods, next generation sequencing and PCR-DGGE, since Smidt et al., demonstrated the partial concordance of different molecular methods for the detection of lactic acid bacteria 39 . In our study, PCR-DGGE appeared to be more sensitive for the identification of Bifidobacterium sp., while more than 80% of Bifidobacterium sp. detected by Illumina sequencing were unclassified.
In line with our findings, other studies have also shown the presence of Lactobacillus sp. and Bifidobacterium sp. in breast milk 6,40 . A low relative abundance of L. fermentum was observed in breast milk in the CD group. The expected pathway by which lactobacilli may get into milk is entero-mammary transport through dendritic cells 41 . Production of short-chain fatty acids by lactobacilli strains may play a role in the pathogenesis of CD. L. fermentum produces both lactic and succinic acids, which in turn modulate the key functions of many main players in the innate immune response 42 . They mediate macrophage dependent anti-inflammatory effects 43 and possess antioxidative capacities 44 . Additionally, it was demonstrated that L. fermentum isolated from breast milk may have an immunostimulatory effect by inducing proinflammatory cytokines in rodent bone marrow derived macrophages 40 . In vivo assays in mice revealed that the ingestion of L. fermentum enhanced the production of Th1 cytokines by spleen cells and increased the IgA concentration in feces 45 . Likewise, Lactobacillus strains have high amylase trypsin inhibitor-metabolizing capacity to ameliorate many adverse effects induced by wheat immunogenic proteins 46 . Accordingly, one may assume that L. fermentum might protect against the development of CD.
There were no significant differences in the levels of the immunological markers analyzed between the CD and the control group, but some significant associations with milk microbiota were observed. The presence of Chryseobacterium and Sphingobacterium (both being Gram negative environmental bacilli) correlated inversely with breast milk TGF-β2 levels, while A. muciniphila (anaerobic mucin-degrading bacterium) showed a positive correlation with both TGF-β1 and TGF-β2. TGF-β belongs to a growth factor family that is responsible for the development and maturation of the mucosal immune system 47 . In our study, we also observed that higher counts of L. reuteri and lower counts of B. animalis in breast milk were associated with higher sCD14 concentrations. The latter bacterial species were also associated with higher concentrations of MFG-E8. Vidal et al. demonstrated that milk sCD14 acts as a "sentinel" molecule and immune modulator in homeostasis and in the defense of the neonatal intestine 48 . Supplementation of maternal breast milk with the probiotic L. reuteri increased the TGF-β2 concentration level in breast milk and decreased sensitization in breast-fed infants 49 . In addition, the Bacilli and Flavobacteriia classes correlated positively with the concentration of MFG-E8, which has been shown to have antiviral effects and to protect against rotavirus both in a cultured cell model as well as in infants [50][51][52] . Furthermore, it prevented the tissue damage caused by prolonged inflammation by clearing apoptotic cells and thereby facilitating immune resolution 53 .
Our study had many strengths. All study subjects in both countries were monitored and diagnosed with identical protocols. All CD diagnoses were confirmed by a biopsy of the small intestine. Although patients came from two different countries, all the analyses were carried out in one laboratory. Two different methods in parallel were applied for investigating Lactobacillus sp. and Bifidobacteria sp. in breast milk microbiota.
Our study also had some limitations. The number of cases was quite small, limiting the statistical power of this study. The mothers in the CD group were younger than the mothers in the control group. When interpreting the results this may be important because it is known that maternal age may influence milk composition and microbiota 54 . The children were observed for only 3 years and we lack information about possible new CD cases beyond that age. As we analyzed only a small number of breast milk samples taken at the age of 3 months, larger longitudinal studies are needed for a better understanding of possible differences between these two groups.
In conclusion, this study provides new insights into the complex breast milk microbiota composition in mothers whose offspring carry a genetic susceptibility to CD but have different disease outcomes. The breast-milk microbiota of mothers of children who developed CD differed in terms of higher bacterial phylotype abundance and diversity, as well as in relation to bacterial composition when compared to the mothers of unaffected children. The immunological markers were differently associated with bacterial composition and could also influence the risk of development CD in later life. Future studies are needed to reveal whether differences in breast milk composition ultimately influence the development of CD.  55 . For the current analysis, samples and data from Estonia and Finland were used. This study was conducted in accordance with the Declaration of Helsinki and local ethical committees respectively approved this study (Ethics Review Committee on Human Research of the University of Tartu in Estonia and Ethical Committee for Psychiatric Diseases and Diseases in Children and Adolescents, Helsinki and Uusimaa Hospital District in Finland). Written informed consent was obtained from all parents for the participation of their child in this study.
Children who were followed from birth to the age of 3 years and those who later developed CD (n = 6) were included in our analysis. The diagnosis of CD was confirmed according to the European Society for Pediatric Gastroenterology, Hepatology and Nutrition 2 , including the abnormal morphology of small intestinal biopsy in accordance with the Marsh classification modified by Oberhuber 56 . From the same DIABIMMUNE cohort we selected 18 controls based on the CD-specific HLA DR/DQ genotype, country of birth, time of birth and sex of each patient.
Five children in the CD group were exclusively breastfed, while one child was mostly bottle-fed during the time of breast milk collection. The corresponding numbers in the control group were 15 and 3. One CD patient and his controls were from Estonia, while all the others were from Finland. The proportions of boys and girls were equal in the CD and control groups. All offspring were HLA DQ 2.5 positive (two CD patients were HLA DQ 2.5 homozygotes). Each infant was full-term (gestational age 37-42 weeks) and born as a singleton. None of the mothers or fathers of the offspring had CD. The mean age was 2.4 years (range 1.5-3.0 years) at seroconversion to positivity for IgA-class tissue transglutaminase antibodies in the CD group. General characteristics of the CD and the control group children are summarized in Table 4.
Collection and processing of samples. Breast milk samples were collected from mothers in both groups 3 months after delivery. The milk samples were obtained by manual expression into sterile bags. Samples were immediately put into a −20 ºC freezer and further stored at −80 ºC until processing. Breast milk samples were thawed and centrifuged for 15 min at 1000g at 4 ºC and after that the aqueous fraction was collected. The latter was used for further analyses. Centrifugation was repeated twice.
All breast milk analyses were carried out in the Department of Immunology and Department of Microbiology at Institute of Biomedicine and Translational Medicine, University of Tartu, Estonia.
Molecular analyses. DNA extraction. Bacterial DNA from human milk was extracted using the Pure-Link™ Microbiome DNA Purification Kit [Invitrogen, using an ELMI SkyLine instrument (ELMI Ltd., Riga, Latvia)]. Human milk samples were centrifuged at 4000g for 30 min, after which the supernatant was removed. Cell pellets were washed with phosphate-buffered saline, centrifuged at 13,000g × 3 min at room temperature and the pellets were resuspended in a kit-specific lysis buffer. The protocol was then continued as described by the manufacturer (Invitrogen, USA). Table 4. General characteristics of the celiac disease (CD) group and the control group (*-data missing for one child; **p = 0.002). www.nature.com/scientificreports/ Illumina sequencing. 16S library preparation was performed using an in-house sequencing protocol with the V4 (F515/R806) primer pair 57 . Sequencing was performed using an iSeq 100 (SN FS10000643, Illumina) and an iSeq 100 i1 Reagent kit (Illumina) in the 2 × 150 bp mode and using the dual index setup with two-read sequencing protocol. 16S rDNA sequence data were analyzed using BION-meta (https:// www. box. com/ bion), according to the author's instructions and in-house scripts. First, sequences were cleaned at both ends using a 99.5% minimum quality threshold for at least 18 of 20 bases for the 5′-end and 28 of 30 bases for 3′-end. The reads were then joined, followed by the removal of read pairs shorter than 150 bp. Then, sequences were cleaned from chimeras and clustered using a 95% oligonucleotide similarity (k-mer length of 8 bp and a step size of 2 bp).
Lastly, consensus reads were aligned to the SILVA reference 16S rDNA database (v123) using a word length of eight and similarity cut-off of 90%. The bacterial designation was analyzed at different taxonomic levels down to the species when applicable.
Primers and probes. Denaturing gradient gel electrophoresis, sequencing of DGGE amplicons and real-time PCR were performed with primers and probes listed in supplementary Table S1 58-61 .
Real-time PCR. In order to establish a quantitative assay, plasmid standards were generated using the method described in Bartosch et al. 62 . The amplified 16S rRNA gene region (amplified with specific primers) from B. longum DSM20219 and L. acidophilus ATCC4356 was cloned into chemically component E. coli JM109 cells using the pGEM-T vector system (Promega, Madison, WI, USA). Plasmids were purified with NucleoSpin Plasmid QuickPure kit according to the manufacturer's instructions (Macherey-Nagel, Düren, Germany). Multiple dilutions of purified plasmids were quantified by spectrophotometry (NanoDrop ND-1000, Thermo Fisher Scientific, Waltham, MA, USA). Quantification of target DNA was achieved by using serial tenfold dilution from 10 5 to 10 1 plasmid copies of previously quantified plasmid standards. Multiplex TaqMan assay PCR reactions were performed in a total volume of 25 µl using TaqMan ® Universal PCR Master Mix (Applied Biosystems, Foster City, CA, USA). Each reaction included 5 µl of template DNA, 12.5 µl of TaqMan ® Universal PCR Master Mix (Applied Biosystems), 400 nM of (F_allbif_IS and R_alllact_IS primers), 600 nM of (F_alllact_IS and R_alllact_IS primers), 900 nM of (F_Eub and R_Eub primers) and 100 nM both (P_alllact_IS, P_alllbif_IS) and 100 nM (P_Eub) of probes. The real-time PCR conditions consisted of an initial denaturation step at 50 °C for 2 min and then 95 °C for 10 min, continued with an amplification step followed by 40 cycles including denaturation at 95 °C for 15 s, and an annealing-elongation step at 60 °C for 1 min. Amplification and detection of DNA samples by real-time PCR was performed with a 7500 Fast Real-Time PCR System (Applied Biosystems Europe BV, Zug, Switzerland) using optical-grade 96-well plates. Data from triplicate samples were analyzed using Sequence Detection Software version 1.6.3 (Applied Biosystems). The DGGE analysis of PCR amplicons was performed using a Dcode™ System apparatus (Bio-Rad, Hercules, CA, USA). Polyacrylamide gels 8% [wt/vol] involving acrylamide-bisacrylamide [37.5:1] in 0.5× Tris-acetic acid-EDTA buffers with a denaturing gradient was prepared with a gradient mixer and Econopump (Bio-Rad). Gradients from 30 to 60% were employed for the separation of the products amplified with specific primers for Lactobacillus spp. and from 45 to 60% for the products amplified with primers specific for Bifidobacterium spp.
Sequencing of DGGE amplicons. PCR amplicons (Bif164-r and Bif662-f) and (Lac-1 and Lac-2) were purified and concentrated with a QIAquick PCR purification kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. Purified amplicons were then cloned into the E. coli JM 109 strain using the pGEM-T vector system (Promega, Madison, WI, USA). Colonies of ampicillin-resistant transformants were randomly picked from each sample and were subjected to PCR with the pGEM-T specific primers T7 and SP6 (Supplementary  Table S1) from lyzed cells to check the size of inserts. Plasmid DNA from selected transformants was isolated using a QIAprep Spin Miniprep kit (Qiagen).
Sequencing reactions were performed using the BigDye Terminator CA v3.1 Cycle Sequencing kit (Applied Biosystem) according to the manufacturer's instructions. The sequences obtained were analyzed using an automatic LI-COR DNA Sequencer 4000L (Licor, Lincoln, NE, USA) and were corrected manually. All of the sequences were thereafter identified using BLASTN and the NCBI nucleotide database.
Statistical analysis. Statistical analysis of clinical and molecular data was performed with the R software for Windows 2018 (R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria). The Shapiro-Wilk normality test was used to determine whether the data had a normal distribution. Comparisons between the CD group and the control group were performed using a t-test for normally distributed data and with the Mann-Whitney-Wilcoxon test for data with skewed distribution. To compare the number of positive cases between groups, Fisher's exact test was used. Correlations were analyzed with a non-parametric Spearman's rank-order correlation test. Alpha diversity based on the Shannon index of the OTU level. Analysis of beta-diversity of CD and control groups including PCoA of weighted UniFrac distances were performed and visualized using PAST 4.0 version. Statistical significance was accepted as p < 0.05, adjusted for ties.

Data availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.