Introduction

Breastfeeding is the first form of infant feeding, as it allows newborns to assimilate all the nutrients needed in the first 6 months of life (González et al., 2013). Essential for initial development (Fernández et al., 2013), human milk is able to influence the development of the immune system and helps establish gut microbiota by its components, such as oligosaccharides, acting naturally as prebiotics (Fernández et al., 2013).

The composition of human milk differs according to the timing of lactation. Colostrum is generated immediately after delivery until the fifth/sixth day of life and is very rich in nutrients and bioactive factors, containing proteins, mineral salts, oligosaccharides, antibodies, cytokines, lysozyme and complement factors (Ballard and Morrow, 2013). From 5 days to 2 weeks postpartum, the transitional milk occurs (Ballard and Morrow, 2013). Its function is to support the nutritional and developmental needs of growing infants, and for this reason, it is rich in lactose, calcium, lipids and glucids (Ballard and Morrow, 2013). One month after childbirth, the human milk achieves a standard composition, known as ‘mature milk’. This is characterized by a lower percentage of proteins and minerals and a greater richness in lipids and carbohydrates compared with colostrum (Ballard and Morrow, 2013).

In the past decade, there has been an increased interest in the study of microbiota in human milk. More than 200 different species belonging to 50 different genera have been described in human milk samples (Hunt et al., 2011). Today it is recognized as a potential source of probiotic bacteria, such as streptococci, lactobacilli and bifidobacteria (Martín et al., 2003), which have been observed to have a pivotal role in the first stage of initial neonatal gut colonization (Fernández et al., 2013), allowing babies to ingest between 1 × 105 and 1 × 107 bacteria daily (Heikkilä and Saris, 2003).

The mature milk microbiota appears around 1 month after delivery, being influenced by hormonal signaling, lifestyle and diet. This results in significant changes to the gut microbiota of pregnant woman (Chapman and Nommsen-Rivers, 2012) in different geographic areas.

The human milk microbiota comes from several body sites, probably including the maternal gut microbiota (Fernández et al., 2013), through an entero-mammary pathway, as well as a specific degree of retrograde flow back into the mammary ducts that can occur during nursing (Hunt et al., 2011). It has been hypothesized that intestinal bacteria could translocate to the mother’s blood stream, reaching the mammary ducts. However, to date it is not clear how and when intestinal microorganisms colonize the mammary epithelium. Interesting, mature milk bacteria have been observed to reduce respiratory and diarrheal infections in early infancy, having also a protective role toward breastfeeding women (Fernández et al., 2013; Bergmann et al., 2014).

The aim of this study was to evaluate the microbiota network, and differences, of colostrum and mature milk in mothers living in two completely different sites and environments, such as Italy and Burundi. In addition, we performed a data-mining evaluation with advanced complex mathematic systems to assess the pattern of commensal clustering using the auto-contractive map (AutoCM).

Subjects and methods

Study population

This study involved the mother and newborn pairs recruited from hospital postnatal wards in the Republic of Burundi (Hospital of Ngozi) and in Verona, Italy (Policlinico GB Rossi).

Participants were selected if they met the following inclusion criteria: (a) healthy infants, (b) willingness to comply with study protocol, and (c) postpartum colostrum collection within 3 days of birth. Exclusion criteria included: (a) significant maternal or infant illness or major birth defects, (b) mothers taking immunosuppressive agents, and (c) mothers taking antibiotic during lactation.

Mothers in Verona represent a western population living in an environment typical of the European countries. Their diet was rich in calories, including high rate of animal proteins, sugar and fat and low rate of fibers, vegetables and fruits. In contrast, the mothers from Burundi live in small villages and came to the hospital just for delivering. The Burundian diet of participant mothers living in these small rural villages consisted mainly of cereals, legumes and vegetables, being rich in fibers and poor of animal proteins, animal fats and sugar.

The sample collection and investigation were conducted following ethical approval by separate committees in the two participating hospitals in accordance with Italian standards (Ethical Committee of the Azienda Ospedaliera di Verona, Italy, Approval No. 1288).

Informed consent was obtained from all subjects.

Collection and processing of milk samples

One sample of colostrum and one sample of mature milk was collected from each mother.

Colostrum was collected within 3 days postpartum, and mature milk at 1 month of life. All mothers were given two sterile plastic tubes in which they collected colostrum and mature milk. Samples were collected after the mother’s hands and areola area were cleaned using a preservative-free soap to allow a deep bacterial decontamination. Samples were initially stored in a fridge and frozen to −20 °C within 2 h of expressing. Samples from Burundi were transported to Verona under controlled conditions. Within the laboratories in Verona, the samples were thawed and transferred into plastic microcentrifuge tubes (Eppendorfs, Milan, Italy) and centrifuged at 1500 g for 15 min at 4 °C to separate the fat and the aqueous phase.

Bacterial DNA extraction and 16S gene sequencing

Total DNA was extracted from colostrum and mature milk samples using a Milk DNA Extraction Kit (Norgen, Thorold, Ontario, Canada) following the manufacturer’s instructions. The protocol included the specific binding of DNA to the QIAmp silica-gel membrane while contaminants pass through (Salonen et al., 2010). DNA amplification and gene sequencing were performed as previously described (Drago et al., 2016).

The auto-contractive map

System biology inherent to human–microbe mutualism is related to the collection of large amounts of data per single subject, and complex mathematical networks can help us in establishing the hierarchy of variables within a specific set. We have adopted the AutoCM to illustrate this.

AutoCM system is a fourth-generation unsupervised artificial neural network (ANN), which has already been demonstrated to outperform several other unsupervised algorithms in a heterogeneous class of tasks (Buscema and Sacco, 2016). AutoCM is able to highlight the natural links among variables with a graph based on minimum spanning tree theory, where distances among variables reflect the weights of the ANN after successful training phase (Buscema and Grossi, 2008a; Buscema et al., 2008b; Buscema and Sacco, 2010). The AutoCM system finds, by a specific learning algorithm, a square matrix of ‘similarities’ (weights mathematically speaking) among the variables (in this case, microbes’ abundances) of the data set. Once the AutoCM weights’ matrix is obtained, it is then filtered by a Minimum Spanning Tree (MST) algorithm (Kruskal, 1956; Fredman and Willard, 1990). MST shows among the huge number of possible ways to connect the variables in a tree, the shortest possible combination. In the MST, in fact, every link able to generate a cycle into the graph is eliminated, irrespective of its strength of association, and this results in a simplified graph. A graphical description of MST concept is provided in Supplementary Appendix S1. The assumption is that as all biological systems tend naturally to the minimal energetic states this graph express the fundamental biological information of the system. The ultimate goal of this data mining model is to discover hidden trends and associations among variables, as this algorithm is able to create a semantic connectivity map in which non-linear associations are preserved and explicit connection schemes are described. This approach shows the map of relevant connections between and among variables and the principal hubs of the system. Hubs can be defined as variables with the maximum amount of connections in the map.

The learning algorithm of CM may be summarized in four orderly steps: (a) signal transfer from the input into the hidden layer; (b) adaptation of the connections value between the input layer and the hidden layer; (c) signal transfer from the hidden layer into the output layer; and (d) adaptation of the connections value between the hidden layer and the output layer.

The AutoCM neural networks does not have initial weights posed at random. They start always by the same value. Therefore, the resulting graph is perfectly reproducible along many possible runs.

A detailed description of the theory, mathematics and functioning of this analytical technique is provided in Supplementary Appendix 2.

In simple words, AutoCM ‘spatializes’ the correlation among different variables (‘closeness’) and converts it into a compelling graph that identifies only the relevant associations and organizes them into a coherent picture, building a complex global picture of the whole pattern of variation. We choose arbitrarily to consider as relevant hubs those microorganisms that showed at least five connections with other microorganisms in the network. Moreover, we defined a ‘central node’ the inner node that is the last remaining after bottom-up recursively pruning away the ‘leaves’ nodes (that is, the isolated ends of the graph).

Results

Study population

The study was proposed to 40 mothers in Italy and in 40 in Burundi. A total of 50 mothers were included in this study (20 mothers from Italy and 30 from Burundi), providing consent to participate and presenting characteristics according to the inclusion criteria. From all Italian subjects, we obtained colostrum and mature milk samples, while only 12 of the Burundian mothers (owing to the local habits) provided the mature milk samples. Demographic details of the two populations are shown in Table 1, where also characteristics of the 12 Burundian pairs who provided both colostrum and mature milk are shown.

Table 1 Demographic characteristics of mother–newborn pairs from the two different sites

Bacterial abundance in colostrum and mature milk samples

Colostrum and mature milk of both populations showed a high bacterial abundance, as >200 bacterial genera have been detected in all samples (Figure 1). The main bacterial genera found in each group are summarized in Table 2 (each bacterial genus represents at least the 2% of total bacteria contained in samples).

Figure 1
figure 1

Bacterial distribution in each type of milk sample: (a) Italian colostrum; (b) Italian mature milk; (c) Burundian colostrum; (d) Burundian mature milk.

Table 2 Main bacterial genera detected in colostrum and mature milk of Italian and Burundian populations

The auto-contractive map

We used the AutoCM to represent the main connections between all bacterial genera found in colostrum and mature milk samples of both populations. This includes all the variables considered linked in a way that the energy structure of the system is minimized or, if weights are correlation measures, in a way that these are maximized for all the connections in the graph.

The MST algorithm does not map all the correlations present in the data but only the strongest correlations between the hubs in the system that are connected to a node and only a single path is available between two hubs (no loops). Several bacterial hubs were observed in all samples, and interestingly, the aforementioned hubs were different between all groups analyzed.

In the Italian colostrum, the main bacterial hubs were represented by Abiotrophia spp, Actinomycetospora spp, Aerococcus spp, Alloiococcus spp, Amaricoccus spp, Bergeyella spp, Citrobacter spp, Desulfovibrio spp, Dolosigranulum spp, Faecalibacterium spp, Parasutterella spp, Rhodanobacter spp and Rubellimicrobium spp. Abiotrophia spp, in particular, represented the biggest hub of the entire network as it had 59 connections to other microorganisms (Figure 2).

Figure 2
figure 2

Microbiota network of Italian colostrum. The main hubs of the bacterial network are underlined with a blue line; red circle shows the central node of the network.

Figure 1 is obtained from a data set composed by 20 rows (subjects) and 269 columns (microbes abundances). Therefore, the graph expresses the overall schema of microbes’ association in the observed sample. The same concept applies to other figures.

Moreover, Aciditerrimonass spp represented the central node of the bacterial network.

The Italian mature milk instead shared with colostrum samples only Abiotrophia spp and Aerococcus spp presenting other bacterial hubs, such as Acetanaerobacterium spp, Aciditerrimonas spp, Acidocella spp, Aminobacter spp, Bacillus spp, Caryophanon spp, Delftia spp, Microvirga spp, Parabacteroides spp and Phascolarctobacterium spp (Figure 3). Alistipes spp was the central node of network.

Figure 3
figure 3

Microbiota network of Italian mature milk. The main hubs of the bacterial network are underlined with a blue line; red circle shows the central node of the network.

Furthermore, bacterial hubs observed in Burundian colostrum were Aeribacillus spp, Agaricola spp, Alterythrobacter spp, Amaricoccus spp, Aquabacterium spp, Aquimonas spp, Brachybacterium spp, Dolosigranulum spp, Micrococcus spp, Peptostreptococcus spp, Propionibacterium spp and Serratia spp (Figure 4), while in Burundian mature milk Achromobacter spp, Aeromicrobium spp, Aggregatibacter spp, Albidovolum spp, Aquipuribacter spp, Aurantimonas spp, Bergeyella spp, Buttiauxella spp, Dolosigranulum spp, Parasutterella spp, Tepidiphilus spp and Weissella spp represented the main bacterial hubs (Figure 5). Sphingomonas spp and Rhizobium spp represented the central nodes of Burundian colostrum and mature milk, respectively. Interestingly, in all groups bacterial hubs did not coincide with the main bacterial genera found in samples, except for Achromobacter and Rhizobium in Burundian mature milk. A detailed description of the theory, mathematics and functioning of this analytical technique has been provided in Supplementary Appendix.

Figure 4
figure 4

Microbiota network of Burundian colostrum. The main hubs of the bacterial network are underlined with a blue line; red circle shows the central node of the network.

Figure 5
figure 5

Microbiota network of Burundian mature milk. The main hubs of the bacterial network are underlined with a blue line; red circle shows the central node of the network.

Furthermore, in colostrum and mature milk of both populations we found a high prevalence of anaerobe bacteria (Figure 6a) and lactic acid bacteria (Figure 6b).

Figure 6
figure 6

(a) Relative abundance of anaerobe bacteria in both Italian and Burundian colostrum and mature milk; (b) Relative abundance of lactic acid bacteria in both Italian and Burundian colostrum and mature milk.

Discussion

This is the first study in which the microbiota of colostrum and mature milk of Italian and African populations has been compared.

The development of cultivation-independent techniques for the study of bacterial population of different biological samples allows a deeper analysis of bacterial diversity existing in aforementioned samples. Dietary habits are considered one of the main factors influencing the human microbiota composition, as the intake of meat, vegetables, proteins and fibers leads to significant changes in human-associated bacterial diversity (De Filippo et al., 2010). Western and African lifestyle and diet are very different, and for this reason, bacteria specialized in human-associated niches underwent deep modifications during the social and demographic changes (De Filippo et al., 2010).

Even though we have not investigated the diet habits of our two populations by a food-frequency questionnaire, we assumed that mothers in the two groups represented two completely different environments. Indeed, the mothers in Verona represent a western population living in an environment typical of the developed world, following a diet rich in animal proteins, sugar and fat and low in fibers. Mothers from Burundi, instead, followed a very different diet if compared with the western one, as it was characterized by high proportions of cereals, legumes and vegetables, being rich in fibers and poor in animal proteins and sugar.

All samples of colostrum and mature milk from Italy and Burundi showed different bacterial distributions in the microbiota network. Synergy is a positive interaction between bacterial species or strains that leads to several benefits and advantages to microorganisms. Consequently, bacteria are not randomly distributed throughout the body but their specific localization and, above all, their reciprocal interactions, are essential for their survival in the human body (Stacy et al., 2016).

Abiotrophia spp represent a main hub found only in colostrum and mature milk from Italian mothers. This bacterial genus is a nutritionally variant of streptococci originally isolated from patients with endocarditis and otitis media (Roggenkamp et al., 1998). It is a common microorganism belonging to the microbiota of oral cavity, genitourinary tract and gastrointestinal tract, which has never been associated with human bacteremia; consequently, its presence in human milk may be the direct consequence of a retrograde flow back into the mammary ducts that can occur during nursing (Hunt et al., 2011) or the entero-mammary pathway (Fernández et al., 2013).

Several findings have highlighted the ability of maternal intestinal bacteria to reach the mammary glands by mean of dendritic cells and CD18+ cells, which would bind to non-pathogenic microorganisms from the gut lumen, carrying them to lactating mammary gland (Rodriguez, 2014). The high percentage of anaerobe intestinal microorganisms found in colostrum and mature milk of both Italian and Burundian populations may support the hypothesis of entero-mammary pathway.

Colostrum from Italian mothers was also rich in lactic acid bacteria, such as Alloiococcus spp, which represent one of the main hubs found in this biological sample. This bacterial genus is a common member of the vaginal microbiota and contributes to the balance between beneficial and pathogenic bacteria in the vaginal ecosystem, providing protection against harmful microorganisms (Martín et al., 2003; Ling et al., 2010). Consequently, it may act as a protective microorganism in newborns, as breastfeeding protects babies against several diseases, not only owing to immunological contents of human milk, such as immunoglobulins and immunocompetent cells, but also by means of probiotic bacteria that reach a high concentration in this biological sample (Gilliland, 1990). Lactic acid bacteria also has numerous beneficial properties in the human organism as they are able to control intestinal infections, improve nutritional value of food and stimulate the immune system (Gilliland, 1990).

In contrast, in the mature milk from Italian mothers Parabacteroides seem to have a pivotal role in the bacterial network. Parabacteroides spp are intestinal microorganisms and their presence in the human milk could be explained by the aforementioned entero-mammary pathway. Interestingly, Parabacteroides is able to produce bacteriocins that inhibit the RNA synthesis, without effect on protein, DNA or ATP synthesis (Nakano et al., 2006). These particular peptides exert a broad-spectrum of antimicrobial activity, and their activity is not linked to the development of resistance in target bacteria (Nakano et al., 2006). As a consequence, the ability of Parabacteroides to produce bacteriocins allows not only the establishment of this microorganism in the mammary gland ecosystem but also the control of excessive growth of potential pathogenic microorganisms.

Furthermore, an in vivo study showed that the administration of Parabacteroides distasonis antigens was able to reduce the impact of intestinal inflammation in animal models of colitis, highlighting the potential protective role of Parabacteroides spp in the host’s organism (Kverka et al., 2011). Also Phascolarctobacterium seem to have a pivotal role in Italian mature milk. This particular microbial genus is a common member of the Firmicutes phylum and it produces high amount of the short-chain fatty acids acetate and propionate, which stimulate colonic blood flow and electrolyte uptake and act as energy source for muscles (Topping and Clifton, 2001).

Conversely, in Burundian colostrum the majority of bacterial hubs are represented by potential pathogens such as Serratia spp and Peptostreptococcus spp or poor-characterized microorganisms, of which the biological role in the human organism is still the object of study. The bacterial genus Aquabacterium, which seems to have a central role in Burundian colostrum, has been described as a colonizer of the very premature infant gut dominant microbiota (Aujoulat et al., 2014). Not surprisingly, this specific genus does not have a further role in the mature milk from Burundian mothers. In our study, we observed that Aquabacterium was connected with 18 different bacterial genera, belonging above all to the intestinal microbiota, but to date the real biological meaning of these interactions are unknown. It is interesting to underlie that the central hub of the microbiota network in mature milk from mothers in Burundi is Rhizobium. As this is a soil bacterium that is a symbiont of the legumes, and not a documented gut microbe, it is tempting to speculate that it possibly enters milk from the large amounts of legumes that we know these mothers consume through the entero-mammary pathway. Unfortunately, mothers enrolled in our study were not subjected to a food-frequency questionnaire, and consequently, we have not detailed information about their diet to advance more specific hypothesis about the relationship between the detection of Rhizobium in milk samples and food intake.

Interestingly, Salter et al. (2014) demonstrated that numerous laboratory reagents and DNA extraction kits are contaminated with bacterial DNA, which can negatively influence the results of metagenomics study, above all when analyzing samples containing low microbial mass (Salter et al., 2014). As Rhizobium belongs to the potential DNA kit contaminants found by Salter et al. (2014), it is reasonable thinking that its presence in milk samples may be the result of DNA extraction kit contamination. However, in the present study a negative control (no DNA sample) was added during bacterial DNA amplification to verify the absence of any contamination. No amplification products have been observed, indeed the agarose gel showed no amplification bands and the DNA quantification with Qubit provided no detectable DNA in negative control, leading us to hypothesize that the presence of Rhizobium in our samples did not derive from sample contamination during processing. Moreover, the detection of Rhizobium as central node only in African mature milk and not in Italian one strengthen our hypothesis that its detection is closely related to plant- and legume-rich diet generally followed by Burundian population.

Moreover, Dolosigranulum spp, which is the only bacterial hub shared with Burundian mature milk, showed having a protective role in the host’s organism. Dolosigranulum and Streptococcus pneumoniae seem to be involved in a competitive interaction that would inhibit the pathogenesis of otitis media by the pathogenic activity of S. pneumoniae (Fusco et al., 2015).

Finally, in the mature milk collected from Burundian mothers Dolosigranulum and Weissella may have a protective role against infections and pathogenic bacteria in both mothers and newborns. This microorganism is a lactic acid bacterium often isolated from human skin, feces, saliva and milk and from African traditional fermented foods (Fusco et al., 2015). Some Weissella strains may have a probiotic activity, as they are able to inhibit in vitro biofilm formation and the proliferation of Streptococcus mutans, which is often involved in dental caries (Fusco et al., 2015). The ability of Weissella to produce bacteriocins and exert an in vitro anti-inflammatory activity in human mouth epithelial cells elicited by Fusobacterium nucleatum strengthen the hypothesis that this microorganism may have a high probiotic potential (Kang et al., 2006; Papagianni and Papamichael, 2012; Papagianni and Sergelidis, 2013).

In conclusion, in our study we observed several differences in the microbiota network of colostrum and mature milk from Italian and Burundian mothers. Bacterial relations changed within the same population, underlying that colostrum and mature milk are different not only for protein and fat content but also for the microbiota composition. We believe some bacterial genera are essential in the first phase of lactation, and for this reason, they have a pivotal role in colostrum, while other microorganisms are fundamental in the long-term nutrition of newborns, having consequently a major role in mature milk.

Our study highlighted the impact that lifestyle and dietary habits may have on the microbiota composition of human milk, being so different between Italy and Burundi, and diet foremost might explain the major differences in the microbiome composition and network. Nevertheless, at the same time we must consider that, besides lifestyle and dietary habits, there are a number of other differences between these two populations that may influence the findings. In particular, in Burundi mothers are younger, their babies are born earlier, they were more likely to have had previous deliveries, they are less likely to be exposed to second-hand smoke and are more likely to have had antenatal antibiotics. Consequently, it is necessary to consider that different factors can contribute to modulate the human milk microbiota. Out of them, probably specific foods or food supplements may represent a more direct and sustainable strategy to protect newborns from the onset of several infections and diseases, promoting the growth of probiotic and beneficial bacteria with a protective role for the host.

To date, the real biological meaning of many bacterial hubs found in the microbiota network are not clear and they are still the object of studies, even if the mathematical model we applied demonstrated that they are probably fundamental for maintaining the microbiota homeostasis and that their breakdown could be responsible of a probable ecosystem unbalance.

The techniques used to illustrate the association between the bacteria is novel and therefore their properties and implications are not currently entirely understood, and further research is called for to explore them. Any way, the disappointing results obtained with two alternative data mining statistical methods such as hierarchical clustering and principal component analysis strengthen the idea that AutoCm, thanks to its new sophisticated mathematics, could become, in the future, a reference approach to better understand the complexity of human–microbe mutualism.

Similarly, further studies are needed to better characterize all interactions between the bacterial hubs and the branches of microbiota network observed in the present study, in order to define the specific biological meaning of bacterial distribution in human milk samples.