The composition of the perinatal intestinal microbiota in cattle

Recent research suggests that the microbial colonization of the mammalian intestine may begin before birth, but the observations are controversial due to challenges in the reliable sampling and analysis of low-abundance microbiota. We studied the perinatal microbiota of calves by sampling them immediately at birth and during the first postnatal week. The large size of the bovine newborns allows sampling directly from rectum using contamination-shielded swabs. Our 16S rDNA data, purged of potential contaminant sequences shared with negative controls, indicates the existence of a diverse low-abundance microbiota in the newborn rectal meconium and mucosa. The newborn rectal microbiota was composed of Firmicutes, Proteobacteria, Actinobacteria and Bacteroidetes. The microbial profile resembled dam oral rather than fecal or vaginal vestibular microbiota, but included typical intestinal taxa. During the first postnatal day, the rectum was invaded by Escherichia/Shigella and Clostridia, and the diversity collapsed. By 7 days, diversity was again increasing. In terms of relative abundance, Proteobacteria were replaced by Firmicutes, Bacteroidetes and Actinobacteria, including Faecalibacterium, Bacteroides, Lactobacillus, Butyricicoccus and Bifidobacterium. Our observations suggest that mammals are seeded before birth with a diverse microbiota, but the microbiota changes rapidly in the early postnatal life.


Supplementary results and discussion
Full taxon

Evaluation of the sequencing data decontamination protocol
Impact of data decontamination is shown at genus level in Supplementary Table 2 for newborn data. The protocol completely removed obvious reagent contaminants such as the thermophilic Hydrogenophilus, despite its high abundance in the raw data. The data decontamination protocol did not significantly change the phylum-level compositions in adult samples and in the 24h calf rectal samples, compared to uncleaned data ( Supplementary Fig. 1). In newborn meconium samples, the proportion of Proteobacteria decreased while Firmicutes, Actinobacteria and Bacteroidetes increased, and in 7-day samples most Proteobacteria were removed. In negative controls, almost all reads were deleted.

Supplementary Table 2. Effect of data decontamination on taxonomic composition of newborn calf meconium samples (raw data vs. decontaminated data). Medians per sample.
In newborn and 7-day samples, a larger number of sequences were deleted than suggested by qPCR quantification of microbial DNA in samples and controls, suggesting that the stringent data decontamination deleted some sequences genuinely originating from the samples.

Microbial diversity in calves as Hill numbers
Alpha diversity in newborn, 24h and 7d calves as Hill numbers (effective numbers of species) is presented in Supplementary Figure 2A. This measure of diversity facilitates straightforward interpretation and comparison due to its doubling property 1 ; however, it has been rarely used in microbiota studies. Most of the common diversity indices can be converted to the effective number of species. This is the number of equally abundant species which produces the given value of the diversity measure used to calculate it 2 . In our case, OTUs represent species.
The various alpha diversity indices can also be represented as a continuous plot, as a function of q [3][4][5][6] , which gives a more complete understanding of the diversity (Fig. S2B): where S = the number of species (in our case, OTUs), and pi = the relative abundance of each species (OTU); pi > 0 and ∑ =1 = 1. The parameter q determines the sensitivity of an index to the common and rare species [5][6][7] . At q=0, the Hill number is simply the observed species richness. The higher q values emphasize the abundant species 1,7,8 . At q=1, the function is undefined, and therefore the estimate − ∑( ×ln( )) is applied; this is the exponential of the Shannon diversity index. When q approaches infinity, the Hill number approaches the inverse of the maximum of p, thus representing only the most abundant species.
In our data, all the diversity measures showed that the alpha diversity collapsed after birth, then started to increase by 7 days of age in most calves. The dominance of abundant species (OTUs) was greatest in the 24 h samples, as richness was proportionally higher than the other indices.

Genus-level compositions of dominant phyla
Genus-level compositions and alpha diversities within major phyla are shown in Supplementary Figs. 3-6. The relatively high diversity and similarity to adult oral microbiota is apparent in newborns also here.

Reoccurrence of newborn core taxa in 24 h and 7 day calves
Supplementary Table 3 shows the occurrence of newborn core taxa in older calves. Most Actinobacteria and Proteobacteria were no longer detectable in most of the older calves, while Firmicutes and Bacteroides largely remained.

Characterization of the adult cow microbiota
The core genera in adult cows are shown in Supplementary Tables 4-6.
The adult fecal microbiota was composed primarily of Firmicutes and Bacteroidetes. 41% of all sequence reads represented unclassified genera of the Ruminococcaceae family (Supplementary Table 4). Unclassified Lachnospiraceae (8%), unclassified Clostridiales (7%), unclassified Bacteroidetes (5%), Romboutsia (5%), unclassified Bacteroidales (4%), Clostridium cluster XI (4%), Bacteroides (3%) and Alistipes (2%) were also abundant. All the core taxa (at >0.1% median abundance) were shared by all 10 cows. The inter-individual variation was very small in terms of alpha and beta diversity (see figures included in the article), probably due to the similarity in housing and feeding. The phylum-level composition of bovine fecal microbiota has been shown to be greatly affected by feeding, especially by the starch content of the diet 9 .
The cows in our study were fed grass silage, supplemented with a small amount of commercial concentrate (mainly barley, wheat and rape seed meal) for about 1-2 weeks before sampling.
The cow oral microbiota was mostly composed of Firmicutes (62%) and Proteobacteria (33%), with smaller contributions of Actinobacteria and Bacteroidetes. The most abundant taxa included Streptococcus (12%), Lactobacillus (7%), unclassified Pasteurellaceae (7%), Alysiella (3%), Moraxella (3%), and Staphylococcus (2%), also shared by all the cows (Supplementary Table 5). The oral microbiota observed in our study was more similar to human oral microbiota than to the recently reported bovine oral microbiota sampled at the time of ingesta regurgitation, possibly because we did not collect the samples at regurgitation 10,11 .
The dam reproductive tract microbiota was sampled at the cranial vaginal vestibule, close to the opening of the urethra, rather than deep in the vagina, to avoid risk to the pregnancy. The microbiota of the vaginal vestibule microbiota was mostly composed of Firmicutes, with <8% Bacteroidetes. Unclassified Ruminococcaceae (21%), Streptococcus (16%), unclassified Lachnospiraceae (10%), unclassified Clostridiales (4%), Romboutsia (3%), Bacteroides (2%) and Clostridium XI were the most abundant taxa (Supplementary Table 6). All the core taxa were shared by all 10 cows. The vestibular microbiota was very similar to the fecal microbiota (see figures included in the article). However, Streptococcus was much more numerous in the vestibule than in feces, and Corynebacterium, unclassified members of Candidatus Saccharibacteria and Aerococcus were present in all vaginal samples but not detected in the feces of any cow. Lactobacillus was found in the vestibule of all cows but at <0.1% median abundance, as described previously for cow vaginal microbiota 12 . The phylum-level composition in our data differs from the few previously published studies of cow vaginal microbiota probably because of differences in sampling (mucosal swabs versus lavage) and because the vaginal microbiota changes during pregnancy 13 .

MiSeq amplicon sequencing of 16S rDNA
The hypervariable regions V3 and V4 of the 16S rRNA gene were sequenced using the Illumina MiSeq platform in the DNA core facility of the University of Helsinki. The library preparation and sequencing were carried out essentially as described previously 16

Detailed description of the bioinformatics pipeline
Removing primers and spacers A subsample of the sequence data was first visualized and checked in Geneious 17 . Primers and spacers were removed using CutAdapt 18 . The results were again confirmed in Geneious.
Initial pre-processing in mothur The group file needed for mothur was generated from the USEARCH outputs using unix commands awk and sed. Then, pre-processing was continued using mothur v. 1.39.5, according to mothur MiSeq standard operating procedure (SOP) 21,22 , up to de-noising (preclustering) and removal of chimeras and non-bacterial sequences. Silva v. 128 23 was used for alignment and Ribosomal Database Project (RDP) v. 16 24 for classification.
Filtering 16S genotypes shared between samples and negative controls Genotypes (16S rDNA sequences) which were abundant in negative controls were removed as potential contaminants, after de-noising and removal of chimeras and non-bacterial sequences. This was done for each sample type separately (newborns, 24h, 7d, adult feces, adult oral, adult vulvar vestibule).
First, sequence count tables for each sample type (+ negative controls) were extracted using mothur. For example, for newborn samples: ### Here, the .accnosgroups file specifies the samples to be output get.groups(fasta=calf.nb.fasta, count=calf.nb.count_table, accnos=nb_ns.accnosgroups) These subsets were then processed individually in Excel, to identify potential contaminant genotypes in each sample type. Singletons (only 1 hit in the extracted dataset) were first removed. Then, potential contaminant genotypes were identified based on their relative abundances in negative controls and actual samples, as actual contaminants are expected to be relatively more abundant in the controls 25 . Here, we approved genotypes which were not present in the controls, or whose total relative abundance in all newborn samples was more than fourfold as high as their total relative abundance in empty collecting swab negative controls. The sequences which failed to fulfill these requirements were removed from the data. For example, for the newborns: