Introduction

Host glycans have been shown to participate directly in specific host–microbe interactions (Bishop and Gagneux, 2007), and signatures of selection observed at the loci of carbohydrate blood group-related genes in humans (Calafell et al., 2008; Ferrer-Admetlla et al., 2009) are most likely the result of host–pathogen interactions (Anstee, 2010). The Sd(a)/Cad carbohydrate determinant is a polymorphic blood group antigen of unknown function expressed on the red cells of 90% of human populations (Conte and Serafini-Cessi, 1991) and detectable in several other human tissues, including the intestinal mucosa and kidney in 98% of humans (Morton et al., 1988). The glycosyltransferase β-1,4-N-acetylgalactosaminyltransferase 2 (B4galnt2) is responsible for catalyzing the last step in the biosynthesis of the Sd(a)/Cad antigen by the addition of an N-acetylgalactosamine (GalNAc) residue via a β-1,4 linkage to a subterminal galactose residue substituted with an α-2,3-linked sialic acid (Lo et al., 2003; Montiel et al., 2003). In the intestine, there is evidence of a gradient in GalNAc residues conferred by B4galnt2 from the ileum to the colon (Robbe et al., 2003). The character of intestinal mucus itself also changes with location, both in variation in the types of mucins present (Corfield et al., 2001) and in the balance of glycolipids and glycoproteins that may be capable of acting as B4galnt2 substrates (Dohi et al., 1996; Kawamura et al., 2005). Thus, distinct carbohydrate mucosal phenotypes are likely to result from B4galnt2 expression at different intestinal anatomic sites.

Intestinal expression of B4galnt2 is observed throughout vertebrates from fish (Stuckenholz et al., 2009) to humans, including mice. Although conservation across species is evidence for a functional constraint preventing the loss of intestinal expression, mice genetically deficient in B4galnt2 (B4galnt2−/−) display no obvious phenotype, gastrointestinal (GI) or otherwise, under specified pathogen-free (SPF) conditions in the laboratory (Mohlke et al., 1999; Johnsen et al., 2008).

The absence of intestinal B4galnt2 expression is also observed in mice homozygous for a spontaneous cis-regulatory mutation termed Modifier of von Willebrand Factor-1 (Mvwf1). Mvwf1 specifically turns off B4galnt2 expression in the intestinal epithelium and instead directs B4galnt2 expression in vascular endothelium. Endothelial cell expression of B4galnt2 results in aberrant posttranslational modification of the blood-clotting glycoprotein, von Willebrand factor (VWF), resulting in accelerated VWF clearance and low circulating VWF levels (Mohlke et al., 1999), similar to the common human bleeding disorder, von Willebrand disease (Sweeney et al., 1990). As in the B4galnt2 knockout mouse, loss of intestinal B4galnt2 expression due to Mvwf1 does not result in an obvious GI phenotype in the laboratory. Mvwf1 is common among laboratory mouse strains (Johnsen et al., 2008), and is present at intermediate frequencies in wild Mus musculus domesticus (house mouse) populations, where there is strong evidence of both recent and long-term selection at the B4galnt2 locus (Johnsen et al., 2009).

These observations seem paradoxical: Mvwf1 confers what would be expected to be a detrimental phenotype in M. musculus (a mild bleeding diathesis), yet the Mvwf1 allele is common in wild mice. The prevalence of natural murine B4galnt2 variants is further illustrated in our recent survey of multiple wild-mouse populations, in which we found loss of B4galnt2 intestinal glycans conferred by the Mvwf1 allele class to be frequent, but not always accompanied by the gain of B4galnt2 vascular expression (Linnenbrink et al., 2011).

We hypothesized that loss of intestinal B4galnt2 expression is at least in part responsible for the striking signatures of selection at the B4galnt2 locus in wild house mice. In this study, we sought to determine the influence of B4galnt2 expression on the resident microbiota throughout the GI tract as evidence of an intestinal B4galnt2 phenotype, as shifts in the intestinal microbiota are known to affect host susceptibility to pathogens and disease (Bishop and Gagneux, 2007; Sekirov et al., 2010; Stecher and Hardt, 2010). We performed high-throughput 16S rRNA gene profiling at multiple intestinal sites in mouse sibling pairs that differed only in the presence or absence of B4galnt2 expression. In addition, we analyzed B4galnt2 expression patterns in germ-free versus conventional mice. Here we describe this detailed characterization of mouse GI bacterial communities in seven distinct locations from the duodenum to the colon and provide evidence for a significant effect of B4galnt2 expression on intestinal bacterial populations.

Materials and methods

Animal material and tissue sampling

All animal protocols (except for the germ-free protocols, see below) were approved by the University of Michigan University Committee on the Use and Care of Animals (UCUCA). C57BL6/J animals were purchased from the Jackson Laboratory (Bar Harbor, ME, USA). B4galnt2 knockout animals first engineered by Dr John Lowe were provided with permission and courtesy of Dr David Ginsburg. Genetic background, maternal effect, housing conditions, gender and diet were accounted for in the following mating scheme: B4galnt2−/−animals bred for more than 20 generations to a C57BL6/J background were mated to C57BL6/J animals to generate heterozygous B4galnt2+/−parents. B4galnt2+/− parents were then intercrossed to generate B4galnt2+/+ and B4galnt2−/− male sibling offspring. Four ‘sibpair’ cages containing at least one B4galnt2+/+ and one B4galnt2−/− sibling were raised and housed together with standard mouse chow in the same room in a specified pathogen-free (SPF) animal facility at the same time.

Studies were performed at 10 weeks of age to ensure adequate time for the colonization and development of mature, stable microbiota (Rehman et al., 2011). Tissues were harvested by fresh dissection of 1 cm of bowel at each of the following defined anatomic sites: mid-duodenum (D2), mid-jejunum, terminal ileum (ending at the cecal valve), cecum and mid-descending colon. With the exception of the cecum, luminal contents were separated from mucosal specimens by flaying open the bowel, washing twice with 0.7 ml ice-cold RNALater (Ambion, Carlsbad, CA, USA) and the 1.4-ml effluent containing stool pooled and marked as ‘luminal contents’. The bowel was then placed in 1.4 ml ice-cold RNALater and marked as ‘mucosa’. For the cecal samples, the cecal pouch was dissected and flayed open, the entire specimen placed in a 15-ml tube with 4 ml ice-cold RNALater, and the specimen gently shaken to remove the cecal contents. The mucosa was retrieved with forceps and rinsed with an additional 0.7 ml of ice-cold RNALater. This rinse was added to the 4-ml stool solution and marked ‘luminal contents’. The rinsed cecal tissue was then placed in 1.4 ml ice-cold RNALater and marked ‘mucosa’. Owing to variability in the presence of luminal contents in the sections of the small intestine, only the contents of the cecum and descending colon were analyzed. Instruments were cleaned between each anatomic site to avoid cross-contamination.

Germ-free animals

C57BL/6J animals were raised under conventional or germ-free conditions at the University of Gothenberg. All animal protocols were approved by the Research Animal Ethics Committee in Gothenburg. At 6 weeks of age, small bowel was harvested for Dolichos biflorus agglutinin (DBA) lectin histochemistry as described below.

DBA lectin histochemistry

DBA lectin is reactive with terminal N-acetyl-D-galactosamine (GalNAc) residues and specifically detects the Mvwf1 switch in B4galnt2 expression from the intestine to the blood vessel (Mohlke et al., 1999). Fresh small bowel harvested from C57BL6/J and B4galnt2−/− animals was fixed in Z-fix (Anatech Ltd, Battle Creek, MI, USA) at room temperature overnight, then paraffin embedded. DBA lectin histochemical staining was performed on embedded tissues using horseradish peroxidase-conjugated DBA (EY Laboratories, Inc., San Mateo, CA, USA) as previously described (Mohlke et al., 1999).

DNA extraction

Approximately 100 mg of tissue or luminal contents was transferred to a 2-ml screwcap tube containing glass beads 0.1, 0.5 and 1 mm in size, each 50 mg (BioSpec Products, Bartlesville, OK, USA), and 1.4 ml lysis buffer argininosuccinate lyase from the QIAmp DNA stool mini kit (Qiagen, Hilden, Germany). After bead beating in a Precellys (Peqlab, Erlangen, Germany) bead beater (3 × 15 s at 6500 r.p.m.), samples were heated to 95 °C for 10 min and constantly shaken in a thermomixer (Eppendorf). Bacterial DNA was then extracted using the QIAmp DNA stool mini kit (Qiagen) following the manufacturer's instructions.

PCR and pyrosequencing

Universal bacterial primers for the V3 16S rRNA variable region were used to amplify bacterial 16S RNA as described by Dethlefsen et al. (2008). PCR amplicon primers were fused to molecular identifier (MID) tags and 454 sequencing adapters according to the manufacturer's instructions (Roche, Basel, Switzerland). All PCR reactions were performed in 50-μl duplicates and combined after PCR. The products were extracted with the Qiagen MinElute Gel Extraction Kit and quantified with the Quant-iT (Invitrogen, Karlsruhe, Germany) dsDNA BR Assay Kit on a Nanodrop 3300 fluorometer. Equimolar amounts of purified PCR product from seven knockout and six wild-type individuals were pooled to generate libraries for each anatomic location. Pooled amplicon libraries were further purified using Ampure Beads (Agencourt, Bernried, Germany). A sample of each library was run on an Agilent (Waldbronn, Germany) bioanalyzer as described in the Roche Titanium Amplicon Sequencing protocol before entering emulsion PCR and sequencing. Each library was sequenced on one-eighth of a picotiter plate on a 454 GS-FLX (Roche).

Sequence analysis

Raw reads obtained from the sequencer were filtered using a perl script according to the following criteria: average quality 25, no ambiguous bases, length between 110 and 200 bp, excluding MID tags and bacterial primers, and a perfect match to the MID and bacterial primer. Primer and tag sequences were removed using alignment with the Needleman–Wunsch algorithm (Needleman and Wunsch, 1970). Sequences were sorted by individuals to produce group files for analysis in MOTHUR v.1.12.3 (Schloss, 2009). Classification into bacterial phyla was performed with the Ribosomal Database Project classifier tool (Wang et al., 2007) and the RDP taxonomy database as implemented in MOTHUR. Reads were aligned with the kmer algorithm available under the align.seqs command (Schloss, 2009) in MOTHUR to the silva reference database (Pruesse et al., 2007). Sequences that did not match the reference alignment in the expected positions were removed.

Statistical analysis

Aligned sequences were used to build a distance matrix and group sequences into operational taxonomic units (OTUs) in MOTHUR. Species richness estimates and collector's curves were generated based on these OTUs and drawn using the R statistics package v.2.11.1 (R Development Core Team, 2010). Phylogenetic trees were built based on the MOTHUR-derived distance matrices with FastTree v. 2.1.3 (Price et al., 2009, 2010) and submitted to the Fast Unifrac online tool for principal coordinate analysis (PCoA) of Unifrac distances (Hamady et al., 2009). Linear models were fitted to the data using the ‘lm’ function implemented in R. Two mixed effect models were made with the ‘lme’ function contained in the nlme package v. 3.1-96 (Pinheiro et al., 2009) to account for potential maternal (cage of origin) effects (models 6 and 7 in Table 1). To compare species richness between different sections of the gut, we subsampled the same number of reads per individual and section (in silico capping), calculated Chao's richness estimator (Chao, 1984) and compared the number of OTUs observed in the subsamples.

Table 1 Linear model comparisons for the effect of genotype on Unifrac PCos

Candidate bacterial species distinguishing the genotypes were determined by calculating the point biseral correlation between species-level (97% sequence identity) OTUs and genotype with the ‘multipatt’ function as part of the ‘indicspecies’ R package v. 1.5.1 (De Caceres and Legendre, 2009). This approach considers not only the difference in read number assigned to an OTU between genotypes, but also the number of individuals of each genotype in which the OTU is observed, thus drawing statistical power from biological replicates.

The number of sequence reads per sample was capped in silico by random subsampling to obtain equal sample sizes for species richness, Unifrac and for detection of candidate species. To provide an initial overview of the sequence reads associated with each region of the GI tract, we classified sequences by aligning them to the SILVA reference database (Pruesse et al., 2007) and obtained taxonomic information from RDP Classifier (Wang et al., 2007).

Results

DBA lectin staining

DBA lectin staining of the small bowel from C57BL6/J animals demonstrates strong DBA reactivity in intestinal epithelial cells, the mucus material contained in goblet cells, and the lining of the intestinal luminal surface. No DBA reactivity is seen in the small bowel of B4galnt2−/− animals (Figure 1). These data show that DBA lectin histochemistry is a specific method for detection of B4galnt2-carbohydrates in the bowel, which are evident on the intestinal epithelium and mucus of B4galnt2-sufficient animals but absent in B4galnt2−/− animals.

Figure 1
figure 1

Small-bowel DBA lectin staining. (a) Wild-type (C57BL6/J) intestine exhibits robust intestinal epithelial DBA lectin staining (brown). Mucus (arrows indicate mucus-containing goblet cells) is strongly DBA lectin positive. (b) B4galnt2−/− bowel demonstrates complete loss of DBA lectin staining of the epithelium and mucus.

High-throughput sequencing of the B4galnt2+/+ and B4galnt2−/− intestinal microbiota

To determine the influence of B4galnt2 expression on the resident microbiota throughout the GI tract, we performed high-throughput pyrosequencing (454) of the bacterial 16S rRNA gene. Our sibpair breeding scheme (see Materials and methods) yielded a total of six B4galnt2+/+ and seven B4galnt2−/− individuals. We surveyed the bacterial communities associated with the mucus layer of five major functional compartments of the murine intestine (duodenum, jejunum, ileum, cecum and descending colon) and the luminal contents of the cecum and colon. The 16S rRNA V3 region was successfully amplified from all individuals at all sites, with the exception of the duodenum of one B4galnt2−/− individual (see Supplementary Table 1 for distribution of reads per GI tract location). A total of 337 449 reads of the hypervariable region V3 were analyzed after quality filtering. A single B4galnt2−/− individual displayed a strongly deviant pattern with respect to the relative abundance of the major bacterial phyla compared with all 12 other mice in the small intestine (Supplementary Figure 1), despite no notable phenotype at the time of dissection. This individual was housed with two other siblings, one B4galnt2+/+ and one B4galnt2−/−, which did not exhibit this pattern. Although we cannot exclude that this aberration is a consequence of loss of B4galnt2 expression, we speculate that this animal was subclinically ill or otherwise compromised, resulting in a shift distinct from other genotypes. Thus, we removed this individual from further analyses.

Bacterial phyla of the GI tract by anatomic site

The communities within the distinct sections of the GI tract differ largely in their composition and proportion of the major bacterial phyla (Figure 2). The mucosa of the small intestine (duodenum, jejunum and ileum) is dominated by Firmicutes (77%, 78% and 91%, respectively). As previously reported by Ley et al. (2005), we also detected sequences of Cyanobacteria origin in the small intestine (1.5% in duodenum and 2.1% in the jejunum). The mucosa of the cecum and colon contained a comparatively lower proportion of Firmicutes (40% and 25%, respectively) and higher proportion of Bacteroidetes (19% and 15%, respectively) and Proteobacteria (37% and 59%, respectively). The cecum samples also contained Deferribacteres (2.5%).

Figure 2
figure 2

Abundance of major bacterial phyla by anatomic site. Average abundance of sequencing reads from major bacteria phyla in +/+ (B4galnt2+/+) and −/− (B4galnt2−/−) mice along the intestine in percent of total bacterial sequences obtained. cec.cont, cecum content; col.cont, colon content.

The differences between the mucosa and the adjacent luminal contents of the cecum and colon are particularly striking. Cecum and colon mucosal tissues contain 19% and 15% Bacteroidetes, compared with 32% and 64%, respectively, in the lumen. The opposite pattern is found for Proteobacteria, with 37% and 59%, respectively, in the cecum and colon mucosal tissues, compared with 8% and 3% in the lumen. Furthermore, the Tenericutes are more abundant in the lumen. This remarkable distinction between the mucosal and luminal communities might be expected in light of previous findings that bacterial populations occupying human colonic mucosa are distinct from those present in fecal samples (Eckburg et al., 2005; Willing et al., 2010). Our results provide direct support for the presence of distinct communities inhabiting the niches found along the GI tract, including those associated with the tissue versus lumen at a given location.

Bacterial species richness differs between sections of the GI tract

To characterize the diversity of these communities at lower taxonomic levels, we tested for differences in bacterial species richness in the B4galtn2+/+ and B4galnt2−/− mice at the five mucosal anatomic locations of the GI tract and in the luminal contents of the cecum and colon (Supplementary Figures 2 and 3). For this analysis, sequences were grouped into species-level (97% similarity) OTUs and Chao's species richness estimator was calculated.

Although we detected no significant difference in species richness with respect to B4galnt2 genotype in any of the sections, several interesting patterns are apparent with regard to the individual GI tract locations. Species richness is highest in the luminal samples and on average lower in the mucus layer (335 versus 187 species, Chao's species richness estimator), consistent with the protective function of the mucus layer and a more controlled interaction between host and microbes closer to the intestinal epithelium. The high richness observed in the cecum may relate to its role as a biofermenter offering stable conditions for the growth of diverse species (Savage, 1977). Interestingly, we also observe a progressive decline of bacterial species richness along the small intestine (201, 168 and 82 for duodenum, jejunum and ileum, respectively, Chao's species richness estimator), mainly due to a much lower diversity in the ileum.

Species composition differs between B4galnt2+/+ and B4galnt2−/− mice

To test whether the composition of bacterial communities is influenced by B4galnt2 expression, we compared bacterial communities between B4galnt2+/+ and B4galnt2−/− mice using the phylogeny, Bernried, Germany-based beta-diversity measure UniFrac. This metric represents the distance between bacterial communities by comparing the shared branch length between phylogenetic trees underlying two communities. A matrix of UniFrac distances between the individual mice was analyzed for principal coordinates (PCo) to partition variation among samples into the most important independent components (Figure 3a). The resulting variance along the principal coordinates was analyzed in a linear model framework with respect to the experimental setup (Figure 3b, Table 1).

Figure 3
figure 3

Principal coordinate analysis (PCoA) of Unifrac distances across anatomic sites and genotypes. (a) PCoA of unweighted Unifrac distances for the whole dataset (all anatomic sites sampled). Colors by section, filled symbols=+/+; open symbols=−/−; symbol shape by cage of origin. (b) Box plot of PCo2 from a, bold horizontal lines represent the median, the box edges mark the upper and lower quartiles, whiskers extend towards the maximum and minimum; color code as in a.

Our model identifies B4galnt2 genotype as a significant determinant of PCo2, and hence bacterial community composition. Both a linear model including section and genotype as additive effects (model 5, Table 1) and the application of mixed-effects models to account for maternal (cage of origin) effects (models 6 and 7) assign genotype as a significant explanatory effect of community composition. The effect is visualized in Figure 3b: The value along PCo2 assigned to B4galnt2+/+ mice is higher for all portions of the GI tract except the colon, indicating a systematic effect of B4galnt2 expression on bacterial communities along the GI tract.

This analysis also reveals the striking effect of anatomic location on bacterial composition along the GI tract. The individual locations of the gut analyzed in our study differ strongly in the composition of their associated bacteria, accounting for 88% and 73%, respectively, of the variance observed along the first two PCos.

Differences in the bacterial community of B4galnt2+/+ and B4galnt2−/− mice by location

Because distinct locations along the GI tract differ largely in their bacterial composition, genotype-dependent differences between the bacterial communities of B4galnt2+/+ and B4galnt2−/− mice are likely to be site-specific. Thus, to analyze the potential influence of B4galnt2 on bacterial communities independent of variation between sections, we applied UniFrac to each anatomic location separately. We find the largest genotype effect on the bacterial communities associated with the ileum (Figure 4, P=7.49 × 10−5, R2=0.81, P=0.001 after Bonferroni correction for testing all sections and the first two PCos, analysis of variance).

Figure 4
figure 4

Unweighted Unifrac PCoA of the ileal mucosal bacterial community. Solid symbols=B4galnt2+/+, empty symbols=B4galnt2−/−. Symbol shape by cage of origin.

Individual bacterial species associated with B4galnt2+/+ and B4galnt2−/− genotypes

To shed light on the identity of bacterial species that make up some of the differences between the intestinal bacterial communities associated with B4galnt2+/+ and B4galnt2−/− mice, we applied a common ecological measure of species habitat association (Dufréne and Legendre, 1997) to OTU clusters (97% sequence identity). OTUs identified as indicators of B4galnt2 expression ‘habitat’ (that is, those consistently present or more abundant in one genotype compared with the other) were classified using the Ribosomal Database Project tool (Wang et al., 2007). Numerous interesting candidate taxa belonging to the three major phyla Bacteroidetes, Firmicutes and Proteobacteria were identified by this method (Table 2). Members of the Firmicutes appear to be the most widely influenced. A total of 11 indicator OTUs belonging to the classes Clostridiales or Lactobacillales were identified from this group, with at least one member differentiating B4galnt2 expression habitats in all seven sampled locations of the GI tract. The Bacteroidetes contained eight OTUs distributed across all jejunum, cecum and the luminal contents of the cecum and colon, five of which belonged to the genus Barnesiella. Four OTUs belonging to the Proteobacteria were identified in the duodenum, colon and luminal contents of the colon, three of which belonged to the genus Helicobacter.

Table 2 Candidate bacterial species as determined by OTU genotype correlation

Intestinal expression of B4galnt2 does not require the presence of bacteria

Expression of intestinal glycosyltransferases, such as the Fut2 glycosyltransferase, has been shown to be influenced by the microbiota present (Bry et al., 1996; Meng et al., 2007). This regulatory mechanism likely has an important role in the host's ability to alter the mucosal surface in response to the environment. To determine whether B4galnt2 expression similarly requires the presence of intestinal bacteria, DBA lectin staining was performed on the small bowels of mice housed under conventional and germ-free conditions. Intestinal DBA lectin staining was present and appeared similar under both conventional and germ-free conditions (data not shown), indicating that B4galnt2 expression occurs in the absence of intestinal bacteria.

Discussion

At birth the intestinal tract is sterile, but is rapidly colonized by a diverse spectrum of bacteria, in addition to archaea and eukaryotes. These organisms are provided a nutrient-rich and largely stable environment by the host. In turn, the host relies on the microbiota for a variety of metabolic processes and their presence is required for normal host intestinal development, mucosal integrity and maintenance of immunologic balance (Backhed et al., 2005; O’Hara and Shanahan, 2006; Artis, 2008; Fraser et al., 2009). Furthermore, these complex communities protect the host from pathogenic organisms in several ways: by occupying microbial niches resulting in displacement, by production of antimicrobial factors and by competing for nutrients and receptors (O’Hara and Shanahan, 2006). By systematically profiling multiple locations throughout the murine GI tract, we have characterized distinct microbial communities at discrete anatomic sites throughout the intestine and found striking differences between sites and between the mucosa and adjacent luminal contents. Although this is seemingly in contrast to a previous report that found no difference between the populations of the mucosa and lumen in a humanized mouse model (Turnbaugh et al., 2009), this apparent discrepancy may be attributable to the current study being based on sampling of unperturbed native bacterial communities.

We find that variation in the expression of a single glycosyltransferase, B4galnt2, is associated with significant shifts in the composition of the intestinal microbiota. These differences are consistent with host carbohydrate-specific selection on colonizing microbial populations, as B4galnt2−/− animals and their control B4galnt2+/+ littermates shared the same microbe exposures over their lifetime. The fact that the ileum displayed the most clear separation in overall composition as measured by the unweighted UniFrac metric also suggests B4galnt2-dependent immune activity, as strong antibacterial activity is present in this tissue (Petnicki-Ocwieja et al., 2009).

In addition to overall changes in composition, we identify a number of specific bacterial lineages influenced by host B4galnt2 expression. Although it could be argued that many bacterial orders found in the bowel have at least one member that has been shown to be important for gut health, nearly all the candidate lineages we identified as being influenced by B4galnt2 expression have been previously identified as significant to intestinal communities. For example, we discovered differences in several OTUs belonging to the order Clostridiales in the jejunum, ileum, cecum and cecum content, similar to two recent reports that found distinct Clostridiales OTUs in inflammatory bowel disease (IBD) samples compared with the controls (Willing et al., 2010; Frank et al., 2011). Furthermore, we observe differences in the family Lachnospiraceae (Order: Clostridiales), which was identified to be decreased in patients with Crohn's disease localized to the ileum compared with controls (Willing et al., 2010) and also reported to be disparate between healthy and diseased mice in an IL10 knockout model of IBD (Ye et al., 2008). Interestingly, Lachnospiracaea are major determinants of the recently reported ‘enterotypes’ described in a large study of human fecal metagenomes (Arumugam et al., 2011). Ye et al. (2008) also identified differences in their murine IBD model in the genus Barnesiella, which we too found to be influenced by B4galnt2 genotype. Others have associated Barnesiella with CD8+ T-cell function in mice (Presley et al., 2010), suggesting more than a bystander role for this genus in intestinal inflammation.

Intriguingly, we found that the indicator OTUs for B4galnt2+/+ and B4galnt2−/− mice at the same location of the gut often were members of the same bacterial family or genus (Figure 5). Streptococcus and Lactococcus were indicative of B4galnt2+/+ and B4galnt2−/− mucosal communities in the duodenum, as were two different Moryella OTUs in the cecum. Likewise, closely related indicator OTUs distinguished by B4galnt2 genotype were found in the luminal content of cecum and colon (Lachnospiraceae and Barnesiella, respectively). Thus, similar species appear to substitute each other depending on B4galnt2 genotype, indicating that many closely related species have the potential to occupy distinct B4galnt2 glycan-defined niches in the mucosa.

Figure 5
figure 5

Individual bacterial species associated with B4galnt2 genotype. OTUs of closely related species appear to substitute by B4galnt2 genotype (B4galnt2+/+=wt, B4galnt2−/−=ko). (a) Duodenal mucosa: OTU 134.1M (Lactococcus sp.), OTU 37.1M (Streptococcus sp.), (b) cecal mucosa: OTU 595.5M (Moryella sp.), OTU 37.5M (Moryella sp.), (c) colon contents: OTU 23.6S (Barnesiella sp.), OTU 192.6S (Barnesiella sp.), (d) cecal contents: OTU 499.5S (Lachnospiraceae), OTU 192.5S (Lachnospiraceae).

Unexpectedly, among the bacteria identified were several OTUs belonging to the genus Helicobacter. Helicobacter spp. are known to naturally infect and cause disease in laboratory mice, most commonly H. hepaticus, H. bilis and H. typhlonius (Feng et al., 2005). Although a difference in Helicobacter abundance would be expected to result in an altered risk for enteritis or other hepatobiliary inflammatory processes, in a protected laboratory environment animals with Helicobacter may exhibit a more subtle or subclinical phenotype. In humans, although H. hepaticus and H. bilis are both associated with hepatoenteric disease (Goldman and Mitchell, 2010), the most prevalent and best-studied pathogenic Helicobacter is H. pylori.

H. pylori predominantly inhabits the gastric mucosa and is associated with a spectrum of human intestinal diseases ranging from gastritis to malignancy (Timothy and Martin, 2009). Most H. pylori is located within the overlying gastric mucus layer and does not interact with the underlying epithelium. However, under some conditions H. pylori can adhere to the gastric mucosa, triggering virulence factors in the bacterium and an inflammatory response from the host (Timothy and Martin, 2009). Thus, the mere presence of Helicobacter may result in a bystander effect by which the nearby-resident microbiota may be influenced not only by niche competition with Helicobacter but also by more general environmental changes due to mucosal inflammation and upregulation of host defenses.

Adhesion to host glycans presented on the gastric mucosa, notably carbohydrate antigens of the ABH and Lewis blood group systems, is a critical step in the pathogenesis of H. pylori (Kobayashi et al., 2009). Furthermore, H. pylori is known to be capable of specifically binding additional carbohydrate moieties, including terminal β-1,3-GalNAc residues (Miller-Podraza et al., 2005), while other murine enterohepatic Helicobacter spp. isolates also demonstrate evidence of carbohydrate-specific adhesion (Hynes et al., 2003). In the duodenal and colonic mucosa, two indicator Helicobacter spp. were detected in the B4galnt2+/+ animals, yet these Helicobacter were largely undetectable in B4galnt2−/− individuals (Figure 6). Thus, we postulate a novel direct interaction between Helicobacter and host mucosal B4galnt2-derived β-1,4-GalNAc residues to be a likely mechanism responsible for the significant increase in abundance of these Helicobacter species observed in the B4galnt2+/+ mice.

Figure 6
figure 6

Abundance of Helicobacter spp. byB4galnt2 genotype. Relative abundance of OTUs of two Helicobacter spp. in the duodenum and colon segregated by B4galnt2 genotype (individuals are the same as in Figure 5, B4galnt2+/+=wt and B4galnt2−/−=ko). *ko6 duodenal mucosal sample was not available for analysis (see Results).

The conservation of intestinal B4galnt2 expression across species suggests an important functional role for B4galnt2 in the GI tract. This hypothesis is supported by our finding that the loss of B4galnt2 expression significantly impacts the composition of the resident microbiota. We propose that the complex signatures of natural selection observed in house mice (Johnsen et al., 2009) are at least in part due to the variation of B4galnt2-GalNAc residues on intestinal mucosal surfaces (the loss of glycans during evolution is discussed by Bishop and Gagneux (2007)). Taken together, these data support a scenario in which the loss of intestinal B4galnt2 expression offers a significant fitness advantage in the face of pathogens reliant on the presence of the otherwise ubiquitous intestinal mucosal B4galnt2-derived GalNAc. Host B4galnt2 glycans may serve as a carbon source for both symbiotic and disease-causing organisms and/or change specific binding targets for GI pathogens. For example, even in this study intended to characterize the symbiotic microbiota, we detected a difference in the pathogenic genus Helicobacter, which is known to adhere to other host blood group carbohydrate structures during pathogenesis.

In summary, we have found that B4galnt2 expression influences the intestinal microbiota. Helicobacter and other enteric pathogens known to interact with GalNAc residues, including protozoans (Entamoeba histolytica (Frederick and Petri, 2005)), viruses (Norovirus (Shirato et al., 2008)) and a spectrum of bacteria (Krivan et al., 1988; Karlsson, 1995), are excellent candidates to underlie the striking signatures of selection at the B4glant2 locus in wild-mouse populations.