Introduction

The stomach has long been considered as a sterile and hostile environment due to its extreme acidity, but more recently it has been established that it is actually populated by microbes. Some inhabitants, that can infect the stomach for decades, are members of the genus Helicobacter [1], which can roughly be divided into gastric and enterohepatic Helicobacter species [2]. The gastric group exclusively colonizes the stomach of mammals, whereas the enterohepatic group colonizes the liver or intestinal tracts of animals, ranging from mammals to bird and reptiles [3].

The specificity of gastric Helicobacter species to colonize and persist in the stomach is the consequence of a series of adaptations that occurred a long time ago [4].

H. pylori is by far the most explored gastric Helicobacter species, colonizing the stomach of about half the human population and is predominantly transmitted within families rather than spread epidemically [5,6,7]. This bacterial species has been linked to a wide variety of gastric pathologies, including adenocarcinoma, which is the fifth most common cancer type worldwide [8]. Comparative genomics approaches for a substantial number of H. pylori strains have already revealed numerous insights into the evolution of this pathogen [4, 9,10,11]. H. pylori seems to be approximately as old as anatomically modern humans (at least 100 kya) highlighting that infection by this pathogen post-dated the evolution of humans [4]. This suggests that H. pylori has probably been acquired via a single-host jump from an unknown, non-human animal host since the stomachs of many animals are also infected by diverse species of Helicobacter. The closest relatives of H. pylori are H. acinonychis from wild felines, which has arisen through a host jump from humans [12], and H. cetorum from dolphins and whales [13]. Genetically much more distinct from H. pylori are the Helicobacter species naturally colonizing the stomach of domestic animals (including cats, dogs, and pigs) which have been designated as the non-H. pylori Helicobacter species (NHPH) group [14]. Particularly, cats and dogs are often infected with multiple species that can be regarded as sympatric (i.e., two bacterial species sharing the same niche and host who frequently encounter one another). Up to 8 different NHPH have been described so far in cats, dogs, and pigs [14, 15, 16], of which several harbor a zoonotic potential. A distinct feature is their association with mucosa-associated lymphoid tissue (MALT) lymphoma, another type of gastric cancer that is less prevalent in humans [14, 16]. Not all NHPH colonize the pets and pigs, for example H. suncus colonizes the stomach of house musk shrews [17, 18].

Due to the ongoing discovery of novel species in the stomach of pets [16] and similar disease outcomes caused by different gastric Helicobacter species that can infect one or multiple hosts [16], the need to study the evolutionary adaptation of these ecologically similar but genetically distinct species becomes even more important. Available studies involving genomic analyses of animal-related gastric Helicobacter species are currently restricted to gene content identity and phylogeny [12, 13, 19,19,20,21,22,23]. This leaves open questions regarding population structure and gene flow, intra- and interspecific genetic variability, and the age of speciation. Therefore, to improve our understanding of the evolutionary forces acting on the Helicobacter genus, we analyzed 108 gastric Helicobacter genomes encompassing multiple strains of available gastric species. In addition, 54 enterohepatic Helicobacter genomes from 23 species were also included for comparison. These latter species are ecologically and genetically distinct from the gastric members within the genus Helicobacter.

Our study provides new insights into the evolution of gastric Helicobacter species, showing comprehensive evidence of admixture between species, as well as estimates of the divergence time when these gastric Helicobacter spp. have split from their most recent common ancestor.

Materials and methods

Helicobacter strains and genome sequencing

A total of 162 strains, consisting of 108 gastric and 54 enterohepatic Helicobacter spp., were included in this study (see Supplementary Table 1). More specifically, strains from 11 gastric and 23 enterohepatic species, as shown in Table 1, were selected. Not all the gastric Helicobacter species were included in this study, such as H. suncus from house musk shrews, due to the absence of its genome sequence. Another gastric species H. mustelae, as well as H. suncus, is known to be phylogenetically more related to enterohepatic Helicobacter species rather than the gastric species [24, 25], and is indicated as gastrointestinal in Table 1 to reflect the potential to colonize both the stomach and the intestinal tract.

Table 1 Overview of the number of strains per Helicobacter species (11 gastric, 1 gastrointestinal and 23 enterohepatic species) included in this study

In addition, 10 strains from unknown enterohepatic species were also taken into account. Genome sequences from 77 Helicobacter strains were already available from the ftp NCBI database (ftp://ftp.ncbi.nlm.nih.gov/; Supplementary Table 1). The other 85 strains were collected during this study and cultivated in vitro under microaerobic conditions for DNA extraction purposes using the Qiagen (Venlo, The Netherlands) Blood & Tissue kit [16]. Subsequently, the DNA samples were used for whole genome sequencing, as previously described [16]. All genomes, including those from the ftp NCBI database, were subjected to gene finding and automatic annotation using Prokka (rapid prokaryotic annotation; [26]).

Pangenome, niche-specific gene and phylogenetic reconstruction

The pangenome was estimated using the rapid large-scale prokaryotic pan genome analysis (Roary) tool [27]. Briefly, the annotated proteins from all 162 strains were first used for a BLASTP all-versus-all sequence similarity search. From the BLASTP output, groups of orthologous proteins were predicted using the MCL software, with paralogue clustering enabled. Core genes were defined as present in ≥99% of the genomes. The rest of the genes were considered as the accessory genome. An overview is shown in Table 2. A pangenome matrix was then created based on the presence or absence of all genetic loci in each individual genome. Subsequently, niche-associated genes (gastric versus enterohepatic) were identified by Scoary analysis using the Roary output, based on a 80% BLAST identity cut-off and paralogue clustering enabled [28]. We used HHblits [29, 30] for iterative protein sequence searching of the niche-associated genes.

Table 2 The Helicobacter genus pangenome

Finally, a phylogenetic tree was created based on the concatenated core genes. The phylogenetic tree was built using the randomized accelerated maximum likelihood (RAxML) program [31] by applying the -f a, -p 12345, -x 12345, -# 100, -m GTRGAMMA parameters, and visualized using the interactive tree of life (iTOL) software [32].

Population structure and admixture analysis

To elucidate the population structure of the 162 Helicobacter strains, we used the Bayesian analysis of population structure (BAPS) software to identify groups that are genetically divergent. BAPS is a popular tool for studying population structure and admixture (genetic flux between populations). We used BAPS in a hierarchical manner (hierBAPS) to resolve the population structure at a finer scale, after excluding the non-informative singleton SNPs, to find substructures inside the main clusters [33]. The clustering was performed using two hierarchical levels. Deeper clustering did not change the number of clusters, when compared to two-level clustering.

The admixture analysis was conducted using two different approaches:

(1) The admixture analysis was first conducted based on the 47 BAPS clusters (at hierarchical level 2), with the following specifications: 100 iterations to estimate the admixture coefficient for the individuals, 200 reference individuals from each population, and 20 iterations for estimating the admixture coefficient for reference individuals [33,33,34,35,36].

(2) Subsequently, the chromosome painting algorithm was applied to the genome-wide haplotype data using the linkage model (http://www.paintmychromosomes.com). This was used to calculate the expected number of chunks of DNA imported from a donor into a recipient genome [11]. The obtained values were summarized into a matrix. Such co-ancestry matrices were created to investigate the blockwise homology between the core genome of the following subsets: (1) the canine/feline NHPH, (2) H. pylori and its two closest relatives (H. acinonychis and H. cetorum), and (3) H. heilmannii, H. ailurogastricus and H. suis. For the subset with the canine and feline NHPH, we excluded the H. baculiformis and H. cynogastricus singleton strains because chromosome painting is known to mistakenly infer distinct singleton strains as hybrids. Subsequently, we used these matrices as inputs in the fineSTRUCTURE algorithm to perform model-based clustering using a Bayesian Markov chain Monte Carlo (MCMC) approach. The fineSTRUCTURE algorithm was run for 100,000 iterations (the first 100,000 iterations were discarded as MCMC burn-in), and the thinning interval was specified at 100.

Date estimates

The least-square dating (LSD) algorithm [37] was applied to estimate the coalescence of different core genome subsets (generated with Roary), more specifically that of the Hp (including H. pylori, H. acinonychis, and H. cetorum) and canine/feline NHPH groups. A root-to-tip analysis, implemented in TempEst [38], was first applied on each subset for investigating the temporal signal and clock-likeness of molecular phylogenies. Subsequently, date estimates of time to the most recent common ancestor (TMRCA) for each subset were calculated using the LSD software with the following parameters: a mutation rate r ranging approximately between 1.8 e−7 and 3.6 e−7 (mean: 2.6 e−7; s.d.: 8 e−7; 95%, 1.28 e−7–3.91 e−7) based on the long-term population-based mutation rate in the human pathogen H. pylori [4]; -d datefile (see Supplementary Table 1), -c, -v 2, -r a, -f 100 and -s (sequence length of the alignment file). A 95% confidence interval was also generated.

Results

Pangenome reconstruction, niche-specific accessory genome distribution and core genome-based phylogeny

In total, 15,095 orthologous gene clusters were identified of which 399, designated as the core genome, were present among all Helicobacter strains (Table 1). The investigation of accessory genomic signals for niche-specialization (i.e., gastric versus enterohepatic specialization) revealed that the larger part of the Helicobacter accessory gene pool is specific to enterohepatic species (4416, 30%) and only a minor fraction (1365, 9.2%) is specific to Helicobacter species with gastric tropism (Supplementary Figure 1a). Of this accessory gene pool, 44 and 7 genes are core in gastric and enterohepatic Helicobacter species, respectively. These two sets of genes unveiled differences in biological processes between both Helicobacter groups that could be related to their niche specialization. The genes are listed in Tables 3 & 4. The majority of the gastric core genes (38 out of 44) were absent in H. mustelae (Table 3), whereas half of the enterohepatic core genes (3 out of 7) were present (Table 4).

Table 3 Functional annotation of core genes of gastric Helicobacter species
Table 4 Functional annotation of core genes of enterohepatic Helicobacter species

The division between enterohepatic and gastric Helicobacter species is also evident in the phylogenetic reconstruction based on the Helicobacter core genome, which resulted in two separate monophyletic groups (Fig. 1). The gastric helicobacters, characterized by long internal and short external branches, could be further divided into the Hp (including H. pylori and its two closest relatives, H. acinonychis and H. cetorum) and NHPH (including the canine, feline, and porcine helicobacters) clades, in which each species was clearly represented by a monophyletic group (Fig. 1). On the contrary, the enterohepatic clade appears to be rather star-like (i.e., short internal and long external branches), suggesting either a rapid ancient radiation event or pervasive recombination among different species. H. mustelae, which has been associated with gastritis, peptic ulcers, MALT lymphoma, and adenocarcinoma in domestic ferrets [20, 24, 25], clustered within the clade of the enterohepatic Helicobacter species based on the core genomic variation (Fig. 1). This phenomenon was also reflected in the accessory genome, where it can be seen that H. mustelae shares more genes with the enterohepatic Helicobacter species than with the gastric species (Tables 3 and 4, Supplementary Figure 1b). Although enteric colonization has not yet been described, H. mustelae has been detected in high amounts in feces of young ferrets, suggesting lower bowel colonization or transit of the organism from its gastric niche [24, 25]. The above findings thus confirm previous hypotheses [16, 18], which emphasize the capability of H. mustelae to colonize both the stomach and the intestinal tract.

Fig. 1
figure 1

Phylogeny and population structure of the Helicobacter genus. Phylogenetic tree based on the Helicobacter core genome, showing that the gastric and enterohepatic species (dashed lines) separated into two distinct clades, in which the gastric clade could be further divided into the Hp (including H. pylori and its two closest relatives H. acinonychis and H. cetorum; red-colored clade) and NHPH (including the canine and feline H. felis, H. salomonis, H. baculiformis, H. cynogastricus, H. bizzozeronii, H. heilmannii, and H. ailurogastricus, and the porcine H. suis; blue-colored clade) clades. Besides the name and ID of the species, its host and geographical origin was also included. The hierarchical Bayesian analysis of the Helicobacter population structure identified 16 clusters at the first finest hierarchical level (first colored bar) and 47 at the second and deepest hierarchical level (second colored bar). The scale bar shows the number of substitutions per site. The figure was drawn using iTOL

Population structure of the genus Helicobacter

We estimated the number of populations by grouping the 162 strains into genetically divergent clusters using BAPS [33]. Overall, the population assignment of the Helicobacter genomes was well correlated with the different clades of the core genome-based phylogenetic tree (Fig. 1). In total 16 clusters were identified at the first hierarchical level (first colored bar in Fig. 1 & Supplementary Table S2), where each gastric species was designated as a different population, with the exception of H. acinonychis (from wild felines), H. cynogastricus (singleton from a dog), and H. baculiformis (singleton from a cat). These latter species clustered with H. pylori (in particular, the ancestral African strains), H. felis (from dogs and cats), and H. salomonis (from dogs) populations, respectively (Fig. 1; Supplementary Table 2). For the enterohepatic Helicobacter spp., however, several populations contained more than one species, as in the case of the population containing H. mustelae (Fig. 1; Supplementary Table 2). Clustering at a deeper level (second colored bar in Fig. 1; Supplementary Table 2) further differentiated the Helicobacter strains into finer populations (47 in total), as well as reassigned species into new populations (Fig. 1 & Supplementary Table 2). More specifically, at this finer level of resolution, H. pylori, H. cetorum (from dolphins and whales), H. bizzozeronii (from a cat, dogs, and humans), H. heilmannii (from cats), H. ailurogastricus (from cats) and H. suis (from pigs) were divided into 3, 3, 2, 4, 3, and 3 subpopulations, respectively. Conversely, H. acinonychis (from wild felines), H. cynogastricus (from a dog), H. baculiformis (from a cat), H. mustelae (from a ferret) and several enterohepatic helicobacters were each assigned to a separate population (Fig. 1; Supplementary Table 2). The H. pylori and H. cetorum subpopulations reflect differences in geographical locations or hosts, whereas the two H. bizzozeronii subpopulations could partly correspond to host differences, namely cats versus dogs (Fig. 1). On the contrary, the subpopulations found within H. heilmannii, H. ailurogastricus, and H. suis could not be explained by differences in hosts or geography. The isolates of these latter species were all obtained from a different single animal of the same species (cat or pig) residing in the same country.

Inference of admixture events supporting interchangeability between species sharing the same niche, and the ecological barrier to genetic exchange between gastric and enterohepatic Helicobacter species

As recombination is an important contributor to sequence diversity and population heterogeneity within a bacterial species [39,40,41], we searched for evidence of intra- and interspecies genetic exchange events using different approaches.

An admixture analysis was first applied to the 47 hierarchically clustered populations. Figure 2 illustrates a colored map of the Helicobacter core genome phylogeny and the 47 BAPS clusters, where each color is equal to the estimated percentage for a Helicobacter strain (as shown in Supplementary Table 3) to have ancestry in the corresponding BAPS population. The BAPS admixture analysis revealed patterns of intra-species admixture within H. pylori, H. bizzozeronii, H. heilmannii, H. ailurogastricus, and H. suis (Fig. 2). Conversely, signatures of interspecies admixture were only observed within the Hp and NHPH clades, and more specifically in the canine H. cynogastricus strain (BAPS-26; 34% of its ancestry being derived from the canine/feline H. felis (BAPS-25)), the feline H. baculiformis strain (BAPS-35, 16 and 6% of its ancestry being derived from the canine/feline H. felis (BAPS-25) and the canine H. salomonis (BAPS-34), respectively), H. cetorum from white-sided dolphins (BAP-22 & BAPS-23, 21% of its ancestry being derived from the human H. pylori Amerindian and East Asian subpopulations (BAPS-2)) and the human H. pylori ancestral Africa 2 population (BAPS-3, 21.5% of its ancestry being derived from the wild feline H. acinonychis (BAPS-4)) (Fig. 2; Supplementary Table 3). Interestingly, admixture between members of the Hp and NHPH clades was not observed (dashed cyan rectangle in Fig. 2). In addition, the enterohepatic species were also included in the BAPS model, to explore any potential genetic exchange between species residing in a different niche. However, such a genetic exchange was not found (dashed orange rectangle in Fig. 2), suggesting the existence of barriers to the genetic exchange between these ecologically distinct groups.

Fig. 2
figure 2

Admixture analysis based on the 47 BAPS predicted population groups at hierarchical level 2 (columns), in association with the Helicobacter core genome phylogeny. Each color indicates the estimated percentage of a Helicobacter strain to originate from the corresponding BAPS population. The percentages of admixture range from 0% (blue color) to 100% (yellow color), and are also indicated in Supplementary Table 3. The admixture sources for (1) H. pylori (BAPS1–3), H. acinonychis (BAPS4), and H. cetorum (BAPS22 (138563_11), BAPS23 (138563_9), and BAPS24) and (2) H. felis (BAPS25), H. cynogastricus (BAPS26), H. salomonis (BAPS34), and H. baculiformis (BAPS35) populations are highlighted in green and yellow dashed boxes, respectively. Admixture sources for H. bizzozeronii (BAPS7–8), H. heilmannii (BAPS15–18), H. ailurogastricus (BAPS27–29), and H. suis (BAPS39–42) are highlighted in red, blue, pink, and dark green dashed boxes, respectively. The species names shown in the gastric clade of the core genome tree are presented in the same color, where admixture was identified in the corresponding BAPS clusters. Signatures of admixture between enterohepatic species were only observed between H. cinaedi (BAPS19–20) and H. magdeburgensis (BAPS21), and are highlighted by a purple dashed box. The absence of admixture between (1) the Hp and NHPH clades and (2) the gastric and enterohepatic groups is highlighted by cyan and brown-red dashed boxes, respectively. The figure is drawn using Phandango; ([59] https://jameshadfield.github.io/phandango/)

Chromosome painting and the fineSTRUCTURE algorithm were then applied to infer the possible genetic exchange among gastric Helicobacter species at the strain level. Based on the above BAPS admixture results, the patterns of genetic exchange were further investigated between the canine/feline gastric helicobacters (Fig. 3) and between H. pylori, H. acinonychis, and H. cetorum (Fig. 4). In addition, the porcine H. suis species is a sister clade of the feline H. heilmannii and H. ailurogastricus species (based on the core genome phylogeny; Fig. 1). Therefore, it might be assumed that these species, despite not sharing the same host, are more closely related, when compared to the other members of the NHPH group. The inference of admixture among these three species was therefore also considered (Fig. 5). The co-ancestry matrices, visualized as heatmaps in Figs. 35, showed that many events of intra- and interspecies genetic exchange have occurred in both the Hp and NHPH groups.

Fig. 3
figure 3

Co-ancestry matrix with evidence of genetic flux between H. felis, H. bizzozeronii, H. salomonis, and H. heilmannii, based on fineSTRUCTURE analysis. The singleton strains H. baculiformis and H. cynogastricus were excluded from this analysis. The color of each cell in the matrix indicates the expected number of chunks imported from a donor genome (column) into a recipient genome (row). The strain IDs (see Supplementary Table 1) and their corresponding host are indicated on the right and the species name is indicated in the tree topology. The highest interspecies recombination seen for H. heilmannii (1) and H. bizzozeronii (2) is highlighted by dashed black boxes

Fig. 4
figure 4

Co-ancestry matrix, showing evidence of genetic flux between H. pylori, H. acinonychis, and H. cetorum, based on fineSTRUCTURE analysis. The color of each cell in the matrix indicates the expected number of chunks imported from a donor genome (column) into a recipient genome (row). The strain IDs (see Supplementary Table 1) and their corresponding hosts are indicated on the right and the species name is indicated in the tree topology. Admixture between (1) H. pylori (ancestral population Africa 2) and H. acinonychis, (2) H. pylori (particularly, ancestral Africa 2 and the W-African subpopulation) and H. cetorum and (3) H. acinonychis and H. cetorum, are highlighted by black dashed boxes

Fig. 5
figure 5

Co-ancestry matrix, showing evidence of genetic flux between H. heilmannii, H. ailurogastricus, and H. suis, based on fineSTRUCTURE analysis. The color of each cell in the matrix indicates the expected number of chunks imported from a donor genome (column) into a recipient genome (row). The strain IDs (see Supplementary Table 1) and their corresponding hosts are indicated on the right and the species name is indicated in the tree topology. DNA imports from H. suis into H. heilmannii (1) or H. ailurogastricus (2) are highlighted by black dashed boxes

More specifically, the interspecific genetic exchange was particularly evident among canine and feline gastric Helicobacter spp. that can share the same hosts. H. heilmannii and H. bizzozeronii showed the highest level of interspecies recombination (Fig. 3). Both species have exchanged genetic material with each other, in particular the feline H. bizzozeronii strain (56877_22) with all feline H. heilmannii strains (Fig. 3). Moreover, from H. bizzozeronii, which also showed the highest intraspecies recombination, DNA was imported into the canine and feline H. felis, the feline H. ailurogastricus, and the canine H. salomonis species, whereas from H. heilmannii DNA was additionally imported into the feline H. ailurogastricus species (Fig. 3).

Within the Hp clade, H. cetorum showed the highest intraspecies admixture (Fig. 4). The analysis also revealed admixture between (1) H. pylori (ancestral population Africa 2) and H. acinonychis, (2) H. pylori (particularly, ancestral Africa 2 and the W-African subpopulation) and H. cetorum and (3) H. acinonychis and H. cetorum (Fig. 4).

Within the NHPH clade, furthermore, a few signatures of genetic exchange from porcine H. suis to the feline H. heilmannii or H. ailurogastricus were noted (Fig. 5).

Dates of the most recent common ancestor (TMRCA) within the gastric Helicobacter clade

The ages of splits between individual lineages within the gastric Helicobacter species were inferred by applying fast dating using least-squares criteria and algorithms implemented in the LSD software, which estimates the substitution rate and the dates of all ancestral nodes of a given tree [37]. We first analyzed the date of the split between the most ancient H. pylori population (Africa 2) and H. acinonychis, and their TMRCA was estimated to be 158,7 kya (95% CI: 157,5–159,7). This is in the range of the previously inferred coalescent dates (within the last 200 kya [12]; 88–116 kya [4]), thus validating our approach. To infer the split between H. pylori and H. cetorum, we analyzed the core genome alignment of the H. pylori Africa 2 population and the H. cetorum strains from marine mammals. The split date between H. cetorum and H. pylori was estimated to be ~610 kya (95% CI: 608,2–612,5) (Fig. 6a).

Fig. 6
figure 6

Time trees of Helicobacter subgroups. a The divergence between H. pylori (ancestral Africa 2) and H. cetorum began 610 kya. b Time of divergence among the H. baculiformis, H. salomonis, H. cynogastricus and H. felis subgroups ranged between 1,52 Mya and 689 kya. The scale bar indicates the number of years. Figures are drawn using iTOL

Interestingly, within the NHPH clade, it appeared that speciation occurred much earlier, compared to the date inferred within the H. pylori clade. More specifically, the TMRCA for the NHPH species was estimated to be 1,96 Mya [95%_CI: 1,947–1,967] (Fig. 6b). Moreover, the TMRCAs of (1) the H. felis, H. baculiformis, H. cynogastricus, and H. salomonis and (2) of the H. heilmannii and H. ailurogastricus subgroups have been inferred to be 1,52 Mya (95% CI: 1,514–1,525) and 690 kya (95% CI: 687,5–693,5), respectively (Fig. 6b).

Discussion

The Helicobacter genus currently comprises 46 Gram-negative species that have established symbiotic relationships in the gastrointestinal tract of one or more hosts; several of these are of pathogenic importance both to humans and animals [17]. This work aimed to elucidate a scenario of Helicobacter evolution and, in particular, the natural history of gastric Helicobacter species based on comparative genomics encompassing the species known to date.

Here, we first characterized the Helicobacter pangenome at the genus level, to investigate gene variation and phylogeny. The enterohepatic Helicobacter species harbor a larger accessory gene pool than the gastric Helicobacter species (Supplementary Figure 1), which could be assigned to their larger genome size (enterohepatic: median genome length of ca. 2 Mb versus gastric: median genome length of 1.63 Mb; [16]). However, this could also be due to the fact that the intestinal niche is less hostile to bacteria than the acidic environment in the stomach, and is thus populated with a more luxurious microbiome that allows frequent exchange of genetic features [1, 42]. Identifying mechanisms involved in niche adaptation is crucial to understand pathogen evolution [43,44,45]. In our study, we explored genetic features that distinguish gastric from enterohepatic Helicobacter species as in recent studies [46, 47], but by means of more than 3-fold increase of genome sequences of gastric NHPH and enterohepatic species. Genes specific to the gastric Helicobacter species were related to nickel homeostasis (e.g., nickel-binding protein Hpn [47]), peptide transport [48], outer membrane biogenesis (particularly outer membrane proteins (OMPs) from the Hor and Hof family; [49]) and tryptophan metabolism. Genes specific to the enterohepatic Helicobacter species were related to resistance to macrolides and the ability to synthesize L-arginine from L-ornithine. Arginine and ornithine are precursors of nitric oxide and polyamines, respectively, and play essential roles in permeability and adaptive responses of the gut [50]. Acquisition of the above traits is likely to be important for the adaptation of Helicobacter to the gastric or enterohepatic niche.

Ecological demarcations were also further underlined in the Helicobacter core genome phylogeny that resulted in the gastric and enterohepatic clades (Fig. 1). Boundaries between bacterial populations due to ecological segregation could limit recombination events, which play an important evolutionary role in the speciation process by defining the population structure of bacterial species [40, 41]. Indeed, in our study, the genetic exchange between the gastric and enterohepatic Helicobacter species was not observed (Fig. 2), thus supporting the presence of an ecological barrier which could have been taken place when the first part of the digestive tract became specialized (i.e., the stomach with acid production). Nevertheless, a genetic barrier effect could not be ruled out as an alternative explanation. In this latter case, the reduced efficiency of mismatch repair following homologous recombination between divergent sequences may represent a major contributing factor, as shown for Campylobacter [34, 44].

For the gastric Helicobacter group, however, extensive gene-flow occurred within each gastric species, emphasizing the genetic variability which is a known hallmark of Helicobacter [16, 40]. Admixture between species was also noted. As pets can be colonized by multiple gastric species [14], the genetic exchange among the canine and feline NHPH was evident with H. heilmannii and H. bizzozeronii, showing the highest level of inter-species recombination (Fig. 3). These two latter species are also the most prevalent species in cats and dogs, respectively [16, 17]. Furthermore, our admixture analysis based on the BAPS population assignment indicated that the singleton pet-associated strains canine H. cynogastricus and feline H. baculiformis are hybrids, in that they have received a considerable amount of DNA from H. felis (Fig. 2).

Signatures of admixture were also observed between closely related species not sharing the same host. These phenomena most likely represent remaining signals of shared ancestry. The signatures of genetic exchange from the porcine H. suis to the feline H. heilmannii or H. ailurogastricus (Fig. 5) might represent the result of shared genetic metabolic features, since these species have the same in vitro growth requirements, as opposed to the other NHPH [17, 22, 51]. Signals of remaining ancestry were also observed within the Hp clade (Fig. 4). The admixture noted between H. pylori ancestral Africa 2 and H. acinonychis from wild felines is in agreement with previous findings [4], where it was shown that the progenitors of this ancient H. pylori population are the source of H. acinonychis. Furthermore, gene flow events were also observed between H. cetorum from white-sided dolphins and the human Africa 2 and W-African H. pylori based on the fine-structure analysis at the strain level (Fig. 4). The shared ancestry observed between H. cetorum and the ancient African H. pylori strains might assume that the common ancestor of both species originates in the African continent as already suggested for H. pylori [4].

Elucidating the ancestral ages of splits between individual lineages within the gastric Helicobacter phylogeny completes the reconstruction of the historical path of these species [4]. This has already been well examined for H. pylori and H. acinonychis, where the minimum age of association between H. pylori and humans was estimated to be ~100 kya. Furthermore, H. acinonychis resulted from a later host jump from humans (ancestral Africa 2) to large felines ca. 43–56 kya [4]. In our study, we found that the coalescent for H. pylori (ancestral Africa) and H. cetorum dated to ca. 600 kya. In this timeframe, the modern humans and Neanderthals have also diverged [52]. This confirms the presence of H. pylori in humans before domestication, and confirms that the acquisition of H. pylori in humans resulted from a jump from an animal host [4]. Furthermore, the evolution of cetaceans had already occurred on the Indian subcontinent 5 Mya [53]. This indicates that H. cetorum did not coevolve with marine mammals, but made the jump to white-sided dolphins much earlier in the past (ca. 500 kya), and diverged thereafter to other cetaceans ca. 410 kya. Coevolution between a microbe and its host generally results in decreased pathogenicity, but a disruption caused by jumps between hosts has been associated with an increase in disease severity, as shown for Staphylococcus aureus and recently also for H. suis [54, 55]. Since infection with H. cetorum has been correlated with gastric ulcers in marine mammals [13, 17], these animals are thus probably not the natural host. Whether the origin of H. cetorum derives from a marine or terrestrial ecosystem remains to be further elucidated.

The gastric Helicobacter species from domestic animals are much more distinct from H. pylori and its closest relatives, H. cetorum and H. acinonychis. Due to the lack of admixture signals between Hp and NHPH, it is likely that these two groups did not evolve together, but rather in parallel, following a very distinct evolutionary path. This hypothesis was confirmed when estimating the time of divergence between the different canine and feline NHPH, indicating an ancient association between these species and their hosts. The most common recent ancestor for different canine/feline NHPH subsets ranged from 1,96 Mya to 690 kya. Taking into account that the common ancestor of cats and dogs is ca. 3–4 Mya [56, 57] and that domestication has occurred more recently [4], our estimations suggest that these pet-associated Helicobacter species coevolved with their host far before the domestication of either cats or dogs. Hitherto, these Helicobacter species cause little or no harm in cats and dogs [14], highlighting that pets are most probably their original host. On the contrary, the common ancestor of H. suis, the other member of the NHPH group and initially originating from asymptomatic non-human primates, existed until approximately 200 kya (193–197 kya) [55]. This species made a host jump to pigs between 100 and 15 kya where it causes gastric disease and whereby pig domestication has had a significant impact on the spread of H. suis in the pig population [55, 58]. As NHPH can occasionally also infect humans, it is thus plausible to assume that their zoonotic potential emerged after domestication of cats, dogs and pigs (ca. 100 kya).

In summary, our data provides new insights into the evolution of gastric Helicobacter species, with comprehensive evidence of admixture between species and date estimates for the historical events of gastric Helicobacter speciation.