Pangenome analysis reveals genetic isolation in Campylobacter hyointestinalis subspecies adapted to different mammalian hosts

Campylobacter hyointestinalis is an emerging pathogen currently divided in two subspecies: C. hyointestinalis subsp. lawsonii which is predominantly recovered from pigs, and C. hyointestinalis subsp. hyointestinalis which can be found in a much wider range of mammalian hosts. Despite C. hyointestinalis being reported as an emerging pathogen, its evolutionary and host-associated diversification patterns are still vastly unexplored. For this reason, we generated whole-genome sequences of 13 C. hyointestinalis subsp. hyointestinalis strains and performed a comprehensive comparative analysis including publicly available C. hyointestinalis subsp. hyointestinalis and C. hyointestinalis subsp. lawsonii genomes, to gain insight into the genomic variation of these differentially-adapted subspecies. Both subspecies are distinct phylogenetic lineages which present an apparent barrier to homologous recombination, suggesting genetic isolation. This is further supported by accessory gene patterns that recapitulate the core genome phylogeny. Additionally, C. hyointestinalis subsp. hyointestinalis presents a bigger and more diverse accessory genome, which probably reflects its capacity to colonize different mammalian hosts unlike C. hyointestinalis subsp. lawsonii that is presumably host-restricted. This greater plasticity in the accessory genome of C. hyointestinalis subsp. hyointestinalis correlates to a higher incidence of genome-wide recombination events, that may be the underlying mechanism driving its diversification. Concordantly, both subspecies present distinct patterns of gene families involved in genome plasticity and DNA repair like CRISPR-associated proteins and restriction-modification systems. Together, our results provide an overview of the genetic mechanisms shaping the genomes of C. hyointestinalis subspecies, contributing to understand the biology of Campylobacter species that are increasingly recognized as emerging pathogens.

infections. Among them, C. hyointestinalis is an emerging pathogen that was first isolated from swine with 86 proliferative enteritis 5 and has since been sporadically recovered from human infections but also as a commensal 87 from a wide variety of wild, farm and domestic mammals (including cattle, pigs, dogs, hamsters, deer and 88 sheep 6 ).

89
C. hyointestinalis is currently divided in two subspecies, namely C. hyointestinalis subsp. lawsonii and 90 C. hyointestinalis subsp. hyointestinalis, based on genetic and phenotypic traits 9,10 . While C. hyointestinalis 91 subsp. hyointestinalis has a broad host range, C. hyointestinalis subsp. lawsonii is restricted to pigs. Some 92 pioneering studies at both genetic and protein levels have suggested that C. hyointestinalis harbors even further 93 intra-species diversity 11-13 which could facilitate its adaptation to diverse hosts and environments. However, 94 these observations remain to be assessed at higher resolution due to the lack of available genomic data for both 95 subspecies, so the evolutionary forces driving its genetic and ecological distinction have not been explored at the 96 whole-genome level.

97
Here, we whole-genome sequenced 13 C. hyointestinalis subsp. hyointestinalis strains isolated from 98 healthy cattle and one strain isolated from a natural watercourse that were sampled on farms located around 99 Sherbrooke, Québec, Canada. By incorporating this information to the available genomes of both subspecies, we 100 performed a pangenome analysis to elucidate the main sources of molecular diversity in both subspecies and the 101 probable genetic mechanisms and functional characteristics that distinguish the host-restricted C. hyointestinalis 102 subsp. lawsonii from the generalist C. hyointestinalis subsp. hyointestinalis. Our work provides the first 103 comprehensive analysis of C. hyointestinalis subspecies at the pangenome level and will guide future efforts to 104 understand the patterns of host-associated evolution in emerging Campylobacter pathogens.

107
By whole-genome sequencing 13 C. hyointestinalis subsp. hyointestinalis strains, we enlarged by 45% 109 the current collection of available genomes for C. hyointestinalis. Then, by recovering 29 additional genomes of 110 C. hyointestinalis subsp. hyointestinalis (n = 19) and C. hyointestinalis subsp. lawsonii (n = 10) from public 111 databases, we built a genomic dataset consisting of 42 genomes (Table 1). These genomes represent strains 112 isolated between 1985 and 2016 from 5 different hosts in 6 different countries. This dataset was subsequently 113 used to apply comparative pangenomic, phylogenetic and ecological approaches to uncover the main sources of 114 genetic variability in C. hyointestinalis subspecies. 115 116 C. hyointestinalis subspecies are genetically isolated lineages. We first reconstructed the species clonal 117 phylogeny starting from a core genome alignment that consisted in 1,320,272 positions (representing 66% of the 118 longest genome), but after removing recombinations only 81,000 positions (representing 6% of the original core 119 genome alignment) remained in the clonal frame. The resulting clonal phylogeny showed a highly structured 120 topology with both subspecies completely separated in two distinct lineages (Fig. 1A, Fig. S1). This observation, 121 together with the clear differences in host distribution suggesting that both subspecies possess isolated ecological 122 niches ( Fig. 1B), led us to hypothesize that C. hyointestinalis subspecies are undergoing a speciation process 123 driven by host allopatry. Indeed, this was supported by a mean Average Nucleotide Identity (ANI) of ~95% 124 separating C. hyointestinalis subsp. hyointestinalis from C. hyointestinalis subsp. lawsonii (Fig. 1C), which is 125 assumed to be a lower boundary to assign bacterial genomes to the same species 14 . Further evidence supporting 126 the genetic isolation of both subspecies come from exploring genome-wide recombination patterns, which 127 revealed a strong barrier to homologous recombination between C. hyointestinalis subsp. hyointestinalis from C.
130 hyointestinalis subsp. hyointestinalis seems to be much more recombinogenic than C. hyointestinalis subsp.  lawsonii, respectively. Accordingly, Figure 2A shows a slightly significant difference in the accessory genome 141 size in favor of C. hyointestinalis subsp. hyointestinalis (p = 0.023, Mann-Whitney U test). This tendency was 142 also observable when calculating the diversity of accessory genes using the inverted Simpson's index for both 143 subspecies (p = 0.00021, Mann-Whitney U test) (Fig. 2B). Accessory gene presence/absence patterns also allowed to completely discriminate between C. hyointestinalis subsp. hyointestinalis and C. hyointestinalis 145 subsp. lawsonii using a Principal Components Analysis, indicating that they have subspecies-specific accessory 146 gene repertories (Fig. 2C). Indeed, 1,562 accessory gene clusters were exclusively found in C. hyointestinalis 147 subsp. hyointestinalis genomes while only 618 were specific to C. hyointestinalis subsp. lawsonii genomes.

148
These results support the hypothesis that both subspecies have been diverging isolated from each other for a 149 considerably long time, which probably has impacted the dynamics of their accessory genes and has resulted in 150 specific gene repertories confined to each subspecies. indicating that these systems are differentially present in ecologically distinct niches resembling again the 199 patterns we observed between C. hyointestinalis subspecies.

200
The maintenance of lineage-specific repertories of molecular machineries that modulate genome 201 plasticity is probably an extended mechanism in Campylobacter, considering that recombination is an important 202 evolutionary force for the adaptation and acquisition of a host signature in well-known Campylobacter 203 pathogens 22 . In general, adaptation occurs in favor of gradual host specialization, but generalism is also widely 204 observed in nature, for example in extremely successful C. jejuni lineages that can be found in high prevalence 205 from both agricultural sources or human infections 23 . A generalist phenotype can be thought as an advantage for 206 bacteria that colonize farm animals, since it allows the subsistence in multiple mammalian species that thieve in 207 close proximity. However, this also represents an increased risk for zoonotic transmission since these animals 208 are usually in contact with humans. Indeed, this scenario is reflected in C. hyointestinalis subspecies, given that 209 the generalist C. hyointestinalis subsp. hyointestinalis has been frequently isolated from human infections in 210 contrast to C. hyointestinalis subsp. lawsonii that is restricted to pigs and very infrequently reported in humans.  Clin. Microbiol. 30, 1982-1984(1992.