In silico proteomic and phylogenetic analysis of the outer membrane protein repertoire of gastric Helicobacter species

Helicobacter (H.) pylori is an important risk factor for gastric malignancies worldwide. Its outer membrane proteome takes an important role in colonization of the human gastric mucosa. However, in zoonotic non-H. pylori helicobacters (NHPHs) also associated with human gastric disease, the composition of the outer membrane (OM) proteome and its relative contribution to disease remain largely unknown. By means of a comprehensive survey of the diversity and distribution of predicted outer membrane proteins (OMPs) identified in all known gastric Helicobacter species with fully annotated genome sequences, we found genus- and species-specific families known or thought to be implicated in virulence. Hop adhesins, part of the Helicobacter-specific family 13 (Hop, Hor and Hom) were restricted to the gastric species H. pylori, H. cetorum and H. acinonychis. Hof proteins (family 33) were putative adhesins with predicted Occ- or MOMP-family like 18-stranded β-barrels. They were found to be widespread amongst all gastric Helicobacter species only sporadically detected in enterohepatic Helicobacter species. These latter are other members within the genus Helicobacter, although ecologically and genetically distinct. LpxR, a lipopolysaccharide remodeling factor, was also detected in all gastric Helicobacter species but lacking as well from the enterohepatic species H. cinaedi, H. equorum and H. hepaticus. In conclusion, our systemic survey of Helicobacter OMPs points to species and infection-site specific members that are interesting candidates for future virulence and colonization studies.


Results
Identification of OMP families. Fifty four genomes were analysed for candidate OMPs by screening their open reading frames for proteins with signal peptide and β-strand propensities using the HHomp tool 22 . A total of 3380 putative OMPs were identified among the different strains of E. coli (500) and the genera Campylobacter (141) and Helicobacter (2739). An overview of the number of OMPs per species and per strain is presented in Table 1. For the Campylobacter and the E. coli strains, the total number of OMPs per genome ranged between 21 and 27, and between 88 and 116, respectively. The mean number of OMPs was lower for the 4 enterohepatic helicobacters (a mean of 33 OMPs) with a minimum of 24 OMPs for H. equorum eqF1 and a maximum of 55 OMPs for H. trogontum R3554), compared to the gastric Helicobacter species (a mean of 66 OMPs) with a minimum of 47 OMPs for H. salomonis M45 and a maximum of 118 OMPs for H. cetorum MIT 00-7128). H. mustelae, which is able to colonize both the gastric and intestinal environment 23 , contained 49 OMPs (Table 1). Subsequently, these OMPs were classified into families based on protein sequence homology. From the 3380 OMPs, 2794 proteins could be classified into one of the 90 families known from the OMPdb reference database (OMPdb.org). The results are listed in Supplementary Table S1. For each of the 90 OMP families, the name and function is shown as well as the OMPs from the E. coli, Campylobacter and Helicobacter strains that clustered into that family.
The remaining 586 protein sequences were then searched against the Pfam protein database, resulting in the additional classification of 308 proteins into 31 putative OMP families with unknown biological function. These families were assigned a systematic family ID from "X1" to "X31". The names of these 31 candidate OMP families and the predicted OMPs from the E. coli, Campylobacter and Helicobacter strains that clustered among these families, are shown in Supplementary Table S2.
Finally, the remaining 278 unclassified protein sequences were clustered into phylogenetic groups using the CD-HIT program with algorithm settings to produce the fewest possible number of clusters. With this method, 106 clusters were produced. The clusters with single members that had a length of less than 120 amino acids (a total of 52 sequences) were excluded. This final clustering revealed 75 families of unknown annotation or function, which were assigned a systematic family ID of "Y1" to "Y75" (Supplementary Table S3). To visualize the relative distribution of the OMP families amongst E. coli and the Campylobacter and Helicobacter species, the predicted OMPdb families, and X and Y families were presented in Venn diagrams (Fig. 1A). The OMP families from the OMPdb database seemed to be well-conserved among E. coli and Helicobacter, and to a lesser extent in Campylobacter (Fig. 1A). From the 90 OMPdb families, 19 families were unique for E. coli, 10 families for the Helicobacter species, and only 1 family for the Campylobacter species. Several of the X-families from the Pfam database were found to be genus-specific (Fig. 1A). From the 31 X-families, 10 families were unique for E. coli, 3 families for Campylobacter, and 3 for Helicobacter. From the 75 Y-families, 46 families were unique for Helicobacter (especially in the helicobacters from cats and dogs), 2 families for Campylobacter and 6 were unique for E. coli (Fig. 1A) The phylogenetic tree of this OMP family shows 6 subgroups (clades) could be distinguished (Fig. 2). Strikingly, the known Hop adhesins (2) all cluster into a monophyletic clade (Fig. 2, clade 2) that is specific for H. pylori (dark blue), its closest relative H. acinonychis (red), and H. cetorum (light purple) and which were found to be completely absent in canine, feline or porcine gastric NHPH. In H. pylori family 13, clade 2 comprises several group members, of which AlpA, AlpB, HopA, HopF, HopI, HopG, HopL and OipA were found to be conserved in H. acinonychis and H. cetorum, which additionally held orthologs of BabB and HopD, and of SabA respectively (Fig. 2). BLAST analyses revealed that several putative OMPs from the canine, feline and porcine gastric NHPH species clustered into these subgroups. For the enterohepatic Helicobacter species and H. mustelae, only orthologous putative OMPs of HorD and HorG were present in this family (clade 1 in Fig. 2). The orthologous OMP from C. coli (76339) and C. jejuni (00-2425) also clustered in this latter subgroup. Interestingly, subgroup 6 ( Fig. 2) contained putative OMPs that were present in canine, feline and porcine Helicobacter species only.  proteins (Fig. 3A). These 8 subgroups were highly conserved among H. acinonychis, H. cetorum, as well as canine, feline and porcine NHPH species, except for HofB, which is only present in H. bizzozeronii. On the contrary, the analysed enterohepatic helicobacters and H. mustelae only contained a few putative Hof-like OMPs, which did not cluster with one of the 8 Hof proteins known from H. pylori. Hof proteins lack close homologs outside the genus Helicobacter. They are of unknown function except for reports in H. heilmannii that implicate HofE and HofF in adherence to human gastric mucins 19 . To gain insight into the putative Hof function, representative sequences of the 8 Hof subgroups were subjected to remote homology and fold recognition searches using the Protein Homology/analogy Recognition Engine 2 (PHYRE2) 24 and RaptorX 25 . Both algorithms picked up high confidence structural homology to 18-stranded β-barrels of the Outer membrane Carboxylate Channel (Occ) family proteins (formerly Opd/Opr) found in Pseudomonas and Acinetobacter 26,27 , as well as to the 18-stranded Major Outer Membrane Protein (MOMP) of C. jejuni 28 . In support of this homology assignment, the transmembrane elements in the 3D threaded structure are also found by the transmembrane β-barrel protein predictor BOCTOPUS 29 , have plausible hydrophobic residue distributions, and show the presence of 2 aromatic "belts" as is frequently observed in OM β-barrels ( Supplementary Fig. S1). The Occ family OMPs are monomeric 18-stranded porins found in species lacking large channel trimeric porins, and are implicated in non-specific diffusion of small carboxylate-containing solutes across the OM 26,27 . C. jejuni MOMP can be found both as a monomeric or trimeric porin of 18-stranded β-barrels involved in cation selective passive diffusion over the OM 30 , though has also been implicated in bacterial adherence 31 . Family 36 -Systemic factor protein A (SfpA/LpxR). The prototype of this family is the systemic factor protein A (SfpA) of Yersinia enterocolitica. Lipid A deacylase (LpxR), a SfpA homologue in Salmonella Typhimurium, has been shown to be important for immune evasion through lipopolysaccharide modification 32 . Also in H. pylori, LpxR homologs play a key role in the establishment of long-term colonization 33 .
Except for H. cinaedi, H. equorum, and H. hepaticus, the examined Helicobacter strains contained between 1 and 2 orthologs of SfpA/LpxR per genome (Family 36 of the OMPdb database, Supplementary Table S1), suggesting that LPS remodeling by lipid A deacylation is a common characteristic in most Helicobacter species. In contrast, only one orthologous protein was present in E. coli strain TW14359 and there were no members of the SfpA/LpxR family detected in Campylobacter and the other E. coli strains (Supplementary Table S1). As shown in the phylogenetic tree (Fig. 4), the H. mustelae ortologous OMP clustered closer to enterohepatic helicobacters than to gastric NHPH species.
Family X1 and X3 -(Putative) vacuolating cytotoxin family. The secreted vacuolating cytotoxin A (VacA), belonging to the autotransporter OMP family, is an important virulence factor of H. pylori. After binding to and internalization into host epithelial cells, VacA induces cellular vacuolation and various other responses 34,35 . Strikingly, most other Helicobacter species lack VacA homologs, except for H. cetorum and H. acinonychis.
In the present study, all analysed H. pylori and H. cetorum strains each harbored one vacA copy (Family X1 of the Pfam database, Supplementary Table S2), whilst H. acinonychis contained more than one vacA copy, though these were inactivated by insertion sequences compared to H. pylori and H. cetorum as described before 15,36 . Besides VacA, H. pylori also contains 3 VacA-like autotransporters that each enhance its capacity to colonize the stomach 35 . Previously, we reported the presence of such vacA-like autotransporter gene in canine, feline and  (6); H. pylori HorA, HopK and HopJ (5) (Fig. 5).
Well-conserved OMP families present among Helicobacter, Campylobacter and/or E. coli with a role in virulence or colonization. Family 14 -Imp/OstA. Imp (increased membrane permeability) or OstA (organic solvent tolerance) is an organic solvent tolerance protein in Gram-negative bacteria that participates in outer membrane biogenesis and integrity 38,39 . The Imp/OstA protein is implicated in the translocation and insertion of LPS into the outer leaflet of the OM bilayer. It has also been associated with membrane permeability, organic solvent tolerance and resistance to antibiotics in H. pylori 40,41 . We identified 48 orthologous outer membrane proteins in the Imp/OstA family (Family 14 of the OMPdb database, Supplementary Table S1). In each of the examined strains of Campylobacter and Helicobacter, except for H. trogontum, one orthologous Imp/ OstA protein was found. However, the latter species harbored an additional putative OstA paralog (Family X18, Supplementary Table S2). The phylogenetic tree of OMP family 14 is shown in Supplementary Fig. S2. In accordance to the other families, H. mustelae clustered closer to enterohepatic helicobacters than to gastric NHPH species.
Family 42 -Outer membrane factor (OMF). Gram-negative bacteria possess energy-dependent transport systems to export proteins, carbohydrates, drugs and heavy metals across the two membranes of the cell envelope 42 . Type 1 protein secretion systems (e.g. HlyBD-TolC) and RND multidrug efflux systems (e.g. AcrA/B-TolC or MexAB-OprM) consist of a cytoplasmic or inner membrane (IM) export system, a membrane fusion protein (MFP) and an outer membrane factor (OMF). TolC, the prototype OMF found in E. coli forms a trimeric 12-stranded β-barrel (3 × 4 strands) with an extended coiled coil domain reaching into the periplasm and contacting the IM-localized export system via the MFP 43 . These transport systems have been shown to play a role in protein export and multidrug efflux, the latter producing both intrinsic and elevated multidrug resistance 42,44,45 . In total, 219 orthologous proteins were identified to belong to the OMF family (Family 42 of the OMPdb database, Supplementary Table S1). Different members of the OMF family were present in all the analysed E. coli, Campylobacter and Helicobacter strains. With protein BLAST, several different OMPs subgroups could be distinguished in the OMF family (Supplementary Table S4). These different subgroups are indicated in the phylogenetic tree of the OMF family ( Supplementary Fig. S3). Also, here, orthologs of the OMP family of H. mustelae clustered with enterohepatic helicobacters rather than with gastric Helicobacter species.

Family 3 -Outer Membrane Receptor (OMR-TonB Dependent Receptor) and Family X2 -TonB-dependent Receptor
Plug Domain. TonB-dependent transporters are bacterial outer membrane proteins that bind and transport ferric chelates called siderophores, as well as vitamin B 12 , nickel complexes, and carbohydrates into the periplasm. For which they use energy from the proton motive force of the cytoplasmic membrane via the TonB-ExhB-ExbD membrane proteins. The TonB-dependent outer membrane receptors have also been shown to be required for bacterial virulence 46 . In the present study, two TonB-dependent outer membrane receptor families were identified.  apart from H. salomonis. OMPs belonging to this TonB-dependent receptor family were absent in C. coli and C. jejuni. The phylogenetic tree of OMP family X2 is shown in Supplementary Fig. S5.
Family 38 -Outer Membrane Phospholipase (OMPLA). The outer membrane phospholipase A (OMPLA), encoded by the pldA gene which is widespread among Gram-negative bacteria, hydrolyses acyl ester bonds in phospholipids and lysophospholipids 47,48 . OMPLA has been described as a virulence factor. For instance, in C. coli, OMPLA was identified as a major hemolytic factor and H. pylori OMPLA has been shown to be involved in the colonization and invasion of the human gastric mucosa 47,49-52 . Moreover, H. pylori isolates with high OMPLA activity have been associated with peptic ulcer disease in human patients 53,54 . In the present study, 53 proteins were classified in the OMPLA family (Family 38 of the OMPdb database, Supplementary  Fig. S6).

Discussion
In this study, we have analysed the genome sequences from a total of 54 different strains of the genera Helicobacter, Campylobacter and E. coli for their presence of OMPs. Considering the genome length and the total protein count, gastric Helicobacter species harbor proportionally more OMPs than other helicobacters or the Campylobacter or Escherichia reference genomes. Gastric helicobacters have small genomes and proteomes compared to the average E. coli genome, which holds a median genome length and protein count of, respectively, 5.17 Mb and 4931 proteins according to the NCBI database. Instead, H. pylori has a median genome length of only 1.63 Mb with a median protein count of 1451. However, H. pylori's surface-localized proteome did not shrink proportionally, and ~4% of its total protein count constitutes of OMPs, compared to just ~2% for E. coli. This large collection of OMPs in an otherwise reductionist genome suggests they form important fitness factors in the survival and adaptation to the harsh gastric environment 4 . Among the gastric Helicobacter species, the total OMP number was the highest  H. acinonychis (a mean of 97) and H. cetorum strains (a mean of 110), although it should be noted that these species contain multiple fragmented OMPs (Table 1).
In general, we found that the clustering of a strain or species' OMPs in the phylogenetic trees is similar to the phylogenetic clustering of their full genomes. The phylogenetic reconstructions of the different families revealed a clear and evident division between enterohepatic and gastric Helicobacter species. The gastric helicobacters could be further divided into H. pylori and its two closest relatives, H. acinonychis and H. cetorum and NHPH species including the canine, feline, and porcine helicobacters clades. Interestingly, H. mustelae, which has been associated with gastritis, peptic ulcers, MALT lymphoma, and adenocarcinoma in domestic ferrets 55,56 , clustered within the clade of the enterohepatic Helicobacter species. This supports previous hypotheses, which emphasize the capability of H. mustelae to colonize both the stomach and the intestinal tract 15 .
The primary function of the outer membrane of Gram-negative bacteria is to form a barrier against hazardous substances from the environment such as enzymes, detergents, and antimicrobials. The permeability of the outer membrane is determined by the presence of OMPs that function as porins. They contain transmembrane diffusion channels through which small hydrophilic molecules, nutrients, and small antibiotics can be transported across the outer membrane 4,57 . In our study, two Helicobacter-specific porin families were found, namely Family 13 and 33 (Supplementary Table S1, Figs 2 and 3). Both families mainly contain OMP orthologs from the genus Helicobacter albeit with a greater extent in the gastric helicobacters than in the enterohepatic ones. C. coli and C. jejuni only harbor one such OMP and members of these families are even lacking in E. coli. The families 13 and 33 were thus probably acquired by Helicobacter after splitting-off from a last common ancestor. Moreover, most gastric NHPH species lack all H. pylori Hop adhesins, suggesting that these OMPs were acquired after H. pylori speciation [15][16][17][18] . The H. pylori-specific Hop proteins function as adhesins for gastric epithelial cells 4 . Interestingly, the adhesive properties of the blood group antigen binding adhesin BabA was found to be pH responsive and to provide the bacteria with a reversible adherence profile that is fine-tuned to the pH-gradients in the stomach mucosa 58 . Also the canine, feline and porcine gastric NHPH species have been shown to attach to the gastric mucosa 15,19 . The absence of the H. pylori Hop adhesins in these NHPHs suggests that other OMPs function as adhesins in these organisms. Indeed, genes encoding orthologs of H. pylori Hof proteins seem to be well conserved in the canine, feline and porcine gastric NHPH species of which HofE and HofF have recently been identified as adhesins in H. heilmannii 19 . HofF has also been shown to be important for H. pylori colonization, but the function of the other H. pylori Hof OMPs remains largely unknown 19,59 . Furthermore, the exact role of the other NHPH OMPs from Families 13 and 33 in NHPH colonization remains to be further elucidated. Remote homology recognition and 3D threading of family 33 (Hof) members indicates structural similarity with 18-stranded porins of the Occ family in Pseudomonas and Acinetobacter, and the C. jejuni major outer membrane protein -MOMP, which are implicated in cation-selective solute diffusion across the OM, as well as in adherence in case of MOMP 28,30,31 .
The explicit OMP variation among species might also be favorable for evasion of the host's immune response. H. pylori has developed a very large repertoire of mechanisms to evade both innate and adaptive immune recognition 60 . One way of H. pylori to evade the immune response is the avoidance of recognition by Toll-like receptors of its bacterial surface molecules such as LPS and flagellin. Here, we identified the SfpA/LpxR OMP family (Family 36, Supplementary Table S1, Fig. 4) present in all gastric Helicobacter species, which might be involved in immune evasion by removing the 3′-acyloxyacyl group of lipid A 61,62 . For the enterohepatic Helicobacter species tested here, SfpA/LpxR was only detected in H. trogontum. This OMP family might thus be more specific for gastric species than their enterohepatic members within the genus Helicobacter.
A very well-studied outer membrane virulence factor of H. pylori is the secreted vacuolating cytotoxin A (VacA; Family X1, Supplementary Table S2) that causes uncontrolled cellular vacuolation 34,35 . In agreement with previous research 15,36 , we identified the VacA OMP only in H. pylori and H. cetorum and short fragments of this protein in H. acinonychis, but not in other NHPH species. We additionally found a Helicobacter-specific putative VacA-like cytotoxin family (Family X3, Supplementary Table S2, Fig. 5). The VacA-like autotransporters belonging to this family enhance the bacterium's colonization capacity of the stomach 63 . The VacA-like OMP is well conserved among the different gastric Helicobacter species, although their protein sequences exhibit much variation 15 . Besides, we also showed variation in the number of VacA-like autotransporters, not only between species but also at the species level. For instance, in H. bizzozeronii strain 10, no VacA-like autotransporter could be identified, whereas the other examined strains of H. bizzozeronii each contained one copy of this OMP. This underlines the genetic diversity among strains within a species. However, it should be noted that the genomes of most NHPH species that were analysed in this study, including that of H. bizzozeronii strain 10, are draft genomes that lack approximately 5% of the full genome sequence. Therefore, it cannot be excluded that the gene encoding the VacA-like autotransporter of this H. bizzozeronii strain is part of the lacking 5% of its genome sequence. The genomes of enterohepatic Helicobacter species, except for H. trogontum, lack a VacA-like autotransporter as well. This OMP family may therefore be more specific for gastric helicobacters. However, in the H. mustelae strain included in our study, the VacA-like OMP is absent as well. Also for all other families that were analysed, the OMPs of this H. mustelae strain clustered closer together with enterohepatic helicobacters, or separately between enterohepatic and gastric species. Thus, although H. mustelae has been described as a gastric Helicobacter species associated with gastric malignancies, recent studies suggest that this species is an enterohepatic species by origin, but adapted its colonization niche from the intestinal to the gastric environment 23,56,64,65 .
In addition, several OMP families were found to be well conserved among E. coli and the Campylobacter, and Helicobacter genera. The Imp/OstA family (Family 14, Supplementary Table S1, Fig. S2), an organic solvent tolerance protein, and OMF family (Family 42, Supplementary Table S1, Fig. S3), part of type 1 protein secretion systems and RND multidrug efflux systems, maintain a barrier for antimicrobial agents and play a role in the resistance to drugs. In contrast to the Helicobacter-specific outer membrane porins which utilize passive diffusion for solute uptake, outer membrane receptor proteins, such as TonB-dependent receptors, carry out high-affinity binding and energy-dependent uptake of specific substrates including iron. In this study, two TonB-dependent receptor families (Family 3 and Family X2, Supplementary Tables S1 and S2, Figs S4 and S5) were identified that contribute to bacterial virulence 46 . Remarkably, OMPs from both Family 3 and Family X2 were lacking in H. salomonis. This may suggest that this species has other iron uptake mechanisms at their disposal to maintain iron homeostasis.
Finally, a virulence factor that has been shown to influence the colonization capacity and pathogenicity of Gram-negative bacteria, is the outer membrane phospholipase A (OMPLA) (Family 38, Supplementary Table S1, Fig. S6). Orthologs of this family could not be found in the genomes of H. cetorum strain MIT 00-7128, H. pylori strain G27, H. equorum, and H. mustelae. Whether the absence of OMPLA in these strains influences their virulence and colonization capacity remains to be further investigated.
In conclusion, several important OMP families, mainly from gastric Helicobacter species, were determined by using comparative genomic and phylogenetic analyses. To our knowledge, this is the first report analysing OMP occurrence and diversity in NHPH species, since previous studies on the OMP repertoire in the genus Helicobacter have mostly concentrated on the human pathogen H. pylori. Two Helicobacter-specific outer membrane protein families with possible functions in adhesion (Family 13, i.e. Hop, Hor and Hom; and Family 33, Hof), the Helicobacter-specific SfpA/LpxR OMP (Family 36) that functions in immune evasion, and a Helicobacter-specific VacA-like cytotoxin family with a role in colonization capacity (Family X3), were identified primarily in gastric species. Furthermore, we showed that most Helicobacter species contain an outer membrane factor (OMF; Family 42) and Imp/OstA (Family 14), both involved in antimicrobial resistance, TonB-dependent OMPs with a function in metal and vitamin-uptake (Family 3 and Family X2), and an outer membrane phospholipase (OMPLA; Family 38) that plays a role in colonization capacity.
In summary, our systemic survey of Helicobacter OMPs points to species and infection-site specific members that are interesting candidates for future virulence and colonization studies.  Table 1.

Escherichia coli, Campylobacter and
Data management and integration. Genomic data was modeled using the Django object relational mapper (ORM) (http://www.djangoproject.com) and database tables were automatically created using the management command 'syncdb' . The FASTA protein sequences of putative outer membrane proteins (OMP) of the selected strains of E. coli, Campylobacter and Helicobacter were extracted from the genomes by using the HHomp tool 22 as described before 15 . The strains of which their corresponding genome sequences are fully known were deposited in the EMBL databases and the accession numbers are shown in the tables. From the other strains, with unknown genome sequences, their OMPs were uploaded in the database. Subsequently, all OMPs were combined to a single 'combined.fasta' file using the Unix 'cat' command. Next, a Phython script was written using IPhython Notebook (http://ipython.org/notebook.html), to load all sequences to the database.
Classification into OMP families. The seed alignments from the 90 OMP families that are defined in OMPdb.org were downloaded. For each of these families, the seed alignments were converted to HMM profiles using the HMMER3 suite of programs (http://hmmer.janelia.org). Specifically, the 'hmmbuild' program was executed for each seed alignment as follows: 'hmmbuild <hmmfile_out> <seed_alignment>'. The generated HMM profiles were then stored to the database. Next, the HMM profiles for all of the OMP families were collected and "pressed" for faster searches using the program 'hmmpress' in HMMER3, resulting in an HMM database of OMP families. Then, each OMP protein sequence was searched against this HMM database in order to classify them into any of those OMP families. The remaining unclassified proteins were searched against the Pfam database using 'hmmscan' in the HMMER3 web server.
Finally, the last remaining 278 unclassified proteins were clustered into groups with CD-HIT using the settings 'word length = 2, identity cutoff = 40%' in order to produce the fewest possible number of clusters. The clusters with single members with an amino acid length of <120 (52 sequences in total) were dropped. The distribution of the OMP families from the Escherichia-, Campylobacter-and Helicobacter genera was determined with 'pandas' , a data processing package in Python (http://pandas.pydata.org). For plotting and visualization, the 'matplotlib' plotting package was used (http://matplotlib.org).
Alignment and phylogenetic analysis. The OMP families with at least 3 identified members were subjected to phylogenetic analysis. The OMP protein sequences of these families were written into multi-sequence FASTA files (one for each family). Multiple sequence alignment was performed using the Clustal Omega program (http://www.clustal.org/omega/). The alignments were trimmed with 'TrimAl' program (http://trimal.cgenomics. org) in order to remove sequence regions that potentially blur the phylogenetic signal such as highly variable loop regions. The trimmed alignments were then fed into the 'FastTree' phylogenetic tree-building program (http:// www.microbesonline.org/fasttree). The phylogenetic trees were visualized and edited by the online tool 'Interactive Tree Of Life' (iTol) 66 . The best-fitting root was selected with TempEst v1.5 67 , formerly known as 'Path-O-Gen' . The OMP protein sequences of all species and strains were subjected to protein BLAST (ncbi). In this way, the individual OMPs belonging to each family as well as possible subgroups could be distinguished. The relative positions of all species' OMPs in the phylogenetic trees were studied and evaluated, and the possible roles in virulence and colonization are presented.
Accession codes. The NCBI or EMBL accession numbers of the species and strains with fully annotated genomes are provided in Table 1.