Structure of FlgK reveals the divergence of the bacterial Hook-Filament Junction of Campylobacter

Evolution of a nano-machine consisting of multiple parts, each with a specific function, is a complex process. A change in one part should eventually result in changes in other parts, if the overall function is to be conserved. In bacterial flagella, the filament and the hook have distinct functions and their respective proteins, FliC and FlgE, have different three-dimensional structures. The filament functions as a helical propeller and the hook as a flexible universal joint. Two proteins, FlgK and FlgL, assure a smooth connectivity between the hook and the filament. Here we show that, in Campylobacter, the 3D structure of FlgK differs from that of its orthologs in Salmonella and Burkholderia, whose structures have previously been solved. Docking the model of the FlgK junction onto the structure of the Campylobacter hook provides some clues about its divergence. These data suggest how evolutionary pressure to adapt to structural constraints, due to the structure of Campylobacter hook, causes divergence of one element of a supra-molecular complex in order to maintain the function of the entire flagellar assembly.

into two sub domains, D1a and D1b, and a more compact domain D2, consisting mostly of β-strands (Fig. 1). Sub-domain D1a, which has eight discontinuous α-helices, could be described as a 4-helix bundle. The first segment of the helical bundle of domain D1, in the N-terminal region, is composed of α-helices α1 and α2, which consist of segments [Asp69-Arg99] and [Ile108-Asn124] that are connected by a short coil. The second segment of the bundle is made from a single long helix α3 that runs through the length of domain D1a. The third segment of the bundle is composed of 3 tandem helices, α4, α6 and α7 that consist of segments [Thr193-Leu213], [Gln291-Arg299] and [Ile314-Ser339], respectively. Helices α4 and α6 are separated by the sub-domain D1b. The fourth segment of the bundle consists of α-helices α12 and α13, which correspond to segments [Asp500-Tyr510] and [Met527-Gly568], respectively. These two helices are separated by a β-hairpin β19-β20. Domain D1b, which consists of segment [Val214-Gly289], is made of 7 antiparallel β-strands, β1 to β4 and β6to β7, connected by  relatively short loops. Domain D2 comprises segment [Ser340-Asn499]. It is located at one extremity of the helical bundle of domain D1 and consists of a combination of antiparallel β-strands connected by small loops and α-helices.

Divergence of FlgK from C. jejuni
Previously, structures of two orthologous proteins of FlgKcj have been solved by X-ray crystallography. The structure of a 64 kDa fragment of FlgK from the beta-proteobacterium Burkholderia pseudomallei (B. pseudomallei) has been previously reported 12 . Furthermore, a 49 kDa fragment of FlgK from the gamma-proteobacterium, Salmonella enterica typhimurium (S. enterica) has been deposited in the Protein Data Bank (PDB-id 2D4Y). These two structures will be hereafter referred to as FlgKbp64 and FlgKse49, respectively. Structural comparison shows that FlgKcj58 has diverged from both of these structures. The full-length FlgK protein from S. enterica (FlgKse) is 59 kDa while FlgK from B. pseudomallei (FlgKbp) is 67.4 kDa. FlgKcj has sequence identities of 24% and of 21% to FlgKse and FlgKbp, respectively, and sequence similarities of 40% and 35%. Insertions in the sequence can be found at various positions (Fig. 2). For comparison, FlgKse and FlgKbp have a sequence identity and similarity of 32% and 48%, respectively. All three structures of FlgK, FlgKcj58, FlgKse49, and FlgKbp64, lack the N-and C-terminal segments that form the coiled-coil of domain D0, seen in the structures of the bacterial flagellar hook and filament 13,14 . The three-dimensional structure of FlgKcj58 aligned to that of FlgKse49 and of FlgKbp64 with a root mean square deviation (RMSD) for alpha carbons of 1.38 Å over 302 residues and 1.54 Å over 291 residues, respectively. Overall structural alignments show that domain D1 of FlgK is well conserved (Figs 3A-C, S2, S3). The structure of the D1a domain of FlgKcj58, which consists of the helical bundle, is similar to that of FlgKse49 and of FlgKbp64. However, small differences can be noticed in this domain. The β-strands β10 and β20, which divide the C-terminal helix in FlgKcj58, is replaced by a short coil in both FlgKse49 and FlgKbp64. Furthermore, FlgKcj58 has an inserted segment [Gly300 -Gly313], which divides the segment [Gln291 -Ser339] into helices α6 and α7, while the equivalent segment corresponds to a continuous long α-helix in both FlgKse49 and FlgKbp64. The structure of domain D1b is conserved in all these three structures of FlgK.
Domain D2 of FlgKcj58 has a different fold compared to domains D2 of FlgKse49 and of FlgKbp64 (Figs 3A-C, S1,S3). Domain D2, in FlgKbp64 and FlgKse49, can be described as a beta-barrel made by eight β-strands connected by long loop. On the other hand, in FlgKcj58, the domain D2 has six β-strands, in a "V" shape, completed with α-helices. Furthermore, the relative position of domain D2 of C. jejuni FlgK differs from that of domains D2 of FlgK in both S. enterica and B. pseudomallei. FlgKbp64 has a third domain, D3, that neither FlgKse49 nor FlgKcj58 have. This domain D3 of FlgKbp64 has a fold similar to that of its domain D2 (Fig. S3). In FlgKcj58, a short loop links the N-terminus of domain D2 to the helix α7 in domain D1 (Fig. 3D). The size of this loop would prevent domain D2 of FlgKcj58 from adopting a similar position as in FlgKse49 or FlgKbp64. In the case of FlgKse49 and FlgKbp64, their D2 domain is linked to domain D1 by long loops (Fig. 3E

Model of the FlgK ring and the hook
The proteins that comprise flagellar axial protein complexes, such as FliC for the filament, FlgE for the hook, and FlgG for the rod, frustrate crystallization of the full-length proteins because of their tendencies to polymerize or to aggregate in solution 9,15,16 . A domain D0, found in all these proteins, is involved in coiled-coil interactions, preventing crystallization of full-length proteins, so crystallization is only possible after removal of this domain 17,18 . Domain D0 is missing from all the X-ray crystallography structures of FlgKcj58, FlgKbp67 and FlgKse49. A model of the full-length FlgK was built using Swiss-Model 19,20 , which selected the hook protein of Campylobacter 13 (FlgEcj, PDB-id 5JXL) as the best template for domain D0 of FlgK. Domains D0 of FlgKcj and FlgEcj share 33% sequence identity and 48% sequence similarity. For comparison, domains D0 of FlgEcj and of the distal rod protein FlgG from Campylobacter, which are known to have very similar structures, share 29% sequence identity and 45% sequence similarity. The resulting theoretical model includes the 3D structure of domain D0 of FlgKcj, not present in FlgKcj58 (Fig. 4A). Domain D0 of FlgKcj was first aligned to domain D0 of the FlgE protein in the hook (Fig. 4B,C), enabling the construction of a ring made of eleven molecules of FlgKcj58; we then performed further structural refinement as described under Methods (Fig. 4D,E). The overall structure of the bacterial flagellar hook and filament consist of a structure made of eleven protofilaments with the N and C terminal chains being the driving force of the structural organization through coiled-coil interactions 8,13,16 . The high sequence similarity between D0 domains of FlgKcj and FlgEcj makes us to believe that the ring of FlgK will also consist of eleven molecules, each interacting with one molecule of FlgE in the hook. The ring was docked on top of the hook of C. jejuni, initially using domain D0 of FlgE as a template for alignment ( Fig. 4F; see Methods for details). When docked onto the structure of the Campylobacter hook, the FlgK ring is almost completely hidden by molecules of the hook, leaving only the tips of a few molecules of FlgKcj protruding from the top of the hook (Fig. 4F).
In the bacterial flagellum, the lengths of the hook and the rod are well controlled 21,22 , but the mechanism controlling the number of FlgK molecules necessary to make the junction between the hook and the filament is unknown. However, a model of this junction, consisting of more than eleven molecules of FlgK causes clashes between domains D1a of the hook-distal FlgK molecule with the hook-proximal domains. These steric constraints seem to limit the number of FlgK molecules to eleven.

Discussion
Campylobacter is the only known bacterium that secretes toxins through its flagella. Understanding the overall structure of the flagellum of C. jejuni is important. The model of the FlgK ring from C. jejuni and the model of the connection between the ring and the hook from C. jejuni give some clues about why the structure of FlgKcj has a different folding compared to those of FlgKse and FlgKbp.
The ring made of FlgKcj58 was refined prior to its docking on the top of the hook molecule. The refined model of the ring shows eleven well packed molecules of FlgKcj58 with no room for extra molecules. FlgKcj58 from the refined structure has an RMSD of 3.3 Å with the crystal structure. The obtained FlgK ring has a diameter of 200 Å Figure 2. Sequence alignment of FlgK proteins. Sequence alignment of FlgK from C. jejuni and S. enterica with a representation of their respective secondary structure. Domain D0, composed of the N-terminal and the C-terminal chains, which are involved in coiled-coil interactions, was missing from each of these structures obtained by X-ray crystallography. The secondary structure of domains D1a, D1b, and D2, found in FlgKcj and FlgKse, are in blue, green, and purple, respectively. The red squares represent the conserved amino acid residues between both sequences. Amino acid sequences of proteins were aligned using Clustal Omega 37 , and secondary structure rendering used ESPript 3.0 (ref. 38 ).
Scientific REPORtS | 7: 15743 | DOI:10.1038/s41598-017-15837-0 (Fig. 5A). In this model, domains D1b and D2 make the external surface of the ring while helices of domain D1a make the central core. Most interactions within the molecules in the ring, are the between domain D1a of neighboring molecules (Fig. 5B) between the beginning of α-helix α1 and the end of α-helix α13. We also found some interactions between the short helix α11, in domain D2, and the β20 that divides helices α12 and α13 in domain  D1a. The subtle interactions are not surprising as most of the interactions will be between domains D0, similar to the interactions seen between FlgE molecules in the hook 13 .
The junction between the hook and FlgKcj58 ring was refined by optimizing the relative orientation and position of the FlgKcj ring (see Methods for details). The refined junction shows that FlgE molecules in the hook interact mostly with domain D1b of FlgKcj58 (Fig. 6). Domain D0, which is missing from FlgKcj58, would make the contacts to assure the continuation of the core region made by domain D0 of FlgE and FlgK (Fig. 4D). Domain D2, which does not seem to have any contacts with FlgE, might serve as the interacting domain in the FlgK-FlgL ring junction.
The necessity for divergence of the structure of Campylobacter FlgK arises, in part, from the structure of its hook protein, FlgE. Campylobacter FlgE (FlgEcj) has two extra domains compared to S. enterica FlgE (FlgEse) 13,23 . The model of the junction between FlgKcj and the hook shows that sub-domain D1b of FlgKcj, which is conserved in all known structures of FlgK, interacts with domain D2 of FlgE. Structural studies of FlgE from different proteobacteria have shown that domains D0, D1, and D2 of FlgE are structurally well conserved 13,23-25 . Based on structural homology of the FlgK and FlgE proteins, it is fair to assume that sub-domain D1b of FlgK will interact with domain D2 of FlgE as shown in our model. Our model of the junction shows that domain D2 of FlgKcj58 interacts with domains D3 and D4 of FlgEcj from C. jejuni (Fig. 7A). Structural alignment of FlgKcj and FlgKse shows that, if FlgKcj had a 3D folding similar to that of FlgKse, with its domain D2 located in the same position as for FlgKse49, its domain D2 would prevent formation of the junction with FlgEcj due to steric clashes (Fig. 7B). Indeed, domains D2 of FlgKse and FlgKbp would superimpose to domain D3 of FlgEcj (Fig. 7C) (Fig. 7D) and eventually makes contact with the ring of FlgL, which connect FlgK to the filament (Fig. S1).
The hook protein of Caulobacter crescentus (C. crescentus), FlgEcc, which structure has previously been reported 25 , has an extra domain D3 not found in S. enterica. However, Yoon and colleagues have shown that domain D3 from FlgEcj and domain D3 of FlgEcc have different structures and are positioned differently compared to the other domains 25 . It would be interesting to know how our proposed models apply to the hook of C. crescentus. With 702 amino acid residues, the sequence of FlgK from C. crescentus strain CB15 (ATCC 19089) is longer showing two molecules of FlgKcj58 as they appear in the FlgK ring. Cα backbone trace of each molecule is color-coded from blue to red from the N to the C terminus, while the surface is colored in cyan, green and magenta for domain D1a, D1b and D2, respectively. than the sequence FlgK previously described. The sequence comparison of FlgK protein from C. jejuni, S. enterica, B. pseudomallei and C. crescentus shows that, like FlgKbp, FlgKcc has third domain D3 (Fig. S4). FlgKcc has a sequence identity of 23.7% and a sequence similarity of 37% with FlgKbp, which is higher than with FlgKse and FlgKcj. This makes us to believe that the structure C. crescentus FlgK, which is not known, might be similar to that of B. pseudomallei, with a slightly larger domain D3 (Fig. S4). Our FlgE-FlgK junction model shows that domain D2 of FlgKcj could not interact with domain D3 of FlgEcc (Fig. 7E), making it a possible fit as a junction for C. crescentus hook. However, FlgKbp could be a better model for FlgK of C. crescentus (Fig. 7F) and the sequence alignment of FlgKcc and FlgKbp (Fig. S4) seems to support this hypothesis.
Bacterial flagella grow in a sequential process 26,27 . Once the hook is completed, FlgK starts appearing at the top of the hook to form the first junction, followed by FlgL that builds up at the top of the ring made of FlgK. The hook is very flexible, while the filament, although more rigid, is able to undergo structural transitions between different states with distinct helical properties 14,28 . The junctions made by FlgK and FlgL play important roles in bridging the hook to the filament. The structural divergence of FlgKcj is in accordance with the existence of extra domains in the hook of C. jejuni, but pressure for this change could have diverse origins. In Campylobacter and in related organisms, FlgL, has between 750 and 950 residues compared to 410 for Burkholderia and 317 for Salmonella. The divergence of FlgKcj is also an indication of the changes that could be expected in the connection between FlgK and FlgL in Campylobacter. Overall, the structural divergence of FlgK from Campylobacter is a case where one element of a supra-molecular complex diverges to compensate for changes in another element, in order to maintain the entire assembly and its function.

Methods
Cloning. The DNA sequence encoding FlgK57cj (amino-acid residues 70-580) was amplified by PCR from C. jejuni strain NTCT11168 genomic DNA with the 5′primer GGCTCTCATATG GATGAGTATTCTTACTATAAATTAAAAGGTGC, generating an NdeI site and a start codon, and the 3′ primer GCGACGCTCGAGATTATATAAGGGCGGCTAATTCTTCATTTGTAT, generating a stop codon and an XhoI site. The PCR fragment was digested with NdeI and XhoI and ligated into the T7 expression vector pET22b (+) (Novagen). The plasmid was transformed into Escherichia coli strain BL21 (DE3) for expression.
Protein expression and purification. BL21 (DE3) cells harboring FlgKcj58 were grown in 6 L Luria broth (LB) containing antibiotic 100 μg mL −1 ampicillin. Protein expression was induced with 1 mM isopropyl β-D-1-thiogalactopyranoside (IPTG) at an OD 600 of 0.8 and cultivation continued for 4 h at 310 K. The cell culture broth was harvested by centrifugation at 8000 g for 15 min. The cell pellet was suspended in 20 mM NaCl, 50 mM Tris-HCl pH 8.0 buffer and cells were disrupted by sonication. The solution of sonicated cells was then centrifuged at 100 000 g for 1 h and the supernatant was loaded onto a HiLoad Q Sepharose 'High Performance' anion-exchange column (GE Healthcare) equilibrated with 20 mM NaCl, 50 mM Tris-HCl pH 8.0 buffer. Protein elution was performed with a linear gradient of NaCl from 0 to 0.5 M. The main fractions were dialyzed overnight against 50 mM Tris-HCl pH 8.0 buffer and loaded again onto the same Q Sepharose column. The anion exchange chromatography procedure followed by dialysis was repeated three more times in order to bring the protein to an optimal level of purity. Then eluted fractions were pooled and loaded onto a Superdex 200 gel-filtration column (GE Healthcare) equilibrated in 100 mM NaCl, 10 mM Tris-HCl pH 8.0 buffer and eluted with the same solution. Gel filtration fractions were pooled and dialyzed overnight against 5 mM Tris-HCl pH 8.0 buffer. The protein was concentrated to 16 mg ml −1 using a Centriprep centrifugal filter device (Millipore). The protein stock solution was stored at 277 K.
Crystallization and data collection. The initial screening of crystallization conditions was carried out using the sitting drop vapour-diffusion method (200 nL protein solution was mixed with 200 nL reservoir solution and equilibrated against 120 μL reservoir solution) in 96-well plates using an automated nanolitre liquid-handling system (Mosquito, TTP labtech). The following screening kits were employed: Wizard I, II, III, and Cryo I, II (Emerald Biosystems) and Crystal Screen (Hampton Research). Two different protein concentrations: 7 and 16 mg mL −1 were used and crystallization plates were equilibrated at 293, 288, 283 and 278 K. Crystals of FlgKcj58 were obtained after 14 days, at 293 K, 16 mg mL −1 with condition 9 of Crystal Screen I (30% PEG 4k, 0.2 M ammonium acetate, 0.1 M sodium citrate pH 5.6). Optimization of FlgKcj58 crystals was performed manually, using the hanging-drop vapor diffusion method with a reservoir volume of 1 mL and a drop volume of 4 μL. The optimal crystallization buffer was found to be: 30% PEG MME 2,000, 0.4 M ammonium acetate, 0.1 M sodium citrate pH 5.6, 2% MPD, 4% 2-propanol and 5% ethylene glycol. Prior to X-ray diffraction measurements, all crystals were cryo-protected by soaking them briefly in a solution corresponding to the reservoir solution supplemented with 25% of ethylene glycol. Crystals were then mounted on a cryo-loop and data were collected at 100 K. Crystals diffracted at a resolution of 2.5 Å and belonged to the space group P2 1 2 1 2.
Docking methodology. A model of the full-length FlgK protein was built using Swiss-Model 19,20 . This theoretical model included the 3D structure of FlgKcj domain D0 that is missing from the experimental structure of FlgKcj58 and was modeled based on FlgE of Campylobacter jejuni (PDB-id 5JXL). This model was used to build the initial ring of FlgKcj based on the alignment with FlgE molecule in the hook.
Modeling of FlgK 11-mer. We generated an initial full-length FlgK structure by beginning with the D1D2 crystral structure, and then manually threading secondary structure elements from the D0 domain of FlgE onto regions with matching secondary structure predictions (via the I-TASSER server 29 ) for D0. After refining the hybrid model using iterative steps of FG-MD 30 and manual refinement to minimize clashes, we then threaded the complete FlgK sequence onto that model using I-TASSER to generate a complete monomer structure.
An initial structure for the FlgKcj ring (11-mer) was obtained based on the alignment with the hook molecule FlgE. We then performed minimal manual adjustments of the structure to remove clashes, and allowed the structure to relax in a molecular dynamics simulation by performing 5,000 steps of minimization and then of Langevin dynamics; next, we applied 500 ps of steered molecular dynamics simulation (after an additional 500 step minimization to account for the new restraints) pulling the center of mass of the FlgK ring toward the FlgE, with harmonic restraints keeping the FlgE monomers in place, at a force constant of 500 kcal/(mol nm 2 ). The pulling velocity was 0.01 nm/ps, with a spring constant of 1000 kcal/(mol nm 2 ) in the pulling direction and 5000 kcal/(mol nm 2 ) transverse to the pulling direction. After 500 ps of pulling, we analyzed the internal alpha carbon RMSDs of the FlgK subunits to determine at which point they began to deform. The position of the FlgK ring relative to FlgE immediately before large internal deformations of the FlgK monomers began was taken to be the optimal interaction distance. Beginning from that structure, we applied a final MD simulation of 1,000 steps minimization and 100 ps Langevin MD with no applied restraints, and then performed final relaxation with Rosetta as described above for the FlgK ring. In this final relaxation step, the D0 domain of FlgK was removed to avoid speculation regarding the interface of the uncrystallized portion. Data Availability. Atomic coordinates of FlgKcj58 have been deposited in the Protein Data Bank under accession code 5XBJ. The refined models have been deposited in DataDryad (doi:10.5061/dryad.fv8h6) and will be made available to the public. The X-ray diffraction data were collected in Spring-8 (Harima, Japan) under the proposal numbers 2012B6739, 2013B6845, 2014A6901, 2014B6901, 2015A6501.