Ligand co-crystallization of aminoacyl-tRNA synthetases from infectious disease organisms

Aminoacyl-tRNA synthetases (aaRSs) charge tRNAs with their cognate amino acid, an essential precursor step to loading of charged tRNAs onto the ribosome and addition of the amino acid to the growing polypeptide chain during protein synthesis. Because of this important biological function, aminoacyl-tRNA synthetases have been the focus of anti-infective drug development efforts and two aaRS inhibitors have been approved as drugs. Several researchers in the scientific community requested aminoacyl-tRNA synthetases to be targeted in the Seattle Structural Genomics Center for Infectious Disease (SSGCID) structure determination pipeline. Here we investigate thirty-one aminoacyl-tRNA synthetases from infectious disease organisms by co-crystallization in the presence of their cognate amino acid, ATP, and/or inhibitors. Crystal structures were determined for a CysRS from Borrelia burgdorferi bound to AMP, GluRS from Borrelia burgdorferi and Burkholderia thailandensis bound to glutamic acid, a TrpRS from the eukaryotic pathogen Encephalitozoon cuniculi bound to tryptophan, a HisRS from Burkholderia thailandensis bound to histidine, and a LysRS from Burkholderia thailandensis bound to lysine. Thus, the presence of ligands may promote aaRS crystallization and structure determination. Comparison with homologous structures shows conformational flexibility that appears to be a recurring theme with this enzyme class.

intermediate 6,7 . Due to the presence of an ester bond which is rapidly hydrolyzed in blood plasma, mupirocin is limited to topical use.
A number of aaRS inhibitors are in preclinical development 2 . These include natural products such as borrelidin that targets a number of ThrRSs through an allosteric mechanism and ochratoxin A that targets PheRS as an active site inhibitor. Structures have not yet been determined for either of these inhibitors bound to their target aaRS, although modeling studies based on resistance mutations have provided insight into the putative borrelidin binding site 8 . In contrast, a number of structures have been published for aminoacyl-adenylate reactive intermediate analogs, such as sulfonamides [9][10][11][12] . However, these compounds typically also inhibit the human homolog and therefore have been abandoned as drug candidates. More recently, a series of diamino quinoline compounds has been developed against Gram-positive bacterial MetRSs, first by GlaxoSmithKline 13 , then by Replidyne [14][15][16] , and an academic group 5,17 . These compounds exhibit strong selectivity for the bacterial MetRS over human MetRS, but may suffer from poor bioavailability 5 . Therefore, further research is necessary both in lead development and aaRS structural biology. The crystal structures of P. falciparum LysRS and ProRS with cladosporin or halofuginone represent valuable studies of aaRS complexes with nature product-like anti-malarial inhibitors 18,19 .
Due to their biological importance and potential as therapeutic targets, aminoacyl-tRNA synthetases have been targeted by a number of structural genomics centers. Perhaps the most successful structural genomics centers at studying aaRSs has been the Medical Structural Genomics of Pathogenic Protozoa (MSGPP), which along with subsequent efforts has resulted in nearly twenty aaRS crystal structures 17,[20][21][22][23][24][25] . Nearly one hundred aaRSs have entered the Seattle Structural Genomics Center for Infectious Disease (SSGCID) [26][27][28] structure determination pipeline, both as internally selected targets and also as targets nominated by the scientific community. These targets are largely comprised of aaRSs from Gram-negative bacteria such as Borrelia burgdorferi, which causes Lyme disease 29 ; Brucella melitensis, which causes brucellosis or Malta fever; orthologs of Mycobacterium tuberculosis, which causes tuberculosis; and Rickettsia prowazekii, the etiologic agent of epidemic typhus. Other targets include a smaller number of aaRSs from eukaryotic pathogens such as Ehrlichia chaffeensis and Encephalitozoon cuniculi. Several aaRSs from Burkholderia thailandensis were identified as candidate essential genes in a transposon screen 30 . A number of these selected aaRS enzymes have been successfully purified, although none of them reached structure determination through first pass pipeline techniques. Here we describe our efforts to obtain aminoacyl-tRNA synthetase structures from infectious disease organisms, which have resulted in six new aaRS co-crystal structures; the initial structure of a seventh target identified via this strategy was recently reported along with inhibitor complexes 31 . All of these structures contain a ligand which may be important for stabilizing the enzyme and promoting crystallizability.

Results and Discussion
Co-crystallization of aaRSs from SSGCID organisms. For the initial round of crystallization of SSGCID targets, no ligands were added to the protein solution, and in general two crystallization trials were set up in 96-well format most commonly in the JCSG+ and PACT sparse matrix screens 32 , although depending on the day-to-day availability one or both of these screens were substituted with Wizard III/IV, Wizard I/II, CSHT or Morpheus. If diffraction quality crystals were not obtained from the initial round of crystallization trials, 6-8 additional sparse matrix trials were set up in 96-well format for high value targets such as those requested by the scientific community. During this second round, the protein concentration was adjusted depending on the percentage of drops containing precipitation in the first round (aiming for approximately 30-50% precipitation as optimal). During the third round of crystallization, a subset of the available aaRS protein samples were incubated with 5 mM ATP (Sigma-Aldrich) and 5 mM of the cognate amino acid (Sigma-Aldrich), and four additional sparse matrix screens were initiated, typically JCSG+, PACT, Wizard III/IV and CSHT. The combined results for the first three rounds of crystallization trials are shown in Table 1. Of the 31 proteins selected for co-crystallization trials, 18 produced crystals (58%), 11 produced crystals which diffracted to better than 6 Å resolution (35%), and 7 crystal structures were determined (23%). Overall, these rates are comparable with other protein classes in the SSGCID pipeline. X-ray diffraction data and structure determination statistics for the six structures are shown in Table 2 and the individual structures are detailed below. Although we solved one or more structures of most aaRS subclasses, we were unable to obtain a co-crystal structure of subclass 2c, perhaps in part due to the low solubility of L-phenylalanine or ochratoxin A in aqueous solution at crystallography concentrations.
CysRS from Borrelia burgdorferi bound to AMP. Crystal structures of cysteinyl-tRNA synthetase (CysRS, E.C. 6.1.1.16) from E. coli have been reported as apo, bound to substrate 33 , and in complex with tRNA 34 . Interestingly, in some organisms a CysRS enzyme has not been identified, and a prolyl-tRNA synthetase (ProRS) exhibits cross-reactivity to charge tRNAs with cysteine, although this may represent mis-acylation rather than a truly bifunctional enzyme 10 . A structure of human CysRS has not yet been solved. We determined a 2.55 Å resolution structure of CysRS, a class Ia aaRS, from B. burgdorferi, the causative agent of lyme disease 29 (Figs 1A and 2A). For each of the six co-crystal structures determined here, a view of the full monomeric structure for each aaRS is shown in Fig. 1. The active sites of each aaRS are highlighted in Fig. 2 for class I aaRSs and in Fig. 3 for class II aaRS. This was the second organism for which a CysRS structure has been reported, although the structure of CysRS from Coxiella burnetti has now been reported (PDB ID 3TQO 35 ) with RMSD 1.46 Å and sequence homology of 34%. The B. burgdorferi CysRS structure was solved during the second round of crystallization trials for this target, as detailed above. The central catalytic domain of CysRS is quite similar between the E. coli, B. burgdorferi, and C. burnetti CysRSs catalytic domains, although the C-terminal anti-codon recognition domain adopts dramatically different conformations with respect to the catalytic domain. The E. coli CysRS cystine and zinc bound structure (1LI7) 33 had a backbone RMSD of 1.61 Å compared to the B. burgdorferi CysRS structure. After determining the structure of CysRS from B. burgdorferi, initial inspection of the electron density maps revealed two strong difference density features. The first difference density peak most likely corresponded to the Scientific RepoRts | 7: 223 | DOI:10.1038/s41598-017-00367-6 catalytic zinc ion, as inferred by the E. coli homolog and which modeled and refined appropriately. The second strong difference density was supportive for an AMP or AMP-containing molecule, which resides in the same location as the A of the CCA tail of (site of acylation) in the E. coli CysRS/tRNA Cys crystal structure. Due to additional residual density off the phosphate of the AMP, it appears likely that a mixture of AMP-containing compounds may have co-purified from the expression host or represent a mixed population of degraded or disordered ATP, which was added during co-crystallization. Attempts to co-crystallize with tRNA mini-helices containing the CCA acceptor stem were unsuccessful.

GluRS from Borrelia burgdorferi and Burkholderia thailandensis bound to L-glutamic acid.
A number of glutamyl-tRNA synthetase (GluRS E.C. 6.1.1.17) crystal structures have been reported in the literature from bacteria, eukaryotes, and archaea. Unfortunately, the human structure has yet to be solved by X-ray crystallography. We solved two co-crystal structures of the class Ib GluRS bound to L-glutamic acid, one from B. burgdorferi at 2.6 Å resolution and one from B. thailandensis at 2.05 Å resolution (Figs 1B,C and 2B,C). Differences between the two GluRS structures in the cognate amino acid binding pocket are apparent. For example, in the B. thailandensis GluRS structure, His209 makes a hydrogen bond with the main chain carboxylate of the cognate glutamic acid, but a hydrogen bond is not observed from the equivalent Trp residue in the B. burgdorferi structure. In the B. thailandensis GluRS structure several water-mediated interactions were observed in the amino acid binding pocket compared to the B. burgdorferi GluRS structure, presumably due to the higher resolution of the B. thailandensis GluRS structure. The two structures have an RMSD of 1.08 Å and a sequence homology of 34% identical and 53% similar amino acid sequences.
TrpRS from Encephalitozoon cuniculi bound to L-tryptophan. Crystal structures have been reported for human 36 , yeast 37 , eukaryotic pathogens 22, 25 as well as bacterial 38 tryptophanyl-tRNA synthetase (TrpRS E.C. 6.1.1.2) and structures have been reported for human TrpRS/tRNA Trp (2AKE, 2DR2) 36 . We solved a 2.6 Å resolution crystal structure of TrpRS, a class 1c aaRS, from the eukaryotic pathogen E. cuniculi with its cognate amino acid L-tryptophan (Figs 1D and 2D). The E. cuniculi TrpRS structure was solved during the second round of crystallization trials for this target, as detailed above. The L-tryptophan-bound human (2QUH) 36 36 demonstrates that the same three acids, Glu124, Gln119, and Tyr84 make the same interactions with the cognate amino acid in both structures (Fig. 4A). These three residues make up the only hydrogen bonding interactions of the binding pocket in both structures.  39 , bacteria (2EL9; no primary citation), and an eukaryotic pathogen (3HRI) 23 . We solved a 2.65 Å resolution structure of HisRS, a class 2a aaRS, from the gram-negative bacteria B. thailandensis bound to its cognate amino acid L-histidine (Figs 1E and 3A). B. thailandensis is commonly used as a model for B. pseudomallei because of their genetic similarity and its far less pathogenic nature. A comparison of the human and the B. thailandensis HisRS structures reveals a backbone RMSD of 1.13 Å. The sequence homology between these two proteins is 24% identical and 42% similarity of amino acids. Unfortunately, the human structure is an apo protein so we can only speculate as to the similarities of the binding pocket residue interactions for the human protein ( Fig. 4B) but we see homologous human residues for Tyr269, Tyr270, Thr92 and Glu90 that likely play a role in aaRS  Table 2. X-ray diffraction data and structure determination statistics. a Class I aaRS enzymes contain a Rossman fold and class II aaRS enzymes contain an anti-parallel b-sheet. Additional differences are described 3 . b Values in parenthesis indicate the highest resolution shell. 20 shells were used in XSCALE 56 .

HisRS from Burkholderia thailandensis bound to L-histidine.
hydrogen bonding of the cognate amino acid in the human protein much like they do in the B. thailandensis HisRS structure (Fig. 2).

Conclusion
In Fig. 1 the six protein structures of the aaRSs are oriented with the aminoacylation domain up, and the anticodon tRNA binding domain, down. The differences in the overall folds of the aminoacylation domains are apparent for the class I aaRS enzymes that have a Rossman fold (Fig. 1A-D) in comparison with the class II aaRS enzymes which have an anti-parallel β-sheet (Fig. 1E,F). As mentioned earlier, the cognate amino acid binding pocket differences between comparable human structures are subtle. For example, in the E. cuniculi TrpRS crystal structure the three residues that make hydrogen bonds with the cognate amino acid, Glu124, Gln119, and Tyr84, overlay almost exactly with the human structures homologous residues. Any compound that would have selectivity between these two proteins would need to utilize more than just these three amino acids in the aminoacyl binding pocket to gain selectivity. Differences, especially just outside the aminoacyl binding pocket, need to be taken advantage of when trying to gain selectivity with a molecular probe compound or potential lead compound. Koh CY, et al. use the T. cruzi HisRS and build compounds from a site just adjacent to the aminoacyl binding pocket that utilize a cysteine residue found in the T. cruzi structure, but not in the human one to develop compounds that are covalent binders 17 . Along similar lines, a number of ProRS inhibitors have been identified with high specificity for pathogenic ProRS enzymes over human enzymes, and these inhibitors such as TCMDC-124506 or glyburide largely bind outside the aminoacyl binding pocket 19 . In addition to the MetRS compounds mentioned above, there are natural products that target other aaRSs (Febrifugine), which might lend more confidence to aaRSs being a viable antibiotic target for some of the organisms discussed in this manuscript. Additionally, there are aaRS inhibitors in clinical trials (Halofuginone) that also make the whole class of aaRSs an interesting group of enzymes from a therapeutic approach. Another clinically relevant aaRS inhibitor, tavaborole, is a topical antifungal medication that inhibits leucyl-tRNA synthetases in onychomycosis fungal infections. The field of aaRS inhibitors has been validated as useful targets for the development of therapeutic compounds; we hope our work will lead to inhibitors against the organisms discussed here. Ideally, these six structures can help guide the creation of more inhibitors and subsequent structures from other organisms.

Methods
Protein expression and purification. Detailed SSGCID cloning, protein expression, and purification protocols have been reported previously 43,44 . Briefly, SSGCID targets were cloned from genomic DNA into an  ButhA.00612.a (LysRS), and EncuA.00600.a (TrpRS) that resulted in crystal structures, the expression and affinity tag was not removed prior to crystallization. For ButhA.01187.a (GluRS) and BobuA.01348.a (GluRS) the affinity tag was not removed. All protein samples were further purified, as a polishing step for crystallography, by size exclusion chromatography equilibrated in 20 mM HEPES pH 7.0, 300 mM NaCl, 2 mM DTT, and 5% glycerol. Fractions containing pure protein were collected, pooled, concentrated to ~20-30 mg/ml, and stored at −80 °C prior to crystallization experiments.