Introduction

During protein synthesis aminoacylated tRNAs bind to the ribosome with the anticodon loop pairing with the codon of the mRNA template while delivering the incoming amino acid to the elongating polypeptide. Aminoacyl-tRNA synthetases (aaRSs or aminoacyl tRNA ligases) charge tRNAs with their cognate amino acids in a two-step mechanism1. First, the aaRS combines the specific amino acid with adenosine-5′-triphosphate (ATP) to produce an activated aminoacyl-adenylate intermediate which reacts with the appropriate tRNA to produce the aminoacylated tRNA. Inhibition of either step results in the buildup of uncharged tRNAs in the cell and consequently on the ribosome thereby inhibiting protein synthesis2. In general, aaRSs are divided into two classes based on the global fold and sequence conservation3. The active sites of class I aaRSs contain a Rossmann fold with two highly conserved sequence motifs, HIGH and KMSKS. The active sites of class II aaRSs contain an anti-parallel β-sheet. Class I aaRSs recognize the CCA acceptor stem by approaching via the minor grove, whereas class II aaRss recognize the CCA acceptor stem via the major groove, a recognition strategy similar to an in vitro selected aminoacyl tRNA synthetase ribozyme4. Each class is further divided into three subclasses based on subunit structure and sequence conservation. Significant differences have been noted between the prokaryotic and eukaryotic homologs in several aaRSs, implying that these enzymes may be viable candidates for antimicrobial drugs2, 5. Indeed, the methicillin-resistant Staphylococcus aureus (MRSA) IleRS inhibitor mupirocin has been approved for clinical use, and its binding site has been shown by X-ray crystallography to overlap with the Ile-AMP reactive intermediate6, 7. Due to the presence of an ester bond which is rapidly hydrolyzed in blood plasma, mupirocin is limited to topical use.

A number of aaRS inhibitors are in preclinical development2. These include natural products such as borrelidin that targets a number of ThrRSs through an allosteric mechanism and ochratoxin A that targets PheRS as an active site inhibitor. Structures have not yet been determined for either of these inhibitors bound to their target aaRS, although modeling studies based on resistance mutations have provided insight into the putative borrelidin binding site8. In contrast, a number of structures have been published for aminoacyl-adenylate reactive intermediate analogs, such as sulfonamides9,10,11,12. However, these compounds typically also inhibit the human homolog and therefore have been abandoned as drug candidates. More recently, a series of diamino quinoline compounds has been developed against Gram-positive bacterial MetRSs, first by GlaxoSmithKline13, then by Replidyne14,15,16, and an academic group5, 17. These compounds exhibit strong selectivity for the bacterial MetRS over human MetRS, but may suffer from poor bioavailability5. Therefore, further research is necessary both in lead development and aaRS structural biology. The crystal structures of P. falciparum LysRS and ProRS with cladosporin or halofuginone represent valuable studies of aaRS complexes with nature product-like anti-malarial inhibitors18, 19.

Due to their biological importance and potential as therapeutic targets, aminoacyl-tRNA synthetases have been targeted by a number of structural genomics centers. Perhaps the most successful structural genomics centers at studying aaRSs has been the Medical Structural Genomics of Pathogenic Protozoa (MSGPP), which along with subsequent efforts has resulted in nearly twenty aaRS crystal structures17, 20,21,22,23,24,25. Nearly one hundred aaRSs have entered the Seattle Structural Genomics Center for Infectious Disease (SSGCID)26,27,28 structure determination pipeline, both as internally selected targets and also as targets nominated by the scientific community. These targets are largely comprised of aaRSs from Gram-negative bacteria such as Borrelia burgdorferi, which causes Lyme disease29; Brucella melitensis, which causes brucellosis or Malta fever; orthologs of Mycobacterium tuberculosis, which causes tuberculosis; and Rickettsia prowazekii, the etiologic agent of epidemic typhus. Other targets include a smaller number of aaRSs from eukaryotic pathogens such as Ehrlichia chaffeensis and Encephalitozoon cuniculi. Several aaRSs from Burkholderia thailandensis were identified as candidate essential genes in a transposon screen30. A number of these selected aaRS enzymes have been successfully purified, although none of them reached structure determination through first pass pipeline techniques. Here we describe our efforts to obtain aminoacyl-tRNA synthetase structures from infectious disease organisms, which have resulted in six new aaRS co-crystal structures; the initial structure of a seventh target identified via this strategy was recently reported along with inhibitor complexes31. All of these structures contain a ligand which may be important for stabilizing the enzyme and promoting crystallizability.

Results and Discussion

Co-crystallization of aaRSs from SSGCID organisms

For the initial round of crystallization of SSGCID targets, no ligands were added to the protein solution, and in general two crystallization trials were set up in 96-well format most commonly in the JCSG+ and PACT sparse matrix screens32, although depending on the day-to-day availability one or both of these screens were substituted with Wizard III/IV, Wizard I/II, CSHT or Morpheus. If diffraction quality crystals were not obtained from the initial round of crystallization trials, 6–8 additional sparse matrix trials were set up in 96-well format for high value targets such as those requested by the scientific community. During this second round, the protein concentration was adjusted depending on the percentage of drops containing precipitation in the first round (aiming for approximately 30–50% precipitation as optimal). During the third round of crystallization, a subset of the available aaRS protein samples were incubated with 5 mM ATP (Sigma-Aldrich) and 5 mM of the cognate amino acid (Sigma-Aldrich), and four additional sparse matrix screens were initiated, typically JCSG+, PACT, Wizard III/IV and CSHT. The combined results for the first three rounds of crystallization trials are shown in Table 1. Of the 31 proteins selected for co-crystallization trials, 18 produced crystals (58%), 11 produced crystals which diffracted to better than 6 Å resolution (35%), and 7 crystal structures were determined (23%). Overall, these rates are comparable with other protein classes in the SSGCID pipeline. X-ray diffraction data and structure determination statistics for the six structures are shown in Table 2 and the individual structures are detailed below. Although we solved one or more structures of most aaRS subclasses, we were unable to obtain a co-crystal structure of subclass 2c, perhaps in part due to the low solubility of L-phenylalanine or ochratoxin A in aqueous solution at crystallography concentrations.

Table 1 Co-crystallization of aaRSs from SSGCID organisms.
Table 2 X-ray diffraction data and structure determination statistics.

CysRS from Borrelia burgdorferi bound to AMP

Crystal structures of cysteinyl-tRNA synthetase (CysRS, E.C. 6.1.1.16) from E. coli have been reported as apo, bound to substrate33, and in complex with tRNA34. Interestingly, in some organisms a CysRS enzyme has not been identified, and a prolyl-tRNA synthetase (ProRS) exhibits cross-reactivity to charge tRNAs with cysteine, although this may represent mis-acylation rather than a truly bifunctional enzyme10. A structure of human CysRS has not yet been solved. We determined a 2.55 Å resolution structure of CysRS, a class Ia aaRS, from B. burgdorferi, the causative agent of lyme disease29 (Figs 1A and 2A). For each of the six co-crystal structures determined here, a view of the full monomeric structure for each aaRS is shown in Fig. 1. The active sites of each aaRS are highlighted in Fig. 2 for class I aaRSs and in Fig. 3 for class II aaRS. This was the second organism for which a CysRS structure has been reported, although the structure of CysRS from Coxiella burnetti has now been reported (PDB ID 3TQO35) with RMSD 1.46 Å and sequence homology of 34%. The B. burgdorferi CysRS structure was solved during the second round of crystallization trials for this target, as detailed above. The central catalytic domain of CysRS is quite similar between the E. coli, B. burgdorferi, and C. burnetti CysRSs catalytic domains, although the C-terminal anti-codon recognition domain adopts dramatically different conformations with respect to the catalytic domain. The E. coli CysRS cystine and zinc bound structure (1LI7)33 had a backbone RMSD of 1.61 Å compared to the B. burgdorferi CysRS structure. After determining the structure of CysRS from B. burgdorferi, initial inspection of the electron density maps revealed two strong difference density features. The first difference density peak most likely corresponded to the catalytic zinc ion, as inferred by the E. coli homolog and which modeled and refined appropriately. The second strong difference density was supportive for an AMP or AMP-containing molecule, which resides in the same location as the A of the CCA tail of (site of acylation) in the E. coli CysRS/tRNACys crystal structure. Due to additional residual density off the phosphate of the AMP, it appears likely that a mixture of AMP-containing compounds may have co-purified from the expression host or represent a mixed population of degraded or disordered ATP, which was added during co-crystallization. Attempts to co-crystallize with tRNA mini-helices containing the CCA acceptor stem were unsuccessful.

Figure 1
figure 1

Overview of co-crystal structures of aaRS enzymes from infectious disease organisms. In the current study, we have determined 6 co-crystal structures of aminoacyl tRNA synthetase (aaRS) enzymes from infectious disease organisms: CysRS from Borrelia burgdorferi (A), GluRS from B. burgdorferi (B) and Burkholderia thailandensis (C), TrpRS from Encephalitozoon cuniculi (D), HisRS from B. thailandensis (E), and LysRS from B. thailandensis (F). For sake of simplicity, only a single monomer is shown although some are biological oligomers such as HisRS which is a dimer.

Figure 2
figure 2

Ligand recognition by class 1 aaRS enzymes from infectious disease organisms. (A) class 1a CysRS from Borrelia burgdorferi (B) class 1b GluRS from B. burgdorferi (C) class 1b GluRS from Burkholderia thailandensis and (D) class 1c TrpRS from Encephalitozoon cuniculi.

Figure 3
figure 3

Ligand recognition by class 2 aaRS enzymes from infectious disease organisms. (A) Class 2a HisRS from B. thailandensis and (B) class 2b LysRS from B. thailandensis.

GluRS from Borrelia burgdorferi and Burkholderia thailandensis bound to L-glutamic acid

A number of glutamyl-tRNA synthetase (GluRS E.C. 6.1.1.17) crystal structures have been reported in the literature from bacteria, eukaryotes, and archaea. Unfortunately, the human structure has yet to be solved by X-ray crystallography. We solved two co-crystal structures of the class Ib GluRS bound to L-glutamic acid, one from B. burgdorferi at 2.6 Å resolution and one from B. thailandensis at 2.05 Å resolution (Figs 1B,C and 2B,C). Differences between the two GluRS structures in the cognate amino acid binding pocket are apparent. For example, in the B. thailandensis GluRS structure, His209 makes a hydrogen bond with the main chain carboxylate of the cognate glutamic acid, but a hydrogen bond is not observed from the equivalent Trp residue in the B. burgdorferi structure. In the B. thailandensis GluRS structure several water-mediated interactions were observed in the amino acid binding pocket compared to the B. burgdorferi GluRS structure, presumably due to the higher resolution of the B. thailandensis GluRS structure. The two structures have an RMSD of 1.08 Å and a sequence homology of 34% identical and 53% similar amino acid sequences.

TrpRS from Encephalitozoon cuniculi bound to L-tryptophan

Crystal structures have been reported for human36, yeast37, eukaryotic pathogens22, 25 as well as bacterial38 tryptophanyl-tRNA synthetase (TrpRS E.C. 6.1.1.2) and structures have been reported for human TrpRS/tRNATrp (2AKE, 2DR2)36. We solved a 2.6 Å resolution crystal structure of TrpRS, a class 1c aaRS, from the eukaryotic pathogen E. cuniculi with its cognate amino acid L-tryptophan (Figs 1D and 2D). The E. cuniculi TrpRS structure was solved during the second round of crystallization trials for this target, as detailed above. The L-tryptophan-bound human (2QUH)36 and E. cuniculi TrpRS structures are fairly similar and have an RMSD of 1.13 Å between the two structures. The sequence homology between the E. cuniculi and human proteins are 46% identical and 65% similar amino acids. Comparison of the TrpRS structures from E. cuniculi (3TZE) and human (2QUH)36 demonstrates that the same three acids, Glu124, Gln119, and Tyr84 make the same interactions with the cognate amino acid in both structures (Fig. 4A). These three residues make up the only hydrogen bonding interactions of the binding pocket in both structures.

Figure 4
figure 4

Comparison of the active sites and cognate ligand recognition between aaRSs from human and infectious disease organisms. (A) Overlay of E. cuniculi TrpRS (PDB ID 3TZE) showing the cognate amino acid binding pocket with human TrpRS (2QUH)36 also containing the cognate amino acid, (B) B. thailandensis HisRS (4E51) showing the cognate amino acid binding pocket with human HisRS (4 × 5O)39 which lacks the cognate amino acid, (C) B. thailandensis LysRS (4EX5) showing the cognate amino acid binding pocket with human LysRS (3BJU)40 also containing the cognate amino acid and an ATP molecule.

HisRS from Burkholderia thailandensis bound to L-histidine

A number of histidyl-tRNA synthetase (HisRS E.C. 6.1.1.21) crystal structures have been reported in the literature, including examples from human (4 × 5O)39, bacteria (2EL9; no primary citation), and an eukaryotic pathogen (3HRI)23. We solved a 2.65 Å resolution structure of HisRS, a class 2a aaRS, from the gram-negative bacteria B. thailandensis bound to its cognate amino acid L-histidine (Figs 1E and 3A). B. thailandensis is commonly used as a model for B. pseudomallei because of their genetic similarity and its far less pathogenic nature. A comparison of the human and the B. thailandensis HisRS structures reveals a backbone RMSD of 1.13 Å. The sequence homology between these two proteins is 24% identical and 42% similarity of amino acids. Unfortunately, the human structure is an apo protein so we can only speculate as to the similarities of the binding pocket residue interactions for the human protein (Fig. 4B) but we see homologous human residues for Tyr269, Tyr270, Thr92 and Glu90 that likely play a role in hydrogen bonding of the cognate amino acid in the human protein much like they do in the B. thailandensis HisRS structure (Fig. 2).

LysRS from Burkholderia thailandensis bound to L-Lysine

Crystal structures have been reported for Lysyl-tRNA sythetase (LysRS E.C. 6.1.1.6) from eukaryotic (including human, 3BJU)40 and bacterial organisms. We solved a 2.4 Å resolution structure of LysRS, a class 2b aaRS, from B. thailandensis bound to L-lysine (Figs 1F and 3B). The cognate amino acid binding pockets of the B. thailandensis and human structures are very similar (Fig. 4C) and make many of the same hydrogen bonding interactions. The overall structures of B. thailandensis structure reported here (4EX5) and the human LysRS structure (3BJU) is an RMSD of 0.91 Å. These two structures have protein sequence homology of 39% identical and 55% similar amino acids. Recently, several groups have been interested in the inhibitor cladosporin and have solved crystal structures of cladosporin bound to lysyl-tRNA synthetases from Cryptosporidium parvum (PDB ID 4ELO; no primary citation), Loa loa (PDB ID 5HGQ)41, Plasmodium falciparum (PDB ID 4YCV)42.

Conclusion

In Fig. 1 the six protein structures of the aaRSs are oriented with the aminoacylation domain up, and the anticodon tRNA binding domain, down. The differences in the overall folds of the aminoacylation domains are apparent for the class I aaRS enzymes that have a Rossman fold (Fig. 1A–D) in comparison with the class II aaRS enzymes which have an anti-parallel β-sheet (Fig. 1E,F). As mentioned earlier, the cognate amino acid binding pocket differences between comparable human structures are subtle. For example, in the E. cuniculi TrpRS crystal structure the three residues that make hydrogen bonds with the cognate amino acid, Glu124, Gln119, and Tyr84, overlay almost exactly with the human structures homologous residues. Any compound that would have selectivity between these two proteins would need to utilize more than just these three amino acids in the aminoacyl binding pocket to gain selectivity. Differences, especially just outside the aminoacyl binding pocket, need to be taken advantage of when trying to gain selectivity with a molecular probe compound or potential lead compound. Koh CY, et al. use the T. cruzi HisRS and build compounds from a site just adjacent to the aminoacyl binding pocket that utilize a cysteine residue found in the T. cruzi structure, but not in the human one to develop compounds that are covalent binders17. Along similar lines, a number of ProRS inhibitors have been identified with high specificity for pathogenic ProRS enzymes over human enzymes, and these inhibitors such as TCMDC-124506 or glyburide largely bind outside the aminoacyl binding pocket19. In addition to the MetRS compounds mentioned above, there are natural products that target other aaRSs (Febrifugine), which might lend more confidence to aaRSs being a viable antibiotic target for some of the organisms discussed in this manuscript. Additionally, there are aaRS inhibitors in clinical trials (Halofuginone) that also make the whole class of aaRSs an interesting group of enzymes from a therapeutic approach. Another clinically relevant aaRS inhibitor, tavaborole, is a topical antifungal medication that inhibits leucyl-tRNA synthetases in onychomycosis fungal infections. The field of aaRS inhibitors has been validated as useful targets for the development of therapeutic compounds; we hope our work will lead to inhibitors against the organisms discussed here. Ideally, these six structures can help guide the creation of more inhibitors and subsequent structures from other organisms.

Methods

Protein expression and purification

Detailed SSGCID cloning, protein expression, and purification protocols have been reported previously43, 44. Briefly, SSGCID targets were cloned from genomic DNA into an expression vector (pAVA0421) encoding an N-terminal histidine affinity tag followed by the human rhinovirus 3C protease cleavage sequence (the entire tag is MAHHHHHHMGTLEAQTQGPGS). All SSGCID targets were forward and reverse sequence verified. Proteins were expressed in E. coli using BL21 (DE3) R3 Rosetta cells and auto-induction media in a LEX bioreactor. The cells were pelleted, frozen at −80 °C. Cells were re-suspended in lysis buffer, sonicated, and clarified by centrifugation. The proteins were purified initially by immobilized metal affinity chromatography. The affinity tag was removed by cleavage with 3C protease followed by a subtractive nickel affinity column for about 60% of all protein samples. For BobuA.00133.a (CysRS), ButhA.00063.a (HisRS), ButhA.00612.a (LysRS), and EncuA.00600.a (TrpRS) that resulted in crystal structures, the expression and affinity tag was not removed prior to crystallization. For ButhA.01187.a (GluRS) and BobuA.01348.a (GluRS) the affinity tag was not removed. All protein samples were further purified, as a polishing step for crystallography, by size exclusion chromatography equilibrated in 20 mM HEPES pH 7.0, 300 mM NaCl, 2 mM DTT, and 5% glycerol. Fractions containing pure protein were collected, pooled, concentrated to ~20–30 mg/ml, and stored at −80 °C prior to crystallization experiments.

Crystallization

Crystallization trials were set up using the CryoFull, JCSG+, Morpheus, PACT, Synergy, Wizard Full (I/II), and Wizard III/IV sparse matrix crystallization screens from Rigaku Reagents and CSHT, Index, and Salt Rx from Hampton Research. Sitting drop vapor diffusion crystallization trials were set up at 16 °C using 0.4 µL of protein and 0.4 µL of precipitant against 80 µL of reservoir in Compact Jr 96-well crystallization plates from Rigaku Reagents. CysRS from Borrelia burgdorferi (BobuA.00133.a) crystallized in the presence of 25% PEG 3350 and 0.2 M Na/K tartrate from the PACT screen condition E9. Both GluRS from Borrelia burgdorferi (BobuA.01348.a) supplemented with 20 mM glutamic acid and TrpRS from Encephalitozoon cuniculi (EncuA.00600.a) crystallized in the presence of 20% PEG 3350 and 0.2 M Potassium Nitrate from the Wizard III/IV screen condition A8. HisRS from Burkholderia thailandensis (ButhA.00063.a) supplemented with 5 mM L-histidine crystallized in the presence of 350 mM Mg Formate, 12% PEG 3350 from a Rigaku Reagents E-Wizard optimization screen from the initial Wizard III/IV screen condition A3 hit. LysRS from Burkholderia thailandensis (ButhA.00612.a) crystallized in the presence of 10% PEG 20,000, 20% PEG 550 MME, 0.1 M MOPS/Hepes pH 7.5, 0.02 M of DL-alanine, L-glutamic acid, glycine, DL-lysine and DL-serine from the Morpheus screen condition H5. GluRS from Burkholderia thailandensis (ButhA.01187.a) crystallized in the presence of 0.1 M MES/Imidazole, 12.5% PEG 1000, 12.5% PEG 3350, 12.5% MPD, 0.02 M L-glutamate, alanine, lysine, serine, glycine from the Morpheus screen condition H4. Crystals were typically cryo-protected with crystallization reservoir supplemented with 10–25% ethylene glycol or 20% glycerol for ButhA.00063.a and flash frozen by plunging into liquid nitrogen. ButhA.00612.a (Morpheus H5) and ButhA.01187.a (Morpheus H4) were flash frozen without supplemental cryo-protectant.

Data collection and structure determination

Data sets were collected (Table 2). Diffraction images are available (http://www.csgid.org/csgid/pages/diffraction_images). Molecular replacement was performed using PHASER45 from the CCP4 suite46. The structure of CysRS from B. burgdorferi (BobuA.00133.a) was solved using the structure of CysRS from E. coli (PDB ID 1LI533, 33% sequence identity) as a search model. The structure of GluRS from B. burgdorferi was solved using 1J0947 as a search model. The structure of GluRS from B. thailandensis was solved using 4GRI as a search model. The crystal structure of TrpRS from E. cuniculi (EncuA.00600.a) was solved using human TrpRS (PDB ID 1ULH48, 46% sequence identity) as a search model. The structure of HisRS from B. thalandensis (ButhA.00063.a) was solved using HisRS from E. coli (PDB ID 1HTT49, 55% sequence identity) as a search model. The structure of LysRS from B. thalandensis (ButhA.00612.a) was solved using LysRS from E. coli (PDB ID 1BBU50, 58% sequence identity) as a search model. Structures were built using automated building in BUCCANEER51 followed by numerous iterative rounds of manual rebuilding in Coot52 and refinement in REFMAC553 or Phenix.Refine54. The correctness of each structure was examined, validated, and improved using Molprobity55.