Introduction

Recent advancements in both microscope hardware and computational processing have led to cryo-electron microscopy (cryoEM) emerging as a mainstream method for biomolecular structure determination. While in ideal cases cryoEM data approaches atomic resolutions1,2,3, most structures determined by cryoEM are in the 3–5 Å resolution range. At these resolutions, model building is time-consuming, error prone, and often ambiguous. To assist this process, methods have been developed to automatically build de novo polypeptide chains into EM data4,5,6,7, and with the advent of AlphaFold 2, high-quality starting models can oftentimes be obtained from sequence information alone8,9. While these methods help build protein models into cryoEM density, tools for automatic fitting of small molecule ligands into cryoEM data are limited. Given the widespread adoption of cryoEM in academia and in industry to support translational studies of drug targets, the ability to accurately model ligand-bound structures is paramount.

There are numerous automated tools from X-ray crystallography for modeling small molecule ligands10,11,12,13. However, their methodology is unproven for use in interpreting all but the highest-resolution cryoEM maps. Traditional ligand fitting methods rely on shape and topological features of density maps to match10 or build11,12,13 the ligand into density. But as resolution decreases below 3 Å, the topological features these methods rely on become less defined, and their accuracy in modeling ligands within 1 Å RMSD of a reference ligand falls below 20%13,14. While these software packages have been updated to consider cryoEM data, the updates focus on protein modeling without reported updates to small molecule modeling15,16 or focus on small molecule refinement instead of automatic model building17.

Along with map features, chemical force fields have provided an energetic approach to accurately fit ligands into their respective density. Two approaches—GemSpot18 and MDFF19—utilize the ligand-docking software GLIDE to model ligands into cryoEM data. However, both require user input in either selecting models during the protocol or choosing a starting configuration, limiting the automation and applicability of these approaches. The protein modeling software Rosetta recently incorporated a small molecule force field, RosettaGenFF, which accurately models the energetics of arbitrary biomolecules in a manner balanced against Rosetta’s protein force field20. Combining this energy model with a genetic-algorithm (GA) optimization method allowing for full receptor side chain flexibility, GALigandDock, yielded superior performance in ligand docking accuracy compared to other state-of-the-art methods.

Here, we leverage the docking power of RosettaGenFF and GA optimization to overcome the challenges of modeling small molecules at near-atomic resolution. We integrate cryoEM density data with the physically realistic force field of RosettaGenFF to create RosettaEMERALD (EM Maps ERoded for Automatic Ligand Docking) for robust ligand modeling into cryoEM maps with no user input during the protocol. We evaluate the performance of EMERALD on all non-ion-mediated ligand-bound protein structures deposited in the EMDB21 and compare our results to their respective deposited structures and high-resolution crystal structures when available.

Results

An overview of EMERALD is illustrated in Fig. 1. GALigandDock places ligands in a protein pocket by iteratively refining a pool of 100 conformations, selecting the best 100 models at each generation using predicted energy. To enable this method to use cryoEM density, two changes were integral: density-guided initial ligand placement and the use of density in model selection at each round. Our initial placement (fully described in Methods) first models density as a pseudo-atomic skeleton (Fig. 1b). When generating the initial population of ligands, ligands are placed at the center of the skeleton and restrained to points in the skeleton. At each iteration, the population of ligand conformers along with their surrounding flexible side chains are further optimized against the sum of a weighted density correlation and the RosettaGenFF energy (Fig. 1c) and finally refined in Rosetta to minimize the energy of the models (Fig. 1d). The full protocol generates a structure in 30–120 min, depending on the size of the ligand and the cryoEM map.

To test EMERALD, we ran our docking protocol on all ligands with 25 or fewer rotatable torsion angles present in deposited cryoEM structures determined at a minimum of 6 Å nominal resolution. This yielded 1053 ligands to be placed. For each model, we ran three independent trajectories, and we analyzed the resulting models using three different criteria: (a) agreement of the deposited model to the lowest energy predicted structure; (b) density fit and number of protein/ligand hydrogen bonds; and (c) convergence of the three trajectories. This last criterion is used to evaluate the confidence in a predicted model.

The results of these docking trajectories are summarized in Fig. 2. In 57% of the cases, our density-guided docking produced a top model within 1 Å RMSD (considering all non-hydrogen atoms in the ligand) of the deposited model after energy minimization (match, Fig. 2a). While an RMSD cutoff of 2 Å has traditionally been used for docking success, the lack of confidence in the low-resolution reference models and inability of RMSD to consider receptor contacts led us to divide results further by density correlation and hydrogen bond contacts. There were 401 cases (38%) where EMERALD produced a model with an RMSD value >1 Å, and the model was similar or better than the deposited model in both metrics (non-match, similar or better quality). The smallest group belonged to 48 cases where the deposited model was not recapitulated, but the EMERALD model had a worse density fit or fewer hydrogen bonds than the deposited model (non-match, worse quality, 5%). Modeling accuracy decreases as ligand flexibility increases and as the local resolution of the map surrounding the ligand worsens (Fig. 2b, c). Also, we found that incorporating EM data in GALigandDock is necessary for recapitulating deposited ligand structures with high success rates (Supplementary Fig. 1).

Because of the low resolution of the density maps, it is difficult to interpret the quality of docked poses from density fit and receptor interactions alone. To instill more confidence in docking results, we analyzed the convergence among the top-ranked ligand poses across three replicates (Fig. 2d–f). Of the cases within 1 Å RMSD, 2 or more of the trajectories converge for 81% of cases, further strengthening the quality of the matched cases (Fig. 2d). Moreover, only 23% of the worse-quality cases converge on the same ligand model (Fig. 2e). Given how well trajectory convergence agrees with these categories, it can serve as a proxy for confidence when our docked model differs from the reference model in ambiguous cases. 42% of the ambiguous cases have similar top models across our trajectories (Fig. 2f), giving us confidence in an alternative model to the deposited structure for those entries.

Our dataset includes 15 of the 20 cases benchmarked for the GemSpot pipeline18, with five cases filtered out of the dataset for being peptides or having inter-residue bonds like ion coordination. For 13 of the 15 ligands, EMERALD produced a ligand within 1 Å of the deposited structure, with nine of those placements assessed as confident. For the other two cases, our models disagreed with the deposited model; GemSpot also found solutions different from the deposited model in these two cases.

Crystal models confirm alternate conformations for EM data

To cross-validate our results—particularly in cases where we found a different solution than the deposited model—we looked for all models with a corresponding high-resolution crystal structure (see Methods). We identified 100 cases where EMERALD converged on a ligand placement and a corresponding high-resolution crystal structure was available. The converged docked model was within 1 Å RMSD of the ligand modeled in the crystal structure for 67% of cases, while 58% of the deposited EM models were within this distance. Considering cases where the model predicted from EMERALD and the reference EM model differ, there were six cases where the EMERALD model was within 1 Å RMSD to the crystal structure while the EM model was not, three cases where the EM model was within 1 Å of the crystal structure but the EMERALD model was not, and eight cases where both models differed from the crystal structure by more than 1 Å. In addition, in five of the six cases where our model predicts the crystal structure, our ligand model improves density correlation by at least 0.03, compared to the deposited cryoEM model.

We show docked models supported by crystal structures in Fig. 3 to highlight the quality of our protocol. These examples include: (a) the hippocampal AMPA receptor with the antagonist MPQX22, where our model makes additional hydrogen bond and π-stacking interactions with the ligand, matching the crystal structure23 (Fig. 3a); (b) NBQX in an AMPA receptor24, where the ligand is flipped, better matching the density, and making bidentate interactions with a nearby arginine residue (Fig. 3b); (c) DNMDP bound to the SLFN12-PDE3A complex25, where small changes better match the crystal structure (Fig. 3c); (d) an ADP molecule in ClpB disaggregase26 (Fig. 3d), where the phosphate groups recapitulate the crystal structure; and (e) a glutamate ligand in the AMPA glutamate receptor24, which was missing an oxygen atom in the deposited structure; when the full glutamate molecule is docked, the carboxylates are placed in a configuration matching the crystal structure27 (Fig. 3e).

There were three cases where our docking protocol found a ligand different than the crystal structure, while the EM model matched the crystal structure closely. All three cases were different maps of the same system, a folate molecule bound to MERS-CoV28,29. In all 3, the EMERALD model and the crystal structure only differ in the placement of a flexible arm with high B-factors in the crystallographic data (Supplementary Fig. 2)30. These results lend more support for EMERALD convergence as a confidence metric, which we used to further find instances of alternate ligand conformations.

Docked poses reveal plausible alternate conformations

Even without crystal structures for reference, trajectory convergence and improved ligand density fit provide confidence in other docked poses. In the case of an antimicrobial bound multiple transferable resistance (Mtr) pump31, our protocol converges on an ampicillin molecule that is flipped so that its phenyl group is now in a pocket of unassigned density (Fig. 4a, b). While the deposited model places the phenyl group sandwiched between two phenylalanine residues (Fig. 4a), our docked model packs the group near a cluster of hydrophobic residues known to interact with other antibiotics31 (Fig. 4b). In addition, nearby arginine, serine, and threonine residues have been suggested to generally coordinate ligands binding to the pump31; our model has the carboxyl group positioned to make interactions with these residues directly or possibly through bridging water molecules. While it is likely that an antibiotic would bind non-specifically to this site, EMERALD ranks our presented orientation the highest across all three trajectories, and there is a large predicted energy gap (about 10 kcal/mol) between the converged conformation and the best-scoring conformation with the phenyl group outside this hydrophobic pocket, suggesting that this pose is strongly favored by EMERALD.

Another instance of improving density fit and receptor interactions is a lipid phosphatidylinositol 4,5-bisphosphate (PIP2) bound to transient receptor potential melastatin member 8 (TRPM8)32. The EMERALD docked model correlates with the map 10% better than the deposited model, placing all the phosphate groups into density and placing the likely disordered glycerol backbone and beginning of the lipid tails in weaker density (Fig. 4c, d). Moreover, the 4,5 phosphate groups of our docked model make more interactions with basic residues that bind PIP2 in other structures of TRPM832. While the start of the acyl chains are oriented away from the transmembrane region of the protein, this is likely occurring because the chains are truncated. Considering the phosphate placements in the deposited model do not appear in the top 20 lowest-energy models for any trajectory and the reasons above, EMERALD predicts a more accurate model of PIP2 binding.

Additional cases with confident alternative models are shown in Fig. 4e–h. For the ATP analog in a structure of the ATP11C flippase33 the gamma phosphate sticks out of density in the deposited model (Fig. 4e) but is modeled into the density and interacting with a nearby lysine residue in the docked model (Fig. 4f). Finally, our EMERALD model of a small molecule GO52 bound to the CD4-HIV-1 Env SOSIP complex34 confidently fits the amide and piperidine groups into the map better than the deposited map, while keeping the hydrophobic interactions as the deposited model (Fig. 4g, h).

We next identified cases where: (a) the EMERALD model and deposited structure were different, and (b) half maps were available in the EMDB. For these cases, models were refined into one half map and validated against the other using real-space density correlation. When comparing the deposited and EMERALD models (Supplementary Fig. 3a), we found two instances where EMERALD’s model fits the validation map worse (Supplementary Fig. 3b–e), seven cases where it fits the validation map better (one of which is shown in Fig. 4d), and saw equivalent quality for the remaining 53 cases.

Low-confidence unmatched cases show pseudo-symmetry or weak density

While our analysis confidently discovers alternate ligand models, 58% of docked molecules with similar quality to the deposited model have medium or low confidence. We found that small molecules that have pseudo-symmetry or have flexible moieties represent these low-confidence cases because of the challenges they provide from their often noisy and inconclusive density. In some instances, two or more replicates of EMERALD agree on a substructure of the molecule (dark blue, Fig. 5a, b)35, but differ in a rotamer of a functional group or a flexible group (light blue, Fig. 5a, b). For other ligands, ambiguous density leads to little agreement among the reference model and low-energy Rosetta models (Fig. 5c, d). The authors for the allosteric modulator of a dopamine receptor note the lack of confidence in the deposited structure36, but have mutagenesis studies to confirm the conformation modeled (Fig. 5c)37. However, one model found with EMERALD aligns with their opposing model and fulfills an unexplained region of density in the deposited model (Fig. 5d). Altogether, these entries show the difficulty in interpreting cryoEM data at medium to low resolution leading to ambiguous density explanations for a single map, and the limits to automated ligand docking using our protocol.

Cases with worse ligand models show poor initial sampling

To learn what improvements could be made to EMERALD in the future, we looked at instances where EMERALD predicts a ligand with worse metrics than the reference model. We found that these cases often had density that is discontinuous or noisy, leading to incorrect skeletonization. For a ubiquinone binding electron transport protein38, the density skeleton only finds density near the head group (Supplementary Fig. 4c). Without a complete skeleton, the initial population struggles to find the deposited conformation, placing the head group exposed to solvent (Supplementary Fig. 4b). In this case, if the 2.63 Å data is instead truncated at 4.0 Å resolution, the density becomes more continuous, and the skeleton generated by EMERALD matches the ligand conformation much more closely (Supplementary Fig. 4d). With a complete skeleton, the docked model is no longer worse than the deposited model. The head group of the lowest-energy model makes the same hydrogen bond interactions as the deposited model, and the docked model improves density correlation by 0.03 (Supplementary Fig. 4e). This underscores the importance of the initial sampling step, especially when evaluating ligands with a large number of rotatable bonds, and identifies areas for future upgrades in EMERALD.

Blind modeling of linoleic acid

To demonstrate our protocol’s utility in structure determination, we used EMERALD to create a model for linoleic acid bound in a previously undetermined protein structure. Determining this model manually would be an arduous task considering high flexibility of the ligand (Fig. 6a). Despite the difficulty of modeling the suspected ligand, EMERALD predicts a small molecule conformation that fits the density, makes an anchoring electrostatic interaction with a neighboring arginine residue, and introduces little torsional strain throughout the hydrophobic tail (Fig. 6b). This placement is supported by the structure of linoleic acid bound to a related protein39. Creating the model required no user input once ligand restraint files were made, and the ease and accuracy when modeling linoleic acid prove the value of EMERALD for structure determination.

Discussion

Here, we show a method EMERALD that is capable of accurately and automatically producing deposition-ready small molecule models into cryoEM maps without human bias during modeling. After being benchmarked on over one thousand ligand-bound entries in the EMDB, EMERALD identifies a confident solution in 62% of entries, in some cases identifying alternate models supported by crystal structures and map validation. Moreover, we show this fully automated protocol determining the conformation of linoleic acid in a previously unsolved structure.

The method should be generally applicable to most ligands with fewer than 25 rotatable bonds; larger ligands have too large of a search space for this algorithm to effectively sample. Discontinuous or noisy density also proved challenging, though modified map processing to improve density connectivity was shown to rescue at least one of these cases. Our current approach only models a single ligand at a time, which complicates density assignment for structures with ligands close together like electron transport proteins. Finally, Rosetta’s poor handling of metal ions precludes modeling ions as cofactors or as ligands themselves, leaving a significant group of proteins unanalyzed40

Currently, our method requires the modeler to know the identification and approximate binding location of the ligand, a non-trivial task when studying novel protein-ligand complexes. For more utility during model building, expanding our method to recognize potential unmodeled ligand blobs and quickly assess possible ligands to determine identity would be beneficial. As is, however, EMERALD offers an automatic tool for ligand modeling that will prove helpful for the now common scenario of ligand-bound structure determination through cryoEM, and EMERALD will serve as a valuable addition to the toolkit of Rosetta EM modeling methods4,41,42 for model building under one software package.

Methods

Creating the protein-ligand dataset

All single-particle EMDB entries with an associated ligand bound structure at 6 Å nominal resolution or better as of September 03, 2021 were obtained. Given the specificity of trying to model ions and glycans, structures with only these types of ligands were excluded from the dataset. In addition, the set had several cases with small molecules in close proximity. To simplify the docking situation, entries with two or more ligands within the binding pocket as defined in our docking protocol were also eliminated from the set. To only have entries with complete macromolecule-ligand complex models that fit the EM density well, structures with a density correlation below 0.4 or that left large regions of density unmodeled were dropped. When considering the first instance of a unique ligand for each EMDB entry, there were a total of 1704 total cases to process for docking.

File preparation for docking

For accurate ligand docking, small molecules need proper protonation states and partial charges. However, the protonation state assigned can depend on the protonation assignment method. To determine the most likely protonation state for a small molecule, we calculated protonation states with three assignment tools—phenix.elbow43, openbabel44, and dimorphite45—and selected the protonation state assigned with two or more methods. If there was no agreement or failures during the assignment, the phenix.elbow assignment was used for modeling. SDF files of the first instance of each unique ligand-entry pair were downloaded from the PDB and used for input. For processing with phenix.elbow, all possible hydrogen atoms were added to the SDF file using openbabel, and then hydrogen atoms were removed to the final protonation assignment using phenix.elbow. To generate the protonation state with openbabel, the hydrogen atoms were simply added to the downloaded SDF file at a pH of 7.4. Instead of adding protons to a structure, dimorphite (which utilizes RDKit46) protonated small molecules as SMILES strings, which were then converted to a structure via openbabel. All three protonation assignment methods agreed for 794 instances with two methods agreeing for 157 cases (Supplementary Fig. 5).

With the protonation state assigned, a mol2 file with AM1-BCC partial charges was generated with antechamber47,48. Finally, a Rosetta specific parameters file was created for each ligand. Receptors were cleaned by eliminating non-macromolecular atoms in the PDB file and replacing modified residues with their unmodified correspondent. The ligand to be docked was added to its position in the deposited structure and randomly translated 0.0–2.0 Å in any direction before docking.

Density erosion and alignment

To ensure the quality of ligand conformations in the ligand pool during the genetic algorithm, randomly perturbed ligands were aligned into unmodeled density to generate the initial pool. Voxels in the density map within 10 Å of the center of mass of the ligand but >2.5 Å from an atom in the receptor were searched and eroded in a modified erosion algorithm from previously described methods12,49. Briefly, voxels were labeled as 0 if their associated density map value was below a density value threshold and labeled as 1 if above it. The voxels were searched in order of density value, and voxels with neighboring voxels of 0 value were removed from the skeleton. If removing a voxel breaks skeleton continuity or if all of the voxel’s neighbors had a value of 0, then the voxel was added back into the skeleton. This process was repeated until a skeleton remained of voxels with high density values.

Blobs of density are often discontinuous and difficult to separate from noise at lower resolution. To account for the low resolution, skeletonizing density was performed in two successive steps with increasing strictness on erosion. On the first pass, peaks in the density were detected and eroded only considering voxels sharing a face with each other. This keeps connections between density blobs that may be disjointed. The remaining voxels were clustered into potential skeleton networks by separating groups of voxels that are 3 Å away from another group. Only the largest network of voxels was chosen for further erosion to eliminate noisy voxels. The largest group of voxels underwent a second, stricter erosion that considered all voxels that share a face or edge with each other, leading to a pseudo-atomic skeleton.

The skeleton was used during initial ligand conformer generation of the genetic algorithm to ensure a starting pool of ligands that already fit into the density. Small molecules were randomly translated and perturbed in the binding pocket and half of the small molecules in the initial pool were aligned to the skeleton. For alignment, the ligands were centered on the center of mass of the skeleton, and then atom-skeleton point pairs were determined. The shortest distance of an atom-skeleton pair while searching over all pairs was found, and this search was repeated until either all atoms or all skeleton points had a unique pairing. For the coordinates in each atom-skeleton pair, the topped out harmonic function in Eq. (1) was used to restrain ligand atoms:

$${E}_{{ij}}=\,36(1\,-{e}^{-{{x}_{{ij}}}^{2}/9})$$
(1)

where Eij is an energy penalty applied and x is the distance in Angstroms between the atom-skeleton pair i, j. The ligand is aligned into the density over two stages of energy minimization with 20 and 15 short rounds of minimizations with the atom-skeleton restraints updated after each round.

Docking protocol and analysis

An initial population of 100 ligands were generated by randomly perturbing across a six-fold axis and the torsion angles of the ligand to be docked. Half of the initial ligands were aligned to the density as described above, while the other half of the population were selected from the top 50 models of 5000 random ligand conformations to ensure diversity in the initial population. The population initialization contributes the longest to EMERALD’s completion time. All side chains within 5 Å plus the radius of the ligand of the initial ligand center of mass were also considered for optimization. The ligand population and nearby side chains were optimized over 10 generations of a genetic algorithm using default parameters in GALigandDock and a scoring function with a high electron density score weight of 100 to evaluate a ligand’s fit into density. The top 20 ligand conformers at the end of the GA were further optimized along with nearby macromolecule atoms using a cartesian minimization in Rosetta. Example scripts for running density-guided ligand docking are provided below.

All entries were run in triplicate and the lowest-energy model for each individual run was further analyzed for docking success. Only cases with 25 or fewer torsion angles were analyzed as the search space of ligands with more torsions becomes difficult to fully explore during a GA. This, along with losing cases from inherent failure during ligand processing, left 1053 cases to analyze. Because of a low confidence in the reference models due to their low resolution, docked models were not directly compared to their respective reference models. Instead, all reference models were relaxed into their EM density map in Rosetta using the cartesian minimization used after the genetic algorithm. Along with a symmetry-independent RMSD value, docked models were compared to reference models by the number of residues that make hydrogen bonds with the ligand and a density correlation calculated in Rosetta. These metrics were used to categorize docking results as matches (docked pose within 1 Å of relaxed reference model); non-match, similar quality (>1 Å RMSD, density correlationdock—density correlationdeposited > 0.025 and hydrogen bondsdock—hydrogen bondsdeposited > −1); or non-match, worse quality (>1 Å RMSD, density correlationdock—density correlationdeposited < −0.025 or hydrogen bondsdock—hydrogen bondsdeposited < −1). Further support for docking success was calculated by determining the convergence of lowest energy ligand models across the triplicate runs. The distance between atom pairs across models were calculated and results were further divided into those with two or more trajectories having their lowest energy models within 1 Å RMSD, more than within 1 Å for 60% of atoms, or within 1 Å for fewer than 60% of atoms.

The resolution of cryoEM maps often varies from the nominal resolution of a map, so to analyze the performance of EMERALD against map resolution, we compared docking results to local resolution rather than nominal resolution. Maps with local resolution calculations were generated with MonoRes via the Xmipp software package50. The deposited maps were filtered with a Gaussian kernel with a sigma of 0.02 times the map dimensions. Binary masks were created using the filtered maps by keeping voxels with a value above 0.05 times the maximum voxel value in the filtered map. With the binary masks, local resolution estimate maps for all instances were created. To calculate the local resolution surrounding the modeled ligand, the local resolution of all voxels within 5 Å of the ligand were averaged. Voxels with zero local resolution values were not included in the average. Considering that ligand binding sites are often less-resolved areas of a map, the nominal resolution was reported if the calculated local resolution of a map around the ligand was better than 1 Å than the nominal resolution since an error likely occurred.

The following command in Rosetta was used for the low-pass filter of map EMDB-30475:

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.