Improving cryo-EM structure validation

A community-wide challenge yields recommendations for improving cryo-EM structure validation.

Validation metrics are an essential part of how biomolecular structures are vetted before publication and interpreted after publication, but judging the accuracy of features within structural models is almost always challenging. Lawson and colleagues report on a community-wide model building and validation challenge, highlight progress in developing robust validation of atomic models of biological macromolecules, and offer recommendations on how to improve cryogenic electron microscopy (cryo-EM) structure validation1.

In cryo-EM experiments, large numbers of noisy, two-dimensional projection images of individual macromolecules are processed computationally to yield three-dimensional (3D) volumetric maps of their electron scattering potential2. The technique’s popularity has exploded in recent years because it can be used to study the 3D structure of virtually any protein or macromolecular complex, regardless of biochemists’ ability to coax it into forming rigid crystals for X-ray diffraction experiments. If the protein or complex is large enough (currently ~50 kDa or more) and if it can be purified and placed into a cryo-EM instrument for imaging, 3D structures at resolutions sufficient to answer biological questions and/or guide drug discovery are usually attainable. Recently, several groups have even obtained maps so detailed that each individual atom in the macromolecule is resolved as a distinct, sharp spheroid3. When such high (so-called ‘atomic’) resolution is achieved, structural biologists can assign the positions of the thousands of atoms in a macromolecule with very high certainty, and there can be very little doubt about the 3D coordinates of each atom.

However, the vast majority of 3D maps obtained by cryo-EM do not resolve individual atoms. At these more common, lower resolutions, the task of assigning precise 3D coordinates to each atom in the macromolecule (‘model building’) is much more arduous because the 3D shape features corresponding to individual atoms are blurred together so that the smallest resolvable features may be entire amino acid residues rather than individual atoms or even chemical groups. When the placement of atoms is ambiguous, even armed with prior knowledge (typical bond distances, angles and torsions, the amino acid sequence of a protein, and so on), structural biologists and the software they use are likely to make errors during model building and refinement (Fig. 1).

Fig. 1: Building an atomic model is much more error-prone when the map resolution is limited.
figure1

The maps and models shown in this figure were part of the recent cryo-EM community model building and validation challenge1. Top row: a region of two 3D maps obtained from the same cryo-EM dataset of apoferritin, but with some of the image data withheld in one case so as to reduce the resolution from 1.8 to 3.1 Å (available from the EMDB under accession numbers EMD-20028 and EMD-20026, respectively). Left column: two atomic models built into the 3.1-Å map and submitted as part of the challenge (blue, entry 35_1; pink, entry 28_1; available from http://model-compare.emdataresource.org) interpret the map with significant differences. For example, the peptide bond between Val46 and Ala47 is rotated ~180 degrees, and different rotamers of the Asp45 and Val46 side chains are modeled. At a global resolution of 3.1 Å, a resolution typical of recently published cryo-EM maps, it is not possible to be confident about which model is correct (note that the local resolution in this loop region may be worse than 3.1 Å). Validation metrics can flag unexpected geometry in the atomic model or disagreement between model and map, and even help chose among otherwise equally plausible models. At 1.8 Å, even though individual atoms are not resolved, the correct model (in pink) is easily identified.

How then do cryo-EM practitioners ensure their models are sound given imperfect experimental results? Thankfully, other structural biologists have been there before. The X-ray crystallography community, for example, came up with metrics and algorithms to ascertain the quality and ‘believability’ of 3D models of biological macromolecules, long before cryo-EM maps warranted the building of models at all. Some of these metrics were quickly adopted by the cryo-EM community, but the two techniques are sufficiently different that new tools and metrics were needed. Academic groups have been working on filling this gap.

The search for a complete set of robust validation metrics for cryo-EM maps and models is far from over, but a recent model validation challenge, reported in this issue1, marks substantial progress. In the challenge, four high-quality 3D maps were distributed and set as targets for model building. Anyone wishing to participate could download the target maps, build atomic models into them, and submit models for consideration. Anonymized models were then processed using validation pipelines that included well-established as well as more experimental validation metrics, with the dual goals of judging the quality of the submitted models and of characterizing the validation metrics themselves.

On the first count, most submitted models were of higher quality than reference structures of the chosen targets, and the results were highly reproducible across submissions. This is encouraging but perhaps not surprising: as model building and refinement tools improve, the overall quality of models should also improve. What’s more, many of the participants in this challenge were experts in model building and refinement methods — expertise is still an important factor here, with most participants reporting that manual modification of models was required to obtain optimized structures. Even among this group of participants, some typical errors were recurrent: in a nice demonstration that the field is still evolving (and fast!), one of the validation tools, called CaBLAM4, which has been gaining in popularity but is not yet routinely used by all practitioners, found geometrical errors or imperfections in at least two-thirds of submitted models. In fact, only two of the thirteen teams taking part in the challenge managed to completely avoid this type of error. In other words: some aspects of model building are still challenging, but tools are becoming available that will help the community root out more and more errors — provided these tools become widely adopted.

And this is where this challenge and report can really make a difference: the authors’ recommendations with regards to validation practices should not only help individual practitioners but also guide future improvements to public data repositories such as the Protein Data Bank, whose validation reports are widely used during peer review and are available for inspection online at any time after a structure is published. This group’s work to characterize the behavior and performance of a number of validation methods leads to specific recommendations as to which metrics appear to be robust enough and orthogonal enough to existing metrics to warrant consideration for inclusion in validation reports. This should hasten the community-wide acceptance of these metrics and, eventually, improve the accuracy of all cryo-EM structures.

References

  1. 1.

    Lawson, C. L. et al. Nat. Methods https://doi.org/10.1038/s41592-020-01051-w (2020).

  2. 2.

    Cheng, Y., Grigorieff, N., Penczek, P. A. & Walz, T. Cell 161, 438–449 (2015).

    CAS  Article  Google Scholar 

  3. 3.

    Herzik, M. A. Jr. Nature 587, 39–40 (2020).

    CAS  Article  Google Scholar 

  4. 4.

    Prisant, M. G., Williams, C. J., Chen, V. B., Richardson, J. S. & Richardson, D. C. Protein Sci. 29, 315–329 (2020).

    CAS  Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Alexis Rohou.

Ethics declarations

Competing interests

A.R. is an employee of Genentech, a subsidiary of Roche, and holds Roche stocks.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Rohou, A. Improving cryo-EM structure validation. Nat Methods 18, 130–131 (2021). https://doi.org/10.1038/s41592-021-01062-1

Download citation

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing