Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Brief Communication
  • Published:

Improving resolution and resolvability of single-particle cryoEM structures using Gaussian mixture models

Abstract

Cryogenic electron microscopy is widely used in structural biology, but its resolution is often limited by the dynamics of the macromolecule. Here we developed a refinement protocol based on Gaussian mixture models that integrates particle orientation and conformation estimation and improves the alignment for flexible domains of protein structures. We demonstrated this protocol on multiple datasets, resulting in improved resolution and resolvability, locally and globally, by visual and quantitative measures.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Refinement of GPCR (EMPIAR-10786) and SARS-CoV-2 spike (EMPIAR-10492) datasets.
Fig. 2: Refinement of the ABC transporter (EMPIAR-10374).

Similar content being viewed by others

Data availability

The three datasets used in the paper are publicly available through EMPIAR: EMPIAR-10786 for the GPCR, EMPIAR-10492 for the SARS-CoV-2 spike, and EMPIAR-10374 for the ABC transporter. Structures produced in this paper are deposited in EMDatabank, and the corresponding refined models in the PDB: EMD-41840, 8U26 for the GPCR; EMD-41841, 8U28 for SARS-CoV-2 spike; and EMD-41845, 8U2C for the ABC transporter.

Code availability

All computational tools described here are implemented in EMAN2, a free and open source software for cryoEM/cryogenic electron tomography imaging processing. The code is available at github.com/cryoem/eman2, and a tutorial for these tools can be found through eman2.org/e2gmm_refine.

References

  1. Cheng, Y. Single-particle cryo-EM—how did it get here and where will it go. Science 361, 876–880 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Zhang, K., Pintilie, G. D., Li, S., Schmid, M. F. & Chiu, W. Resolving individual atoms of protein complex by cryo-electron microscopy. Cell Res. 30, 1136–1139 (2020).

    Article  PubMed  Google Scholar 

  3. Nakane, T. et al. Single-particle cryo-EM at atomic resolution. Nature 587, 152–156 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Yip, K. M., Fischer, N., Paknia, E., Chari, A. & Stark, H. Atomic-resolution protein structure determination by cryo-EM. Nature 587, 157–161 (2020).

    Article  CAS  PubMed  Google Scholar 

  5. Scheres, S. H. W. RELION: implementation of a Bayesian approach to cryo-EM structure determination. J. Struct. Biol. 180, 519–530 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Tang, G. et al. EMAN2: an extensible image processing suite for electron microscopy. J. Struct. Biol. 157, 38–46 (2007).

    Article  CAS  PubMed  Google Scholar 

  7. Nakane, T., Kimanius, D., Lindahl, E. & Scheres, S. H. Characterisation of molecular motions in cryo-EM single-particle data by multi-body refinement in RELION. eLife 7, e36861 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  8. Zhong, E. D., Bepler, T., Berger, B. & Davis, J. H. CryoDRGN: reconstruction of heterogeneous cryo-EM structures using neural networks. Nat. Methods 18, 176–185 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Punjani, A. & Fleet, D. J. 3D variability analysis: resolving continuous flexibility and discrete heterogeneity from single particle cryo-EM. J. Struct. Biol. 213, 107702 (2021).

    Article  CAS  PubMed  Google Scholar 

  10. Punjani, A. & Fleet, D.J. 3DFlex: determining structure and motion of flexible proteins from cryo-EM. Nat. Methods 20, 860–870 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Herreros, D. et al. Estimating conformational landscapes from Cryo-EM particles by 3D Zernike polynomials. Nat. Commun. 14, 154 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Chen, M. & Ludtke, S. J. Deep learning-based mixed-dimensional Gaussian mixture model for characterizing variability in cryo-EM. Nat. Methods 18, 930–936 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Punjani, A., Rubinstein, J. L., Fleet, D. J. & Brubaker, M. A. cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nat. Methods 14, 290–296 (2017).

    Article  CAS  PubMed  Google Scholar 

  14. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at arXiv https://doi.org/10.48550/arXiv.1412.6980 (2017).

  15. Chen, M., Toader, B. & Lederman, R. Integrating molecular models into CryoEM heterogeneity analysis using scalable high-resolution deep gaussian mixture models. J. Mol. Biol. https://doi.org/10.1016/j.jmb.2023.168014 (2023).

  16. Martín~Abadi et al. TensorFlow: Large-scale machine learning on heterogeneous systems. Preprint at arXiv https://doi.org/10.48550/arXiv.1603.04467 (2016).

  17. Harris, J. A. et al. Selective G protein signaling driven by substance P-neurokinin receptor dynamics. Nat. Chem. Biol. 18, 109–115 (2022).

    Article  CAS  PubMed  Google Scholar 

  18. Pintilie, G. et al. Measurement of atom resolvability in cryo-EM maps with Q-scores. Nat. Methods 17, 328–334 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Ke, Z. et al. Structures and distributions of SARS-CoV-2 spike proteins on intact virions. Nature 588, 498–502 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Jackson, S. M. et al. Structural basis of small-molecule inhibition of human multidrug transporter ABCG2. Nat. Struct. Mol. Biol. 25, 333–340 (2018).

    Article  CAS  PubMed  Google Scholar 

  21. Dong, Y. et al. Cryo-EM structures and dynamics of substrate-engaged human 26S proteasome. Nature 565, 49–55 (2019).

    Article  CAS  PubMed  Google Scholar 

  22. Herreros, D. et al. ZART: a novel multiresolution reconstruction algorithm with motion-blur correction for single particle analysis. J. Mol. Biol. 435, 168088 (2023).

    Article  CAS  PubMed  Google Scholar 

  23. Asarnow, D., Palovcak, E. & Cheng, Y. asarnow/pyem: UCSF pyem v0.5. Zenodo https://doi.org/10.5281/zenodo.3576630 (2019).

  24. Penczek, P. A. Reliable cryo-EM resolution estimation with modified Fourier shell correlation. IUCrJ 7, 995–1008 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Afonine, P. V. et al. Real-space refinement in PHENIX for cryo-EM and crystallography. Acta Crystallogr. D 74, 531–544 (2018).

    Article  CAS  Google Scholar 

  26. Pettersen, E. F. et al. UCSF Chimera—a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004).

    Article  CAS  PubMed  Google Scholar 

  27. Goddard, T. D. et al. UCSF ChimeraX: meeting modern challenges in visualization and analysis. Protein Sci. 27, 14–25 (2018).

    Article  CAS  PubMed  Google Scholar 

  28. Bell, J. M., Chen, M., Baldwin, P. R. & Ludtke, S. J. High resolution single particle refinement in EMAN2.1. Methods 100, 25–34 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Bartesaghi, A. et al. Atomic resolution cryo-EM structure of β-galactosidase. Structure 26, 848–856.e3 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This research has been supported by NIH grant R21MH125285 and R01GM150905 to M.C., and R01GM113195 and the Chan Zuckerberg Initiative fund from Silicon Valley Community Foundation (award no. 2021-234593) to W.C. Computational resources from SLAC Shared Scientific Data Facility (SDF) are used for the work. We thank G. Pintilie for helping with the final model refinement for PDB deposition.

Author information

Authors and Affiliations

Authors

Contributions

M.C. designed the algorithm. M.C. implemented the software and performed the experiments. M.C., M.F.S. and W.C. contributed to data interpretation and manuscript preparation.

Corresponding author

Correspondence to Muyuan Chen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Methods thanks Ellen Zhong and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Arunima Singh, in collaboration with the Nature Methods team. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Workflow diagram for GMM-based particle orientation and conformation refinement.

Each block represents one step of processing, and the arrows indicate the sequence of the processes. The right side, DNN-based heterogeneity analysis is optional and is only recommended when large scale movement is present in the system. The result of the DNN-based refinement focusing on one region can be treated as one patch and merged into the patch-by-patch refinement results from the main workflow.

Extended Data Fig. 2 Detailed results of the GPCR dataset (EMPIAR-10786).

(a) ‘Gold-standard’ FSC curves of the reconstruction using initial orientation (gray), global GMM-based refinement (yellow) and patch-by-patch refinement (pink). (b) Q-score comparison of the three corresponding maps. (c-d) Overall structure of the three reconstructions, colored by local resolution, and comparison of local real space features. The cyan boxes in C highlight the location of features in the corresponding columns shown in D.

Extended Data Fig. 3 Detailed results of the SARS-COV2 dataset (EMPIAR-10492).

(a) ‘Gold-standard’ FSC curves of the reconstruction using initial orientation (gray), global GMM-based refinement (yellow) and patch-by-patch refinement (pink). (b) Q-score comparison of the three corresponding maps. (c-d) Overall structure of the three reconstructions, colored by local resolution, and comparison of local real space features. The cyan boxes in C highlight the location of features in the corresponding columns shown in D.

Extended Data Fig. 4 Detailed results of the ABC transporter dataset (EMPIAR-10374).

(a) ‘Gold-standard’ FSC curves of the reconstruction using initial orientation (gray), global GMM-based refinement (yellow) and patch-by-patch refinement (pink). (b) Q-score comparison of the three corresponding maps. (c-d) Overall structure of the three reconstructions, colored by local resolution, and comparison of local real space features. The cyan boxes in C highlight the location of features in the corresponding columns shown in D.

Extended Data Fig. 5 Visualization of GMMs.

(a) Reconstruction of the GPCR using the voxel map representation, determined at 3.3 Å resolution. 6 × 106 floating point values are required to represent the structure. (b) GMM representation of the A, shown in isosurface view. 5 × 104 floating point values are used to represent the structure. (c) FSC curve between A and B. The two structures are virtually identical up to 3.3 Å. (d) Visualization of the GMM from B using 3D scatter plot. Each point is colored by the amplitude of the Gaussian function and the size of the points correspond to the width of Gaussian functions. (e) Overlay of the coordinates of Gaussian functions in the corresponding density map.

Extended Data Fig. 6 Diagram for the patch-by-patch refinement.

(a) Input density maps and GMMs from the global orientation refinement. (b) Scatter plot of Gaussian coordinates, colored by clustering result. (c) Soft masks for each cluster. Each mask is a sphere covering all Gaussian coordinates of the corresponding cluster, with a soft falloff. (d) Focused refinement results using the soft masks, colored by local resolution. Note that the same refinement process is done independently for the even/odd subsets of particles using the corresponding half map/GMM as reference. (e) Masked out density from the individual focused refinement result. (f) Final composite map generated by weighted averaging.

Extended Data Fig. 7 Orientation and translation change before and after GMM-based refinement.

(a) Histogram of particle orientation assignment change after the GMM-based global refinement. Mean = 2.34, std = 1.37 degrees. (b) Histogram of particle translation change after the GMM-based global refinement. Mean = 0.66, std = 0.46 pixels. (c) Histogram of particle orientation assignment change after the GMM-based focused refinement. Mean = 3.74, std = 1.76 degrees. (d) Histogram of particle translation change after the GMM-based global refinement. Mean = 0.89, std = 0.62 pixels.

Extended Data Fig. 8 Performance of GMM-based refinement in datasets at lower resolution.

(a–c) Refinement of a small subset of the GPCR dataset. (d–f) Refinement of a small subset of the SARS-COV2 dataset. (a, d) ‘Gold-standard’ FSC curves of the reconstruction using voxel-based refinement (gray), global GMM-based refinement (yellow) and patch-by-patch refinement (pink). (b, f) Overall structure of the initial reconstruction and the final patch-by-patch refinement result, colored by local resolution. (c, f) Comparison of local features from the voxel-based, global GMM-based and patch-by-patch refinement.

Extended Data Fig. 9 Impact of flexible domains on the global FSC curve.

(a) Simulated density map of β-galactosidase, with two rigid subunits at 2.5 Å and two flexible ones resolved at 15 Å. (b) Same simulated map as A, with the two flexible subunits resolved at 7 Å. (c) Comparison of ‘gold-standard’ FSC curves of the two structures. Blue - A, red - B.

Supplementary information

Reporting Summary

Peer Review File

Supplementary Video 1

Structure comparison of the GPCR dataset (EMPIAR-10786). Gray: reconstruction using initial orientation. Pink: reconstruction after GMM-based patch-by-patch refinement.

Supplementary Video 2

Structure comparison of the SARS-CoV-2 dataset (EMPIAR-10492). Gray: reconstruction using initial orientation. Pink: reconstruction after GMM-based patch-by-patch refinement.

Supplementary Video 3

Structure comparison of the ABC transporter dataset (EMPIAR-10374). Gray: reconstruction using initial orientation. Pink: reconstruction after GMM-based patch-by-patch refinement.

Supplementary Video 4

Converting structural heterogeneity to particle orientation focusing on the Fab of the ABC transporter (EMPIAR-10374).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, M., Schmid, M.F. & Chiu, W. Improving resolution and resolvability of single-particle cryoEM structures using Gaussian mixture models. Nat Methods 21, 37–40 (2024). https://doi.org/10.1038/s41592-023-02082-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41592-023-02082-9

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing