Abstract
An increasing number of protein structures are being determined by cryogenic electron microscopy (cryo-EM). Although the resolution of determined cryo-EM density maps is improving in general, there are still many cases where amino acids of a protein are assigned with different levels of confidence. Here we developed a method that identifies potential misassignment of residues in the map, including residue shifts along an otherwise correct main-chain trace. The score, named DAQ, computes the likelihood that the local density corresponds to different amino acids, atoms, and secondary structures, estimated via deep learning, and assesses the consistency of the amino acid assignment in the protein structure model with that likelihood. When DAQ was applied to different versions of model structures in the Protein Data Bank that were derived from the same density maps, a clear improvement in the DAQ score was observed in the newer versions of the models. DAQ also found potential misassignment errors in a substantial number of deposited protein structure models built into cryo-EM maps.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Xanomeline displays concomitant orthosteric and allosteric binding modes at the M4 mAChR
Nature Communications Open Access 06 September 2023
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout




Code availability
The DAQ program is freely available for academic use from Github at https://github.com/kiharalab/DAQ. The program is available to run on a Google Collab website at https://bit.ly/daq-score.
References
Lawson, C. L. et al. EMDataBank unified data resource for 3DEM. Nucleic Acids Res. 44, D396–D403 (2016).
Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).
Lawson, C. L. et al. Cryo-EM model validation recommendations based on outcomes of the 2019 EMDataResource challenge. Nat. Methods 18, 156–164 (2021).
Lagerstedt, I. et al. Web-based visualisation and analysis of 3D electron-microscopy data from EMDB and PDB. J. Struct. Biol. 184, 173–181 (2013).
Barad, B. A. et al. EMRinger: side chain-directed model and map validation for 3D cryo-electron microscopy. Nat. Methods 12, 943–946 (2015).
Pintilie, G. et al. Measurement of atom resolvability in cryo-EM maps with Q-scores. Nat. Methods 17, 328–334 (2020).
Cragnolini, T. et al. TEMPy2: a Python library with improved 3D electron microscopy density-fitting and validation workflows. Acta Crystallogr. Sect. D. Struct. Biol. 77, 41–47 (2021).
Joseph, A. P. et al. Atomic model validation using the CCP-EM software suite. Acta Crystallogr. Sect. D. Struct. Biol. 78, 152–161 (2022).
Afonine, P. V. et al. New tools for the analysis and validation of cryo-EM maps and atomic models. Acta Crystallogr. Sect. D. Struct. Biol. 74, 814–840 (2018).
Chen, V. B. et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr. Sect. D. Biol. Crystallogr. 66, 12–21 (2010).
Prisant, M. G., Williams, C. J., Chen, V. B., Richardson, J. S. & Richardson, D. C. New tools in MolProbity validation: CaBLAM for CryoEM backbone, UnDowser to rethink “waters”, and NGL Viewer to recapture online 3D graphics. Protein Sci. 29, 315–329 (2020).
Wang, X. et al. Detecting protein and DNA/RNA structures in cryo-EM maps of intermediate resolution using deep learning. Nat. Commun. 12, 2302 (2021).
Maddhuri Venkata Subramaniya, S. R., Terashi, G. & Kihara, D. Protein secondary structure detection in intermediate-resolution cryo-EM maps using deep learning. Nat. Methods 16, 911–917 (2019).
Mostosi, P., Schindelin, H., Kollmannsberger, P. & Thorn, A. Haruspex: a neural network for the automatic identification of oligonucleotides and protein secondary structure in cryo-electron microscopy maps. Angew. Chem. 59, 14788–14795 (2020).
Pfab, J., Phan, N. M. & Si, D. DeepTracer for fast de novo cryo-EM protein structure modeling and special studies on CoV-related complexes. Proc. Natl Acad. Sci. USA 118, e2017525118 (2021).
Hanson, J., Paliwal, K., Litfin, T., Yang, Y. & Zhou, Y. Improving prediction of protein secondary structure, backbone angles, solvent accessibility and contact numbers by using predicted contact maps and an ensemble of recurrent and residual convolutional neural networks. Bioinformatics 35, 2403–2410 (2019).
He, K., Zhang, X., Ren, S. & SUn, J. Deep residual learning for image recognition, In Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition 770–778 (IEEE, 2016).
Gao, Y. et al. Structure of the visual signaling complex between transducin and phosphodiesterase 6. Mol. Cell 80, 237–245 (2020); erratum 81, 2496 (2021)..
Desai, N., Brown, A., Amunts, A. & Ramakrishnan, V. The structure of the yeast mitochondrial ribosome. Science 355, 528–531 (2017).
Amunts, A. et al. Structure of the yeast mitochondrial large ribosomal subunit. Science 343, 1485–1489 (2014).
Delano, W. L. The PyMOL Molecular Graphics System. http://www.pymol.org (2002).
Zhu, L., Li, L., Qi, Y., Yu, Z. & Xu, Y. Cryo-EM structure of SMG1–SMG8–SMG9 complex. Cell Res 29, 1027–1034 (2019).
Langer, L. M., Gat, Y., Bonneau, F. & Conti, E. Structure of substrate-bound SMG1–8–9 kinase complex reveals molecular basis for phosphorylation specificity. eLife 9, e57127 (2020).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Ronneberger, O., Fischer, P. & Brox, T. U-Net: convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015 (eds Navab, N., Hornegger, J., Wells, W. & Frangi, A.) 234–241 (Springer, 2015).
Dosovitskiy, A. et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. International Conference on Learning Representations (2020).
Kabsch, W. & Sander, C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577 (1983).
Kingma, D. & Ba, J. Adam. A method for stochastic optimization. International Conference on Learning Representations (2015).
Farabella, I. et al. TEMPy: a Python library for assessment of three-dimensional electron microscopy density fits. J. Appl. Crystallogr. 48, 1314–1323 (2015).
Shindyalov, I. N. & Bourne, P. E. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng. 11, 739 (1998).
Gribskov, M. & Robinson, N. L. Use of receiver operating characteristic (ROC) analysis to evaluate sequence matching. Comput. Chem. 20, 25–33 (1996).
Acknowledgements
This work was partly supported by the National Institutes of Health (R01GM133840, R01GM123055, and 3R01GM133840-02S1 to D.K.; R01CA254402, R01CA221289, and R01HL071818 to J.J.G.T.); the National Science Foundation (CMMI1825941, MCB1925643, DBI2003635, and DBI2146026) to D.K.; and the Walther Foundation for Cancer Research to J.J.G.T.
Author information
Authors and Affiliations
Contributions
J.J.G.T. and D.K. conceived the study. G.T. designed and implemented the DAQ score. X.W. coded and trained Emap2sec+ and computed probability values of structure features for cryo-EM maps. S.R.M.V.S. participated in coding Emap2sec+. G.T. and X.W. constructed datasets. G.T. and X.W. performed the computation and G.T., D.K., X.W., and J.J.G.T. analyzed the data. J.J.G.T. examined individual examples of potentially misassigned models. G.T. drafted the manuscript and J.J.G.T. and D.K. edited it. All the authors read and approved the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Methods thanks Grigore Pintilie, Alexis Rohou, and Carlos Óscar Sánchez-Sorzano for their contribution to the peer review of this work. Primary Handling Editor: Rita Strack, in collaboration with the Nature Methods team. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs 1–12 and Supplementary Tables 1–9.
Supplementary Table
Supplementary Table. 1, 2, 4, 6, and 7
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Terashi, G., Wang, X., Maddhuri Venkata Subramaniya, S.R. et al. Residue-wise local quality estimation for protein models from cryo-EM maps. Nat Methods 19, 1116–1125 (2022). https://doi.org/10.1038/s41592-022-01574-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41592-022-01574-4
This article is cited by
-
Xanomeline displays concomitant orthosteric and allosteric binding modes at the M4 mAChR
Nature Communications (2023)
-
DAQ-Score Database: assessment of map–model compatibility for protein structure models from cryo-EM maps
Nature Methods (2023)