Abstract
DNA and RNA play fundamental roles in various cellular processes, where their three-dimensional structures provide information critical to understanding the molecular mechanisms of their functions. Although an increasing number of nucleic acid structures and their complexes with proteins are determined by cryogenic electron microscopy (cryo-EM), structure modeling for DNA and RNA remains challenging particularly when the map is determined at a resolution coarser than atomic level. Moreover, computational methods for nucleic acid structure modeling are relatively scarce. Here, we present CryoREAD, a fully automated de novo DNA/RNA atomic structure modeling method using deep learning. CryoREAD identifies phosphate, sugar and base positions in a cryo-EM map using deep learning, which are traced and modeled into a three-dimensional structure. When tested on cryo-EM maps determined at 2.0 to 5.0 Å resolution, CryoREAD built substantially more accurate models than existing methods. We also applied the method to cryo-EM maps of biomolecular complexes in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The entries of the maps and corresponding structure models utilized in this study are provided in Supplementary Tables 1 and 4. The experimental EM maps utilized can be downloaded from the EMDB (https://www.emdataresource.org/). The corresponding experimental determined structures utilized can be downloaded from the RCSB (https://www.rcsb.org/). The structures modeled by CryoREAD are available at https://doi.org/10.5281/zenodo.8274164. Source data are provided with this paper.
Code availability
The source code of CryoREAD is available at https://github.com/kiharalab/CryoREAD (ref. 41). The webserver is available at https://em.kiharalab.org/algorithm/CryoREAD, where users can simply upload the map and obtain the structures without installment. Users can also access Google Colab Notebook webserver at https://bit.ly/CryoREAD. A detailed tutorial for CryoREAD is available at https://kiharalab.org/emsuites/cryoread.php.
References
Warner, K. D., Hajdin, C. E. & Weeks, K. M. Principles for targeting RNA with drug-like small molecules. Nat. Rev. Drug Discov. 17, 547–558 (2018).
Huang, P. -S., Boyken, S. E. & Baker, D. The coming of age of de novo protein design. Nature 537, 320–327 (2016).
Churkin, A. et al. Design of RNAs: comparing programs for inverse RNA folding. Brief. Bioinform. 19, 350–358 (2018).
Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of COOT. Acta Crystallogr. D Biol. Crystallogr. 66, 486–501 (2010).
Liebschner, D. et al. Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix. Acta Crystallogr. D Struct. Biol. 75, 861–877 (2019).
Winn, M. D. et al. Overview of the CCP4 suite and current developments. Acta Crystallogr. D Biol. Crystallogr. 67, 235–242 (2011).
Alnabati, E. & Kihara, D. Advances in structure modeling methods for cryo-electron microscopy maps. Molecules 25, 82 (2020).
Pfab, J., Phan, N. M. & Si, D. DeepTracer for fast de novo cryo-EM protein structure modeling and special studies on CoV-related complexes. Proc. Natl Acad. Sci. USA 118, e2017525118 (2021).
Terashi, G. & Kihara, D. De novo main-chain modeling for EM maps using MAINMAST. Nat. Commun. 9, 1618 (2018).
Maddhuri Venkata Subramaniya, S. R., Terashi, G. & Kihara, D. Protein secondary structure detection in intermediate-resolution cryo-EM maps using deep learning. Nat. Methods 16, 911–917 (2019).
Song, Y. et al. High-resolution comparative modeling with RosettaCM. Structure 21, 1735–1742 (2013).
Emsley, P. & Cowtan, K. COOT: model-building tools for molecular graphics. Acta Crystallogr. D Biol. Crystallogr. 60, 2126–2132 (2004).
Pettersen, E. F. et al. UCSF Chimera—a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004).
Schlick, T. & Pyle, A. M. Opportunities and challenges in RNA structural modeling and design. Biophys. J. 113, 225–234 (2017).
Keating, K. S. & Pyle, A. M. RCrane: semi-automated RNA model building. Acta Crystallogr. D Biol. Crystallogr. 68, 985–995 (2012).
Kappel, K. et al. Accelerated cryo-EM-guided determination of three-dimensional RNA-only structures. Nat. Methods 17, 699–707 (2020).
Huang, H. et al. Unet 3+: a full-scale connected unet for medical image segmentation. in 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 1055–1059 (IEEE, 2020).
Ronneberger, O., Fischer, P. & Box, T. U-Net: convolutional networks for biomedical image segmentation. in International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 234–241 (Springer, 2015).
Carreira-Perpinan, M. A. Acceleration strategies for Gaussian mean-shift image segmentation. in IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR) 1160–1167 (IEEE, 2006).
Psaraftis, H. N. Dynamic vehicle routing problems. Veh. Routing Methods Stud. 16, 223–248 (1988).
Rossi, F., Van Beek, P. & Walsh, T. Handbook of Constraint Programming (Elsevier, 2006).
Afonine, P. V. et al. Real-space refinement in PHENIX for cryo-EM and crystallography. Acta Crystallogr. D Biol. Crystallogr. 74, 531–544 (2018).
Wang, X. et al. Detecting protein and DNA/RNA structures in cryo-EM maps of intermediate resolution using deep learning. Nat. Commun. 12, 2302 (2021).
Kim, M.-S. et al. Cracking the DNA code for V(D)J recombination. Mol. Cell 70, 358–370 (2018).
Grimm, C. et al. Structural basis of poxvirus transcription: vaccinia RNA polymerase complexes. Cell 179, 1537–1550 (2019).
Li, S. et al. Structural basis of amino acid surveillance by higher-order tRNA–mRNA interactions. Nat. Struct. Mol. Biol. 26, 1094–1105 (2019).
Nikolay, R. et al. Snapshots of native pre-50S ribosomes reveal a biogenesis factor network and evolutionary specialization. Mol. Cell 81, 1200–1215 (2021).
Shi, M. et al. SARS-CoV-2 Nsp1 suppresses host but not viral translation through a bipartite mechanism. Preprint at BioRxiv https://doi.org/10.1101/2020.09.18.302901 (2020).
Schubert, K. et al. SARS-CoV-2 Nsp1 binds the ribosomal mRNA channel to inhibit translation. Nat. Struct. Mol. Biol. 27, 959–966 (2020).
Thoms, M. et al. Structural basis for translational shutdown and immune evasion by the Nsp1 protein of SARS-CoV-2. Science 369, 1249–1255 (2020).
Naydenova, K. et al. Structure of the SARS-CoV-2 RNA-dependent RNA polymerase in the presence of favipiravir-RTP. Proc. Natl Acad. Sci. USA 118, e2021946118 (2021).
Wang, Q. et al. Structural basis for RNA replication by the SARS-CoV-2 polymerase. Cell 182, 417–428 (2020).
Chen, J. et al. Structural basis for helicase-polymerase coupling in the SARS-CoV-2 replication-transcription complex. Cell 182, 1560–1573 (2020).
Terwilliger, T. C., Adams, P. D., Afonine, P. V. & Sobolev, O. V. A fully automatic method yielding initial models from high-resolution cryo-electron microscopy maps. Nat. Methods 15, 905–908 (2018).
Sudre, C. H., Li, W., Vercauteren, T., Ourselin, S. & Jorge Cardoso, M. Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. in Deep learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support 240–248 (Springer, 2017).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. in Proceedings of International Conference on Learning Representations (2015).
Fukunaga, K. & Hostetler, L. The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Inform. Theory 21, 32–40 (1975).
Toth, P. & Vigo, D. The Vehicle Routing Problem (SIAM, 2002).
Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011).
Wang, X., Terashi, G. & Kihara, D. CryoREAD: de novo structure modeling for nucleic acids in cryo-EM maps using deep learning. Zenodo. https://doi.org/10.5281/zenodo.8274181
Acknowledgements
The authors thank J. C. Verburgt, H. Kannan, A. Jain and C. Christoffer for their help in literature search, discussion and proofreading. The authors also thank J. A. Nash, S. Ellis and J. Chen’s suggestion for optimizing the released software. This work was partly supported by the National Institutes of Health (R01GM133840, 3R01 GM133840-02S1) and the National Science Foundation (DMS2151678, DBI2003635, CMMI1825941, MCB2146026 and MCB1925643). X.W. is a recipient of the MolSSI graduate fellowship.
Author information
Authors and Affiliations
Contributions
D.K. conceived the study. X.W. designed and implemented CryoREAD and computed results. G.T. designed the core strategy of molecular structure building pipeline and participated in implementing the algorithm. All the authors analyzed the results. X.W. drafted the manuscript and D.K. edited it. All the authors read and approved the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Methods thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editor: Allison Doerr, in collaboration with the Nature Methods team. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 The detailed network architecture of cryoREAD.
a, the network architecture. The entire network consists of two stages of U-Net networks and here we show the 1st stage networks. It concatenates two U-Net architectures. They are 3D U-shape-based convolutional Network (UNet) with full-scale skip connections and deep supervisions. The channel size of different layers is also illustrated in the figure. b, The Encoder Block (Enc1 in panel a); c, The Merge Encoder Block (MEnc); and d, the Decoder Block (Dec). Conv3D, a 3-dimentional (3D) convolutional layer with the filter size of 3*3*3, stride 1 and padding 1. BatchNorm, a normalization layer that takes statistics in a batch to normalize the input data. ReLU, Rectified Linear Unit, a commonly used activation layer. It is a cascaded U-net, where the first U-Net (on the left) focuses on the prediction of high-level detection of sugar, phosphate, base, and protein while the second U-Net (on the right) focuses predicting different base types: A, C, G, and T/U. The processed information of the 1st U-Net encoder is also passed as input for the 2nd U-Net to help its predictions (dashed lines in orange). We applied deep supervision to the loss on output of different decoder outputs, which was shown to improve the performance. The stage 2 network only includes the first U-Net architecture of the stage 1 network. It takes predicted probabilities of 8*643 Å3 predictions (8 probabilities: protein, phosphate, sugar, base, and four different base types) from the stage 1 network and outputs the refined probabilities in a box of 8*64*64*64 Å3.
Extended Data Fig. 2 The running Time of CryoREAD on 11 structures of different sizes.
The experiments were carried out on a computer server with 1 NVIDIA TITAN RTX 24GB GPU and 24 CPUs. Here 5 colors correspond to 5 steps in CryoREAD pipeline: 1) Structure Detection by Deep Learning; 2) Representative Node Clustering; 3) Backbone Tracing; 4) Sequence Assignment; 5) Full Atom Model. The actual data point of the 11 maps are shown by dots.
Extended Data Fig. 3 The distribution of the size of nucleic acids in the testing set.
The test set includes 68 cryo-EM maps. The x-axis shows the resolution from 2.0 to 5.0 Å and the y-axis denotes the number of nucleotides in each map, ranging from 57 to 4,286.
Extended Data Fig. 4 Grid level detection accuracy (recall) of 8 structural classes.
A grid was assigned with a structure class that is closer than 2 Å to the grid. If there were multiple different structures that were within 2 Å, the closer one was assigned to the grid. A detection by deep network for a grid was considered as correct if the probability of the correct structure class has a value over 0.5. pho, phosphate. Results of the stage 1 and stage 2 networks are shown. The statistics are calculated over n = 68 independent experimental EM maps, with each points values derived from Supplementary Table 2. For stage 1, the values of minima, maxima, center, bounds of box and whiskers of different categories in order: Sugar(0.333,0.893,0.729,0.601/0.804,0.333/0.893), Phos(0.138,0.849,0.656,0.463/0.754,0.138/0.849), Base(0.380,0.947,0.836,0.772/0.889,0.669/0.947), Protein(0.349,0.929,0.808,0.763/0.879,0.621/0.929), A-Base(0.034,0.890,0.549,0.362/0.746,0.034/0.890), U/T-Base(0.019,0.797,0.460,0.294/0.646,0.019/0.797), C-Base(0.088,0.886,0.539,0.369/0.711,0.088/0.886), G-Base(0.153,0.933,0.637,0.490/0.823,0.153/0.933), Overall(0.438,0.909,0.764,0.685/0.822,0.478/0.909). For stage 2, the values of minima, maxima, center, bounds of box and whiskers of different categories in order: Sugar(0.357,0.920,0.775,0.654/0.846,0.453/0.920), Phos(0.228,0.883,0.693,0.509/0.810,0.228/0.883), Base(0.408,0.952,0.858,0.795/0.905,0.695/0.952), Protein(0.490,0.969,0.903,0.867/0.936,0.772/0.969), A-Base(0.013,0.893,0.501,0.296/0.745,0.013/0.893), U/T-Base(0.024,0.824,0.479,0.291/0.682,0.024/0.824), C-Base(0.104,0.920,0.611,0.449/0.775,0.104/0.920), G-Base(0.237,0.951,0.737,0.609/0.858,0.237/0.951), Overall(0.492,0.952,0.827,0.756/0.878,0.577/0.952). For the moiety level accuracy, see Fig. 2 in the main text.
Extended Data Fig. 5 Nucleotide moiety-based accuracy relative to the resolution.
A nucleotide moiety was considered as correctly detected if the majority of the atoms in the moiety were correctly detected. The data used here is the same as those which were used for Fig. 2a. a. base detection accuracy relative to the map resolution. The equation of regression line is y = −0.032x + 1.019 (Pearson correlation coefficient: −0.256, p-value: 0.035, standard error:0.015). b. Moiety-based accuracy of detecting 2-ring bases (A/G). If A or G was detected as either A or G, it was considered as correct detection. The equation of regression line is y = −0.116x + 1.239 (Pearson correlation coefficient: −0.582, p-value: 1.925e-7, standard error:0.020). c. Accuracy of detecting 1-ring bases (U/T/C). The equation of regression line is y = −0.191x + 1.412 (Pearson correlation coefficient: −0.658, p-value: 1.109e-9, standard error:0.027). d. Accuracy of detecting Adenine (A). The equation of regression line is y = −0.312x + 1.712 (Pearson correlation coefficient: −0.758, p-value: 6.951e-14, standard error:0.033). e. Accuracy of detecting Uracil/Thymine (U/T). The equation of regression line is y = −0.277x + 1.561 (Pearson correlation coefficient: −0.754, p-value: 1.098e-13, standard error:0.030). f. Accuracy of detecting Cytosine (C). The equation of regression line is y = −0.206x + 1.423 (Pearson correlation coefficient: −0.679, p-value: 1.891e-10, standard error:0.027). g. Accuracy of detecting Guanine (G). The equation of regression line is y = −0.094x + 1.158 (Pearson correlation coefficient: −0.446, p-value: 1.367e-4, standard error:0.023).
Extended Data Fig. 6 Correlation between sequence recall and sequence recall (match).
Sequence recall (match) only considers nucleotides in the reference structure that have a corresponding nucleotide in the model (an average atom pair distance of less than 5 Å).
Extended Data Fig. 7 Sequence match relative to the map resolution.
To compute sequence match, first we identified a nucleotide in the model that corresponds to each nucleotide in the reference structure by assigning the nucleotide in the model that has the closest average atom distance, then checked if the bases are identical or not. Sequence match only considers nucleotides in the reference structure that have a corresponding nucleotide in the model (an average atom pair distance of less than 5 Å). In this figure, we compared sequence match of the initial assignment and after the sequence alignment. The initial assignment considers the base type obtained by the base predictions at base nodes of the atomic structures being developed. The initial assignment here is different from the base moiety accuracy reported in Fig. 2a and Extended Data Fig. 5 because Fig. 2a and Extended Data Fig. 5 concern initial grid-based accuracy of bases by deep learning while the initial sequence assignment here considers accuracy of the base assignment in the modeled tertiary structure, where the base positions are determined in consideration of other atoms in the nucleic acids including phosphate and sugar positions. Seq Match is the reassigned base type by sequence assignment to backbone paths. a. Overall sequence match. For initial assignment, the equation of regression line is y = −0.125x + 0.984 (Pearson correlation coefficient: −0.782, p-value: 3.380e-15, standard error:0.012). For seq match, the equation of regression line is y = −0.110x + 0.997 (Pearson correlation coefficient: −0.684, p-value: 1.230e-10, standard error:0.014). b. Sequence match of Adenine (A) relative to the map resolution. For initial assignment, the equation of regression line is y = −0.169x + 1.045 (Pearson correlation coefficient: −0.684, p-value: 1.307e-11, standard error:0.022). For seq match, the equation of regression line is y = −0.143x + 1.040 (Pearson correlation coefficient: −0.591, p-value: 1.127e-7, standard error:0.024). c. Sequence match of Uracil/Thymine (U/T). For initial assignment, the equation of regression line is y = −0.137x + 1.018 (Pearson correlation coefficient: −0.671, p-value: 3.771e-10, standard error:0.019). For seq match, the equation of regression line is y = −0.132x + 1.042 (Pearson correlation coefficient: −0.680, p-value: 1.881e-10, standard error:0.018). d. Sequence match of Cytosine (C). For initial assignment, the equation of regression line is y = −0.072x + 0.873 (Pearson correlation coefficient: −0.482, p-value: 3.141e-8, standard error:0.016). For seq match, the equation of regression line is y = −0.074x + 0.911 (Pearson correlation coefficient: −0.452, p-value: 1.095e-4, standard error:0.018). e. Sequence match of Guanine (G). For initial assignment, the equation of regression line is y = −0.114x + 0.997 (Pearson correlation coefficient: −0.578, p-value: 2.381e-7, standard error:0.020). For seq match, the equation of regression line is y = −0.100x + 1.012 (Pearson correlation coefficient: −0.595, p-value: 9.017e-8, standard error:0.017).
Extended Data Fig. 8 The number of atom clashes before and after the structure refinement.
An atom clash is defined as heavy atom pairs closer than 3.0 Å. The line shown is y = x.
Extended Data Fig. 9 Examples of modeled atomic structure by CryoREAD for experimental maps without full atomic structures.
Detailed Evaluation Results are shown in Supplementary Table 3. In this figure, from left to right, the 3 columns correspond to 1) EM map and its corresponding structure; 2) showing only RNA structures in the map. In addition to the RNA structure modeled by the authors, we also shown here homologous structures of missing RNAs in the map, which we searched by BLAST1. 3) the atomic structure model by CryoREAD. a. the initial Shwachman-Bodian-Diamond syndrome (SBDS) protein closed state of the nascent 60 S ribosomal subunit (EMD-3145, PDB 5AN9, Resolution: 3.3 Å; protein lengths: 1905 aa; RNA length: 1162 nt): Backbone recall: 0.888; Sequence match: 0.696. Identified homologous structure (PDB 5XXB, RNA length: 3352 nt, Sequence Identity: 84.3%, RMSD: 1.2 Å): Backbone recall: 0.832. b. the SBDS open state of the nascent 60 S ribosomal subunit (EMD-3146, PDB 5ANB, Resolution: 4.1 Å; protein lengths: 3025 aa; RNA length: 1162 nt): Backbone recall: 0.871; Sequence match: 0.544. Identified homologous RNA structure (PDB 5XXB, RNA length: 3352 nt, Sequence Identity: 84.3%, RMSD: 1.3 Å): Backbone recall: 0.821. c. the ELF1 accommodated state of the nascent 60 S ribosomal subunit (EMD-3147, PDB 5ANC, Resolution: 4.2 Å; protein lengths: 2801 aa; RNA length: 1162 nt): Backbone recall: 0.881; Sequence match: 0.535. Identified homologous structure (PDB 5XXB, RNA length: 3352 nt, Sequence Identity: 84.3%, RMSD: 1.3 Å): Backbone recall: 0.804. d. TnaC-stalled ribosome complex with the titin I27 domain folding close to the ribosomal exit tunnel (EMD-0322, PDB 6I0Y, Resolution: 3.2 Å; protein lengths: 3552 aa; RNA length: 3049 nt): Backbone recall: 0.901; Sequence match: 0.632. Identified homologous RNA structure (PDB 7D80, RNA length: 4761 nt, Sequence Identity: 100%, RMSD: 1.2 Å): Backbone recall: 0.834. e. RNC-SRP-SR complex early state (EMD-8000, PDB 5GAD, Resolution: 3.7 Å; protein lengths: 4087 aa; RNA length: 3049 nt): Backbone recall: 0.914; Sequence match: 0.655. Identified homologous structure (PDB 7D80, RNA length: 4761 nt, Sequence Identity: 100%, RMSD: 1.0 Å): Backbone recall: 0.847. f. Structure of the 40 S ABCE1 post-splitting complex (EMD-4071, PDB5LL6, Resolution: 3.9 Å; protein lengths: 3429 aa; RNA length: 1325 nt): Backbone recall: 0.856; Sequence match: 0.586. Identified homologous structure (PDB 7OSM, RNA length: 1740 nt, Sequence Identity: 99%, RMSD: 1.0 Å): Backbone recall: 0.784. In Extended Data Fig. 9, we show cases where only a part of the structures in an EM map was modelled by authors. They are maps from three different sets of EM maps of ribosomal subunits. The first set (panel a-c) includes three different states of eIF6 release from the nascent 60 S ribosomal subunit2 of Dictyostelium discoideum, where only part of 26 S ribosomal unit is modeled by the authors. The second set (panel d-e) presents two different forms of 70 S ribosomal subunit3 of Escherichia coli, where the authors only modeled 50 S ribosomal subunit but 30 S ribosomal subunit was left unmodelled. The third example (panel f) is 40 S ribosomal subunit of Saccharomyces cerevisiae, where only part of 18 S ribosomal RNA was modeled. We filled the missing RNA structure in the maps with homologous RNA structure found by BLAST1 against PDB. Sequence identities of the identified RNAs were 84.3% to 100%. CryoREAD models for the missing RNA structures had backbone recall of 0.784 to 0.847, when the homologous structures were considered as reference. Backbone recall of CryoREAD models for RNAs with authors’ model was from 0.856 to 0.914.
Extended Data Fig. 10 Structure model evaluation on the 68 experimental EM maps with Phenix.
a, the number of nucleotides modelled by Phenix map_to_model and CryoREAD. For Phenix results, two models were generated. Models from map regions that are predicted to include nucleic acid atoms (Phenix (Mask), blue) and models that were built from the entire map (Phenix, orange). b, comparison of backbone atom/sequence recall of Phenix (Mask) and Phenix. c, backbone atom recalls of Phenix (Mask), Phenix, and CryoREAD relative to map resolution. For CryoREAD, the equation of regression line is y = −0.042x + 1.012 (Pearson correlation coefficient: −0.320, p-value: 0.008, standard error:0.015). For Phenix(Mask), the equation of regression line is y = −0.091x + 0.877 (Pearson correlation coefficient: −0.445, p-value: 1.456e-4, standard error:0.023). For Phenix, the equation of regression line is y = −0.108x + 0.839 (Pearson correlation coefficient: −0.492, p-value: 2.006e-5, standard error:0.023). d, sequence recalls of Phenix (Mask), Phenix and CryoREAD relative to map resolution. For CryoREAD, the equation of regression line is y = −0.117x + 0.961 (Pearson correlation coefficient: −0.632, p-value: 7.280e-9, standard error:0.017). For Phenix(Mask), the equation of regression line is y = −0.064x + 0.444 (Pearson correlation coefficient: −0.550, p-value: 1.190e-6, standard error:0.012). For Phenix, the equation of regression line is y = −0.063x + 0.408 (Pearson correlation coefficient: −0.525, p-value: 4.254e-6, standard error:0.013).
Supplementary information
Supplementary Information
Supplementary Figs. 1–4 and legends for Supplementary Tables 1–7.
Source data
Source Data
Source data for Figs. 2–5 and Extended Data Figs. 3–7, 9 and 10.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, X., Terashi, G. & Kihara, D. CryoREAD: de novo structure modeling for nucleic acids in cryo-EM maps using deep learning. Nat Methods 20, 1739–1747 (2023). https://doi.org/10.1038/s41592-023-02032-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41592-023-02032-5
This article is cited by
-
All-atom RNA structure determination from cryo-EM maps
Nature Biotechnology (2024)
-
DeepMainmast: integrated protocol of protein structure modeling for cryo-EM with deep learning and structure prediction
Nature Methods (2024)
-
A deep learning-based method for modeling of RNA structures from cryo-EM maps
Nature Biotechnology (2024)
-
Automated model building and protein identification in cryo-EM maps
Nature (2024)