Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Deep learning improves macromolecule identification in 3D cellular cryo-electron tomograms

An Author Correction to this article was published on 16 November 2021

This article has been updated

Abstract

Cryogenic electron tomography (cryo-ET) visualizes the 3D spatial distribution of macromolecules at nanometer resolution inside native cells. However, automated identification of macromolecules inside cellular tomograms is challenged by noise and reconstruction artifacts, as well as the presence of many molecular species in the crowded volumes. Here, we present DeepFinder, a computational procedure that uses artificial neural networks to simultaneously localize multiple classes of macromolecules. Once trained, the inference stage of DeepFinder is faster than template matching and performs better than other competitive deep learning methods at identifying macromolecules of various sizes in both synthetic and experimental datasets. On cellular cryo-ET data, DeepFinder localized membrane-bound and cytosolic ribosomes (roughly 3.2 MDa), ribulose 1,5-bisphosphate carboxylase–oxygenase (roughly 560 kDa soluble complex) and photosystem II (roughly 550 kDa membrane complex) with an accuracy comparable to expert-supervised ground truth annotations. DeepFinder is therefore a promising algorithm for the semiautomated analysis of a wide range of molecular targets in cellular tomograms.

This is a preview of subscription content

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Overview of DeepFinder.
Fig. 2: Target generation strategies for training.
Fig. 3: Analysis of algorithm performance on the synthetic dataset (SHREC 2019 challenge).
Fig. 4: Comparison of score maps obtained with template matching and DeepFinder, and analysis of structural resolution through subtomogram averaging (Dataset 2).
Fig. 5: DeepFinder localizes small macromolecules in cellular tomograms.

Data availability

The synthetic dataset (Dataset 1) is available on the website of the SHREC 2019 challenge (http://www2.projects.science.uu.nl/shrec/cryo-et/2019/). A tomogram from the experimental dataset of C. reinhardtii cells (Dataset 2)34,60 can be found in the EMDB under accession number EMD-3967 (ref. 61). The test tomogram of the Chlamydomonas pyrenoid used for subtomogram averaging (Dataset 3)37 can be downloaded from the EMDB under accession number EMD-12749, and the raw tilt series data for this tomogram are available at the Electron Microscopy Public Image Archive (EMPIAR) under accession number EMPIAR-10694. All four tomograms used to train and test the detection of PSII in Chlamydomonas thylakoids (Dataset 4)38 can be downloaded from the EMDB under accession numbers EMD-10780, EMD-10781, EMD-10782 and EMD-10783.

Code availability

The code can be downloaded for free from our GitLab website (https://gitlab.inria.fr/serpico/deep-finder) along with accompanying documentation (https://deepfinder.readthedocs.io/en/latest/). DeepFinder is embedded into the new release of Scipion62 (https://github.com/scipion-em/scipion-em-deepfinder), an open-source image processing framework for cryo-electron microscopy (http://scipion.i2pc.es/).

Each step of DeepFinder shown in Fig. 1a can be executed with scripts using the API (examples are provided) or with a GPU. These steps may also be embedded in other workflows, for example, if the user needs only the segmentation step. To implement DeepFinder, we used Keras (http://keras.io), an open-source toolbox written in Python and using the TensorFlow framework.

All training procedures were achieved using a Nvidia Tesla K80 GPU, running Cuda 8 and cuDNN 6. In the code, we display the memory consumption of DeepFinder for different training parameters.

Change history

References

  1. 1.

    Schaffer, M. et al. Optimized cryo-focused ion beam sample preparation aimed at in situ structural studies of membrane proteins. J. Struct. Biol. 197, 73–82 (2017).

    CAS  Google Scholar 

  2. 2.

    Frank, J. Approaches to large-scale structures. Curr. Opin. Struct. Biol. 5, 194–201 (1995).

    CAS  Google Scholar 

  3. 3.

    McEwen, B., Renken, C., Marko, M. & Mannella, C. Principles and practice in electron tomography. Methods Cell Biol. 89, 129–168 (2008).

    Google Scholar 

  4. 4.

    McIntosh, R., Nicastro, D. & Mastronarde, D. New views of cells in 3D: an introduction to electron tomography. Trends Cell Biol. 15, 43–51 (2005).

    CAS  Google Scholar 

  5. 5.

    Nicastro, D., Frangakis, A., Typke, D. & Baumeister, W. Cryo-electron tomography of neurospora mitochondria. J. Struct. Biol. 129, 48–56 (2000).

    CAS  Google Scholar 

  6. 6.

    Guesdon, A., Blestel, S., Kervrann, C. & Chrétien, D. Single versus dual-axis cryo-electron tomography of microtubules assembled in vitro: limits and perspectives. J. Struct. Biol. 181, 169–78 (2013).

    CAS  Google Scholar 

  7. 7.

    Best, C., Nickell, S. & Baumeister, W. Localization of protein complexes by pattern recognition. Methods Cell Biol. 2007, 615–638 (2007).

    Google Scholar 

  8. 8.

    Albert, S. et al. Direct visualization of degradation microcompartments at the ER membrane. Proc. Natl Acad. Sci. USA 117, 1069–1080 (2020).

    CAS  Google Scholar 

  9. 9.

    Förster, F., Pruggnaller, S., Seybert, A. & Frangakis, A. S. Classification of cryo-electron sub-tomograms using constrained correlation. J. Struct. Biol. 161, 276–286 (2008).

    Google Scholar 

  10. 10.

    Martinez-Sanchez, A. et al. Template-free detection and classification of membrane-bound complexes in cryo-electron tomograms. Nat. Methods 17, 209–216 (2020).

    CAS  Google Scholar 

  11. 11.

    LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

    CAS  PubMed  Google Scholar 

  12. 12.

    LeCun, Y., Kavukcuoglu, K. & Farabet, C. Convolutional networks and applications in vision. In Proc. IEEE Int. Symp. on Circuits and Systems, 253–256 (2010).

  13. 13.

    Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. In Proc. Neural Inf. Processing Systems (NIPS) (eds Pereira, F., Burges, C. J. C., Bottou, L. & Weinberger, K. Q.) 1–9 (2012).

  14. 14.

    Long, J., Shelhamer, E. & Darrell, T. Fully convolutional networks for semantic segmentation. In Proc. Conf. Comput. Vis. Pattern Recognition (CVPR), 3431–3440 (2015).

  15. 15.

    Falk, T. et al. U-net—deep learning for cell counting, detection, and morphometry. Nat. Methods 16, 67–70 (2019).

    CAS  Google Scholar 

  16. 16.

    Belthangady, C. & Royer, L. Applications, promises, and pitfalls of deep learning for fluorescence image reconstruction. Nat. Methods 16, 1215–1225 (2019).

    CAS  Google Scholar 

  17. 17.

    Ouyang, W., Aristov, A., Lelek, M., Hao, X. & Zimmer, C. Deep learning massively accelerates super-resolution localization microscopy. Nat. Biotechnology 36, 460–468 (2018).

    CAS  Google Scholar 

  18. 18.

    Weigert, M. et al. Content-aware image restoration: pushing the limits of fluorescence microscopy. Nat. Methods 12, 1090–1097 (2018).

    Google Scholar 

  19. 19.

    Wu, X., Zeng, X., Zhu, Z., Gao, X. & Xu, M. Template-based and template-free approaches in cellular cryo-electron tomography structural pattern mining. Comp. Biol. 11, 1146–1152 (2019).

    Google Scholar 

  20. 20.

    Wang, F. et al. DeepPicker: a deep learning approach for fully automated particle picking in cryo-EM. J. Struct. Biol. 195, 325–336 (2016).

    Google Scholar 

  21. 21.

    Al-Azzawi, A., Ouadou, A., Tanner, J. J. & Cheng, J. AutoCryoPicker: an unsupervised learning approach for fully automated single particle picking in cryo-EM images. BMC Bioinform. 20, 326 (2019).

    Google Scholar 

  22. 22.

    Wagner, T. et al. SPHIRE-crYOLO is a fast and accurate fully automated particle picker for cryo-EM. Commun. Biol. 2, 218 (2019).

    PubMed  PubMed Central  Google Scholar 

  23. 23.

    Bepler, T. et al. Positive-unlabeled convolutional neural networks for particle picking in cryo-electron micrographs. Nat. Methods 16, 1153–1160 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Tegunov, D. & Cramer, P. Real-time cryo-electron microscopy data preprocessing with Warp. Nat. Methods 16, 1146–1152 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Chen, M. et al. Convolutional neural networks for automated annotation of cellular cryo-electron tomograms. Nat. Methods 14, 983–985 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Che, C. et al. Improved deep learning based macromolecules structure classification from electron cryo tomograms. Mach. Vis. Appl. 29, 1227–1236 (2018).

    PubMed  PubMed Central  Google Scholar 

  27. 27.

    Ronneberger, O., Fischer, P. & Brox, T. U-Net: convolutional networks for biomedical image segmentation. In Proc. Med. Image Comput. Comput. Assist. Interv. (MICCAI) 9351, (eds Navab, N., Hornegger, J., Wells, W. M. & Frangi, A.) 234–241 (2015).

  28. 28.

    Förster, F. & Hegerl, R. Structure determination in situ by averaging of tomograms. Cell. Electron Microsc. 79, 741–767 (2007).

    Google Scholar 

  29. 29.

    Gubins, I. et al. SHREC’19 Track: classification in cryo-clectron tomograms. In Proc. Eurographics Workshop on 3D Object Retrieval, SHREC–3D Shape Retrieval Contest 2019 https://www2.projects.science.uu.nl/shrec/cryo-et/2019/ (Utrecht Univ., 2019).

  30. 30.

    Hrabe, T. et al. PyTOM: a python-based toolbox for localization of macromolecules in cryo-electron tomograms and subtomogram analysis. J. Struct. Biol. 178, 177–188 (2012).

    CAS  Google Scholar 

  31. 31.

    Gubins, I. et al. SHREC 2020: classification in cryo-electron tomograms. Comput. Graphics 91, 279–289 (2020).

    Google Scholar 

  32. 32.

    Moebel, E. & Kervrann, C. A Monte Carlo framework for missing wedge restoration and noise removal in cryo-electron tomography. J. Struct. Biol.; X 4, 100013 (2020).

    Google Scholar 

  33. 33.

    Rolnick, D., Veit, A., Belongie, S. & Shavit, N. Deep learning is robust to massive label noise. Preprint at arXiv https://arxiv.org/abs/1705.10694v2 (2017).

  34. 34.

    Pfeffer, S. et al. Dissecting the molecular organization of the translocon-associated protein complex. Nat. Communications 8, 14516 (2017).

    CAS  Google Scholar 

  35. 35.

    Chen, Y., Pfeffer, S., Hrabe, T., Schuller, J. M. & Förster, F. Fast and accurate reference-free alignment of subtomograms. J. Struct. Biol. 182, 235–245 (2013).

    Google Scholar 

  36. 36.

    Sanchez-Garcia, R., Segura, J., Maluenda, D., Carazo, J. & Sorzano, C. Deep consensus, a deep learning-based approach for particle pruning in cryo-electron microscopy. IUCrJ. 5, 854–865 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. 37.

    Freeman-Rosenzweig, E. et al. The eukaryotic co2-concentrating organelle is liquid-like and exhibits dynamic reorganization. Cell 171, 148–162 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Wietrzynski, W. et al. Charting the native architecture of chlamydomonas thylakoid membranes with single-molecule precision. eLife 9, e53740 (2020).

    PubMed  PubMed Central  Google Scholar 

  39. 39.

    Förster, F., Han, B. G. & Beck, M. Visual proteomics. Meth. Enzymol. 483, 215–243 (2010).

    Google Scholar 

  40. 40.

    Vendeville, A., Larivière, D. & Fourmentin, E. An inventory of the bacterial macromolecular components and their spatial organization. FEMS Microbiol. Rev. 35, 395–414 (2011).

    CAS  Google Scholar 

  41. 41.

    Gipson, B. R. et al. Automatic recovery of missing amplitudes and phases in tilt-limited electron crystallography of two-dimensional crystals. Phys. Rev. E. 84, 011916 (2011).

    Google Scholar 

  42. 42.

    Deng, Y. et al. ICON: 3D reconstruction with ‘missing-information’ restoration in biological electron tomography. J. Struct. Biol. 195, 100–112 (2016).

    Google Scholar 

  43. 43.

    Biyani, N. et al. Image processing techniques for high-resolution structure determination from badly ordered 2D crystals. J. Struct. Biol. 203, 120–134 (2018).

    CAS  Google Scholar 

  44. 44.

    He, S. et al. The structural basis of Rubisco phase separation in the pyrenoid. Nat. Plants 6, 1480–1490 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. 45.

    Sheng, X. et al. Structural insight into light harvesting for photosystem II in green algae. Nat. Plants 5, 1320–1330 (2019).

    CAS  Google Scholar 

  46. 46.

    Kingma, D. P. & Ba, J. L. ADAM: a method for stochastic optimization. Preprint at arXiv https://arxiv.org/abs/1412.6980v9 (2014).

  47. 47.

    Salehi, S. S. M., Erdogmus, D. & Gholipour, A. Tversky loss function for image segmentation using 3D fully convolutional deep networks. In Proc. MICCAI workshop on Machine Learning in Medical Imaging (MLMI), (eds Wang, Q., Shi, Y., Suk, H. I. & Suzuki, K.) 379–387 (2017).

  48. 48.

    Milletari, F., Navab, N. & Ahmadi, S.-A. V-Net: fully convolutional neural networks for volumetric medical image segmentation. In Proc. IEEE Int. Conf. 3D Vision (3DV), 565–571 (2016).

  49. 49.

    Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proc. Int. Conf. Learn. Representation (eds Bengio, Y. & LeCun, Y.) 1–14 (2015).

  50. 50.

    Comaniciu, D., Meer, P. & Member, S. Mean Shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 603–619 (2002).

    Google Scholar 

  51. 51.

    Martinez-Sanchez, A., Garcia, I., Asano, S., Lucic, V. & Fernandez, J.-J. Robust membrane detection based on tensor voting for electron tomography. J. Struct. Biol. 186, 49–61 (2014).

    Google Scholar 

  52. 52.

    Kremer, J. R., Mastronarde, D. N. & McIntosh, J. R. Computer visualization of three-dimensional image data using IMOD. J. Struct. Biol. 116, 71–76 (1996).

    CAS  Google Scholar 

  53. 53.

    Zivanov, J. et al. New tools for automated high-resolution cryo-EM structure determination in RELION-3. eLife 7, e42166 (2018).

    PubMed  PubMed Central  Google Scholar 

  54. 54.

    Bharat, T. B. & Scheres, S. Resolving macromolecular structures from electron cryo-tomography data using subtomogram averaging in RELION. Nat. Protoc 11, 2054–2065 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. 55.

    Harauz, G. & van Heel, M. Exact filters for general geometry three dimensional reconstruction. Optik 78, 146–156 (1996).

    Google Scholar 

  56. 56.

    Rosenthal, P. B. & Henderson, R. Optimal determination of particle orientation, absolute hand, and contrast loss in single-particle electron cryomicroscopy. J. Mol. Biol. 333, 721–745 (2003).

    CAS  Google Scholar 

  57. 57.

    Chen, S. et al. High-resolution noise substitution to measure overfitting and validate resolution in 3D structure determination by single particle electron cryomicroscopy. Ultramicroscopy 135, 24–35 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. 58.

    Pettersen, E. F. et al. UCSF Chimera—a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  59. 59.

    Goddard, T. et al. UCSF ChimeraX: meeting modern challenges in visualization and analysis. Protein Sci. 27, 14–25 (2018).

    CAS  Google Scholar 

  60. 60.

    Albert, S. et al. Proteasomes tether to two distinct sites at the nuclear pore complex. Proc. Natl Acad. Sci. USA 114, 201716305 (2017).

    Google Scholar 

  61. 61.

    Henderson, R. Avoiding the pitfalls of single particle cryo-electron microscopy: Einstein from noise. Proc. Natl Acad. Sci. USA 110, 18037–18041 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  62. 62.

    de la Rosa-Trevìn, J. et al. Scipion: a software framework toward integration, reproducibility and validation in 3D electron microscopy. J. Struct. Biol. 195, 93–99 (2016).

    Google Scholar 

Download references

Acknowledgements

This work was jointly supported by the Fourmentin-Guilbert Foundation and Région Bretagne (Brittany Council). Calculations were performed on the Inria Rennes computing grid facilities partly funded by France-BioImaging infrastructure (French National Research Agency—ANR-10-INBS-04-07, ‘Investments for the future’) and at the Max Planck Institute for Biochemistry computing cluster, Martinsried, Germany. L.L., R.D.R., W.W., T.P. and B.D.E. were supported by DFG grant no. EN 1194/1-1 as part of FOR 2092, The Munich School for Data Science (MUDS) and Helmholtz Association. A.M.-S. was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy EXC 2067/1-390729940.

We thank F. Förster and M. Killinger for fruitful discussions about cryo-ET data analysis and deep learning applied to large 3D volumes analysis, respectively.

We thank the organizers of the SHREC 2019 and SHREC 2020 challenges for helpful assistance and for providing the template matching results: I. Gubins and R.C. Veltkamp (Utrecht University, Department of Information and Computing Sciences), G. van der Schot and F. Förster (Utrecht University, Department of Chemistry).

Finally, we thank S. Prima for careful reading of the paper and valuable suggestions and comments.

Author information

Affiliations

Authors

Contributions

E.M. designed and implemented the presented DeepFinder method and carried out the biocomputing experiments. C.K. supervised the project and was in charge of overall direction and planning. E.F., D.L. and C.K. devised the project and the main conceptual ideas, with assistance from A.M.-S.. B.D.E. and W.B. facilitated access to datasets. B.D.E., S.A., W.W. and S.P. provided the C. reinhardtii datasets and annotations (Datasets 2, 3 and 4). A.M.-S., J.O. and B.D.E. conceived experiments on real datasets. L.L., R.D.R., W.W. and T.P. performed experiments on datasets depicting thylakoid membranes and pyrenoid matrices within vitreously frozen C. reinhardtii cells. E.M., B.D.E. and C.K. cowrote the paper. All authors provided critical feedback and helped shape the research, analysis and paper.

Corresponding authors

Correspondence to Benjamin D. Engel or Charles Kervrann.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Rita Strack was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Two workflows for macromolecule localization in cryo-ET.

a, Conventional processing pipeline based on template matching. b, DeepFinder (analysis stage): a multi-class approach able to localize particles of several different macromolecular species in one pass. a, and b, highlight why DeepFinder is more agile than template matching when several macromolecule classes need to be localized. c, CNN architecture used in DeepFinder and based on U-Net27. The architecture adopts the encoder-decoder paradigm, which produces an output volume with the same size as the input volume. Each green box represents a convolutional layer. The number of filters n and the filter size s is labeled as n × (s × s × s). All convolutional layers are followed by a ReLU activation function, except the last layer, which uses a soft-max function. The up-sampling is achieved with up-convolutions (also called ‘backward-convolution’). Combining feature maps from different scales is performed by concatenation along the channel dimension. In the end, the total number of architecture parameters is approximately 903k. More precisely, this number depends slightly on Ncl, the number of classes: 902, 928 + Ncl × 33.

Extended Data Fig. 2 DeepFinder graphical user interface.

a, Training interface composed of a first window for parametrizing the procedure and a second window for displaying the training metrics in real-time. b, Segmentation interface which also opens a data visualization tool. This tool allows the user to explore the tomogram with superimposed segmentations. In addition, DeepFinder also incorporates interfaces for tomogram annotation, target generation and clustering (see the documentation at https://gitlab.inria.fr/serpico/deep-finder for more information).

Extended Data Fig. 3 Analysis of algorithm performance on the synthetic dataset (SHREC’20 challenge).

a, Performance (F1-score) of DeepFinder, UMC and template matching algorithms and ability of algorithms to discriminate between 12 classes/subclasses of macromolecules. The highest (best) possible value of an F1-score is 1.0 and the lowest (worst) possible value is 0. The scores of template matching were provided by the SHREC’20 challenge organizers (Utrecht University, Department of Information and Computing Sciences and Department of Chemistry). b, Performance of DeepFinder implemented as a multi-class network architecture and as an architecture made of 12 binary networks. These two architectures differ only by the number of output neurons. c, Influence of the training target generation method (‘shapes’ versus ‘spheres’). In the case of ‘shapes’, the exact shapes of the macromolecules have been used to annotate the tomograms. In the case of ‘spheres’, the shape and the orientation of macromolecules are not needed to generate the training targets. This analysis used eight tomograms for training, one tomogram for validation, and one tomogram for testing.

Extended Data Fig. 4 Evolution of F1-scores with respect to sizes of the training sets (number of tomograms) on the synthetic SHREC dataset (12 classes).

Scores are displayed for both the SHREC 2019 a, and 2020 b, editions. This figure gives an estimation of the amount of annotated data needed to identify macromolecules. This amount depends on the size of the target macromolecule: smaller targets require more annotations. Each tomogram contains in average 208 macromolecules per class. The macromolecules have been categorized into four groups (large, medium, small and tiny). This analysis used eight tomograms for training, one tomogram for validation, and one tomogram for testing.

Extended Data Fig. 5 Evolution of F1-score with respect to training iterations and training set size on real cryo-ET Dataset #2, Chlamydomonas reinhardtii (3 classes).

a, The loss, which quantifies the segmentation quality, is computed for the training set, as well as for the validation set. Comparing both curves allows assessment of the generalization capabilities of DeepFinder. The curves for both sets should ideally overlap, otherwise it indicates overfitting (the network memorizes trained samples instead of learning discriminating features). One epoch equals 100 training iterations. b, The F1-score, which quantifies the localization performance, computed on the test set. The F1-score is obtained by comparing the membrane-bound ribosomes found by DeepFinder to expert annotations. The time axis has been obtained using a Tesla K80 GPU. The curve indicates that competitive particle picking results are obtained after 20 epochs, or 4.3 hours with the required GPU. This analysis used 21 tomograms for training, one tomogram for validation, and eight tomograms for testing. c, In a similar fashion to Extended Data Fig. 4, this curve provides an estimate of the quantity of training data required to achieve a competitive result. It appears that this quantity is 1,400 ribosomes (nine tomograms), which is a typical size for a cryo-ET dataset. On first glance, this estimate seems to contradict the estimates in Extended Data Fig. 4: the numbers do not coincide (the curve labeled ‘Large’ estimates that quantity at 208 particles). Note that SHREC’19 is a synthetic dataset, composed of 12 classes. Here, we are dealing with a real cellular dataset consisting of three classes (membrane, mb-ribo and ct-ribo). It appears that having a larger number of classes enables the use of smaller training sets. On the other hand, the case of real data is more difficult, notably because of the presence of ‘label noise’ (errors due to the annotation pipeline) and other sources of signal corruption such as the missing wedge, the contrast transfer function and the low signal-to-noise ratio (in part caused by increased molecular crowding inside cells). This analysis used one tomogram for validation, and eight tomograms for testing.

Extended Data Fig. 6 Quantitative analysis of overlap with expert annotations on cellular cryo-ET data (Dataset #2, mb-ribos).

We varied the thresholds of template matching (a) and DeepFinder (b) to compute the Recall (ratio between the number of true positives and the number of particles in the ground truth), Precision (ratio between the number of true positives and the number of detected particles) and F1-score (2 × (Recall × Precision) / (Recall + Precision)) curves. The threshold parameter for template matching is the constrained correlation coefficient, and for DeepFinder it is the cluster size, which corresponds to the macromolecule volume (in voxels). We obtained a maximum F1-score of 0.86 for DeepFinder and a maximum F1-score of 0.50 for template matching (with no post-classification step, see Extended Data Fig. 1a). Template matching and DeepFinder both have good Recall values, but template matching has a lower Precision than DeepFinder. This suggests that template matching can be recommended to select many candidates, but a time-consuming post-classification is required to improve Precision. DeepFinder has much higher Precision values, which confirms the results from the synthetic dataset (SHREC’19 challenge). This analysis used 48 tomograms for training, one tomogram for validation, and eight tomograms for testing.

Extended Data Fig. 7 DeepFinder handles ice contamination on the lamella surface.

a, Tomogram slice depicting the border of a FIB-milled lamella. The lamella contains a Chlamydomonas reinhardtii cell, with a lamella surface suffering from ice contamination. b, Tomogram slice with superimposed DeepFinder segmentation. Most of the ice contamination artifacts have been correctly classified as ‘background’. Nonetheless, some missclassifications exist, as can be observed in the zoomed-in boxes (in dashed red). In boxes 1 and 2, DeepFinder confuses some artifacts with membranes, and some features are wrongly classified as membrane-bound ribosomes. Such missclassifications can be filtered out, either by masking the boundaries of the lamella, or by rejecting segmented objects that are too small (using the ‘cluster size’ attribute given by the clustering step of the DeepFinder analysis stage). This analysis used 48 tomograms for training, one tomogram for validation, and eight tomograms for testing.

Extended Data Fig. 8 The generalization potential of DeepFinder on P19 cells.

DeepFinder was trained on the Chlamydomonas (algae) dataset and then applied on a tomogram of mouse P19 cells (EMD-10439). Although the ribosome has a different structure for the two species, for a given voxel size (13.68 Å) the structures are similar enough for DeepFinder to identify and localize mb-ribo particles in a P19 cell. a, Tomographic slice with both the superimposed segmented cell membrane (gray) and mb-ribo particles (blue). b, Average density from 300 mb-ribo particles. c, Histogram of mb-ribo particle distance from the nearest cell membrane. In this histogram, the maximum mode is located at 136.8 Å, which corresponds to the ribosome radius. This analysis used 48 tomograms for training, one tomogram for validation, and one tomogram for testing.

Supplementary information

Supplementary Information

Supplementary Table 2, Figs. 1–8 and Note 1 (with two figures).

Reporting Summary

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Moebel, E., Martinez-Sanchez, A., Lamm, L. et al. Deep learning improves macromolecule identification in 3D cellular cryo-electron tomograms. Nat Methods 18, 1386–1394 (2021). https://doi.org/10.1038/s41592-021-01275-4

Download citation

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing