Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Protocol
  • Published:

Integrative structural modeling of macromolecular complexes using Assembline

Abstract

Integrative modeling enables structure determination of macromolecular complexes by combining data from multiple experimental sources such as X-ray crystallography, electron microscopy or cross-linking mass spectrometry. It is particularly useful for complexes not amenable to high-resolution electron microscopy—complexes that are flexible, heterogeneous or imaged in cells with cryo-electron tomography. We have recently developed an integrative modeling protocol that allowed us to model multi-megadalton complexes as large as the nuclear pore complex. Here, we describe the Assembline software package, which combines multiple programs and libraries with our own algorithms in a streamlined modeling pipeline. Assembline builds ensembles of models satisfying data from atomic structures or homology models, electron microscopy maps and other experimental data, and provides tools for their analysis. Compared with other methods, Assembline enables efficient sampling of conformational space through a multistep procedure, provides new modeling restraints and includes a unique configuration system for setting up the modeling project. Our protocol achieves exhaustive sampling in less than 100–1,000 CPU-hours even for complexes in the megadalton range. For larger complexes, resources available in institutional or public computer clusters are needed and sufficient to run the protocol. We also provide step-by-step instructions for preparing the input, running the core modeling steps and assessing modeling performance at any stage.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Assembline workflow.
Fig. 2: Detailed workflow regarding the calculation of fit libraries.
Fig. 3: Workflow of the global optimization step.
Fig. 4: An example of Xlink Analyzer interface used for configuring the modeling project.
Fig. 5: Workflow of the refinement step.
Fig. 6: Integrative models of ScNPC.
Fig. 7: Integrative modeling of Elongator complex.
Fig. 8: Validation of the integrative model of the Elongator complex by a high-resolution cryo-EM structure.

Similar content being viewed by others

Data availability

All datasets needed for the step-by-step tutorials (i.e., for yeast NPC and Elongator complex modeling) are provided at https://git.embl.de/rantos/scnpc_tutorial.git (for the yeast NPC) and https://git.embl.de/kosinski/elongator_tutorial.git (for the Elongator complex).

Code availability

The Assembline software is freely available as an open-source Python package (https://www.embl-hamburg.de/Assembline/), which can be installed from source code (https://git.embl.de/kosinski/Assembline) or from the Anaconda repository (https://anaconda.org/kosinskilab/assembline). Documentation is provided in the Supplementary Manual and can also be found online (https://assembline.readthedocs.io/en/latest/), and in the form of the step-by-step tutorials for modeling the yeast NPC (Supplementary Tutorial 1, online version: https://scnpc-tutorial.readthedocs.io/en/latest/) and Elongator complex (Supplementary Tutorial 2, online version: https://elongator-tutorial.readthedocs.io/en/latest/).

References

  1. Marsh, J. A. & Teichmann, S. A. Structure, dynamics, assembly, and evolution of protein complexes. Ann. Rev. Biochem. 84, 551–575 (2015).

    Article  CAS  PubMed  Google Scholar 

  2. Patel, A. B. et al. Structure of human TFIID and mechanism of TBP loading onto promoter DNA. Science 362, eaau8872 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Vos, S. M., Farnung, L., Urlaub, H. & Cramer, P. Structure of paused transcription complex Pol II–DSIF–NELF. Nature 560, 601–606 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Lyumkis, D. Challenges and opportunities in cryo-EM single-particle analysis. J. Biol. Chem. 294, 5181–5197 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Beck, M. & Baumeister, W. Cryo-electron tomography: can it reveal the molecular sociology of cells in atomic detail? Trends Cell Biol. 26, 825–837 (2016).

    Article  PubMed  Google Scholar 

  6. Allegretti, M. et al. In-cell architecture of the nuclear pore and snapshots of its turnover. Nature 586, 796–800 (2020).

    Article  CAS  PubMed  Google Scholar 

  7. O’Reilly, F. J. et al. In-cell architecture of an actively transcribing-translating expressome. Science 369, 554–557 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  8. Turk, M. & Baumeister, W. The promise and the challenges of cryo‐electron tomography. FEBS Lett. 594, 3243–3261 (2020).

    Article  CAS  PubMed  Google Scholar 

  9. Wagner, F. R. et al. Preparing samples from whole cells using focused-ion-beam milling for cryo-electron tomography. Nat. Protoc. 15, 2041–2070 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Bharat, T. A. M. & Scheres, S. H. W. Resolving macromolecular structures from electron cryo-tomography data using subtomogram averaging in RELION. Nat. Protoc. 11, 2054–2065 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Fäßler, F., Dimchev, G., Hodirnau, V. V., Wan, W. & Schur, F. K. M. Cryo-electron tomography structure of Arp2/3 complex in cells reveals new insights into the branch junction. Nat. Commun. 11, 6437 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Malhotra, S., Träger, S., Dal Peraro, M. & Topf, M. Modelling structures in cryo-EM maps. Curr. Opin. Struct. Biol. 58, 105–114 (2019).

    Article  CAS  PubMed  Google Scholar 

  13. Ferber, M. et al. Automated structure modeling of large protein assemblies using crosslinks as distance restraints. Nat. Methods 13, 515–520 (2016).

    Article  CAS  PubMed  Google Scholar 

  14. Orbán-Németh, Z. et al. Structural prediction of protein models using distance restraints derived from cross-linking mass spectrometry data. Nat. Protoc. 13, 478–494 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  15. Gräwert, T. W. & Svergun, D. I. Structural modeling using solution small-angle X-ray scattering (SAXS). J. Mol. Biol. 432, 3078–3092 (2020).

    Article  PubMed  Google Scholar 

  16. Koukos, P. I. & Bonvin, A. M. J. J. Integrative modelling of biomolecular complexes. J. Mol. Biol. 432, 2861–2881 (2020).

    Article  CAS  PubMed  Google Scholar 

  17. Braitbard, M., Schneidman-Duhovny, D. & Kalisman, N. Integrative structure modeling: overview and assessment. Annu. Rev. Biochem. 88, 113–135 (2019).

    Article  CAS  PubMed  Google Scholar 

  18. Rout, M. P. & Sali, A. Principles for integrative structural biology studies. Cell 177, 1384–1403 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Viswanath, S., Chemmama, I. E., Cimermancic, P. & Sali, A. Assessing exhaustiveness of stochastic sampling for integrative modeling of macromolecular structures. Biophys. J. 113, 2344–2353 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Zimmerli, C. E. et al. Nuclear pores constrict upon energy depletion. Preprint at bioRxiv https://doi.org/10.1101/2020.07.30.228585 (2020).

  21. Webb, B. et al. Integrative structure modeling with the Integrative Modeling Platform. Protein Sci. 27, 245–258 (2018).

    Article  CAS  PubMed  Google Scholar 

  22. Bordoli, L. et al. Protein structure homology modeling using SWISS-MODEL workspace. Nat. Protoc. 4, 1–13 (2009).

    Article  CAS  PubMed  Google Scholar 

  23. Kosinski, J. et al. Molecular architecture of the inner ring scaffold of the human nuclear pore complex. Science 352, 363–365 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Kosinski, J. et al. Xlink analyzer: software for analysis and visualization of cross-linking data in the context of three-dimensional structures. J. Struct. Biol. 189, 177–183 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Pettersen, E. F. et al. UCSF Chimera-A visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004).

    Article  CAS  PubMed  Google Scholar 

  26. Russel, D. et al. Putting the pieces together: Integrative Modeling Platform software for structure determination of macromolecular assemblies. PLoS Biol. 10, e1001244 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Saltzberg, D. et al. Modeling biological complexes using Integrative Modeling Platform. Methods Mol. Biol. 2022, 353–377 (2019).

    Article  CAS  PubMed  Google Scholar 

  28. Dauden, M. I. et al. Architecture of the yeast Elongator complex. EMBO Rep. 18, 264–279 (2017).

    Article  CAS  PubMed  Google Scholar 

  29. Dauden, M. I. et al. Molecular basis of tRNA recognition by the Elongator complex. Sci. Adv. 5, eaaw2326 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Beckham, K. S. H. et al. Structure of the mycobacterial ESX-5 type VII secretion system pore complex. Sci. Adv. 7, eabg9923 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Bui, K. H. et al. Integrated structural analysis of the human nuclear pore complex scaffold. Cell 155, 1233–1243 (2013).

    Article  CAS  PubMed  Google Scholar 

  32. Fisher, R. A. Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika 10, 507 (1915).

    Google Scholar 

  33. Strimmer, K. fdrtool: a versatile R package for estimating local and tail area-based false discovery rates. Bioinformatics 24, 1461–1462 (2008).

    Article  CAS  PubMed  Google Scholar 

  34. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57, 289–300 (1995).

    Google Scholar 

  35. Kirkpatrick, S., Gelatt, C. D. & Vecchi, M. P. Optimization by simulated annealing. Science 220, 671–680 (1983).

    Article  CAS  PubMed  Google Scholar 

  36. Schneidman-Duhovny, D., Hammel, M. & Sali, A. FoXS: a web server for rapid computation and fitting of SAXS profiles. Nucleic Acids Res. 38, W540–W544 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Lasker, K., Topf, M., Sali, A. & Wolfson, H. J. Inferential optimization for simultaneous fitting of multiple components into a CryoEM map of their assembly. J. Mol. Biol. 388, 180–194 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Dominguez, C., Boelens, R. & Bonvin, A. M. J. J. HADDOCK: a protein–protein docking approach based on biochemical or biophysical information. J. Am. Chem. Soc. 125, 1731–1737 (2003).

    Article  CAS  PubMed  Google Scholar 

  39. Karaca, E., Rodrigues, J. P. G. L. M., Graziadei, A., Bonvin, A. M. J. J. & Carlomagno, T. M3: an integrative framework for structure determination of molecular machines. Nat. Methods 14, 897–902 (2017).

    Article  CAS  PubMed  Google Scholar 

  40. Marrink, S. J., Risselada, H. J., Yefimov, S., Tieleman, D. P. & De Vries, A. H. The MARTINI force field: coarse grained model for biomolecular simulations. J. Phys. Chem. B 111, 7812–7824 (2007).

    Article  CAS  PubMed  Google Scholar 

  41. DiMaio, F., Tyka, M. D., Baker, M. L., Chiu, W. & Baker, D. Refinement of protein structures into low-resolution density maps using Rosetta. J. Mol. Biol. 392, 181–190 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Wriggers, W. Conventions and workflows for using Situs. Acta Crystallogr. D. Biol. Crystallogr. 68, 344–351 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Topf, M. et al. Protein structure fitting and refinement guided by cryo-EM density. Structure 16, 295–307 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Trabuco, L. G., Villa, E., Mitra, K., Frank, J. & Schulten, K. Flexible fitting of atomic structures into electron microscopy maps using molecular dynamics. Structure 16, 673–683 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Pandurangan, A. P., Vasishtan, D., Alber, F. & Topf, M. γ-TEMPy: simultaneous fitting of components in 3D-EM maps of their assembly using a genetic algorithm. Structure 23, 2365–2376 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Vitalis, A. & Caflisch, A. Equilibrium sampling approach to the interpretation of electron density maps. Structure 22, 156–167 (2014).

    Article  CAS  PubMed  Google Scholar 

  47. Lopéz-Blanco, J. R. & Chacón, P. IMODFIT: efficient and robust flexible fitting based on vibrational analysis in internal coordinates. J. Struct. Biol. 184, 261–270 (2013).

    Article  PubMed  Google Scholar 

  48. Ratje, A. H. et al. Head swivel on the ribosome facilitates translocation by means of intra-subunit tRNA hybrid sites. Nature 468, 713–716 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Saha, M. & Morais, M. C. FOLD-EM: automated fold recognition in medium-and low-resolution (4–15 Å) electron density maps. Bioinformatics 28, 3265–3273 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. de Vries, S. J. & Zacharias, M. ATTRACT-EM: a new method for the computational assembly of large molecular machines using cryo-EM maps. PLoS One 7, e49733 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

    Google Scholar 

  54. Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).

    Article  Google Scholar 

  55. Gil, V. A. & Guallar, V. pyRMSD: a Python package for efficient pairwise RMSD matrix calculation and handling. Bioinformatics 29, 2363–2364 (2013).

    Article  CAS  PubMed  Google Scholar 

  56. McInnes, L., Healy, J. & Astels, S. hdbscan: hierarchical density based clustering. J. Open Source Softw. 2, 205 (2017).

    Article  Google Scholar 

  57. Wickham, H. et al. Welcome to the Tidyverse. J. Open Source Softw. 4, 1686 (2019).

    Article  Google Scholar 

  58. Šali, A. & Blundell, T. L. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815 (1993).

    Article  PubMed  Google Scholar 

  59. Beck, M. & Hurt, E. The nuclear pore complex: understanding its function through structural insight. Nat. Rev. Mol. Cell Biol. 18, 73–89 (2017).

    Article  CAS  PubMed  Google Scholar 

  60. Lin, D. H. et al. Architecture of the symmetric core of the nuclear pore. Science 352, aaf1015 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  61. Drin, G. et al. A general amphipathic α-helical motif for sensing membrane curvature. Nat. Struct. Mol. Biol. 14, 138–146 (2007).

    Article  CAS  PubMed  Google Scholar 

  62. Kim, S. J. et al. Integrative structure and functional anatomy of a nuclear pore complex. Nature 555, 475–482 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Dauden, M. I., Jaciuk, M., Müller, C. W. & Glatt, S. Structural asymmetry in the eukaryotic Elongator complex. FEBS Lett. 592, 502–515 (2018).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank coauthors of the publications in which the Assembline protocol has been applied. We are grateful to D. Ziemianowicz and K. Kaszuba for comments on the manuscript and the protocol, and A. Obarska for feedback on the protocol. The work has been supported by the Federal Ministry of Education and Research of Germany (FKZ 031L0100)

Author information

Authors and Affiliations

Authors

Contributions

V.R., K.K. and J.K. developed the protocol. V.R and J.K wrote the manuscript.

Corresponding author

Correspondence to Jan Kosinski.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Protocols thanks André Hoelz, Peter J. Peters and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Key references using this protocol

Beckham, K. S. H. et al. Sci. Adv. 7, eabg9923 (2021): https://doi.org/10.1126/sciadv.abg9923

Allegretti, M. et al. Nature 586, 796–800 (2020): https://doi.org/10.1038/s41586-020-2670-5

Zimmerli, C. E. et al. Preprint at bioRxiv (2020): https://doi.org/10.1101/2020.07.30.228585

Dauden, M. I. et al. EMBO Rep. 18, 264–279 (2017): https://doi.org/10.15252/embr.201643353

Kosinski, J. et al. Science 352, 363–365 (2016): https://doi.org/10.1126/science.aaf0643

Essential data used in this protocol

Allegretti, M. et al. Nature 586, 796–800 (2020): https://doi.org/10.1038/s41586-020-2670-5

Dauden, M. I. et al. EMBO Rep. 18, 264–279 (2017): https://doi.org/10.15252/embr.201643353

Supplementary information

Supplementary Information

Supplementary Figs. 1–3, Supplementary Manual and Supplementary Tutorials 1 and 2.

Reporting Summary

Supplementary Video 1

CR Y-complex trajectory from global optimization. Animation video depicting the trajectory of the best scoring CR Y-complex model of the wild-type in-cell ScNPC6 produced with global optimization step. The model is shown in the coarse-grained representation inside the respective EM map.

Supplementary Video 2

NR Y-complex trajectory from refinement. Animation video depicting the trajectory of the best scoring NR Y-complex model of the in-cell wild-type ScNPC6 produced with refinement step. The model (and the symmetrical copies) is shown in the coarse-grained representation inside the respective EM map.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rantos, V., Karius, K. & Kosinski, J. Integrative structural modeling of macromolecular complexes using Assembline. Nat Protoc 17, 152–176 (2022). https://doi.org/10.1038/s41596-021-00640-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41596-021-00640-z

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics