Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Domain-agnostic predictions of nanoscale interactions in proteins and nanoparticles

A preprint version of the article is available at bioRxiv.

Abstract

Although challenging, the accurate and rapid prediction of nanoscale interactions has broad applications for numerous biological processes and material properties. While several models have been developed to predict the interaction of specific biological components, they use system-specific information that hinders their application to more general materials. Here we present NeCLAS, a general and efficient machine learning pipeline that predicts the location of nanoscale interactions, providing human-intelligible predictions. NeCLAS outperforms current nanoscale prediction models for generic nanoparticles up to 10–20 nm, reproducing interactions for biological and non-biological systems. Two aspects contribute to these results: a low-dimensional representation of nanoparticles and molecules (to reduce the effect of data uncertainty), and environmental features (to encode the physicochemical neighborhood at multiple scales). This framework has several applications, from basic research to rapid prototyping and design in nanobiotechnology.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Methods and data.
Fig. 2: Predictive performances of different methods.
Fig. 3: Interactions between molecular tweezers and the 14-3-3σ protein.
Fig. 4: Interaction of PSMα1 and GQDs.
Fig. 5: Predicted and simulated interactions of GQDs.

Similar content being viewed by others

Data availability

Additional data are available at Deep Blue Data, an open and permanent data repository maintained by the University of Michigan56. This repository contains all raw files too large to include with the paper, including atomic coordinate files, simulations inputs and outputs, and individual pairwise predictions. The source data for Figs. 1–5 are available with this paper.

Code availability

The code used in this work and the relative documentation is available on Code Ocean57. Public releases of the code can be found at https://gitlab.eecs.umich.edu/violigroup/ml/neclas/-/releases/.

References

  1. Ghosh, G. & Panicker, L. Protein–nanoparticle interactions and a new insight. Soft Matter 17, 3855–3875 (2021).

    Article  Google Scholar 

  2. Russ, K. A. et al. C60 fullerene localization and membrane interactions in RAW 264.7 immortalized mouse macrophages. Nanoscale 8, 4134–4144 (2016).

    Article  Google Scholar 

  3. Liu, C. et al. Predicting the time of entry of nanoparticles in lipid membranes. ACS Nano 13, 10221–10232 (2019).

    Article  Google Scholar 

  4. Pawson, T. & Scott, J. D. Signaling through scaffold, anchoring, and adaptor proteins. Science 278, 2075–2080 (1997).

    Article  Google Scholar 

  5. Holzinger, M., Le Goff, A. & Cosnier, S. Nanomaterials for biosensing applications: a review. Front. Chem. 2, 63–73 (2014).

  6. Cha, S.-H. et al. Shape-dependent biomimetic inhibition of enzyme by nanoparticles and their antibacterial activity. ACS Nano 9, 9097–9105 (2015).

    Article  Google Scholar 

  7. Adcock, S. A. & McCammon, J. A. Molecular dynamics: survey of methods for simulating the activity of proteins. Chem. Rev. 106, 1589–1615 (2006).

    Article  Google Scholar 

  8. Yan, Y., Tao, H., He, J. & Huang, S.-Y. The HDOCK server for integrated protein–protein docking. Nat. Protoc. 15, 1829–1852 (2020).

    Article  Google Scholar 

  9. Lim, S. et al. A review on compound–protein interaction prediction methods: data, format, representation and model. Comput. Struct. Biotechnol. J. 19, 1541–1556 (2021).

    Article  Google Scholar 

  10. Krivák, R. & Hoksza, D. P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure. J. Cheminform. 10, 39 (2018).

  11. Gainza, P. et al. Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nat. Methods 17, 184–192 (2020).

    Article  Google Scholar 

  12. Sanchez-Garcia, R., Sorzano, C., Carazo, J. M. & Segura, J. BIPSPI: a method for the prediction of partner-specific protein-protein interfaces. Bioinformatics 35, 470–477 (2019).

    Article  Google Scholar 

  13. Dai, B. & Bailey-Kellogg, C. Protein interaction interface region prediction by geometric deep learning. Bioinformatics 37, 2580–2588 (2021).

    Article  Google Scholar 

  14. Minhas, F. u. A. A., Geiss, B. J. & Ben-Hur, A. PAIRpred: partner-specific prediction of interacting residues from sequence and structure. Proteins 82, 1142–1155 (2014).

  15. Fout, A., Byrd, J., Shariat, B. & Ben-Hur, A. Protein interface prediction using graph convolutional networks. In Advances in Neural Information Processing Systems, Vol. 30 (Eds Guyon, I. et al.) (Curran Associates, Inc. 2017).

  16. Vreven, T. et al. Updates to the integrated protein-protein interaction benchmarks: Docking Benchmark version 5 and Affinity Benchmark version 2. J. Mol. Biol. 427, 3031–3041 (2015).

    Article  Google Scholar 

  17. Monopoli, M. P., Åberg, C., Salvati, A. & Dawson, K. A. Biomolecular coronas provide the biological identity of nanosized materials. Nat. Nanotechnol. 7, 779–786 (2012).

    Article  Google Scholar 

  18. Findlay, M. R., Freitas, D. N., Mobed-Miremadi, M. & Wheeler, K. E. Machine learning provides predictive analysis into silver nanoparticle protein corona formation from physicochemical properties. Environ. Sci. Nano 5, 64–71 (2018).

    Article  Google Scholar 

  19. Ouassil, N., Pinals, R. L., Bonis-O’Donnell, J. T. D., Wang, J. W. & Landry, M. P. Supervised learning model predicts protein adsorption to carbon nanotubes. Sci. Adv. 8, eabm0898 (2022).

    Article  Google Scholar 

  20. Alex, J. M. et al. Calixarene-mediated assembly of a small antifungal protein. IUCrJ 6, 238–247 (2019).

    Article  Google Scholar 

  21. Clark, J. J., Orban, Z. J. & Carlson, H. A. Predicting binding sites from unbound versus bound protein structures. Sci. Rep. 10, 15856 (2020).

    Article  Google Scholar 

  22. Costanzo, L. D. & Geremia, S. Atomic details of carbon-based nanomolecules interacting with proteins. Molecules 25, 3555 (2020).

    Article  Google Scholar 

  23. Cha, M. et al. Unifying structural descriptors for biological and bioinspired nanoscale complexes. Nat. Comput. Sci. 2, 243–252 (2022).

    Article  Google Scholar 

  24. Porollo, A. & Meller, J. Prediction-based fingerprints of protein-protein interactions. Proteins 66, 630–645 (2006).

    Article  Google Scholar 

  25. Yang, J., Roy, A. & Zhang, Y. Protein–ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. Bioinformatics 29, 2588–2595 (2013).

    Article  Google Scholar 

  26. Jiménez, J., Doerr, S., Martínez-Rosell, G., Rose, A. S. & De Fabritiis, G. DeepSite: protein-binding site predictor using 3D-convolutional neural networks. Bioinformatics 33, 3036–3042 (2017).

    Article  Google Scholar 

  27. Mylonas, S. K., Axenopoulos, A. & Daras, P. DeepSurf: a surface-based deep learning approach for the prediction of ligand binding sites on proteins. Bioinformatics 37, 1681–1690 (2021).

    Article  Google Scholar 

  28. Le Guilloux, V., Schmidtke, P. & Tuffery, P. Fpocket: an open source platform for ligand pocket detection. BMC Bioinformatics 10, 168 (2009).

    Article  Google Scholar 

  29. Andreeva, A., Kulesha, E., Gough, J. & Murzin, A. G. The SCOP database in 2020: expanded classification of representative family and superfamily domains of known protein structures. Nucleic Acids Res. 48, D376–D382 (2019).

    Article  Google Scholar 

  30. Bier, D. et al. Molecular tweezers modulate 14-3-3 protein–protein interactions. Nat. Chem. 5, 234–239 (2013).

  31. Pintar, A., Carugo, O. & Pongor, S. CX, an algorithm that identifies protruding atoms in proteins. Bioinformatics 18, 980–984 (2002).

    Article  Google Scholar 

  32. Stanton, D. T. & Jurs, P. C. Development and use of charged partial surface area structural descriptors in computer-assisted quantitative structure–property relationship studies. Anal. Chem. 62, 2323–2329 (1990).

    Article  Google Scholar 

  33. Stanton, D. T., Egolf, L. M., Jurs, P. C. & Hicks, M. G. Computer-assisted prediction of normal boiling points of pyrans and pyrroles. J. Chem. Inf. Comput. Sci. 32, 306–316 (1992).

    Article  Google Scholar 

  34. Wang, Y. et al. Anti-biofilm activity of graphene quantum dots via self-assembly with bacterial amyloid proteins. ACS Nano 13, 4278–4289 (2019).

    Article  Google Scholar 

  35. Elvati, P., Baumeister, E. & Violi, A. Graphene quantum dots: effect of size, composition and curvature on their assembly. RSC Adv. 29, 17704–17710 (2017).

  36. Suzuki, N. et al. Chiral graphene quantum dots. ACS Nano 10, 1744–1755 (2016).

  37. Noid, W. Gea. The multiscale coarse-graining method. I. A rigorous bridge between atomistic and coarse-grained models. J. Chem. Phys. 128, 244114 (2008).

    Article  Google Scholar 

  38. Izvekov, S. & Voth, G. A. A multiscale coarse-graining method for biomolecular systems. J. Phys. Chem. B 109, 2469–2473 (2005).

    Article  Google Scholar 

  39. Baranwal, M. et al. Struct2Graph: a graph attention network for structure based predictions of protein–protein interactions. BMC Bioinformatics 23, 370 (2022).

  40. Deguchi, S., Alargova, R. G. & Tsujii, K. Stable dispersions of fullerenes, C60 and C70, in water. Preparation and characterization. Langmuir 17, 6013–6017 (2001).

    Article  Google Scholar 

  41. Kim, K.-H. et al. Protein-directed self-assembly of a fullerene crystal. Nat. Commun. 7, 11429 (2016).

    Article  Google Scholar 

  42. Zaheer, M. et al. Deep sets. In Advances in Neural Information Processing Systems, Vol. 30 (Eds Guyon, I. et al.) (Curran Associates, Inc. 2017).

  43. Martinetz, T., Berkovich, S. & Schulten, K. ‘Neural-gas’ network for vector quantization and its application to time-series prediction. IEEE Trans. Neural Netw. 4, 558–569 (1993).

    Article  Google Scholar 

  44. Sanner, M. F., Olson, A. J. & Spehner, J.-C. Reduced surface: an efficient way to compute molecular surfaces. Biopolymers 38, 305–320 (1996).

    Article  Google Scholar 

  45. Kawabata, T. Detection of multiscale pockets on protein surfaces using mathematical morphology. Proteins 78, 1195–1211 (2010).

    Article  Google Scholar 

  46. Todeschini, R. & Gramatica, P. The WHIM theory: new 3D molecular descriptors for QSAR in environmental modelling. SAR QSAR Environ. Res. 7, 89–115 (1997).

    Article  Google Scholar 

  47. Dolinsky, T. J. et al. PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations. Nucleic Acids Res. 35, W522–W525 (2007).

    Article  Google Scholar 

  48. Hornak, V. et al. Comparison of multiple AMBER force fields and development of improved protein backbone parameters. Proteins 65, 712–725 (2006).

    Article  Google Scholar 

  49. Gasteiger, J. & Marsili, M. A new model for calculating atomic charges in molecules. Tetrahedron Lett. 19, 3181–3184 (1978).

    Article  Google Scholar 

  50. Behler, J. Atom-centered symmetry functions for constructing high-dimensional neural network potentials. J. Chem. Phys. 134, 074106 (2011).

    Article  Google Scholar 

  51. Gastegger, M., Schwiedrzik, L., Bittermann, M., Berzsenyi, F. & Marquetanda, P. WACSF—weighted atom-centered symmetry functions as descriptors in machine learning potentials. J. Chem. Phys. 148, 241709 (2018).

  52. Berman, H., Henrick, K. & Nakamura, H. Announcing the worldwide Protein Data Bank. Nat. Struct. Mol. Biol. 10, 980–980 (2003).

    Article  Google Scholar 

  53. Halgren, T. A. Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94. J. Comput. Chem. 17, 490–519 (1996).

    Article  Google Scholar 

  54. Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous systems. TensorFlow https://www.tensorflow.org/ (2015).

  55. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).

  56. Saldinger, J., Raymond, M., Elvati, P. & Violi, A. Supporting data: domain-agnostic predictions of nanoscale interactions in proteins and nanoparticles. University of Michigan–Deep Blue Data https://doi.org/10.7302/58q6-0q88 (2023).

  57. Saldinger, J., Raymond, M., Elvati, P. & Violi, A. Domain-agnostic predictions of nanoscale interactions in proteins and nanoparticles. https://codeocean.com/capsule/8157811/tree. Code Ocean https://doi.org/10.24433/CO.8157811.v1 (2023).

Download references

Acknowledgements

The work was supported by the BlueSky Initiative, funded by The University of Michigan College of Engineering (principal investigator A.V.), the Army Research Office MURI (grant no. W911NF-18-1-0240) (A.V.), and the National Science Foundation Graduate Research Fellowship under grant no. 1256260 (J.C.S.). We thank C. Scott for insightful feedback and discussions on ML and C. Luyet for the help with the all-atom simulation of 6C-g3OH. We acknowledge Advanced Research Computing, a division of Information and Technology Services at the University of Michigan, for computational resources and services provided for the research.

Author information

Authors and Affiliations

Authors

Contributions

A.V. and P.E. conceived and supervised the project. J.C.S. and P.E. conceived chemical features and representations. M.R. and J.C.S. designed, trained and tested the machine learning models. J.C.S. designed experiments and created the database. P.E. designed and ran the MD simulations. All authors read, revised and approved the paper.

Corresponding author

Correspondence to Angela Violi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editors: Ananya Rastogi and Fernando Chirigati, in collaboration with the Nature Computational Science team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–11, Notes, Discussion, Tables 1–5 and equations 1–7.

Reporting Summary

Source data

Source Data Fig. 1

Fig1_c_source.csv: first two principal components and bead type, Fig1_def_source.csv: RMSD values for protein–protein and protein–nanoparticle datasets.

Source Data Fig. 2

Fig2_a_source.csv: per-complex AUC values for protein–nanoparticle datasets, Fig2_b_source: per-complex AUC for protein–protein dataset with leave-one-out cross-validation, Fig2_c_source: per-complex AUC for protein–protein dataset on Docking Benchmark Dataset split.

Source Data Fig. 3

Fig3_a_source.csv: mean residue predictions, Fig3_b_source.csv: residue predictions and statistics, Fig3_c_source.csv: feature values.

Source Data Fig. 4

Fig4_b_source.csv: residue interaction predictions and ground truth, Fig4_c_source.csv: interaction predictions.

Source Data Fig. 5

Fig5a_d_source.csv: interaction potentials for internal and external beads of g3OH and g3CHO.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saldinger, J.C., Raymond, M., Elvati, P. et al. Domain-agnostic predictions of nanoscale interactions in proteins and nanoparticles. Nat Comput Sci 3, 393–402 (2023). https://doi.org/10.1038/s43588-023-00438-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s43588-023-00438-x

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing