Predicting drug–protein interaction using quasi-visual question answering system

Zheng, Shuangjia; Li, Yongjian; Chen, Sheng; Xu, Jun; Yang, Yuedong

doi:10.1038/s42256-020-0152-y

Article
Published: 14 February 2020

Predicting drug–protein interaction using quasi-visual question answering system

Nature Machine Intelligence volume 2, pages 134–140 (2020)Cite this article

5988 Accesses
131 Citations
10 Altmetric
Metrics details

Subjects

A Publisher Correction to this article was published on 11 August 2020

This article has been updated

Abstract

Identifying novel drug–protein interactions is crucial for drug discovery. For this purpose, many machine learning-based methods have been developed based on drug descriptors and one-dimensional protein sequences. However, protein sequences cannot accurately reflect the interactions in three-dimensional space. However, direct input of three-dimensional structure is of low efficiency due to the sparse three-dimensional matrix, and is also prevented by the limited number of co-crystal structures available for training. Here we propose an end-to-end deep learning framework to predict the interactions by representing proteins with a two-dimensional distance map from monomer structures (Image) and drugs with molecular linear notation (String), following the visual question answering mode. For efficient training of the system, we introduce a dynamic attentive convolutional neural network to learn fixed-size representations from the variable-length distance maps and a self-attentional sequential model to automatically extract semantic features from the linear notations. Extensive experiments demonstrate that our model obtains competitive performance against state-of-the-art baselines on the directory of useful decoys, enhanced (DUD-E), human and BindingDB benchmark datasets. Further attention visualization provides biological interpretation to depict highlighted regions of both protein and drug molecules.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: The framework of the proposed DrugVQA model.**

**Fig. 2: Performance comparisons of our proposed method and baselines on seen and unseen protein targets from the BindingDB dataset.**

**Fig. 3: Importance visualization of pocket and ligand pairs.**

DeepLPI: a novel deep learning-based model for protein–ligand interaction prediction for drug repurposing

Article Open access 28 October 2022

Analyzing effect of quadruple multiple sequence alignments on deep learning based protein inter-residue distance prediction

Article Open access 07 April 2021

Predicting target–ligand interactions with graph convolutional networks for interpretable pharmaceutical discovery

Article Open access 19 May 2022

Data availability

All data used in this paper are publicly available and can be accessed at http://dude.docking.org for the DUD-E dataset, https://github.com/IBMInterpretableDTIP for the BindingDB-IBM dataset, https://github.com/masashitsubaki/CPI_prediction/tree/master/dataset for human dataset and https://www.rcsb.org for the protein crystal structure.

Code availability

Demo, instructions and code for DrugVQA are available at https://github.com/prokia/drugVQA.

Change history

11 August 2020
A Correction to this paper has been published: https://doi.org/10.1038/s42256-020-0224-z

References

Koes, D. R., Baumgartner, M. P. & Camacho, C. J. Lessons learned in empirical scoring with smina from the CSAR 2011 benchmarking exercise. J. Chem. Inf. Model. 53, 1893–1904 (2013).
Article Google Scholar
Trott, O. & Olson, A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 31, 455–461 (2010).
Google Scholar
Ragoza, M., Hochuli, J., Idrobo, E., Sunseri, J. & Koes, D. R. Protein–ligand scoring with convolutional neural networks. J. Chem. Inf. Model. 57, 942–957 (2017).
Article Google Scholar
Tsubaki, M., Tomii, K. & Sese, J. Compound–protein interaction prediction with end-to-end learning of neural networks for graphs and sequences. Bioinformatics 35, 309–318 (2018).
Article Google Scholar
Gao, K. Y., Fokoue, A., Luo, H., Iyengar, A., Dey, S. & Zhang, P. Interpretable drug target prediction using deep neural representation. In Int. Joint Conf. on Artificial Intelligence 3371–3377 (IJCAI, 2018).
Zheng, S., Yan, X., Yang, Y. & Xu, J. Identifying structure–property relationships through SMILES syntax analysis with self-attention mechanism. J. Chem. Inf. Model. 59, 914–923 (2018).
Article Google Scholar
Öztürk, H., Özgür, A. & Ozkirimli, E. DeepDTA: deep drug–target binding affinity prediction. Bioinformatics 34, i821–i829 (2018).
Article Google Scholar
Jastrzebski, S., Leśniak, D. & Czarnecki, W. M. Learning to SMILE(S). Preprint at https://arxiv.org/abs/1602.06289 (2016).
Wallach, I., Dzamba, M. & Heifets, A. AtomNet: a deep convolutional neural network for bioactivity prediction in structure-based drug discovery. Preprint at https://arxiv.org/abs/1510.02855 (2015).
Stepniewska-Dziubinska, M. M., Zielenkiewicz, P. & Siedlecki, P. Development and evaluation of a deep learning model for protein–ligand binding affinity prediction. Bioinformatics 34, 3666–3674 (2018).
Article Google Scholar
Skolnick, J., Kolinski, A. & Ortiz, A. R. MONSSTER: a method for folding globular proteins with a small number of distance restraints. J. Mol. Biol. 265, 217–241 (1997).
Article Google Scholar
Namrata, A. & Possu, H. Generative modeling for protein structures. Adv. Neural Inf. Process. Syst. 31, 7494–7505 (2018).
Google Scholar
Bepler, T. & Berger, B. Learning protein sequence embeddings using information from structure. Preprint at https://arxiv.org/abs/1902.08661 (2019).
Yang, Z., He, X., Gao, J., Deng, L. & Smola, A. Stacked attention networks for image question answering. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 21–29 (2016).
Xu, K. et al. Show, attend and tell: neural image caption generation with visual attention. In Int. Conf. on Machine Learning 37, 2048–2057 (PMLR, 2015).
Noh, H., Seo, P. H. & Han, B. Image question answering using convolutional neural network with dynamic parameter prediction. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 30–38 (IEEE, 2016).
Agrawal, A. et al. VQA: visual question answering. Int. J. Comput. Vis. 123, 4–31 (2017).
Article MathSciNet Google Scholar
Antol, S. et al. VQA: Visual Question Answering. In Proc. IEEE International Conference on Computer Vision 2425–2433 (IEEE, 2015).
Weininger, D. et al. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988).
Article Google Scholar
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).
Article Google Scholar
Ma, L., Lu, Z. & Li, H. Learning to answer questions from image using convolutional neural network. In Thirtieth AAAI Conference on Artificial Intelligence (AAAI, 2016).
Shih, K. J., Singh, S. & Hoiem, D. Where to look: focus regions for visual question answering. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 4613–4621 (IEEE, 2016).
Xu, H. & Saenko, K. Ask, attend and answer: exploring question-guided spatial attention for visual question answering. In European Conference on Computer Vision (Springer, 2016).
Schwartz, I., Schwing, A. & Hazan, T. High-order attention models for visual question answering. Adv. Neural Inf. Process. Syst. 3664–3674 (2017).
Bleakley, K. & Yamanishi, Y. Supervised prediction of drug–target interactions using bipartite local models. Bioinformatics 25, 2397–2403 (2009).
Article Google Scholar
Ballester, P. J. & Mitchell, J. B. O. A machine learning approach to predicting protein–ligand binding affinity with applications to molecular docking. Bioinformatics 26, 1169–1175 (2010).
Article Google Scholar
Durrant, J. D. & McCammon, J. A. NNScore 2.0: a neural-network receptor–ligand scoring function. J. Chem. Inf. Model. 51, 2897–2903 (2011).
Article Google Scholar
Tabei, Y. & Yamanishi, Y. Scalable prediction of compound–protein interactions using minwise hashing. BMC Syst. Biol. 7, S3 (2013).
Article Google Scholar
Olivecrona, M., Blaschke, T., Engkvist, O. & Chen, H. Molecular de-novo design through deep reinforcement learning. J. Cheminform. 9, 48 (2017).
Article Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. Identity mappings in deep residual networks. In European Conference on Computer Vision 630–645 (Springer, 2016).
D.-A. Clevert, T. Unterthiner, and S. Hochreiter. Fast and accurate deep network learning by exponential linear units (ELUs). Preprint at https://arxiv.org/abs/1511.07289 (2015).
Nair, V. & Hinton, G. E. Rectified linear units improve restricted Boltzmann machines. In Proc. 27th International Conference on Machine Learning 807–814 (ICML, 2010).
Lin, Z. et al. A structured self-attentive sentence embedding. Preprint at https://arxiv.org/abs/1703.03130 (2017).
Mysinger, M. M., Carchia, M., Irwin, J. J. & Shoichet, B. K. Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J. Med. Chem. 55, 6582–6594 (2012).
Article Google Scholar
Liu, H., Sun, J., Guan, J., Zheng, J. & Zhou, S. Improving compound–protein interaction prediction by building up highly credible negative samples. Bioinformatics 31, i221–i229 (2015).
Article Google Scholar
Gilson, M. K. et al. BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res. 44, D1045–D1053 (2015).
Article Google Scholar
Paszke, A. et al. Automatic differentiation in PyTorch. In Neural Information Processing Systems Workshop Autodiff (NeurIPS, 2017).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).
Fokoue, A., Sadoghi, M., Hassanzadeh, O. & Zhang, P. Predicting drug–drug interactions through large-scale similarity-based link prediction. In European Semantic Web Conference 774–789 (Springer, 2016).
Wen, M. et al. Deep-learning-based drug–target interaction prediction. J. Proteome Res. 16, 1401–1409 (2017).
Article Google Scholar
Torng, W. & Altman, R. B. Graph convolutional neural networks for predicting drug-target interactions. J. Chem. Inf. Model. 59, 4131–4149 (2019).
Article Google Scholar
Burley, S. K. et al. RCSB Protein Data Bank: biological macromolecular structures enabling research and education in fundamental biology, biomedicine, biotechnology and energy. Nucleic Acids Res. 47, D464–D474 (2018).
Article Google Scholar

Download references

Acknowledgements

The work was supported in part by the National Key R&D Program of China (2018YFC0910500), GD Frontier and Key Tech Innovation Program (2018B010109006,2019B020228001), the National Natural Science Foundation of China (61772566, U1611261 and 81801132, 81903540) and the programme for Guangdong Introducing Innovative and Entrepreneurial Teams (2016ZT06D211).

Author information

These authors contributed equally: Shuangjia Zheng and Yongjian Li.

Authors and Affiliations

School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, China
Shuangjia Zheng & Jun Xu
National Supercomputer Center in Guangzhou, Sun Yat-sen University, Guangzhou, China
Shuangjia Zheng, Yongjian Li, Sheng Chen & Yuedong Yang

Authors

Shuangjia Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Yongjian Li
View author publications
You can also search for this author in PubMed Google Scholar
Sheng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jun Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yuedong Yang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.Z., Y.L. and Y.Y. contributed concept and implementation. S.Z. and Y.L. co-designed experiments. S.Z. and Y.L. were responsible for programming. All authors contributed to the interpretation of results. S.Z. and Y.Y. wrote the manuscript. All authors reviewed and approved the final manuscript.

Corresponding authors

Correspondence to Jun Xu or Yuedong Yang.

Ethics declarations

Competing interests

The authors declare no competing interests

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

42256_2020_152_MOESM1_ESM.pdf

Supplementary dataset details, neural network training and performance details, visualization details, Supplementary Figs. 1–5 and Table 1.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zheng, S., Li, Y., Chen, S. et al. Predicting drug–protein interaction using quasi-visual question answering system. Nat Mach Intell 2, 134–140 (2020). https://doi.org/10.1038/s42256-020-0152-y

Download citation

Received: 19 September 2019
Accepted: 13 January 2020
Published: 14 February 2020
Issue Date: 01 February 2020
DOI: https://doi.org/10.1038/s42256-020-0152-y

This article is cited by

CAT-DTI: cross-attention and Transformer network with domain adaptation for drug-target interaction prediction
- Xiaoting Zeng
- Weilin Chen
- Baiying Lei
BMC Bioinformatics (2024)
GEFormerDTA: drug target affinity prediction based on transformer graph for early fusion
- Youzhi Liu
- Linlin Xing
- Maozu Guo
Scientific Reports (2024)
Integrating sequence and graph information for enhanced drug-target affinity prediction
- Haohuai He
- Guanxing Chen
- Calvin Yu-Chian Chen
Science China Information Sciences (2024)
PMF-CPI: assessing drug selectivity with a pretrained multi-functional model for compound–protein interactions
- Nan Song
- Ruihan Dong
- Fei Guo
Journal of Cheminformatics (2023)
MCL-DTI: using drug multimodal information and bi-directional cross-attention learning method for predicting drug–target interaction
- Ying Qian
- Xinyi Li
- Qian Zhang
BMC Bioinformatics (2023)