BIONIC: biological network integration using convolutions

Forster, Duncan T.; Li, Sheena C.; Yashiroda, Yoko; Yoshimura, Mami; Li, Zhijian; Isuhuaylas, Luis Alberto Vega; Itto-Nakama, Kaori; Yamanaka, Daisuke; Ohya, Yoshikazu; Osada, Hiroyuki; Wang, Bo; Bader, Gary D.; Boone, Charles

doi:10.1038/s41592-022-01616-x

Article
Published: 03 October 2022

BIONIC: biological network integration using convolutions

Nature Methods volume 19, pages 1250–1261 (2022)Cite this article

11k Accesses
12 Citations
58 Altmetric
Metrics details

Subjects

Abstract

Biological networks constructed from varied data can be used to map cellular function, but each data type has limitations. Network integration promises to address these limitations by combining and automatically weighting input information to obtain a more accurate and comprehensive representation of the underlying biology. We developed a deep learning-based network integration algorithm that incorporates a graph convolutional network framework. Our method, BIONIC (Biological Network Integration using Convolutions), learns features that contain substantially more functional information compared to existing approaches. BIONIC has unsupervised and semisupervised learning modes, making use of available gene function annotations. BIONIC is scalable in both size and quantity of the input networks, making it feasible to integrate numerous networks on the scale of the human genome. To demonstrate the use of BIONIC in identifying new biology, we predicted and experimentally validated essential gene chemical–genetic interactions from nonessential gene profiles in yeast.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 2: Comparison of BIONIC integration to three input networks.**

**Fig. 3: Comparison of BIONIC to existing integration approaches.**

**Fig. 4: Supervised performance of BIONIC compared with an existing supervised integration approach.**

**Fig. 5: Network quantity and network size performance comparison across integration methods.**

**Fig. 6: BIONIC essential gene chemical–genetic interaction predictions.**

Differentiable biology: using deep learning for biophysics-based and data-driven modeling of molecular mechanisms

Article 04 October 2021

Mohammed AlQuraishi & Peter K. Sorger

Improving representations of genomic sequence motifs in convolutional networks with exponential activations

Article 08 February 2021

Peter K. Koo & Matt Ploenzke

GINv2.0: a comprehensive topological network integrating molecular interactions from multiple knowledge bases

Article Open access 13 January 2024

Xiao Chang, Shen Yan, … Xu Chi

Data availability

All data, standards, BIONIC yeast features and chemical–genetic interaction data are available in the following Figshare repository: https://figshare.com/projects/BIONIC_Biological_Network_Integration_using_Convolutions/122585. There are no restrictions on the data. Source data are provided with this paper.

Code availability

The BIONIC code is available at https://github.com/bowang-lab/BIONIC⁷⁷. Code to reproduce the main figure analyses (Figs. 2–6) is available at https://github.com/duncster94/BIONIC-analyses⁷⁸ and a library implementing the coannotation prediction, module detection and gene function prediction evaluations is available at https://github.com/duncster94/BIONIC-evals⁷⁹. The BIONIC integrated yeast features (PEG features) can be explored at https://bionicviz.com.

References

Fraser, A. G. & Marcotte, E. M. A probabilistic view of gene function. Nat. Genet. 36, 559 (2004).
Article CAS PubMed Google Scholar
Malod-Dognin, N. et al. Towards a data-integrated cell. Nat. Commun. 10, 805 (2019).
Article CAS PubMed PubMed Central Google Scholar
Wang, P., Gao, L., Hu, Y. & Li, F. Feature related multi-view nonnegative matrix factorization for identifying conserved functional modules in multiple biological networks. BMC Bioinf. 19, 394 (2018).
Article CAS Google Scholar
Argelaguet, R. et al. Multi-omics factor analysis—a framework for unsupervised integration of multi-omics data sets. Mol. Syst. Biol. 14, e8124 (2018).
Article PubMed PubMed Central Google Scholar
Mostafavi, S., Ray, D., Warde-Farley, D., Grouios, C. & Morris, Q. GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome Biol. 9, S4 (2008).
Article PubMed PubMed Central Google Scholar
Wang, B. et al. Similarity network fusion for aggregating data types on a genomic scale. Nat. Methods 11, 333 (2014).
Article CAS PubMed Google Scholar
Cho, H. et al. Compact integration of multi-network topology for functional analysis of genes. Cell Syst. 3, 540–548.e5 (2016).
Article CAS PubMed PubMed Central Google Scholar
Huttenhower, C., Hibbs, M., Myers, C. & Troyanskaya, O. G. A scalable method for integration and functional analysis of multiple microarray datasets. Bioinformatics 22, 2890–2897 (2006).
Article CAS PubMed Google Scholar
von Mering, C. et al. STRING: a database of predicted functional associations between proteins. Nucleic Acids Res. 31, 258–261 (2003).
Article Google Scholar
Alexeyenko, A. & Sonnhammer, E. L. L. Global networks of functional coupling in eukaryotes from comprehensive data integration. Genome Res. 19, 1107–1116 (2009).
Article CAS PubMed PubMed Central Google Scholar
Gligorijević, V., Barot, M. & Bonneau, R. deepNF: deep network fusion for protein function prediction. Bioinformatics 34, 3873–3881 (2018).
Article PubMed PubMed Central Google Scholar
Perozzi, B., Al-Rfou, R. & Skiena, S. DeepWalk: online learning of social representations. In Proc. 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (eds Macskassy, S. & Perlich, C.) 701–710 (Association for Computing Machinery, 2014).
Grover, A. & Leskovec, J. node2vec: scalable feature learning for networks. KDD 2016, 855–864 (2016).
Article PubMed PubMed Central Google Scholar
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In Proc. International Conference on Learning Representations (2017).
Defferrard, M., Bresson, X. & Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. In Proc. Advances in Neural Information Processing Systems (NIPS 2016) Vol. 29, 3844-3852 (Curran Associates, Inc., 2016).
Hamilton, W., Ying, Z. & Leskovec, J. Inductive representation learning on large graphs. In Proc. Advances in Neural Information Processing Systems (NIPS 2017) Vol. 30, 1024-1034 (Curran Associates, Inc., 2017).
Veličković, P. et al. Graph attention networks. In Proc. International Conference on Learning Representations (2018).
Piotrowski, J. S. et al. Functional annotation of chemical libraries across diverse biological processes. Nat. Chem. Biol. 13, 982–993 (2017).
Article CAS PubMed PubMed Central Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 770–778 (IEEE, 2016).
Krogan, N. J. et al. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440, 637–643 (2006).
Article CAS PubMed Google Scholar
Hu, Z., Killion, P. J. & Iyer, V. R. Genetic reconstruction of a functional transcriptional regulatory network. Nat. Genet. 39, 683–687 (2007).
Article CAS PubMed Google Scholar
Costanzo, M. et al. A global genetic interaction network maps a wiring diagram of cellular function. Science 353, aaf1420 (2016).
Article PubMed PubMed Central Google Scholar
Myers, C. L. et al. Discovery of biological networks from diverse functional genomic data. Genome Biol. 6, R114 (2005).
Article PubMed PubMed Central Google Scholar
Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273–297 (1995).
Article Google Scholar
Vo, T. V. et al. A proteome-wide fission yeast interactome reveals network evolution principles from yeasts to human. Cell 164, 310–323 (2016).
Article CAS PubMed PubMed Central Google Scholar
Martín, R. et al. A PP2A-B55-mediated crosstalk between TORC1 and TORC2 regulates the differentiation response in fission yeast. Curr. Biol. 27, 175–188 (2017).
Article PubMed PubMed Central Google Scholar
Ryan, C. J. et al. Hierarchical modularity and the evolution of genetic interactomes across species. Mol. Cell 46, 691–704 (2012).
Article CAS PubMed PubMed Central Google Scholar
Ashburner, M. et al. Gene Ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).
Article CAS PubMed PubMed Central Google Scholar
Orchard, S. et al. The MIntAct project–IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Res. 42, D358–363 (2014).
Article CAS PubMed Google Scholar
Kanehisa, M. & Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27–30 (2000).
Article CAS PubMed PubMed Central Google Scholar
Fernandez, C. F., Pannone, B. K., Chen, X., Fuchs, G. & Wolin, S. L. An Lsm2-Lsm7 complex in Saccharomyces cerevisiae associates with the small nucleolar RNA snR5. Mol. Biol. Cell 15, 2842–2852 (2004).
Article CAS PubMed PubMed Central Google Scholar
Chowdhury, A., Mukhopadhyay, J. & Tharun, S. The decapping activator Lsm1p-7p-Pat1p complex has the intrinsic ability to distinguish between oligoadenylated and polyadenylated RNAs. RNA 13, 998–1016 (2007).
Article CAS PubMed PubMed Central Google Scholar
Wilson, J. D., Baybay, M., Sankar, R., Stillman, P. & Popa, A. M. Analysis of population functional connectivity data via multilayer network embeddings. Netw. Sci. 9, 99–122 (2021).
Article Google Scholar
Huttlin, E. L. et al. Architecture of the human interactome defines protein communities and disease networks. Nature 545, 505 (2017).
Article CAS PubMed PubMed Central Google Scholar
Huttlin, E. L. et al. The bioplex network: a systematic exploration of the human interactome. Cell 162, 425–440 (2015).
Article CAS PubMed PubMed Central Google Scholar
Hein, M. Y. et al. A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell 163, 712–723 (2015).
Article CAS PubMed Google Scholar
Rolland, T. et al. A proteome-scale map of the human interactome network. Cell 159, 1212–1226 (2014).
Article CAS PubMed PubMed Central Google Scholar
Roemer, T. & Boone, C. Systems-level antimicrobial drug and drug synergy discovery. Nat. Chem. Biol. 9, 222–231 (2013).
Article CAS PubMed Google Scholar
Ayscough, K. R. et al. High rates of actin filament turnover in budding yeast and roles for actin in establishment and maintenance of cell polarity revealed using the actin inhibitor latrunculin-A. J. Cell Biol. 137, 399–416 (1997).
Article CAS PubMed PubMed Central Google Scholar
Persaud, R. et al. Clionamines stimulate autophagy, inhibit Mycobacterium tuberculosis survival in macrophages, and target Pik1. Cell Chem. Biol. 29, 870–882 (2021).
Simpkins, S. W. et al. Using BEAN-counter to quantify genetic interactions from multiplexed barcode sequencing experiments. Nat. Protoc. 14, 415–440 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kato, N., Takahashi, S., Nogawa, T., Saito, T. & Osada, H. Construction of a microbial natural product library for chemical biology studies. Curr. Opin. Chem. Biol. 16, 101–108 (2012).
Article CAS PubMed Google Scholar
Protchenko, O., Rodriguez-Suarez, R., Androphy, R., Bussey, H. & Philpott, C. C. A screen for genes of heme uptake identifies the FLC family required for import of FAD into the endoplasmic reticulum. J. Biol. Chem. 281, 21445–21457 (2006).
Article CAS PubMed Google Scholar
Kitagaki, H., Wu, H., Shimoi, H. & Ito, K. Two homologous genes, DCW1 (YKL046c) and DFG5, are essential for cell growth and encode glycosylphosphatidylinositol (GPI)-anchored membrane proteins required for cell wall biogenesis in Saccharomyces cerevisiae. Mol. Microbiol. 46, 1011–1022 (2002).
Article CAS PubMed Google Scholar
Ram, A. F. et al. Loss of the plasma membrane-bound protein Gas1p in Saccharomyces cerevisiae results in the release of beta1,3-glucan into the medium and induces a compensation mechanism to ensure cell wall integrity. J. Bacteriol. 180, 1418–1424 (1998).
Article CAS PubMed PubMed Central Google Scholar
Tomishige, N. et al. Mutations that are synthetically lethal with a gas1Delta allele cause defects in the cell wall of Saccharomyces cerevisiae. Mol. Genet. Genomics 269, 562–573 (2003).
Article CAS PubMed Google Scholar
Ragni, E., Fontaine, T., Gissi, C., Latgè, J. P. & Popolo, L. The Gas family of proteins of Saccharomyces cerevisiae: characterization and evolutionary analysis. Yeast 24, 297–308 (2007).
Article CAS PubMed Google Scholar
Neiman, A. M., Mhaiskar, V., Manus, V., Galibert, F. & Dean, N. Saccharomyces cerevisiae HOC1, a suppressor of pkc1, encodes a putative glycosyltransferase. Genetics 145, 637–645 (1997).
Article CAS PubMed PubMed Central Google Scholar
Simpkins, S. W. et al. Predicting bioprocess targets of chemical compounds through integration of chemical-genetic and genetic interactions. PLoS Comput. Biol. 14, e1006532 (2018).
Article PubMed PubMed Central Google Scholar
Pasikowska, M., Palamarczyk, G. & Lehle, L. The essential endoplasmic reticulum chaperone Rot1 is required for protein N- and O-glycosylation in yeast. Glycobiology 22, 939–947 (2012).
Article CAS PubMed Google Scholar
Machi, K. et al. Rot1p of Saccharomyces cerevisiae is a putative membrane protein required for normal levels of the cell wall 1,6-beta-glucan. Microbiology 150, 3163–3173 (2004).
Article CAS PubMed Google Scholar
Levinson, J. N., Shahinian, S., Sdicu, A.-M., Tessier, D. C. & Bussey, H. Functional, comparative and cell biological analysis of Saccharomyces cerevisiae Kre5p. Yeast 19, 1243–1259 (2002).
Article CAS PubMed Google Scholar
Azuma, M., Levinson, J. N., Pagé, N. & Bussey, H. Saccharomyces cerevisiae Big1p, a putative endoplasmic reticulum membrane protein required for normal levels of cell wall beta-1,6-glucan. Yeast 19, 783–793 (2002).
Article CAS PubMed Google Scholar
Roemer, T., Delaney, S. & Bussey, H. SKN1 and KRE6 define a pair of functional homologs encoding putative membrane proteins involved in beta-glucan synthesis. Mol. Cell. Biol. 13, 4039–4048 (1993).
CAS PubMed PubMed Central Google Scholar
Kubo, K. et al. Jerveratrum-type steroidal alkaloids inhibit β-1,6-glucan biosynthesis in fungal cell walls. Microbiol. Spectr. 10, e0087321 (2022).
Article PubMed Google Scholar
Usaj, M. et al. TheCellMap.org: a web-accessible database for visualizing and mining the global yeast genetic interaction network. G3 7, 1539–1549 (2017).
Article CAS PubMed PubMed Central Google Scholar
Elnaggar, A. et al. ProtTrans: towards cracking the language of lifes code through self-supervised deep learning and high performance computing. IEEE Trans. Pattern Anal. Mach. Intell. https://doi.org/10.1109/TPAMI.2021.3095381 (2021).
Almagro Armenteros, J. J., Sønderby, C. K., Sønderby, S. K., Nielsen, H. & Winther, O. DeepLoc: prediction of protein subcellular localization using deep learning. Bioinformatics 33, 3387–3395 (2017).
Article PubMed Google Scholar
Mattiazzi Usaj, M. et al. Systematic genetics and single‐cell imaging reveal widespread morphological pleiotropy and cell‐to‐cell variability. Mol. Syst. Biol. 16, 30 (2020).
Article Google Scholar
Paszke, A. et al. Automatic differentiation in PyTorch. in NIPS Autodiff Workshop (2017).
Fey, M. & Lenssen, J. E. Fast Graph Representation Learning with PyTorch Geometric. in ICLR 2019 Workshop on Representation Learning on Graphs and Manifolds (2019).
1. Kingma, D. P. & Ba, J. Adam: A Method for Stochastic Optimization. in 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings (eds. Bengio, Y. & LeCun, Y.) (2015).
Stark, C. et al. BioGRID: a general repository for interaction datasets. Nucleic Acids Res. 34, D535–D539 (2006).
Article CAS PubMed Google Scholar
Hibbs, M. A. et al. Exploring the functional landscape of gene expression: directed search of large microarray compendia. Bioinformatics 23, 2692–2699 (2007).
Article CAS PubMed Google Scholar
Myers, C. L., Barrett, D. R., Hibbs, M. A., Huttenhower, C. & Troyanskaya, O. G. Finding function: evaluation methods for functional genomic data. BMC Genomics 7, 187 (2006).
Article PubMed PubMed Central Google Scholar
Aggarwal, C.C., Hinneburg, A., Keim, D.A. (2001). On the Surprising Behavior of Distance Metrics in High Dimensional Space. In: Van den Bussche, J., Vianu, V. (eds) Database Theory — ICDT 2001. ICDT 2001. Lecture Notes in Computer Science, vol 1973. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44503-X_27
Davis, J. & Goadrich, M. The relationship between Precision-Recall and ROC curves. In Proc. 23rd International Conference on Machine Learning: June 25–29, 2006; Pittsburgh, Pennsylvania (eds Cohen, W. W. & Moore, A.) 233–240 (ACM Press, 2006).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Article Google Scholar
Friedman, J. H. Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001).
Article Google Scholar
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).
Article CAS PubMed PubMed Central Google Scholar
Platt, J. C. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. in Advances in Large Margin Classifiers (eds Smola, A. J. et al.) 61-74 (MIT Press, 1999).
Deshpande, R. et al. Efficient strategies for screening large-scale genetic interaction networks. Preprint at bioRxiv https://doi.org/10.1101/159632 (2017).
Beyer, H. Tukey & John, W. Exploratory Data Analysis. Addison-Wesley Publishing Company Reading, Mass.—Menlo Park, cal., London, Amsterdam, Don Mills, Ontario, Sydney 1977, XVI, 688S. Biom. J. 23, 413–414 (1981).
Article Google Scholar
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Stat. Methodol. 57, 289–300 (1995).
Google Scholar
Kitamura, A., Someya, K., Hata, M., Nakajima, R. & Takemura, M. Discovery of a small-molecule inhibitor of β-1,6-glucan synthesis. Antimicrob. Agents Chemother. 53, 670–677 (2009).
Article CAS PubMed Google Scholar
Yamanaka, D. et al. Development of a novel β-1,6-glucan-specific detection system using functionally-modified recombinant endo-β-1,6-glucanase. J. Biol. Chem. 295, 5362–5376 (2020).
Article CAS PubMed PubMed Central Google Scholar
Forster, D. Biological Network Integration using Convolutions (BIONIC) v.0.2.4. Zenodo https://doi.org/10.5281/zenodo.6762584 (2022).
Forster, D. BIONIC analyses v.0.1.0. Zenodo https://doi.org/10.5281/zenodo.6762596 (2022).
Forster, D. BIONIC evaluations (BIONIC-evals) v.0.1.0. Zenodo https://doi.org/10.5281/zenodo.6762602 (2022).

Download references

Acknowledgements

We thank B. Andrews, M. Costanzo and C. Myers for their insightful comments. We also thank M. Fey for adding important features to the PyTorch Geometric library for us. This work was supported by NRNB (US National Institutes of Health, National Center for Research Resources grant number P41 GM103504). Funding for continued development and maintenance of Cytoscape is provided by the US National Human Genome Research Institute under award number HG009979. This work was also supported by the Canadian Institutes of Health Research Foundation grant number FDN-143264, US National Institutes of Health grant number R01HG005853 and joint funding by Genome Canada (OGI-163) and the Ministry of Economic Development, Job Creation and Trade, under the program Bioinformatics and Computational Biology. This work was supported by the National Research Council of Canada through the AI for Design program. This work was supported by CIFAR AI Chair programs. This work was also supported by JSPS KAKENHI grant numbers JP15H04483 (C.B. and Y.O.), JP17H06411 (C.B. and Y.Y.), JP18K14351 (K.I.-N.), JP19H03205 (Y.O.), JP20K07487 (D.Y.) and a RIKEN Foreign Postdoctoral Fellowship (S.C.L.). This research was enabled in part by support provided by SciNet and the Digital Research Alliance of Canada.

Author information

These authors contributed equally: Duncan T. Forster, Sheena C. Li.

Authors and Affiliations

Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
Duncan T. Forster, Gary D. Bader & Charles Boone
The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
Duncan T. Forster, Sheena C. Li, Zhijian Li, Luis Alberto Vega Isuhuaylas, Gary D. Bader & Charles Boone
Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
Duncan T. Forster & Bo Wang
RIKEN Center for Sustainable Resource Science, Wako, Saitama, Japan
Sheena C. Li, Yoko Yashiroda, Mami Yoshimura, Hiroyuki Osada & Charles Boone
Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan
Kaori Itto-Nakama & Yoshikazu Ohya
Laboratory for Immunopharmacology of Microbial Products, School of Pharmacy, Tokyo University of Pharmacy and Life Sciences, Hachioji, Tokyo, Japan
Daisuke Yamanaka
Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Tokyo, Japan
Yoshikazu Ohya
Peter Munk Cardiac Center, University Health Network, Toronto, Ontario, Canada
Bo Wang
Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
Bo Wang & Gary D. Bader
Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada
Bo Wang
The Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, Ontario, Canada
Gary D. Bader
Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
Gary D. Bader

Authors

Duncan T. Forster
View author publications
You can also search for this author in PubMed Google Scholar
Sheena C. Li
View author publications
You can also search for this author in PubMed Google Scholar
Yoko Yashiroda
View author publications
You can also search for this author in PubMed Google Scholar
Mami Yoshimura
View author publications
You can also search for this author in PubMed Google Scholar
Zhijian Li
View author publications
You can also search for this author in PubMed Google Scholar
Luis Alberto Vega Isuhuaylas
View author publications
You can also search for this author in PubMed Google Scholar
Kaori Itto-Nakama
View author publications
You can also search for this author in PubMed Google Scholar
Daisuke Yamanaka
View author publications
You can also search for this author in PubMed Google Scholar
Yoshikazu Ohya
View author publications
You can also search for this author in PubMed Google Scholar
Hiroyuki Osada
View author publications
You can also search for this author in PubMed Google Scholar
Bo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Gary D. Bader
View author publications
You can also search for this author in PubMed Google Scholar
Charles Boone
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

D.T.F. conceived and developed the method and computational experiments. S.C.L. and M.Y. performed the chemical–genetic screens. Z.L. provided resources for the TS mutant collection. L.A.V.I. preprocessed and provided the chemical–genetic data. H.O. provided the chemical matter and information about the screened compounds. S.C.L. and Z.L. constructed the drug-hypersensitive TS mutant collection. K.I.-N., D.Y. and H.O. performed the jervine biochemical validation. D.T.F., S.C.L., Y.Y., Y.O., B.W., G.D.B. and C.B. wrote the manuscript. B.W., G.D.B. and C.B. conceived and supervised the project.

Corresponding authors

Correspondence to Bo Wang, Gary D. Bader or Charles Boone.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Methods thanks Kevin Yuk-Lap Yip and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Lin Tang, in collaboration with the Nature Methods team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Detailed view of individual BIONIC network encoder.

A more detailed view of an individual network encoder, including residual connections. A network specific graph convolutional network is used to encode the input network for increasing neighborhood sizes. The first GCN in the sequence learns features for a given node based on the node’s immediate neighborhood (1st order features). The next GCN learns features based on the node’s second order neighborhood (2nd order features), and so on. The node feature matrices learned by each GCN pass are summed together to create the final learned, network-specific features. Summing the outputs of the various GCNs in this way creates residual connections, allowing features from multiple neighborhood sizes to generate the final learned features, rather than just the final neighborhood size. This figure shows three GCN layers, but BIONIC uses the same pattern of connections for any number of GCN layers. Note that the GCN layers for a given encoder share their weights, so in effect, there is a single GCN layer for each encoder.

Extended Data Fig. 2 Comparison of individual network features produced by BIONIC.

A comparison of individual networks (denoted ‘Net’), their corresponding features encoded using the unsupervised BIONIC (denoted ‘BIONIC’), as well as the BIONIC integration of these networks (denoted ‘GI+COEX+PPI BIONIC’). BP = Biological Processes, GI = Genetic Interaction, COEX = Co-expression, PPI = Protein-protein Interaction. These are the same networks and evaluations used in Fig. 2. Data are presented as mean values. Error bars indicate the 95% confidence interval for n = 10 independent samples.

Extended Data Fig. 3 Dynamics of BIONIC feature space through training.

Comparison of pairwise gene similarities (cosine similarity in the case of BIONIC, direct binary adjacency in the case of the network), as defined by IntAct Complexes for known co-complex relationships (positive pairs) and no co-complex relationships (negative pairs), between a yeast PPI network (as used in the Fig. 2 analyses) and the unsupervised BIONIC features produced from this network. The BIONIC similarities are shown throughout the training process (epochs), whereas the input network is constant so its pairwise similarities do not change. ‘Network’ denotes the input PPI network, ‘BIONIC’ denotes the features learned from this network using BIONIC.

Extended Data Fig. 4 Coverage of BIONIC and input network captured modules.

Coverage of functional gene modules by individual networks and the unsupervised BIONIC integration of these networks (denoted BIONIC), as determined by a parameter optimized module detection analysis where the clustering parameters were optimized for each module individually. The number of captured modules is reported for a range of overlap scores (Jaccard threshold). Higher threshold indicates greater correspondence between the clusters obtained from the dataset and their respective modules given by the standard. PPI = protein-protein interaction. These are the same networks and BIONIC features as Fig. 2.

Extended Data Fig. 5 Captured modules comparison for BIONIC and input networks for optimal clustering parameters.

Known protein complexes (as defined by the IntAct standard) captured by individual networks and the unsupervised BIONIC integration of these networks (denoted BIONIC). Hierarchical clustering was performed on the datasets and resulting clusters were compared to known IntAct complexes and scored for set overlap using the Jaccard score (ranging from 0 to 1). The clustering algorithm parameters were optimized for each module individually, unlike the analysis in Fig. 2 where the clustering parameters were optimized for the standard as a whole. Each point is a protein complex, as in Fig. 2c. The dashed line indicates instances where the given data sets achieve the same score for a given module. Histograms indicate the distribution of overlap (Jaccard) scores for the given dataset, and the labelled dashed line indicates the mean of this distribution. The individual modules shown here as well as for the KEGG Pathways and IntAct Complexes module standards can be found in Supplementary Data File 4. The LSM2-7 complex is indicated by the arrows. PPI = protein-protein interaction. This analysis uses the same networks and BiONIC features as Fig. 2.

Extended Data Fig. 6 Interpretability of BIONIC feature space.

Co-annotation evaluations of the unsupervised BIONIC features subset to different clusters of feature dimensions (denoted ‘Cluster’). The number of feature dimensions for each cluster is given in parenthesis. The performance of the original BIONIC features (denoted BIONIC (512)) is also displayed. Data are presented as mean values. Bars indicate 95% confidence interval for n = 10 independent samples.

Extended Data Fig. 7 Integration method performance for yeast-two-hybrid network inputs.

Performance comparison of 5 yeast-two-hybrid network integrations across functional standards, evaluation types and unsupervised integration methods. Data are presented as mean values. Bars indicate 95% confidence interval for n = 10 independent samples. BP = Biological Process, multi-n2v = multi-node2vec.

Extended Data Fig. 8 Effects of label poisoning on BIONIC semi-supervised and unsupervised performance.

Semi-supervised BIONIC comparisons. a) A label poisoning experiment, where progressively more permutation noise is added to the label sets the semi-supervised BIONIC is trained on. ‘Noise’ indicates the proportion of permutation noise applied (multiply by 100 for percentages). Data are presented as mean values. Bars indicate 95% confidence interval for n = 10 independent samples. b) UMAP plots comparing the embedding space of the TFIID complex and the 100 nearest neighbors of this complex for unsupervised and semi-supervised BIONIC over a range of label noise values. SS = average silhouette score of TFIID members.

Extended Data Fig. 9 Computational scalability of BIONIC.

Graphics processing unit (GPU) memory usage in gigabytes (left) and average wall clock epoch time in minutes (right) for a range of network sizes and number of networks. GB = gigabyte, min = minutes. Gray squares indicate a scenario where BIONIC exceeded the maximum memory of the GPU and failed to complete. The experiments were run on a Titan Xp GPU with a 2.4 GHz Intel Xeon CPU and 32 GB of system memory.

Extended Data Fig. 10 β-1,6-glucan levels in yeast strains.

The amount of glucan per cell was calculated using pustulan as a standard. Data are presented as mean values. Error bars indicate standard deviation for n = 3 biologically independent samples. kre6Δ compared to wild type p-value = 0.01473, Jervine compares to wild type p-value = 0.01520. * Significant difference (p-value < 0.05 after Bonferroni correction, Welch’s one-sided t-test).

Supplementary information

Supplementary Information

Supplementary Figs. 1–7 and Notes 1–4.

Reporting Summary

Supplementary Data 1

Hyperparameter optimization results. Hyperparameter optimization results across integration methods integrating three S. pombe networks. The chosen (best) hyperparameter combinations for each method are highlighted.

Supplementary Data 2

Integrated network details. Publication, gene count, edge count and experimental type for each yeast network and each human network used in Figs. 2–6. Rows in yellow indicate the three yeast networks used in Figs. 2–4 and 6 integrations.

Supplementary Data 3

Evaluation standards details. Gene count, coannotation count, module count and class count details for each standard used in the Figs. 2–5 evaluations.

Supplementary Data 4

Module detection results. Overlap of standard-optimized clusters obtained from the Fig. 2c module detection analysis for networks as well as integration methods. Module standards are IntAct complexes, KEGG pathways and GO Biological Processes.

Supplementary Data 5

Extended Data Figs. 4–5 Module detection results. Overlap of known per-module-optimized clusters obtained from the Figs. 2 and 3 input networks and integration methods, with IntAct complex, KEGG pathway and GO Biological Process modules.

Supplementary Data 6

50 compound TS Aalele screen results. Files containing the TS allele chemical–genetic scores and IQR scores, screened against 50 compounds (at multiple concentrations) that were selected by BIONIC.

Supplementary Data 7

Essential gene compound sensitivity predictions. Essential yeast gene compound sensitivity predictions for 50 selected compounds using BIONIC.

Supplementary Data 8

Integrated BIONIC features. Learned BIONIC features from yeast networks (protein–protein interaction, coexpression and genetic interaction) integrated and used in Figs. 2, 3 and 6.

Supplementary Data 9

Evaluation standards. Yeast evaluation standards for coannotation prediction, module detection and gene function prediction used in Figs. 2a, 3a, 4 and 5a as well as the human coannotation standard used in Fig. 5b.

Source data

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Fig. 4

Statistical source data.

Source Data Fig. 5

Statistical source data.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Forster, D.T., Li, S.C., Yashiroda, Y. et al. BIONIC: biological network integration using convolutions. Nat Methods 19, 1250–1261 (2022). https://doi.org/10.1038/s41592-022-01616-x

Download citation

Received: 04 December 2020
Accepted: 16 August 2022
Published: 03 October 2022
Issue Date: October 2022
DOI: https://doi.org/10.1038/s41592-022-01616-x

This article is cited by

Interpreting biologically informed neural networks for enhanced proteomic biomarker discovery and pathway analysis
- Erik Hartman
- Aaron M. Scott
- Johan Malmström
Nature Communications (2023)

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links