Predicting compound potency is a key task in drug design, and a number of machine learning-based methods have been proposed for such applications. For example, various graph neural network (GNN) models have been reported for predicting ligand affinity from graph representations of protein–ligand interactions. However, there are controversial views in the field regarding the accuracy and relevance of GNN affinity predictions, largely based on the fact that the use of different training data volumes has led to similar correlations with experimental data, while different partitions of training and test data have resulted in notable differences in model performance. In light of this debate, Jürgen Bajorath and colleagues systematically predicted protein–ligand affinities using different types of GNNs and applied explainable artificial intelligence (XAI) to rationalize the predictions.
Six different GNNs with different architectures were considered: graph convolutional network (GCN), graph attention network (GAT), graph isomorphism network (GIN), an edge-included variant (GINE), a GNN with a generalized activation function (GraphSAGE), and a GNN using the graph convolutional operator (GC-GNN). EdgeSHAPer, an XAI method for GNNs, was employed to quantify the importance of edges in graphs for GNN learning, which made it possible to determine which parts (or subgraphs) of the interaction graphs were responsible for model decisions. Comparisons across the GNNs showed little difference between the predicted affinities from the individual architectures. However, analysis of the proportions of protein, ligand and interaction edges among the top 25 edges showed that ligand memorization dominated predictions across affinity levels. Conversely, protein memorization was not shown to substantially contribute to the predictions. Combined with further assessment from GNNExplainer, the results demonstrated that, while GNNs may not comprehensively account for protein–ligand interactions and physical reality, depending on the model, they can balance ligand memorization with learning interaction patterns. These results help to clarify the function of GNNs in protein–ligand affinity predictions and will likely help to address some of the ongoing debate about their accuracy.
This is a preview of subscription content, access via your institution