Tools and machine learning (ML) formalizations for antibody binding prediction are scarce. Despite recent advances in protein or antibody structure modelling1,2, predicting antibody binding to an antigen remains extremely challenging, even for Alphafold2 (refs. 3,4), which relies on the availability of evolutionary information. The performance of de novo prediction of new antibodies has been poor so far, and methods for adapting successful protein ML paradigms to small-sized antibody datasets that lack evolutionary information5 are critically needed. Furthermore, the required type and size of structural, sequential or affinity-based datasets for accurate binding prediction is still unknown6, which leads to inefficient experimental data generation. Recent hybrid sequence and structure ML formalisms such as binding residues prediction7 or screening compatible antibody–antigen modelled structures8 have suggested that antibody binding prediction is feasible with current types of datasets, in a protein–protein interaction (PPI) setting where unrelated binding pairs are shuffled to create non-binders. Zhang et al.9 demonstrate a new adaptive architecture to predict the neutralization landscape of many antibodies to many antigen variants. Such new approaches and ML formalizations are needed to learn the global binding landscape of antibody and antigen variants, and to transition from the restricted neighbourhood of known bindings towards generalizable binding.

Zhang et al. developed a graph convolutional network (GCN)-based architecture (termed DeepAAI), that can predict the neutralization capacity of completely unknown antibodies during training (Fig. 1). This capability differs from typical PPI prediction methods that already include all known antibodies in both training and test datasets, albeit with different binding partners. Prediction involves either binary classification or regression for the antibodies’ IC50 neutralization score (IC50 is a standard quantitative measure for the potency of a molecule to inhibit a biochemical function). DeepAAI matches antibody sequences to nodes in a graph-based functional latent space. A flexible GCN takes this graph as input. A new antibody can be assigned a new node, and the GCN predicts its property after being trained on the other nodes, akin to a transfer learning scheme. Zhang et al. therefore answer the important research question: given a latent space of the neutralization landscape of a set of antibodies to multiple antigens, is it possible to project new antibodies in this latent space and learn which targets they neutralize?

Fig. 1: The adaptive DeepAAI ML workflow9 predicts the neutralization of unseen antibodies and learns a functional-based internal representation of antibodies.
figure 1

The philosophy of Zhang’s architecture lies in building an internal representation of antibodies (‘global features’) based on functional similarity for binding to predefined antigens. DeepAAI embeds antibody sequences from their k-mer composition or alignment properties (position specific scoring matrix, PSSM) into one node per antibody as a graph. Edge weights are the quantified relation between the antibodies’ embeddings. A graph convolutional network (GCN) analyses a node (antibody) and the weight to its neighbours (all other antibodies) in order to predict if it neutralizes an antigen. Following Zhang et al.’s hypothesis, two antibodies that neutralize the same sets of antigens should have close embeddings and provide a similar input to the GCN layer. New antibodies (white) that are absent from the training receive a new node embedding and edge weights adaptively at prediction time, and the GCN predicts neutralization of this new node. The same representation is built symmetrically for the antigen space, which means new antigens can also be considered. The authors provide evidence that close antibody sequences in the latent space bind to the same immunogenic region of the antigen, which supports the hypothesis of a functional-based similarity embedding. The GCN-based antibody and antigen embeddings were not sufficient to ‘beat’ baseline architectures, and the authors needed to add a convolutional neural network (CNN) that leveraged motifs of 2 amino acids inside antibody and antigen sequences as a parallel track as ‘local features’7. Ab, antibody; Ag, antigen; RBD, receptor-binding domain; AR, adaptive relation.

Previous studies relied on sequence similarity in their loss function to build latent spaces10,11,12,13, but it has been shown this is not a good surrogate to binding on benchmarking simulated datasets6 and potentially disregards dissimilar sequences with shared structure or binding landscape14, which other structural-based methods would have identified15. By contrast, DeepAAI9 builds a function-based internal representation, potentially clustering sequences based on binding rules. Actually, ML models that predict antibody structure1,4 would also likely develop predictive functional latent representations of antibodies, but they have not yet been fine-tuned to antibody–antigen binding or neutralization. Interestingly, the authors show that the latent space clusters antibodies that tend to recognize the same immunogenic regions, which supports a successful functional clustering. Such latent space enables the generation of new antibodies of desired properties, and could reveal interpretable binding rules such as predictive motifs for cross-reactivity. The latent space can also be fine-tuned to predict the binding residues at the antibody–antigen binding interface (the paratope and epitope).

It might seem obvious that ML models use unseen data for testing. Yet, the authors achieve a bigger jump than usual in view of ‘data leakage’, which denotes the existence of shared information between training and test datasets. Sequence similarity is a first line of data leakage: antibodies similar to a training instance might be predicted well without learning generalizable rules. By separating 90% sequence-similar sequences from train and test datasets, and by using unseen antibodies for prediction, DeepAAI shows that binding can be transferred beyond two levels of data leakage owing to known properties of the same antibody to other targets that could be present in a PPI formalism. However, more advanced types of data leakage might exist: if an antibody in the training set has a similar binding profile to the new tested antibody, the model may be predicting correctly without necessarily learning rules, in which case the model learns a multiclass problem where any explored neutralization pattern describes a class. A stress test of DeepAAI with adversarial simulated datasets6 of antibodies sharing binding rules but not the binding landscape would likely help, and would inform how far DeepAAI latent space can represent complex binding profiles.

Provocatively, one could say sequence neutralization datasets may replace structural datasets. Interestingly, sequence positional information was not required for neutralization prediction owing to the use of k-mer sequence representation and the use of a convolutional neural network (CNN), while position specific scoring matrices were needed for IC50 regression prediction, which shows the potential implication of hidden structural information. The question therefore remains where the key information comes from. Does DeepAAI really infer interpretable binding rules, or does it rely on the existence of other antibodies with similar neutralization landscape in the training dataset? DeepAAI predictions were transferable from HIV to the influenza virus and to COVID-19 datasets, which supports the finding that neutralization rules might be predictable with smaller datasets than previously expected. Whether sequence datasets will ultimately prevail or not, binding structures will remain the gold standard to check predictions, especially at the paratope–epitope level. If DeepAAI latent space represents structural binding modes, then each functional cluster might be described by one or few experimental structures, and DeepAAI can inform which experimental structures are missing, and which ones would be redundant to generate, therefore helping to prioritize expensive antibody–antigen structure measurements.

Finally, structural datasets have not yet captured many cases of antibody cross-reactivity to multiple antigens. As a result, current ML methods may have been skewed to ‘one antibody, one target’ formulations, where improving affinity has shadowed the hard task of decreasing off-targets16. Zhang et al.’s formulation enables the exploration of the antibody specificity landscape, which is necessary to investigate which antibodies are cross-reactive in the context of off-target antigen recognition. This capability could prove useful for developing safe immunotherapies.