Bridging the neutralization gap for unseen antibodies

Robert, Philippe A.; Greiff, Victor

doi:10.1038/s42256-022-00594-1

Download PDF

News & Views
Published: 12 December 2022

Antibody binding prediction

Bridging the neutralization gap for unseen antibodies

Nature Machine Intelligence volume 5, pages 8–10 (2023)Cite this article

2108 Accesses
5 Altmetric
Metrics details

Subjects

Antibodies are an essential class of therapeutics but low breadth or off-target binding are major concerns for antibody–drug efficiency and safety. To predict which targets an antibody can neutralize, a machine learning pipeline based on an adaptive graph convolutional network architecture is proposed that learns the binding landscape of antibodies to multiple mutated viruses at the same time.

Tools and machine learning (ML) formalizations for antibody binding prediction are scarce. Despite recent advances in protein or antibody structure modelling^1,2, predicting antibody binding to an antigen remains extremely challenging, even for Alphafold2 (refs. ^3,4), which relies on the availability of evolutionary information. The performance of de novo prediction of new antibodies has been poor so far, and methods for adapting successful protein ML paradigms to small-sized antibody datasets that lack evolutionary information⁵ are critically needed. Furthermore, the required type and size of structural, sequential or affinity-based datasets for accurate binding prediction is still unknown⁶, which leads to inefficient experimental data generation. Recent hybrid sequence and structure ML formalisms such as binding residues prediction⁷ or screening compatible antibody–antigen modelled structures⁸ have suggested that antibody binding prediction is feasible with current types of datasets, in a protein–protein interaction (PPI) setting where unrelated binding pairs are shuffled to create non-binders. Zhang et al.⁹ demonstrate a new adaptive architecture to predict the neutralization landscape of many antibodies to many antigen variants. Such new approaches and ML formalizations are needed to learn the global binding landscape of antibody and antigen variants, and to transition from the restricted neighbourhood of known bindings towards generalizable binding.

Zhang et al. developed a graph convolutional network (GCN)-based architecture (termed DeepAAI), that can predict the neutralization capacity of completely unknown antibodies during training (Fig. 1). This capability differs from typical PPI prediction methods that already include all known antibodies in both training and test datasets, albeit with different binding partners. Prediction involves either binary classification or regression for the antibodies’ IC₅₀ neutralization score (IC₅₀ is a standard quantitative measure for the potency of a molecule to inhibit a biochemical function). DeepAAI matches antibody sequences to nodes in a graph-based functional latent space. A flexible GCN takes this graph as input. A new antibody can be assigned a new node, and the GCN predicts its property after being trained on the other nodes, akin to a transfer learning scheme. Zhang et al. therefore answer the important research question: given a latent space of the neutralization landscape of a set of antibodies to multiple antigens, is it possible to project new antibodies in this latent space and learn which targets they neutralize?

**Fig. 1: The adaptive DeepAAI ML workflow⁹ predicts the neutralization of unseen antibodies and learns a functional-based internal representation of antibodies.**

Previous studies relied on sequence similarity in their loss function to build latent spaces^10,11,12,13, but it has been shown this is not a good surrogate to binding on benchmarking simulated datasets⁶ and potentially disregards dissimilar sequences with shared structure or binding landscape¹⁴, which other structural-based methods would have identified¹⁵. By contrast, DeepAAI⁹ builds a function-based internal representation, potentially clustering sequences based on binding rules. Actually, ML models that predict antibody structure^1,4 would also likely develop predictive functional latent representations of antibodies, but they have not yet been fine-tuned to antibody–antigen binding or neutralization. Interestingly, the authors show that the latent space clusters antibodies that tend to recognize the same immunogenic regions, which supports a successful functional clustering. Such latent space enables the generation of new antibodies of desired properties, and could reveal interpretable binding rules such as predictive motifs for cross-reactivity. The latent space can also be fine-tuned to predict the binding residues at the antibody–antigen binding interface (the paratope and epitope).

It might seem obvious that ML models use unseen data for testing. Yet, the authors achieve a bigger jump than usual in view of ‘data leakage’, which denotes the existence of shared information between training and test datasets. Sequence similarity is a first line of data leakage: antibodies similar to a training instance might be predicted well without learning generalizable rules. By separating 90% sequence-similar sequences from train and test datasets, and by using unseen antibodies for prediction, DeepAAI shows that binding can be transferred beyond two levels of data leakage owing to known properties of the same antibody to other targets that could be present in a PPI formalism. However, more advanced types of data leakage might exist: if an antibody in the training set has a similar binding profile to the new tested antibody, the model may be predicting correctly without necessarily learning rules, in which case the model learns a multiclass problem where any explored neutralization pattern describes a class. A stress test of DeepAAI with adversarial simulated datasets⁶ of antibodies sharing binding rules but not the binding landscape would likely help, and would inform how far DeepAAI latent space can represent complex binding profiles.

Provocatively, one could say sequence neutralization datasets may replace structural datasets. Interestingly, sequence positional information was not required for neutralization prediction owing to the use of k-mer sequence representation and the use of a convolutional neural network (CNN), while position specific scoring matrices were needed for IC₅₀ regression prediction, which shows the potential implication of hidden structural information. The question therefore remains where the key information comes from. Does DeepAAI really infer interpretable binding rules, or does it rely on the existence of other antibodies with similar neutralization landscape in the training dataset? DeepAAI predictions were transferable from HIV to the influenza virus and to COVID-19 datasets, which supports the finding that neutralization rules might be predictable with smaller datasets than previously expected. Whether sequence datasets will ultimately prevail or not, binding structures will remain the gold standard to check predictions, especially at the paratope–epitope level. If DeepAAI latent space represents structural binding modes, then each functional cluster might be described by one or few experimental structures, and DeepAAI can inform which experimental structures are missing, and which ones would be redundant to generate, therefore helping to prioritize expensive antibody–antigen structure measurements.

Finally, structural datasets have not yet captured many cases of antibody cross-reactivity to multiple antigens. As a result, current ML methods may have been skewed to ‘one antibody, one target’ formulations, where improving affinity has shadowed the hard task of decreasing off-targets¹⁶. Zhang et al.’s formulation enables the exploration of the antibody specificity landscape, which is necessary to investigate which antibodies are cross-reactive in the context of off-target antigen recognition. This capability could prove useful for developing safe immunotherapies.

References

Ruffolo, J. A., Chu, L.-S., Mahajan, S. P. & Gray, J. J. Preprint at Biophys. J. 121, 155a–156a (2022).
Lin, Z. et al. Preprint at bioRxiv https://doi.org/10.1101/2022.07.20.500902 (2022).
Yin, R., Feng, B. Y., Varshney, A. & Pierce, B. G. Protein Sci. 31, e4379 (2022).
Google Scholar
Abanades, B., Georges, G., Bujotzek, A. & Deane, C. M. Bioinformatics 38, 1877–1880 (2022).
Google Scholar
Chowdhury, R. et al. Nat. Biotechnol. https://doi.org/10.1038/s41587-022-01432-w (2022).
Article Google Scholar
Robert, P. A. et al. Nat. Computat. Sci. (in the press).
Pittala, S. & Bailey-Kellogg, C. Bioinformatics 36, 3996–4003 (2020).
Google Scholar
Schneider, C., Buchanan, A., Taddese, B. & Deane, C. M. Bioinformatics 38, 377–383 (2021).
Google Scholar
Zhang, J. et al. Nat. Mach. Intell. 4, 964–976 (2022).
Google Scholar
Friedensohn, S. et al. Preprint at bioRxiv https://doi.org/10.1101/2020.02.25.965673 (2020).
Akbar, R. et al. MAbs 14, 2031482 (2022).
Google Scholar
Ruffolo, J. A., Sulam, J. & Gray, J. J. Patterns (NY) 3, 100406 (2021).
Google Scholar
Leem, J., Mitchell, L. S., Farmery, J. H. R., Barton, J. & Galson, J. D. Deciphering the language of antibodies using self-supervised learning. Patterns (NY) 3, 100513 (2022).
Google Scholar
Mason, D. M. et al. Nat. Biomed. Eng. 5, 600–612 (2021).
Google Scholar
Richardson, E. et al. MAbs 13, 1869406 (2021).
Google Scholar
Cunningham, O., Scott, M., Zhou, Z. S. & Finlay, W. J. J. MAbs 13, 1999195 (2021).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
Philippe A. Robert & Victor Greiff
Department of Biomedicine, University of Basel, Basel, Switzerland
Philippe A. Robert

Authors

Philippe A. Robert
View author publications
You can also search for this author in PubMed Google Scholar
Victor Greiff
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Philippe A. Robert or Victor Greiff.

Ethics declarations

Competing interests

P.A.R.’s current postdoctoral position at University of Basel was funded by Hoffmann-La Roche, Basel. V.G. declares advisory board positions in aiNET GmbH, Enpicom B.V, Specifica Inc, Adaptyv Biosystems, EVQLV, Omniscope, Diagonal Therapeutics, and Absci. V.G. is a consultant for Roche/Genentech, immunai, and Proteinea.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Robert, P.A., Greiff, V. Bridging the neutralization gap for unseen antibodies. Nat Mach Intell 5, 8–10 (2023). https://doi.org/10.1038/s42256-022-00594-1

Download citation

Published: 12 December 2022
Issue Date: January 2023
DOI: https://doi.org/10.1038/s42256-022-00594-1

Bridging the neutralization gap for unseen antibodies

Subjects

References

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing interests

Rights and permissions

About this article

Cite this article

Predicting unseen antibodies’ neutralizability via adaptive graph neural networks

Search

Quick links

Subjects

References

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing interests

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links