In two recent studies published in Nature and Science, researchers successfully developed unified AI architectures capable of predicting the structures of all biomolecules. This remarkable advancement is expected to have a profound impact on future biomedical research and drug design by offering crucial information regarding the interactions that govern both physiological and pathological processes.

Predicting the structure of monomeric proteins has been almost accomplished by AlphaFold2 (AF2).1 Once the monomer structures are known, the next step is to predict how these monomers interact with other biomolecules, including proteins, nucleic acids, and small molecules. These interactions form the fundamental basis that drive the complex and dynamic behavior of living systems. Accurate prediction of biomolecular interactions remains a grand challenge in computational structural biology.

Recently, several methods, including AlphaFold3 (AF3)2 and RoseTTAFold All-Atom (RFAA),3 have been developed to predict biomolecular interactions. These advances are built on the tremendous success of monomer structure prediction by AF2 and RoseTTAFold.4 Overall, these works demonstrate that it is possible to predict the structures for all biomolecules under a unified AI architecture.

Compared to AF2, several key modifications are made in AF3 to accommodate more types of biomolecules rather than just proteins, as in AF2. One of the most notable changes is the replacement of the invariant point attention (IPA)-based structure module in AF2 with a generative diffusion architecture in AF3. The diffusion operates on the raw coordinates and does not require rotational and translational equivariance. Unlike conventional diffusion, the diffusion in AF3 is conditioned on the input embeddings obtained from the multiple sequence alignments and homologous templates of proteins and/or nucleic acids, as well as reference conformers of ligands. This conditioning ensures that the generated structure models are compatible with the inputs, resembling the goal of typical structure prediction. Removing such a condition may enable the network to generate dynamic structures, which would be particularly valuable for studying the alternative functional states of structured proteins and the conformational ensembles of intrinsically disordered proteins.

AF3 demonstrates significantly improved accuracy in predicting protein–ligand interactions. Benchmarks show that AF3 outperforms both classical and deep learning-enhanced docking tools. In benchmark tests conducted in AF3, the percentage of correctly predicted ligand poses, specifically those with a pocket-aligned ligand RMSD (root mean square deviation) of < 2 Å, was found to be 76.4% for AF3.2 In comparison, Vina only achieved a success rate of 52.3%.5 This improvement is remarkable given that the ground truth structure of the protein bound to the ligand (called the holo structure) is used as input for Vina but not for AF3. When the true ligand-binding pocket information is used as additional input, the accuracy of AF3 increases to 90.2%.

However, in reality, the holo protein structure is often unavailable. When predicted structure models (by AlphaFold-Multimer V2.36) are used as inputs, Vina’s accuracy reduces to 13.1%. With sequence input, AF3 almost doubles the accuracy of RFAA (76.4% vs 42.0%). Both AF3 and RFAA outperform protein–ligand docking methods that use predicted protein structure models as inputs. This can be easily explained by the fact that conformational changes occur upon ligand binding, which is known as the “induced fit” mechanism. The protein structure is mostly treated as a rigid body in docking methods, which is unlikely to work well when conformational changes happen during the binding event. In contrast, AF3 and RFAA fold the complex structures from scratch (“simultaneous folding”), which can better address conformational changes. Additionally, as data-driven approaches, the high accuracy achieved by AF3 and RFAA is closely related to the availability of the rich and high-quality data of protein–ligand interactions (> 500,000 biologically relevant interactions in the Q-BioLiP database7).

AF3 also has improved performance for other tasks, including protein monomers, RNAs, protein–protein interactions, protein–nucleic acid interactions, and post-translational modifications. The improvement over AlphaFold-Multimer for protein monomers is marginal (~1.5%), probably because AlphaFold-Multimer is already very accurate for monomeric protein structure prediction. Though AF3 outperforms other automated methods for RNAs, protein–protein interactions, and protein–nucleic acid interactions, the overall accuracy is still far from satisfactory. One of the key reasons is that the experimental data available for training are limited.7

The prediction of protein–ligand interactions by AF3 and RFAA has the potential to greatly expedite the drug discovery process. By providing more precise modeling of these interactions, virtual screening of drugs can become more effective. However, due to the less-than-ideal accuracy and success rate of AF3, it remains uncertain to what extent virtual screening can be improved. It is particularly interesting to investigate whether AF3 can identify small molecules that bind to target proteins in a chemical library, as well as determine the true positive rate. To further enhance these methods, future research could integrate the binding affinity between the ligand and protein, as well as the experimental conditions used for measuring the affinity, into the training process of the networks. To accomplish this, the pharmaceutical industry may need to consider making their data of measured binding affinities between numerous ligands and proteins available to the public for gathering sufficient information for the prediction of not only the complex structure, but also the binding affinity and even the “druggability”.

Notably, the diffusion modules utilized in AF3 and RFdiffusion All-Atom (RFdiffusionAA)3 are generative in nature. RFdiffusionAA can generate new proteins around small molecules of interest. These generated proteins were further confirmed through experimental validation, and RFdiffusionAA holds significant potential in the development of small molecule-binding proteins and sensors. Conversely, by fine-tuning AF3 or RFAA on diffusion denoising tasks, it may be feasible to generate novel small molecules that fit into specific pockets of target proteins. The prospect of designing small-molecule drugs solely based on protein sequence information is fascinating.

Unlike RFAA, AF3 has not been made open source, which could potentially limit its impact. Currently, only a web server is offered with restricted access. The DeepMind team said they were “working on releasing the AF3 model for academic use” within 6 months after an open letter entitled “AlphaFold3 Transparency and Reproducibility” was posted on May 11, 2024 ( We hope to see the release of AF3 model shortly so that it can be used in a wide range of scientific and technological domains, improving human health in the end.