Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain
the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in
Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles
and JavaScript.
Deep learning methods in natural language processing generally become more effective with larger datasets and bigger networks. But it is not evident whether the same is true for more specialized domains such as cheminformatics. Frey and colleagues provide empirical explorations of chemistry models and find that neural-scaling laws hold true even for the largest tested models and datasets.
The immense amount of Wikipedia articles makes it challenging for volunteers to ensure that cited sources support the claim they are attached to. Petroni et al. use an information-retrieval model to assist Wikipedia users in improving verifiability.
Identifying unknown peptides in tandem mass spectrometry is challenging as fragmentation of precursor peptides can be incomplete. Mao and colleagues present a method based on graph neural networks and a path-searching model to create more stable sequence predictions.
Computational methods for analysing single 2D tissue slices from spatial transcriptomics studies are well established, but their extension to the 3D domain is challenging. Wang et al. develop a deep learning framework that can perform 3D reconstruction of cellular structures in tissues as well as whole organisms.
Contact prediction between two proteins is still computationally challenging, but is vital for understanding multi-protein complexes. Lin et al. use a geometric deep learning approach to provide accurate predictions of inter-protein residue–residue contacts.
Deconvolution of cell types in tissue proteomic data is a challenging computational task for the bioinformatics community. A deep-learning method termed scpDeconv is introduced that makes efficient use of single-cell proteomics data to deconvolve cell types and states from bulk proteomics measurements.
AlphaFold2 has revolutionized bioinformatics, but its ability to predict protein structures with high accuracy comes at the price of a costly database search for multiple sequence alignments. Fang and colleagues pre-train a large-scale protein language model and use it in conjunction with AlphaFold2 as a fully trainable and efficient model for structure prediction.
It is widely known that AI-based recommendation systems on social media and news websites can isolate humans from diverse information, eventually trapping them in so-called information cocoons, where they are exposed to a narrow range of viewpoints. Li et al. introduce an adaptive information dynamics model to uncover the origin of information cocoons in complex human–AI interaction systems, and test their findings on two large real-world datasets.
Deep learning can help develop non-invasive technology for decoding speech from brain activity, which could improve the lives of patients with brain injuries. Défossez et al. report a contrastive-learning approach to decode speech listening from human participants, using public databases of recordings based on non-invasive magnetic and electrical measurements.
Online matching platforms are increasingly used for applications with positive social impact such as matching blood donors with recipients, where matching algorithms need to balance fairness with an efficiency objective. The authors demonstrate, both in computational simulations and using real data from the Facebook Blood Donations tool, that introducing a simple online matching policy can substantially increase the likelihood of donor action.
Fine motor skill recovery in hand rehabilitation is a challenge due to limited finger movement sensing and closed-loop control algorithms in existing rehabilitation gloves. Sui et al. develop a soft-packaged rehabilitation glove, integrating sensing, actuation, a human–machine interface, power, electronics and a closed-loop algorithm. The glove aids patients after a stroke to recover fine motor skills of the fingers in a portable manner.
Identifying interventions that can induce a desired effect is challenging owing to the combinatorial number of possible choices in design space. Zhang and colleagues propose an active learning approach with theoretical guarantees to discover optimal interventions in causal models, and demonstrate the framework in the context of genetic perturbation design using single-cell transcriptomic data.
State-of-the-art image reconstruction for multispectral optoacoustic tomography is currently too slow for clinical applications. Dehner, Zahnd et al. propose a deep learning framework to reconstruct optoacoustic images in real-time while maintaining similar quality.
The recent accessibility of large language models brought them into contact with a large number of users and, due to the social nature of language, it is hard to avoid prescribing human characteristics such as intentions to a chatbot. Pataranutaporn and colleagues investigated how framing a bot as helpful or manipulative can influence this perception and the behaviour of the humans that interact with it.
Despite their efficiency advantages, the performance of photonic neural networks is hampered by the accumulation of inherent systematic errors. Zheng et al. propose a dual backpropagation training approach, which allows the network to adapt to systematic errors, thus outperforming state-of-the-art in situ training approaches.
Local methods of explainable artificial intelligence identify where important features or inputs occur, while global methods try to understand what features or concepts have been learned by a model. The authors propose a concept-level explanation method that bridges the local and global perspectives, enabling more comprehensive and human-understandable explanations.
With the advances in neural language models, the question arises if some models align better with human processing than others. Golan et al. identify sentences that language models disagree about and use them to compare the shortcomings of different language models.
An outstanding challenge in materials science is doing large-scale simulations with complex electron interactions. Deng and colleagues introduce a universal graph neural network-based interatomic potential integrating atomic magnetic moments as charge constraints, which allows for capturing subtle chemical properties in several lithium-based solid-state materials
Generating novel molecules that bind to specific protein targets is a challenging but important task in computational drug design. Zhang and colleagues present a molecular generation method based on hierarchical auto-regression.
For virtual protein docking, an accurate scoring function is necessary that evaluates how likely a protein conformation is. Stebliankin and colleagues present a method based on vision transformers that provides a more accurate score by evaluating individual binding interfaces as multi-channel images.