Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain
the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in
Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles
and JavaScript.
A set of 20 computational metrics was evaluated to determine whether they could predict the functionality of synthetic enzyme sequences produced by generative protein models, resulting in the development of a computational filter, COMPSS, that increased experimental success rates by 50–150%, tested in over 500 natural and AI-generated enzymes.
Borderlands Science is a casual mini-game released within a mass-market video game that crowdsources the alignment of one million RNA sequences from the human microbiome. In 3 years, 4 million participants generated over 135 million puzzle solutions that were used to build a reference alignment and improve microbial phylogeny.
Using synthetic biology, we engineered a cellulose-producing bacterium that can produce eumelanin and respond to light, so that it is possible to grow a microbial leather material that is colored black or contains projected black patterns.
Cells interact with their local environment to enact global tissue function. By harnessing gene–gene covariation in cellular neighborhoods from spatial transcriptomics data, the covariance environment (COVET) niche representation and the environmental variational inference (ENVI) data integration method model phenotype–microenvironment interplay and reconstruct the spatial context of dissociated single-cell RNA sequencing datasets.
The underrepresentation of functional glial cells is a major challenge in brain organoid models. We developed an astroglia-enriched cortical organoid model that allows efficient generation of functional astrocytes and enables the formation of astroglial morphological subclasses with layer-specific gene expression profiles upon transplantation into the mouse brain.
The Iniquitate pipeline assessed the impacts of cell-type imbalance on single-cell RNA sequencing integration through perturbations to dataset balance. The results indicated that cell-type imbalance not only leads to loss of biological signal in the integrated space, but also can change the interpretation of downstream analyses after integration.
We developed a deep learning-based method, EMRNA, to automatically model RNA structures from cryo-electron microscopy maps. Evaluation of EMRNA on diverse test sets of RNA maps shows that it builds RNA models with high accuracy and efficiency.
We developed the OMArk software package for evaluating protein-coding gene annotation quality. In addition to assessing the completeness of a proteome, OMArk estimates the overall quality of the gene set’s content, a feature that will help to improve public protein sequence data.
Protein language models learn from diverse sequences spanning the evolutionary tree and have proven to be powerful tools for sequence design, variant effect prediction and structure prediction. What are the foundations of protein language models, and how are they applied in protein engineering?
Models like ChatGPT and DALL-E2 generate text and images in response to a text prompt. Despite different data and goals, how can generative models be useful for protein engineering?
By applying the logic of conditional enzymes, we have developed a zinc-finger-dependent recombinase system, the editing activity of which is induced by zinc finger DNA binding. The system combines the precision of recombinases with the DNA target site programmability of zinc finger domains.
Many questions on the activity of the Ras proto-oncogene are unanswered due to the lack of tools for detecting active Ras in living cells. Here, we used protein design and structure prediction algorithms to develop biosensors that detect the activity and environment of endogenous Ras.