Machine learning | Nature Communications

Article
22 September 2023 | Open Access

Determining subunit-subunit interaction from statistics of cryo-EM images: observation of nearest-neighbor coupling in a circadian clock protein complex

Deciphering interactions between subunits in protein complexes is an important problem. By combining cryo-EM imaging and statistical modeling, Han and colleagues reveal a significant cooperativity between subunits in the clock protein hexamer KaiC.

Xu Han
, Dongliang Zhang
& Qi Ouyang

Article
21 September 2023 | Open Access

DeepSlice: rapid fully automatic registration of mouse brain imaging to a volumetric atlas

Navigating the complex structure of the brain poses a challenge to neuroscientists. Here, the authors have trained an AI (DeepSlice) that can automatically register brain images with speed and accuracy, thus simplifying this process.

Harry Carey
, Michael Pegios
& Simon McMullan

Article
19 September 2023 | Open Access

Translating genomic tools to Raman spectroscopy analysis enables high-dimensional tissue characterization on molecular resolution

Spatial transcriptomics of histological sections have revolutionized basic research, while the actual biomolecular composition of the sample has fallen behind. Here, the authors propose a novel approach to analyze untargeted spatiomolecular Raman spectroscopy data through bioinformatic tools developed for transcriptomic analyses, and integrate them with additional Omics techniques.

Manuel Sigle
, Anne-Katrin Rohlfing
& Meinrad Paul Gawaz

Article
16 September 2023 | Open Access

Integrating end-to-end learning with deep geometrical potentials for ab initio RNA structure prediction

Here the authors developed an open-source program (DRfold) for RNA tertiary structure prediction from sequence. Through a unique combination of end-to-end learning and geometry restraint guided simulations, the method demonstrates advantage over peer methods.

Yang Li
, Chengxin Zhang
& Yang Zhang

Article
15 September 2023 | Open Access

Machine learning coarse-grained potentials of protein thermodynamics

Understanding protein dynamics is a complex scientific challenge. Here, authors construct coarse-grained molecular potentials using artificial neural networks, significantly accelerating protein dynamics simulations while preserving their thermodynamics.

Maciej Majewski
, Adrià Pérez
& Gianni De Fabritiis

Article
14 September 2023 | Open Access

DNA methylation profiling to determine the primary sites of metastatic cancers using formalin-fixed paraffin-embedded tissues

Molecular tests that can determine the tissue of origin of cancers of unknown primary (CUP) are still needed. Here, the authors develop a DNA methylation profiling assay and a machine learning classifier to predict the origin of metastatic tumours in CUP patients using formalin-fixed, paraffin embedded samples.

Shirong Zhang
, Shutao He
& Hongcang Gu

Article
09 September 2023 | Open Access

Transfer Learning with Kernel Methods

Transfer learning can be applied in computer vision and natural language processing to utilize knowledge from a source task to improve performance on a target task. The authors propose a framework for transfer learning with kernel methods for improved image classification and virtual drug screening.

Adityanarayanan Radhakrishnan
, Max Ruiz Luyten
& Caroline Uhler

Article
07 September 2023 | Open Access

Mining multi-center heterogeneous medical data with distributed synthetic learning

Here the authors present Distributed Synthetic Learning, a system that addresses data privacy, isolated data islands, and heterogeneity concerns in healthcare analytics by learning to generate state-of-the-art synthetic data for downstream tasks.

Qi Chang
, Zhennan Yan
& Dimitris N. Metaxas

Article
02 September 2023 | Open Access

Prediction of base editor off-targets by deep learning

Base editors can induce unwanted off-target effects. Here the authors design libraries of gRNA-off-target pairs and perform a screen to obtain editing efficiencies for ABE and CBE: they use the datasets to train DL models (ABEdeepoff and CBEdeepoff) which can predict mutation tolerance at potential off-targets.

Chengdong Zhang
, Yuan Yang
& Yongming Wang

Article
02 September 2023 | Open Access

Interpreting biologically informed neural networks for enhanced proteomic biomarker discovery and pathway analysis

Deep neural networks hold significant promise in capturing the complexity of biological systems. However, they suffer from a lack of interpretability. Here, authors present a generalizable method for developing, interpreting, and visualizing biologically informed neural networks for proteomics data.

Erik Hartman
, Aaron M. Scott
& Johan Malmström

Article
25 August 2023 | Open Access

Projecting RNA measurements onto single cell atlases to extract cell type-specific expression profiles using scProjection

Many expression deconvolution approaches have been developed to estimate % RNA contributions of diverse cell types to mixed RNA measurements. Here, the authors have developed a complementary approach called scProjection to recover cell type-specific expression profiles from mixed RNA measurements.

Nelson Johansen
, Hongru Hu
& Gerald Quon

Article
19 August 2023 | Open Access

DECIMER.ai: an open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications

Chemical structures are typically published as nonmachine-readable images in scientific literature. Here, the authors present DECIMER.ai, an open platform for translating chemical structures in publications into machine-readable representations.

Kohulan Rajan
, Henning Otto Brinkhaus
& Christoph Steinbeck

Article
19 August 2023 | Open Access

APOGEE 2: multi-layer machine-learning model for the interpretable prediction of mitochondrial missense variants

APOGEE 2 is a machine-learning tool for assessing the fragility of the mitochondrial genome, evaluating genetic variant pathogenicity and ultimately enhancing our understanding of the clinical heterogeneity of mitochondrial genetic diseases.

Salvatore Daniele Bianco
, Luca Parca
& Tommaso Mazza

Article
17 August 2023 | Open Access

A deep learning method for replicate-based analysis of chromosome conformation contacts using Siamese neural networks

Siamese neural networks are a powerful deep learning approach for image analysis. Here, the authors adapt this method to the replicate-based analysis of Hi-C data and find that it successfully discriminates technical noise from biological variation.

Ediem Al-jibury
, James W. D. King
& Daniel Rueckert

Article
16 August 2023 | Open Access

Whole genome deconvolution unveils Alzheimer’s resilient epigenetic signature

The authors present a deep learning method that deconvolutes ATAC-seq samples into cell type-specific chromatin accessibility profiles. Applied on 191 samples, the method unveils cell type-specific pathways and nominates potential epigenetic mediators underlying resilience to Alzheimer’s disease.

Eloise Berson
, Anjali Sreenivas
& Thomas J. Montine

Article
15 August 2023 | Open Access

Deep transfer learning for inter-chain contact predictions of transmembrane protein complexes

Membrane proteins are encoded by approximately a quarter of human genes. Here, the authors propose a deep transfer learning method for predicting inter-chain residue-residue contacts of transmembrane protein complexes.

Peicong Lin
, Yumeng Yan
& Sheng-You Huang

Article
11 August 2023 | Open Access

Domain loss enabled evolution of novel functions in the snake three-finger toxin gene superfamily

3-finger toxins are unique to the venoms of caenophidian snakes. This study traces the evolution of these toxins in snakes, highlighting a key shift from membrane-bound to secretory proteins. This transformation, involving the loss of a membrane-anchoring domain and changes in gene expression, paved the way for their venomous function.

Ivan Koludarov
, Tobias Senoner
& Burkhard Rost

Article
09 August 2023 | Open Access

A machine-learning approach to human ex vivo lung perfusion predicts transplantation outcomes and promotes organ utilization

Ex vivo perfusion is a unique platform to study isolated human lungs. Here, authors show that a machine learning model, InsighTx, derived from data generated during ex vivo lung perfusion can accurately predict transplant outcomes and increase organ utilization rates.

Andrew T. Sage
, Laura L. Donahoe
& Shaf Keshavjee

Article
07 August 2023 | Open Access

Experimental validation of the free-energy principle with in vitro neural networks

Empirical applications of the free-energy principle entail a commitment to a particular process theory. Here, the authors reverse engineered generative models from neural responses of in vitro networks and demonstrated that the free-energy principle could predict how neural networks reorganized in response to external stimulation.

Takuya Isomura
, Kiyoshi Kotani
& Karl J. Friston

Article
07 August 2023 | Open Access

Getting personal with epigenetics: towards individual-specific epigenomic imputation with machine learning

The authors present eDICE, an attention-based model that enables accurate imputation of missing portions of the observed epigenetic landscape, and show that eDICE can be used to predict individualspecific epigenomic variation in the EN-TEx dataset.

Alex Hawkins-Hooker
, Giovanni Visonà
& Gabriele Schweikert

Article
03 August 2023 | Open Access

A neural-mechanistic hybrid approach improving the predictive power of genome-scale metabolic models

Mechanistic models estimate the phenotype of microorganisms in different environments but may have limited predictive capabilities. Here, authors develop trainable hybrid models with improved predictability using mechanistic insights and smaller training sets than conventional machine learning techniques.

Léon Faure
, Bastien Mollet
& Jean-Loup Faulon

Article
03 August 2023 | Open Access

Segmenting functional tissue units across human organs using community-driven development of generalizable machine learning algorithms

Constructing the human reference atlas requires integration and analysis of massive amounts of data. Here the authors report the setup and results of the Hacking the Human Body machine learning algorithm development competition hosted by the Human Biomolecular Atlas and the Human Protein Atlas teams.

Yashvardhan Jain
, Leah L. Godwin
& Katy Börner

Article
20 July 2023 | Open Access

Identification of transcriptional programs using dense vector representations defined by mutual information with GeneVector

In single-cell RNA-seq analyses, it would be critical to measure the relationships between genes. Here, the authors develop a framework for single-cell dimensionality reduction that incorporates gene-specific relationships - GeneVector -, and use it for tasks such as annotating cell types and analysing pathway variation after treatment.

Nicholas Ceglia
, Zachary Sethna
& Andrew McPherson

Article
18 July 2023 | Open Access

Detecting shortcut learning for fair medical AI using shortcut testing

Diagnosing shortcut learning in clinical models is difficult, as sensitive attributes may be causally linked with disease. Using multitask learning, the authors propose a method to directly test for the presence of shortcut learning in clinical ML systems.

Alexander Brown
, Nenad Tomasev
& Jessica Schrouff

Article
18 July 2023 | Open Access

Next generation pan-cancer blood proteome profiling using proximity extension assay

Comprehensive and scalable proteomic profiling of plasma samples can improve the screening and diagnosis of cancer patients. Here, the authors use the Olink Proximity Extension Assay technology to characterise the plasma proteomes of 1477 patients across twelve cancer types, and use machine learning to obtain a protein panel for cancer classification.

María Bueno Álvez
, Fredrik Edfors
& Mathias Uhlén

Article
13 July 2023 | Open Access

Deep structured learning for variant prioritization in Mendelian diseases

In individuals with rare, monogenic disorders it often remains challenging to identify the disease-causing genetic variants among numerous potential candidates. Here, the authors develop a neural network ensemble for variant pathogenicity prediction, specifically for this type of disorder.

Matt C. Danzi
, Maike F. Dohrn
& Stephan Züchner

Article
13 July 2023 | Open Access

Exon-intron boundary inhibits m⁶A deposition, enabling m⁶A distribution hallmark, longer mRNA half-life and flexible protein coding

m⁶A mRNA modification is not typically found near splice junctions in mRNAs. Here the authors show exon-intron boundary inhibits m6A deposition at ~100 nt region nearby splice site, enabling m⁶A distribution hallmark, more stable mRNA and flexible protein coding.

Zhiyuan Luo
, Qilian Ma
& Shengdong Ke

Article
13 July 2023 | Open Access

Discovering functionally important sites in proteins

An important step in understanding and using proteins is to identify the residues that are important for function. The authors present a machine-learning based method to predict functional sites that leverages and combines the information available in protein sequences and structures.

Matteo Cagiada
, Sandro Bottaro
& Kresten Lindorff-Larsen

Article
12 July 2023 | Open Access

Multi-batch single-cell comparative atlas construction by deep learning disentanglement

Comparing single-cell RNA-seq and ATAC-seq data from multiple batches is challenging due to technical artifacts. Here, the authors propose a method that disentangles technical and biological effects, facilitating batch-confounded chromatin and gene expression state discovery and enhancing the analysis of perturbation effects on cell populations.

Allen W. Lynch
, Myles Brown
& Clifford A. Meyer

Article
12 July 2023 | Open Access

Turnover number predictions for kinetically uncharacterized enzymes using machine and deep learning

The turnover numbers of most enzyme-catalyzed reactions are unknown. Kroll et al. developed a general model that can predict turnover numbers even for enzymes dissimilar to those used for training, outperforming existing models.

Alexander Kroll
, Yvan Rousset
& Martin J. Lercher

Article
11 July 2023 | Open Access

Spatial cellular architecture predicts prognosis in glioblastoma

Intra-tumoral heterogeneity and cell-state plasticity contribute to the development of therapeutic resistance in glioblastoma (GBM). Here the authors use two deep learning models to predict spatial transcriptional programs and prognosis from histology images in GBM.

Yuanning Zheng
, Francisco Carrillo-Perez
& Olivier Gevaert

Article
11 July 2023 | Open Access

Large depth-of-field ultra-compact microscope by progressive optimization and deep learning

Traditional optical microscope, while bulky, often fails to deliver optimal performance. Here, the authors have engineered an integrated microscope of 0.15 cm³ in volume and a weight of 0.5 g, which outperforms a commercial microscope and can be seamlessly integrated with a smartphone.

Yuanlong Zhang
, Xiaofei Song
& Qionghai Dai

Article
08 July 2023 | Open Access

Leveraging spatial transcriptomics data to recover cell locations in single-cell RNA-seq with CeLEry

Cell location information is important for understanding how tissue is spatially organized. Here, the authors develop CeLEry, a machine learning method that aims to recover cell locations for single-cell RNA-seq data by leveraging information learned from spatial transcriptomics.

Qihuang Zhang
, Shunzhou Jiang
& Mingyao Li

Article
17 June 2023 | Open Access

A data-driven approach for predicting the impact of drugs on the human microbiome

Drugs can impact the gut microbiome. Here, Algavi and Borenstein developed a machine-learning framework that successfully predicts the impact of thousands of drugs on hundreds of gut microbes, explaining drug-induced dysbiosis and side effects.

Yadid M. Algavi
& Elhanan Borenstein

Article
15 June 2023 | Open Access

Biomedical knowledge graph learning for drug repurposing by extending guilt-by-association to multiple layers

Computational drug repurposing models that leverage biomedical knowledge graphs to associate drugs to diseases, are biased to genes. Here, the authors present DREAMwalk, which extends guilt-by-association for multi-layer knowledge graph learning using a semantic information-guided random walk.

Dongmin Bang
, Sangsoo Lim
& Sun Kim

Article
13 June 2023 | Open Access

Pacpaint: a histology-based deep learning model uncovers the extensive intratumor molecular heterogeneity of pancreatic adenocarcinoma

Rapid and effective molecular subtyping of pancreatic adenocarcinoma (PDAC) is important for prognosis and treatment. Here, the authors develop PACpAInt, a deep learning model for PDAC molecular subtyping from whole-slide histological imaging that enables the analysis of heterogeneity and prognostic predictions.

Charlie Saillard
, Flore Delecourt
& Jerome Cros

Article
13 June 2023 | Open Access

Predicting the antigenic evolution of SARS-COV-2 with deep learning

SARS-CoV-2’s rapid evolution threatens public health. Here, authors present a deep learning approach to forecast high-risk mutations that may appear in the future, aiding vaccine development and enhancing preparedness against future variants.

Wenkai Han
, Ningning Chen
& Xin Gao

Article
12 June 2023 | Open Access

Machine learning optimization of candidate antibody yields highly diverse sub-nanomolar affinity antibody libraries

Therapeutic antibody discovery is time and cost-intensive. Here, the authors develop a machine learning-driven method enabling accelerated design of large and diverse single-chain variable fragments with high binding efficiency, especially at high levels of diversity.

Lin Li
, Esther Gupta
& Matthew E. Walsh

Article
10 June 2023 | Open Access

Discovery of senolytics using machine learning

Cellular senescence is involved in many disease processes but few senolytic compounds are currently known. Here, the authors report the discovery of three senolytics using machine learning models trained solely on published data, with large reductions in drug screening costs.

Vanessa Smer-Barreto
, Andrea Quintanilla
& Diego A. Oyarzún

Article
09 June 2023 | Open Access

Empowering drug off-target discovery with metabolic and structural analysis

The authors present a workflow integrating metabolic perturbations with protein structural analysis to identify drug off-targets, demonstrating how combining machine learning methods with mechanistic analyses can benefit off-target identification.

Sourav Chowdhury
, Daniel C. Zielinski
& Eugene I. Shakhnovich

Article
03 June 2023 | Open Access

Improvement of cryo-EM maps by simultaneous local and non-local deep learning

Map post-processing is crucial for cryo-EM modeling building. Here, the authors present a deep learning approach to improve both the quality and interpretability of cryo-EM maps by simultaneously considering local and non-local effects.

Jiahua He
, Tao Li
& Sheng-You Huang

Article
27 May 2023 | Open Access

Inference of cell type-specific gene regulatory networks on cell lineages from single cell omic datasets

Cell type-specific gene expression patterns are outputs of transcriptional gene regulatory networks (GRNs) that connect transcription factors and signaling proteins to target genes. Here, the authors present single-cell Multi-Task Network Inference (scMTNI), a multi-task learning framework to infer cell type-specific GRN dynamics from scRNA-seq and scATAC-seq datasets collected for diverse cell fate specification trajectories.

Shilu Zhang
, Saptarshi Pyne
& Sushmita Roy

Article
15 May 2023 | Open Access

A general model to predict small molecule substrates of enzymes based on machine and deep learning

For many enzymes, it is unknown which primary and/or secondary reactions they catalyze. Here, the authors use machine and deep learning to develop a general model for the prediction of enzyme-small molecule substrate pairs and make the resulting model available through a webserver.

Alexander Kroll
, Sahasra Ranjan
& Martin J. Lercher

Article
11 May 2023 | Open Access

Metal3D: a general deep learning framework for accurate metal ion location prediction in proteins

Zinc is an essential metal for many proteins. Here, the authors propose a model based on 3D convolutional networks to predict the location of zinc in experimental and computationally predicted structures within a framework readily extensible to other metals.

Simon L. Dürr
, Andrea Levy
& Ursula Rothlisberger

Article
09 May 2023 | Open Access

Candida expansion in the gut of lung cancer patients associates with an ecological signature that supports growth under dysbiotic conditions

Here, Seelbinder et al. show high Candida levels in cancer patients’ stool to correlate with greater metabolically flexibility but less robust bacterial communities and, combined with machine learning models to predict Candida levels from bacterial data, suggest that lactate producing bacteria may fuel Candida overgrowth in the gut during dysbiosis.

Bastian Seelbinder
, Zoltan Lohinai
& Gianni Panagiotou

Article
06 May 2023 | Open Access

Improving de novo protein binder design with deep learning

Recently, a pipeline for the design of protein-binding proteins using only the structure of the target protein was reported. Here, the authors report that the incorporation of deep learning methods into the original pipeline increases experimental success rate by ten-fold.

Nathaniel R. Bennett
, Brian Coventry
& David Baker

Article
04 May 2023 | Open Access

Chemistry-intuitive explanation of graph neural networks for molecular property prediction with substructure masking

Attempts to explain molecular property predictions of neural networks are not always compatible with chemical intuition based on chemical substructures. Here the authors propose the substructure mask explanation method to tackle this challenge.

Zhenxing Wu
, Jike Wang
& Tingjun Hou

Article
03 May 2023 | Open Access

Explainable multi-task learning for multi-modality biological data analysis

Multimodal biological data is challenging to analyze. Here, the authors develop UnitedNet, an explainable deep neural network for analyzing single-cell multimodal biological data and estimating relationships between gene expression and other modalities with cell-type specificity.

Xin Tang
, Jiawei Zhang
& Jia Liu

Article
02 May 2023 | Open Access

Machine learning-driven multifunctional peptide engineering for sustained ocular drug delivery

Sustained drug delivery is critical for patient adherence to chronic disease treatments. Here the authors apply machine learning to engineer multifunctional peptides with high melanin binding, high cell-penetration, and low cytotoxicity, enhancing the duration and efficacy of peptide-drug conjugates for sustained ocular delivery.

Henry T. Hsueh
, Renee Ti Chou
& Laura M. Ensign

Article
28 April 2023 | Open Access

PeakDecoder enables machine learning-based metabolite annotation and accurate profiling in multidimensional mass spectrometry measurements

Alternative algorithms exploiting advantages of multidimensional mass spectrometry in untargeted metabolomics are needed. Here, the authors develop and demonstrate PeakDecoder for confident and accurate metabolite profiling in 116 microbial sample runs and using a library built from 64 standards.

Aivett Bilbao
, Nathalie Munoz
& Kristin E. Burnum-Johnson

Machine learning articles within Nature Communications

Featured

Browse broader subjects

Search

Quick links