Featured
-
-
Article
| Open AccessDeepSlice: rapid fully automatic registration of mouse brain imaging to a volumetric atlas
Navigating the complex structure of the brain poses a challenge to neuroscientists. Here, the authors have trained an AI (DeepSlice) that can automatically register brain images with speed and accuracy, thus simplifying this process.
- Harry Carey
- , Michael Pegios
- & Simon McMullan
-
Article
| Open AccessTranslating genomic tools to Raman spectroscopy analysis enables high-dimensional tissue characterization on molecular resolution
Spatial transcriptomics of histological sections have revolutionized basic research, while the actual biomolecular composition of the sample has fallen behind. Here, the authors propose a novel approach to analyze untargeted spatiomolecular Raman spectroscopy data through bioinformatic tools developed for transcriptomic analyses, and integrate them with additional Omics techniques.
- Manuel Sigle
- , Anne-Katrin Rohlfing
- & Meinrad Paul Gawaz
-
Article
| Open AccessIntegrating end-to-end learning with deep geometrical potentials for ab initio RNA structure prediction
Here the authors developed an open-source program (DRfold) for RNA tertiary structure prediction from sequence. Through a unique combination of end-to-end learning and geometry restraint guided simulations, the method demonstrates advantage over peer methods.
- Yang Li
- , Chengxin Zhang
- & Yang Zhang
-
Article
| Open AccessMachine learning coarse-grained potentials of protein thermodynamics
Understanding protein dynamics is a complex scientific challenge. Here, authors construct coarse-grained molecular potentials using artificial neural networks, significantly accelerating protein dynamics simulations while preserving their thermodynamics.
- Maciej Majewski
- , Adrià Pérez
- & Gianni De Fabritiis
-
Article
| Open AccessDNA methylation profiling to determine the primary sites of metastatic cancers using formalin-fixed paraffin-embedded tissues
Molecular tests that can determine the tissue of origin of cancers of unknown primary (CUP) are still needed. Here, the authors develop a DNA methylation profiling assay and a machine learning classifier to predict the origin of metastatic tumours in CUP patients using formalin-fixed, paraffin embedded samples.
- Shirong Zhang
- , Shutao He
- & Hongcang Gu
-
Article
| Open AccessTransfer Learning with Kernel Methods
Transfer learning can be applied in computer vision and natural language processing to utilize knowledge from a source task to improve performance on a target task. The authors propose a framework for transfer learning with kernel methods for improved image classification and virtual drug screening.
- Adityanarayanan Radhakrishnan
- , Max Ruiz Luyten
- & Caroline Uhler
-
Article
| Open AccessMining multi-center heterogeneous medical data with distributed synthetic learning
Here the authors present Distributed Synthetic Learning, a system that addresses data privacy, isolated data islands, and heterogeneity concerns in healthcare analytics by learning to generate state-of-the-art synthetic data for downstream tasks.
- Qi Chang
- , Zhennan Yan
- & Dimitris N. Metaxas
-
Article
| Open AccessPrediction of base editor off-targets by deep learning
Base editors can induce unwanted off-target effects. Here the authors design libraries of gRNA-off-target pairs and perform a screen to obtain editing efficiencies for ABE and CBE: they use the datasets to train DL models (ABEdeepoff and CBEdeepoff) which can predict mutation tolerance at potential off-targets.
- Chengdong Zhang
- , Yuan Yang
- & Yongming Wang
-
Article
| Open AccessInterpreting biologically informed neural networks for enhanced proteomic biomarker discovery and pathway analysis
Deep neural networks hold significant promise in capturing the complexity of biological systems. However, they suffer from a lack of interpretability. Here, authors present a generalizable method for developing, interpreting, and visualizing biologically informed neural networks for proteomics data.
- Erik Hartman
- , Aaron M. Scott
- & Johan Malmström
-
Article
| Open AccessProjecting RNA measurements onto single cell atlases to extract cell type-specific expression profiles using scProjection
Many expression deconvolution approaches have been developed to estimate % RNA contributions of diverse cell types to mixed RNA measurements. Here, the authors have developed a complementary approach called scProjection to recover cell type-specific expression profiles from mixed RNA measurements.
- Nelson Johansen
- , Hongru Hu
- & Gerald Quon
-
Article
| Open AccessDECIMER.ai: an open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications
Chemical structures are typically published as nonmachine-readable images in scientific literature. Here, the authors present DECIMER.ai, an open platform for translating chemical structures in publications into machine-readable representations.
- Kohulan Rajan
- , Henning Otto Brinkhaus
- & Christoph Steinbeck
-
Article
| Open AccessAPOGEE 2: multi-layer machine-learning model for the interpretable prediction of mitochondrial missense variants
APOGEE 2 is a machine-learning tool for assessing the fragility of the mitochondrial genome, evaluating genetic variant pathogenicity and ultimately enhancing our understanding of the clinical heterogeneity of mitochondrial genetic diseases.
- Salvatore Daniele Bianco
- , Luca Parca
- & Tommaso Mazza
-
Article
| Open AccessA deep learning method for replicate-based analysis of chromosome conformation contacts using Siamese neural networks
Siamese neural networks are a powerful deep learning approach for image analysis. Here, the authors adapt this method to the replicate-based analysis of Hi-C data and find that it successfully discriminates technical noise from biological variation.
- Ediem Al-jibury
- , James W. D. King
- & Daniel Rueckert
-
Article
| Open AccessWhole genome deconvolution unveils Alzheimer’s resilient epigenetic signature
The authors present a deep learning method that deconvolutes ATAC-seq samples into cell type-specific chromatin accessibility profiles. Applied on 191 samples, the method unveils cell type-specific pathways and nominates potential epigenetic mediators underlying resilience to Alzheimer’s disease.
- Eloise Berson
- , Anjali Sreenivas
- & Thomas J. Montine
-
Article
| Open AccessDeep transfer learning for inter-chain contact predictions of transmembrane protein complexes
Membrane proteins are encoded by approximately a quarter of human genes. Here, the authors propose a deep transfer learning method for predicting inter-chain residue-residue contacts of transmembrane protein complexes.
- Peicong Lin
- , Yumeng Yan
- & Sheng-You Huang
-
Article
| Open AccessDomain loss enabled evolution of novel functions in the snake three-finger toxin gene superfamily
3-finger toxins are unique to the venoms of caenophidian snakes. This study traces the evolution of these toxins in snakes, highlighting a key shift from membrane-bound to secretory proteins. This transformation, involving the loss of a membrane-anchoring domain and changes in gene expression, paved the way for their venomous function.
- Ivan Koludarov
- , Tobias Senoner
- & Burkhard Rost
-
Article
| Open AccessA machine-learning approach to human ex vivo lung perfusion predicts transplantation outcomes and promotes organ utilization
Ex vivo perfusion is a unique platform to study isolated human lungs. Here, authors show that a machine learning model, InsighTx, derived from data generated during ex vivo lung perfusion can accurately predict transplant outcomes and increase organ utilization rates.
- Andrew T. Sage
- , Laura L. Donahoe
- & Shaf Keshavjee
-
Article
| Open AccessExperimental validation of the free-energy principle with in vitro neural networks
Empirical applications of the free-energy principle entail a commitment to a particular process theory. Here, the authors reverse engineered generative models from neural responses of in vitro networks and demonstrated that the free-energy principle could predict how neural networks reorganized in response to external stimulation.
- Takuya Isomura
- , Kiyoshi Kotani
- & Karl J. Friston
-
Article
| Open AccessGetting personal with epigenetics: towards individual-specific epigenomic imputation with machine learning
The authors present eDICE, an attention-based model that enables accurate imputation of missing portions of the observed epigenetic landscape, and show that eDICE can be used to predict individualspecific epigenomic variation in the EN-TEx dataset.
- Alex Hawkins-Hooker
- , Giovanni Visonà
- & Gabriele Schweikert
-
Article
| Open AccessA neural-mechanistic hybrid approach improving the predictive power of genome-scale metabolic models
Mechanistic models estimate the phenotype of microorganisms in different environments but may have limited predictive capabilities. Here, authors develop trainable hybrid models with improved predictability using mechanistic insights and smaller training sets than conventional machine learning techniques.
- Léon Faure
- , Bastien Mollet
- & Jean-Loup Faulon
-
Article
| Open AccessSegmenting functional tissue units across human organs using community-driven development of generalizable machine learning algorithms
Constructing the human reference atlas requires integration and analysis of massive amounts of data. Here the authors report the setup and results of the Hacking the Human Body machine learning algorithm development competition hosted by the Human Biomolecular Atlas and the Human Protein Atlas teams.
- Yashvardhan Jain
- , Leah L. Godwin
- & Katy Börner
-
Article
| Open AccessIdentification of transcriptional programs using dense vector representations defined by mutual information with GeneVector
In single-cell RNA-seq analyses, it would be critical to measure the relationships between genes. Here, the authors develop a framework for single-cell dimensionality reduction that incorporates gene-specific relationships - GeneVector -, and use it for tasks such as annotating cell types and analysing pathway variation after treatment.
- Nicholas Ceglia
- , Zachary Sethna
- & Andrew McPherson
-
Article
| Open AccessDetecting shortcut learning for fair medical AI using shortcut testing
Diagnosing shortcut learning in clinical models is difficult, as sensitive attributes may be causally linked with disease. Using multitask learning, the authors propose a method to directly test for the presence of shortcut learning in clinical ML systems.
- Alexander Brown
- , Nenad Tomasev
- & Jessica Schrouff
-
Article
| Open AccessNext generation pan-cancer blood proteome profiling using proximity extension assay
Comprehensive and scalable proteomic profiling of plasma samples can improve the screening and diagnosis of cancer patients. Here, the authors use the Olink Proximity Extension Assay technology to characterise the plasma proteomes of 1477 patients across twelve cancer types, and use machine learning to obtain a protein panel for cancer classification.
- María Bueno Álvez
- , Fredrik Edfors
- & Mathias Uhlén
-
Article
| Open AccessDeep structured learning for variant prioritization in Mendelian diseases
In individuals with rare, monogenic disorders it often remains challenging to identify the disease-causing genetic variants among numerous potential candidates. Here, the authors develop a neural network ensemble for variant pathogenicity prediction, specifically for this type of disorder.
- Matt C. Danzi
- , Maike F. Dohrn
- & Stephan Züchner
-
Article
| Open AccessExon-intron boundary inhibits m6A deposition, enabling m6A distribution hallmark, longer mRNA half-life and flexible protein coding
m6A mRNA modification is not typically found near splice junctions in mRNAs. Here the authors show exon-intron boundary inhibits m6A deposition at ~100 nt region nearby splice site, enabling m6A distribution hallmark, more stable mRNA and flexible protein coding.
- Zhiyuan Luo
- , Qilian Ma
- & Shengdong Ke
-
Article
| Open AccessDiscovering functionally important sites in proteins
An important step in understanding and using proteins is to identify the residues that are important for function. The authors present a machine-learning based method to predict functional sites that leverages and combines the information available in protein sequences and structures.
- Matteo Cagiada
- , Sandro Bottaro
- & Kresten Lindorff-Larsen
-
Article
| Open AccessMulti-batch single-cell comparative atlas construction by deep learning disentanglement
Comparing single-cell RNA-seq and ATAC-seq data from multiple batches is challenging due to technical artifacts. Here, the authors propose a method that disentangles technical and biological effects, facilitating batch-confounded chromatin and gene expression state discovery and enhancing the analysis of perturbation effects on cell populations.
- Allen W. Lynch
- , Myles Brown
- & Clifford A. Meyer
-
Article
| Open AccessTurnover number predictions for kinetically uncharacterized enzymes using machine and deep learning
The turnover numbers of most enzyme-catalyzed reactions are unknown. Kroll et al. developed a general model that can predict turnover numbers even for enzymes dissimilar to those used for training, outperforming existing models.
- Alexander Kroll
- , Yvan Rousset
- & Martin J. Lercher
-
Article
| Open AccessSpatial cellular architecture predicts prognosis in glioblastoma
Intra-tumoral heterogeneity and cell-state plasticity contribute to the development of therapeutic resistance in glioblastoma (GBM). Here the authors use two deep learning models to predict spatial transcriptional programs and prognosis from histology images in GBM.
- Yuanning Zheng
- , Francisco Carrillo-Perez
- & Olivier Gevaert
-
Article
| Open AccessLarge depth-of-field ultra-compact microscope by progressive optimization and deep learning
Traditional optical microscope, while bulky, often fails to deliver optimal performance. Here, the authors have engineered an integrated microscope of 0.15 cm3 in volume and a weight of 0.5 g, which outperforms a commercial microscope and can be seamlessly integrated with a smartphone.
- Yuanlong Zhang
- , Xiaofei Song
- & Qionghai Dai
-
Article
| Open AccessLeveraging spatial transcriptomics data to recover cell locations in single-cell RNA-seq with CeLEry
Cell location information is important for understanding how tissue is spatially organized. Here, the authors develop CeLEry, a machine learning method that aims to recover cell locations for single-cell RNA-seq data by leveraging information learned from spatial transcriptomics.
- Qihuang Zhang
- , Shunzhou Jiang
- & Mingyao Li
-
Article
| Open AccessA data-driven approach for predicting the impact of drugs on the human microbiome
Drugs can impact the gut microbiome. Here, Algavi and Borenstein developed a machine-learning framework that successfully predicts the impact of thousands of drugs on hundreds of gut microbes, explaining drug-induced dysbiosis and side effects.
- Yadid M. Algavi
- & Elhanan Borenstein
-
Article
| Open AccessBiomedical knowledge graph learning for drug repurposing by extending guilt-by-association to multiple layers
Computational drug repurposing models that leverage biomedical knowledge graphs to associate drugs to diseases, are biased to genes. Here, the authors present DREAMwalk, which extends guilt-by-association for multi-layer knowledge graph learning using a semantic information-guided random walk.
- Dongmin Bang
- , Sangsoo Lim
- & Sun Kim
-
Article
| Open AccessPacpaint: a histology-based deep learning model uncovers the extensive intratumor molecular heterogeneity of pancreatic adenocarcinoma
Rapid and effective molecular subtyping of pancreatic adenocarcinoma (PDAC) is important for prognosis and treatment. Here, the authors develop PACpAInt, a deep learning model for PDAC molecular subtyping from whole-slide histological imaging that enables the analysis of heterogeneity and prognostic predictions.
- Charlie Saillard
- , Flore Delecourt
- & Jerome Cros
-
Article
| Open AccessPredicting the antigenic evolution of SARS-COV-2 with deep learning
SARS-CoV-2’s rapid evolution threatens public health. Here, authors present a deep learning approach to forecast high-risk mutations that may appear in the future, aiding vaccine development and enhancing preparedness against future variants.
- Wenkai Han
- , Ningning Chen
- & Xin Gao
-
Article
| Open AccessMachine learning optimization of candidate antibody yields highly diverse sub-nanomolar affinity antibody libraries
Therapeutic antibody discovery is time and cost-intensive. Here, the authors develop a machine learning-driven method enabling accelerated design of large and diverse single-chain variable fragments with high binding efficiency, especially at high levels of diversity.
- Lin Li
- , Esther Gupta
- & Matthew E. Walsh
-
Article
| Open AccessDiscovery of senolytics using machine learning
Cellular senescence is involved in many disease processes but few senolytic compounds are currently known. Here, the authors report the discovery of three senolytics using machine learning models trained solely on published data, with large reductions in drug screening costs.
- Vanessa Smer-Barreto
- , Andrea Quintanilla
- & Diego A. Oyarzún
-
Article
| Open AccessEmpowering drug off-target discovery with metabolic and structural analysis
The authors present a workflow integrating metabolic perturbations with protein structural analysis to identify drug off-targets, demonstrating how combining machine learning methods with mechanistic analyses can benefit off-target identification.
- Sourav Chowdhury
- , Daniel C. Zielinski
- & Eugene I. Shakhnovich
-
Article
| Open AccessImprovement of cryo-EM maps by simultaneous local and non-local deep learning
Map post-processing is crucial for cryo-EM modeling building. Here, the authors present a deep learning approach to improve both the quality and interpretability of cryo-EM maps by simultaneously considering local and non-local effects.
- Jiahua He
- , Tao Li
- & Sheng-You Huang
-
Article
| Open AccessInference of cell type-specific gene regulatory networks on cell lineages from single cell omic datasets
Cell type-specific gene expression patterns are outputs of transcriptional gene regulatory networks (GRNs) that connect transcription factors and signaling proteins to target genes. Here, the authors present single-cell Multi-Task Network Inference (scMTNI), a multi-task learning framework to infer cell type-specific GRN dynamics from scRNA-seq and scATAC-seq datasets collected for diverse cell fate specification trajectories.
- Shilu Zhang
- , Saptarshi Pyne
- & Sushmita Roy
-
Article
| Open AccessA general model to predict small molecule substrates of enzymes based on machine and deep learning
For many enzymes, it is unknown which primary and/or secondary reactions they catalyze. Here, the authors use machine and deep learning to develop a general model for the prediction of enzyme-small molecule substrate pairs and make the resulting model available through a webserver.
- Alexander Kroll
- , Sahasra Ranjan
- & Martin J. Lercher
-
Article
| Open AccessMetal3D: a general deep learning framework for accurate metal ion location prediction in proteins
Zinc is an essential metal for many proteins. Here, the authors propose a model based on 3D convolutional networks to predict the location of zinc in experimental and computationally predicted structures within a framework readily extensible to other metals.
- Simon L. Dürr
- , Andrea Levy
- & Ursula Rothlisberger
-
Article
| Open AccessCandida expansion in the gut of lung cancer patients associates with an ecological signature that supports growth under dysbiotic conditions
Here, Seelbinder et al. show high Candida levels in cancer patients’ stool to correlate with greater metabolically flexibility but less robust bacterial communities and, combined with machine learning models to predict Candida levels from bacterial data, suggest that lactate producing bacteria may fuel Candida overgrowth in the gut during dysbiosis.
- Bastian Seelbinder
- , Zoltan Lohinai
- & Gianni Panagiotou
-
Article
| Open AccessImproving de novo protein binder design with deep learning
Recently, a pipeline for the design of protein-binding proteins using only the structure of the target protein was reported. Here, the authors report that the incorporation of deep learning methods into the original pipeline increases experimental success rate by ten-fold.
- Nathaniel R. Bennett
- , Brian Coventry
- & David Baker
-
Article
| Open AccessChemistry-intuitive explanation of graph neural networks for molecular property prediction with substructure masking
Attempts to explain molecular property predictions of neural networks are not always compatible with chemical intuition based on chemical substructures. Here the authors propose the substructure mask explanation method to tackle this challenge.
- Zhenxing Wu
- , Jike Wang
- & Tingjun Hou
-
Article
| Open AccessExplainable multi-task learning for multi-modality biological data analysis
Multimodal biological data is challenging to analyze. Here, the authors develop UnitedNet, an explainable deep neural network for analyzing single-cell multimodal biological data and estimating relationships between gene expression and other modalities with cell-type specificity.
- Xin Tang
- , Jiawei Zhang
- & Jia Liu
-
Article
| Open AccessMachine learning-driven multifunctional peptide engineering for sustained ocular drug delivery
Sustained drug delivery is critical for patient adherence to chronic disease treatments. Here the authors apply machine learning to engineer multifunctional peptides with high melanin binding, high cell-penetration, and low cytotoxicity, enhancing the duration and efficacy of peptide-drug conjugates for sustained ocular delivery.
- Henry T. Hsueh
- , Renee Ti Chou
- & Laura M. Ensign
-
Article
| Open AccessPeakDecoder enables machine learning-based metabolite annotation and accurate profiling in multidimensional mass spectrometry measurements
Alternative algorithms exploiting advantages of multidimensional mass spectrometry in untargeted metabolomics are needed. Here, the authors develop and demonstrate PeakDecoder for confident and accurate metabolite profiling in 116 microbial sample runs and using a library built from 64 standards.
- Aivett Bilbao
- , Nathalie Munoz
- & Kristin E. Burnum-Johnson