Article
|
Open Access
Featured
-
-
Article
| Open AccessSystematic detection of functional proteoform groups from bottom-up proteomic datasets
Many proteins exist in various proteoforms but detecting these variants by bottom-up proteomics remains difficult. Here, the authors present a computational approach based on peptide correlation analysis to identify and characterize proteoforms from bottom-up proteomics data.
- Isabell Bludau
- , Max Frank
- & Ruedi Aebersold
-
Article
| Open AccessPreventing corneal blindness caused by keratitis using artificial intelligence
Keratitis is the main cause of corneal blindness worldwide, but most vision loss caused by keratitis can be avoidable via early detection and treatment, which are challenging in resource-limited settings. Here, the authors develop a deep learning system for the automated classification of keratitis and other cornea abnormalities.
- Zhongwen Li
- , Jiewei Jiang
- & Wei Chen
-
Article
| Open Accessα-Helical peptidic scaffolds to target α-synuclein toxic species with nanomolar affinity
α-Synuclein (αS) aggregation is a driver of several neurodegenerative disorders. Here, the authors identify a class of peptides that bind toxic αS oligomers and amyloid fibrils but not monomeric functional protein, and prevent further αS aggregation and associated cell damage.
- Jaime Santos
- , Pablo Gracia
- & Salvador Ventura
-
Article
| Open AccessMachine learning differentiates enzymatic and non-enzymatic metals in proteins
The authors generate the largest structural dataset of enzymatic and non-enzymatic metalloprotein sites to date. They use this dataset to train a decision-tree ensemble machine learning algorithm that allows them to distinguish between catalytic and non-catalytic metal sites. The computational model described here could also be useful for the identification of new enzymatic mechanisms and de novo enzyme design.
- Ryan Feehan
- , Meghan W. Franklin
- & Joanna S. G. Slusky
-
Article
| Open AccessSingle-cell RNA-seq reveals fibroblast heterogeneity and increased mesenchymal fibroblasts in human fibrotic skin diseases
Fibroblasts are found to be heterogeneous in multiple fibrotic diseases, but fibroblast heterogeneity in fibrotic skin diseases is not well characterized. Here the authors employ scRNA-seq to explore fibroblast heterogeneity in keloid, a paradigm of fibrotic skin diseases.
- Cheng-Cheng Deng
- , Yong-Fei Hu
- & Bin Yang
-
Article
| Open AccessModel-based prediction of spatial gene expression via generative linear mapping
Single cell RNA-seq loses spatial information of gene expression in multicellular systems because tissue must be dissociated. Here, the authors show the spatial gene expression profiles can be both accurately and robustly reconstructed by a new computational method using a generative linear mapping, Perler.
- Yasushi Okochi
- , Shunta Sakaguchi
- & Honda Naoki
-
Article
| Open AccessMolDiscovery: learning mass spectrometry fragmentation of small molecules
A large number of mass spectra from different samples have been collected, and to identify small molecules from these spectra, database searches are needed, which is challenging. Here, the authors report molDiscovery, a mass spectral database search method that uses an algorithm to generate mass spectrometry fragmentations and learns a probabilistic model to match small molecules with their mass spectra.
- Liu Cao
- , Mustafa Guler
- & Hosein Mohimani
-
Article
| Open AccessControlling the pandemic during the SARS-CoV-2 vaccination rollout
Despite the consensus that mass vaccination against SARS-CoV-2 will ultimately end the pandemic, it is not clear when and which control measures can be relaxed during the rollout of vaccination programmes. Here, the authors investigate relaxation scenarios using an age-structured transmission model that has been fitted to data for Portugal.
- João Viana
- , Christiaan H. van Dorp
- & Ganna Rozhnova
-
Article
| Open AccessLearning mutational signatures and their multidimensional genomic properties with TensorSignatures
Currently available tools for the analysis of mutational signatures do not make use of all possible genomic properties aside from mutation patterns. Here the authors present TensorSignatures, an efficient framework that jointly infers mutational signatures and their genomic determinants.
- Harald Vöhringer
- , Arne Van Hoeck
- & Moritz Gerstung
-
Article
| Open AccessIntegrated analysis of Xist upregulation and X-chromosome inactivation with single-cell and single-allele resolution
X-chromosome inactivation (XCI) ensures dosage compensation between the sexes. Here the authors perform allele-specific single-cell RNA sequencing in differentiating mouse embryonic stem cells to provide a detailed profile of the onset of XCI.
- Guido Pacini
- , Ilona Dunkel
- & Edda G. Schulz
-
Article
| Open AccessInsights into household transmission of SARS-CoV-2 from a population-based serological survey
Household-based studies can provide insights into SARS-CoV-2 transmission. Here, the authors fit transmission models to serological data from Geneva, Switzerland, and estimate that the risk of infection from single household exposure (17.3%) was higher than for extra-household exposure (5.1%).
- Qifang Bi
- , Justin Lessler
- & Didier Trono
-
Article
| Open AccessChildren’s exploratory play tracks the discriminability of hypotheses
People can infer unobserved causes of perceptual data (e.g. the contents of a box from the sound made by shaking it). Here the authors show that children compare what they hear with what they would have heard given other causes, and explore longer when the heard and imagined sounds are hard to discriminate.
- Max H. Siegel
- , Rachel W. Magid
- & Laura E. Schulz
-
Article
| Open AccessEngineering the protein dynamics of an ancestral luciferase
Directed evolution commonly relies on point mutations but InDels frequently occur in evolution. Here the authors report a protein-engineering framework based on InDel mutagenesis and fragment transplantation resulting in greater catalysis and longer glow-type bioluminescence of the ancestral luciferase.
- Andrea Schenkmayerova
- , Gaspar P. Pinto
- & Jiri Damborsky
-
Article
| Open AccessBenchmarking microbiome transformations favors experimental quantitative approaches to address compositionality and sampling depth biases
Here, the authors use simulated quantitative gut microbial communities to benchmark the performance of 13 common data transformations in determining diversity as well as microbe-microbe and microbe-metadata associations, finding that quantitative approaches incorporating microbial load variation outperform computational strategies in downstream analyses, urging for a widespread adoption of quantitative approaches, or recommending specific computational transformations whenever determination of microbial load of samples is not feasible.
- Verónica Lloréns-Rico
- , Sara Vieira-Silva
- & Jeroen Raes
-
Article
| Open AccessReliable identification of protein-protein interactions by crosslinking mass spectrometry
Cross-linking mass spectrometry (MS) can identify protein-protein interaction (PPI) networks but assessing the reliability of these data remains challenging. To address this issue, the authors develop and validate a method to determine the false-discovery rate of PPIs identified by cross-linking MS.
- Swantje Lenz
- , Ludwig R. Sinn
- & Juri Rappsilber
-
Article
| Open AccessAnalysis of the genomic landscape of yolk sac tumors reveals mechanisms of evolution and chemoresistance
Yolk sac tumours are a type of ovarian germ cell tumours. Here, the authors perform exome and RNA sequencing of clinical samples of 30 patients, characterize the mutational landscape of these rare tumours, and identify molecular features associated with resistance to cisplatin-based therapies.
- Xuan Zong
- , Ying Zhang
- & Jiaxin Yang
-
Article
| Open AccessMaps and metrics of insecticide-treated net access, use, and nets-per-capita in Africa from 2000-2020
Insecticide treated nets (ITNs) are an important part of malaria control in Africa and WHO targets aim for 80% coverage. This study estimates the spatio-temporal access and use of ITNs in Africa from 2000-2020, and shows that both metrics have improved over time but access remains below WHO targets.
- Amelia Bertozzi-Villa
- , Caitlin A. Bever
- & Samir Bhatt
-
Article
| Open AccessEffect of specific non-pharmaceutical intervention policies on SARS-CoV-2 transmission in the counties of the United States
Disentangling the impacts of non-pharmaceutical interventions on COVID-19 transmission is challenging as they have been used in different combinations across time and space. This study shows that, early in the epidemic, school/daycare closures and stopping nursing home visits were associated with the biggest reduction in transmission in the United States.
- Bingyi Yang
- , Angkana T. Huang
- & Derek A. T. Cummings
-
Article
| Open AccessCitywide serosurveillance of the initial SARS-CoV-2 outbreak in San Francisco using electronic health records
Population-based surveys are the gold standard for estimating seroprevalence but are expensive and often only capture a small geographic area or window of time. This study describes a new platform, SCALE-IT, for serosurveillance based on algorithmic sampling of electronic health records, and uses it to estimate the seroprevalence of SARS-CoV-2 in San Francisco.
- Isobel Routledge
- , Adrienne Epstein
- & Isabel Rodriguez-Barraquer
-
Article
| Open AccessMachine learning analyses of antibody somatic mutations predict immunoglobulin light chain toxicity
Systemic light chain amyloidosis (AL) is caused by the production of toxic light chains and can be fatal, yet effective treatments are often not possible due to delayed diagnosis. Here the authors show that a machine learning platform analyzing light chain somatic mutations allows the prediction of light chain toxicity to serve as a possible tool for early diagnosis of AL.
- Maura Garofalo
- , Luca Piccoli
- & Andrea Cavalli
-
Article
| Open AccessAlgebraic graph-assisted bidirectional transformers for molecular property prediction
Despite considerable efforts, quantitative prediction of various molecular properties remains a challenge. Here, the authors propose an algebraic graph-assisted bidirectional transformer, which can incorporate massive unlabeled molecular data into molecular representations via a self-supervised learning strategy and assisted with 3D stereochemical information from graphs.
- Dong Chen
- , Kaifu Gao
- & Feng Pan
-
Article
| Open AccessMachine-learning predicts genomic determinants of meiosis-driven structural variation in a eukaryotic pathogen
Structural variation in genomes of the same species is frequent but what drives the rearrangements remains unclear. Machine-learning of rearrangement patterns among telomere-to-telomere assemblies can accurately identify regions of intrinsic DNA instability in a eukaryotic pathogen.
- Thomas Badet
- , Simone Fouché
- & Daniel Croll
-
Article
| Open AccessRapid detection of identity-by-descent tracts for mega-scale datasets
Traditional methods to identify genomic regions identical-by-descent (IBD) do not scale well to biobank-level datasets. Here, the authors describe a new IBD algorithm, iLASH, which uses LocAlity-Sensitive Hashing to provide rapid IBD estimation when applied to the PAGE and UK Biobank datasets.
- Ruhollah Shemirani
- , Gillian M. Belbin
- & José Luis Ambite
-
Article
| Open AccessCell segmentation-free inference of cell types from in situ transcriptomics data
Inaccurate cell segmentation has been the major problem for cell-type identification and tissue characterization of the in situ spatially resolved transcriptomics data. Here we show a robust cell segmentation-free computational framework (SSAM), for identifying cell types and tissue domains in 2D and 3D.
- Jeongbin Park
- , Wonyl Choi
- & Naveed Ishaque
-
Article
| Open AccessHybrid AI-assistive diagnostic model permits rapid TBS classification of cervical liquid-based thin-layer cell smears
Technical advancements have significantly improved early diagnosis of cervical cancer, but accurate diagnosis is still difficult due to various practical factors. Here, the authors develop an artificial intelligence assistive diagnostic solution to improve cervical liquid-based thin-layer cell smear diagnosis according to clinical TBS criteria in a large multicenter study.
- Xiaohui Zhu
- , Xiaoming Li
- & Yanqing Ding
-
Article
| Open AccessR2DT is a framework for predicting and visualising RNA secondary structure using templates
Non-coding RNA function is poorly understood, partly due to the challenge of determining RNA secondary (2D) structure. Here, the authors present a framework for the reproducible prediction and visualization of the 2D structure of a wide array of RNAs, which enables linking RNA sequence to function.
- Blake A. Sweeney
- , David Hoksza
- & Anton I. Petrov
-
Article
| Open AccessTime trajectories in the transcriptomic response to exercise - a meta-analysis
Regular exercise promotes overall health and prevents non-communicable diseases, but the adaptation mechanisms are unclear. Here, the authors perform a meta-analysis to reveal time-specific patterns of the acute and long-term exercise response in human skeletal muscle, and identify sex- and age-specific changes.
- David Amar
- , Malene E. Lindholm
- & Euan A. Ashley
-
Article
| Open AccessVariant-specific inflation factors for assessing population stratification at the phenotypic variance level
Pooling participant-level genetic data into a single analysis can result in variance stratification, reducing statistical performance. Here, the authors develop variant-specific inflation factors to assess variance stratification and apply this to pooled individual-level data from whole genome sequencing.
- Tamar Sofer
- , Xiuwen Zheng
- & Kenneth M. Rice
-
Article
| Open AccessDeep learning connects DNA traces to transcription to reveal predictive features beyond enhancer–promoter contact
Recent advances in super-resolution microscopy have made it possible to measure chromatin 3D structure and transcription in thousands of single cells. Here, authors present a deep learning-based approach to characterise how chromatin structure relates to transcriptional state of individual cells and determine which structural features of chromatin regulation are important for gene expression state.
- Aparna R. Rajpurkar
- , Leslie J. Mateo
- & Alistair N. Boettiger
-
Article
| Open AccessSystematic benchmarking of tools for CpG methylation detection from nanopore sequencing
Several existing algorithms predict the methylation of DNA using Nanopore sequencing signals, but it is unclear how they compare in performance. Here, the authors benchmark the performance of several such tools, and propose METEORE, a consensus tool that improves prediction accuracy.
- Zaka Wing-Sze Yuen
- , Akanksha Srivastava
- & Eduardo Eyras
-
Article
| Open AccessMOGONET integrates multi-omics data using graph convolutional networks allowing patient classification and biomarker identification
Our understanding of human disease can be improved by integrating the abundance of high throughput biomedical data. Here, the authors use deep learning methods successfully used on images to integrate various types of omics data to improve patient classification and identify disease biomarkers.
- Tongxin Wang
- , Wei Shao
- & Kun Huang
-
Article
| Open AccessLarge variation in anti-SARS-CoV-2 antibody prevalence among essential workers in Geneva, Switzerland
Many job sectors classified as ‘essential’ have continued operating with limited restrictions during the COVID-19 pandemic, potentially placing workers at higher risk of infection. Here, the authors show that seropositivity rates in workers vary widely across and between job sectors in Geneva, Switzerland.
- Silvia Stringhini
- , María-Eugenia Zaballa
- & Idris Guessous
-
Article
| Open AccessOptimizing vaccine allocation for COVID-19 vaccines shows the potential role of single-dose vaccination
Most COVID-19 vaccines require two doses but a single dose provides partial protection, so it is unclear how best to prioritize vaccine distribution in the context of limited supply. Here, the authors show that campaigns in which some age groups receive one dose while others receive both doses may be optimal.
- Laura Matrajt
- , Julia Eaton
- & Holly Janes
-
Article
| Open AccessHiC-DC+ enables systematic 3D interaction calls and differential analysis for Hi-C and HiChIP
The genome-wide investigation of chromatin organization enables insights into global gene expression control. Here, the authors present a computationally efficient method for the analysis of chromatin organization data and use it to recover principles of 3D organization across conditions.
- Merve Sahin
- , Wilfred Wong
- & Christina S. Leslie
-
Article
| Open AccessReplicate sequencing libraries are important for quantification of allelic imbalance
Allele-specific expression in diploid organisms can be quantified by RNA-seq and it is common practice to rely on a single library. Here, the authors show that the standard approach has variable error rate and present Qllelic as a tool to improve reproducibility of allele-specific RNA-seq analysis.
- Asia Mendelevich
- , Svetlana Vinogradova
- & Alexander A. Gimelbrant
-
Article
| Open AccessDeep learning boosts sensitivity of mass spectrometry-based immunopeptidomics
The identification of HLA peptides by mass spectrometry is non-trivial. Here, the authors extended and used the wealth of data from the ProteomeTools project to improve the prediction of non-tryptic peptides using deep learning, and show their approach enables a variety of immunological discoveries.
- Mathias Wilhelm
- , Daniel P. Zolg
- & Bernhard Kuster
-
Article
| Open AccessQuantitative single-cell proteomics as a tool to characterize cellular hierarchies
Single-cell proteomics can provide insights into the molecular basis for cellular heterogeneity. Here, the authors develop a multiplexed single-cell proteomics and computational workflow, and show that their strategy captures the cellular hierarchies in an Acute Myeloid Leukemia culture model.
- Erwin M. Schoof
- , Benjamin Furtwängler
- & Bo T. Porse
-
Article
| Open AccessMOCCASIN: a method for correcting for known and unknown confounders in RNA splicing analysis
Confounding factors on gene expression analysis can be analyzed by several existing tools. Here the authors develop an algorithm called MOCCASIN to correct the effect of known and unknown confounders on RNA splicing quantification.
- Barry Slaff
- , Caleb M. Radens
- & Yoseph Barash
-
Article
| Open AccessSystems genetics in diversity outbred mice inform BMD GWAS and identify determinants of bone strength
Osteoporosis GWAS faces two challenges, causal gene discovery and a lack of phenotypic diversity. Here, the authors use the Diversity Outbred mouse population to inform human GWAS using networks and map genetic loci for 55 bone traits, identifying new potential bone strength genes.
- Basel M. Al-Barghouthi
- , Larry D. Mesner
- & Charles R. Farber
-
Article
| Open AccessA deep learning approach to identify gene targets of a therapeutic for human splicing disorders
Drugs that modify RNA splicing are promising treatments for many genetic diseases. Here the authors show that deep learning strategies can predict drug targets, strongly supporting the use of in silico approaches to expand the therapeutic potential of drugs that modulate RNA splicing.
- Dadi Gao
- , Elisabetta Morini
- & Susan A. Slaugenhaupt
-
Article
| Open AccessAnchor extension: a structure-guided approach to design cyclic peptides targeting enzyme active sites
Cyclic peptides are of particular interest due to their pharmacological properties, but their design for binding to a target protein is challenging. Here, the authors present a computational “anchor extension” methodology for de novo design of cyclic peptides that bind to the target protein with high affinity, and validate the approach by developing cyclic peptides that inhibit histone deacetylases 2 and 6.
- Parisa Hosseinzadeh
- , Paris R. Watson
- & David Baker
-
Article
| Open AccessThe regulatory landscape of Arabidopsis thaliana roots at single-cell resolution
Existing studies of the chromatin accessibility, the primary mark of regulatory DNA, in Arabidopsis are based mainly on bulk samples. Here, the authors report the regulatory landscape of Arabidopsis thaliana roots at single-cell resolution.
- Michael W. Dorrity
- , Cristina M. Alexandre
- & Josh T. Cuperus
-
Article
| Open AccessRNA structure probing reveals the structural basis of Dicer binding and cleavage
Sequencing methods such as icSHAPE were developed to probe RNA structures transcriptome-wide in cells. To probe intact RNA structures, the authors develop icSHAPE-MaP and apply to Dicer-bound substrates showing that distance measuring is important for Dicer cleavage of pre-miRNAs.
- Qing-Jun Luo
- , Jinsong Zhang
- & Qiangfeng Cliff Zhang
-
Article
| Open AccessNuclear compartmentalization of TERT mRNA and TUG1 lncRNA is driven by intron retention
RNA localization plays an important role in transcriptome regulation. The majority of TERT transcripts are detected in the nucleus and TUG1 lncRNAs in both the nucleus and cytoplasm. Here, the authors combine single-cell RNA imaging, antisense oligonucleotides and splicing analyses to show that retention of specific introns drives stable compartmentalization of TERT and TUG1 transcripts in the nucleus, and that splicing of TERT retained introns is mitotically regulated.
- Gabrijela Dumbović
- , Ulrich Braunschweig
- & John L. Rinn
-
Article
| Open AccessCrowdsourced mapping of unexplored target space of kinase inhibitors
The IDG-DREAM Challenge carried out crowdsourced benchmarking of predictive algorithms for kinase inhibitor activities on unpublished data. This study provides a resource to compare emerging algorithms and prioritize new kinase activities to accelerate drug discovery and repurposing efforts.
- Anna Cichońska
- , Balaguru Ravikumar
- & Tero Aittokallio
-
Article
| Open AccessDiscovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network
Mammalian genomes are scattered with repetitive sequences, but their biology remains largely elusive. Here, the authors show that transcription can initiate from short tandem repetitive sequences, and that genetic variants linked to human diseases are preferentially found at repeats with high transcription initiation level.
- Mathys Grapotte
- , Manu Saraswat
- & Charles-Henri Lecellier
-
Article
| Open AccessModel-based analysis uncovers mutations altering autophagy selectivity in human cancer
Although autophagy has been linked to tumourigenesis, it is unclear how genomic alterations affect autophagy selectivity in tumours. Here, the authors establish a pipeline that integrates computational and experimental approaches to show that altered autophagy selectivity is frequent in cancer cells and link glycogen autophagy with tumourigenesis.
- Zhu Han
- , Weizhi Zhang
- & Da Jia
-
Article
| Open AccessGenerative modeling of single-cell time series with PRESCIENT enables prediction of cell trajectories with interventions
Single-cell RNA-Seq allows us to observe snapshots of how biological systems change over time at cellular resolution. Here, the authors develop a generative framework that uses time-resolved single-cell data to model how cells change in physical time, including in response to perturbations.
- Grace Hui Ting Yeo
- , Sachit D. Saksena
- & David K. Gifford
-
Article
| Open AccessIntegrating genomics and metabolomics for scalable non-ribosomal peptide discovery
Current genome mining methods predict many putative non-ribosomal peptides (NRPs) from their corresponding biosynthetic gene clusters, but it remains unclear which of those exist in nature and how to identify their post-assembly modifications. Here, the authors develop NRPminer, a modification-tolerant tool for the discovery of NRPs from large genomic and mass spectrometry datasets, and use it to find 180 NRPs from different environments.
- Bahar Behsaz
- , Edna Bode
- & Hosein Mohimani
Browse broader subjects
Browse narrower subjects
- Biochemical reaction networks
- Cellular signalling networks
- Classification and taxonomy
- Communication and replication
- Computational models
- Computational neuroscience
- Computational platforms and environments
- Data acquisition
- Data integration
- Data mining
- Data processing
- Data publication and archiving
- Databases
- Functional clustering
- Gene ontology
- Gene regulatory networks
- Genome informatics
- Hardware and infrastructure
- High-throughput screening
- Image processing
- Literature mining
- Machine learning
- Microarrays
- Network topology
- Phylogeny
- Power law
- Predictive medicine
- Probabilistic data networks
- Programming language
- Protein analysis
- Protein design
- Protein folding
- Protein function predictions
- Protein structure predictions
- Proteome informatics
- Quality control
- Scale invariance
- Sequence annotation
- Software
- Standards
- Statistical methods
- Virtual drug screening