Featured
-
-
Article
| Open AccessIntegrating genomics and metabolomics for scalable non-ribosomal peptide discovery
Current genome mining methods predict many putative non-ribosomal peptides (NRPs) from their corresponding biosynthetic gene clusters, but it remains unclear which of those exist in nature and how to identify their post-assembly modifications. Here, the authors develop NRPminer, a modification-tolerant tool for the discovery of NRPs from large genomic and mass spectrometry datasets, and use it to find 180 NRPs from different environments.
- Bahar Behsaz
- , Edna Bode
- & Hosein Mohimani
-
Article
| Open AccessEnhancing CRISPR-Cas9 gRNA efficiency prediction by data integration and deep learning
High-quality gRNA activity data is needed for accurate on-target efficiency predictions. Here the authors generate activity data for over 10,000 gRNA and build a deep learning model CRISPRon for improved performance predictions.
- Xi Xiang
- , Giulia I. Corsi
- & Yonglun Luo
-
Article
| Open AccessAutomated annotation and visualisation of high-resolution spatial proteomic mass spectrometry imaging data using HIT-MAP
MALDI-mass spectrometry imaging (MSI) can reveal the distribution of proteins in tissues but tools for protein identification and annotation are sparse. Here, the authors develop an open-source bioinformatic workflow for false discovery rate-controlled protein annotation and spatial mapping from MALDI-MSI data.
- G. Guo
- , M. Papanicolaou
- & A. C. Grey
-
Article
| Open AccessMultimodal analysis of cell-free DNA whole-genome sequencing for pediatric cancers with low mutational burden
Liquid biopsies enable minimally invasive applications for diagnosis and treatment monitoring. Here the authors analyse fragmentation patterns of circulating tumour DNA on multiple levels and develop a bioinformatic tool, LIQUORICE, to accurately detect and classify paediatric cancers with low mutational burden.
- Peter Peneder
- , Adrian M. Stütz
- & Eleni M. Tomazou
-
Article
| Open AccessRetention time prediction using neural networks increases identifications in crosslinking mass spectrometry
Predicting chromatographic retention times (RTs) has proven beneficial in proteomics but has not yet been achieved for crosslinked peptides. Here, the authors develop an RT prediction tool for crosslinked peptides and leverage predicted RTs to increase identifications in crosslinking mass spectrometry studies.
- Sven H. Giese
- , Ludwig R. Sinn
- & Juri Rappsilber
-
Article
| Open AccessImpact of DNA methylation on 3D genome structure
Multi-layered epigenetic regulation in higher eukaryotes makes it challenging to disentangle the individual effects of modifications on chromatin structure and function. Here, the authors expressed mammalian DNA methyltransferases in yeast, which have no DNA methylation, to show that methylation has intrinsic effects on chromatin structure.
- Diana Buitrago
- , Mireia Labrador
- & Modesto Orozco
-
Article
| Open AccessChromosomal copy number heterogeneity predicts survival rates across cancers
Intratumour heterogeneity (ITH) is associated with worse prognosis in cancer, and efficient frameworks to measure it are needed. Here the authors develop a method to estimate copy number heterogeneity, and propose that it is driven by chromosomal instability and can predict pan-cancer survival.
- Erik van Dijk
- , Tom van den Bosch
- & Daniël M. Miedema
-
Article
| Open AccessStructure-based protein function prediction using graph convolutional networks
The rapid increase in the number of proteins in sequence databases and the diversity of their functions challenge computational approaches for automated function prediction. Here, the authors introduce DeepFRI, a Graph Convolutional Network for predicting protein functions by leveraging sequence features extracted from a protein language model and protein structures.
- Vladimir Gligorijević
- , P. Douglas Renfrew
- & Richard Bonneau
-
Article
| Open AccessIdentification of putative causal loci in whole-genome sequencing data via knockoff statistics
Association analyses that capture rare and noncoding variants in whole genome sequencing data are limited by factors like statistical power. Here, the authors present KnockoffScreen, a statistical method using the knockoff framework to detect, localise and prioritise rare and common risk variants at genome-wide scale.
- Zihuai He
- , Linxi Liu
- & Iuliana Ionita-Laza
-
Article
| Open AccessAn electronic neuromorphic system for real-time detection of high frequency oscillations (HFO) in intracranial EEG
A major challenge across a variety of fields is how to process the vast quantities of data produced by sensors without large computation resources. Here, the authors present a neuromorphic chip which can detect a relevant signature of epileptogenic tissue from intracranial recordings in patients.
- Mohammadali Sharifshazileh
- , Karla Burelo
- & Giacomo Indiveri
-
Article
| Open AccessHealth improvement framework for actionable treatment planning using a surrogate Bayesian model
Clinical decision-making regarding treatments based on personal characteristics leads to effective health improvements. Here, the authors introduce a modeling framework to evaluate the actionability of treatment pathways.
- Kazuki Nakamura
- , Ryosuke Kojima
- & Yasushi Okuno
-
Article
| Open AccessSynthetic neural-like computing in microbial consortia for pattern recognition
Complex biological systems have individual cells acting collectively to solve complex tasks. Here the authors implement neural network-like computing in a bacterial consortia to recognise patterns.
- Ximing Li
- , Luna Rizik
- & Ramez Daniel
-
Article
| Open AccessDetecting and phasing minor single-nucleotide variants from long-read sequencing data
Cellular genetic heterogeneity is common across biological conditions, yet application of long-read sequencing to this subject is limited by error rates. Here, the authors present iGDA, a tool for detection and phasing of minor variants from long-read sequencing data, allowing accurate reconstruction of haplotypes.
- Zhixing Feng
- , Jose C. Clemente
- & Eric E. Schadt
-
Article
| Open AccessEvolution of core archetypal phenotypes in progressive high grade serous ovarian cancer
High-grade serous ovarian cancer (HGSOC) is prone to developing resistance to treatment. Here, the authors use single-cell RNA-seq and an analysis of archetypes, and find that shifts in metabolism and proliferation are associated with the response to treatment and clonal heterogeneity in HGSOC.
- Aritro Nath
- , Patrick A. Cosgrove
- & Andrea H. Bild
-
Article
| Open AccessClump sequencing exposes the spatial expression programs of intestinal secretory cells
Combining scRNA-seq with spatial information to enable the reconstruction of spatially-resolved cell atlases is challenging for rare cell types. Here the authors present ClumpSeq, an approach for sequencing small clumps of tissue attached cells, and apply it to establish spatial atlases for all secretory cell types in the small intestine.
- Rita Manco
- , Inna Averbukh
- & Shalev Itzkovitz
-
Article
| Open AccessMining mutation contexts across the cancer genome to map tumor site of origin
The vast majority of somatic mutations observed in tumors are rare. Here, the authors show that these large numbers of rare mutations are more predictive of the tissue of origin of a tumor than the information from a few common driver mutations.
- Saptarshi Chakraborty
- , Axel Martin
- & Ronglai Shen
-
Article
| Open AccessPermutation-based identification of important biomarkers for complex diseases via machine learning models
Study of human disease remains challenging due to convoluted disease etiologies and complex molecular mechanisms at genetic, genomic, and proteomic levels. Here, the authors propose a computationally efficient Permutation-based Feature Importance Test to assist interpretation and selection of individual features in complex machine learning models for complex disease analysis.
- Xinlei Mi
- , Baiming Zou
- & Jianhua Hu
-
Article
| Open AccessControlling COVID-19 via test-trace-quarantine
Initial COVID-19 containment in the United States focused on limiting mobility, including school and workplace closures, with enormous societal and economic costs. Here, the authors demonstrate the feasibility of a test-trace-quarantine strategy using an agent-based model and detailed data on the Seattle region.
- Cliff C. Kerr
- , Dina Mistry
- & Daniel J. Klein
-
Article
| Open AccessInterpretation of T cell states from single-cell transcriptomics data using reference atlases
One challenge of single cell RNA sequencing analysis is how to consistently identify cell subtypes and states across different datasets. Here the authors propose the use of a reference single-cell atlas as a stable system of coordinates to characterize T cell states across studies, diseases and species.
- Massimo Andreatta
- , Jesus Corria-Osorio
- & Santiago J. Carmona
-
Article
| Open AccessGene-level metagenomic architectures across diseases yield high-resolution microbiome diagnostic indicators
Here, combing the massive gene-universe of the gut microbiome to identify strain-specific, cross-disease, associations across seven human diseases, the authors introduce the concept of microbiome architecture, defined as the complete set of positive and negative associations between microbial genes and human host disease, highlighting microbiome architectures as potential diagnostic indicators.
- Braden T. Tierney
- , Yingxuan Tan
- & Chirag J. Patel
-
Article
| Open AccessA global resource for genomic predictions of antimicrobial resistance and surveillance of Salmonella Typhi at pathogenwatch
Whole genome sequencing data are increasingly becoming routinely available but generating actionable insights is challenging. Here, the authors describe Pathogenwatch, a web tool for genomic surveillance of S. Typhi, and demonstrate its use for antimicrobial resistance assignment and strain risk assessment.
- Silvia Argimón
- , Corin A. Yeats
- & David M. Aanensen
-
Article
| Open AccessSupervised dimensionality reduction for big data
Biomedical measurements usually generate high-dimensional data where individual samples are classified in several categories. Vogelstein et al. propose a supervised dimensionality reduction method which estimates the low-dimensional data projection for classification and prediction in big datasets.
- Joshua T. Vogelstein
- , Eric W. Bridgeford
- & Mauro Maggioni
-
Article
| Open AccessThe neutrotime transcriptional signature defines a single continuum of neutrophils across biological compartments
Differentiating neutrophil functional states is difficult. Here the authors show, using single cell RNA-sequencing and trajectory analyses, that mouse neutrophils can be presented as a transcriptome continuum rather than discrete subsets, but are affected by inflammation to express distinct transcriptional states.
- Ricardo Grieshaber-Bouyer
- , Felix A. Radtke
- & Hideyuki Yoshida
-
Article
| Open AccessConnectivity characterization of the mouse basolateral amygdalar complex
The basolateral amygdala is implicated in several behavior-related states including anxiety, autism, and addiction. The authors apply circuit-level pathway tracing methods combined with computational techniques to provide a comprehensive connectivity atlas of the mouse basolateral amygdala complex.
- Houri Hintiryan
- , Ian Bowman
- & Hong-Wei Dong
-
Article
| Open AccessAutoSpill is a principled framework that simplifies the analysis of multichromatic flow cytometry data
Flow cytometry allows the simultaneous quantification of many markers in and on a cell, but the analysis of such data is complicated. Here, the authors propose AutoSpill, a framework that facilitates the analysis of such data by automating parts of the analysis and requiring fewer controls.
- Carlos P. Roca
- , Oliver T. Burton
- & Adrian Liston
-
Article
| Open AccessHierarchical progressive learning of cell identities in single-cell data
Classification methods for scRNA-seq data are limited in their ability to learn from multiple datasets simultaneously. Here the authors present scHPL, a hierarchical progressive learning method that automatically finds relationships between cell populations across multiple datasets and constructs a classification tree.
- Lieke Michielsen
- , Marcel J. T. Reinders
- & Ahmed Mahfouz
-
Article
| Open AccessTotal genetic contribution assessment across the human genome
Quantifying the effects of individual loci on the human phenome is a challenging task. Here, the authors introduce a modelling technique, TGCA, that assesses total genetic contribution per locus and apply this to UK Biobank phenotype domains, revealing top loci and links to tissue-specific gene expression.
- Ting Li
- , Zheng Ning
- & Xia Shen
-
Article
| Open AccessUsing mobile phone data to reveal risk flow networks underlying the HIV epidemic in Namibia
Human mobility influences the spatial distribution of infectious diseases such as HIV. Here, the authors use call data records from mobile phones to model HIV networks in Namibia and estimate that ~40% of the risk of HIV acquisition is driven by mobility.
- Eugenio Valdano
- , Justin T. Okano
- & Sally Blower
-
Article
| Open AccessTissue context determines the penetrance of regulatory DNA variation
The functional consequences of variation in human regulatory DNA depend on the local chromatin environment and the cell/tissue context. Here the authors use highly diverged hybrid mice to study genetic effects on DNA accessibility in vivo across multiple cell and tissue types.
- Jessica M. Halow
- , Rachel Byron
- & Matthew T. Maurano
-
Article
| Open AccessPairing a high-resolution statistical potential with a nucleobase-centric sampling algorithm for improving RNA model refinement
Predicting RNA structure from sequence is challenging due to the relative sparsity of experimentally-determined RNA 3D structures for model training. Here, the authors propose a way to incorporate knowledge on interactions at the atomic and base–base level to refine the prediction of RNA structures.
- Peng Xiong
- , Ruibo Wu
- & Yaoqi Zhou
-
Article
| Open AccessLandscape of allele-specific transcription factor binding in the human genome
Single-nucleotide variants in enhancers or promoters may affect gene transcription by altering transcription factor binding sites. Here the authors present a meta-analysis empowered by a new statistical method covering thousands of ChIP-Seq experiments resulting in the identification of more than 500 thousand allele-specific binding (ASB) events in the human genome.
- Sergey Abramov
- , Alexandr Boytsov
- & Ivan V. Kulakovskiy
-
Article
| Open AccessDistinct axial and lateral interactions within homologous filaments dictate the signaling specificity and order of the AIM2-ASC inflammasome
AIM2-ASC inflammasomes are filamentous signalling platforms that play a central role in host innate defence. Here, the authors present the filament cryo-EM structure of the inflammasome receptor AIM2, which is very similar to the adaptor ASC filament structure. By employing Rosetta and Molecular Dynamics simulations the authors provide further insights into the directionality and recognition mechanisms of the individual AIM2 and ASC filaments, which is further validated with biochemical and cellular experiments.
- Mariusz Matyszewski
- , Weili Zheng
- & Jungsan Sohn
-
Article
| Open AccessEstimating COVID-19 mortality in Italy early in the COVID-19 pandemic
Estimates of COVID-19-related mortality are limited by incomplete testing. Here, the authors perform counterfactual analyses and estimate that there were 59,000–62,000 deaths from COVID-19 in Italy until 9th September 2020, approximately 1.5 times higher than official statistics.
- Chirag Modi
- , Vanessa Böhm
- & Uroš Seljak
-
Article
| Open AccessThe epidemicity index of recurrent SARS-CoV-2 infections
Several prognostic indices are available to predict the long-term fate of emerging infectious diseases and the effect of their containment measures, including a variety of reproduction numbers. Here, the authors introduce the epidemicity index, a complementary index to evaluate the potential for transient increases of SARS-Cov-2 epidemics.
- Lorenzo Mari
- , Renato Casagrandi
- & Marino Gatto
-
Article
| Open AccessOvercoming false-positive gene-category enrichment in the analysis of spatially resolved transcriptomic brain atlas data
Identifying enriched gene sets in transcriptomic data is routine analysis. Here, the authors show that conventional gene category enrichment analysis (GCEA) applied to brain-wide atlas data yields biased results and develop a flexible ensemble-based null model framework to enable appropriate inference in GCEA.
- Ben D. Fulcher
- , Aurina Arnatkeviciute
- & Alex Fornito
-
Article
| Open AccessNeural network aided approximation and parameter inference of non-Markovian models of gene expression
Cells are complex systems that make decisions biologists struggle to understand. Here, the authors use neural networks to approximate the solution of mathematical models that capture the history and randomness of biochemical processes in order to understand the principles of transcription control.
- Qingchao Jiang
- , Xiaoming Fu
- & Ramon Grima
-
Article
| Open AccessIntegration of machine learning and genome-scale metabolic modeling identifies multi-omics biomarkers for radiation resistance
Personalized prediction of tumor radiosensitivity would facilitate development of precision medicine workflows for cancer treatment. Here, the authors integrate machine learning and genome-scale metabolic modeling approaches to identify multi-omics biomarkers predictive of radiation response.
- Joshua E. Lewis
- & Melissa L. Kemp
-
Article
| Open AccessA machine learning model for identifying patients at risk for wild-type transthyretin amyloid cardiomyopathy
Transthyretin amyloid cardiomyopathy is a treatable but often unrecognized cause of heart failure. We derived and validated a machine learning model based on medical diagnostic codes that identifies heart failure patients at risk for wild-type transthyretin amyloid cardiomyopathy.
- Ahsan Huda
- , Adam Castaño
- & Sanjiv J. Shah
-
Article
| Open AccessThe RNA landscape of the human placenta in health and disease
Placental dysfunction can have catastrophic or barely discernible effects ranging from miscarriage to apparently normal birth. Here the authors present a comprehensive analysis of the human placental transcriptome and identify circular RNAs and piRNAs.
- Sungsam Gong
- , Francesca Gaccioli
- & D. Stephen Charnock-Jones
-
Article
| Open AccessSARS-CoV-2 gene content and COVID-19 mutation impact by comparing 44 Sarbecovirus genomes
The SARS-CoV-2 gene set remains unresolved, hindering dissection of COVID-19 biology. Comparing 44 Sarbecovirus genomes provides a high-confidence protein-coding gene set. The study characterizes protein-level and nucleotide-level evolutionary constraints, and prioritizes functional mutations from the ongoing COVID-19 pandemic.
- Irwin Jungreis
- , Rachel Sealfon
- & Manolis Kellis
-
Article
| Open AccessArtificial intelligence-enabled fully automated detection of cardiac amyloidosis using electrocardiograms and echocardiograms
Cardiac amyloidosis is difficult to identify, given low prevalence and similarity of the symptoms to more prevalent disorders. Here the authors present a multi-modality, artificial intelligence-enabled pipeline, that enables automated detection of cardiac amyloidosis from inexpensive and accessible measures.
- Shinichi Goto
- , Keitaro Mahara
- & Rahul C. Deo
-
Article
| Open AccessGlobal population structure and genotyping framework for genomic surveillance of the major dysentery pathogen, Shigella sonnei
Whole genome sequencing is increasingly being adopted for Shigella sonnei outbreak investigation and surveillance, but there is no global classification standard. Here, the authors develop and validate a genomic framework implemented using open-source software, and demonstrate its application using surveillance data.
- Jane Hawkey
- , Kalani Paranagama
- & Kathryn E. Holt
-
Article
| Open AccessA coordinated progression of progenitor cell states initiates urinary tract development
Nephric duct (ND)-derived ureteric buds (UB) form the kidney collecting duct system, while ureteric tips promote nephron formation. Here the authors use single-cell RNA-seq and introduce Cluster RNA-seq to identify four progenitor populations in developing ND/UB regulated by the transcription factors Tfap2a/b and Gata3.
- Oraly Sanchez-Ferras
- , Alain Pacis
- & Maxime Bouchard
-
Article
| Open AccessComprehensive cell type decomposition of circulating cell-free DNA with CelFiE
Tissue damage and turnover lead to the release of DNA in the blood and can be used to monitor changes in tissue state. Here, the authors developed a tool to accurately estimate the proportion of cell types contributing to cell-free DNA in the blood, with an application to pregnant women and ALS patients.
- Christa Caggiano
- , Barbara Celona
- & Noah Zaitlen
-
Article
| Open AccessGenoppi is an open-source software for robust and standardized integration of proteomic and genetic data
Genetic variation can impact protein complexes and interaction networks, but reconciling genetic and proteomic information remains challenging. To address this need, the authors develop Genoppi —a computational tool for integrating genetics and cell-type-specific proteomics data.
- Greta Pintacuda
- , Frederik H. Lassen
- & Kasper Lage
-
Article
| Open AccessDeep learning-based predictive identification of neural stem cell differentiation
The differentiation of neural stem cells (NSCs) into neurons is a critical part in devising potential cell-based therapeutic strategies for central nervous system diseases but NSCs fate determination and prediction is problematic. Here, the authors present a deep neural network model for predictable reliable identification of NSCs fate.
- Yanjing Zhu
- , Ruiqi Huang
- & Rongrong Zhu
-
Article
| Open AccessSystematic inference and comparison of multi-scale chromatin sub-compartments connects spatial organization to cell phenotypes
Computational algorithms to infer chromatin sub-compartments and compartment domains require high-resolution Hi-C maps. Here the authors present Calder, an algorithm that can infer sub-compartments and compartment domains with variable resolution Hi-C data, and they apply it to more than a hundred Hi-C experiments to study sub-compartment repositioning.
- Yuanlong Liu
- , Luca Nanni
- & Giovanni Ciriello
-
Article
| Open AccessDecoupling epithelial-mesenchymal transitions from stromal profiles by integrative expression analysis
Epithelial cancer cells can transition into a mesenchymal phenotype to enable invasion and metastasis. Here, the authors use previously published single-cell and bulk RNA sequencing datasets to decouple the mesenchymal expression profiles of cancer and stromal cells.
- Michael Tyler
- & Itay Tirosh
-
Article
| Open AccessRegression plane concept for analysing continuous cellular processes with machine learning
High-content screening prompted the development of software enabling discrete phenotypic analysis of single cells. Here, the authors show that supervised continuous machine learning can drive novel discoveries in diverse imaging experiments and present the Regression Plane module of Advanced Cell Classifier.
- Abel Szkalisity
- , Filippo Piccinini
- & Peter Horvath
Browse broader subjects
Browse narrower subjects
- Biochemical reaction networks
- Cellular signalling networks
- Classification and taxonomy
- Communication and replication
- Computational models
- Computational neuroscience
- Computational platforms and environments
- Data acquisition
- Data integration
- Data mining
- Data processing
- Data publication and archiving
- Databases
- Functional clustering
- Gene ontology
- Gene regulatory networks
- Genome informatics
- Hardware and infrastructure
- High-throughput screening
- Image processing
- Literature mining
- Machine learning
- Microarrays
- Network topology
- Phylogeny
- Power law
- Predictive medicine
- Probabilistic data networks
- Programming language
- Protein analysis
- Protein design
- Protein folding
- Protein function predictions
- Protein structure predictions
- Proteome informatics
- Quality control
- Scale invariance
- Sequence annotation
- Software
- Standards
- Statistical methods
- Virtual drug screening