Article
|
Open Access
Featured
-
-
Article
| Open AccessscDisInFact: disentangled learning for integration and prediction of multi-batch multi-condition single-cell RNA-sequencing data
Here the authors propose a deep learning model that integrates multi-condition, multi-batch single-cell RNA-sequencing datasets. The model disentangles biological variation (condition effect) from technical confounders (batch effect) and overcomes some limitations of existing approaches.
- Ziqi Zhang
- , Xinye Zhao
- & Xiuwei Zhang
-
Article
| Open AccessLongitudinal quantification of Bifidobacterium longum subsp. infantis reveals late colonization in the infant gut independent of maternal milk HMO composition
Here, the authors develop a high-throughput method to quantify Bifidobacterium longum subsp. infantis (BL. infantis), a proficient HMO-utilizer, from metagenomic sequencing, and applied it to a longitudinal cohort consisting of 21 mother-infant dyads, suggesting BL. infantis colonization to start late in the breast-feeding period.
- Dena Ennis
- , Shimrit Shmorak
- & Moran Yassour
-
Article
| Open AccessSingle-cell analysis of psoriasis resolution demonstrates an inflammatory fibroblast state targeted by IL-23 blockade
Single cell profiling of tissue from patients undergoing therapy has the potential to identify drug-induced immune changes. Here the authors show a skin scRNA-seq study of psoriasis patients treated with an IL-23 inhibitor and characterize changes in cell states during early treatment.
- Luc Francis
- , Daniel McCluskey
- & Satveer K. Mahil
-
Article
| Open AccessSemi-supervised integration of single-cell transcriptomics data
Batch effects hinder multi-sample single-cell data analyses. Here, authors present STACAS, a scalable single-cell RNA-seq data integration tool that uses prior cell type knowledge to preserve biological variability, demonstrating robustness to noisy input cell type labels.
- Massimo Andreatta
- , Léonard Hérault
- & Santiago J. Carmona
-
Article
| Open AccessUtility of long-read sequencing for All of Us
Using All of Us pilot data, the authors compared short- and long-read performance across medically relevant genes and showcased the utility of long reads to improve variant detection and phasing in easy and hard to resolve medically relevant genes.
- M. Mahmoud
- , Y. Huang
- & F. J. Sedlazeck
-
Article
| Open AccessSingle-cell multiomics decodes regulatory programs for mouse secondary palate development
Development of the secondary palate is a complex process. Here, the authors profile mouse palatogenesis through single-cell multiome sequencing, revealing dynamic gene regulation across embryonic days (E) 12.5, E13.5, E14.0, and E14.5.
- Fangfang Yan
- , Akiko Suzuki
- & Zhongming Zhao
-
Article
| Open AccessTrajectory inference across multiple conditions with condiments
scRNA-Seq has enabled the study of dynamic systems such as response to a drug at the individual cell and gene levels. Here the authors introduce a framework to interpret differences at the trajectory, cell populations, and individual gene levels.
- Hector Roux de Bézieux
- , Koen Van den Berge
- & Sandrine Dudoit
-
Article
| Open AccessSiFT: uncovering hidden biological processes by probabilistic filtering of single-cell data
Cells simultaneously encode multiple signals, some harder to recover. Here, authors introduce SiFT (Signal FilTering), a kernel-based projection method, revealing underlying biological processes in single-cell data.
- Zoe Piran
- & Mor Nitzan
-
Article
| Open AccessCompositional and temporal division of labor modulates mixed sugar fermentation by an engineered yeast consortium
Synthetic microbial communities are suitable for mixed substrates fermentation and long metabolic pathway engineering. Here, the authors combine fermentation experiments with mathematical modeling to reveal the effect of compositional and temporal changes on division of labor in cellulosic ethanol production using two yeast strains.
- Jonghyeok Shin
- , Siqi Liao
- & Yong-Su Jin
-
Article
| Open AccessAnti-correlated feature selection prevents false discovery of subpopulations in scRNAseq
Typical single-cell RNAseq pipelines will subcluster homogeneous cells. Here, authors present a computational algorithm for accurately identifying cell-type marker genes in single-cell data analysis with a low false discovery rate.
- Scott R. Tyler
- , Daniel Lozano-Ojalvo
- & Eric E. Schadt
-
Article
| Open AccessEmergence of periodic circumferential actin cables from the anisotropic fusion of actin nanoclusters during tubulogenesis
Periodic circumferential cytoskeletons support biological tube formation. Here, the authors show that self-assembled actin nanoclusters undergo biased fusion and develop into periodic cables in response to the membrane anisotropy of the expanding Drosophila tracheal tube.
- Sayaka Sekine
- , Mitsusuke Tarama
- & Shigeo Hayashi
-
Article
| Open AccessRational strain design with minimal phenotype perturbation
No consensus exists on the computationally tractable use of dynamic models for strain design. To tackle this, the authors report a framework, nonlinear-dynamic-model-assisted rational metabolic engineering design, for efficiently designing robust, artificially engineered cellular organisms.
- Bharath Narayanan
- , Daniel Weilandt
- & Vassily Hatzimanikatis
-
Article
| Open AccessHuman whole-exome genotype data for Alzheimer’s disease
The heterogeneity of whole-exome sequencing (WES) data generation methods presents a challenge to joint analysis. Here, the authors present a bioinformatics strategy to generate high-quality data from processing diversely generated WES samples, as applied in the Alzheimer’s Disease Sequencing Project.
- Yuk Yee Leung
- , Adam C. Naj
- & Li-San Wang
-
Article
| Open AccessUltraconserved bacteriophage genome sequence identified in 1300-year-old human palaeofaeces
Bacterial viruses (phages) are generally recognised as rapidly evolving biological entities. Here, Rozwalak et al. analyse DNA sequence datasets generated from ancient palaeofaeces and identify 298 phage genomes from the last 5300 years, including a 1300-year-old phage genome nearly identical to a present-day virus that infects human gut bacteria.
- Piotr Rozwalak
- , Jakub Barylski
- & Andrzej Zielezinski
-
Article
| Open AccessMARS an improved de novo peptide candidate selection method for non-canonical antigen target discovery in cancer
Detection of neoepitopes from tumours is time consuming and requires the integration of genomic and/or RNA sequencing expression data. Here, the authors propose a machine learning method to enable direct identification of additional, tumour-specific sequences using mass spectrometry through integration of de novo peptide sequencing scores, MHC class I binding prediction, and peptide retention time prediction.
- Hanqing Liao
- , Carolina Barra
- & Nicola Ternette
-
Article
| Open AccessSegment anything in medical images
Segmentation is an important fundamental task in medical image analysis. Here the authors show a deep learning model for efficient and accurate segmentation across a wide range of medical image modalities and anatomies.
- Jun Ma
- , Yuting He
- & Bo Wang
-
Article
| Open AccessMolecular quantitative trait loci in reproductive tissues impact male fertility in cattle
Investigating the genetics of male fertility requires comprehensive genotype and phenotype data. Here, the authors characterize the transcriptional complexity of bovine male reproductive tissues to identify loci associated with male fertility.
- Xena Marie Mapel
- , Naveen Kumar Kadri
- & Hubert Pausch
-
Article
| Open AccessUsing big sequencing data to identify chronic SARS-Coronavirus-2 infections
Chronic SARS-CoV-2 infections have been hypothesised to be sources of new variants. Here, the authors use large-scale genome sequencing data to identify mutations predictive of chronic infections, which may therefore be relevant in future variants.
- Sheri Harari
- , Danielle Miller
- & Adi Stern
-
Article
| Open AccessShifting patterns of dengue three years after Zika virus emergence in Brazil
Dengue virus circulation was unusually low in Brazil in 2015-2018 following the emergence of Zika virus, but subsequently resurged causing large outbreaks with a lower mean age of infection. Here, the authors use mathematical modelling to investigate the links between dengue dynamics and prior Zika infection.
- Francesco Pinotti
- , Marta Giovanetti
- & José Lourenço
-
Article
| Open AccessStructure-guided discovery of anti-CRISPR and anti-phage defense proteins
Bacteria use various defense systems to protect themselves from phage infection, and phages have evolved diverse counter-defense measures to overcome host defenses. Here, the authors use protein structural similarity and gene co-occurrence analyses for identification of new anti-phage and counter-defense systems.
- Ning Duan
- , Emily Hand
- & Akintunde Emiola
-
Article
| Open AccessDistinguishing examples while building concepts in hippocampal and artificial networks
While the hippocampus is well-known to store specific memories, it can also learn common features that are shared across individual memories. Here, the authors show how this ability arises from dual input pathways and how it can inspire better machine learning methods.
- Louis Kang
- & Taro Toyoizumi
-
Article
| Open AccessPROST: quantitative identification of spatially variable genes and domain detection in spatial transcriptomics
Understanding biological mechanisms requires a thorough exploration of spatiotemporal transcriptional patterns in complex tissues. Here, authors present PROST to quantify spatial gene expression patterns and detect spatial domains using spatial transcriptomics data of varying resolutions.
- Yuchen Liang
- , Guowei Shi
- & Zhonghui Tang
-
Article
| Open AccessClinical application of tumour-in-normal contamination assessment from whole genome sequencing
Assessing tumour contamination in normal samples is critical for accurate variant calling in cancer samples. Here, the authors develop TINC, a computational method to determine the level of tumour in normal contamination, and demonstrate its application in the Genomics England 100,000 Genomes Project dataset.
- Jonathan Mitchell
- , Salvatore Milite
- & Giulio Caravagna
-
Article
| Open AccessLongitudinal single cell atlas identifies complex temporal relationship between type I interferon response and COVID-19 severity
Single cell transcriptomics can reveal at high resolution the body’s response to infection. Here the authors have applied this technology to a longitudinal SARS-CoV-2 infected cohort and identified gene expression changes that may predict disease severity and reveal the underlying molecular mechanisms.
- Quy Xiao Xuan Lin
- , Deepa Rajagopalan
- & Shyam Prabhakar
-
Article
| Open AccessDeterminants of epidemic size and the impacts of lulls in seasonal influenza virus circulation
Seasonal influenza levels were unusually low when non-pharmaceutical interventions for COVID-19 were in place. Here, the authors analyse serological and epidemiological evidence for the hypothesis that such lulls in influenza transmission lead to reduced immunity and therefore larger epidemics in subsequent seasons.
- Simon P. J. de Jong
- , Zandra C. Felix Garza
- & Colin A. Russell
-
Article
| Open AccessFrom interaction networks to interfaces, scanning intrinsically disordered regions using AlphaFold2
Here, the authors show that AlphaFold2 accurately predicts protein interfaces involving disordered regions. Combining different delimitations and sequence alignments increases the success rate, while scanning short overlapping fragments identifies binding sites.
- Hélène Bret
- , Jinmei Gao
- & Raphaël Guerois
-
Article
| Open AccessLeveraging single-cell ATAC-seq and RNA-seq to identify disease-critical fetal and adult brain cell types
This study analyzed data from human cells assayed using single-cell technologies, together with data associating genetic variants to disease, to identify fetal and brain cell types whose biologically critically influences the etiology of disease.
- Samuel S. Kim
- , Buu Truong
- & Alkes L. Price
-
Article
| Open AccessEffective binning of metagenomic contigs using contrastive multi-view representation learning
Here, the authors present COMEBin, a metagenomics binning method based on contrastive multi-view representation learning that uses data augmentation to generate multiple fragments (views) of each contig, resulting in high-quality embeddings of heterogeneous features. COMEBin outperforms state-of-the art binning methods, particularly in recovering near-complete genomes from real environmental samples.
- Ziye Wang
- , Ronghui You
- & Shanfeng Zhu
-
Article
| Open AccessHaplotype-aware modeling of cis-regulatory effects highlights the gaps remaining in eQTL data
Genetic variants that influence gene expression play a major role in human phenotypic variability and disease susceptibility. Here, the authors introduce a computational method to estimate the regulatory effect size in genes with multiple conditionally independent regulatory variants.
- Nava Ehsan
- , Bence M. Kotis
- & Pejman Mohammadi
-
Article
| Open AccessSystematic detection of co-infection and intra-host recombination in more than 2 million global SARS-CoV-2 samples
SARS-CoV-2 coinfections may lead to recombination events which could be important in the emergence of new variants. Here, the authors develop an automated bioinformatics pipeline to identify coinfections in genomic data and test it on >2 million publicly available raw read data sets collected globally.
- Orsolya Anna Pipek
- , Anna Medgyes-Horváth
- & István Csabai
-
Article
| Open AccessBIDCell: Biologically-informed self-supervised learning for segmentation of subcellular spatial transcriptomics data
Subcellular in situ spatial transcriptomics offers the promise to address biological problems that were previously inaccessible but requires accurate cell segmentation to uncover insights. Here, authors present BIDCell, a biologically informed, deep learning-based cell segmentation framework.
- Xiaohang Fu
- , Yingxin Lin
- & Jean Y. H. Yang
-
Article
| Open AccessDiffDomain enables identification of structurally reorganized topologically associating domains
Topologically associating domains (TADs) are critical structural units in 3D genome organization, and their reorganization between health and disease states is associated with essential genome functions. However, computational methods for identifying reorganized TADs are still in the early stages of development. Here, the authors present an algorithm leveraging random matrix theory to identify reorganized TADs.
- Dunming Hua
- , Ming Gu
- & Dechao Tian
-
Article
| Open AccessCryo-EM structure and B-factor refinement with ensemble representation
Cryo-EM is the go-to method for visualizing large, flexible biomolecules. Here, authors introduce a new Gaussian mixture modelling method for cryo-EM modelling tasks, including refinement, composite map generation and ensemble representation.
- Joseph G. Beton
- , Thomas Mulvaney
- & Maya Topf
-
Article
| Open AccessLigand coupling mechanism of the human serotonin transporter differentiates substrates from inhibitors
The serotonin transporter, targeted by several medications, terminates neurotransmission by clearing serotonin from the synaptic cleft. Combining biochemical results with in silico data, the authors show the key interactions that initiate substrate transport.
- Ralph Gradisch
- , Katharina Schlögl
- & Thomas Stockner
-
Article
| Open AccessIntegrating genetic regulation and single-cell expression with GWAS prioritizes causal genes and cell types for glaucoma
The molecular and cellular causes of glaucoma are not well understood. Here, the authors integrate GWAS with genetic regulation and single cell expression from multiple eye tissues to identify genes and key cell types that affect glaucoma pathogenesis.
- Andrew R. Hamel
- , Wenjun Yan
- & Ayellet V. Segrè
-
Article
| Open AccessMechanism-centric regulatory network identifies NME2 and MYC programs as markers of Enzalutamide resistance in CRPC
Heterogeneous response to Enzalutamide remains a critical issue in castration-resistant prostate cancer (CRPC). Here, the authors reconstruct a CRPC-specific mechanism-centric regulatory network to identify signatures of Enzalutamide response and predict patients at risk of Enzalutamide resistance.
- Sukanya Panja
- , Mihai Ioan Truica
- & Antonina Mitrofanova
-
Article
| Open AccessMesozoic evolution of cicadas and their origins of vocalization and root feeding
The evolution of cicadas is unclear due to a lack of understanding of transitional features. Here, the authors assess adult and nymph mid-Cretaceous cicadas, to elucidate their morphological evolution and identify evidence of the origins of cicada sound-generation and subterranean lifestyle.
- Hui Jiang
- , Jacek Szwedo
- & Bo Wang
-
Article
| Open AccessDevelopmental basis of SHH medulloblastoma heterogeneity
The role of developmental pathways in medulloblastoma tumours (MB) with sonic hedgehog (SHH) activation remains to be explored. Here, the authors perform multi-omic analysis and characterise the key transcriptomic and metabolic patterns of highly differentiated cells in SHH MBs.
- Maxwell P. Gold
- , Winnie Ong
- & Ernest Fraenkel
-
Article
| Open AccessGene-SGAN: discovering disease subtypes with imaging and genetic signatures via multi-view weakly-supervised deep clustering
Many diseases can display distinct brain imaging phenotypes across individuals, potentially reflecting disease subtypes. However, biological interpretability is limited if the derived subtypes are not associated with genetic drivers or susceptibility factors. Here, the authors describe a deep-learning method that links imaging phenotypes with genetic factors, thereby conferring genetic correlations to the disease subtypes.
- Zhijian Yang
- , Junhao Wen
- & Christos Davatzikos
-
Article
| Open AccessMarsGT: Multi-omics analysis for rare population inference using single-cell graph transformer
Identifying rare cell populations is key to understanding cancer progression and response to therapy. Here, authors introduce MarsGT, an end-to-end deep learning model for rare cell population identification from scMulti-omics data.
- Xiaoying Wang
- , Maoteng Duan
- & Qin Ma
-
Article
| Open AccessMENDER: fast and scalable tissue structure identification in spatial omics data
Identifying tissue structure in large-scale spatial omics datasets from multiple slices is challenging. Here, authors present MENDER, an optimisation-free spatial clustering method that can scale to million-level spatial data, enabling efficient analysis of spatial cell atlases.
- Zhiyuan Yuan
-
Article
| Open AccessEnhancing geometric representations for molecules with equivariant vector-scalar interactive message passing
Utilising geometric information and reducing computational costs are key challenges in the molecular modelling field. Here, authors propose ViSNet, which efficiently extracts geometric features, accurately predicts molecular properties, and drives simulations with interpretability.
- Yusong Wang
- , Tong Wang
- & Tie-Yan Liu
-
Article
| Open AccessRadiomic tractometry reveals tract-specific imaging biomarkers in white matter
Diffusion MRI is used for tract-specific microstructural analysis of the white matter. Here, the authors introduce radiomic tractometry (RadTract), enhancing tractometry with radiomics-based imaging biomarkers for improved predictive modelling.
- Peter Neher
- , Dusan Hirjak
- & Klaus Maier-Hein
-
Article
| Open AccessQuick model-based viscoelastic clot strength predictions from blood protein concentrations for cybermedical coagulation control
Available viscoelastic models of blood flow and blood coagulation are unsuited for a cybermedical input-output type of control system application. Here the authors present validated viscoelastic coagulation models that use quickly-measurable protein concentrations to forecast slow clot strength curves for future automation.
- Damon E. Ghetmiri
- , Alessia J. Venturi
- & Amor A. Menezes
-
Article
| Open AccessImproving deep neural network generalization and robustness to background bias via layer-wise relevance propagation optimization
Image background features can undesirably affect deep networks’ decisions. Here, the authors show that the optimization of Layer-wise Relevance Propagation explanation heatmaps can hinder such influence, improving out-of-distribution generalization.
- Pedro R. A. S. Bassi
- , Sergio S. J. Dertkigil
- & Andrea Cavalli
-
Article
| Open AccessThe juxtamembrane linker of synaptotagmin 1 regulates Ca2+ binding via liquid-liquid phase separation
Synaptotagmin (syt) 1 is a calcium sensor for neuronal exocytosis. Here, the authors show that the juxtamembrane linker of this integral membrane protein negatively regulates its calcium sensing activity by mediating self-association via liquid-liquid phase separation.
- Nikunj Mehta
- , Sayantan Mondal
- & Edwin R. Chapman
-
Article
| Open AccessPetascale pipeline for precise alignment of images from serial section electron microscopy
Segmentation accuracy of serial section electron microscopy (ssEM) images can be limited by the step of aligning 2D section images to create a 3D image stack. Here the authors report a computational pipeline for aligning ssEM images and apply this to a whole fly brain dataset.
- Sergiy Popovych
- , Thomas Macrina
- & H. Sebastian Seung
-
Article
| Open AccessAngiogenesis-on-a-chip coupled with single-cell RNA sequencing reveals spatially differential activations of autophagy along angiogenic sprouts
The functional heterogeneity of autophagy in endothelial cells during angiogenesis remains incompletely understood. Here, the authors apply a 3D angiogenesis-on-a-chip coupled with single-cell RNA sequencing to find distinct autophagy functions in two different endothelial cell populations during angiogenic sprouting.
- Somin Lee
- , Hyunkyung Kim
- & Noo Li Jeon
-
Article
| Open Accessrworkflows: automating reproducible practices for the R community
Reproducibility is essential for the progress of research, yet achieving it remains elusive even in computational fields. Here, authors develop the rworkflows suite, making robust CI/CD workflows easy and freely accessible to all R package developers.
- Brian M. Schilder
- , Alan E. Murphy
- & Nathan G. Skene
Browse broader subjects
Browse narrower subjects
- Biochemical reaction networks
- Cellular signalling networks
- Classification and taxonomy
- Communication and replication
- Computational models
- Computational neuroscience
- Computational platforms and environments
- Data acquisition
- Data integration
- Data mining
- Data processing
- Data publication and archiving
- Databases
- Functional clustering
- Gene ontology
- Gene regulatory networks
- Genome informatics
- Hardware and infrastructure
- High-throughput screening
- Image processing
- Literature mining
- Machine learning
- Microarrays
- Network topology
- Phylogeny
- Power law
- Predictive medicine
- Probabilistic data networks
- Programming language
- Protein analysis
- Protein design
- Protein folding
- Protein function predictions
- Protein structure predictions
- Proteome informatics
- Quality control
- Scale invariance
- Sequence annotation
- Software
- Standards
- Statistical methods
- Virtual drug screening