Computational biology and bioinformatics articles within Nature Communications

Featured

  • Article
    | Open Access

    Downstream of trajectory inference for cell lineages based on scRNA-seq data, differential expression analysis yields insight into biological processes. Here, Van den Berge et al. develop tradeSeq, a framework for the inference of within and between-lineage differential expression, based on negative binomial generalized additive models.

    • Koen Van den Berge
    • , Hector Roux de Bézieux
    •  & Lieven Clement
  • Article
    | Open Access

    Methylation of a lysine residue in Hsp90 is a recently discovered post-translational modification but the mechanistic effects of this modification have remained unknown so far. Here the authors combine biochemical and biophysical approaches, molecular dynamics (MD) simulations and functional experiments with yeast and show that this lysine is a switch point, which specifically modulates conserved Hsp90 functions including co-chaperone regulation and client activation.

    • Alexandra Rehn
    • , Jannis Lawatscheck
    •  & Johannes Buchner
  • Article
    | Open Access

    Diagnosing acute infections based on transcriptional host response shows promise, but generalizability is wanting. Here, the authors use a co-normalization framework to train a classifier to diagnose acute infections and apply it to independent data on a targeted diagnostic platform.

    • Michael B. Mayhew
    • , Ljubomir Buturovic
    •  & Timothy E. Sweeney
  • Article
    | Open Access

    The analysis of RNA-seq data is complicated by dropouts, and these are usually treated as a problem to be addressed. Here, Peng Qiu uses dropouts as a source of information and presents a co-occurrence clustering algorithm to cluster cells based on the dropout pattern; this could be a complementary approach to existing methods.

    • Peng Qiu
  • Article
    | Open Access

    GWAS analysis currently relies mostly on linear mixed models, which do not account for linkage disequilibrium (LD) between tested variants. Here, Sesia et al. propose KnockoffZoom, a non-parametric statistical method for the simultaneous discovery and fine-mapping of causal variants, assuming only that LD is described by hidden Markov models (HMMs).

    • Matteo Sesia
    • , Eugene Katsevich
    •  & Chiara Sabatti
  • Article
    | Open Access

    Metabolic engineering is often hampered by non-linear kinetics and allosteric regulatory mechanisms. Here, the authors construct a quantitative model for the pentose degradation Weimberg pathway in Caulobacter crescentus and demonstrate its biotechnological applications in cell-free system and standard metabolic engineering.

    • Lu Shen
    • , Martha Kohlhaas
    •  & Bettina Siebers
  • Article
    | Open Access

    Aperiodic structure imaging suffers limitations when utilizing Fourier analysis. The authors report an algorithm that quantitatively overcomes these limitations based on nonconvex optimization, demonstrated by studying aperiodic structures via the phase sensitive interference in STM images.

    • Sky C. Cheung
    • , John Y. Shin
    •  & Abhay N. Pasupathy
  • Article
    | Open Access

    Reference databases are essential for studies on host-microbiota interactions. Here, the authors present the construction of VIRGO, a human vaginal non-redundant gene catalog, which represents a comprehensive resource for taxonomic and functional profiling of vaginal microbiomes from metagenomic and metatranscriptomic datasets.

    • Bing Ma
    • , Michael T. France
    •  & Jacques Ravel
  • Article
    | Open Access

    Quantifying somatic evolutionary processes in cancer and healthy tissue is a challenge. Here, the authors use single time point multi-region sampling of cancer and normal tissue, combined with evolutionary theory, to quantify in vivo mutation and cell survival rates per cell division.

    • Benjamin Werner
    • , Jack Case
    •  & Andrea Sottoriva
  • Article
    | Open Access

    In Mendelian randomization (MR) studies, one typically selects SNPs as instrumental variables that do not directly affect the outcome to avoid violation of MR assumptions. Here, Cho et al. present a framework, MR-TRYX, that leverages knowledge of such outliers of horizontal pleiotropy to identify putative causal relationships between exposure and outcome.

    • Yoonsu Cho
    • , Philip C. Haycock
    •  & Gibran Hemani
  • Article
    | Open Access

    Neurotransmitter:sodium symporters (NSS) serve as targets for drugs including antidepressants and psychostimulants. Here authors report the X-ray structure of the prokaryotic NSS member, LeuT, in a Na+/substrate-bound, inward-facing occluded conformation which is a key intermediate in the LeuT transport cycle.

    • Kamil Gotfryd
    • , Thomas Boesen
    •  & Ulrik Gether
  • Article
    | Open Access

    Understanding tumour development at a granular level is a challenge in solid tumours. Here, the authors provide a cell atlas across tumour development in a genetic model of salivary gland squamous cell carcinoma using single-cell transcriptome and epitope profiling.

    • Samantha D. Praktiknjo
    • , Benedikt Obermayer
    •  & Nikolaus Rajewsky
  • Article
    | Open Access

    The study of disease modules facilitates insight into complex diseases, but their identification relies on knowledge of molecular networks. Here, the authors show that disease modules and genes can also be discovered in deep autoencoder representations of large human gene expression datasets.

    • Sanjiv K. Dwivedi
    • , Andreas Tjärnberg
    •  & Mika Gustafsson
  • Article
    | Open Access

    Complex diseases often share genetic determinants and symptoms, but the mechanistic basis of disease interactions remains elusive. Here, the authors propose a network topological measure to identify proteins linking complex diseases in the interactome, and identify mediators between COPD and asthma.

    • Enrico Maiorino
    • , Seung Han Baek
    •  & Amitabh Sharma
  • Article
    | Open Access

    For single-cell RNA-seq experiments the sequencing budget is limited, and how it should be optimally allocated to maximize information is not clear. Here the authors develop a mathematical framework to show that, for estimating many gene properties, the optimal allocation is to sequence at the depth of one read per cell per gene.

    • Martin Jinye Zhang
    • , Vasilis Ntranos
    •  & David Tse
  • Article
    | Open Access

    Large 3D electron microscopy data sets frequently contain noisy data due to accelerated imaging, and denoising techniques require specialised skill sets. Here the authors introduce DenoisEM, an ImageJ plugin that democratises denoising EM data sets, enabling fast parameter tuning and processing through parallel computing.

    • Joris Roels
    • , Frank Vernaillen
    •  & Yvan Saeys
  • Article
    | Open Access

    Localizing phosphorylation sites by data-independent acquisition (DIA)-based proteomics is still challenging. Here, the authors develop algorithms for phosphosite localization and stoichiometry determination, and incorporate them into single-shot DIA-phosphoproteomics workflows.

    • Dorte B. Bekker-Jensen
    • , Oliver M. Bernhardt
    •  & Jesper V. Olsen
  • Article
    | Open Access

    Microbes secrete a repertoire of extracellular proteins to serve various functions depending on the ecological context. Here the authors examine how bacterial community composition and habitat structure affect the extracellular proteins, showing that generalist species and those living in more structured environments produce more extracellular proteins, and that costs of production are lower in more diverse communities.

    • Marc Garcia-Garcera
    •  & Eduardo P. C. Rocha
  • Article
    | Open Access

    Most currently available statistical tools for the analysis of ATAC-seq data were repurposed from tools developed for other functional genomics data (e.g. ChIP-seq). Here, Gabitto et al develop ChromA, a Bayesian statistical approach for the analysis of both bulk and single-cell ATAC-seq data.

    • Mariano I. Gabitto
    • , Anders Rasmussen
    •  & Richard Bonneau
  • Article
    | Open Access

    Whether the immune system aging differs between men and women is barely known. Here the authors characterize gene expression, chromatin state and immune subset composition in the blood of healthy humans 22 to 93 years of age, uncovering shared as well as sex-unique alterations, and create a web resource to interactively explore the data.

    • Eladio J. Márquez
    • , Cheng-han Chung
    •  & Duygu Ucar
  • Article
    | Open Access

    The authors present SVclone, a computational method for inferring the cancer cell fraction of structural variants from whole-genome sequencing data.

    • Marek Cmero
    • , Ke Yuan
    •  & Christian von Mering
  • Article
    | Open Access

    Multi-omics datasets pose major challenges to data interpretation and hypothesis generation owing to their high-dimensional molecular profiles. Here, the authors develop ActivePathways method, which uses data fusion techniques for integrative pathway analysis of multi-omics data and candidate gene discovery.

    • Marta Paczkowska
    • , Jonathan Barenboim
    •  & Christian von Mering
  • Article
    | Open Access

    The authors previously developed the Protein Common Interface Database (ProtCID), which compares and clusters the interfaces of pairs of full-length protein chains with defined Pfam domain architectures in different PDB entries to identify biological assemblies. Here the authors extend ProtCID to the clustering of domain-domain interactions that also allows analyzing domain interactions with peptides, nucleic acids, and ligands.

    • Qifang Xu
    •  & Roland L. Dunbrack Jr.
  • Article
    | Open Access

    Understanding deregulation of biological pathways in cancer can provide insight into disease etiology and potential therapies. Here, as part of the PanCancer Analysis of Whole Genomes (PCAWG) consortium, the authors present pathway and network analysis of 2583 whole cancer genomes from 27 tumour types.

    • Matthew A. Reyna
    • , David Haan
    •  & Christian von Mering
  • Article
    | Open Access

    In somatic cells the mechanisms maintaining the chromosome ends are normally inactivated; however, cancer cells can re-activate these pathways to support continuous growth. Here, the authors characterize the telomeric landscapes across tumour types and identify genomic alterations associated with different telomere maintenance mechanisms.

    • Lina Sieverling
    • , Chen Hong
    •  & Christian von Mering
  • Article
    | Open Access

    Copy number alterations (CNAs) can drive tumor progression in cancer by altering gene expression levels, but transcriptional adaption can skew CNA impact. Here, the authors present transcriptional adaptation to CNA (TACNA) profiling; a tool to extract the transcriptional effect of CNAs from expression data without requiring paired CNA profiles.

    • Arkajyoti Bhattacharya
    • , Rico D. Bense
    •  & Rudolf S. N. Fehrmann
  • Article
    | Open Access

    There are many methods to detect cancer-driving mutations. Here, the authors harness the variant allele frequency of mutations in tumor cells of a single individual to present a method that can estimate growth patterns and identify driver gene evolution at a patient specific level.

    • Leonidas Salichos
    • , William Meyerson
    •  & Mark Gerstein
  • Article
    | Open Access

    Antimicrobial resistance (AMR) represents a global health threat. Here, the authors analyse the oral and gut resistomes from metagenomes of diverse populations and find that the oral resistome harbours higher abundance but lower diversity of antimicrobial resistance genes than the gut resistome.

    • Victoria R. Carr
    • , Elizabeth A. Witherden
    •  & David L. Moyes
  • Article
    | Open Access

    Cell-surface proteins serve as phenotypic cell markers and in many cases are more indicative of cellular function than the transcriptome. Here, the authors introduce a transfer learning framework to impute surface protein abundances from scRNA-seq data.

    • Zilu Zhou
    • , Chengzhong Ye
    •  & Nancy R. Zhang
  • Article
    | Open Access

    Ibrutinib, a Bruton tyrosine kinase inhibitor, provides effective treatment for chronic lymphocytic leukemia (CLL). Here, the authors describe time-dependent molecular changes to malignant cells and to the immune system in patients undergoing ibrutinib therapy, with can be used for therapy monitoring.

    • André F. Rendeiro
    • , Thomas Krausgruber
    •  & Christoph Bock
  • Article
    | Open Access

    The endothelial to haematopoietic transition (EHT) is the process where haemogenic endothelium differentiates into haematopoietic stem and progenitor cells (HSPCs). Here the authors use single cell transcriptomics and antibody screening to identify CD44 as a marker of EHT that is required for EHT and HSPC development.

    • Morgan Oatley
    • , Özge Vargel Bölükbası
    •  & Christophe Lancrin
  • Article
    | Open Access

    The likelihood of linking within a complex network is of importance to solve real-world problems, but it is challenging to predict. Sun et al. show that the link predictability limit can be well estimated by measuring the shortest compression length of a network without a need of prediction algorithm.

    • Jiachen Sun
    • , Ling Feng
    •  & Yanqing Hu
  • Article
    | Open Access

    Uveal melanoma is highly metastatic and unresponsive to checkpoint immunotherapy. Here, the authors present single-cell transcriptomics of 59,915 cells in 8 primary and 3 metastatic samples, highlighting the diversity of the tumour microenvironment.

    • Michael A. Durante
    • , Daniel A. Rodriguez
    •  & J. William Harbour
  • Article
    | Open Access

    Chromosome arm-level aneuploidies (CAAs) are frequently observed in cancer. Here, the authors analyse CAA landscapes across different tumour types, relating these chromosome arm gains and losses to tumour evolution, metastasis, patient survival and response to a range of anti-cancer therapies.

    • Ankit Shukla
    • , Thu H. M. Nguyen
    •  & Pascal H. G. Duijf
  • Article
    | Open Access

    Poliovirus has a higher mutation rate than HIV, yet has been almost eradicated by vaccination while an effective vaccine against HIV does not exist. Here, the authors develop a fitness model for poliovirus viral protein 1 to show that it is subject to stringent evolutionary constraints that limit its ability to avoid vaccine-induced immune responses.

    • Ahmed A. Quadeer
    • , John P. Barton
    •  & Matthew R. McKay
  • Article
    | Open Access

    Here, via a metagenomics analysis of population-based and disease cohorts, Vich Vila et al. study the impact of 41 commonly used medications on the taxonomic structures, metabolic potential and resistome of the gut microbiome, underscoring the importance of correcting for multiple drug use in microbiome studies.

    • Arnau Vich Vila
    • , Valerie Collij
    •  & Rinse K. Weersma
  • Article
    | Open Access

    The biology of Alzheimer’s disease (AD) remains unknown. We propose AD is a protein connectivity-based dysfunction disorder whereby a switch of the chaperome into epichaperomes rewires proteome-wide connectivity, leading to brain circuitry malfunction that can be corrected by novel therapeutics.

    • Maria Carmen Inda
    • , Suhasini Joshi
    •  & Gabriela Chiosis
  • Article
    | Open Access

    Quantifying the effect of mutations on binding free energy is important to understand protein-protein interaction (PPI). Here the authors develop a method based on yeast display and next-generation sequencing to generate quantitative binding landscapes for any PPI regardless of their Kd value.

    • Michael Heyne
    • , Niv Papo
    •  & Julia M. Shifman
  • Article
    | Open Access

    Accounting for the effects of genetic expression in genome-scale metabolic models is challenging. Here, the authors introduce a model formulation that efficiently simulates thermodynamic-compliant fluxes, enzyme and mRNA concentration levels, allowing omics integration and broad analysis of in silico cellular physiology.

    • Pierre Salvy
    •  & Vassily Hatzimanikatis
  • Article
    | Open Access

    Low sample numbers often limit the robustness of analyses in biomedical research. Here, the authors introduce a method to generate realistic scRNA-seq data using GANs that learn gene expression dependencies from complex samples, and show that augmenting spare cell populations improves downstream analyses.

    • Mohamed Marouf
    • , Pierre Machart
    •  & Stefan Bonn
  • Article
    | Open Access

    Data-independent acquisition (DIA) is an emerging technology in proteomics but it typically relies on spectral libraries built by data-dependent acquisition (DDA). Here, the authors use deep learning to generate in silico spectral libraries directly from protein sequences that enable more comprehensive DIA experiments than DDA-based libraries.

    • Yi Yang
    • , Xiaohui Liu
    •  & Liang Qiao