Computational biology and bioinformatics articles within Nature Communications

Featured

  • Article
    | Open Access

    The blood transcriptome of human subjects can be profiled on an almost routine basis in translational research settings. Here the authors show that a fixed and well-characterized repertoire of transcriptional modules can be employed as a reusable framework for the analysis, visualization and interpretation of such data

    • Matthew C. Altman
    • , Darawan Rinchai
    •  & Damien Chaussabel
  • Article
    | Open Access

    DNA probes used in next generation sequencing (NGS) have variable hybridisation kinetics, resulting in non-uniform coverage. Here, the authors develop a deep learning model to predict NGS depth using DNA probe sequences and apply to human and non-human sequencing panels.

    • Jinny X. Zhang
    • , Boyan Yordanov
    •  & David Yu Zhang
  • Article
    | Open Access

    Vo’, Italy, is a unique setting for studying SARS-CoV-2 antibody dynamics because mass testing was conducted there early in the pandemic. Here, the authors perform two follow-up serological surveys and estimate seroprevalence, the extent of within-household transmission, and the impact of contact tracing.

    • Ilaria Dorigatti
    • , Enrico Lavezzo
    •  & Andrea Crisanti
  • Article
    | Open Access

    Advances in omics approaches could enable quantitative predictions of microbial functional composition. Here the authors re-analyze 885 metagenome-assembled genomes from Tara Oceans, and use a network approach to quantify protein functional clusters and explore their biogeography.

    • Emile Faure
    • , Sakina-Dorothée Ayata
    •  & Lucie Bittner
  • Article
    | Open Access

    Unmasking the decision making process of machine learning models is essential for implementing diagnostic support systems in clinical practice. Here, the authors demonstrate that adversarially trained models can significantly enhance the usability of pathology detection as compared to their standard counterparts.

    • Tianyu Han
    • , Sven Nebelung
    •  & Daniel Truhn
  • Article
    | Open Access

    Patients with chronic lung disease (CLD) have an increased risk for severe coronavirus disease-19 and poor outcomes. Here the authors compare the transcriptomes of single cells isolated from healthy and CLD lungs to identify molecular characteristics of lung cells that may account for worse COVID-19 outcomes in these patients.

    • Linh T. Bui
    • , Nichelle I. Winters
    •  & Laure Emmanuelle Zaragosi
  • Article
    | Open Access

    Finding durable, high-density media for data storage is necessary to support the ever-expanding generation of digital data. Here, the authors use peptide sequences to store digital data and retrieve them using tandem mass spectrometry, proving that peptides can be used as a storage medium.

    • Cheuk Chi A. Ng
    • , Wai Man Tam
    •  & Zhong-Ping Yao
  • Article
    | Open Access

    Deep learning algorithms trained on data streamed temporally from different clinical sites and from a multitude of physiological sensors are generally affected by a degradation in performance. To mitigate this, the authors propose a continual learning strategy that employs a replay buffer.

    • Dani Kiyasseh
    • , Tingting Zhu
    •  & David Clifton
  • Article
    | Open Access

    Mutations in 5’ untranslated regions (UTRs) have a functional role in gene expression in cancer. Here, the authors develop a sequencing-based high throughput functional assay named PLUMAGE and show the effects of these mutations on gene expression and their association with clinical outcomes in prostate cancer.

    • Yiting Lim
    • , Sonali Arora
    •  & Andrew C. Hsieh
  • Article
    | Open Access

    Our ability to interpret single-cell multivariate signaling responses is still limited. Here the authors introduce fractional response analysis (FRA), involving fractional cell counting, capable of deconvoluting heterogeneous multivariate responses of cellular populations.

    • Karol Nienałtowski
    • , Rachel E. Rigby
    •  & Michał Komorowski
  • Article
    | Open Access

    Existing genetic prediction tools typically assume that genetic variants contribute equally towards the phenotype. The authors develop eight prediction tools that allow the user to specify the heritability model, and show that these tools enable substantially improved prediction of complex traits.

    • Qianqian Zhang
    • , Florian Privé
    •  & Doug Speed
  • Article
    | Open Access

    It remains unclear how spatial information controls endothelial cell identity and behavior in the developing heart. Here the authors perform single cell RNA sequencing at key developmental timepoints in mice to interrogate cellular contributions to coronary vessel patterning and maturation in the epicardium.

    • Pearl Quijada
    • , Michael A. Trembley
    •  & Eric M. Small
  • Article
    | Open Access

    Few studies have provided functional analysis of the epigenetic landscape in the regenerating liver. Here the authors define chromatin states in the quiescent vs. regenerating mouse liver through integration of genome wide profiles of DNA methylation, histone modifications, and chromatin accessibility, identifying H3K27me3 as an epigenetic mark conferring regenerative potential.

    • Chi Zhang
    • , Filippo Macchi
    •  & Kirsten C. Sadler
  • Article
    | Open Access

    Precision medicine needs prognostic markers to select the patients that will benefit more from targeted therapy. Authors show here that high level of baseline T cell receptor diversity is an indicator of favourable prognosis in multiple cancer types, and monoclonal expansion of T-cells correlates with good response to immune checkpoint blockade therapy in metastatic melanoma patients.

    • Sara Valpione
    • , Piyushkumar A. Mundra
    •  & Richard Marais
  • Article
    | Open Access

    RNA modifications appear to play a role in determining RNA structure and function. Here, the authors develop a deep learning model that predicts the location of 12 RNA modifications using primary sequence, and show that several modifications are associated, which suggests dependencies between them.

    • Zitao Song
    • , Daiyun Huang
    •  & Jia Meng
  • Article
    | Open Access

    The superior colliculus (SC) receives diverse cortical inputs to drive many behaviors. Here, based on comprehensive mapping of cortico-tectal projections, the authors refined the superior colliculus into medial, centromedial, centrolateral, and lateral zones, and characterized the input-output connectivity and morphology of neurons in each zone that serve the role of SC in goal-directed behaviors.

    • Nora L. Benavidez
    • , Michael S. Bienkowski
    •  & Hong-Wei Dong
  • Article
    | Open Access

    A more comprehensive map of viral host ranges can help identify and mitigate zoonotic and animal-disease risks. A divide-and-conquer approach which separates viral, mammalian and network features predicts over 20,000 unknown associations between known viruses and susceptible mammalian species.

    • Maya Wardeh
    • , Marcus S. C. Blagrove
    •  & Matthew Baylis
  • Article
    | Open Access

    To benchmark single cell bioinformatics tools, data simulators can provide a robust ground truth. Here the authors present dyngen, a multi-modal simulator, and apply it to aligning cell developmental trajectories, cell-specific regulatory network inference and estimation of RNA velocity.

    • Robrecht Cannoodt
    • , Wouter Saelens
    •  & Yvan Saeys
  • Article
    | Open Access

    Small molecules bioactivity descriptors are enriched representations of compounds, reaching beyond chemical structures and capturing their known biological properties. Here the authors present a collection of deep neural networks able to infer bioactivity signatures for any compound of interest, even when little or no experimental information is available for them.

    • Martino Bertoni
    • , Miquel Duran-Frigola
    •  & Patrick Aloy
  • Article
    | Open Access

    The class Frizzled of G protein-coupled receptors (GPCRs) consist of ten Frizzled (FZD1-10) subtypes and Smoothened (SMO). Here the Schulte laboratory demonstrates that FZDs differ substantially from SMO in receptor activation-associated conformational changes, while SMO manifests a preference for a straight TM6, the TM6 of FZDs is kinked upon activation.

    • Ainoleena Turku
    • , Hannes Schihada
    •  & Gunnar Schulte
  • Review Article
    | Open Access

    Natural products are an important source of bioactive compounds and have versatile applications in different fields, but their discovery is challenging. Here, the authors review the recent developments in genome mining for discovery of natural products, focusing on compounds from unconventional microorganisms and microbiomes.

    • Kirstin Scherlach
    •  & Christian Hertweck
  • Article
    | Open Access

    A deeper knowledge of the immune cell profile within the brain cancer tumor microenvironment (TM) could identify targets to improve immunotherapy efficacy. Here, in glioblastoma, the authors find haematopoietic stem and progenitor cells in the TM, which are associated with poor prognosis and increased immunosuppression.

    • I-Na Lu
    • , Celia Dobersalske
    •  & Igor Cima
  • Article
    | Open Access

    Bacterial microcompartments (BMCs) are organelles consisting of a protein shell in which certain metabolic reactions take place separated from the cytoplasm. Here, Sutter et al. present a comprehensive catalog of BMC loci, substantially expanding the number of known BMCs and describing distinct types and compartmentalized reactions.

    • Markus Sutter
    • , Matthew R. Melnicki
    •  & Cheryl A. Kerfeld
  • Article
    | Open Access

    Many proteins exist in various proteoforms but detecting these variants by bottom-up proteomics remains difficult. Here, the authors present a computational approach based on peptide correlation analysis to identify and characterize proteoforms from bottom-up proteomics data.

    • Isabell Bludau
    • , Max Frank
    •  & Ruedi Aebersold
  • Article
    | Open Access

    Keratitis is the main cause of corneal blindness worldwide, but most vision loss caused by keratitis can be avoidable via early detection and treatment, which are challenging in resource-limited settings. Here, the authors develop a deep learning system for the automated classification of keratitis and other cornea abnormalities.

    • Zhongwen Li
    • , Jiewei Jiang
    •  & Wei Chen
  • Article
    | Open Access

    α-Synuclein (αS) aggregation is a driver of several neurodegenerative disorders. Here, the authors identify a class of peptides that bind toxic αS oligomers and amyloid fibrils but not monomeric functional protein, and prevent further αS aggregation and associated cell damage.

    • Jaime Santos
    • , Pablo Gracia
    •  & Salvador Ventura
  • Article
    | Open Access

    The authors generate the largest structural dataset of enzymatic and non-enzymatic metalloprotein sites to date. They use this dataset to train a decision-tree ensemble machine learning algorithm that allows them to distinguish between catalytic and non-catalytic metal sites. The computational model described here could also be useful for the identification of new enzymatic mechanisms and de novo enzyme design.

    • Ryan Feehan
    • , Meghan W. Franklin
    •  & Joanna S. G. Slusky
  • Article
    | Open Access

    Single cell RNA-seq loses spatial information of gene expression in multicellular systems because tissue must be dissociated. Here, the authors show the spatial gene expression profiles can be both accurately and robustly reconstructed by a new computational method using a generative linear mapping, Perler.

    • Yasushi Okochi
    • , Shunta Sakaguchi
    •  & Honda Naoki
  • Article
    | Open Access

    A large number of mass spectra from different samples have been collected, and to identify small molecules from these spectra, database searches are needed, which is challenging. Here, the authors report molDiscovery, a mass spectral database search method that uses an algorithm to generate mass spectrometry fragmentations and learns a probabilistic model to match small molecules with their mass spectra.

    • Liu Cao
    • , Mustafa Guler
    •  & Hosein Mohimani
  • Article
    | Open Access

    Despite the consensus that mass vaccination against SARS-CoV-2 will ultimately end the pandemic, it is not clear when and which control measures can be relaxed during the rollout of vaccination programmes. Here, the authors investigate relaxation scenarios using an age-structured transmission model that has been fitted to data for Portugal.

    • João Viana
    • , Christiaan H. van Dorp
    •  & Ganna Rozhnova
  • Article
    | Open Access

    People can infer unobserved causes of perceptual data (e.g. the contents of a box from the sound made by shaking it). Here the authors show that children compare what they hear with what they would have heard given other causes, and explore longer when the heard and imagined sounds are hard to discriminate.

    • Max H. Siegel
    • , Rachel W. Magid
    •  & Laura E. Schulz