Computational biology and bioinformatics articles within Nature Communications

Featured

  • Article
    | Open Access

    Drug and target discovery for advanced liver disease are hampered by a lack of suitable models for clinical translation. Here the authors present a human liver cell-based system modeling a clinical prognostic signature allowing to propose nizatidine for treatment of advanced liver fibrosis and hepatocellular carcinoma prevention.

    • Emilie Crouchet
    • , Simonetta Bandiera
    •  & Thomas F. Baumert
  • Article
    | Open Access

    Dual-energy X-ray absorptiometry and the Fracture Risk Assessment Tool are recommended tools for osteoporotic fracture risk evaluation, but are underutilized. Here, the authors present an opportunistic tool to identify fractures, predict bone mineral density and evaluate fracture risk using plain pelvis and lumbar spine radiographs.

    • Chen-I Hsieh
    • , Kang Zheng
    •  & Chang-Fu Kuo
  • Article
    | Open Access

    Peptide-protein interactions play fundamental roles in cellular processes and are crucial for designing peptide therapeutics. Here, the authors present a deep learning framework for simultaneously predicting peptide-protein interactions and identifying peptide binding residues involved in the interactions.

    • Yipin Lei
    • , Shuya Li
    •  & Jianyang Zeng
  • Article
    | Open Access

    Image-based simulation for obtaining physical quantities is limited by the uncertainty in the underlying image segmentation. Here, the authors introduce a workflow for efficiently quantifying segmentation uncertainty and creating uncertainty distributions of the resulting physics quantities.

    • Michael C. Krygier
    • , Tyler LaBonte
    •  & Scott A. Roberts
  • Article
    | Open Access

    Little is known about how human parental relatedness varied across ancient populations. Runs of homozygosity (ROH) in the offspring’s genome can give clues. Here, the authors present a method to identify ROH in ancient genomes and infer low rates of close kin unions across most ancient populations.

    • Harald Ringbauer
    • , John Novembre
    •  & Matthias Steinrücken
  • Article
    | Open Access

    Despite extensive genetic heterogeneity, nearly half of all multiple myeloma (MM) cases are driven by cyclin D2 (CCND2) over-expression. Here the authors dissect the chromatin landscape of MM to provide insights into the transcriptional regulatory landscape driving MM and divergent transcriptomes corresponding to different MM genetic subtypes.

    • Jaime Alvarez-Benayas
    • , Nikolaos Trasanidis
    •  & Anastasios Karadimitris
  • Article
    | Open Access

    The molecular basis of Alzheimer’s Disease has been obscured by heterogeneity and scarcity of brain gene expression data, which limit effectiveness in complex models. Here, the authors introduce a multi-task deep learning framework to learn generalizable and nuanced relationships between gene expression and neuropathology.

    • Nicasia Beebe-Wang
    • , Safiye Celik
    •  & Su-In Lee
  • Article
    | Open Access

    Newly emerged pathogens are inherently difficult to forecast, due to many unknowns about their biology early in an epidemic. Here, the authors assess forecasts of a suite of models during the Zika epidemic in Colombia, finding that the models that performed best changed over the course of the epidemic.

    • Rachel J. Oidtman
    • , Elisa Omodei
    •  & T. Alex Perkins
  • Article
    | Open Access

    Complex biomolecular networks are fundamental to the functioning of living systems, both at the cellular level and beyond. In this paper, the authors develop a systems framework to elucidate the interplay of networks and the spatial localisation of network components.

    • Govind Menon
    •  & J. Krishnan
  • Article
    | Open Access

    Intratumour heterogeneity (ITH) and mutational signatures are typically analysed separately, even though they are not necessarily independent. Here, the authors present CloneSig, a tool for the joint estimation of ITH and mutational signatures, with which they analyse the TCGA and PCAWG datasets.

    • Judith Abécassis
    • , Fabien Reyal
    •  & Jean-Philippe Vert
  • Article
    | Open Access

    Disambiguating abbreviations is important for automated clinical note processing; however, deploying machine learning for this task is restricted by lack of good training data. Here, the authors show novel data augmentation methods that use biomedical ontologies to improve abbreviation disambiguation in many datasets.

    • Marta Skreta
    • , Aryan Arbabi
    •  & Michael Brudno
  • Article
    | Open Access

    Historical interbreeding between Neanderthals and humans should leave signatures of historical demographics in modern human genomes. Analysing the size distribution of Neanderthal fragments in non-African genomes suggests consistent differences in the generation interval across Eurasia, and that this could explain mutational spectrum variation.

    • Moisès Coll Macià
    • , Laurits Skov
    •  & Mikkel Heide Schierup
  • Comment
    | Open Access

    Spatially resolved transcriptomic data demand new computational analysis methods to derive biological insights. Here, we comment on these associated computational challenges as well as highlight the opportunities for standardized benchmarking metrics and data-sharing infrastructure in spurring innovation moving forward.

    • Lyla Atta
    •  & Jean Fan
  • Article
    | Open Access

    Genetic plasticity drives phenotypic differences. Here, the authors develop a framework to quantify the individual and combinatorial contributions of SNPs on a phenotype of interest and use it to identify SNP-SNP interactions associated with variations in bacteria’s response to external changes.

    • Dengcheng Yang
    • , Yi Jin
    •  & Rongling Wu
  • Article
    | Open Access

    Trypanosoma brucei undergoes developmental steps during host infection. Here, using oligopeptide-induced differentiation in vitro, authors model replicative ‘slender’ to transmissible ‘stumpy’ bloodstream forms and identify developmental and cell cycle regulators by single cell transcriptomics.

    • Emma M. Briggs
    • , Federico Rojas
    •  & Thomas D. Otto
  • Article
    | Open Access

    Identifying the molecular mechanisms of response to systemic therapy in prostate cancer remains crucial. Here, the authors apply single cell-ATAC and RNAseq to models of early treatment response and resistance to enzalutamide and identify chromatin and gene expression patterns that can predict treatment response.

    • S. Taavitsainen
    • , N. Engedal
    •  & A. Urbanucci
  • Article
    | Open Access

    The intrinsic disorder of histone tails poses challenges in their characterization. Here the authors apply extensive molecular dynamics simulations of the full nucleosome to show reversible binding to DNA with specific binding modes of different types of histone tails, where charge-altering modifications suppress tail-DNA interactions and may boost interactions between nucleosomes and nucleosome-binding proteins.

    • Yunhui Peng
    • , Shuxiang Li
    •  & Anna R. Panchenko
  • Article
    | Open Access

    The application of polygenic risk scores to individual-level disease susceptibility is challenging, as risk is evaluated at a group-level. Here, the authors describe a machine learning method, Mondrian Cross-Conformal Prediction, that reports disease status conditional probability value at the individual level.

    • Jiangming Sun
    • , Yunpeng Wang
    •  & Kasper Lage
  • Article
    | Open Access

    The global pattern of the mammalian methylome is formed by changes in methylation and demethylation. Here the authors describe a metric methylation concurrence that measures the ratio of unmethylated CpGs inside the partially methylated reads and show that methylation concurrence is associated with epigenetically regulated tumour suppressor genes.

    • Jiejun Shi
    • , Jianfeng Xu
    •  & Wei Li
  • Article
    | Open Access

    Finding a biologically-relevant inductive bias for training DNNs on large fitness landscapes is challenging. Here, the authors propose a method called Epistatic Net that improves DNN prediction accuracy and interpretation speed by integrating the knowledge that higher-order epistatic interactions are usually sparse.

    • Amirali Aghazadeh
    • , Hunter Nisonoff
    •  & Kannan Ramchandran
  • Article
    | Open Access

    The analysis of NMR spectra of complex biochemical samples with respect to individual resonances is challenging but critically important. Here, the authors present a deep learning-based method that accelerates this process also for crowded NMR data that are non-trivial to analyze, even by expert NMR spectroscopists.

    • Da-Wei Li
    • , Alexandar L. Hansen
    •  & Rafael Brüschweiler
  • Article
    | Open Access

    The authors present epiScanpy: a computational framework for the analysis of single-cell epigenomic data, both ATAC-seq and DNA methylation data, with examples for clustering, cell type identification, trajectory learning and atlas integration - and show its performance in distinguishing cell types.

    • Anna Danese
    • , Maria L. Richter
    •  & Maria Colomé-Tatché
  • Article
    | Open Access

    Boolean networks allow a simplified representation of interactions. Here, the authors systematically analyze regulation in dozens of biological Boolean networks, finding mathematical regularities that suggest biological systems could be controlled through a relatively small number of components.

    • Enrico Borriello
    •  & Bryan C. Daniels
  • Article
    | Open Access

    The echocardiogram allows for a comprehensive assessment of the cardiac musculature and valves, but its rich temporally resolved data remain underutilized. Here, the authors develop a video AI system trained to predict post-operative right ventricular failure.

    • Rohan Shad
    • , Nicolas Quach
    •  & William Hiesinger
  • Article
    | Open Access

    Forecasting models have been used extensively to inform decision making during the COVID-19 pandemic. In this preregistered and prospective study, the authors evaluated 14 short-term models for Germany and Poland, finding considerable heterogeneity in predictions and highlighting the benefits of combined forecasts.

    • J. Bracher
    • , D. Wolffram
    •  & Frost Tianjian Xu
  • Article
    | Open Access

    Alternative polyadenylation regulates localization, half-life and translation of mRNA isoforms. Here the authors investigate alternative polyadenylation using single cell RNA sequencing data from mouse embryos and identify 3’-UTR isoforms that are regulated across cell types and developmental time.

    • Vikram Agarwal
    • , Sereno Lopez-Darwin
    •  & Jay Shendure
  • Article
    | Open Access

    miRNAs are loaded into Argonaute protein and repress complementary mRNA targets. Here the authors show the unappreciated role of RNA binding proteins for efficient miRNA targeting and expand the current understanding of miRNA targeting.

    • Sukjun Kim
    • , Soyoung Kim
    •  & Daehyun Baek
  • Article
    | Open Access

    Vilazodone (VLZ) is a drug for the treatment of major depressive disorders that targets the serotonin transporter (SERT). Here, the authors combine pharmacology measurements and cryo-EM structural analysis to characterize VLZ binding to SERT and observe that VLZ exhibits non-competitive inhibition of serotonin transport and binds with nanomolar affinity to an allosteric site in SERT.

    • Per Plenge
    • , Dongxue Yang
    •  & Claus J. Loland
  • Article
    | Open Access

    Mass gathering events represent a risk for transmission of SARS-CoV-2. Here, the authors describe an experimental indoor test event in which individual contacts were measured and use aerosol and epidemiological modelling to evaluate transmission risks of different types of restrictions in the arena.

    • Stefan Moritz
    • , Cornelia Gottschick
    •  & Rafael Mikolajczyk
  • Article
    | Open Access

    The link between gRNA sequence and Cas9 activity is well established but the mechanism underlying this relationship is not well understood. Here the authors show that gRNA sequence primarily influences activity by dictating the time it takes for Cas9 to find the target site in a species-specific manner.

    • E. A. Moreb
    •  & M. D. Lynch
  • Article
    | Open Access

    Glycomics can uncover important molecular changes but measured glycans are highly interconnected and incompatible with common statistical methods, introducing pitfalls during analysis. Here, the authors develop an approach to identify glycan dependencies across samples to facilitate comparative glycomics.

    • Bokan Bao
    • , Benjamin P. Kellman
    •  & Nathan E. Lewis
  • Article
    | Open Access

    Mass spectrometry-based metabolomics is a powerful method for profiling large clinical cohorts but batch variations can obscure biologically meaningful differences. Here, the authors develop a computational workflow that removes unwanted data variation while preserving biologically relevant information.

    • Taiyun Kim
    • , Owen Tang
    •  & Jean Yee Hwa Yang
  • Article
    | Open Access

    Reopening of universities to students following COVID-19 restrictions risks increased transmission due to high numbers of social contacts and the potential for asymptomatic transmission. Here, the authors use a mathematical model with social contact data to estimate the impacts of reopening a typical non-campus based university in the UK.

    • Ellen Brooks-Pollock
    • , Hannah Christensen
    •  & Leon Danon
  • Article
    | Open Access

    Lineage tracing and snapshots of transcriptional state at the single-cell level are powerful, complementary tools for studying development. Here, the authors propose a mathematical method combining lineage tracing with trajectory inference to improve our understanding of development.

    • Aden Forrow
    •  & Geoffrey Schiebinger
  • Article
    | Open Access

    Age-related clonal hematopoiesis is associated with risk for diseases like acute myeloid leukemia (AML), yet it is unclear why some individuals do not progress despite having AML driver mutations. Here, the authors use deep learning and population genetics models to investigate how the interplay of positive and negative selection influences AML progression.

    • Kimberly Skead
    • , Armande Ang Houle
    •  & Philip Awadalla
  • Article
    | Open Access

    The phenotypic consequence of 3D genome boundary disruption on developmental processes remains insufficiently understood. Here, the authors show that perturbation of a SOX17 boundary in human pluripotent stem cells interferes with proper differentiation and that germline variations affecting such boundaries are subject to selection, resulting in underrepresentation in the human population.

    • Hua-Jun Wu
    • , Alexandro Landshammer
    •  & Franziska Michor
  • Article
    | Open Access

    Performing multiple histological stains on a biopsy can be costly and time consuming. Here the authors present a method for the digital transformation of H&E stained tissue into special stains (e.g., PAS, Masson’s Trichrome and Jones silver stain), and demonstrate that it improves diagnoses over the use of H&E only.

    • Kevin de Haan
    • , Yijie Zhang
    •  & Aydogan Ozcan
  • Article
    | Open Access

    DNA is becoming increasingly used as a medium to store non-genetic information. Here the authors present a dynamic stack data structure implemented as a DNA polymer chemistry able to record and retrieve signals in a last-in first-out order.

    • Annunziata Lopiccolo
    • , Ben Shirt-Ediss
    •  & Natalio Krasnogor
  • Article
    | Open Access

    Local gene co-expression is found throughout the genome, but systematic analysis of these co-expressed genes is needed. Here, the authors identify local co-expressed genes in 49 tissues and characterize the genetic variants which may affect their expression and contribute to disease.

    • Diogo M. Ribeiro
    • , Simone Rubinacci
    •  & Olivier Delaneau
  • Article
    | Open Access

    SARS-CoV-2 was first detected in Kenya in March 2020 and there was evidence of local transmission in the following months. Here, the authors characterise the early stages of the epidemic in coastal Kenya using phylogenetics and find evidence of multiple strain importations from international points of entry.

    • George Githinji
    • , Zaydah R. de Laurent
    •  & Charles N. Agoti