Computational biology and bioinformatics articles within Nature Communications

Featured

  • Article
    | Open Access

    Circular RNAs have been identified using short-read RNA sequencing. Here, the authors report isoCirc, a long-read sequencing method to characterize full-length circRNA isoforms and generate a catalogue of full-length circRNA isoforms in 12 human tissues and one human cell line.

    • Ruijiao Xin
    • , Yan Gao
    •  & Yi Xing
  • Article
    | Open Access

    Patients with solid cancers have high rates of clonal haematopoiesis associated with increased risk of secondary leukemias. Here, by using peripheral blood sequencing data from patients with solid non-hematologic cancer, the authors profile the landscape of mosaic chromosomal alterations and gene mutations, defining patients at high risk of leukemia progression.

    • Teng Gao
    • , Ryan Ptashkin
    •  & Elli Papaemmanuil
  • Article
    | Open Access

    The growing need for realism in addressing complex public health questions calls for accurate models of the human contact patterns that govern disease transmission. Here, the authors generate effective population-level contact matrices by using highly detailed macro (census) and micro (survey) data on key socio-demographic features.

    • Dina Mistry
    • , Maria Litvinova
    •  & Alessandro Vespignani
  • Article
    | Open Access

    Sparse testing early in the SARS-CoV-2 pandemic hinders estimation of the dates and origins of initial case importations. Here, the authors show that the main source of cases imported from China shifted from Wuhan to other Chinese cities by mid-February, especially for African locations.

    • Tigist F. Menkir
    • , Taylor Chin
    •  & Rene Niehus
  • Article
    | Open Access

    Mass spectrometry-based covalent labeling techniques such as hydroxyl radical protein footprinting (HRPF) provide information about protein tertiary structures. Here, the authors present a dynamics driven HRPF-guided algorithm for protein structure prediction that is incorporated in the Rosetta software suite and only requires the protein sequence and HRPF data as input and they demonstrate its successful application to four benchmark proteins.

    • Sarah E. Biehn
    •  & Steffen Lindert
  • Article
    | Open Access

    Skeletal muscle conveys the beneficial effects of physical exercise but due to its heterogeneity, studying the effects of exercise on muscle fibres is challenging. Here, the authors carry out proteomic analysis of myofibres from freeze-dried muscle biopsies, show fibre-type specific changes in response to exercise, and show that the oxidative and glycolytic muscle fibers adapt differentially to exercise training.

    • A. S. Deshmukh
    • , D. E. Steenberg
    •  & J. F. P. Wojtaszewski
  • Article
    | Open Access

    Post-transcriptional gene regulation is an important contributor to cell type-specific differences at the transcriptomic level. Here, the authors use a multiomics approach to characterize neuronal diversity in the mouse nervous system, analyzing the relative contributions of multiple layers of transcriptomic regulation in the specification of cell type identity.

    • Kevin C. H. Ha
    • , Timothy Sterne-Weiler
    •  & Benjamin J. Blencowe
  • Article
    | Open Access

    Low-resource settings can face additional challenges in managing the COVID-19 pandemic. Here, the authors use mathematical modelling to investigate transmission in the state of Bahia, Brazil, and quantify control measures needed to prevent the hospital system becoming overwhelmed.

    • Juliane F. Oliveira
    • , Daniel C. P. Jorge
    •  & Roberto F. S. Andrade
  • Article
    | Open Access

    Many computational tools identify mRNA variations by analyzing the transformed RNA-seq data such as collapsed reads. Here, the authors report a computational method which uses shape changes in the RNA-seq coverage profile to discover changes in mRNA expression and alternative splicing.

    • Hyo Young Choi
    • , Heejoon Jo
    •  & D. Neil Hayes
  • Article
    | Open Access

    Contact tracing is critical to controlling COVID-19, but most protocols only “forward-trace” to notify people who were recently exposed. Using a stochastic branching-process model, the authors show that “bidirectional” tracing to identify infector individuals and their other infectees robustly improves outbreak control.

    • William J. Bradshaw
    • , Ethan C. Alley
    •  & Kevin M. Esvelt
  • Article
    | Open Access

    Lack of a widespread surveillance network hampers accurate infectious disease forecasting. Here the authors provide a framework to optimize the selection of surveillance site locations and show that accurate forecasting of respiratory diseases for locations without surveillance is feasible.

    • Sen Pei
    • , Xian Teng
    •  & Jeffrey Shaman
  • Article
    | Open Access

    Cellular senescence is a hallmark of ageing and is important for the pathogenesis of ageing-related diseases. Here, the authors develop a morphology-based deep learning system to identify senescent cells and a quantitative scoring system to evaluate the state of endothelial cells to evaluate the effects of anti-senescent reagents.

    • Dai Kusumoto
    • , Tomohisa Seki
    •  & Shinsuke Yuasa
  • Article
    | Open Access

    While temperature impacts the function of all cellular components, it’s hard to rule out how the temperature dependence of cell phenotypes emerged from the dependence of individual components. Here, the authors develop a Bayesian genome scale modelling approach to identify thermal determinants of yeast metabolism.

    • Gang Li
    • , Yating Hu
    •  & Jens Nielsen
  • Article
    | Open Access

    Balancing high resolution and broad genome coverage in single-cell Hi-C approaches remains challenging. Here, the authors describe a computational method for the reconstruction of a large 3D-ensemble of single-cell chromatin conformations from population Hi-C measurements and apply this model to study embryogenesis in Drosophila.

    • Qiu Sun
    • , Alan Perez-Rathke
    •  & Jie Liang
  • Article
    | Open Access

    Digital trace data from search engines lacks information about the experiences of the individuals generating the data. Here the authors link search data and human computation to build a tracking model of influenza-like illness.

    • Stefan Wojcik
    • , Avleen S. Bijral
    •  & David Lazer
  • Article
    | Open Access

    Safely reducing the necessary duration of quarantine for COVID-19 could lessen the economic impacts of the pandemic. Here, the authors demonstrate that testing on exit from quarantine is more effective than testing on entry, and can enable quarantine to be reduced from fourteen to seven days.

    • Chad R. Wells
    • , Jeffrey P. Townsend
    •  & Alison P. Galvani
  • Article
    | Open Access

    Computational approaches to predict water’s role in host-ligand binding attract a great deal of attention. Here the authors use a metadynamics enhanced sampling method and machine learning to compute binding energies for host-guest systems from the SAMPL5 challenge and provide details of water structural changes.

    • Valerio Rizzi
    • , Luigi Bonati
    •  & Michele Parrinello
  • Article
    | Open Access

    Integration of single cell data modalities increases the richness of information about the heterogeneity of cell states, but integration of imaging and transcriptomics is an open challenge. Here the authors use autoencoders to learn a probabilistic coupling and map these modalities to a shared latent space.

    • Karren Dai Yang
    • , Anastasiya Belyaeva
    •  & Caroline Uhler
  • Article
    | Open Access

    The determination of whether cancer cell lines recapitulate the molecular features of corresponding patient tumours remains essential for the selection of appropriate cell line models for preclinical studies. The method developed here, Celligner, integrates cancer cell line and tumour RNA-seq datasets and reveals large differences in their concordance across cell lines and cancer types.

    • Allison Warren
    • , Yejia Chen
    •  & James M. McFarland
  • Article
    | Open Access

    Age is one of the strongest risk factors for severe illness from COVID-19. By integrating human lung transcriptomes with experimental data on SARS-CoV-2, the authors pinpoint specific age-associated factors that could contribute to the heightened severity of COVID-19 in older populations.

    • Ryan D. Chow
    • , Medha Majety
    •  & Sidi Chen
  • Article
    | Open Access

    In most model yeast species the Origin Recognition Complex (ORC) binds defined and species-specific base sequences while in humans what determines the binding appears to be more complex. Here the authors reveal that the yeast’s ORC complex binding specificity is dependent on a 19-amino acid insertion helix in the Orc4 subunit which is lost in human.

    • Clare S. K. Lee
    • , Ming Fung Cheung
    •  & Bik-Kwoon Tye
  • Article
    | Open Access

    RNA-sequencing data from tumours can be used to predict the prognosis of patients. Here, the authors show that a neural network meta-learning approach can be useful for predicting prognosis from a small number of samples.

    • Yeping Lina Qiu
    • , Hong Zheng
    •  & Olivier Gevaert
  • Article
    | Open Access

    The uptake of hydrophobic molecules by bacterial FadL channels is implicated in quorum sensing, interactions with eukaryotic hosts and biodegradation of many pollutants. Insights into monoaromatic hydrocarbon uptake by TodX and CymD channels suggest that all FadL channels mediate substrate uptake via lateral diffusion.

    • Kamolrat Somboon
    • , Anne Doble
    •  & Bert van den Berg
  • Article
    | Open Access

    The potential for accidental or deliberate misuse of biotechnology is of concern for international biosecurity. Here the authors apply machine learning to DNA sequences and associated phenotypic data to facilitate genetic engineering attribution and identify country-of-origin and ancestral lab of engineered DNA sequences.

    • Ethan C. Alley
    • , Miles Turpin
    •  & Kevin M. Esvelt
  • Article
    | Open Access

    Pathogenicity scores are instrumental in prioritizing variants for Mendelian disease, yet their application to common disease is largely unexplored. Here, the authors assess the utility of pathogenicity scores for 41 complex traits and develop a framework to improve their informativeness for common disease.

    • Samuel S. Kim
    • , Kushal K. Dey
    •  & Alkes L. Price
  • Article
    | Open Access

    The systematic characterization of C. elegans morphology during development has yet to be performed. Here, the authors produce a 3D atlas of C. elegans morphology from 17 embryos and 54 developmental stages, using an automated pipeline, CShaper (combining segmentation of fluorescently labeled membranes with automated cell lineage tracing).

    • Jianfeng Cao
    • , Guoye Guan
    •  & Hong Yan
  • Article
    | Open Access

    Mechanical strength of in situ assembled nuclear lamin filaments arranged in a 3D meshwork is unclear. Here, using mechanical, structural and simulation tools, the authors report the hierarchical organization of the lamin meshwork that imparts strength and toughness to lamin filaments at par with silk and Kevlar®

    • K. Tanuj Sapra
    • , Zhao Qin
    •  & Ohad Medalia
  • Article
    | Open Access

    Here, the authors develop a genome evolution model to investigate the origin of functional redundancy in the human microbiome by analyzing its genomic content network and illustrate potential ecological and evolutionary processes that may contribute to its resilience.

    • Liang Tian
    • , Xu-Wen Wang
    •  & Yang-Yu Liu
  • Article
    | Open Access

    Most approaches for modeling the membrane protein complexes are not capable of incorporating the topological information provided by the membrane. Here authors present an integrative computational protocol for the modeling of membrane-associated protein assemblies, specifically complexes consisting of a membrane-embedded protein and a soluble partner.

    • Jorge Roel-Touris
    • , Brian Jiménez-García
    •  & Alexandre M. J. J. Bonvin
  • Article
    | Open Access

    The TGFβ signaling pathway has been shown to regulate transcription by regulating enhancer activity. Here, the authors perform a comprehensive analysis of enhancers in normal mammary epithelial gland cells to elucidate how TGFβ-dependent enhancers control gene transcription in these cells.

    • Jose A. Guerrero-Martínez
    • , María Ceballos-Chávez
    •  & Jose C. Reyes
  • Article
    | Open Access

    High numbers of COVID-19-related deaths have been reported in the United States, but estimation of the true numbers of infections is challenging. Here, the authors estimate that on 1 June 2020, 3.7% of the US population was infected with SARS-CoV-2, and 0.01% was infectious, with wide variation by state.

    • H. Juliette T. Unwin
    • , Swapnil Mishra
    •  & Seth Flaxman
  • Article
    | Open Access

    The long noncoding RNA XIST plays a central role in sex-specific gene expression in humans by silencing one of two X chromosomes in female cells. Here the authors show that higher order secondary structure creates the modular domain structure of XIST ribonucleoprotein complex and spatial separation of functions.

    • Zhipeng Lu
    • , Jimmy K. Guo
    •  & Howard Y. Chang
  • Article
    | Open Access

    Genome-wide maps of evolutionary constraint and large-scale compendia of epigenomic and transcription factor data provide complementary information for genome annotation. Here, the authors develop the Constrained Non-Exonic Predictor (CNEP) that enables better understanding of their relationship.

    • Olivera Grujic
    • , Tanya N. Phung
    •  & Jason Ernst
  • Perspective
    | Open Access

    The IMEx consortium provides one of the largest resources of curated, experimentally verified molecular interaction data. Here, the authors review how IMEx evolved into a fundamental resource for life scientists and describe how IMEx data can support biomedical research.

    • Pablo Porras
    • , Elisabet Barrera
    •  & Sandra Orchard
  • Article
    | Open Access

    Combinatorial treatments have become a standard of care for various complex diseases including cancers. Here, the authors show that combinatorial responses of two anticancer drugs can be accurately predicted using factorization machines trained on large-scale pharmacogenomic data for guiding precision oncology studies.

    • Heli Julkunen
    • , Anna Cichonska
    •  & Juho Rousu
  • Article
    | Open Access

    Replicate runs of maximum likelihood phylogenetic analyses can generate different tree topologies due to differences in parameters, such as random seeds. Here, Shen et al. demonstrate that replicate runs can generate substantially different tree topologies even with identical data and parameters.

    • Xing-Xing Shen
    • , Yuanning Li
    •  & Antonis Rokas
  • Article
    | Open Access

    Single-cell transcriptomics enhanced our ability to profile heterogeneous cell populations. It is not known which statistical frameworks are performant to detect subpopulation-level responses. Here, the authors developed a simulation framework to evaluate various methods across a range of scenarios.

    • Helena L. Crowell
    • , Charlotte Soneson
    •  & Mark D. Robinson