Computational biology and bioinformatics articles within Nature Communications

Featured

  • Article
    | Open Access

    Cancer cells reprogramme their metabolism with unclear clinical implications. Here, the authors analyse the expression of metabolic genes across 20 types of solid cancers and find that clinical aggressiveness, poor survival and metastasis are associated with the deregulation of mitochondrial metabolism.

    • Edoardo Gaude
    •  & Christian Frezza
  • Article
    | Open Access

    Expression of TEM β-lactamase is a predominant mechanism underlying antibiotic resistance in pathogenic Gram-negative bacteria. Here, the authors use Markov state models to reveal and experimentally confirm hidden conformations that determine TEM substrate specificity.

    • Kathryn M. Hart
    • , Chris M. W. Ho
    •  & Gregory R. Bowman
  • Article
    | Open Access

    Retinitis pigmentosa is often caused by mutations that affect the activity or transport of rhodopsin, but some mutations cause disease even though an apparently functional protein is produced. Here the authors show that three such enigmatic mutants retain scramblase activity but are unable to dimerize.

    • Birgit Ploier
    • , Lydia N. Caro
    •  & Anant K. Menon
  • Article
    | Open Access

    Many drugs are small molecule inhibitors of cell signalling. Through single cell analysis and mathematical modelling here the authors show that cell-to-cell variability diversifies inhibition response into digital and analogue, and that the two translate into distinct long-term functional responses.

    • Robert M. Vogel
    • , Amir Erez
    •  & Grégoire Altan-Bonnet
  • Article
    | Open Access

    Building multi-component enzymatic processes in one pot is challenged by the inherent complexity of each biochemical system. Here, the authors use online mass spectroscopy and engineering systems theory to achieve forward design of a ten-membered reaction cascade.

    • Christoph Hold
    • , Sonja Billerbeck
    •  & Sven Panke
  • Article
    | Open Access

    A wealth of gene expression data is publicly available, yet is little use without additional human curation. Ma’ayan and colleagues report a crowdsourcing project involving over 70 participants to annotate and analyse thousands of human disease-related gene expression datasets.

    • Zichen Wang
    • , Caroline D. Monteiro
    •  & Avi Ma’ayan
  • Article
    | Open Access

    Plasticity and clonal population structure in bacterial genomes can hinder traditional SNP-based genetic association studies. Here, Corander and colleagues present a method to identify variable-length sequence elements enriched in a phenotype of interest, and demonstrate its use in human pathogens.

    • John A. Lees
    • , Minna Vehkala
    •  & Jukka Corander
  • Article
    | Open Access

    Genome interpretation and analysis of allelic activity requires appropriate haplotype phasing. Here the authors present phASER, a fast and accurate method for variant phrasing from RNA-seq and genome sequencing data.

    • Stephane E. Castel
    • , Pejman Mohammadi
    •  & Tuuli Lappalainen
  • Article
    | Open Access

    To modulate gene expression, the glucocorticoid receptor binds to response elements (RE) that vary in sequence. Here, the authors show that RE sequences can modulate glucocorticoid receptor structure and activity, which might provide regulatory specificity towards individual target genes.

    • Stefanie Schöne
    • , Marcel Jurk
    •  & Sebastiaan H. Meijsing
  • Article
    | Open Access

    Clonal haematopoiesis has been thought to occur in less than 10% of individuals younger than 70 years old. Here, the authors use an error corrected next-generation sequencing method to find clonal haematopoiesis in the peripheral blood of 19 of 20 healthy 50–70 year old individuals.

    • Andrew L. Young
    • , Grant A. Challen
    •  & Todd E. Druley
  • Article
    | Open Access

    Long non-coding RNAs are increasingly recognised to be important factors in regulating cellular processes and comprise a large faction of the transcriptome, however most are uncharacterised. Here the authors present RACE-Seq, a tool to improve and extend the annotation of low-expression transcripts.

    • Julien Lagarde
    • , Barbara Uszczynska-Ratajczak
    •  & Jennifer Harrow
  • Article
    | Open Access

    Despite their complexity, ecological networks appear robust to species loss. Here, Strona and Lafferty use artificial life simulations and real-world data to show that such robustness applies to stable conditions, but can collapse when the environment changes.

    • Giovanni Strona
    •  & Kevin D. Lafferty
  • Article
    | Open Access

    The cornea is formed of cells that originate from the outer circle of stem cells and that move towards its centre. Here, the authors show that the movement pattern is self-organised, requiring no cues, and that stem cell leakage may account for the presence of stem cells at the centre of the cornea.

    • Erwin P. Lobo
    • , Naomi C. Delic
    •  & J. Guy Lyons
  • Article
    | Open Access

    It is difficult to image haematopoietic stem cells (HSC) in their niche. Here, the authors present a new high-throughput computational approach to visualise HSCs in vivoat a high spatial and temporal resolution and also use a Msi2-reporter to label endogenous HSCs and progenitors, enabling cell tracking

    • Claire S. Koechlein
    • , Jeffrey R. Harris
    •  & Tannishtha Reya
  • Article
    | Open Access

    Chromosomal aberrations can be detected by global gene expression analysis. Here, the authors report eSNP-Karyotyping, a new method that can detect chromosomal aberrations by measuring the ratio of expression between the two alleles without comparison to a matched diploid sample.

    • Uri Weissbein
    • , Maya Schachter
    •  & Nissim Benvenisty
  • Article
    | Open Access

    Here, Libertini and colleagues devise a computation tool that can analyze whole-genome bisulfite sequencing (WGBS) data to recover of ∼30% of the lost differential methylation position information. They use COMETgazer and COMETvintage to analyze 13 diffferent methylome data to demonstrate their performance.

    • Emanuele Libertini
    • , Simon C. Heath
    •  & Stephan Beck
  • Article
    | Open Access

    Clinical RNA-seq datasets can predict clinical outcomes. Here, Shen et al. report a statistical method for survival analysis of mRNA isoform variation using clinical RNA-seq datasets, and the identified isoform based survival predictors outperform gene expression based survival predictors using TCGA data on six cancer types.

    • Shihao Shen
    • , Yuanyuan Wang
    •  & Yi Xing
  • Article
    | Open Access

    Digital and analogue gene circuits each have distinct advantages in natural and engineered cells. Here, Rubens et al. engineer synthetic gene circuits that implement mixed-signal digital and analogue computations in living cells.

    • Jacob R. Rubens
    • , Gianluca Selvaggio
    •  & Timothy K. Lu
  • Article
    | Open Access

    Identifying and annotating functional elements in the human genome remains a challenging but important task. Here the authors propose a priority annotation score to rank identifications and suggest how proteogenomics evidence can be interpreted and what additional information substantiates protein-coding potential for annotation.

    • James C. Wright
    • , Jonathan Mudge
    •  & Jennifer Harrow
  • Article
    | Open Access

    3D genome structures are plastic and vary from cell to cell even in an isogenic sample. Here, the authors present an approach to identify frequent 3D chromatin clusters across a population of genome structures, either deconvoluted from ensemble-averaged Hi-C data or from a collection of single-cell Hi-C data.

    • Chao Dai
    • , Wenyuan Li
    •  & Xianghong Jasmine Zhou
  • Article
    | Open Access

    Stochastic reaction-diffusion systems are used for modelling spatial dynamics in many disciplines, but parameter inference and model selection remain challenging. Here the authors offer a solution enabled by a connection between reaction-diffusion and the well-studied spatio-temporal Cox processes.

    • David Schnoerr
    • , Ramon Grima
    •  & Guido Sanguinetti
  • Article
    | Open Access

    Upstream open reading frames (uORFs) can repress gene expression. Here, Guo-Liang Chew and colleagues use bioinformatics approaches to show that conservation of uORF-mediated translational repression is mediated by sequence features in human, mouse and zebrafish genomes.

    • Guo-Liang Chew
    • , Andrea Pauli
    •  & Alexander F. Schier
  • Article
    | Open Access

    Significant morphological changes occur during the conversion of the immature HIV virion into a mature infectious form. Here the authors use coarse-grained molecular dynamics simulations to model HIV-1 capsid self-assembly and disassembly events that suggests several metastable capsid intermediates sensitive to local conditions.

    • John M. A. Grime
    • , James F. Dama
    •  & Gregory A. Voth
  • Article
    | Open Access

    The prolactin receptor consists of a folded extracellular domain, a transmembrane domain and an intracellular intrinsically disordered domain. Here the authors use a combined experimental and computational approach to obtain a structure of a class I cytokine receptor, the human prolactin receptor.

    • Katrine Bugge
    • , Elena Papaleo
    •  & Birthe B. Kragelund
  • Article
    | Open Access

    Sudden arrhythmic death is a leading cause of mortality, however approaches to identify at-risk patients are of low sensitivity and specificity. Here, the authors develop a personalized approach to assess arrhythmia risk in post-infarction patients based on cardiac imaging and computational modelling that significantly outperforms existing clinical metrics.

    • Hermenegild J. Arevalo
    • , Fijoy Vadakkumpadan
    •  & Natalia A. Trayanova
  • Article
    | Open Access

    Use of general linear mixed models (GLMMs) in genetic variance analysis can quantify the relative contribution of additive effects from genetic variation on a given trait. Here, Jonathan Mosley and colleagues apply GLMM in a phenome-wide analysis and show that genetic variations in the HLA region are associated with 44 phenotypes, 5 phenotypes which were not previously reported in GWASes.

    • Jonathan D. Mosley
    • , John S. Witte
    •  & Joshua C. Denny
  • Article
    | Open Access

    Here, Anders Krogh and colleagues describe Kaiju, a metagenome taxonomic classification program that uses maximum (in-)exact matches on the protein-level to account for evolutionary divergence. The authors show that Kaiju performs faster and is more sensitive compared with existing algorithms and can be used on a standard computer.

    • Peter Menzel
    • , Kim Lee Ng
    •  & Anders Krogh
  • Article
    | Open Access

    The target of rapamycin (Tor) is a Ser/Thr protein kinase that regulates a wide range of anabolic and catabolic processes. Here the authors describe a sub-nanometer cryo-EM structure of a yeast Tor–Lst8 complex and propose an overall topology that differs from that previously suggested for mTORC1.

    • Domagoj Baretić
    • , Alex Berndt
    •  & Roger L. Williams
  • Article
    | Open Access

    Analyses of data from high-throughput genomic technologies are challenging given large data dimensionality. Here, Liu and colleagues describe a method called MANCIE (Matrix Analysis and Normalization by Concordant Information Enhancement) that can conduct genomic data normalization and bias correction to detect biologically relevant information.

    • Chongzhi Zang
    • , Tao Wang
    •  & X. Shirley Liu
  • Article
    | Open Access

    The global measurement of ribosome occupancy on mRNAs is commonly used as a proxy in estimating rates of protein synthesis. Here the authors describe Xtail, a computational approach that facilitates the extraction of accurate quantitative insight from ribosome profiling data (Ribo-Seq).

    • Zhengtao Xiao
    • , Qin Zou
    •  & Xuerui Yang
  • Article
    | Open Access

    The human genome is highly organized, with one-dimensional chromatin states packaged into higher level three-dimensional architecture. Here, the authors present EpiTensor that can identify 3D spatial associations from 1D epigenetic information.

    • Yun Zhu
    • , Zhao Chen
    •  & Wei Wang
  • Article
    | Open Access

    The validation and analysis of X-ray crystallographic data is essential for reproducibility and the development of crystallographic methods. Here, the authors describe a repository for crystallographic datasets and demonstrate some of the ways it could serve the crystallographic community.

    • Peter A. Meyer
    • , Stephanie Socias
    •  & Piotr Sliz
  • Article
    | Open Access

    Quantitative analysis of embryonic cell dynamics from large data sets remains a major challenge in the field of developmental biology. Here the authors develop software and a workflow to reconstruct cell lineage trees from 3D time lapse imaging data sets from several developing organisms including zebrafish, tunicates and sea urchins.

    • Emmanuel Faure
    • , Thierry Savy
    •  & Paul Bourgine
  • Article
    | Open Access

    Analysis of RNAi screens is a multi-step process requiring the sequential use of several unrelated resources. Here the authors generate an online resource integrating RNAi analytic tools and filters into a seamless workflow, which improves the specificity, selectivity and reproducibility of the results.

    • Bhaskar Dutta
    • , Alaleh Azhir
    •  & Iain D. C. Fraser
  • Article
    | Open Access

    The folding of protein domains can occur concomitant with their synthesis, and the rates at which individual codons are translated by the ribosome can affect the folding process. Here the authors present a kinetic model that accurately predicts the probability that a nascent protein domain will co-translationally fold in vivo.

    • Daniel A. Nissley
    • , Ajeet K. Sharma
    •  & Edward P. O’Brien
  • Article
    | Open Access

    TATA boxes in gene promoters are associated with high level of cell-to-cell variation in gene expression. Through integration of multiple data sets, the authors now provide insights into how the interactions of TBP with DNA and other proteins can lead to noisy expression.

    • Charles N. J. Ravarani
    • , Guilhem Chalancon
    •  & M. Madan Babu
  • Article
    | Open Access

    The influence of species conservation on food webs is less well understood than the effects of species loss. Here, the authors test several indices against optimal food web management and find no current metrics are reliably effective at identifying species conservation priorities.

    • E. McDonald-Madden
    • , R. Sabbadin
    •  & H. P. Possingham
  • Article
    | Open Access

    The clinical application of new sequencing techniques is expected to accelerate pathogen identification. Here, Bradley et al. present a clinician-friendly software package that uses sequencing data for quick and accurate prediction of antibiotic resistance profiles for S. aureus and M. tuberculosis.

    • Phelim Bradley
    • , N. Claire Gordon
    •  & Zamin Iqbal
  • Article
    | Open Access

    Availability of computing power can limit computational analysis of large genetic and genomic datasets. Here, Canela-Xandri, et al. describe a software called DISSECT that is capable of analyzing large-scale genetic data by distributing the work across thousands of networked computers.

    • Oriol Canela-Xandri
    • , Andy Law
    •  & Albert Tenesa