Computational biology and bioinformatics articles within Nature Communications

Featured

  • Article
    | Open Access

    Here, Libertini and colleagues devise a computation tool that can analyze whole-genome bisulfite sequencing (WGBS) data to recover of ∼30% of the lost differential methylation position information. They use COMETgazer and COMETvintage to analyze 13 diffferent methylome data to demonstrate their performance.

    • Emanuele Libertini
    • , Simon C. Heath
    •  & Stephan Beck
  • Article
    | Open Access

    Clinical RNA-seq datasets can predict clinical outcomes. Here, Shen et al. report a statistical method for survival analysis of mRNA isoform variation using clinical RNA-seq datasets, and the identified isoform based survival predictors outperform gene expression based survival predictors using TCGA data on six cancer types.

    • Shihao Shen
    • , Yuanyuan Wang
    •  & Yi Xing
  • Article
    | Open Access

    Digital and analogue gene circuits each have distinct advantages in natural and engineered cells. Here, Rubens et al. engineer synthetic gene circuits that implement mixed-signal digital and analogue computations in living cells.

    • Jacob R. Rubens
    • , Gianluca Selvaggio
    •  & Timothy K. Lu
  • Article
    | Open Access

    Identifying and annotating functional elements in the human genome remains a challenging but important task. Here the authors propose a priority annotation score to rank identifications and suggest how proteogenomics evidence can be interpreted and what additional information substantiates protein-coding potential for annotation.

    • James C. Wright
    • , Jonathan Mudge
    •  & Jennifer Harrow
  • Article
    | Open Access

    3D genome structures are plastic and vary from cell to cell even in an isogenic sample. Here, the authors present an approach to identify frequent 3D chromatin clusters across a population of genome structures, either deconvoluted from ensemble-averaged Hi-C data or from a collection of single-cell Hi-C data.

    • Chao Dai
    • , Wenyuan Li
    •  & Xianghong Jasmine Zhou
  • Article
    | Open Access

    Stochastic reaction-diffusion systems are used for modelling spatial dynamics in many disciplines, but parameter inference and model selection remain challenging. Here the authors offer a solution enabled by a connection between reaction-diffusion and the well-studied spatio-temporal Cox processes.

    • David Schnoerr
    • , Ramon Grima
    •  & Guido Sanguinetti
  • Article
    | Open Access

    Upstream open reading frames (uORFs) can repress gene expression. Here, Guo-Liang Chew and colleagues use bioinformatics approaches to show that conservation of uORF-mediated translational repression is mediated by sequence features in human, mouse and zebrafish genomes.

    • Guo-Liang Chew
    • , Andrea Pauli
    •  & Alexander F. Schier
  • Article
    | Open Access

    Significant morphological changes occur during the conversion of the immature HIV virion into a mature infectious form. Here the authors use coarse-grained molecular dynamics simulations to model HIV-1 capsid self-assembly and disassembly events that suggests several metastable capsid intermediates sensitive to local conditions.

    • John M. A. Grime
    • , James F. Dama
    •  & Gregory A. Voth
  • Article
    | Open Access

    The prolactin receptor consists of a folded extracellular domain, a transmembrane domain and an intracellular intrinsically disordered domain. Here the authors use a combined experimental and computational approach to obtain a structure of a class I cytokine receptor, the human prolactin receptor.

    • Katrine Bugge
    • , Elena Papaleo
    •  & Birthe B. Kragelund
  • Article
    | Open Access

    Sudden arrhythmic death is a leading cause of mortality, however approaches to identify at-risk patients are of low sensitivity and specificity. Here, the authors develop a personalized approach to assess arrhythmia risk in post-infarction patients based on cardiac imaging and computational modelling that significantly outperforms existing clinical metrics.

    • Hermenegild J. Arevalo
    • , Fijoy Vadakkumpadan
    •  & Natalia A. Trayanova
  • Article
    | Open Access

    Use of general linear mixed models (GLMMs) in genetic variance analysis can quantify the relative contribution of additive effects from genetic variation on a given trait. Here, Jonathan Mosley and colleagues apply GLMM in a phenome-wide analysis and show that genetic variations in the HLA region are associated with 44 phenotypes, 5 phenotypes which were not previously reported in GWASes.

    • Jonathan D. Mosley
    • , John S. Witte
    •  & Joshua C. Denny
  • Article
    | Open Access

    Here, Anders Krogh and colleagues describe Kaiju, a metagenome taxonomic classification program that uses maximum (in-)exact matches on the protein-level to account for evolutionary divergence. The authors show that Kaiju performs faster and is more sensitive compared with existing algorithms and can be used on a standard computer.

    • Peter Menzel
    • , Kim Lee Ng
    •  & Anders Krogh
  • Article
    | Open Access

    The target of rapamycin (Tor) is a Ser/Thr protein kinase that regulates a wide range of anabolic and catabolic processes. Here the authors describe a sub-nanometer cryo-EM structure of a yeast Tor–Lst8 complex and propose an overall topology that differs from that previously suggested for mTORC1.

    • Domagoj Baretić
    • , Alex Berndt
    •  & Roger L. Williams
  • Article
    | Open Access

    Analyses of data from high-throughput genomic technologies are challenging given large data dimensionality. Here, Liu and colleagues describe a method called MANCIE (Matrix Analysis and Normalization by Concordant Information Enhancement) that can conduct genomic data normalization and bias correction to detect biologically relevant information.

    • Chongzhi Zang
    • , Tao Wang
    •  & X. Shirley Liu
  • Article
    | Open Access

    The global measurement of ribosome occupancy on mRNAs is commonly used as a proxy in estimating rates of protein synthesis. Here the authors describe Xtail, a computational approach that facilitates the extraction of accurate quantitative insight from ribosome profiling data (Ribo-Seq).

    • Zhengtao Xiao
    • , Qin Zou
    •  & Xuerui Yang
  • Article
    | Open Access

    The human genome is highly organized, with one-dimensional chromatin states packaged into higher level three-dimensional architecture. Here, the authors present EpiTensor that can identify 3D spatial associations from 1D epigenetic information.

    • Yun Zhu
    • , Zhao Chen
    •  & Wei Wang
  • Article
    | Open Access

    The validation and analysis of X-ray crystallographic data is essential for reproducibility and the development of crystallographic methods. Here, the authors describe a repository for crystallographic datasets and demonstrate some of the ways it could serve the crystallographic community.

    • Peter A. Meyer
    • , Stephanie Socias
    •  & Piotr Sliz
  • Article
    | Open Access

    Quantitative analysis of embryonic cell dynamics from large data sets remains a major challenge in the field of developmental biology. Here the authors develop software and a workflow to reconstruct cell lineage trees from 3D time lapse imaging data sets from several developing organisms including zebrafish, tunicates and sea urchins.

    • Emmanuel Faure
    • , Thierry Savy
    •  & Paul Bourgine
  • Article
    | Open Access

    Analysis of RNAi screens is a multi-step process requiring the sequential use of several unrelated resources. Here the authors generate an online resource integrating RNAi analytic tools and filters into a seamless workflow, which improves the specificity, selectivity and reproducibility of the results.

    • Bhaskar Dutta
    • , Alaleh Azhir
    •  & Iain D. C. Fraser
  • Article
    | Open Access

    The folding of protein domains can occur concomitant with their synthesis, and the rates at which individual codons are translated by the ribosome can affect the folding process. Here the authors present a kinetic model that accurately predicts the probability that a nascent protein domain will co-translationally fold in vivo.

    • Daniel A. Nissley
    • , Ajeet K. Sharma
    •  & Edward P. O’Brien
  • Article
    | Open Access

    TATA boxes in gene promoters are associated with high level of cell-to-cell variation in gene expression. Through integration of multiple data sets, the authors now provide insights into how the interactions of TBP with DNA and other proteins can lead to noisy expression.

    • Charles N. J. Ravarani
    • , Guilhem Chalancon
    •  & M. Madan Babu
  • Article
    | Open Access

    The influence of species conservation on food webs is less well understood than the effects of species loss. Here, the authors test several indices against optimal food web management and find no current metrics are reliably effective at identifying species conservation priorities.

    • E. McDonald-Madden
    • , R. Sabbadin
    •  & H. P. Possingham
  • Article
    | Open Access

    The clinical application of new sequencing techniques is expected to accelerate pathogen identification. Here, Bradley et al. present a clinician-friendly software package that uses sequencing data for quick and accurate prediction of antibiotic resistance profiles for S. aureus and M. tuberculosis.

    • Phelim Bradley
    • , N. Claire Gordon
    •  & Zamin Iqbal
  • Article
    | Open Access

    Availability of computing power can limit computational analysis of large genetic and genomic datasets. Here, Canela-Xandri, et al. describe a software called DISSECT that is capable of analyzing large-scale genetic data by distributing the work across thousands of networked computers.

    • Oriol Canela-Xandri
    • , Andy Law
    •  & Albert Tenesa
  • Article
    | Open Access

    Cancer genetics has benefited from the advent of next generation sequencing, yet a comparison of sequencing and analysis techniques is lacking. Here, the authors sequence a normal-tumour pair and perform data analysis at multiple institutes and highlight some of the pitfalls associated with the different methods.

    • Tyler S. Alioto
    • , Ivo Buchhalter
    •  & Ivo G. Gut
  • Article
    | Open Access

    Single-cell RNA-sequencing (scRNA-seq) can be applied to dissect the kinetics of gene expression and patterns of allele-specific expression. Here, Kim et al.report a generative statistical model that can separate biological variability from technical noise by quantifying technical noise using external RNA spike-ins.

    • Jong Kyoung Kim
    • , Aleksandra A. Kolodziejczyk
    •  & John C. Marioni
  • Article
    | Open Access

    In chromatin endogenous cleavage (ChEC), micrococcal nuclease (MNase) is fused to a protein of interest and its cleavage is thus targeted to specific genomic loci in vivo. Here, the authors show that time-resolved ChEC-seq (high-throughput sequencing after ChEC) can detect DNA shape patterns regardless of motif strength.

    • Gabriel E. Zentner
    • , Sivakanthan Kasinathan
    •  & Steven Henikoff
  • Article
    | Open Access

    Das et al. present a novel Bayesian approach called expression Quantitative Trait enhancer Loci (eQTeL), which effectively integrates genetic and epigenetic information to identify combination of regulatory genomic variants underlying expression variance. Using various functional data, the authors show the variants identified by eQTeL are likely to be causal.

    • Avinash Das
    • , Michael Morley
    •  & Sridhar Hannenhalli
  • Article
    | Open Access

    Assessing functional impact of mutations in cancer on gene expression can improve our understanding of cancer biology and may identify potential therapeutic targets. Here, Ding et al. describe a novel statistical model named xseq for a systematic survey of how mutations impact transcriptome landscapes across 12 different tumour types.

    • Jiarui Ding
    • , Melissa K. McConechy
    •  & Sohrab P. Shah
  • Article
    | Open Access

    The biochemical pathways of central carbon metabolism are highly conserved across all domains of life. Here, Courtet al. use a computational approach to test all possible pathways of glycolysis and gluconeogenesis and find that the existing trunk pathways may represent a maximal flux solution selected for during evolution.

    • Steven J. Court
    • , Bartlomiej Waclaw
    •  & Rosalind J. Allen
  • Article
    | Open Access

    Comprehensive digital information on species distributions is crucial for research in ecology, evolution and conservation. Here, Meyer et al.find large gaps and biases in global vertebrate point records, especially in emerging economies, and identify key factors currently limiting information.

    • Carsten Meyer
    • , Holger Kreft
    •  & Walter Jetz
  • Article
    | Open Access

    Cell-to-cell communication relies upon interactions between secreted ligands and cell surface receptors. Here, Ramilowski et al.present a draft cell-to-cell communication network based on expression of ligand-receptor pairs in 144 different human cell types.

    • Jordan A. Ramilowski
    • , Tatyana Goldberg
    •  & Alistair R. R. Forrest
  • Article
    | Open Access

    TALE proteins are popular tools for genome engineering because they can recognize specific DNA sequences, however off-target effects are a routine problem. Here Rogers and Barrera et al. comprehensively map TALE–DNA interactions to develop a computational model to predict binding specificity.

    • Julia M. Rogers
    • , Luis A. Barrera
    •  & Martha L. Bulyk
  • Article
    | Open Access

    Adverse drug reactions are an important clinical problem. Here the authors combine information about drug-induced gene expression changes and genetic variability of patients with a genome-scale metabolic model to identify drug-induced changes in cellular metabolism that may be linked to drug side effects.

    • Daniel C. Zielinski
    • , Fabian V. Filipp
    •  & Bernhard O. Palsson
  • Article
    | Open Access

    Proteins are sometimes implicated in separate and seemingly unrelated processes, so called moonlighting functions. Here the authors use bioinformatics tools to identify extreme multifunctional proteins and define a signature of extreme multifunctionality.

    • Charles E. Chapple
    • , Benoit Robisson
    •  & Christine Brun
  • Article
    | Open Access

    Planctomycetes appear to differ from all other bacteria in their cellular organization and their apparent lack of a peptidoglycan (PG) cell wall. Here Jeske et al. show that Planctomycetes do possess a typical PG cell wall and that their cellular architecture resembles that of Gram-negative bacteria.

    • Olga Jeske
    • , Margarete Schüler
    •  & Christian Jogler
  • Article |

    There is currently no consensus on how best to identify and delimit biogeographical regions. Here the authors develop a network-based approach incorporating complex presence–absence patterns that can successfully identify commonly recognized biogeographical regions, and apply it to two large-scale data sets of plants and amphibians.

    • Daril A. Vilhena
    •  & Alexandre Antonelli
  • Article
    | Open Access

    Cells constantly integrate information from multiple stimuli. By considering every possible means by which two stimuli can interact, Cappuccio et al. define 10 interaction modes and demonstrate their preferential use by dendritic cells responding to different combinations of microbial and host inflammatory cues.

    • Antonio Cappuccio
    • , Raphaël Zollinger
    •  & Vassili Soumelis
  • Article |

    Artifacts caused by whole-genome amplification bias are a recurrent challenge in single-cell sequencing analysis. Here, the authors develop statistical models and demonstrate an efficient strategy for controlling amplification errors by a joint analysis of single cell genomes.

    • Cheng-Zhong Zhang
    • , Viktor A. Adalsteinsson
    •  & J. Christopher Love
  • Article |

    Sequential segmentation in development is best described in vertebrates, where it relies on cell proliferation and shows regular periodicity. Here, the authors show that in the flour beetle segments are added with irregular rate and their elongation during periods of fast growth relies mostly on cell movements.

    • A. Nakamoto
    • , S. D. Hester
    •  & T. A. Williams
  • Article
    | Open Access

    The key regulators that allow transition from proliferative to invasive phenotype in melanoma cells have not been identified yet. The authors perform chromatin and transcriptome profiling followed by comprehensive bioinformatics analysis identifying new candidate regulators for two distinct cell states of melanoma.

    • Annelien Verfaillie
    • , Hana Imrichova
    •  & Stein Aerts
  • Article
    | Open Access

    Gradients of the secreted morphogen Sonic Hedgehog (Shh) pattern the neural tube in vertebrates. Cohen et al.quantify Shh signalling in developing mice, and by constructing a computational model of the process, identify mechanisms by which the dynamics of Shh signalling are regulated.

    • Michael Cohen
    • , Anna Kicheva
    •  & James Briscoe
  • Article |

    The evolutionary origin of Hippopotamidae, the family of hippos, is poorly understood. Here, the authors describe a new fossil from Kenya that unambiguously roots Hippopotamidae into the group that includes the first large terrestrial mammals to invade Africa, more than 30 million years ago.

    • Fabrice Lihoreau
    • , Jean-Renaud Boisserie
    •  & Stéphane Ducrocq
  • Article
    | Open Access

    The activity of sensory neurons can be correlated with perceptual decisions and this effect may provide insights into how sensory information is processed during perceptual tasks. Here the authors develop a network model of sensory and decision-making areas and propose that the dynamics across the network hierarchy explains the choice probabilities.

    • Klaus Wimmer
    • , Albert Compte
    •  & Jaime de la Rocha