Computational biology and bioinformatics articles within Nature Communications

Featured

  • Article
    | Open Access

    The increasing amount of raw RNA-seq data calls for new computational methods to mine information. Here, the authors present ASCOT, a computational resource to identify splice variants in RNA-seq data, and apply it to splicing patterns in neurons and unique splicing patterns in rod photoreceptors.

    • Jonathan P. Ling
    • , Christopher Wilks
    •  & Seth Blackshaw
  • Article
    | Open Access

    Although transcription factor (TF) cooperativity is widespread, a global mechanistic understanding of the role of TF cooperativity is still lacking. Here the authors introduce a statistical learning framework that provides structural insight into TF cooperativity and its functional consequences based on next generation sequencing data and provide mechanistic insights into TF cooperativity and its impact on protein-phenotype interactions.

    • Ignacio L. Ibarra
    • , Nele M. Hollmann
    •  & Judith B. Zaugg
  • Article
    | Open Access

    Multivariable Mendelian randomization (MR) extends the standard MR framework to consider multiple risk factors in a single model. Here, Zuber et al. propose MR-BMA, a Bayesian variable selection approach to identify the likely causal determinants of a disease from many candidate risk factors as for example high-throughput data sets.

    • Verena Zuber
    • , Johanna Maria Colijn
    •  & Stephen Burgess
  • Article
    | Open Access

    Quantification and characterization of circRNAs in sequencing data remains challenging, hindering efforts to understand their roles and regulation. The algorithm introduced here enables accurate circRNA quantification and permits insight into competitive splicing between linear and circular isoforms.

    • Jinyang Zhang
    • , Shuai Chen
    •  & Fangqing Zhao
  • Article
    | Open Access

    Identification of cancer driver genes, especially those that can act as tumour suppressors or oncogenes depending on context, remains a challenge. Here, the authors introduce Moonlight, a tool that integrates multi-omic data to address this challenge and identify numerous dual-role cancer genes.

    • Antonio Colaprico
    • , Catharina Olsen
    •  & Elena Papaleo
  • Article
    | Open Access

    We lack effective treatment for half of children with high-risk neuroblastoma. Here, the authors introduce an algorithm that can predict the effect of interventions on gene expression signatures associated with high disease processes and risk, and identify and validate promising drug targets.

    • Elin Almstedt
    • , Ramy Elgendy
    •  & Sven Nelander
  • Article
    | Open Access

    RNA-sequencing is mostly used to assess gene expression; however, it can also give information about genetic variants. Here, the authors present CaSpER, a statistical framework that utilises RNA-sequencing reads to identify and visualise CNV events by integrating transcriptome-wide expression and allelic shift profiles.

    • Akdes Serin Harmanci
    • , Arif O. Harmanci
    •  & Xiaobo Zhou
  • Article
    | Open Access

    It is important to analyse the local resolution of cryo-EM maps. Here the authors present MonoDir, a fully automatic and parameter free method for the directional local resolution analysis of cryo-EM maps that requires only the final map as input and they also propose indicators for assessing map quality.

    • Jose Luis Vilas
    • , Hemant D. Tagare
    •  & Carlos Oscar S. Sorzano
  • Article
    | Open Access

    Integrating independent large-scale pharmacogenomic screens can enable unprecedented characterization of genetic vulnerabilities in cancers. Here, the authors show that the two largest independent CRISPR-Cas9 gene-dependency screens are concordant, paving the way for joint analysis of the data sets.

    • Joshua M. Dempster
    • , Clare Pacini
    •  & Francesco Iorio
  • Article
    | Open Access

    Dengue and Zika virus are related flaviviruses, and introduction of Zika in the Americas may have impacted dengue epidemiology. Here, Borchering et al. show that dengue incidence was unusually low in 2017 in Brazil and Colombia, and simulations incorporating immune-mediated interactions predict reductions in dengue following Zika outbreaks with subsequent rebounds.

    • Rebecca K. Borchering
    • , Angkana T. Huang
    •  & Derek A. T. Cummings
  • Article
    | Open Access

    Common Fragile Sites (CFSs) are chromosome regions prone to breakage upon replication stress known to drive chromosome rearrangements during oncogenesis. Here the authors use genome-wide and single cell techniques to assess how replication timing and transcriptional activity correlate with genome stability.

    • Olivier Brison
    • , Sami El-Hilali
    •  & Chun-Long Chen
  • Article
    | Open Access

    Glioblastoma cells are known to be able to adapt easily to different environments. The authors study the dynamic adaptation of glioblastoma cells to the heterogenous brain tumor microenvironment, showing that tumor cells demonstrate varying plasticity of their transcriptomic profiles and an ability to survive new stimuli, in part, by propagating stochastic perturbations over their gene-regulatory network.

    • Orieta Celiku
    • , Mark R. Gilbert
    •  & Orit Lavi
  • Article
    | Open Access

    Multiple sequence alignments of proteins carry information about evolution, the protein’s fitness landscape and its stability in the face of mutations. Here, the authors demonstrate the utility of latent space models learned using variational autoencoders to infer these properties from sequences.

    • Xinqiang Ding
    • , Zhengting Zou
    •  & Charles L. Brooks III
  • Article
    | Open Access

    Increasing evidence supports the existence of ordered nanodomains (or rafts) in cholesterol rich cell membranes. Here authors present molecular dynamics simulations and EPR experiments to monitor permeation of oxygen and water through membranes in the liquid ordered and liquid disordered phases.

    • An Ghysels
    • , Andreas Krämer
    •  & Richard W. Pastor
  • Article
    | Open Access

    Existing computational approaches to predict long-range regulatory interactions do not fully exploit high-resolution Hi-C datasets. Here the authors present a Random Forests regression-based approach to predict high-resolution Hi-C counts using one-dimensional regulatory genomic signals.

    • Shilu Zhang
    • , Deborah Chasman
    •  & Sushmita Roy
  • Article
    | Open Access

    Stem-cell-specific genes regulate processes such as maintenance, identity and/or division. Here, the authors show that in the Arabidopsis root TCX2, a gene expressed across different stem cell populations (a stem-cell-ubiquitous gene), controls division and identity by regulating stem-cell-type-specific networks.

    • Natalie M. Clark
    • , Eli Buckner
    •  & Rossangela Sozzani
  • Article
    | Open Access

    Mechanistic insight into the regulation of transcriptional modules remains scarce. Here, the authors identify statistically independent gene sets by applying independent component analysis to a high-quality E. coli RNA-seq data compendium and find that most gene sets represent the effects of specific transcriptional regulators.

    • Anand V. Sastry
    • , Ye Gao
    •  & Bernhard O. Palsson
  • Article
    | Open Access

    Disease heritability and genetic correlations between traits depend on genetics, the environment and their interaction. Here, Jia et al. compute disease prevalence curves and disease embeddings from electronic health records and impute heritability for hundreds of diseases and genetic correlations for thousands of disease pairs.

    • Gengjie Jia
    • , Yu Li
    •  & Andrey Rzhetsky
  • Article
    | Open Access

    Sequencing cancer genomes reveals low frequency novel somatic variants without known function. Here, the authors leverage statistical methodology from the fields of computational linguistics and ecology to highlight the potentially important signals harboured by these novel variants that are often dismissed.

    • Saptarshi Chakraborty
    • , Arshi Arora
    •  & Ronglai Shen
  • Article
    | Open Access

    Viral genomic DNA is often modified to evade the host bacterial restriction system. Here the authors identified 2′-deoxy-7-deazaguanine modifications on phage DNA by comparative genomics and experimental validation, showing their role in genome protection.

    • Geoffrey Hutinet
    • , Witold Kot
    •  & Valérie de Crécy-Lagard
  • Article
    | Open Access

    How reproducible human kidney organoids derived from different iPSC lines are, and how faithful they are to human kidney tissue remain unclear. Here, the authors use four human iPSC lines to derive kidney organoids and show how organoid composition is reproducible, comparable to human tissue and of improved quality after transplantation.

    • Ayshwarya Subramanian
    • , Eriene-Heidi Sidhom
    •  & Anna Greka
  • Article
    | Open Access

    Visualisation tools that use dimensionality reduction, such as t-SNE, provide poor visualisation on large data sets of millions of observations. Here the authors present opt-SNE, that automatically finds data set-tailored parameters for t-SNE to optimise visualisation and improve analysis.

    • Anna C. Belkina
    • , Christopher O. Ciccolella
    •  & Jennifer E. Snyder-Cappione
  • Article
    | Open Access

    Haplotype information inferred by phasing is useful in genetic and genomic analysis. Here, the authors develop SHAPEIT4, a phasing method that exhibits sub-linear running time, provides accurate haplotypes and enables integration of external phasing information.

    • Olivier Delaneau
    • , Jean-François Zagury
    •  & Emmanouil T. Dermitzakis
  • Article
    | Open Access

    Identification of clinically relevant gene expression signatures for cancer stratification remains challenging. Here, the authors introduce a flexible nonlinear signal superposition model that enables dissection of large gene expression data sets into signatures and extraction of gene interactions.

    • Michael Grau
    • , Georg Lenz
    •  & Peter Lenz
  • Article
    | Open Access

    Structural variants may be omitted in sequence analysis despite their importance in genome variation and phenotypic impact. Here the authors present GraphTyper2, which uses pangenome graphs to genotype structural variants using short-reads and can be applied in large-scale sequencing studies.

    • Hannes P. Eggertsson
    • , Snaedis Kristmundsdottir
    •  & Pall Melsted
  • Article
    | Open Access

    Machine learning algorithms can be trained to estimate age from brain structural MRI. Here, the authors introduce a new deep-learning-based age prediction approach, and then carry out a GWAS of the difference between predicted and chronological age, revealing two associated variants.

    • B. A. Jonsson
    • , G. Bjornsdottir
    •  & M. O. Ulfarsson
  • Article
    | Open Access

    There is disproportionally high cancer prevalence in males. Here, the authors analyse the tumour suppressor p53 in sporadic cancers, highlighting a higher incidence of its mutation in males. Males are further disadvantaged by a failure to shield against the expression of damaged X-linked genes in p53-networks. These factors likely contribute to sex-disparity.

    • Sue Haupt
    • , Franco Caramia
    •  & Ygal Haupt
  • Article
    | Open Access

    Reconstructing system dynamics on complex high-dimensional energy landscapes from static experimental snapshots remains challenging. Here, the authors introduce a framework to infer the essential dynamics of physical and biological systems without need for time-dependent measurements.

    • Philip Pearce
    • , Francis G. Woodhouse
    •  & Jörn Dunkel
  • Article
    | Open Access

    In Drosophila, dosage compensation involves a twofold transcriptional upregulation of the single male chromosome X. Here the authors show that global conformational differences are specifically present in the male X chromosome and detectable using Hi-C data, indicating that dosage compensation affects global chromosome structure.

    • Koustav Pal
    • , Mattia Forcato
    •  & Francesco Ferrari
  • Article
    | Open Access

    Genome sequencing is being widely adopted for diagnosis of genetic diseases, but identifying the causal variants remains challenging. Here, the authors introduce a tool that incorporates tissue-specific gene expression data into predicting variant pathogenicity, improving accuracy.

    • Denise Anderson
    • , Gareth Baynam
    •  & Timo Lassmann
  • Article
    | Open Access

    Whole genome sequencing (WGS) holds promise to solve a subset of Mendelian disease cases for which exome sequencing did not provide a genetic diagnosis. Here, Wells et al. report a supervised machine learning model trained on functional, mutational and structural features for rank-scoring and interpreting variants in non-coding regions from WGS.

    • Alex Wells
    • , David Heckerman
    •  & Julia di Iulio
  • Article
    | Open Access

    Drug target identification is a crucial step in drug development. Here, the authors introduce a Bayesian machine learning framework that integrates multiple data types to predict the targets of small molecules, enabling identification of a new set of microtubule inhibitors and the target of the anti-cancer molecule ONC201.

    • Neel S. Madhukar
    • , Prashant K. Khade
    •  & Olivier Elemento
  • Article
    | Open Access

    Metabolic syndrome is characterized by complex phenotypes that increases the risk of cardiovascular disease and type 2 diabetes. Here the authors’ integrative network analysis suggests BTK inhibitor ibrutinib to be a promising treatment through its obesity-associated inflammation lowering effect.

    • Karla Misselbeck
    • , Silvia Parolo
    •  & Corrado Priami
  • Article
    | Open Access

    Allele-specific expression at single-cell resolution can reveal stochastic and dynamic features of gene expression in greater detail. The authors propose scBASE, a soft zero-and-one inflated model that improves estimation of cellular allelic proportions by pooling information across cells.

    • Kwangbom Choi
    • , Narayanan Raghupathy
    •  & Gary A. Churchill
  • Article
    | Open Access

    Imaging heart development is challenging due to constant tissue movement and changing physical landmarks. Here the authors present an algorithm capable of maintaining phase-locked imaging throughout a 24 hour timespan, enabling long term timelapse imaging studies of zebrafish heart development, repair and regeneration.

    • Jonathan M. Taylor
    • , Carl J. Nelson
    •  & Martin A. Denvir
  • Article
    | Open Access

    Our understanding of the mechanisms of drug interactions remains limited. Here the authors introduce a framework to study how complex cellular perturbations induced by different drugs affect each other in morphological feature space.

    • Michael Caldera
    • , Felix Müller
    •  & Jörg Menche
  • Article
    | Open Access

    The clonal origins of metastases and the timing of dissemination remains an open question for most cancer types. Using primary and metastatic samples taken from one colorectal cancer patient, Alves et al. use Bayesian phylogenetics to reconstruct the history of metastasis.

    • Joao M. Alves
    • , Sonia Prado-López
    •  & David Posada
  • Article
    | Open Access

    N1-methyladenosine (m1A) was recently reported as a new mRNA modification but its prevalence has been controversial. Here the authors showed that m1A, if present in mRNA, is at very low stoichiometry, with the notable exception of MT-ND5. Further, they show that the previously reported enrichment of m1A near the start of transcripts are false-positive identifications due to cross-reactivity of the commonly used m1A antibody with mRNA caps.

    • Anya V. Grozhik
    • , Anthony O. Olarerin-George
    •  & Samie R. Jaffrey
  • Article
    | Open Access

    Various approaches are being used for polygenic prediction including Bayesian multiple regression methods that require access to individual-level genotype data. Here, the authors extend BayesR to utilise GWAS summary statistics (SBayesR) and show that it outperforms other summary statistic-based methods.

    • Luke R. Lloyd-Jones
    • , Jian Zeng
    •  & Peter M. Visscher