Computational biology and bioinformatics articles within Nature Communications

Featured

  • Article
    | Open Access

    Karyotyping of cancer genomes at the base-level is technically challenging. Here, the authors introduce InfoGenomeR, an algorithm that can infer cancer genome karyotypes from whole-genome sequencing data, and test their model on breast, ovarian and brain cancer samples; and identify private and shared mutations between primary and metastatic cancer samples.

    • Yeonghun Lee
    •  & Hyunju Lee
  • Article
    | Open Access

    Methods to produce haplotype-resolved genome assemblies often rely on access to family trios. The authors present FALCON-Phase, a tool that combines ultra-long range Hi-C chromatin interaction data with a long read de novo assembly to extend haplotype phasing to the contig or scaffold level.

    • Zev N. Kronenberg
    • , Arang Rhie
    •  & Sarah B. Kingan
  • Article
    | Open Access

    Highly endangered species like the Sumatran rhinoceros are at risk from inbreeding. Five historical and 16 modern genomes from across the species range show mutational load, but little evidence for local adaptation, suggesting that future inbreeding depression could be mitigated by assisted gene flow among populations.

    • Johanna von Seth
    • , Nicolas Dussex
    •  & Love Dalén
  • Article
    | Open Access

    The ability to design functional sequences is central to protein engineering and biotherapeutics. Here the authors introduce a deep generative alignment-free model for sequence design applied to highly variable regions and design and test a diverse nanobody library with improved properties for selection experiments.

    • Jung-Eun Shin
    • , Adam J. Riesselman
    •  & Debora S. Marks
  • Article
    | Open Access

    Data-rich networks can be difficult to interpret beyond a certain size. Here, the authors introduce a platform that uses virtual reality to allow the visual exploration of large networks, while interfacing with data repositories and other analytical methods to improve the interpretation of big data.

    • Sebastian Pirch
    • , Felix Müller
    •  & Jörg Menche
  • Article
    | Open Access

    Current aptamer discovery approaches are unable to probe the complete space of possible sequences. Here, the authors use machine learning to facilitate the development of DNA aptamers with improved binding affinities, and truncate them without significantly compromising binding affinity.

    • Ali Bashir
    • , Qin Yang
    •  & B. Scott Ferguson
  • Article
    | Open Access

    Our understanding of the age-related molecular alterations in cancer is still limited. Here, the authors perform a pan-cancer analysis of age-associated genomic, transcriptomic, and epigenetic alterations, linking age-related gene expression changes to age-related DNA methylation alterations

    • Kasit Chatsirisupachai
    • , Tom Lesluyes
    •  & João Pedro de Magalhães
  • Article
    | Open Access

    Dysregulated phosphorylation is well-known in cancers, but it has largely been studied in isolation from mutations. Here the authors introduce HotPho, a tool that can discover spatial interactions between phosphosites and mutations, which are associated with activating mutation and genetic dependencies in cancer.

    • Kuan-lin Huang
    • , Adam D. Scott
    •  & Li Ding
  • Article
    | Open Access

    Functional RNA secondary structure is important for the pre-mRNA processing including splicing, cleavage and polyadenylation, and RNA editing. Here the authors present a catalog of conserved long-range RNA structures in the human transcriptome by defining pairs of conserved complementary regions (PCCR) in pre-aligned evolutionarily conserved regions.

    • Svetlana Kalmykova
    • , Marina Kalinina
    •  & Dmitri Pervouchine
  • Article
    | Open Access

    De novo design of self-assembling protein nanostructures and materials is of significant interest, however design of complex, multi-component assemblies is challenging. Here, the authors present a stepwise hierarchical approach to build such assemblies using helical repeat and helical bundle proteins as building blocks, and provide an in-depth structural characterization of the resulting assemblies.

    • Yang Hsia
    • , Rubul Mout
    •  & David Baker
  • Article
    | Open Access

    Multiple molecular profiling methods are required to study urothelial non-muscle-invasive bladder cancer (NMIBC) due to its heterogeneity. Here the authors integrate multi-omics data of 834 NMIBC patients, identifying a molecular subgroup associated with multiple alterations and worse outcomes.

    • Sia Viborg Lindskrog
    • , Frederik Prip
    •  & Lars Dyrskjøt
  • Article
    | Open Access

    Epigenetic and transcriptional dynamics are critical for both tissue homeostasis and injury response in the kidney. Leveraging a single cell multiomics atlas of the developing mouse kidney, the authors reveal key events in chromatin regulation and gene expression dynamics during postnatal development.

    • Zhen Miao
    • , Michael S. Balzer
    •  & Katalin Susztak
  • Article
    | Open Access

    Deep learning methods show great promise for the analysis of microscopy images but there is currently an accessibility barrier to many users. Here the authors report a convenient entry-level deep learning platform that can be used at no cost: ZeroCostDL4Mic.

    • Lucas von Chamier
    • , Romain F. Laine
    •  & Ricardo Henriques
  • Article
    | Open Access

    Triple-negative breast cancer (TNBC) is an aggressive breast cancer subtype with poor prognostic outcomes. Here the authors characterize super-enhancer heterogeneity and they identify genes that are specifically regulated by TNBC-specific super-enhancers, including FOXC1, MET and ANLN.

    • Hao Huang
    • , Jianyang Hu
    •  & Y. Rebecca Chin
  • Article
    | Open Access

    Cancers in different populations have been shown to be genetically distinct. Here, the authors sequence breast cancers from Mexican-Hispanic patients and find that these patients have a higher percentage of Akt1 mutations compared to Caucasian and Asian populations, suggesting these are clinically actionable.

    • Sandra L. Romero-Cordoba
    • , Ivan Salido-Guadarrama
    •  & Alfredo Hidalgo-Miranda
  • Article
    | Open Access

    Modern biological research is complicated by the difficulty of collecting, transforming, annotating, and integrating datasets. Here, the authors present Go Get Data, a fast, reproducible approach to installing standardized data recipes, with an application to genomics data.

    • Michael J. Cormier
    • , Jonathan R. Belyeu
    •  & Aaron R. Quinlan
  • Article
    | Open Access

    Methods for profiling differences between individual cells are constantly expanding. Here, the authors present a computational framework for the analysis of chromatin accessibility data at the single-cell level that takes into account previous knowledge and data-specific characteristics.

    • Shengquan Chen
    • , Guanao Yan
    •  & Zhixiang Lin
  • Article
    | Open Access

    Conventional single-cell RNA sequencing analysis rely on genome annotations that may be incomplete or inaccurate especially for understudied organisms. Here the authors present a bioinformatic tool that leverages single-cell data to uncover biologically relevant transcripts beyond the best available genome annotation.

    • Michael F. Z. Wang
    • , Madhav Mantri
    •  & Iwijn De Vlaminck
  • Article
    | Open Access

    Estimating the effects of non-pharmaceutical interventions for COVID-19 is challenging, partly due to variations in testing. Here, the authors use viral sequence data as an alternative means of inferring intervention effects, and show that delays in implementation resulted in more severe epidemics.

    • Manon Ragonnet-Cronin
    • , Olivia Boyd
    •  & Erik Volz
  • Article
    | Open Access

    Despite linguistic and geographic diversity in South Eastern Bantu-speaking (SEB) groups of South Africa, genetic variation in these groups has not been investigated in depth. Here, the authors analyse genome-wide data from 5056 individuals, providing insights into demographic history across SEB groups.

    • Dhriti Sengupta
    • , Ananyo Choudhury
    •  & Michèle Ramsay
  • Article
    | Open Access

    Variable number tandem repeats (VNTRs) are implicated in human diseases yet have been difficult to analyse computationally. Here, the authors describe a neural network method, adVNTR-NN, that allows rapid and accurate genotyping of VNTRs from large whole genome sequencing datasets.

    • Mehrdad Bakhtiari
    • , Jonghun Park
    •  & Vineet Bafna
  • Article
    | Open Access

    Genetic correlation analyses give insight on complex disease, yet are limited by oversimplification. Here, the authors present LOGODetect, a method using summary statistics from genome-wide association studies to identify genomic regions with correlation signals across multiple phenotypes.

    • Hanmin Guo
    • , James J. Li
    •  & Lin Hou
  • Article
    | Open Access

    Patient-derived xenografts are widely used for drug development, but the impact of murine viral infection remains underexplored. Here, the authors demonstrate the extensive existence of murine viral sequences in patient-derived xenografts and significant expression change of crucial genes in samples with high virus load.

    • Zihao Yuan
    • , Xuejun Fan
    •  & W. Jim Zheng
  • Article
    | Open Access

    Likelihood optimization in phylogenetic tree reconstruction is computationally intensive, especially as the number of sequences and taxa included increase. Here, Azouri et al. show how an artificial intelligence approach can reduce computational time without losing accuracy of tree inference.

    • Dana Azouri
    • , Shiran Abadi
    •  & Tal Pupko
  • Article
    | Open Access

    In genome-wide association meta-analysis, it is often difficult to find an independent dataset of sufficient size to replicate associations. Here, the authors have developed MAMBA to calculate the probability of replicability based on consistency between datasets within the meta-analysis.

    • Daniel McGuire
    • , Yu Jiang
    •  & Dajiang J. Liu
  • Article
    | Open Access

    Many countries have closed schools as part of their COVID-19 response. Here, the authors model SARS-CoV-2 transmission on a network of schools and households in England, and find that risk of transmission between schools is lower if primary schools are open than if secondary schools are open.

    • James D. Munday
    • , Katharine Sherratt
    •  & Sebastian Funk
  • Article
    | Open Access

    COVID-19 has caused many healthcare systems to become overwhelmed, potentially impacting patient care. Here, the authors show that COVID-19-related in-hospital mortality rates in Israel increased in periods of moderate or high hospital load, independent of patient characteristics.

    • Hagai Rossman
    • , Tomer Meir
    •  & Malka Gorfine
  • Article
    | Open Access

    Drug use or bacterial infection can cause significant alterations of gastric microbiome. Here, the authors show how advanced pattern recognition by nonlinear machine intelligence can help disclose a bacteria-metabolite network which enlightens mechanisms behind such perturbations.

    • Claudio Durán
    • , Sara Ciucci
    •  & Carlo Vittorio Cannistraci
  • Article
    | Open Access

    Clustering cells based on similarities in gene expression is the first step towards identifying cell types in scRNASeq data. Here the authors incorporate biological knowledge into the clustering step to facilitate the biological interpretability of clusters, and subsequent cell type identification.

    • Tian Tian
    • , Jie Zhang
    •  & Hakon Hakonarson
  • Article
    | Open Access

    Single-cell RNA-Seq suffers from heterogeneity in sequencing sparsity and complex differential patterns in gene expression. Here, the authors introduce a graph neural network based on a hypothesis-free deep learning framework as an effective representation of gene expression and cell–cell relationships.

    • Juexin Wang
    • , Anjun Ma
    •  & Dong Xu
  • Article
    | Open Access

    Artificial intelligence and machine learning promise to transform cancer therapies by accurately predicting the most appropriate drugs to treat individual patients. Here, the authors present an approach which uses omics data to produce ordered lists of drugs based on their effectiveness in decreasing cancer cell proliferation.

    • Henry Gerdes
    • , Pedro Casado
    •  & Pedro R. Cutillas
  • Article
    | Open Access

    Radiographic imaging is routinely used to evaluate treatment response in solid tumors. Here, the authors present a multi-task deep learning approach that allows simultaneous tumor segmentation and response prediction from longitudinal images in a multi-center study on rectal cancer.

    • Cheng Jin
    • , Heng Yu
    •  & Ruijiang Li