Genome informatics articles within Nature Communications

Featured

  • Article
    | Open Access

    Chromatin loops bridging distant loci within chromosomes can be detected by a variety of techniques such as Hi-C. Here the authors present Chromosight, an algorithm applied on mammalian, bacterial, viral and yeast genomes, able to detect various types of pattern in chromosome contact maps, including chromosomal loops.

    • Cyril Matthey-Doret
    • , Lyam Baudry
    •  & Axel Cournac
  • Article
    | Open Access

    The molecular basis for the unique taste and aroma of tea cultivars is largely unknown, but is critical for breeding new cultivars. Here the authors use transcriptomics and metabolomics to study the relationship among phylogenetic groups and specialized metabolites from 136 tea accessions in China.

    • Xiaomin Yu
    • , Jiajing Xiao
    •  & Renyi Liu
  • Article
    | Open Access

    Acral melanoma occurs on the soles of the feet, palms of the hands and in nail beds. Here, the authors reports the genomic landscape of 87 acral melanomas and find that some tumors harbor a UV signature and that the tumors are diverse at the levels of mutational signatures, structural aberrations and copy number signatures.

    • Felicity Newell
    • , James S. Wilmott
    •  & Nicholas K. Hayward
  • Article
    | Open Access

    Genomic analysis of neuroblastoma has revealed important disease etiology. In this study, the authors assembled whole genome, exome and transcriptome data from over 700 neuroblastomas and identified molecular signatures correlated with age, and rare, potentially targetable variants overlooked in smaller cohorts.

    • Samuel W. Brady
    • , Yanling Liu
    •  & Jinghui Zhang
  • Article
    | Open Access

    The evolutionary progression from primary to metastatic prostate cancer is largely uncharted, and the implications for liquid biopsy are unexplored. Here, the authors use deep genomic sequencing and histopathological information to trace tumor evolution both within the prostate and during metastasis in ten men.

    • D. J. Woodcock
    • , E. Riabchenko
    •  & D. C. Wedge
  • Article
    | Open Access

    Despite the identification of genetic risk loci for late-onset Alzheimer’s disease (LOAD), the genetic architecture and prediction remains unclear. Here, the authors use genetic risk scores for prediction of LOAD across three datasets and show evidence suggesting oligogenic variant architecture for this disease.

    • Qian Zhang
    • , Julia Sidorenko
    •  & Peter M. Visscher
  • Article
    | Open Access

    A biologically-interpretable and robust metric that provides insight into one’s health status from a gut microbiome sample is an important clinical goal in current human microbiome research. Herein, the authors introduce a species-level index that predicts the likelihood of having a disease.

    • Vinod K. Gupta
    • , Minsuk Kim
    •  & Jaeyun Sung
  • Article
    | Open Access

    There’s an emerging body of evidence to show how biological sex impacts cancer incidence, treatment and underlying biology. Here, using a large pan-cancer dataset, the authors further highlight how sex differences shape the cancer genome.

    • Constance H. Li
    • , Stephenie D. Prokopec
    •  & Christian von Mering
  • Article
    | Open Access

    Pseudogenes are key markers of genome remodelling processes. Here the authors present genome-wide annotation of the pseudogenes in the mouse reference genome and 18 inbred mouse strains, update human pseudogene annotations, and characterise the transcription and evolution of mouse pseudogenes.

    • Cristina Sisu
    • , Paul Muir
    •  & Mark Gerstein
  • Article
    | Open Access

    CRISPR-Cas is a host adaptive immunity system and viruses harbor diverse anti-CRISPR proteins (Acrs). Here, the authors develop a random forest machine-learning approach to predict Acrs, identifying 2500 candidate Acr families, which expand the current repertoire of predicted Acrs by two orders of magnitude.

    • Ayal B. Gussow
    • , Allyson E. Park
    •  & Eugene V. Koonin
  • Article
    | Open Access

    Predicting chromatin loops from genome-wide interaction matrices such as Hi-C data provides insight into gene regulation events. Here, the authors present Peakachu, a Random Forest classification framework that predicts chromatin loops from genome-wide contact maps, and apply it to systematically predict chromatin loops in 56 Hi-C datasets, with results available at the 3D Genome Browser.

    • Tarik J. Salameh
    • , Xiaotao Wang
    •  & Feng Yue
  • Article
    | Open Access

    Joint analysis of multiple traits can increase power and provide insights into shared genetic architecture. Here, Nguyen et al. develop multi-trait TADA (mTADA), an extension of TADA (transmission and de novo association test) that jointly analyses de novo mutations of traits for improved risk-gene identification power.

    • Tan-Hoang Nguyen
    • , Amanda Dobbyn
    •  & Eli A. Stahl
  • Article
    | Open Access

    Upstream open reading frames (uORFs), located in 5’ untranslated regions, are regulators of downstream protein translation. Here, Whiffin et al. use the genomes of 15,708 individuals in the Genome Aggregation Database (gnomAD) to systematically assess the deleteriousness of variants creating or disrupting uORFs.

    • Nicola Whiffin
    • , Konrad J. Karczewski
    •  & James S. Ware
  • Article
    | Open Access

    Regulation of chromosome structure plays essential roles in many nuclear processes. Here, the authors present TADdyn, a tool that integrates time-course 3C data, restraint-based modelling, and molecular dynamics to simulate the structural rearrangements of genomic loci and find that during gene activation, transcription starting sites contact with open chromatin regions into active physical domains.

    • Marco Di Stefano
    • , Ralph Stadhouders
    •  & Marc A. Marti-Renom
  • Article
    | Open Access

    Plasmids can mediate the exchange of genetic material between bacterial cells. Here, Acman et al. use network analyses to study the population structure and dynamics of over 10,000 plasmids, assigning them into cliques that correlate with gene content, host range, and existing classifications based on replicon and mobility types.

    • Mislav Acman
    • , Lucy van Dorp
    •  & Francois Balloux
  • Article
    | Open Access

    Empirical examples documenting the pace of adaptation across the whole genome in wild populations are scarce. Here the authors study wild stickleback populations from lake and stream habitats and show that there is a genome-wide signature of adaptation to stream habitats within just one generation.

    • Telma G. Laurentino
    • , Dario Moser
    •  & Daniel Berner
  • Article
    | Open Access

    Evolutionary steering uses therapies to control tumour evolution by exploiting trade-offs. Here, using a barcoding approach applied to large cell populations, the authors explore evolutionary steering in lung cancer cells treated with EGFR inhibitors.

    • Ahmet Acar
    • , Daniel Nichol
    •  & Andrea Sottoriva
  • Article
    | Open Access

    A fraction of mammalian CTCF binding sites fall within transposable elements (TEs) but their contribution to the evolution of 3D chromatin structure is unknown. Here the authors investigate the effect of TE-driven CTCF binding site expansions on chromatin looping in humans and mice, and provide evidence that TEs contribute to cell-specific and species-specific chromatin looping diversity and variable gene regulation in mammalian genomes.

    • Adam G. Diehl
    • , Ningxin Ouyang
    •  & Alan P. Boyle
  • Article
    | Open Access

    Population structure, even subtle differences within seemingly homogenous populations, can have an impact on the accuracy of polygenic prediction. Here, Sakaue et al. use dimensionality reduction methods to reveal fine-scale structure in the Biobank Japan cohort and explore the performance of polygenic risk scores.

    • Saori Sakaue
    • , Jun Hirata
    •  & Yukinori Okada
  • Article
    | Open Access

    Prior to genome assembly, the raw sequencing reads must be analyzed for assessment of major genome characteristics such as genome size, heterozygosity, and repetitiveness. For this purpose, the authors introduce GenomeScope 2.0, an extension of GenomeScope for polyploid genomes, and Smudgeplot, which can estimate a genome’s ploidy.

    • T. Rhyker Ranallo-Benavidez
    • , Kamil S. Jaron
    •  & Michael C. Schatz
  • Article
    | Open Access

    For single-cell RNA-seq experiments the sequencing budget is limited, and how it should be optimally allocated to maximize information is not clear. Here the authors develop a mathematical framework to show that, for estimating many gene properties, the optimal allocation is to sequence at the depth of one read per cell per gene.

    • Martin Jinye Zhang
    • , Vasilis Ntranos
    •  & David Tse
  • Article
    | Open Access

    Microbes secrete a repertoire of extracellular proteins to serve various functions depending on the ecological context. Here the authors examine how bacterial community composition and habitat structure affect the extracellular proteins, showing that generalist species and those living in more structured environments produce more extracellular proteins, and that costs of production are lower in more diverse communities.

    • Marc Garcia-Garcera
    •  & Eduardo P. C. Rocha
  • Article
    | Open Access

    Copy number alterations (CNAs) can drive tumor progression in cancer by altering gene expression levels, but transcriptional adaption can skew CNA impact. Here, the authors present transcriptional adaptation to CNA (TACNA) profiling; a tool to extract the transcriptional effect of CNAs from expression data without requiring paired CNA profiles.

    • Arkajyoti Bhattacharya
    • , Rico D. Bense
    •  & Rudolf S. N. Fehrmann
  • Article
    | Open Access

    Transcript assembly is an important step in analysis of RNA-seq data whose accuracy influences downstream quantification, detection and characterization of alternative splice variants. Here, the authors develop PsiCLASS, a transcript assembler leveraging simultaneous analysis of multiple RNA-seq samples.

    • Li Song
    • , Sarven Sabunciyan
    •  & Liliana Florea
  • Article
    | Open Access

    In-depth functional characterization of genomes relies on comprehensive transcriptome data. Here, the authors employ four complementary RNA sequencing technologies to explore the transcription landscape across 16 tissues or different organ types in diploid A genome cotton using a newly developed computational pipeline.

    • Kun Wang
    • , Dehe Wang
    •  & Yuxian Zhu
  • Article
    | Open Access

    HiChIP/PLAC-seq assay is popular for profiling 3D genome interactions among regulatory elements at kilobase resolution. Here the authors describe FitHiChIP an empirical null-based, flexible computational method for statistical significance estimation and loop calling from HiChIP data.

    • Sourya Bhattacharyya
    • , Vivek Chandra
    •  & Ferhat Ay
  • Article
    | Open Access

    Archaea and bacteria often have gene pairs with overlapping stop and start codons, suggesting translational coupling. Here, Huber et al. analyse overlapping gene pairs from 720 genomes, and validate translational coupling via termination-reinitiation for 14 gene pairs in Haloferax volcanii and Escherichia coli.

    • Madeleine Huber
    • , Guilhem Faure
    •  & Jörg Soppa
  • Article
    | Open Access

    Sequencing platforms, such as Oxford Nanopore or Pacific Biosciences generate long-read data that preserve long-range genomic information but have high error rates. Here, the authors develop MetaMaps, a computational tool for strain-level metagenomic assignment and compositional estimation using long reads.

    • Alexander T. Dilthey
    • , Chirag Jain
    •  & Adam M. Phillippy
  • Article
    | Open Access

    Simulated single cell RNA sequencing data is useful for method development and comparison. Here, the authors developed SymSim, a simulator that explicitly models the main factors of variation in single cell data.

    • Xiuwei Zhang
    • , Chenling Xu
    •  & Nir Yosef
  • Article
    | Open Access

    Genome-wide association studies (GWAS) have so far uncovered more than 200 loci for multiple sclerosis (MS). Here, the authors integrate data from various sources for a cell type-specific pathway analysis of MS GWAS results that specifically highlights the involvement of the immune system in disease pathogenesis.

    • Lohith Madireddy
    • , Nikolaos A. Patsopoulos
    •  & Sergio E. Baranzini
  • Article
    | Open Access

    Bacteroidetes genomes contain polysaccharide utilization loci (PULs), each of which encodes enzymes for the breakdown of one particular glycan. By analyzing the enzyme composition of 13,537 PULs, the authors suggest that the natural glycan diversity is orders of magnitude lower than previously proposed.

    • Pascal Lapébie
    • , Vincent Lombard
    •  & Bernard Henrissat
  • Article
    | Open Access

    Nanopore sequencing technology generates longer reads than current technologies, but with more errors. Here, the authors develop new analytical tools to improve accuracy and evaluate the potential of nanopore sequencing for clinical human genomics.

    • Rory Bowden
    • , Robert W. Davies
    •  & Peter Donnelly
  • Article
    | Open Access

    Any DNA sequence can be represented by a chiral partner sequence – an exact copy arranged in reverse nucleotide order. Here, the authors show that chiral DNA sequence pairs share important properties and show the utility of synthetic chiral sequences (sequins) as controls for clinical genomics.

    • Ira W. Deveson
    • , Bindu Swapna Madala
    •  & Tim R. Mercer
  • Article
    | Open Access

    Due to various structural and sequence complexities, the human Y chromosome is challenging to sequence and characterize. Here, the authors develop a strategy to sequence native, unamplified flow sorted Y chromosomes with a nanopore sequencing platform, and report the first assembly of a human Y chromosome of African origin.

    • Lukas F. K. Kuderna
    • , Esther Lizano
    •  & Tomas Marques-Bonet