Genome informatics articles within Nature Communications

Featured

  • Article
    | Open Access

    Any DNA sequence can be represented by a chiral partner sequence – an exact copy arranged in reverse nucleotide order. Here, the authors show that chiral DNA sequence pairs share important properties and show the utility of synthetic chiral sequences (sequins) as controls for clinical genomics.

    • Ira W. Deveson
    • , Bindu Swapna Madala
    •  & Tim R. Mercer
  • Article
    | Open Access

    Due to various structural and sequence complexities, the human Y chromosome is challenging to sequence and characterize. Here, the authors develop a strategy to sequence native, unamplified flow sorted Y chromosomes with a nanopore sequencing platform, and report the first assembly of a human Y chromosome of African origin.

    • Lukas F. K. Kuderna
    • , Esther Lizano
    •  & Tomas Marques-Bonet
  • Article
    | Open Access

    Gene pairs that are coexpressed across various environmental conditions in multiple species suggest functional similarity. Here the authors analyze patterns of gene expression co-evolution across diverse eukaryotes, and identify hundreds of protein complexes and pathways whose gene expression levels have co-evolved since their ancient divergence.

    • Trevor Martin
    •  & Hunter B. Fraser
  • Article
    | Open Access

    The evolution and genetic nature of metastatic lesions is not completely characterized. Here the authors perform a comprehensive whole-genome study of colorectal metastases in comparison to matched primary tumors and define a multistage progression model and metastasis-specific changes that, in part, are therapeutically actionable.

    • Naveed Ishaque
    • , Mohammed L. Abba
    •  & Heike Allgayer
  • Article
    | Open Access

    Integrated analyses of multiple large-scale screenings can be complicated by batch effects and technical artefacts. McFarland et al. introduce DEMETER2, a hierarchical model coupled with model-based normalization, which allows the assessment of differential dependencies across genes and cell lines.

    • James M. McFarland
    • , Zandra V. Ho
    •  & Aviad Tsherniak
  • Article
    | Open Access

    Divergent transcription from promoters and enhancers occurs in many species, but it is unclear if it is a general feature of all eukaryotic cis regulatory elements. Here the authors define cis regulatory elements in worms, flies, and human; and identify several differences in regulatory architecture among metazoans.

    • Mahmoud M. Ibrahim
    • , Aslihan Karabacak
    •  & Uwe Ohler
  • Article
    | Open Access

    Short-tandem repeats (STR), similar to single nucleotide polymorphisms (SNP), contribute to complex traits, but their ascertainment by next-generation sequencing is costly. Here, Saini et al. provide a SNP+STR haplotype reference panel that allows imputation of STRs from SNP array data.

    • Shubham Saini
    • , Ileena Mitra
    •  & Melissa Gymrek
  • Article
    | Open Access

    Transcription elongation (TE) is a key point of inducible gene expression regulation. Here, the authors report widespread TE defects (TEdeff) in a high proportion of cancers that correlate with poor immunotherapy response, highlighting TE defects as potential routes for immune resistance.

    • Vishnu Modur
    • , Navneet Singh
    •  & Kakajan Komurov
  • Article
    | Open Access

    CTCF mediates long-range chromatin interactions which are important for genome organization and function. Here, the authors demonstrate that CTCF-mediated interactome exhibits extensive plasticity and present Lollipop, a machine-learning framework which predicts CTCF-mediated long-range interactions using genomic and epigenomic features.

    • Yan Kai
    • , Jaclyn Andricovich
    •  & Weiqun Peng
  • Article
    | Open Access

    Sharing of whole genome sequencing (WGS) data improves study scale and power, but data from different groups are often incompatible. Here, US genome centers and NIH programs define WGS data processing standards and a flexible validation method, facilitating collaboration in human genetics research.

    • Allison A. Regier
    • , Yossi Farjoun
    •  & Ira M. Hall
  • Article
    | Open Access

    Clinical oncology is rapidly adopting next-generation sequencing technology for nucleotide variant and indel detection. Here the authors present a three-platform approach (whole-genome, whole-exome, and whole-transcriptome) in pediatric patients for the detection of diverse types of germline and somatic variants.

    • Michael Rusch
    • , Joy Nakitandwe
    •  & Jinghui Zhang
  • Article
    | Open Access

    Accurate detection of TADs requires ultra-deep sequencing and sophisticated normalisation procedures, which limits the analysis of Hi-C data. Here the authors develop a normalisation-free method to decode the domains of chromosomes (deDoc) that utilizes structural entropy to predict TADs with ultra-low sequencing data.

    • Angsheng Li
    • , Xianchen Yin
    •  & Zhihua Zhang
  • Article
    | Open Access

    The majority of the human reference genome assembly is represented as a single consensus haplotype. Here, Wong et al. analyze de novo assemblies of 17 diverse, haplotype-resolved genomes to gain insights into the structure of genetic diversity and compile a list of alternative haplotypes across populations.

    • Karen H. Y. Wong
    • , Michal Levy-Sakin
    •  & Pui-Yan Kwok
  • Article
    | Open Access

    Temporal programs of genome replication show different levels of conservation between closely or distantly related species. Here, the authors generate genome-wide replication timing profiles for ten yeast species, and analyze their evolutionary dynamics.

    • Nicolas Agier
    • , Stéphane Delmas
    •  & Gilles Fischer
  • Article
    | Open Access

    A central question in cancer research is how specific driver mutations are acquired and maintained during cancer development. Here Temko et al. use public sequencing data to infer the effect of mutation and selection on a set of driver mutations and suggest that selection frequently dominates.

    • Daniel Temko
    • , Ian P. M. Tomlinson
    •  & Trevor A. Graham
  • Article
    | Open Access

    Pulmonary arterial hypertension (PAH) is a rare lung disorder characterised by narrowing and obliteration of small pulmonary arteries ultimately leading to right heart failure. Here, the authors sequence whole genomes of over 1000 PAH patients and identify likely causal variants in GDF2, ATP13A3, AQP1 and SOX17.

    • Stefan Gräf
    • , Matthias Haimel
    •  & Nicholas W. Morrell
  • Article
    | Open Access

    Long-read sequencing technologies facilitate efficient and high quality genome assembly. Here Michael et al. achieve a fast reference assembly for Arabidopsis thaliana KBS-Mac-74 accession using the handheld Oxford Nanopore MinION sequencer and consumer computing hardware, and demonstrate its usefulness in resolving complex structural variation.

    • Todd P. Michael
    • , Florian Jupe
    •  & Joseph R. Ecker
  • Article
    | Open Access

    Centromeres and large-scale structural variants evolve and contribute to genome diversity during vertebrate speciation. Here Ichikawa et al perform de novo long-read genome assembly of three inbred medaka strains, and report long-range structure of centromeres and their methylation as well as correlation of structural variants with differential gene expression.

    • Kazuki Ichikawa
    • , Shingo Tomioka
    •  & Shinich Morishita
  • Article
    | Open Access

    While non-coding synonymous and intronic variants are often not under strong selective constraint, they can be pathogenic through affecting splicing or transcription. Here, the authors develop a score that uses sequence context alterations to predict pathogenicity of synonymous and non-coding genetic variants, and provide a web server of pre-computed scores.

    • Sahar Gelfman
    • , Quanli Wang
    •  & David B. Goldstein
  • Article
    | Open Access

    Biomphalaria glabrata is a fresh water snail that acts as a host for trematode Schistosoma mansoni that causes intestinal infection in human. This work describes the genome and transcriptome analyses from 12 different tissues of B glabrata, and identify genes for snail behavior and evolution.

    • Coen M. Adema
    • , LaDeana W. Hillier
    •  & Richard K. Wilson
  • Article
    | Open Access

    Genome assembly for many plant species can be challenging due to large size and high repeat content. Here, the authors usein vitroproximity ligation to assemble the genome of lettuce, revealing a family-specific triplication event and providing a comprehensive reference genome for a member of the Compositae.

    • Sebastian Reyes-Chin-Wo
    • , Zhiwen Wang
    •  & Richard W. Michelmore
  • Article
    | Open Access

    The gene-battery model posits transposable elements (TEs) may becis-regulatory elements to control gene expression. Here, mouse-specific TEs are shown as binding sites for multiple collaborating transcription factors in embryonic stem cells, and act as cis-regulatory modules in synergistic fashion.

    • Vasavi Sundaram
    • , Mayank N. K. Choudhary
    •  & Ting Wang
  • Article
    | Open Access

    Assembling genomes using currently available computational methods can be time consuming. Here, Coin and colleagues describe a bioinformatics tool named npScarf that can scaffold and complete an existing short read assembly in real-time using nanopore sequencing.

    • Minh Duc Cao
    • , Son Hoang Nguyen
    •  & Lachlan J. M. Coin
  • Article
    | Open Access

    Sequencing initiatives have detected multiple types of mutations in cancer. Here the authors, analysing enhancer-targeting sequence data, show that small insertions in transcriptional enhancers are frequently found near oncogenes, and demonstrate how one mutation deregulates expression of LMO2 in leukemia cells.

    • Brian J. Abraham
    • , Denes Hnisz
    •  & Richard A. Young
  • Article
    | Open Access

    Fission yeastSchizosaccharomyces pombe has diverse traits. Jeffares et al. characterize large copy number variations (CNVs) and rearrangements in S. pombe, and show that CNVs are transient with effects on quantitative traits and gene expression, whereas rearrangements influence intrinsic reproductive isolation.

    • Daniel C. Jeffares
    • , Clemency Jolly
    •  & Fritz J. Sedlazeck
  • Article
    | Open Access

    Currently available metagenomic data analysis relies on reference genomes. Here, the authors describe a newde novometagenomic assembly method, metaSort, that constructs bacterial genomes from metagenomic samples to reduce microbial community complexity while increasing genome recovery and assembly.

    • Peifeng Ji
    • , Yanming Zhang
    •  & Fangqing Zhao
  • Article
    | Open Access

    Altered DNA methylation is a feature of cancer and between-patient variability is prevalent. Here, the authors integrate data on thousands of human tumours, and find that expression levels of methionine metabolism genes are predictive of methylation features, and that the breakdown of this relationship is a negative prognostic marker.

    • Mahya Mehrmohamadi
    • , Lucas K. Mentch
    •  & Jason W. Locasale
  • Article
    | Open Access

    To modulate gene expression, the glucocorticoid receptor binds to response elements (RE) that vary in sequence. Here, the authors show that RE sequences can modulate glucocorticoid receptor structure and activity, which might provide regulatory specificity towards individual target genes.

    • Stefanie Schöne
    • , Marcel Jurk
    •  & Sebastiaan H. Meijsing
  • Article
    | Open Access

    The global measurement of ribosome occupancy on mRNAs is commonly used as a proxy in estimating rates of protein synthesis. Here the authors describe Xtail, a computational approach that facilitates the extraction of accurate quantitative insight from ribosome profiling data (Ribo-Seq).

    • Zhengtao Xiao
    • , Qin Zou
    •  & Xuerui Yang
  • Article
    | Open Access

    The clinical application of new sequencing techniques is expected to accelerate pathogen identification. Here, Bradley et al. present a clinician-friendly software package that uses sequencing data for quick and accurate prediction of antibiotic resistance profiles for S. aureus and M. tuberculosis.

    • Phelim Bradley
    • , N. Claire Gordon
    •  & Zamin Iqbal
  • Article
    | Open Access

    Cancer genetics has benefited from the advent of next generation sequencing, yet a comparison of sequencing and analysis techniques is lacking. Here, the authors sequence a normal-tumour pair and perform data analysis at multiple institutes and highlight some of the pitfalls associated with the different methods.

    • Tyler S. Alioto
    • , Ivo Buchhalter
    •  & Ivo G. Gut
  • Article
    | Open Access

    The correct assembly of genomes from sequencing data remains a challenge due to difficulties in correctly assigning the location of repeated DNA elements. Here the authors describe GRAAL, an algorithm that utilizes genome-wide chromosome contact data within a probabilistic framework to produce accurate genome assemblies.

    • Hervé Marie-Nelly
    • , Martial Marbouty
    •  & Romain Koszul