Sequence annotation articles within Nature Communications

Featured

  • Article
    | Open Access

    The genomic organisation of the cattle genome has been assembled to a limited level of resolution. Here using long range nanopore sequencing the authors present a cattle genome assembly concentrating on characterising the immunogenomic loci, particularly T cell receptor (TR), immunoglobulin (IG) and MHC genes, from one animal.

    • Ting-Ting Li
    • , Tian Xia
    •  & Tao Li
  • Article
    | Open Access

    Accurate long-read RNA sequencing facilitates analysis of full-length transcripts. Here the authors develop an integrative toolkit, optimised for Iso-Seq data analysis, that includes transcript alignment, annotation, quantification and gene fusion detection.

    • Yuchao Xia
    • , Zijie Jin
    •  & Ruibin Xi
  • Article
    | Open Access

    Long-read sequencing is promising for the detection of structural variants (SVs), which requires algorithms with high sensitivity and precision. Here, the authors develop DeBreak, an algorithm for comprehensive and accurate SV detection in long-read sequencing data across different platforms, which outperforms other SV callers.

    • Yu Chen
    • , Amy Y. Wang
    •  & Zechen Chong
  • Article
    | Open Access

    Genetic association studies for rare variants suffer from lack of power and thus there is a need for methods to improve rare variant discovery. Here, the authors present functionally informed association tests with increased statistical power to aid discovery and interpretation of rare variants.

    • Remo Monti
    • , Pia Rautenstrauch
    •  & Christoph Lippert
  • Article
    | Open Access

    Here, we present TP-DB; a pattern-based search engine based on 1.67 million helices from the Protein Database (PDB). We demonstrate the utility of TP-DB in identifying microbe-specific antigens, as well as the design of antimicrobial peptides and Protein-protein interaction blockers.

    • Cheng-Yu Tsai
    • , Emmanuel Oluwatobi Salawu
    •  & Lee-Wei Yang
  • Article
    | Open Access

    Non-coding RNA function is poorly understood, partly due to the challenge of determining RNA secondary (2D) structure. Here, the authors present a framework for the reproducible prediction and visualization of the 2D structure of a wide array of RNAs, which enables linking RNA sequence to function.

    • Blake A. Sweeney
    • , David Hoksza
    •  & Anton I. Petrov
  • Article
    | Open Access

    The SARS-CoV-2 gene set remains unresolved, hindering dissection of COVID-19 biology. Comparing 44 Sarbecovirus genomes provides a high-confidence protein-coding gene set. The study characterizes protein-level and nucleotide-level evolutionary constraints, and prioritizes functional mutations from the ongoing COVID-19 pandemic.

    • Irwin Jungreis
    • , Rachel Sealfon
    •  & Manolis Kellis
  • Article
    | Open Access

    Conventional single-cell RNA sequencing analysis rely on genome annotations that may be incomplete or inaccurate especially for understudied organisms. Here the authors present a bioinformatic tool that leverages single-cell data to uncover biologically relevant transcripts beyond the best available genome annotation.

    • Michael F. Z. Wang
    • , Madhav Mantri
    •  & Iwijn De Vlaminck
  • Article
    | Open Access

    Accurate prediction of variant pathogenicity is essential to understanding genetic risks in disease. Here, the authors present a deep neural network method for prediction of missense variant pathogenicity, MVP, and demonstrate its utility in prioritizing de novo variants contributing to developmental disorders.

    • Hongjian Qi
    • , Haicang Zhang
    •  & Yufeng Shen
  • Article
    | Open Access

    Researchers can make use of a variety of computational tools to prioritize genetic variants and predict their pathogenicity. Here, the authors evaluate the performance of six of these tools in three typical biological tasks and find generally low concordance of predictions and experimental confirmation.

    • Li Liu
    • , Maxwell D. Sanderford
    •  & Sudhir Kumar
  • Article
    | Open Access

    Association between variants in 11 different genes and breast cancer risk has been established and sequencing of these genes is recommended to provide personalized diagnosis, therapy, and surveillance for the high-risk patients and their relatives. Here the authors analyse the frequency of germline pathogenic mutations in these genes specifically in a Japanese population.

    • Yukihide Momozawa
    • , Yusuke Iwasaki
    •  & Michiaki Kubo
  • Article
    | Open Access

    The globally-distributed Ranidae (true frogs) are the largest frog family. Here, Hammond et al. present a draft genome of the North American bullfrog, Rana (Lithobates) catesbeiana, as a foundation for future understanding of true frog genetics as amphibian species face difficult environmental challenges.

    • S. Austin Hammond
    • , René L. Warren
    •  & Inanc Birol
  • Article
    | Open Access

    While non-coding synonymous and intronic variants are often not under strong selective constraint, they can be pathogenic through affecting splicing or transcription. Here, the authors develop a score that uses sequence context alterations to predict pathogenicity of synonymous and non-coding genetic variants, and provide a web server of pre-computed scores.

    • Sahar Gelfman
    • , Quanli Wang
    •  & David B. Goldstein
  • Article
    | Open Access

    Long non-coding RNAs are increasingly recognised to be important factors in regulating cellular processes and comprise a large faction of the transcriptome, however most are uncharacterised. Here the authors present RACE-Seq, a tool to improve and extend the annotation of low-expression transcripts.

    • Julien Lagarde
    • , Barbara Uszczynska-Ratajczak
    •  & Jennifer Harrow