Sequence annotation

  • Article
    | Open Access

    Non-coding RNA function is poorly understood, partly due to the challenge of determining RNA secondary (2D) structure. Here, the authors present a framework for the reproducible prediction and visualization of the 2D structure of a wide array of RNAs, which enables linking RNA sequence to function.

    • Blake A. Sweeney
    • , David Hoksza
    •  & Anton I. Petrov
  • Article
    | Open Access

    The SARS-CoV-2 gene set remains unresolved, hindering dissection of COVID-19 biology. Comparing 44 Sarbecovirus genomes provides a high-confidence protein-coding gene set. The study characterizes protein-level and nucleotide-level evolutionary constraints, and prioritizes functional mutations from the ongoing COVID-19 pandemic.

    • Irwin Jungreis
    • , Rachel Sealfon
    •  & Manolis Kellis
  • Article
    | Open Access

    Conventional single-cell RNA sequencing analysis rely on genome annotations that may be incomplete or inaccurate especially for understudied organisms. Here the authors present a bioinformatic tool that leverages single-cell data to uncover biologically relevant transcripts beyond the best available genome annotation.

    • Michael F. Z. Wang
    • , Madhav Mantri
    •  & Iwijn De Vlaminck
  • Article
    | Open Access

    Accurate prediction of variant pathogenicity is essential to understanding genetic risks in disease. Here, the authors present a deep neural network method for prediction of missense variant pathogenicity, MVP, and demonstrate its utility in prioritizing de novo variants contributing to developmental disorders.

    • Hongjian Qi
    • , Haicang Zhang
    •  & Yufeng Shen
  • Article
    | Open Access

    Researchers can make use of a variety of computational tools to prioritize genetic variants and predict their pathogenicity. Here, the authors evaluate the performance of six of these tools in three typical biological tasks and find generally low concordance of predictions and experimental confirmation.

    • Li Liu
    • , Maxwell D. Sanderford
    •  & Sudhir Kumar
  • Article
    | Open Access

    Association between variants in 11 different genes and breast cancer risk has been established and sequencing of these genes is recommended to provide personalized diagnosis, therapy, and surveillance for the high-risk patients and their relatives. Here the authors analyse the frequency of germline pathogenic mutations in these genes specifically in a Japanese population.

    • Yukihide Momozawa
    • , Yusuke Iwasaki
    •  & Michiaki Kubo
  • Article
    | Open Access

    The globally-distributed Ranidae (true frogs) are the largest frog family. Here, Hammond et al. present a draft genome of the North American bullfrog, Rana (Lithobates) catesbeiana, as a foundation for future understanding of true frog genetics as amphibian species face difficult environmental challenges.

    • S. Austin Hammond
    • , René L. Warren
    •  & Inanc Birol
  • Article
    | Open Access

    While non-coding synonymous and intronic variants are often not under strong selective constraint, they can be pathogenic through affecting splicing or transcription. Here, the authors develop a score that uses sequence context alterations to predict pathogenicity of synonymous and non-coding genetic variants, and provide a web server of pre-computed scores.

    • Sahar Gelfman
    • , Quanli Wang
    •  & David B. Goldstein
  • Article
    | Open Access

    Long non-coding RNAs are increasingly recognised to be important factors in regulating cellular processes and comprise a large faction of the transcriptome, however most are uncharacterised. Here the authors present RACE-Seq, a tool to improve and extend the annotation of low-expression transcripts.

    • Julien Lagarde
    • , Barbara Uszczynska-Ratajczak
    •  & Jennifer Harrow