Bioinformatics | Nature Genetics

Article
28 March 2024 | Open Access

A single-cell atlas enables mapping of homeostatic cellular shifts in the adult human breast

Single-cell RNA sequencing analysis of over 800,000 human adult breast cells from 55 female donors identifies 41 cell subtypes and highlights age- and parity-dependent effects. Samples from healthy women with germline mutations in BRCA1 or BRCA2 showed signs of T cell exhaustion.

Austin D. Reed
, Sara Pensa
& Walid T. Khaled

Article
27 February 2024 | Open Access

BANKSY unifies cell typing and tissue domain segmentation for scalable spatial omics data analysis

BANKSY is an algorithm with R and Python implementations that identifies both cell types and tissue domains from spatially resolved omics data by incorporating spatial kernels capturing microenvironmental information. It is applicable to a range of technologies and is scalable to millions of cells.

Vipul Singhal
, Nigel Chou
& Shyam Prabhakar

Technical Report
15 February 2024 | Open Access

Accurate and sensitive mutational signature analysis with MuSiCal

MuSiCal is a mutational signature analysis tool combining minimum-volume nonnegative matrix factorization with other algorithmic innovations. Applied to PCAWG data, MuSiCal gives more accurate results, including resolving ambiguous flat signatures.

Hu Jin
, Doga C. Gulhan
& Peter J. Park

Article
20 December 2023 | Open Access

Accurate detection of identity-by-descent segments in human ancient DNA

ancIBD identifies identity-by-descent regions in ancient DNA using a hidden Markov model optimized for these low-coverage data. Analysis of 4,248 individuals demonstrates that ancIBD can identify up to sixth-degree relatives and provides genealogical insights into ancient populations.

Harald Ringbauer
, Yilei Huang
& David Reich

Article
10 August 2023 | Open Access

Genome-wide prediction of disease variant effects with a deep protein language model

A modified framework leveraging a protein language model (ESM1b) is used to predict all possible 450 million missense variant effects in the human genome and shows potential for generalizing to more complex genetic variations such as indels and stop-gains.

Nadav Brandes
, Grant Goldman
& Vasilis Ntranos

Correspondence | 25 April 2023

Annotating and prioritizing human non-coding variants with RegulomeDB v.2

Shengcheng Dong
, Nanxiang Zhao
& Benjamin C. Hitz

Article | 17 May 2021

An atlas of mitochondrial DNA genotype–phenotype associations in the UK Biobank

An analysis of the UK Biobank identifies 227 new associations between mitochondrial DNA (mtDNA) variants and phenotypes. mtDNA genetic architecture reflects regional UK nuclear genome ancestry.

Ekaterina Yonova-Doing
, Claudia Calabrese
& Joanna M. M. Howson

Article | 27 May 2019

Similarity regression predicts evolution of transcription factor sequence specificity

Similarity regression is an improved method for predicting transcription factor motifs, enabling analysis of DNA-binding motifs across eukaryotes and an expansion of the Cis-BP database of measured and predicted transcription factor motifs.

Samuel A. Lambert
, Ally W. H. Yang
& Timothy R. Hughes

Technical Report | 18 March 2019

Linked-read analysis identifies mutations in single-cell DNA-sequencing data

Linked-read analysis is a method for analyzing single-cell DNA-sequencing data that accurately identifies somatic single-nucleotide variants by using read-level phasing with nearby germline variants, enabling the characterization of mutational signatures and estimation of somatic mutation rates in single cells.

Craig L. Bohrson
, Alison R. Barton
& Peter J. Park

Article | 04 February 2019

The landscape of selection in 551 esophageal adenocarcinomas defines genomic biomarkers for the clinic

Genomic analysis of 551 esophageal adenocarcinomas identifies new driver mutations and biomarkers associated with poor prognosis. More than 50% of esophageal adenocarcinomas contain sensitizing events for CDK4/CDK6 inhibitors, thus providing an evidence base for targeted therapeutics.

Alexander M. Frankell
, SriGanesh Jammula
& Rebecca C. Fitzgerald

Article | 06 November 2017

High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing

RNA Capture Long Seq (CLS) is a new method for transcript annotation that combines targeted RNA capture with long-read sequencing. CLS reannotates GENCODE lncRNAs and increases the number of validated splice junctions and transcript models for targeted loci.

Julien Lagarde
, Barbara Uszczynska-Ratajczak
& Rory Johnson

Article | 30 January 2017

Avian W and mammalian Y chromosomes convergently retained dosage-sensitive regulators

David Page and colleagues report the sequence of the chicken W sex chromosome and compare ancestral W-linked genes across bird species. They find that the W chromosome did not acquire genes expressed exclusively in reproductive tissue, but retained genes through selection to maintain appropriate dosage levels of broadly expressed genes.

Daniel W Bellott
, Helen Skaletsky
& David C Page

News & Views | 29 September 2015

Gene signatures from pancreatic cancer tumor and stromal cells predict disease outcome

Pancreatic cancers consist of a heterogeneous amalgam of assorted cell types, making it challenging to develop a classification system that groups these tumors according to common molecular features. A new study tackles this important issue using bioinformatics approaches to decipher gene expression signatures derived specifically from either tumor cells or nonmalignant stromal cells that predict patient outcome and may inform personalized treatments.

Filippos Kottakis
& Nabeel Bardeesy

News & Views | 29 July 2015

Running spell-check to identify regulatory variants

A major challenge in human genetics is pinpointing which non-coding genetic variants affect gene expression and disease risk. A new study in this issue describes a broadly applicable approach for this task that explicitly models cell type–specific regulatory motifs and generates variant effect predictions that are more accurate and interpretable than those of alternative tools.

Martin Kircher
& Jay Shendure

Research Highlights | 26 February 2014

Rare variant association studies

Orli Bahcall

Article | 28 July 2013

Massively parallel decoding of mammalian regulatory sequences supports a flexible organizational model

Nadav Ahituv and colleagues use a massively parallel reporter assay to test 4,970 synthetic regulatory element sequences, containing patterns of 12 known liver transcription factor binding sites, in mice and in HepG2 cells. They systematically test the impact of binding site copy number, spacing, combination and order on gene expression.

Robin P Smith
, Leila Taher
& Nadav Ahituv

Article | 11 March 2012

Whole-genome analysis of diverse Chlamydia trachomatis strains identifies phylogenetic relationships masked by current clinical typing

Simon Harris and colleagues report whole-genome sequencing of 36 Chlamydia trachomatis representative strains from temporally and geographically diverse sources and use this to construct a genome-wide phylogeny of the species. They find that epidemic spread can be driven by clonal expansion from a single source and also report evidence for recombination in recent clinical strains both within and between biovars.

Simon R Harris
, Ian N Clarke
& Nicholas R Thomson

News & Views | 27 October 2011

More to Hi-C than meets the eye

Diversification and specialization of high-throughput technologies demand assay-specific treatment of data for reliable interpretation. A new study shows that data generated using the Hi-C approach contain hidden features of interchromosomal DNA interactions, which are revealed through analysis with an integrated probabilistic model that corrects for multiple sources of bias in the data.

Myong-Hee Sung
& Gordon L Hager

Analysis | 16 October 2011

Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture

Amos Tanay and Eitan Yaffe report methods to correct biases in the Hi-C method for mapping chromosomal contacts on a genome-wide scale. Their analysis of Hi-C data shows interchromosomal aggregation of hypersensitive sites, transcriptionally active foci and other epigenetic markers of active chromatin.

Eitan Yaffe
& Amos Tanay

Bioinformatics articles within Nature Genetics

Featured