Technical Reports

Fine-mapping cellular QTLs with RASQUAL and ATAC-seq

Natsuhiko Kumasaka, Andrew Knights and Daniel Gaffney develop a new statistical approach for association mapping that models genetic effects and accounts for biases in sequencing data in a single probabilistic framework. They apply this method to generate a map of chromatin accessibility QTLs and show how it can be used to fine-map regulatory variants and link distal regulatory elements with genes.

Natsuhiko Kumasaka
Andrew J Knights
Daniel J Gaffney
Technical Report14 Dec 2015
Visualizing spatial population structure with estimated effective migration surfaces

Matthew Stephens and colleagues present a method for visualizing geographic patterns in genetic population structure. They apply this method to data from elephant, human and Arabidopsis thaliana populations and illustrate its potential to highlight barriers and corridors to gene flow.

Desislava Petkova
John Novembre
Matthew Stephens
Technical Report07 Dec 2015
In situ single-cell analysis identifies heterogeneity for PIK3CA mutation and HER2 amplification in HER2-positive breast cancer

Kornelia Polyak, Franziska Michor and colleagues report a novel method, STAR-FISH, for combined in situ single-cell analysis of point mutations and copy number alterations in archived tissue samples. They apply STAR-FISH to clinically relevant PIK3CA mutations and HER2 amplifications and observe associations between intratumoral diversity and clinical outcome.

Michalina Janiszewska
Lin Liu
Kornelia Polyak
Technical Report24 Aug 2015
A gene-based association method for mapping traits using reference transcriptome data

Hae Kyung Im and colleagues report a method for predicting gene expression perturbations from genotype data after training on reference transcriptome data sets. Association of predicted gene expression with disease traits identifies known and new candidate disease genes.

Eric R Gamazon
Heather E Wheeler
Hae Kyung Im
Technical Report10 Aug 2015
A method to predict the impact of regulatory variants from DNA sequence

Michael Beer and colleagues report a metric based on a regulatory region annotation method, gkm-SVM, and use this to predict the effects of regulatory variants from sequencing and DNase I–hypersensitive site data. They apply their method to autoimmune disease GWAS data and report several new predictions for causal SNPs.

Dongwon Lee
David U Gorkin
Michael A Beer
Technical Report15 Jun 2015
Statistical colocalization of genetic risk variants for related autoimmune diseases in the context of common controls

Mary Fortune, Chris Wallace and colleagues report a new method that allows statistical colocalization of genetic risk variants for related autoimmune diseases in the context of common controls. They apply their method to type 1 diabetes, rheumatoid arthritis, celiac disease and multiple sclerosis and highlight the complexity in genetic variation underlying these distinct autoimmune diseases.

Mary D Fortune
Hui Guo
Chris Wallace
Technical Report08 Jun 2015
Improved genome inference in the MHC using a population reference graph

Gil McVean, Alexander Dilthey and colleagues present a graphical model-based method for accurate genomic assembly that uses the diversity present in multiple reference sequences, as represented by a population reference graph. The method is applied to simulated and empirical data from the human MHC region to demonstrate the improved accuracy of genomic inference.

Alexander Dilthey
Charles Cox
Gil McVean
Technical Report27 Apr 2015
Exploring population size changes using SNP frequency spectra

Xiaoming Liu and Yun-Xin Fu present a model-flexible method for inferring changes in population size over time on the basis of the composite likelihood of SNP frequencies. They apply the method to 1000 Genomes Project data to infer changes in human population size on the timescale of 10,000 to 200,000 years ago.

Xiaoming Liu
Yun-Xin Fu
Technical Report06 Apr 2015
Testing for genetic associations in arbitrarily structured populations

John Storey and colleagues report a statistical test for genetic association for use with data from structured populations. They demonstrate the use of this test on both simulated data and empirical data from the Northern Finland Birth Cohort, from which they identify significant loci not detected by other methods.

Minsun Song
Wei Hao
John D Storey
Technical Report30 Mar 2015
LD Score regression distinguishes confounding from polygenicity in genome-wide association studies

Benjamin Neale and colleagues report the LD Score regression method, used to distinguish the relative contributions of confounding bias and polygenicity to inflated test statistics in GWAS. They apply their method to summary statistics from GWAS for over 30 phenotypes, confirm that polygenicity accounts for the majority of inflation in test statistics and demonstrate use of this method as a correction factor.

Brendan K Bulik-Sullivan
Po-Ru Loh
Benjamin M Neale
Technical Report02 Feb 2015
Efficient Bayesian mixed-model analysis increases association power in large cohorts

Alkes Price, Po-Ru Loh and colleagues report the BOLT-LMM method for mixed-model association. They apply their method to 9 quantitative traits in 23,294 samples and demonstrate that it provides improvements in computational efficiency as well as gains in power that increase with the size of the cohort, making it useful for the analysis of large cohorts.

Po-Ru Loh
George Tucker
Alkes L Price
Technical Report02 Feb 2015
Large multiallelic copy number variations in humans

Steven McCarroll and colleagues report an analysis of multiallelic copy number variants (mCNVs). They characterize mCNVs in 849 whole-genome sequences from the 1000 Genomes Project and find that mCNVs give rise to most gene dosage variation in humans.

Robert E Handsaker
Vanessa Van Doren
Steven A McCarroll
Technical Report26 Jan 2015
A method for calculating probabilities of fitness consequences for point mutations across the human genome

Adam Siepel and colleagues develop a statistical method, fitCons, which combines comparative and functional genomic data to estimate the probability that a point mutation will influence fitness. They generate fitCons scores for three human cell types from ENCODE data sets and demonstrate improved prediction power for cis regulatory elements in comparison to conventional conservation-based scores.

Brad Gulko
Melissa J Hubisz
Adam Siepel
Technical Report19 Jan 2015
Leveraging population admixture to characterize the heritability of complex traits

Noah Zaitlen, Alkes Price and colleagues report a new approach to estimate the narrow-sense heritability of complex traits from unrelated individuals in a recently admixed population. They apply this approach to estimate the heritability for 13 quantitative or case-control phenotypes in 21,497 African-American individuals and suggest the inflation of family-based h² estimates.

Noah Zaitlen
Bogdan Pasaniuc
Alkes L Price
Technical Report10 Nov 2014
A multiscale statistical mechanical framework integrates biophysical and genomic data to assemble cancer networks

Peter Sorger, Mohammed AlQuraishi and colleagues present a statistical framework for integrating biophysical and genomic data to predict the consequences of cancer-related mutations on protein-protein interactions. They apply their framework to the SH2 phosphoprotein network using publicly available data from The Cancer Genome Atlas.

Mohammed AlQuraishi
Grigoriy Koytiger
Peter K Sorger
Technical Report02 Nov 2014
Comprehensive variation discovery in single human genomes

David Jaffe and colleagues report a new algorithm, DISCOVAR, for variant calling and de novo genome assembly. They test the algorithm on a new reference variant call set and demonstrate improved variant calling, particularly in challenging regions of the genome.

Neil I Weisenfeld
Shuangye Yin
David B Jaffe
Technical Report19 Oct 2014
Haplotype-resolved whole-genome sequencing by contiguity-preserving transposition and combinatorial indexing

Frank Steemers and colleagues report a new method for genome-wide haplotyping based on contiguity-preserving transposition and combinatorial indexing. They apply this method to assemble over 95% of the heterozygous variants in a human genome into long, accurate haplotype blocks.

Sasan Amini
Dmitry Pushkarev
Frank J Steemers
Technical Report19 Oct 2014
Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications

Gerton Lunter and colleagues report Platypus software, which combines a haplotype-based multi-sample variant caller with local sequence assembly in a Bayesian statistical framework. They demonstrate applications to exome and whole-genome data sets, to the identification de novo mutations in parent-offspring trios and to the genotyping of HLA loci.

Andy Rimmer
Hang Phan
Gerton Lunter
Technical Report13 Jul 2014
Inferring human population size and separation history from multiple genome sequences

Stephan Schiffels and Richard Durbin report the multiple sequentially Markovian coalescent (MSMC) method for inferring human population size and separation history from multiple genome sequences. Their application to the whole-genome sequences of 34 individuals from 9 populations allows inferences about events in human population history as recent as 2,000 years ago.

Stephan Schiffels
Richard Durbin
Technical Report22 Jun 2014
Ancestry estimation and control of population stratification for sequence-based association studies

Gonçalo Abecasis, Chaolong Wang and colleagues report a new statistical method, implemented in a publicly available software program LASER, to estimate an individual's genetic ancestry directly from off-target sequence reads from targeted sequencing experiments, making use of a reference panel. Their simulations and testing on real data sets show accurate inference of worldwide continental ancestry with whole-genome shotgun coverage as low as 0.001× and of fine-scale ancestry within Europe with coverage as low as 0.1×.

Chaolong Wang
Xiaowei Zhan
Gonçalo R Abecasis
Technical Report16 Mar 2014

Technical Reports

Fine-mapping cellular QTLs with RASQUAL and ATAC-seq

Visualizing spatial population structure with estimated effective migration surfaces

In situ single-cell analysis identifies heterogeneity for PIK3CA mutation and HER2 amplification in HER2-positive breast cancer

A gene-based association method for mapping traits using reference transcriptome data

A method to predict the impact of regulatory variants from DNA sequence

Statistical colocalization of genetic risk variants for related autoimmune diseases in the context of common controls

Improved genome inference in the MHC using a population reference graph

Exploring population size changes using SNP frequency spectra

Testing for genetic associations in arbitrarily structured populations

LD Score regression distinguishes confounding from polygenicity in genome-wide association studies

Efficient Bayesian mixed-model analysis increases association power in large cohorts

Large multiallelic copy number variations in humans

A method for calculating probabilities of fitness consequences for point mutations across the human genome

Leveraging population admixture to characterize the heritability of complex traits

A multiscale statistical mechanical framework integrates biophysical and genomic data to assemble cancer networks

Comprehensive variation discovery in single human genomes

Haplotype-resolved whole-genome sequencing by contiguity-preserving transposition and combinatorial indexing

Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications

Inferring human population size and separation history from multiple genome sequences

Ancestry estimation and control of population stratification for sequence-based association studies

Search

Quick links

Technical Reports

Filter By:

Search

Quick links