Technical Reports

  • Technical Report |

    The authors present a high-throughput single-cell ChIP-seq method with coverage of up to 10,000 loci per cell. They identify diverse chromatin landscapes in breast cancer cells characterized by dynamic H3K27me3 levels.

    • Kevin Grosselin
    • , Adeline Durand
    • , Justine Marsolier
    • , Adeline Poitou
    • , Elisabetta Marangoni
    • , Fariba Nemati
    • , Ahmed Dahmani
    • , Sonia Lameiras
    • , Fabien Reyal
    • , Olivia Frenoy
    • , Yannick Pousse
    • , Marcel Reichen
    • , Adam Woolfe
    • , Colin Brenan
    • , Andrew D. Griffiths
    • , Céline Vallot
    •  & Annabelle Gérard
  • Technical Report |

    Signature Multivariate Analysis is a new computational tool that detects the mutational signature of homologous-recombination deficiency in clinical samples sequenced with targeted panels, enabling the identification of patients who are responsive to poly (ADP-ribose) polymerase inhibition therapy.

    • Doga C. Gulhan
    • , Jake June-Koo Lee
    • , Giorgio E. M. Melloni
    • , Isidro Cortés-Ciriano
    •  & Peter J. Park
  • Technical Report |

    Linked-read analysis is a method for analyzing single-cell DNA-sequencing data that accurately identifies somatic single-nucleotide variants by using read-level phasing with nearby germline variants, enabling the characterization of mutational signatures and estimation of somatic mutation rates in single cells.

    • Craig L. Bohrson
    • , Alison R. Barton
    • , Michael A. Lodato
    • , Rachel E. Rodin
    • , Lovelace J. Luquette
    • , Vinay V. Viswanadham
    • , Doga C. Gulhan
    • , Isidro Cortés-Ciriano
    • , Maxwell A. Sherman
    • , Minseok Kwon
    • , Michael E. Coulter
    • , Alon Galor
    • , Christopher A. Walsh
    •  & Peter J. Park
  • Technical Report |

    UTMOST (unified test for molecular signatures) is a method for cross-tissue gene expression imputation for transcriptome-wide association analyses. Cross-tissue TWAS using UTMOST identifies new candidate genes for late-onset Alzheimer’s disease.

    • Yiming Hu
    • , Mo Li
    • , Qiongshi Lu
    • , Haoyi Weng
    • , Jiawei Wang
    • , Seyedeh M. Zekavat
    • , Zhaolong Yu
    • , Boyang Li
    • , Jianlei Gu
    • , Sydney Muchnik
    • , Yu Shi
    • , Brian W. Kunkle
    • , Shubhabrata Mukherjee
    • , Pradeep Natarajan
    • , Adam Naj
    • , Amanda Kuzma
    • , Yi Zhao
    • , Paul K. Crane
    • , Hui Lu
    •  & Hongyu Zhao
  • Technical Report |

    GARFIELD is a new approach that classifies genomic features related to phenotypes on the basis of integrating GWAS signals with functional annotations. GARFIELD is used to characterize enrichment patterns for 29 traits integrated with ENCODE and Roadmap Epigenomics annotations.

    • Valentina Iotchkova
    • , Graham R. S. Ritchie
    • , Matthias Geihs
    • , Sandro Morganella
    • , Josine L. Min
    • , Klaudia Walter
    • , Nicholas John Timpson
    • , Ian Dunham
    • , Ewan Birney
    •  & Nicole Soranzo
  • Technical Report |

    Graph Genome Pipeline is a read-alignment and variant-calling pipeline based on graph genomes that offers improved read-mapping and variant-calling accuracy while achieving speed comparable to those of linear reference genome pipelines.

    • Goran Rakocevic
    • , Vladimir Semenyuk
    • , Wan-Ping Lee
    • , James Spencer
    • , John Browning
    • , Ivan J. Johnson
    • , Vladan Arsenijevic
    • , Jelena Nadj
    • , Kaushik Ghose
    • , Maria C. Suciu
    • , Sun-Gou Ji
    • , Gülfem Demir
    • , Lizao Li
    • , Berke Ç. Toptaş
    • , Alexey Dolgoborodov
    • , Björn Pollex
    • , Iosif Spulber
    • , Irina Glotova
    • , Péter Kómár
    • , Andrew L. Stachyra
    • , Yilong Li
    • , Milos Popovic
    • , Morten Källberg
    • , Amit Jain
    •  & Deniz Kural
  • Technical Report |

    StructLMM is a new method to identify genotype–environment interactions (G×E) that involve multiple exposures or environments. When applied to UK Biobank and eQTL data, StructLMM discovers new G×E signals.

    • Rachel Moore
    • , Francesco Paolo Casale
    • , Marc Jan Bonder
    • , Danilo Horta
    • , Bastiaan T. Heijmans
    • , Peter A. C.’t Hoen
    • , Joyce van Meurs
    • , Aaron Isaacs
    • , Rick Jansen
    • , Lude Franke
    • , Dorret I. Boomsma
    • , René Pool
    • , Jenny van Dongen
    • , Jouke J. Hottenga
    • , Marleen M. J. van Greevenbroek
    • , Coen D. A. Stehouwer
    • , Carla J. H. van der Kallen
    • , Casper G. Schalkwijk
    • , Cisca Wijmenga
    • , Alexandra Zhernakova
    • , Ettje F. Tigchelaar
    • , P. Eline Slagboom
    • , Marian Beekman
    • , Joris Deelen
    • , Diana van Heemst
    • , Jan H. Veldink
    • , Leonard H. van den Berg
    • , Cornelia M. van Duijn
    • , Bert A. Hofman
    • , André G. Uitterlinden
    • , P. Mila Jhamai
    • , Michael Verbiest
    • , H. Eka D. Suchiman
    • , Marijn Verkerk
    • , Ruud van der Breggen
    • , Jeroen van Rooij
    • , Nico Lakenberg
    • , Hailiang Mei
    • , Maarten van Iterson
    • , Michiel van Galen
    • , Jan Bot
    • , Peter van’t Hof
    • , Patrick Deelen
    • , Irene Nooren
    • , Matthijs Moed
    • , Martijn Vermaat
    • , Dasha V. Zhernakova
    • , René Luijk
    • , Marc Jan Bonder
    • , Freerk van Dijk
    • , Wibowo Arindrarto
    • , Szymon M. Kielbasa
    • , Morris A. Swertz
    • , Erik W. van Zwet
    • , Lude Franke
    • , Inês Barroso
    •  & Oliver Stegle
  • Technical Report |

    A machine learning approach for refinement of somatic variant calls automates this process and reduces bias stemming from inter-reviewer variability.

    • Benjamin J. Ainscough
    • , Erica K. Barnell
    • , Peter Ronning
    • , Katie M. Campbell
    • , Alex H. Wagner
    • , Todd A. Fehniger
    • , Gavin P. Dunn
    • , Ravindra Uppaluri
    • , Ramaswamy Govindan
    • , Thomas E. Rohan
    • , Malachi Griffith
    • , Elaine R. Mardis
    • , S. Joshua Swamidass
    •  & Obi L. Griffith
  • Technical Report |

    Tri-C is a new 3C approach to identify concurrent chromatin interactions at individual alleles. The authors observe specific higher-order structures involving simultaneous interactions between multiple enhancers and promoters, called regulatory hubs.

    • A. Marieke Oudelaar
    • , James O. J. Davies
    • , Lars L. P. Hanssen
    • , Jelena M. Telenius
    • , Ron Schwessinger
    • , Yu Liu
    • , Jill M. Brown
    • , Damien J. Downes
    • , Andrea M. Chiariello
    • , Simona Bianco
    • , Mario Nicodemi
    • , Veronica J. Buckle
    • , Job Dekker
    • , Douglas R. Higgs
    •  & Jim R. Hughes
  • Technical Report |

    BayesTyper is a new probabilistic genotyping algorithm that offers superior sensitivity and accuracy relative to existing methods by using exact alignment of read k-mers to a graph representation of the reference and candidate variants.

    • Jonas Andreas Sibbesen
    • , Lasse Maretty
    •  & Anders Krogh
  • Technical Report |

    LeafCutter is a new tool that identifies variable intron splicing events from RNA-seq data for analysis of complex alternative splicing. The method does not require transcript annotation and can be used to map splicing quantitative trait loci.

    • Yang I. Li
    • , David A. Knowles
    • , Jack Humphrey
    • , Alvaro N. Barbeira
    • , Scott P. Dickinson
    • , Hae Kyung Im
    •  & Jonathan K. Pritchard
  • Technical Report |

    Covariates for multiphenotype studies (CMS), a new approach for testing for associations from large-scale datasets, leverages genetic and environmental factors shared between correlated variables measured on the same samples. Applying CMS to real and simulated data demonstrates a large increase in power equivalent to that gained by doubling the sample size.

    • Hugues Aschard
    • , Vincent Guillemot
    • , Bjarni Vilhjalmsson
    • , Chirag J Patel
    • , David Skurnik
    • , Chun J Ye
    • , Brian Wolpin
    • , Peter Kraft
    •  & Noah Zaitlen
  • Technical Report |

    Graphtyper is a fast and scalable method for variant genotyping that aligns short-read sequence data to a pangenome. Graphtyper was able to accurately genotype 90 million sequence variants in the whole genomes of 28,000 Icelanders, including those in six HLA genes.

    • Hannes P Eggertsson
    • , Hakon Jonsson
    • , Snaedis Kristmundsdottir
    • , Eirikur Hjartarson
    • , Birte Kehr
    • , Gisli Masson
    • , Florian Zink
    • , Kristjan E Hjorleifsson
    • , Aslaug Jonasdottir
    • , Adalbjorg Jonasdottir
    • , Ingileif Jonsdottir
    • , Daniel F Gudbjartsson
    • , Pall Melsted
    • , Kari Stefansson
    •  & Bjarni V Halldorsson
  • Technical Report |

    Adam Siepel and colleagues report a new computational method, LINSIGHT, that combines evolutionary conservation and functional genomic information to predict the fitness consequences of noncoding mutations in the human genome. They use LINSIGHT to show that fitness consequences of enhancer mutations depend on tissue and cell type specificity and promoter constraints.

    • Yi-Fei Huang
    • , Brad Gulko
    •  & Adam Siepel
  • Technical Report
    | Open Access

    Adam Phillippy, Curtis Van Tassell, Timothy Smith and colleagues present a new reference genome assembly for the domestic goat using a pipeline that improves contiguity of the assembly by more than 250-fold. The pipeline uses a combination of short- and long-read sequencing, optical mapping, and chromatin interaction mapping.

    • Derek M Bickhart
    • , Benjamin D Rosen
    • , Sergey Koren
    • , Brian L Sayre
    • , Alex R Hastie
    • , Saki Chan
    • , Joyce Lee
    • , Ernest T Lam
    • , Ivan Liachko
    • , Shawn T Sullivan
    • , Joshua N Burton
    • , Heather J Huson
    • , John C Nystrom
    • , Christy M Kelley
    • , Jana L Hutchison
    • , Yang Zhou
    • , Jiajie Sun
    • , Alessandra Crisà
    • , F Abel Ponce de León
    • , John C Schwartz
    • , John A Hammond
    • , Geoffrey C Waldbieser
    • , Steven G Schroeder
    • , George E Liu
    • , Maitreya J Dunham
    • , Jay Shendure
    • , Tad S Sonstegard
    • , Adam M Phillippy
    • , Curtis P Van Tassell
    •  & Timothy P L Smith
  • Technical Report |

    Kun Zhang and colleagues present a metric called methylation haplotype load (MHL) that quantifies methylation patterns within blocks of tightly linked CpG dinucleotides. They show that the MHL can distinguish samples from different human somatic tissues and that it can be used to improve detection of cancer-derived circulating DNA and identify its tissue of origin.

    • Shicheng Guo
    • , Dinh Diep
    • , Nongluk Plongthongkum
    • , Ho-Lim Fung
    • , Kang Zhang
    •  & Kun Zhang
  • Technical Report |

    Stuart Orkin, Daniel Bauer and colleagues present DNA Striker, a computational tool to design variant-aware saturating-mutagenesis screens with multiple CRISPR-associated nucleases. They apply their methodology to the HBS1L-MYB intergenic region, which is associated with red-blood-cell traits, and identify putative regulatory elements that control MYB expression.

    • Matthew C Canver
    • , Samuel Lessard
    • , Luca Pinello
    • , Yuxuan Wu
    • , Yann Ilboudo
    • , Emily N Stern
    • , Austen J Needleman
    • , Frédéric Galactéros
    • , Carlo Brugnara
    • , Abdullah Kutlar
    • , Colin McKenzie
    • , Marvin Reid
    • , Diane D Chen
    • , Partha Pratim Das
    • , Mitchel A Cole
    • , Jing Zeng
    • , Ryo Kurita
    • , Yukio Nakamura
    • , Guo-Cheng Yuan
    • , Guillaume Lettre
    • , Daniel E Bauer
    •  & Stuart H Orkin
  • Technical Report |

    Yun Song and colleagues present SMC++, a statistical method for population history inference capable of analyzing unphased whole genomes and sample sizes much larger than can be analyzed by current methods. The authors apply SMC++ to sequence data from human, Drosophila and finch populations.

    • Jonathan Terhorst
    • , John A Kamm
    •  & Yun S Song
  • Technical Report |

    James Liley, John Todd and Chris Wallace present a statistical method for determining whether disease-associated variants have different effect sizes in phenotypically defined subgroups of disease cases. The test can be combined with existing methods to determine whether genetic heterogeneity is driven by population stratification or by different mechanisms of disease pathology.

    • James Liley
    • , John A Todd
    •  & Chris Wallace
  • Technical Report |

    John Storey, David Blei and colleagues present a method, TeraStructure, for estimating population structure from human genomic data sets on a scale not possible with current methods. TeraStructure is able to analyze data from the Human Genome Diversity Panel and the 1000 Genomes Project in less than three hours.

    • Prem Gopalan
    • , Wei Hao
    • , David M Blei
    •  & John D Storey
  • Technical Report |

    Gill Bejerano and colleagues present M-CAP, a classifier that estimates variant pathogenicity in clinical exome data sets. They show that M-CAP outperforms other existing methods at all thresholds and correctly dismisses 60% of rare missense variants of uncertain significance at 95% sensitivity.

    • Karthik A Jagadeesh
    • , Aaron M Wenger
    • , Mark J Berger
    • , Harendra Guturu
    • , Peter D Stenson
    • , David N Cooper
    • , Jonathan A Bernstein
    •  & Gill Bejerano
  • Technical Report |

    Po-Ru Loh, Alkes Price and colleagues present Eagle2, a reference-based phasing algorithm that allows for highly accurate and efficient phasing of genotypes across a broad range of cohort sizes. They demonstrate an approximately 10% improvement in accuracy and 20% improvement in speed compared to a competing method, SHAPEIT2.

    • Po-Ru Loh
    • , Petr Danecek
    • , Pier Francesco Palamara
    • , Christian Fuchsberger
    • , Yakir A Reshef
    • , Hilary K Finucane
    • , Sebastian Schoenherr
    • , Lukas Forer
    • , Shane McCarthy
    • , Goncalo R Abecasis
    • , Richard Durbin
    •  & Alkes L Price
  • Technical Report |

    Runjun Kumar, S. Joshua Swamidass and Ron Bose present an unsupervised parsimony-guided method, ParsSNP, for prioritizing candidate cancer driver mutations. They apply ParsSNP to a gastric cancer data set and predict potential driver mutations not detected by other methods, including truncations in known tumor-suppressor genes and previously confirmed drivers.

    • Runjun D Kumar
    • , S Joshua Swamidass
    •  & Ron Bose
  • Technical Report |

    Victoria Hore, Jonathan Marchini and colleagues present a method for multiple-tissue gene expression studies aimed at uncovering gene networks linked to genetic variation. They apply their method to RNA sequencing data from adipose, skin and lymphoblastoid cell lines and identify several biologically relevant gene networks with a genetic basis.

    • Victoria Hore
    • , Ana Viñuela
    • , Alfonso Buil
    • , Julian Knight
    • , Mark I McCarthy
    • , Kerrin Small
    •  & Jonathan Marchini
  • Technical Report |

    Richard Mott, Simon Myers and colleagues present a new imputation method, STITCH, which does not require genotyping arrays or high-quality reference panels. They use STITCH to accurately impute genotypes in both outbred laboratory mice and a sample human population directly from low-coverage (<2×) sequencing data.

    • Robert W Davies
    • , Jonathan Flint
    • , Simon Myers
    •  & Richard Mott
  • Technical Report |

    Po-Ru Loh, Pier Francesco Palamara and Alkes Price develop a new long-range phasing method, Eagle, that harnesses long, shared identical-by-descent tracts and can be applied to large outbred populations. They use Eagle to phase samples from the UK Biobank and find that it is faster and has better accuracy than existing methods.

    • Po-Ru Loh
    • , Pier Francesco Palamara
    •  & Alkes L Price
  • Technical Report |

    Jonathan Marchini and colleagues develop a new method for haplotype phasing, SHAPEIT3, capable of handling large data sets from biobanks containing >100,000 genotyped samples. They find that their method is fast and accurate, with a low switch error rate, and can be scaled to data sets from increasingly larger cohorts.

    • Jared O'Connell
    • , Kevin Sharp
    • , Nick Shrine
    • , Louise Wain
    • , Ian Hall
    • , Martin Tobin
    • , Jean-Francois Zagury
    • , Olivier Delaneau
    •  & Jonathan Marchini
  • Technical Report |

    Soumya Raychaudhuri, Buhm Han and colleagues present a statistical method to distinguish whether shared genetic risk variants among complex traits are driven by whole-group pleiotropy or a subset of individuals who constitute a genetically heterogeneous subgroup. They use the method to examine genetic sharing among autoimmune diseases and between major depressive disorder and schizophrenia and find that most genetic sharing cannot be explained by subgroup heterogeneity but that, in contrast, seronegative rheumatoid arthritis is a heterogeneous condition.

    • Buhm Han
    • , Jennie G Pouget
    • , Kamil Slowikowski
    • , Eli Stahl
    • , Cue Hyunkyu Lee
    • , Dorothee Diogo
    • , Xinli Hu
    • , Yu Rang Park
    • , Eunji Kim
    • , Peter K Gregersen
    • , Solbritt Rantapää Dahlqvist
    • , Jane Worthington
    • , Javier Martin
    • , Steve Eyre
    • , Lars Klareskog
    • , Tom Huizinga
    • , Wei-Min Chen
    • , Suna Onengut-Gumuscu
    • , Stephen S Rich
    • , Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium
    • , Naomi R Wray
    •  & Soumya Raychaudhuri
  • Technical Report |

    Andy Dahl and colleagues present a method for imputing missing phenotype data in genetic studies with multiple correlated phenotypes where samples can have any level of relatedness. They apply their method to simulated and real data sets and show that it improves the sensitivity to detect association signals.

    • Andrew Dahl
    • , Valentina Iotchkova
    • , Amelie Baud
    • , Åsa Johansson
    • , Ulf Gyllensten
    • , Nicole Soranzo
    • , Richard Mott
    • , Andreas Kranis
    •  & Jonathan Marchini
  • Technical Report |

    Iuliana Ionita-Laza, Kenneth McCallum and colleagues developed an unsupervised statistical approach, Eigen, that integrates different functional annotations into a single measure of functional importance for coding and noncoding variants. Their meta-score can outperform the recently proposed CADD score and can be applied to fine-mapping studies.

    • Iuliana Ionita-Laza
    • , Kenneth McCallum
    • , Bin Xu
    •  & Joseph D Buxbaum
  • Technical Report |

    Natsuhiko Kumasaka, Andrew Knights and Daniel Gaffney develop a new statistical approach for association mapping that models genetic effects and accounts for biases in sequencing data in a single probabilistic framework. They apply this method to generate a map of chromatin accessibility QTLs and show how it can be used to fine-map regulatory variants and link distal regulatory elements with genes.

    • Natsuhiko Kumasaka
    • , Andrew J Knights
    •  & Daniel J Gaffney
  • Technical Report |

    Matthew Stephens and colleagues present a method for visualizing geographic patterns in genetic population structure. They apply this method to data from elephant, human and Arabidopsis thaliana populations and illustrate its potential to highlight barriers and corridors to gene flow.

    • Desislava Petkova
    • , John Novembre
    •  & Matthew Stephens
  • Technical Report |

    Kornelia Polyak, Franziska Michor and colleagues report a novel method, STAR-FISH, for combined in situ single-cell analysis of point mutations and copy number alterations in archived tissue samples. They apply STAR-FISH to clinically relevant PIK3CA mutations and HER2 amplifications and observe associations between intratumoral diversity and clinical outcome.

    • Michalina Janiszewska
    • , Lin Liu
    • , Vanessa Almendro
    • , Yanan Kuang
    • , Cloud Paweletz
    • , Rita A Sakr
    • , Britta Weigelt
    • , Ariella B Hanker
    • , Sarat Chandarlapaty
    • , Tari A King
    • , Jorge S Reis-Filho
    • , Carlos L Arteaga
    • , So Yeon Park
    • , Franziska Michor
    •  & Kornelia Polyak
  • Technical Report |

    Hae Kyung Im and colleagues report a method for predicting gene expression perturbations from genotype data after training on reference transcriptome data sets. Association of predicted gene expression with disease traits identifies known and new candidate disease genes.

    • Eric R Gamazon
    • , Heather E Wheeler
    • , Kaanan P Shah
    • , Sahar V Mozaffari
    • , Keston Aquino-Michaels
    • , Robert J Carroll
    • , Anne E Eyler
    • , Joshua C Denny
    • , GTEx Consortium
    • , Dan L Nicolae
    • , Nancy J Cox
    •  & Hae Kyung Im
  • Technical Report |

    Michael Beer and colleagues report a metric based on a regulatory region annotation method, gkm-SVM, and use this to predict the effects of regulatory variants from sequencing and DNase I–hypersensitive site data. They apply their method to autoimmune disease GWAS data and report several new predictions for causal SNPs.

    • Dongwon Lee
    • , David U Gorkin
    • , Maggie Baker
    • , Benjamin J Strober
    • , Alessandro L Asoni
    • , Andrew S McCallion
    •  & Michael A Beer
  • Technical Report |

    Mary Fortune, Chris Wallace and colleagues report a new method that allows statistical colocalization of genetic risk variants for related autoimmune diseases in the context of common controls. They apply their method to type 1 diabetes, rheumatoid arthritis, celiac disease and multiple sclerosis and highlight the complexity in genetic variation underlying these distinct autoimmune diseases.

    • Mary D Fortune
    • , Hui Guo
    • , Oliver Burren
    • , Ellen Schofield
    • , Neil M Walker
    • , Maria Ban
    • , Stephen J Sawcer
    • , John Bowes
    • , Jane Worthington
    • , Anne Barton
    • , Steve Eyre
    • , John A Todd
    •  & Chris Wallace
  • Technical Report |

    Gil McVean, Alexander Dilthey and colleagues present a graphical model-based method for accurate genomic assembly that uses the diversity present in multiple reference sequences, as represented by a population reference graph. The method is applied to simulated and empirical data from the human MHC region to demonstrate the improved accuracy of genomic inference.

    • Alexander Dilthey
    • , Charles Cox
    • , Zamin Iqbal
    • , Matthew R Nelson
    •  & Gil McVean
  • Technical Report |

    Xiaoming Liu and Yun-Xin Fu present a model-flexible method for inferring changes in population size over time on the basis of the composite likelihood of SNP frequencies. They apply the method to 1000 Genomes Project data to infer changes in human population size on the timescale of 10,000 to 200,000 years ago.

    • Xiaoming Liu
    •  & Yun-Xin Fu
  • Technical Report |

    John Storey and colleagues report a statistical test for genetic association for use with data from structured populations. They demonstrate the use of this test on both simulated data and empirical data from the Northern Finland Birth Cohort, from which they identify significant loci not detected by other methods.

    • Minsun Song
    • , Wei Hao
    •  & John D Storey
  • Technical Report |

    Benjamin Neale and colleagues report the LD Score regression method, used to distinguish the relative contributions of confounding bias and polygenicity to inflated test statistics in GWAS. They apply their method to summary statistics from GWAS for over 30 phenotypes, confirm that polygenicity accounts for the majority of inflation in test statistics and demonstrate use of this method as a correction factor.

    • Brendan K Bulik-Sullivan
    • , Po-Ru Loh
    • , Hilary K Finucane
    • , Stephan Ripke
    • , Jian Yang
    • , Schizophrenia Working Group of the Psychiatric Genomics Consortium
    • , Nick Patterson
    • , Mark J Daly
    • , Alkes L Price
    •  & Benjamin M Neale
  • Technical Report |

    Alkes Price, Po-Ru Loh and colleagues report the BOLT-LMM method for mixed-model association. They apply their method to 9 quantitative traits in 23,294 samples and demonstrate that it provides improvements in computational efficiency as well as gains in power that increase with the size of the cohort, making it useful for the analysis of large cohorts.

    • Po-Ru Loh
    • , George Tucker
    • , Brendan K Bulik-Sullivan
    • , Bjarni J Vilhjálmsson
    • , Hilary K Finucane
    • , Rany M Salem
    • , Daniel I Chasman
    • , Paul M Ridker
    • , Benjamin M Neale
    • , Bonnie Berger
    • , Nick Patterson
    •  & Alkes L Price
  • Technical Report |

    Steven McCarroll and colleagues report an analysis of multiallelic copy number variants (mCNVs). They characterize mCNVs in 849 whole-genome sequences from the 1000 Genomes Project and find that mCNVs give rise to most gene dosage variation in humans.

    • Robert E Handsaker
    • , Vanessa Van Doren
    • , Jennifer R Berman
    • , Giulio Genovese
    • , Seva Kashin
    • , Linda M Boettger
    •  & Steven A McCarroll
  • Technical Report |

    Adam Siepel and colleagues develop a statistical method, fitCons, which combines comparative and functional genomic data to estimate the probability that a point mutation will influence fitness. They generate fitCons scores for three human cell types from ENCODE data sets and demonstrate improved prediction power for cis regulatory elements in comparison to conventional conservation-based scores.

    • Brad Gulko
    • , Melissa J Hubisz
    • , Ilan Gronau
    •  & Adam Siepel
  • Technical Report |

    Noah Zaitlen, Alkes Price and colleagues report a new approach to estimate the narrow-sense heritability of complex traits from unrelated individuals in a recently admixed population. They apply this approach to estimate the heritability for 13 quantitative or case-control phenotypes in 21,497 African-American individuals and suggest the inflation of family-based h2 estimates.

    • Noah Zaitlen
    • , Bogdan Pasaniuc
    • , Sriram Sankararaman
    • , Gaurav Bhatia
    • , Jianqi Zhang
    • , Alexander Gusev
    • , Taylor Young
    • , Arti Tandon
    • , Samuela Pollack
    • , Bjarni J Vilhjálmsson
    • , Themistocles L Assimes
    • , Sonja I Berndt
    • , William J Blot
    • , Stephen Chanock
    • , Nora Franceschini
    • , Phyllis G Goodman
    • , Jing He
    • , Anselm J M Hennis
    • , Ann Hsing
    • , Sue A Ingles
    • , William Isaacs
    • , Rick A Kittles
    • , Eric A Klein
    • , Leslie A Lange
    • , Barbara Nemesure
    • , Nick Patterson
    • , David Reich
    • , Benjamin A Rybicki
    • , Janet L Stanford
    • , Victoria L Stevens
    • , Sara S Strom
    • , Eric A Whitsel
    • , John S Witte
    • , Jianfeng Xu
    • , Christopher Haiman
    • , James G Wilson
    • , Charles Kooperberg
    • , Daniel Stram
    • , Alex P Reiner
    • , Hua Tang
    •  & Alkes L Price