A method to estimate the contribution of regional genetic associations to complex traits from summary association statistics

Pare, Guillaume; Mao, Shihong; Deng, Wei Q.

doi:10.1038/srep27644

Download PDF

Article
Open access
Published: 08 June 2016

A method to estimate the contribution of regional genetic associations to complex traits from summary association statistics

Guillaume Pare^1,2,3,4,
Shihong Mao³ &
Wei Q. Deng⁵

Scientific Reports volume 6, Article number: 27644 (2016) Cite this article

1813 Accesses
3 Citations
4 Altmetric
Metrics details

Subjects

Abstract

Despite considerable efforts, known genetic associations only explain a small fraction of predicted heritability. Regional associations combine information from multiple contiguous genetic variants and can improve variance explained at established association loci. However, regional associations are not easily amenable to estimation using summary association statistics because of sensitivity to linkage disequilibrium (LD). We now propose a novel method, LD Adjusted Regional Genetic Variance (LARGV), to estimate phenotypic variance explained by regional associations using summary statistics while accounting for LD. Our method is asymptotically equivalent to a multiple linear regression model when no interaction or haplotype effects are present. It has several applications, such as ranking of genetic regions according to variance explained or comparison of variance explained by two or more regions. Using height and BMI data from the Health Retirement Study (N = 7,776), we show that most genetic variance lies in a small proportion of the genome and that previously identified linkage peaks have higher than expected regional variance.

An integrated framework for local genetic correlation analysis

Article 14 March 2022

Evaluating and improving heritability models using summary statistics

Article 23 March 2020

A saturated map of common genetic variants associated with human height

Article Open access 12 October 2022

Introduction

Currently known genetic associations only explain a relatively small proportion of complex traits variance. In accordance with the widely accepted polygenic nature of complex traits, it has been proposed that weak, yet undetected, associations underlie complex trait heritability of a wide variety of phenotypes such as height¹, cognitive function² or rheumatoid arthritis³, etc. We have recently shown that the joint association of multiple weakly associated variants over large chromosomal regions contributes to complex traits variance⁴. Such regional associations are not easily amenable to estimation using summary-level association statistics because of sensitivity to linkage disequilibrium (LD). Nonetheless, only large meta-analyses have the necessary power to identify weakly associated variants and results are typically reported in the form of summary association statistics. In this report, we propose a novel method to assess the contribution of regional associations to complex traits variance using summary association statistics. Estimation of regional associations with our novel method, LD Adjusted Regional Genetic Variance (LARGV), can help identify key genomic regions involved in regulation of complex traits.

Clustering of weak associations within defined chromosomal regions has been previously suggested⁵ and can increase variance explained at established association loci as compared to genome-wide significant SNPs alone⁶. Such regional associations extended up to 433.0 Kb from genome-wide significant SNPs⁴, a distance compatible with long-range cis regulation of gene expression^7,8. Furthermore, regional associations appeared to be the results of multiple weak associations rather than one or a few very significant univariate associations. These results point towards the existence of key regulatory regions where functional genetic variants aggregate, the identification of which can lead to novel biological insights and a better understanding of complex traits genetics.

Several methods have been described to estimate the overall contribution of common genetic variants to complex traits variance, but no method was specifically designed to estimate regional association using summary association statistics while accounting for LD (Table 1). For instance, a popular approach is based on variance component models using genetic relatedness as the variance-covariance matrix of the random effect. An implementation of this approach uses REML¹ to estimate the genetic effect and modifications have been reported to either take into account LD between SNPs⁹ or to handle large datasets¹⁰. While very useful and informative, all of these approaches require individual-level data. This latter pitfall has been overcome by the development of LDScore¹¹, which uses summary-level association statistics as inputs and has been shown to be equivalent to Elston regression¹². However, LDScore is ill suited for estimation of regional heritability because it is based on regressing genetic effects on the sum of linkage disequilibrium and requires large number of SNPs for precise estimations. A multi-SNP locus-association method has been described that uses summary association statistics, but it necessitates to first prune SNPs for p-value (p < 0.01) and LD (r² < 0.1)¹³. Finally, alternative methods^14,15 estimate genetic variance from the distribution of effect sizes and are not appropriate for regional genetic variance because the small number of SNPs will lead to an imprecise estimation and linkage equilibrium is assumed. There is thus a need for a method to estimate the regional contribution of common genetic variants using summary association statistics while simultaneously taking LD into account.

Table 1 Advantages and disadvantages of existing methods to estimate regional genetic variance from summary association statistics.

Full size table

Results

Comparison of genetic variance estimated using summary statistics and variance component models

Our method estimates the regional contribution to complex trait variance using summary data by adjusting the variance explained by each SNP for its LD with neighboring SNPs. Simulations using 1000 Genomes Project¹⁶ (1000G) data showed that genetic regional variance is accurately estimated by our method when no haplotype or interaction effects are present (Figure S1), as predicted by theoretical derivations (see Methods). We sought to compare estimation of overall genetic variance by our novel method to variance component models. We divided the genome into SNP blocks of median size 250 Kb (85–95 contiguous SNPs) minimizing inter-block LD and applied our method to BMI and height in the Health Retirement Study¹⁷ (HRS; N = 7,776) for which individual-level genotypes were available. Using only summary association statistics for our method and corresponding individual-level data for variance component models, both methods provided consistent estimates of genome-wide genetic variance (Fig. 1). We explored the impact of adjustment for genetic principal components. The first 20 components provided adequate protection against population stratification while the inclusion of fewer components led to the inflation in genetic variance, especially for height. Using 20 components, genetic variance was estimated at 0.12 (95% CI 0.01–0.24) for BMI with our novel method and 0.14 (95% CI 0.05–0.23) with variance component models¹⁸. The corresponding estimates for height were 0.28 (95% CI 0.17–0.39) and 0.30 (95% CI 0.20–0.39).

Using summary association statistics to identify regional associations

We next sought to compare the ability of our method to identify regional associations with other approaches also using summary association statistics. We ranked SNP blocks (median size of 250 Kb) according to decreasing regional genetic variance by applying three different methods on Genetic Investigation of Anthropometric Traits consortium (GIANT) summary association statistics^19,20: (1) our proposed approach (LARGV), (2) LDScore and (3) the multi-SNP locus-association proposed by Ehret and colleagues¹³. We then used SNP block ranks derived from GIANT by each of the three methods to estimate genetic variance in HRS (which is not part of GIANT) with variance component models, successively adding a higher proportion of SNP blocks. As illustrated in Fig. 2, the genetic variance estimated by LARGV increased more rapidly with the proportion of top SNP blocks included as compared to other two methods. The multi-SNP locus-association method performed well, but a large proportion of SNP blocks did not have any SNP left after LD and p-value pruning (68.4% for BMI and 26.7% for height), highlighting the disadvantage of pruning instead of adjusting for LD. We obtained consistent results when using LD data from 1000G data instead of HRS (Figure S2), changing SNP block size (Figure S3), or changing the number of neighboring SNPs included in LD calculations (Figure S4).

A relatively small proportion of blocks contributed disproportionately to genetic variance. The top 25% SNP blocks explained 0.94 of BMI genetic variance (i.e. = 0.132/0.140) and 0.83 of height genetic variance when using LARGV on GIANT data to rank the genetic variance of each SNP block. These results could potentially be explained by the presence of one or more very strong associations in each of these top SNP blocks. To explore this possibility, we recorded the minimum univariate association SNP p-value for each block in both GIANT and HRS (Fig. 3). Median minimum univariate p-value was 2.0 × 10⁻² for BMI in GIANT, with 1% of blocks having one or more genome-wide significant associations (p < 5 × 10⁻⁸). On the other hand, the median minimum p-value was 1.9 × 10⁻³ for height in GIANT, with 9% of blocks having one or more genome-wide significant associations. The median minimum univariate p-values were 1.8 × 10⁻² and 1.7 × 10⁻² for BMI and height in HRS. No SNP reached genome-wide significance in HRS.

Analysis of known linkage peaks

A unique application of our method is the estimation of genetic variance over extended genomic regions using summary association statistics. We therefore tested the hypothesis that previously identified linkage peaks are enriched for regional associations. Based on results from the largest linkage study of height and BMI, three peaks with suggestive (LOD>2.0) evidence of linkage in Europeans were identified, all for height²¹. The only peak with LOD > 3.0 showed a significant (p = 0.002) enrichment in regional association within a distance of +/−7.5 Mb from the linkage marker, corresponding to an estimated excess regional genetic variance of 0.0044 over the genome-wide average (Table 2). Upon a closer inspection, the region encompasses several sub-regions with genome-wide significant associations.

Table 2 Excess regional genetic variance at three suggestive (LOD > 2.0) linkage peaks for height using GIANT summary association statistics.

Full size table

Discussion

We propose a novel method to estimate regional genetic variance from summary association statistics. Using this method, we confirmed a major role of regional associations in complex trait heritability, whereby the aggregation of genetic associations contributes disproportionately to phenotypic variance. Selecting the top SNP blocks from the GIANT meta-analysis, we showed that 25% of the genome is responsible for up to 0.94 and 0.83 of BMI and height genetic variance. A large proportion of these blocks had unremarkable minimum univariate p-values, suggesting the presence of multiple weak associations underlies their impact on phenotypic variance, especially for BMI. The concentration of genetic associations within these regions supports the existence of critical nodes in the genetic regulation of complex traits such as height and BMI, with implications not only for association testing but also for population genetics and natural selection. These results also suggest that a combination of strong genetic associations and regional associations contribute to complex traits variance, with the relative proportions varying across traits. For instance, a higher proportion of genetic variance was found in the top 25% blocks for BMI yet these blocks had less significant minimum univariate p-value than height.

Our method can also be used to estimate the genetic variance explained by extended regions. We therefore tested the hypothesis that some of the previously identified linkage peaks are the result of regional associations. The only known linkage peak with LOD > 3.0 for height showed a marked and significant enrichment in regional association. This region had been previously identified in multiple linkage studies^21,22. Genetic variance explained by the region was estimated at 0.0044, which is unlikely to explain the linkage peak by itself. Nonetheless, the juxtaposition of linkage and regional associations points towards concentration of functional variants as a potential explanation for the observed linkage. The lack of regional association at other peaks can be explained by false-positive linkage results, rare variant associations undetected by genome-wide association studies or by differences in genetic architecture between studied populations.

Our proposed method has several advantages and is complementary to other methods. First, it provides results that are highly consistent with the widely used variance component models requiring individual-level data. Second, it is computationally straightforward and can be applied to large datasets. Third, it is agnostic and therefore complementary to functional annotations of the genome. Fourth, since SNPs are not pruned by either LD or significance, every SNP block or region can be evaluated.

A few limitations are worth mentioning. First, the assumption that SNPs contribute to genetic variance without any interaction or haplotype effects can lead to an underestimation of genetic variance. While our estimates were consistent with variance component models, there is a need for statistical models that better capture genetic variance when strong haplotype effects are expected²³. Second, estimation of overall genome-wide genetic variance using summary association statistics is dependent on both the accuracy of SNP effect size estimates and the correct specification of LD structure. For instance, differences in adjustment for population stratification (e.g. principal components) in individual studies that participate in a meta-analysis could potentially influence the results. Ideally, LD data would come from the same population used to derived the summary association statistics, which could be included as summary statistics for each SNP (i.e. η_d, see Methods). While this information is currently not available, consistent results were observed when using LD data from either HRS or 1000G, demonstrating the robustness of our approach to LD misspecification. Third, we have only tested continuous traits. Nevertheless, our method can be easily adapted to other outcome types through the use of generalized linear models.

In this report, we establish a novel method to estimate the regional contribution of common variants to complex traits variance using summary association statistics. Our method has several applications, such as ranking of genetic regions for genetic variance and identification of key regions contributing to genetic variance. Our method can also be used to perform network analysis using summary association statistics, or to combine summary association statistics with other types of genetic annotations as we have shown with linkage results.

Methods

Methods overview

We have previously shown that large-region joint association, where multiple genetic variants are included as independent variables in a linear model, is a simple and powerful way to test for regional associations when individual-level data are available⁴. However, such regional joint associations are not easily amenable to estimation using summary data because of sensitivity to linkage disequilibrium. To evaluate the contribution of large regions joint associations to the genetic variance of complex traits, we first devised an algorithm to divide the genome in blocks of SNPs in such a way to minimize inter-block linkage disequilibrium and thus “spillage” associations. We then derived a method to estimate regional associations using summary data and showed this method to be equivalent to a multiple linear regression model when genetic effects are strictly additive (i.e. no haplotype or interaction effect). Using regional variance estimates from summary association statistics for height and BMI from the Genetic Investigation of Anthropometric Traits (GIANT) consortium, we estimated the distribution of genetic effects across the genome and validated results in the Health Retirement Study (HRS), which is not part of GIANT. A user-friendly software is available to produce SNP blocks and LD adjustment upon request at pareg@mcmaster.ca.

Dividing the genome into SNP blocks

We first divided the genome into regions of contiguous SNPs varying in size (e.g. from 195 SNPs to 205 SNPs), herein referred to as SNP blocks and used as units for regional associations. To minimize inter-block LD and thus “spillage” associations, we devised a greedy algorithm optimizing the choice of block boundary sequentially from one end of a chromosome to the other. Briefly, using a user-defined minimum and maximum block size (in number of SNPs) and starting at one end of a chromosome arm, each possible “cut-point” between the first and second block are tested and maximal LD (r²) between pairs of SNPs crossing block boundary is calculated. The cut-point that minimizes the maximal LD is chosen, thus defining the first block and the procedure is repeated for each subsequent block until all SNPs on a chromosome arm have been assigned to a block. We empirically determined that SNP blocks of size 85 SNPs to 95 SNPs (median 90 SNPs) had a median physical size of 250 Kb.

Estimating the contribution of regional genetic associations with individual-level genotypes

The use of adjusted R² lends itself nicely to estimation of regional variance explained when individual-level genotypes are available⁴. In this context, SNPs comprised in a given SNP block are included as independent variables in a multiple linear regression model and the goodness of fit statistic, adjusted R² calculated. Because the adjusted R²accounts for the number of SNPs included in each block, the expected adjusted R² is zero under the null hypothesis of no association and the expected sum of adjusted R² over all SNP blocks is also zero. The overall contribution of regional associations to complex traits variance can be estimated by simply summing the adjusted R² over all (or selected) SNP blocks. Furthermore, the distribution of adjusted R² under the null hypothesis of no association has been previously described²⁴ and can be used to derive the distribution of the sum of adjusted R².

Estimating the contribution of regional genetic associations with summary association statistics

Estimating the contribution of regional genetic associations from summary association statistics is challenging when the exact SNP linkage disequilibrium structure of source populations is unknown. While approaches have been described to perform joint or conditional associations^23,25 using estimated SNP covariance matrices, they do not perform well when estimating regional variance explained because of sensitivity to misspecification of linkage disequilibrium (data not shown) and ensuing an overestimation of regional associations. We therefore created a simple procedure to estimate regional variance explained from summary association statistics data, adjusting for linkage disequilibrium.

Without loss of generalizability, we assume a quantitative trait (Y) standard normally distributed and genotypes normalized to have mean = 0 and standard deviation = 1 throughout. Given an n × m genotype matrix X representing genotypes at m SNPs in n individuals and the pairwise linkage disequilibrium (r²) between two SNPs k and l as r²_k,l, for a SNP d, the following LD adjustment (η_d) can be defined as the summation of LD between the d^th SNP and 100 SNPs upstream and downstream:

with a distance of 100 SNPs assumed sufficient to ensure linkage equilibrium (other values might be used). Only including SNPs with summary GWAS statistics in the sum, variance explained by each SNP d is given by:

where b_d denotes the univariate regression coefficient commonly reported in GWAS results (assuming genotypes have been standardized to have mean zero and SD = 1). Regional variance explained is then given by the sum of over SNPs in a given region. Assuming a strictly additive genetic model where each SNP contributes additively to a trait without any interaction or haplotype effects, we demonstrate the expected total variance explained over a region is approximately equal to the expected value of the multiple linear regression variance explained when the sample size is sufficiently large.

To simplify the calculation, we define D such that is an m by m symmetric matrix, where D_k is a m × 1 vector whose entries represent the k^th column of D. We will make use of the following properties:

where N is the total number of individuals in the sample.

Estimation of regional genetic variance with multiple linear regression models

Suppose the genotype matrix is fixed while the true genetic effect is a random vector β, whose individual components, i.e. the SNPs, i = 1, 2, …, m, have mean zero and variance σ². The size of the variance σ² is on the scale of M⁻¹, where M is the number of genome-wide SNPs. The genetic model can be expressed as:

where ε is a vector of standard normal error with identity variance covariance matrix. Then, the vector of estimated multiple linear regression coefficients B is given by:

The multivariate variance explained R² can be written in terms of the true effect and the error term:

and since the error term has identity variance covariance and the true effect β has variance covariance matrix σ²I, the expected variance explained is simplified to

and the variance can be calculated accordingly:

Estimation of regional genetic variance using summary association statistics

The univariate regression coefficients, denoted by lower case b and directly obtained from GWAS summary statistics, are given by

The total variance explained over a region can be calculated using only the univariate regression coefficients from GWAS:

We can rewrite the expression by defining:

Thus, we have (Equation 10):

Since as LD tends to be weak 100 SNPs upstream and downstream away from the index SNP, we can simplify the expected total variance explained using the summary statistics to (Equation 11):

The variance of can be similarly derived. By using the cyclic property of trace and the fact that DΛD is a positive definite matrix, an upper bound for the variance can be expressed as (Equation 12):

where and both terms are small with respect to .

Comparison of regional genetic variance estimated using multiple regression models and summary association statistics

The expected values are equivalent between multiple linear regression model and the summary statistics derived regional sum (), with the number of SNPs m replaced by the “effective” number of genetic markers . Variance is slightly bigger for multiple linear regression models as compared to regional sum because the “effective” number of genetic markers is always equal (no LD) or less than the number of markers (m). In other words, is expected to be equal (no LD) or lower than the corresponding .

Health Retirement Study

We conducted large region joint association analysis for height using genome-wide data from the publicly available Health Retirement Study (HRS; dbGaP Study Accession: phs000428.v1.p1). HRS quality control criteria were used for filtering of both genotype and phenotype data, namely: (1) SNPs and individuals with missingness higher than 2% were excluded, (2) related individuals were excluded, (3) only participants with self-reported European ancestry genetically confirmed by principal component analysis were included, (4) SNPs with Hardy-Weinberg equilibrium p < 1 × 10⁻⁶ were excluded, (5) individuals for whom the reported sex does not match their genetic sex were excluded, (6) SNPs with minor allele frequency lower than 0.02 were removed. The final dataset included 7,776 European participants genotyped for 740,748 SNPs. Height and BMI was adjusted for age and sex in all analyses. To mitigate the effect of outliers, we have removed values outside the 1^st and 99^th percentile range for each of height and BMI. All analyses are adjusted for the first 20 genetic principal components unless stated otherwise. All LD estimates used throughout the manuscript were derived from HRS genotypes. HRS was not part of the GIANT meta-analysis of height and BMI^26,27.

Additional Information

How to cite this article: Pare, G. et al. A method to estimate the contribution of regional genetic associations to complex traits from summary association statistics. Sci. Rep. 6, 27644; doi: 10.1038/srep27644 (2016).

References

Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nature Genetics 42, 565–569, doi: 10.1038/ng.608 (2010).
Article CAS PubMed PubMed Central Google Scholar
Davies, G. et al. Genetic contributions to variation in general cognitive function: a meta-analysis of genome-wide association studies in the CHARGE consortium (N = 53949). Mol Psychiatry 20, 183–192, doi: 10.1038/mp.2014.188 (2015).
Article CAS PubMed PubMed Central Google Scholar
Stahl, E. A. et al. Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis. Nature genetics 44, 483–489, doi: 10.1038/ng.2232 (2012).
Article CAS PubMed PubMed Central Google Scholar
Pare, G., Asma, S. & Deng, W. Q. Contribution of large region joint associations to complex traits genetics. PLoS Genet 11, e1005103, doi: 10.1371/journal.pgen.1005103 (2015).
Article CAS PubMed PubMed Central Google Scholar
Beyene, J., Tritchler, D., Asimit, J. L. & Hamid, J. S. Gene- or region-based analysis of genome-wide association studies. Genet Epidemiol 33 Suppl 1, S105–110, doi: 10.1002/gepi.20481 (2009).
Article PubMed PubMed Central Google Scholar
Gusev, A. et al. Quantifying missing heritability at known GWAS loci. PLoS Genet 9, e1003993, doi: 10.1371/journal.pgen.1003993 (2013).
Article CAS PubMed PubMed Central Google Scholar
Cheung, V. G. & Spielman, R. S. Genetics of human gene expression: mapping DNA variants that influence gene expression. 10, 595–604, doi: 10.1038/nrg2630 (2009).
Article CAS Google Scholar
Consortium, T. E. P. An integrated encyclopedia of DNA elements in the human genome. 489, 57–74, doi: 10.1038/nature11247 (2012).
Article CAS Google Scholar
Speed, D., Hemani, G., Johnson, M. R. & Balding, D. J. Improved heritability estimation from genome-wide SNPs. Am J Hum Genet 91, 1011–1021, doi: 10.1016/j.ajhg.2012.10.010 (2012).
Article CAS PubMed PubMed Central Google Scholar
Loh, P.-R. et al. Contrasting regional architectures of schizophrenia and other complex diseases using fast variance components analysis. bioRxiv (2015).
Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet 47, 291–295, doi: 10.1038/ng.3211 (2015).
Article CAS PubMed PubMed Central Google Scholar
Bulik-Sullivan, B. Relationship between LD Score and Haseman-Elston Regression. bioRxiv (2015).
Ehret, G. B. et al. A multi-SNP locus-association method reveals a substantial fraction of the missing heritability. Am J Hum Genet 91, 863–871, doi: 10.1016/j.ajhg.2012.09.013 (2012).
Article CAS PubMed PubMed Central Google Scholar
Palla, L. & Dudbridge, F. A Fast Method that Uses Polygenic Scores to Estimate the Variance Explained by Genome-wide Marker Panels and the Proportion of Variants Affecting a Trait. Am J Hum Genet 97, 250–259, doi: 10.1016/j.ajhg.2015.06.005 (2015).
Article CAS PubMed PubMed Central Google Scholar
So, H. C., Li, M. & Sham, P. C. Uncovering the total heritability explained by all true susceptibility variants in a genome-wide association study. Genet Epidemiol 35, 447–456, doi: 10.1002/gepi.20593 (2011).
Article PubMed Google Scholar
Genomes Project, C. et al. A global reference for human genetic variation. Nature 526, 68–74, doi: 10.1038/nature15393 (2015).
Article ADS CAS Google Scholar
Sonnega, A. et al. Cohort Profile: the Health and Retirement Study (HRS). Int J Epidemiol 43, 576–585, doi: 10.1093/ije/dyu067 (2014).
Article PubMed PubMed Central Google Scholar
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet 88, 76–82, doi: 10.1016/j.ajhg.2010.11.011 (2011).
Article CAS PubMed PubMed Central Google Scholar
Locke, A. E. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206, doi: 10.1038/nature14177 (2015).
Article CAS PubMed PubMed Central Google Scholar
Wood, A. R. et al. Defining the role of common variation in the genomic and biological architecture of adult human height. Nature genetics 46, 1173–1186, doi: 10.1038/ng.3097 (2014).
Article CAS PubMed PubMed Central Google Scholar
Sammalisto, S. et al. Genome-wide linkage screen for stature and body mass index in 3.032 families: evidence for sex- and population-specific genetic effects. Eur J Hum Genet 17, 258–266, doi: 10.1038/ejhg.2008.152 (2009).
Article CAS PubMed Google Scholar
Perola, M. et al. Combined genome scans for body stature in 6,602 European twins: evidence for common Caucasian loci. PLoS Genet 3, e97, doi: 10.1371/journal.pgen.0030097 (2007).
Article CAS PubMed PubMed Central Google Scholar
Vilhjalmsson, B. et al. Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores. bioRxiv (2015).
Ohtani, K. & Tanizaki, H. Exact Distributions of R2 and Adjusted R2 in a Linear Regression Model with Multivariate Error Terms. Journal Of The Japan Statistical Society 34, 101–109, doi: 10.14490/jjss.34.101 (2004).
Article MathSciNet MATH Google Scholar
Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat Genet 44, 369–375, S361–363, doi: 10.1038/ng.2213 (2012).
Article CAS PubMed PubMed Central Google Scholar
Berndt, S. I. et al. Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture. Nat Genet 45, 501–512, doi: 10.1038/ng.2606 (2013).
Article CAS PubMed PubMed Central Google Scholar
Lango Allen, H. et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832–838, doi: 10.1038/nature09410 (2010).
Article ADS CAS PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

Department of Pathology and Molecular Medicine, McMaster University, L8S 4L8, Hamilton, ON, Canada
Guillaume Pare
Department of Clinical Epidemiology and Biostatistics, Population Genomics Program, McMaster University, Hamilton, L8S 4L8, ON, Canada
Guillaume Pare
Population Health Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, L8L 2X2, ON, Canada
Guillaume Pare & Shihong Mao
Thrombosis and Atherosclerosis Research Institute, Hamilton, L8L 2X2, ON, Canada
Guillaume Pare
Department of Statistical Sciences, University of Toronto, Toronto, M5S 3G3, ON, Canada
Wei Q. Deng

Authors

Guillaume Pare
View author publications
You can also search for this author in PubMed Google Scholar
Shihong Mao
View author publications
You can also search for this author in PubMed Google Scholar
Wei Q. Deng
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

G.P. designed the experiment; G.P. and W.Q.D. wrote the manuscript; S.M. analyzed the data and prepared tables and figures; All authors reviewed the manuscript.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Electronic supplementary material

Supplementary Information

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Pare, G., Mao, S. & Deng, W. A method to estimate the contribution of regional genetic associations to complex traits from summary association statistics. Sci Rep 6, 27644 (2016). https://doi.org/10.1038/srep27644

Download citation

Received: 29 February 2016
Accepted: 18 May 2016
Published: 08 June 2016
DOI: https://doi.org/10.1038/srep27644

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.