Guidance for the utility of linear models in meta-analysis of genetic association studies of binary phenotypes

Abstract

Linear mixed models are increasingly used for the analysis of genome-wide association studies (GWAS) of binary phenotypes because they can efficiently and robustly account for population stratification and relatedness through inclusion of random effects for a genetic relationship matrix. However, the utility of linear (mixed) models in the context of meta-analysis of GWAS of binary phenotypes has not been previously explored. In this investigation, we present simulations to compare the performance of linear and logistic regression models under alternative weighting schemes in a fixed-effects meta-analysis framework, considering designs that incorporate variable case–control imbalance, confounding factors and population stratification. Our results demonstrate that linear models can be used for meta-analysis of GWAS of binary phenotypes, without loss of power, even in the presence of extreme case–control imbalance, provided that one of the following schemes is used: (i) effective sample size weighting of Z-scores or (ii) inverse-variance weighting of allelic effect sizes after conversion onto the log-odds scale. Our conclusions thus provide essential recommendations for the development of robust protocols for meta-analysis of binary phenotypes with linear models.

Introduction

Linear mixed models (LMMs) have received increasing prominence in the analysis of genome-wide association studies (GWAS) of complex human traits because they account for genetic structure, across participants, which arises from population stratification, cryptic relatedness or close familial relationships.1, 2, 3, 4, 5, 6, 7 In this framework, structure is modelled by means of a genetic relationship matrix (GRM), constructed from genome-wide SNP genotype data across study participants (or from known familial relationships). A random-effects model is then used to evaluate the evidence of association for an SNP by accounting for the contribution of the GRM to the overall variance of the trait. This flexible modelling framework can incorporate fixed effects to account for covariates, and can be used to estimate components of heritability that are explained by (subsets of) genotyped SNPs.8, 9

Linear models assume that the outcome of interest is a quantitative trait with a Gaussian distribution. However, it has become increasingly common to use LMM approaches in population- and family-based GWAS of binary phenotypes because of their flexibility in accounting for structure, and their computational tractability in comparison with logistic mixed models. Linear models have the disadvantage that allelic effect estimates cannot be interpreted, directly, in terms of the odds ratio (OR), although approximations on the log-odds scale can be obtained.10 Recent studies have also demonstrated that LMMs have less power than traditional logistic regression modelling techniques in GWAS of case–control phenotypes unless ascertainment is adequately accounted for.11, 12

While the properties of linear (mixed) models in the analysis of GWAS of binary phenotypes at the cohort level have been explored previously,10 their utility in the context of meta-analysis has not been investigated. In this study, therefore, we present simulations to compare the type I error rates and power of generalised linear (mixed) models under alternative weighting schemes in a fixed-effects meta-analysis framework. We consider a range of study designs that incorporate variable case–control imbalance across GWAS to reflect the increasing use of large-scale, population-based biobanks, and investigate the impact of confounders and population stratification on the properties of the analytical strategies. We conclude by making recommendations for the development of robust protocols for meta-analysis of GWAS of binary phenotypes with linear (mixed) models, which will be highly relevant in the era of large-scale consortium efforts to unravel the genetic basis of complex human diseases.

Materials and methods

Consider a GWAS of n participants, with binary phenotypes, genome-wide genotypes and additional covariates denoted by y, G and x, respectively. We denote the phenotype of the ith participant by yi{0, 1}, and their genotype at the jth SNP by Gij[0, 2], coded under a dosage model in the number of minor alleles. In a generalised linear mixed modelling framework,

where g(.) is the link function, β is the allelic effect of the jth SNP on the phenotype and γ is a vector of covariate regression parameters. In this expression, u is a vector of random effects, defined by u~MVN(0, λ K), for the variance component λ and GRM K, derived from genome-wide SNP data (or known familial relationships) to account for population structure. A likelihood ratio test with one degree of freedom is then formed by comparing the maximised log likelihood of the unconstrained model (1) with that obtained under the null hypothesis of no association, β=0. Note that model (1) reduces to a generalised linear model (no random effects) for λ=0, which is appropriate in the absence of structure because of population stratification and/or familial relationships.

Under a logistic regression model, for the logit link function, the maximum-likelihood estimate of the allelic effect, , can be interpreted directly as the log-OR of the jth SNP. However, under a linear regression model, for the identity link function, the maximum-likelihood estimate of the allelic effect, , is measured on the wrong scale. Nevertheless, we can obtain an approximation of the allelic log-OR and corresponding variance from the linear model,10 given by

and

where is the maximum-likelihood estimate of the intercept. In practice, is usually obtained from the null model for which βLIN=0, because the effect of any SNP on the phenotype is expected to be small. Here, we estimate by the proportion of participants that are cases, for which the correction factor is minimised when the number of cases and controls in the study is equal (ie, no imbalance). This transformation of parameter estimates from the linear regression model has been demonstrated to provide an accurate approximation of the allelic log-OR provided that genetic effects are small, the case–control ratio is well balanced and the SNP is common.10

Fixed-effects meta-analysis

Consider N GWAS, for which we have tested for association of the phenotype with the jth SNP under a generalised linear model (1). We denote the effective sample size of the kth GWAS by nk, given by

where n0k and n1k denote the number of controls and cases, respectively. In the kth GWAS, we also denote the P-value obtained from the regression model by pk, and the estimated allelic effect from the regression model by .

Under an effective sample size weighting scheme, we obtain a combined Z-score for association of the jth SNP across GWAS by

where φ−1 is the inverse normal distribution function. Alternatively, under an inverse-variance weighting scheme, we obtain an estimate of the allelic effect of the jth SNP on the phenotype, and the corresponding variance, across GWAS by

where

We then obtain a combined Z-score for association of the jth SNP across GWAS by

Simulation study

We have performed a series of detailed simulations to investigate the type I error rates and power of alternative approaches to study-level association testing of a binary phenotype (linear and logistic regression modelling) in the context of fixed-effects meta-analysis (with effective sample size or inverse-variance weighting schemes), summarised in Table 1.

Table 1 Summary of approaches to study-level association testing of a binary phenotype

Our first study design consisted of 10 cohorts of a binary phenotype, ascertained from the same population, each comprising of 2000 participants. We considered three scenarios for case–control imbalance, described in Table 2, such that the meta-analysis comprised a total of 10 000 cases and 10 000 population controls: (i) no imbalance (1:1 ratio in each cohort); (ii) moderate imbalance (variable ratio of 3:1 to 1:3 across cohorts); and (iii) extreme imbalance (variable ratio of 19:1 to 1:19 across cohorts). For each scenario, we investigated models of association parameterised according to: (i) the risk allele frequency (RAF) of the causal SNP, denoted q; and (ii) the allelic OR for the risk allele, denoted ψ.

Table 2 Summary of case–control counts in each cohort for alternative imbalance scenarios considered in the simulation study

For each model, we generated 10 000 replicates of genotype data for the causal SNP in the study participants. For each replicate, genotypes were simulated in the required numbers of cases and controls in each cohort, according to the causal SNP RAF and allelic OR, and assuming Hardy–Weinberg equilibrium. Specifically, genotypes in cases and controls were simulated from a multinomial distribution, with probabilities given by

where R denotes the risk allele and .

To assess the impact of confounders on the alternative analysis strategies, we also simulated a binary covariate for each individual from a Bernoulli distribution, taking the value 1 in cases with probability and 0 otherwise, and taking the value 1 in controls with probability and 0 otherwise.

We also investigated the impact of population stratification on the alternative analysis strategies. Within each cohort, cases and controls were ascertained from sub-population A with probabilities θ and (1−θ), respectively, and were otherwise ascertained from sub-population B. The RAFs in sub-populations A and B were assumed to be 0.4 and 0.6, respectively, and used to generate genotypes at the causal SNP under Hardy–Weinberg equilibrium, from a multinomial distribution, as defined above. For each individual, we then simulated genotype data for 1000 additional uncorrelated SNPs, assuming Hardy–Weinberg equilibrium, and independent of case–control status, from a multinomial distribution. For each SNP, we assumed minor allele frequencies of 0.2 and 0.8, respectively, in sub-populations A and B. Genotypes at the 1000 SNPs were then used to construct the GRM within each cohort.

Our second study design consisted of two cohorts of a binary phenotype, ascertained from the same population. The first cohort consisted of 1000 cases and 1000 controls. The second cohort represented a large biobank of 100 000 individuals, within which we investigated the impact of the extent of case–control imbalance on the meta-analysis. For each scenario, we assumed a causal SNP RAF of 0.5 and an allelic OR of 1.25, and generated 10 000 replicates of genotype data for the causal SNP in the study participants. For each replicate, genotypes were simulated in the required number of cases and controls in the two cohorts, assuming Hardy–Weinberg equilibrium, from a multinomial distribution, as described above.

For both study designs, we used a linear Wald test, implemented in EPACTS, to obtain parameter estimates and association P-values under a linear regression model (no random effects) within each cohort for each replicate. To obtain parameter estimates under a logistic regression model (no random effects) within each cohort, we used a Firth bias-corrected likelihood ratio test, also implemented in EPACTS, which has been demonstrated to be more robust to case–control imbalance than Wald or score statistics for binary outcomes.13 To obtain parameter estimates under a LMM (random effects for GRM) within each cohort, we used EMMAX,1 also implemented in EPACTS. We combined summary statistics through fixed-effects meta-analysis with effective sample size and inverse-variance weighting using METAL14 and GWAMA,15 respectively.

Across all scenarios, each test of association, after meta-analysis, was evaluated at nominal significance thresholds of P<0.05 and P<0.01, and at the traditional genome-wide standard of P<5 × 10−8. For estimated allelic effect sizes on the log-odds scale (from the logistic regression model and after conversion from the linear regression model), we also evaluated bias and mean square error (MSE).

Results

No population stratification or confounders

We first considered the properties of fixed-effects meta-analysis of association summary statistics obtained from linear and logistic regression models without random effects for the GRM and for simulations generated in the absence of structure or confounders. Supplementary Figure S1 presents the type I error rate (at a nominal 5% significance threshold) of each of the analytical strategies considered (Table 1) for an SNP with RAF in the range of 1–50%. For all frequencies investigated, the type I error rate was consistent with the nominal significance threshold of P<0.05, irrespective of the analytical approach and the extent of case–control imbalance.

Figure 1 presents the power (at genome-wide significance) of each of the analytical strategies considered (Table 1), as a function of the allelic OR, for an SNP with RAF in the range of 1–50%. There is no appreciable difference in power between the five approaches unless there is extreme case–control imbalance. In this extreme imbalance setting, the power of the meta-analysis under inverse-variance weighting of effect sizes from the linear model (without conversion to the log-odds scale) is substantially lower compared with that for the other approaches. However, we also observe a loss in power of the meta-analysis under inverse-variance weighting of effect sizes from the logistic regression model for rare SNPs (RAF 1%), irrespective of the extent of case–control imbalance, which has not been reported previously. We observe the same pattern of results at less stringent significance levels (Supplementary Figure S2), with the inverse-variance weighting of effect sizes from the linear model (without conversion to the log-odds scale) being substantially less powerful when there is extreme case–control imbalance.

Figure 1
figure1

Power to detect association (at genome-wide significance, P<5 × 10−8) of a binary phenotype with a causal SNP, in the absence of population stratification or confounders, using alternative meta-analysis strategies for summary statistics obtained from linear and logistic regression models without random effects for the GRM (Table 1). Results are presented as a function of the allelic OR, for a causal SNP with RAF in the range of 1–50% and for variable extent of case–control imbalance (defined in Table 2).

Supplementary Figures S3 and S4 present the bias and MSE of the estimated allelic OR after meta-analysis under the inverse-variance weighting of effect sizes from the logistic regression model and the linear regression model after conversion to the log-odds scale. Results are presented as a function of the allelic OR. There is minimal difference in both metrics between the two meta-analysis strategies. However, for rare SNPs (RAF 1%), the meta-analysis under inverse-variance weighting of effect sizes from the logistic regression model underestimates the allelic OR, irrespective of case–control imbalance, explaining the reduction in power of this strategy that was observed above.

Impact of a confounding variable in the absence of population stratification

We next considered the properties of fixed-effects meta-analysis of association summary statistics obtained from linear and logistic regression models without random effects for the GRM and for simulations generated in the absence of structure, but where the binary phenotype was also correlated with a confounding variable. We assumed a causal SNP with RAF 50% and an allelic OR of 1.15 for the binary phenotype. Supplementary Figure S5 presents the power (at genome-wide significance) of each of the five analytical strategies considered (Table 1), as a function of the relative risk of the confounding variable, defined by . As expected, there is a general decline in power to detect association across analytical strategies as the relative risk of the confounder of the binary phenotype increases. However, as demonstrated by the simulations in the absence of confounders, the inverse-variance weighting of effect sizes from the linear model (without conversion to the log-odds scale) was less powerful when there is extreme case–control imbalance.

Supplementary Figure S5 also presents the bias and MSE of the estimated allelic OR after meta-analysis under the inverse-variance weighting of effect sizes from the logistic regression model and the linear regression model after conversion to the log-odds scale. Results are presented as a function of the relative risk of the confounding variable. Irrespective of the case–control imbalance, the estimated allelic OR after conversion to the log-odds scale becomes increasingly biased (underestimated) as the relative risk of the confounding variable increases, although power is not affected.

Impact of population stratification

We then considered the properties of fixed-effects meta-analysis of association summary statistics obtained from linear regression models, with and without random effects for the GRM and for simulations generated in the presence of population stratification (cases and controls ascertained from sub-populations A and B). Supplementary Figure S6 presents the type I error rate (at a nominal 5% significance threshold) of each analytical strategy considered (Table 1) as a function of the probability, θ, that a case is ascertained from sub-population A. Irrespective of the extent of population stratification, the type I error rate was consistent with the nominal significance threshold of P<0.05 for any fixed-effects meta-analysis strategy using the linear model with random effects for the GRM. However, as expected, type I error rates became increasingly inflated as the extent of population stratification was elevated for all fixed-effects meta-analysis strategies using the linear model without a random effect for the GRM.

Figure 2 presents the power (at genome-wide significance) of the three fixed-effects meta-analysis strategies that aggregate association summary statistics from the linear model with random effects for the GRM, for a causal SNP with allelic OR of 1.15 for the binary phenotype. There is no appreciable difference in power between the analytical strategies, unless there is extreme case–control imbalance. In this extreme imbalance setting, the power of the meta-analysis under inverse-variance weighting of effect sizes from the linear model (without conversion to the log-odds scale) is substantially lower compared with that for the other approaches. The difference in power between these approaches is consistent, irrespective of the extent of population stratification.

Figure 2
figure2

Power to detect association (at genome-wide significance, P<5 × 10−8) of a binary phenotype with a causal SNP, in the presence of population stratification (cases and controls ascertained from sub-populations (A and B), using alternative meta-analysis strategies for summary statistics obtained from linear regression models with random effects for the GRM (Table 1). Results are presented as a function of the probability that a case is ascertained from sub-population A, for a causal SNP with allelic OR of 1.15 for the binary phenotype and for variable extent of case–control imbalance (defined in Table 2).

Impact of inclusion of a population biobank with extreme case–control imbalance

Finally, we considered the properties of fixed-effects meta-analysis of association summary statistics obtained from linear and logistic regression models without random effects for the GRM, for simulations generated in the absence of structure. In these simulations, association summary statistics were aggregated from a population biobank of 100 000 participants with extreme case–control imbalance and a balanced case–control study of 2000 participants. Figure 3 presents the power (at genome-wide significance) of each of the analytical strategies considered (Table 1), for a causal SNP with RAF 50% and an allelic OR of 1.25, as a function of the number of cases in the population biobank. As reported above, in this extreme imbalance setting, the power of the meta-analysis under inverse-variance weighting of effect sizes from the linear model (without conversion to the log-odds scale) is substantially lower compared with that for the other approaches. The difference in power reduces as the extent of the imbalance in the biobank decreases (i.e. the proportion of cases increases), and thus has most detrimental impact for rare diseases.

Figure 3
figure3

Power to detect association (at genome-wide significance, P<5 × 10−8) of a binary phenotype with a causal SNP, in the absence of population stratification or confounders, using alternative meta-analysis strategies for summary statistics obtained from linear and logistic regression models without random effects for the GRM (Table 1). Association summary statistics were aggregated from a population biobank of 100 000 participants with extreme case–control imbalance and a balanced case–control study of 2000 participants. Results are presented for a causal SNP with RAF 50% and an allelic OR of 1.25, as a function of the number of cases in the population biobank.

Discussion

We have presented simulations to evaluate the utility of linear models in the context of meta-analysis of GWAS of binary phenotypes. Our results highlight that the extent of case–control imbalance across studies can have a major impact on the performance of a linear regression model. We have demonstrated that, for extreme imbalance, meta-analysis under inverse-variance weighting of allelic effect estimates from a linear regression model results in a substantial reduction in power, unless they are first converted onto the log-odds scale. This is of particular importance because existing, widely used software16 for the meta-analysis of association summary statistics from LMMs implements inverse-variance weighting of allelic effect estimates without conversion to the log-odds scale.

For a binary phenotype, under a linear regression model, the standard error of an allelic effect estimate is dependent on multiple factors, including allele frequency, total sample size, OR and variance of the trait. For a fixed total sample size, the variance of the trait (and thus standard error of the allelic effect estimate) decreases as the case–control imbalance becomes more extreme. However, the power to detect association with the binary phenotype is less in imbalanced studies, and they should, in fact, be given less weight in any meta-analysis. Correction of allelic effect estimates from the linear regression model onto the log-odds scale circumvents this issue by inflating the corresponding standard error by a factor that is inversely proportional to the case–control imbalance.

Case–control imbalance is becoming increasingly widespread in GWAS of binary phenotypes, particularly with the availability of large-scale, extensively studied, population-based biobanks, often with linkage to electronic medical records.17, 18, 19, 20 The utility of linear models in these extremely imbalanced case–control designs has not been previously studied in the context of meta-analysis. Crucially, our investigation highlights that linear models can be used for meta-analysis of GWAS of binary phenotypes, without loss of power, even in the presence of extreme case–control imbalance, provided that one of the following schemes is used: (i) effective sample size weighting of Z-scores or (ii) inverse-variance weighting of allelic effect sizes after conversion onto the log-odds scale.

Our simulations demonstrate that meta-analysis of association summary statistics for binary phenotypes from LMMs is robust to population stratification, even in the presence of extreme case–control imbalance. However, it is important to note that this conclusion is valid only when population stratification does not lead to violation of the LMM assumption of homoscedasticity, for which residual variances are constant, irrespective of covariates.21, 22 Heteroscedasticity can occur in the presence of population stratification, for example, when strata have variable case–control imbalance or heterogeneous disease risk. Under these circumstances, LMMs are valid only for variants that have similar RAFs across strata, such that there is only weak confounding due to structure. Otherwise, computationally efficient software will be required to implement logistic mixed models on the scale of the whole genome.

References

  1. 1

    Kang HM, Sul JH, Service SK et al: Variance component model to account for sample structure in genome-wide association studies. Nat Genet 2010; 42: 348–354.

  2. 2

    Zhang Z, Ersoz E, Lai CQ et al: Mixed linear model approach adapted for genome-wide association studies. Nat Genet 2010; 42: 355–360.

  3. 3

    Price AL, Zaitlen NA, Reich D, Patterson N : New approaches to population stratification in genome-wide association studies. Nat Rev Genet 2010; 11: 459–463.

  4. 4

    Lippert C, Listgarten J, Liu Y, Kadie CM, Davidson RI, Heckerman D : FaST linear mixed models for genome-wide association studies. Nat Methods 2011; 8: 833–835.

  5. 5

    Listgarten J, Lippert C, Kadie CM, Davidson RI, Eskin E, Heckerman D : Improved linear mixed models for genome-wide association studies. Nat Methods 2012; 9: 525–526.

  6. 6

    Zhou X, Stephens M : Genome-wide efficient mixed-model analysis for association studies. Nat Genet 2012; 44: 821–824.

  7. 7

    Svishcheva GR, Axenovich TI, Belonogova NM, van Duijn CM, Aulchenko YS : Rapid variance components-based method for whole-genome association analysis. Nat Genet 2012; 44: 1166–1170.

  8. 8

    Yang J, Benyamin B, McEvoy BP et al: Common SNPs explain a large proportion of the heritability for human height. Nat Genet 2010; 42: 565–569.

  9. 9

    Zaitlen N, Kraft P : Heritability in the genome-wide association era. Hum Genet 2012; 131: 1655–1664.

  10. 10

    Pirinen M, Donnelly P, Spencer CCA : Efficient computation with a linear mixed model on large-scale genetic data sets with applications to genetic studies. Ann App Stat 2013; 7: 369–390.

  11. 11

    Yang J, Zaitlen NA, Goddard ME, Visscher PM, Price AL : Advantages and pitfalls in the application mixed-model association methods. Nat Genet 2014; 46: 100–106.

  12. 12

    Hayeck TJ, Zaitlen NA, Loh P-R et al: Mixed model with correction for case–control ascertainment increases association power. Am J Hum Genet 2015; 96: 720–730.

  13. 13

    Ma C, Blackwell T, Boehnke M, Scott LJ GoT2D Investigators: Recommended joint and meta-analysis strategies for case–control association testing of single low-count variants. Genet Epidemiol 2013; 37: 539–550.

  14. 14

    Willer CJ, Li Y, Abecasis GR : METAL: fast and efficient meta-analysis of genome-wide association scans. Bioinf 2010; 26: 2190–2191.

  15. 15

    Magi R, Morris AP : GWAMA: software for genome-wide association meta-analysis. BMC Bioinf 2010; 11: 288.

  16. 16

    Liu DJ, Peloso GM, Zhan X et al: Meta-analysis of gene-level tests for rare variant association. Nat Genet 2014; 46: 200–204.

  17. 17

    Roden DM, Pulley JM, Basford MA et al: Development of a large-scale de-identified DNA biobank to enable personalised medicine. Clin Pharmacol Ther 2008; 84: 362–369.

  18. 18

    Tayo BO, Teil M, Tong L et al: Genetic background of patients from a university medical center in Manhattan: implications for personalized medicine. PLoS One 2011; 6: e19166.

  19. 19

    Leitsalu L, Haller T, Esko T et al: Cohort profile: Estonian Biobank of the Estonian Genome Center, University of Tartu. Int J Epidemiol 2015; 44: 1137–1147.

  20. 20

    Sudlow C, Gallacher J, Allen N et al: UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med 2015; 12: e1001779.

  21. 21

    Conomos MP, Laurie CA, Stilp AM et al: Genetic diversity and association studies in US Hispanic/Latino populations: applications in the Hispanic Community Health Study/Study of Latinos. Am J Hum Genet 2016; 92: 165–184.

  22. 22

    Chen H, Wang C, Conomos MP et al: Control for population structure and relatedness for binary traits in genetic association studies via logistic mixed models. Am J Hum Genet 2016; 98: 653–666.

Download references

Acknowledgements

Andrew P Morris is a Wellcome Trust Senior Fellow in Basic Biomedical Science (under award WT098017).

Author information

Correspondence to Andrew P Morris.

Ethics declarations

Competing interests

The authors declare no conflict of interest.

Additional information

Supplementary Information accompanies this paper on European Journal of Human Genetics website

Supplementary information

Supplementary Figures (PDF 673 kb)

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Further reading

  • Genome-wide association study of cerebral small vessel disease reveals established and novel loci

    • Jaeyoon Chung
    • , Sandro Marini
    • , Joanna Pera
    • , Bo Norrving
    • , Jordi Jimenez-Conde
    • , Jaume Roquer
    • , Israel Fernandez-Cadenas
    • , David L Tirschwell
    • , Magdy Selim
    • , Devin L Brown
    • , Scott L Silliman
    • , Bradford B Worrall
    • , James F Meschia
    • , Stacie Demel
    • , Steven M Greenberg
    • , Agnieszka Slowik
    • , Arne Lindgren
    • , Reinhold Schmidt
    • , Matthew Traylor
    • , Muralidharan Sargurupremraj
    • , Steffen Tiedt
    • , Rainer Malik
    • , Stéphanie Debette
    • , Martin Dichgans
    • , Carl D Langefeld
    • , Daniel Woo
    • , Jonathan Rosand
    •  & Christopher D Anderson

    Brain (2019)

  • Genome-wide association study of type 2 diabetes in Africa

    • Ji Chen
    • , Meng Sun
    • , Adebowale Adeyemo
    • , Fraser Pirie
    • , Tommy Carstensen
    • , Cristina Pomilla
    • , Ayo P. Doumatey
    • , Guanjie Chen
    • , Elizabeth H. Young
    • , Manjinder Sandhu
    • , Andrew P. Morris
    • , Inês Barroso
    • , Mark I. McCarthy
    • , Anubha Mahajan
    • , Eleanor Wheeler
    • , Charles N. Rotimi
    •  & Ayesha A. Motala

    Diabetologia (2019)

  • Discovery of common and rare genetic risk variants for colorectal cancer

    • Jeroen R. Huyghe
    • , Stephanie A. Bien
    • , Tabitha A. Harrison
    • , Hyun Min Kang
    • , Sai Chen
    • , Stephanie L. Schmit
    • , David V. Conti
    • , Conghui Qu
    • , Jihyoun Jeon
    • , Christopher K. Edlund
    • , Peyton Greenside
    • , Michael Wainberg
    • , Fredrick R. Schumacher
    • , Joshua D. Smith
    • , David M. Levine
    • , Sarah C. Nelson
    • , Nasa A. Sinnott-Armstrong
    • , Demetrius Albanes
    • , M. Henar Alonso
    • , Kristin Anderson
    • , Coral Arnau-Collell
    • , Volker Arndt
    • , Christina Bamia
    • , Barbara L. Banbury
    • , John A. Baron
    • , Sonja I. Berndt
    • , Stéphane Bézieau
    • , D. Timothy Bishop
    • , Juergen Boehm
    • , Heiner Boeing
    • , Hermann Brenner
    • , Stefanie Brezina
    • , Stephan Buch
    • , Daniel D. Buchanan
    • , Andrea Burnett-Hartman
    • , Katja Butterbach
    • , Bette J. Caan
    • , Peter T. Campbell
    • , Christopher S. Carlson
    • , Sergi Castellví-Bel
    • , Andrew T. Chan
    • , Jenny Chang-Claude
    • , Stephen J. Chanock
    • , Maria-Dolores Chirlaque
    • , Sang Hee Cho
    • , Charles M. Connolly
    • , Amanda J. Cross
    • , Katarina Cuk
    • , Keith R. Curtis
    • , Albert de la Chapelle
    • , Kimberly F. Doheny
    • , David Duggan
    • , Douglas F. Easton
    • , Sjoerd G. Elias
    • , Faye Elliott
    • , Dallas R. English
    • , Edith J. M. Feskens
    • , Jane C. Figueiredo
    • , Rocky Fischer
    • , Liesel M. FitzGerald
    • , David Forman
    • , Manish Gala
    • , Steven Gallinger
    • , W. James Gauderman
    • , Graham G. Giles
    • , Elizabeth Gillanders
    • , Jian Gong
    • , Phyllis J. Goodman
    • , William M. Grady
    • , John S. Grove
    • , Andrea Gsur
    • , Marc J. Gunter
    • , Robert W. Haile
    • , Jochen Hampe
    • , Heather Hampel
    • , Sophia Harlid
    • , Richard B. Hayes
    • , Philipp Hofer
    • , Michael Hoffmeister
    • , John L. Hopper
    • , Wan-Ling Hsu
    • , Wen-Yi Huang
    • , Thomas J. Hudson
    • , David J. Hunter
    • , Gemma Ibañez-Sanz
    • , Gregory E. Idos
    • , Roxann Ingersoll
    • , Rebecca D. Jackson
    • , Eric J. Jacobs
    • , Mark A. Jenkins
    • , Amit D. Joshi
    • , Corinne E. Joshu
    • , Temitope O. Keku
    • , Timothy J. Key
    • , Hyeong Rok Kim
    • , Emiko Kobayashi
    • , Laurence N. Kolonel
    • , Charles Kooperberg
    • , Tilman Kühn
    • , Sébastien Küry
    • , Sun-Seog Kweon
    • , Susanna C. Larsson
    • , Cecelia A. Laurie
    • , Loic Le Marchand
    • , Suzanne M. Leal
    • , Soo Chin Lee
    • , Flavio Lejbkowicz
    • , Mathieu Lemire
    • , Christopher I. Li
    • , Li Li
    • , Wolfgang Lieb
    • , Yi Lin
    • , Annika Lindblom
    • , Noralane M. Lindor
    • , Hua Ling
    • , Tin L. Louie
    • , Satu Männistö
    • , Sanford D. Markowitz
    • , Vicente Martín
    • , Giovanna Masala
    • , Caroline E. McNeil
    • , Marilena Melas
    • , Roger L. Milne
    • , Lorena Moreno
    • , Neil Murphy
    • , Robin Myte
    • , Alessio Naccarati
    • , Polly A. Newcomb
    • , Kenneth Offit
    • , Shuji Ogino
    • , N. Charlotte Onland-Moret
    • , Barbara Pardini
    • , Patrick S. Parfrey
    • , Rachel Pearlman
    • , Vittorio Perduca
    • , Paul D. P. Pharoah
    • , Mila Pinchev
    • , Elizabeth A. Platz
    • , Ross L. Prentice
    • , Elizabeth Pugh
    • , Leon Raskin
    • , Gad Rennert
    • , Hedy S. Rennert
    • , Elio Riboli
    • , Miguel Rodríguez-Barranco
    • , Jane Romm
    • , Lori C. Sakoda
    • , Clemens Schafmayer
    • , Robert E. Schoen
    • , Daniela Seminara
    • , Mitul Shah
    • , Tameka Shelford
    • , Min-Ho Shin
    • , Katerina Shulman
    • , Sabina Sieri
    • , Martha L. Slattery
    • , Melissa C. Southey
    • , Zsofia K. Stadler
    • , Christa Stegmaier
    • , Yu-Ru Su
    • , Catherine M. Tangen
    • , Stephen N. Thibodeau
    • , Duncan C. Thomas
    • , Sushma S. Thomas
    • , Amanda E. Toland
    • , Antonia Trichopoulou
    • , Cornelia M. Ulrich
    • , David J. Van Den Berg
    • , Franzel J. B. van Duijnhoven
    • , Bethany Van Guelpen
    • , Henk van Kranen
    • , Joseph Vijai
    • , Kala Visvanathan
    • , Pavel Vodicka
    • , Ludmila Vodickova
    • , Veronika Vymetalkova
    • , Korbinian Weigl
    • , Stephanie J. Weinstein
    • , Emily White
    • , Aung Ko Win
    • , C. Roland Wolf
    • , Alicja Wolk
    • , Michael O. Woods
    • , Anna H. Wu
    • , Syed H. Zaidi
    • , Brent W. Zanke
    • , Qing Zhang
    • , Wei Zheng
    • , Peter C. Scacheri
    • , John D. Potter
    • , Michael C. Bassik
    • , Anshul Kundaje
    • , Graham Casey
    • , Victor Moreno
    • , Goncalo R. Abecasis
    • , Deborah A. Nickerson
    • , Stephen B. Gruber
    • , Li Hsu
    •  & Ulrike Peters

    Nature Genetics (2019)

  • Bayesian multiple logistic regression for case-control GWAS

    • Saikat Banerjee
    • , Lingyao Zeng
    • , Heribert Schunkert
    • , Johannes Söding
    •  & Michael P. Epstein

    PLOS Genetics (2018)

  • Transformation of Summary Statistics from Linear Mixed Model Association on All-or-None Traits to Odds Ratio

    • Luke R. Lloyd-Jones
    • , Matthew R. Robinson
    • , Jian Yang
    •  & Peter M. Visscher

    Genetics (2018)