Article | Published:

A weighted burden test using logistic regression for integrated analysis of sequence variants, copy number variants and polygenic risk score

European Journal of Human Geneticsvolume 27pages114124 (2019) | Download Citation


Previously described methods of analysis allow variants in a gene to be weighted more highly according to rarity and/or predicted function and then for the variant contributions to be summed into a gene-wise risk score, which can be compared between cases and controls using a t-test. However, this does not allow incorporating covariates into the analysis. Schizophrenia is an example of an illness where there is evidence that different kinds of genetic variation can contribute to risk, including common variants contributing to a polygenic risk score (PRS), very rare copy number variants (CNVs) and sequence variants. A logistic regression approach has been implemented to compare the gene-wise risk scores between cases and controls, while incorporating as covariates population principal components, the PRS and the presence of pathogenic CNVs and sequence variants. A likelihood ratio test is performed, comparing the likelihoods of logistic regression models with and without this score. The method was applied to an ethnically heterogeneous exome-sequenced sample of 6000 controls and 5000 schizophrenia cases. In the raw analysis, the test statistic is inflated but inclusion of principal components satisfactorily controls for this. In this dataset, the inclusion of the PRS and effect from CNVs and sequence variants had only small effects. The set of genes which are FMRP targets showed some evidence for enrichment of rare, functional variants among cases (p = 0.0005). This approach can be applied to any disease in which different kinds of genetic and non-genetic risk factors make contributions to risk.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.


  1. 1.

    Morris AP, Zeggini E. An evaluation of statistical approaches to rare variant analysis in genetic association studies. Genet Epidemiol. 2010;34:188–93.

  2. 2.

    Purcell SM, Moran JL, Fromer M, Ruderfer D, Solovieff N, Roussos P, et al. A polygenic burden of rare disruptive mutations in schizophrenia. Nature. 2014;506:185–90.

  3. 3.

    Madsen BE, Browning SR. A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet. 2009;5:e1000384.

  4. 4.

    Richardson TG, Timpson NJ, Campbell C, Gaunt TR. A pathway-centric approach to rare variant association analysis. Eur J Hum Genet. 2016;25:123–9.

  5. 5.

    Neale BM, Rivas MA, Voight BF, et al. Testing for an unusual distribution of rare variants. PLoS Genet. 2011;7:e1001322.

  6. 6.

    Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X. Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet. 2011;89:82–93.

  7. 7.

    Lee S, Emond M, Bamshad M, Barnes K, Rieder M. Optimal unified approach for rare variant association testing with application to small sample case-control whole-exome sequencing studies. Am J Hum Genet. 2012;91:224–37.

  8. 8.

    Lee S, Abecasis GR, Boehnke M, Lin X. Rare-variant association analysis: study designs and statistical tests. Am J Hum Genet. 2014;95:5–23.

  9. 9.

    Curtis D. A rapid method for combined analysis of common and rare variants at the level of a region, gene, or pathway. Adv Appl Bioinform Chem. 2012;5:1–9.

  10. 10.

    Curtis D. Pathway analysis of whole exome sequence data provides further support for the involvement of histone modification in the aetiology of schizophrenia. Psychiatr Genet. 2016;26:223–7.

  11. 11.

    Marshall CR, Howrigan DP, Merico D, et al. Contribution of copy number variants to schizophrenia from a genome-wide study of 41,321 subjects. Nat Genet. 2016;49:27–35.

  12. 12.

    Correct, but you need 6 authors:Singh T, Kurki MI, Curtis D, Purcell SM, Crooks L, et al. Rare loss-of-function variants in SETD1A are associated with schizophrenia and developmental disorders. Nat Neurosci 2016;19:571‒7.

  13. 13.

    Steinberg S, Gudmundsdottir S, Sveinbjornsson G, et al. Truncating mutations in RBM12 are associated with psychosis. Nat Genet. 2017;49:1251–4.

  14. 14.

    Genovese G, Fromer M, Stahl EA, et al. Increased burden of ultra-rare protein-altering variants among 4,877 individuals with schizophrenia. Nat Neurosci. 2016;19:1433–41.

  15. 15.

    Curtis D, Coelewij L, Liu S-H, Humphrey J, Mott R. Weighted burden analysis of exome-sequenced case-control sample implicates synaptic genes in schizophrenia aetiology. Behav Genet. 2018;48:198–208.

  16. 16.

    Schizophrenia Working Group of the Psychiatric Genomics Consortium. Biological insights from 108 schizophrenia-associated genetic loci. Nature 2014;511: 421–7.

  17. 17.

    McLaren W, Gil L, Hunt SE, et al. The Ensembl Variant Effect Predictor. Genome Biol. 2016;17:122.

  18. 18.

    Adzhubei I, Jordan DM, Sunyaev SR. Predicting functional effect of human missense mutations using PolyPhen-2. Curr Protoc Hum Genet. 2013;7:20 Unit7.

  19. 19.

    Kumar P, Henikoff S, Ng PC. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc. 2009;4:1073–81.

  20. 20.

    King D. Dlib-ml: a machine learning toolkit. J Mach Learn Res. 2009;10:1755–8.

  21. 21.

    Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559–75.

  22. 22.

    Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 2015;4:7.

  23. 23.

    Purcell SM, Wray NR, Stone JL, et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009;10:8192–8192.

  24. 24.

    Rujescu D, Ingason A, Cichon S, et al. Disruption of the neurexin 1 gene is associated with schizophrenia. Hum Mol Genet. 2009;18:988–96.

  25. 25.

    Ching MSL, Shen Y, Tan W-H, et al. Deletions of NRXN1 (neurexin-1) predispose to a wide spectrum of developmental disorders. Am J Med Genet B Neuropsychiatr Genet. 2010;153B:937–47.

  26. 26.

    Subramanian A, Tamayo P, Mootha VK et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 2005; 102:15545–50.

  27. 27.

    R Core Team. R: A language and environment for statistical computing (R Foundation for Statistical Computing: Vienna, Austria, 2014).

  28. 28.

    Curtis D. Construction of an exome-wide risk score for schizophrenia based on a weighted burden test. Ann Hum Genet. 2017;82:11–22.

  29. 29.

    Knowles JW, Ashley EA. Cardiovascular disease: the rise of the genetic risk score. PLoS Med. 2018;15:e1002546.

  30. 30.

    Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 2005;33:D514–7.

  31. 31.

    Fagerberg L, Hallstrom BM, Oksvold P, et al. Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol Cell Proteom. 2014;13:397–406.

  32. 32.

    Wagnon JL, Briese M, Sun W, et al. CELF4 regulates translation and local abundance of a vast set of mRNAs, including genes associated with regulation of synaptic function. PLoS Genet. 2012;8:e1003067.

  33. 33.

    Samocha KE, Robinson EB, Sanders SJ, et al. A framework for the interpretation of de novo mutation in human disease. Nat Genet. 2014;46:944–50.

  34. 34.

    Deciphering Developmental Disorders Study. Prevalence and architecture of de novo mutations in developmental disorders. Nature 2017;542:433–8.

  35. 35.

    Fromer M, Pocklington AJ, Kavanagh DH, et al. De novo mutations in schizophrenia implicate synaptic networks. Nature. 2014;506:179–84.

  36. 36.

    Kirov G, Pocklington AJ, Holmans P, et al. De novo CNV analysis implicates specific abnormalities of postsynaptic signalling complexes in the pathogenesis of schizophrenia. Mol Psychiatry. 2012;17:142–53.

  37. 37.

    Darnell JC, Van Driesche SJ, Zhang C, et al. FMRP stalls ribosomal translocation on mRNAs linked to synaptic function and autism. Cell. 2011;146:247–61.

  38. 38.

    Robinson EB, Neale BM, Hyman SE. Genetic research in autism spectrum disorders. Curr Opin Pediatr. 2015;27:685–91.

  39. 39.

    Cahoy JD, Emery B, Kaushal A, et al. A transcriptome database for astrocytes, neurons, and oligodendrocytes: a new resource for understanding brain development and function. J Neurosci. 2008;28:264–78.

  40. 40.

    Lek M, Karczewski KJ, Minikel EV, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536:285–91.

  41. 41.

    Bayés A, van de Lagemaat LN, Collins MO, et al. Characterization of the proteome, diseases and evolution of the human postsynaptic density. Nat Neurosci. 2011;14:19–21.

  42. 42.

    Weyn-Vanhentenryck SM, Mele A, Yan Q, et al. HITS-CLIP and integrative modeling define the Rbfox splicing-regulatory network linked to brain development and autism. Cell Rep. 2014;6:1139–52.

  43. 43.

    Pirooznia M, Wang T, Avramopoulos D, et al. SynaptomeDB: an ontology-based knowledgebase for synaptic genes. Bioinformatics. 2012;28:897–9.

  44. 44.

    Cotton AM, Ge B, Light N, Adoue V, Pastinen T, Brown CJ. Analysis of expressed SNPs identifies variable extents of expression from the human inactive X chromosome. Genome Biol. 2013;14:R122.

  45. 45.

    Moeschler JB. Genetic evaluation of intellectual disabilities. Semin Pediatr Neurol. 2008;15:2–9.

  46. 46.

    Gécz J, Shoubridge C, Corbett M. The genetic landscape of intellectual disability arising from chromosome X. Trends Genet. 2009;25:308–16.

  47. 47.

    Moeschler JB, Shevell M. American Academy of Pediatrics Committee on Genetics. Clinical genetic evaluation of the child with mental retardation or developmental delays. Pediatrics. 2006;117:2304–16.

  48. 48.

    Rauch A, Hoyer J, Guth S, et al. Diagnostic yield of various genetic approaches in patients with unexplained developmental delay or mental retardation. Am J Med Genet Part A. 2006;140A:2063–74.

Download references


The datasets used for the analysis described in this manuscript were obtained from dbGaP at through dbGaP accession number phs000473.v2.p2. Samples used for data analysis were provided by the Swedish Cohort Collection supported by the NIMH grant R01MH077139, the Sylvan C. Herman Foundation, the Stanley Medical Research Institute and The Swedish Research Council (grants 2009–4959 and 2011–4659). Support for the exome sequencing was provided by the NIMH Grand Opportunity grant RCMH089905, the Sylvan C. Herman Foundation, a grant from the Stanley Medical Research Institute and multiple gifts to the Stanley Center for Psychiatric Research at the Broad Institute of MIT and Harvard.

Author information


  1. Centre for Psychiatry, Barts and the London School of Medicine and Dentistry, Charterhouse Square, London, EC1M 6BQ, UK

    • David Curtis
  2. UCL Genetics Institute, UCL, Darwin Building, Gower Street, London, WC1E 6BT, UK

    • David Curtis


  1. Search for David Curtis in:

Conflict of interest

The authors declare that they have no conflict of interest.

Corresponding author

Correspondence to David Curtis.

About this article

Publication history