Multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes

A Publisher Correction to this article was published on 03 June 2019

This article has been updated


Stroke has multiple etiologies, but the underlying genes and pathways are largely unknown. We conducted a multiancestry genome-wide-association meta-analysis in 521,612 individuals (67,162 cases and 454,450 controls) and discovered 22 new stroke risk loci, bringing the total to 32. We further found shared genetic variation with related vascular traits, including blood pressure, cardiac traits, and venous thromboembolism, at individual loci (n = 18), and using genetic risk scores and linkage-disequilibrium-score regression. Several loci exhibited distinct association and pleiotropy patterns for etiological stroke subtypes. Eleven new susceptibility loci indicate mechanisms not previously implicated in stroke pathophysiology, with prioritization of risk variants and genes accomplished through bioinformatics analyses using extensive functional datasets. Stroke risk loci were significantly enriched in drug targets for antithrombotic therapy.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: MEGASTROKE study design.
Fig. 2: Association results of the transancestral GWAS meta-analysis and the prespecified ancestry-specific meta-analysis in European samples.
Fig. 3: Genetic overlap between stroke and related vascular traits at the 32 genome-wide-significant loci for stroke.
Fig. 4: Shared genetic contribution between stroke and related vascular traits.
Fig. 5: Connection between stroke risk genes and approved drugs for antithrombotic therapy.

Change history

  • 03 June 2019

    An amendment to this paper has been published and can be accessed via a link at the top of the paper.


  1. 1.

    GBD 2015 DALYs and HALE Collaborators. Global, regional, and national disability-adjusted life-years (DALYs) for 315 diseases and injuries and healthy life expectancy (HALE), 1990-2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet 388, 1603–1658 (2016).

    Google Scholar 

  2. 2.

    GBD 2015 Mortality and Causes of Death Collaborators. Global, regional, and national life expectancy, all-cause mortality, and cause-specific mortality for 249 causes of death, 1980-2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet 388, 1459–1544 (2016).

    Google Scholar 

  3. 3.

    Gudbjartsson, D. F. et al. Variants conferring risk of atrial fibrillation on chromosome 4q25. Nature 448, 353–357 (2007).

    CAS  PubMed  Google Scholar 

  4. 4.

    Gudbjartsson, D. F. et al. A sequence variant in ZFHX3 on 16q22 associates with atrial fibrillation and ischemic stroke. Nat. Genet. 41, 876–878 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. 5.

    International Stroke Genetics Consortium (ISGC) et al. Genome-wide association study identifies a variant in HDAC9 associated with large vessel ischemic stroke. Nat. Genet. 44, 328–333 (2012).

    Google Scholar 

  6. 6.

    Woo, D. et al. Meta-analysis of genome-wide association studies identifies 1q22 as a susceptibility locus for intracerebral hemorrhage. Am. J. Hum. Genet. 94, 511–521 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. 7.

    Kilarski, L. L. et al. Meta-analysis in more than 17,900 cases of ischemic stroke reveals a novel association at 12q24.12. Neurology 83, 678–685 (2014).

    PubMed  PubMed Central  Google Scholar 

  8. 8.

    Traylor, M. et al. A novel MMP12 locus is associated with large artery atherosclerotic stroke using a genome-wide age-at-onset informed approach. PLoS Genet. 10, e1004469 (2014).

    PubMed  PubMed Central  Google Scholar 

  9. 9.

    NINDS, Stroke Genetics Network (SiGN) & International Stroke Genetics Consortium (ISGC). Loci associated with ischaemic stroke and its subtypes (SiGN): a genome-wide association study. Lancet Neurol. 15, 174–184 (2016).

    Google Scholar 

  10. 10.

    Neurology Working Group of the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium, the Stroke Genetics Network (SiGN) & the International Stroke Genetics Consortium (ISGC). Identification of additional risk loci for stroke and small vessel disease: a meta-analysis of genome-wide association studies. Lancet Neurol. 15, 695–707 (2016).

    Google Scholar 

  11. 11.

    Malik, R. et al. Low-frequency and common genetic variation in ischemic stroke: the METASTROKE collaboration. Neurology 86, 1217–1226 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. 12.

    Traylor, M. et al. Genetic variation at 16q24.2 is associated with small vessel stroke. Ann. Neurol. 81, 383–394 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. 13.

    Williams, F. M. et al. Ischemic stroke is associated with the ABO locus: the EuroCLOT study. Ann. Neurol. 73, 16–31 (2013).

    PubMed  PubMed Central  Google Scholar 

  14. 14.

    1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    Google Scholar 

  15. 15.

    Morris, A. P. Transethnic meta-analysis of genomewide association studies. Genet. Epidemiol. 35, 809–822 (2011).

    PubMed  PubMed Central  Google Scholar 

  16. 16.

    Mishra, A. & Macgregor, S. VEGAS2: software for more flexible gene-based testing. Twin Res. Hum. Genet. 18, 86–91 (2015).

    PubMed  Google Scholar 

  17. 17.

    Traylor, M. et al. Genome-wide meta-analysis of cerebral white matter hyperintensities in patients with stroke. Neurology 86, 146–153 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. 18.

    Hara, K. et al. Association of HTRA1 mutations and familial ischemic cerebral small-vessel disease. N. Engl. J. Med. 360, 1729–1739 (2009).

    CAS  PubMed  Google Scholar 

  19. 19.

    Verdura, E. et al. Heterozygous HTRA1 mutations are associated with autosomal dominant cerebral small vessel disease. Brain 138, 2347–2358 (2015).

    PubMed  Google Scholar 

  20. 20.

    Gould, D. B. et al. Role of COL4A1 in small-vessel disease and hemorrhagic stroke. N. Engl. J. Med. 354, 1489–1496 (2006).

    CAS  PubMed  Google Scholar 

  21. 21.

    Jeanne, M. et al. COL4A2 mutations impair COL4A1 and COL4A2 secretion and cause hemorrhagic stroke. Am. J. Hum. Genet. 90, 91–101 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Pickrell, J. K. et al. Detection and interpretation of shared genetic influences on 42 human traits. Nat. Genet. 48, 709–717 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Lubitz, S. A. et al. Independent susceptibility markers for atrial fibrillation on chromosome 4q25. Circulation 122, 976–984 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Kato, N. et al. Trans-ancestry genome-wide association study identifies 12 genetic loci influencing blood pressure and implicates a role for DNA methylation. Nat. Genet. 47, 1282–1293 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Surakka, I. et al. The impact of low-frequency and rare variants on lipid levels. Nat. Genet. 47, 589–597 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Morris, A. P. et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat. Genet. 44, 981–990 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. 27.

    Bis, J. C. et al. Meta-analysis of genome-wide association studies from the CHARGE consortium identifies common variants associated with carotid intima media thickness and plaque. Nat. Genet. 43, 940–947 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Verhaaren, B. F. et al. Multiethnic genome-wide association study of cerebral white matter hyperintensities on MRI. Circ. Cardiovasc. Genet. 8, 398–409 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Sinner, M. F. et al. Integrating genetic, transcriptional, and functional analyses to identify 5 novel genes for atrial fibrillation. Circulation 130, 1225–1235 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Germain, M. et al. Meta-analysis of 65,734 individuals identifies TSPAN15 and SLC44A2 as two susceptibility loci for venous thromboembolism. Am. J. Hum. Genet. 96, 532–542 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Nikpay, M. et al. A comprehensive 1,000 Genomes-based genome-wide association meta-analysis of coronary artery disease. Nat. Genet. 47, 1121–1130 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. 32.

    Ellinor, P. T. et al. Meta-analysis identifies six new susceptibility loci for atrial fibrillation. Nat. Genet. 44, 670–675 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. 34.

    Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Trynka, G. et al. Chromatin marks identify critical cell types for fine mapping complex trait variants. Nat. Genet. 45, 124–130 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

    PubMed Central  Google Scholar 

  37. 37.

    Pers, T. H. et al. Biological interpretation of genome-wide association studies using predicted gene functions. Nat. Commun. 6, 5890 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Geer, L. Y. et al. The NCBI BioSystems database. Nucleic Acids Res. 38, D492–D496 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. 39.

    Mishra, A. & MacGregor, S. A novel approach for pathway analysis of GWAS data highlights role of BMP signaling and muscle cell differentiation in colorectal cancer susceptibility. Twin Res. Hum. Genet. 20, 1–9 (2017).

    PubMed  Google Scholar 

  40. 40.

    Wakefield, J. A Bayesian measure of the probability of false discovery in genetic epidemiology studies. Am. J. Hum. Genet. 81, 208–227 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Yang, H. & Wang, K. Genomic variant annotation and prioritization with ANNOVAR and wANNOVAR. Nat. Protoc. 10, 1556–1566 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. 42.

    Wang, W. et al. LNK/SH2B3 loss of function promotes atherosclerosis and thrombosis. Circ. Res. 119, e91–e103 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. 43.

    Ramensky, V., Bork, P. & Sunyaev, S. Human non-synonymous SNPs: server and survey. Nucleic Acids Res. 30, 3894–3900 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. 44.

    Kumar, P., Henikoff, S. & Ng, P. C. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4, 1073–1081 (2009).

    CAS  PubMed  Google Scholar 

  45. 45.

    GTEx Consortium. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348, 648–660 (2015).

    PubMed  PubMed Central  Google Scholar 

  46. 46.

    Eicher, J. D. et al. GRASP v2.0: an update on the Genome-Wide Repository of Associations between SNPs and phenotypes. Nucleic Acids Res. 43, D799–D804 (2015).

    CAS  PubMed  Google Scholar 

  47. 47.

    Leslie, R., O’Donnell, C. J. & Johnson, A. D. GRASP: analysis of genotype-phenotype results from 1390 genome-wide association studies and corresponding open access database. Bioinformatics 30, i185–i194 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. 48.

    Higasa, K. et al. Human genetic variation database, a reference database of genetic variations in the Japanese population. J. Hum. Genet. 61, 547–553 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. 49.

    Bonder, M. J. et al. Disease variants alter transcription factor levels and methylation of their binding sites. Nat. Genet. 49, 131–138 (2017).

    CAS  PubMed  Google Scholar 

  50. 50.

    Adams, D. et al. BLUEPRINT to decode the epigenetic signature written in blood. Nat. Biotechnol. 30, 224–226 (2012).

    CAS  PubMed  Google Scholar 

  51. 51.

    Franzén, O. et al. Cardiometabolic risk loci share downstream cis- and trans-gene regulation across tissues and diseases. Science 353, 827–830 (2016).

    PubMed  PubMed Central  Google Scholar 

  52. 52.

    Erbilgin, A. et al. Identification of CAD candidate genes in GWAS loci and their expression in vascular cells. J. Lipid Res. 54, 1894–1905 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  53. 53.

    The ARIC investigators. The Atherosclerosis Risk in Communities (ARIC) Study: design and objectives. Am. J. Epidemiol. 129, 687–702 (1989).

    Google Scholar 

  54. 54.

    Suhre, K. et al. Connecting genetic risk to disease end points through the human blood plasma proteome. Nat. Commun. 8, 14357 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. 55.

    Brænne, I. et al. Prediction of causal candidate genes in coronary artery disease loci. Arterioscler. Thromb. Vasc. Biol. 35, 2207–2217 (2015).

    PubMed  PubMed Central  Google Scholar 

  56. 56.

    Flister, M. J. et al. Identifying multiple causative genes at a single GWAS locus. Genome Res. 23, 1996–2002 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. 57.

    Okada, Y. et al. Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature 506, 376–381 (2014).

    CAS  Google Scholar 

  58. 58.

    Kemp, J. P. et al. Identification of 153 new loci associated with heel bone mineral density and functional involvement of GPC6 in osteoporosis. Nat. Genet. 49, 1468–1475 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  59. 59.

    Li, Y. & Kellis, M. Joint Bayesian inference of risk variants and tissue-specific epigenomic enrichments across multiple complex human diseases. Nucleic Acids Res. 44, e144 (2016).

    PubMed  PubMed Central  Google Scholar 

  60. 60.

    Lee, B. K. et al. Cell-type specific and combinatorial usage of diverse transcription factors revealed by genome-wide binding studies in multiple human cells. Genome Res. 22, 9–24 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  61. 61.

    Sanseau, P. et al. Use of genome-wide association studies for drug repositioning. Nat. Biotechnol. 30, 317–320 (2012).

    CAS  PubMed  Google Scholar 

  62. 62.

    Nelson, M. R. et al. The support of human genetic evidence for approved drug indications. Nat. Genet. 47, 856–860 (2015).

    CAS  PubMed  Google Scholar 

  63. 63.

    den Hoed, M. et al. Identification of heart rate-associated loci and their effects on cardiac conduction and rhythm disorders. Nat. Genet. 45, 621–631 (2013).

    Google Scholar 

  64. 64.

    Pfeufer, A. et al. Genome-wide association study of PR interval. Nat. Genet. 42, 153–159 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  65. 65.

    Christophersen, I. E. et al. Large-scale analyses of common and rare variants identify 12 new loci associated with atrial fibrillation. Nat. Genet. 49, 946–952 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  66. 66.

    Verweij, N. et al. Genetic determinants of P wave duration and PR segment. Circ. Cardiovasc. Genet. 7, 475–481 (2014).

    PubMed  PubMed Central  Google Scholar 

  67. 67.

    Le Scouarnec, S. et al. Dysfunction in ankyrin-B-dependent ion channel and transporter targeting causes human sinus node disease. Proc. Natl. Acad. Sci. USA 105, 15617–15622 (2008).

    PubMed  Google Scholar 

  68. 68.

    Schott, J. J. et al. Congenital heart disease caused by mutations in the transcription factor NKX2-5. Science 281, 108–111 (1998).

    CAS  PubMed  Google Scholar 

  69. 69.

    Ellesøe, S. G. et al. Familial atrial septal defect and sudden cardiac death: identification of a novel NKX2-5 mutation and a review of the literature. Congenit. Heart Dis. 11, 283–290 (2016).

    PubMed  Google Scholar 

  70. 70.

    Mohler, P. J. et al. Ankyrin-B mutation causes type 4 long-QT cardiac arrhythmia and sudden cardiac death. Nature 421, 634–639 (2003).

    CAS  PubMed  Google Scholar 

  71. 71.

    Shi, H., Kichaev, G. & Pasaniuc, B. Contrasting the genetic architecture of 30 complex traits from summary association data. Am. J. Hum. Genet. 99, 139–153 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  72. 72.

    Kato, N. et al. Meta-analysis of genome-wide association studies identifies common variants associated with blood pressure variation in east Asians. Nat. Genet. 43, 531–538 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  73. 73.

    Surendran, P. et al. Trans-ancestry meta-analyses identify rare and common variants associated with blood pressure and hypertension. Nat. Genet. 48, 1151–1161 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  74. 74.

    Winkler, T. W. et al. Quality control and conduct of genome-wide association meta-analyses. Nat. Protoc. 9, 1192–1212 (2014).

    PubMed  PubMed Central  Google Scholar 

  75. 75.

    Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  76. 76.

    So, H. C., Gui, A. H., Cherny, S. S. & Sham, P. C. Evaluating the heritability explained by known susceptibility variants: a survey of ten complex diseases. Genet. Epidemiol. 35, 310–317 (2011).

    PubMed  PubMed Central  Google Scholar 

  77. 77.

    Feigin, V. L., Lawes, C. M., Bennett, D. A. & Anderson, C. S. Stroke epidemiology: a review of population-based studies of incidence, prevalence, and case-fatality in the late 20th century. Lancet Neurol. 2, 43–53 (2003).

    PubMed  Google Scholar 

  78. 78.

    Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44, 369–375 (2012). S1–S3.

    CAS  PubMed  PubMed Central  Google Scholar 

  79. 79.

    Liu, J. Z. et al. A versatile gene-based test for genome-wide association studies. Am. J. Hum. Genet. 87, 139–145 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  80. 80.

    Bowden, J., Davey Smith, G. & Burgess, S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 44, 512–525 (2015).

    PubMed  PubMed Central  Google Scholar 

  81. 81.

    International Consortium for Blood Pressure Genome-Wide Association Studies et al. Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature 478, 103–109 (2011).

    Google Scholar 

  82. 82.

    Wain, L. V. et al. Genome-wide association study identifies six new loci influencing pulse pressure and mean arterial pressure. Nat. Genet. 43, 1005–1011 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  83. 83.

    Wellcome Trust Case Control Consortium et al. Bayesian refinement of association signals for 14 loci in 3 common diseases. Nat. Genet. 44, 1294–1301 (2012).

    Google Scholar 

  84. 84.

    Johnson, A. D. et al. SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 24, 2938–2939 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  85. 85.

    Arnold, M., Raffler, J., Pfeufer, A., Suhre, K. & Kastenmüller, G. SNiPA: an interactive, genetic variant-centered annotation browser. Bioinformatics 31, 1334–1336 (2015).

    PubMed  Google Scholar 

  86. 86.

    Ward, L. D. & Kellis, M. HaploReg v4: systematic mining of putative causal variants, cell types, regulators and target genes for human complex traits and disease. Nucleic Acids Res. 44, D1, D877–D881 (2016). 

    CAS  PubMed  Google Scholar 

  87. 87.

    ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Google Scholar 

  88. 88.

    Wishart, D. S. et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 34, D668–D672 (2006).

    CAS  PubMed  Google Scholar 

  89. 89.

    Yang, H. et al. Therapeutic target database update 2016: enriched resource for bench to clinical drug target and targeted pathway information. Nucleic Acids Res. 44, D1069–D1074 (2016).

    CAS  PubMed  Google Scholar 

  90. 90

    Hachiya, T. et al. Genetic predisposition to ischemic stroke: a polygenic risk score. Stroke 48, 253–258 (2017).

    PubMed  PubMed Central  Google Scholar 

  91. 91

    Kanai, M. et al. Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases. Nat. Genet. (2018).

    CAS  PubMed  Google Scholar 

Download references


A full list of Acknowledgements appears in the Supplementary Note.

Author information





Writing and editing the manuscript: R.M., G.C., M.T., M.S., Y.O., S.D., and M.D. Study design/conception: R.M., M.D., S.D., B.M.P., G.J.F., J.W.J., J.I.R., J.G.W., M.F., H.I.Y., C.J., S. Seshadri, W.T.L., B.B.W., B.D.M., S.J.K., H.S.M., J.D., J.R., K.S., and O.M. Statistical analysis: A.-K.G., G.J.F., M.F., C.D.L., Y.O., E.L., B.R.S., R.M., M.S., M.T., A. Mishra, E.G.H., C.D.A., T.M.B., C. Carrera, I.C., W.-Y.L., S.L.P., K. Rannikmäe, K. Rice, S. Tiedt, J.C.C., A.D.J., P.I.W.d.B., S.W.v.d.L., P. Almgren, S. Gretarsdottir, and F.T. Sample/phenotype contribution: M.D., S.D., C.D.A., C. Cruchaga, I.C., H.I.H., J.W.J., N.S.R., A.S.B., A.C., A.S., A.S.H., A.P.R., A.L.D., A. Rolfs, A. Ruusalepp, A.G.L., A. Manichaikul, B.M.K., C.L.C., C.R., C.K., C. Tanislav, C. Tzourio, C.M.v.D., D.I.C., D.W., D.A.T., D.O.K., D.K.S., D.L., E.S.T., E.E.S., E.I., F.-C.H., G.P., H.A., H.H.H.A., H.S.M., I.E.C., J. Haessler, J. He, J. Hata, J.F.M., J.S.K., J.-M.L., J.D., J.W.C., J.R., J.J-C., J.A.J., K.S., K.M.R., K.L.K., K.L.W., L.J.L., L.A.L., M.A.N., M.A.I., M.d.H., M.R.I., M.J.O., M. Kanai, M. Kubo, M.W., M.M.S., N.J.W., N.K., O.R.B., P.F.M., P.T.E., P.K.M., P.E., P. Amouyel, P.v.d.H., Q.D., Q.Y., R.P.G., R.L.S., R.F.G., R.S., S.Y., S.K., S.T.E., S.B., S.A.L., S.J.K., S.R.H., S.W.-S., T.B.H., T.R., T.H.M., T.P., T.T., U.S., U.T., V.C., V.G., W.-M.C., V.N.S.T., X.J., B.M.P., J.I.R., J.G.W., O.M., C.J., J.C.H., S. Seshadri, T.A., G.B.B., R.D.B., A.H., N.L.S., R.L., C.M.L., T.N., P. M. Ridker, P. M. Rothwell, V.S., C.O.S., P.S., C.L.M.S., K.D.T., M. Civelek, D. Saleheen, D. Strbian, S. Sakaue, S. Gustafsson, S. Tiedt, S. Trompet, and I.F.-C.. Critical revision of article: R.M., M.D., S.D., B.M.P., C.J., J.I.R., O.M., S. Seshadri, G.J.F., J.W.J., W.T.L., C.D.A., D. Strbian, E.G.H., I.F.-C., S. Tiedt, C.L.M.S., C.O.S., C. Cruchaga, G.B.B., I.C., J.C.B., J. Hata, K. Rice, S.L.P., N.S.R., S.S.R., T.A., T.N., J.M.M.H., T.M.B., and V.S. Supervision: M.D., S.D., C.D.A., J.M.M.H, J.I.R., S. Seshadri, C.M.L., C.L.M.S., J.W.J., V.S., and J.C.B. GWAS analyses: R.M., G.C., M.T., S. Gretarsdottir, G.T., J. Hata, A.K.G., M. Chong, J.L.M.B., C. Carrera, A.H., G.J.F., and Y.K. Functional annotation: M.S., A. Mishra, R.M., G.C., M.T., L.R.-J., and A.K.G. Gene-based analysis: A. Mishra. Pathway analyses: A. Mishra, R.M., M. Chong, and K. Rice. Drug-target analysis: Y.O. Scoring method: M.S., R.M., S.D., and M.D. wGRS analysis: M.S. and R.M. LD-score regression analysis: R.M., M.S., and Y.K. Credible-SNP-set analysis: R.M., G.C., and M.S. Data for GWAS analysis, cross-phenotype analysis or QTL analysis: AFGen Consortium, Cohorts for Heart and AgingResearch in Genomic Epidemiology (CHARGE) Consortium, iGEN-BP Consortium, INVENT Consortium, STARNET, and Biobank Japan Cooperative Hospital Group. Consortia providing stroke data: COMPASS Consortium, EPIC-CVD Consortium, EPIC-InterAct Consortium, ISGC, METASTROKE Consortium, Neurology Working Group of the CHARGE Consortium, NINDS-SiGN, UK Young Lacunar DNA Study, and MEGASTROKE Consortium. The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the National Heart, Lung, and Blood Institute or the National Institute of Neurological Disorders and Stroke. 

Corresponding authors

Correspondence to Stephanie Debette or Martin Dichgans.

Ethics declarations

Competing interests

S. Gretarsdottir, G.T., U.T., and K.S. are all employees of deCODE Genetics/Amgen, Inc. M.A.N. is an employee of Data Tecnica International. P.T.E. is the PI on a grant from Bayer HealthCare to the Broad Institute, focused on the genetics and therapeutics of atrial fibrillation. S.A.L. receives sponsored research support from Bayer HealthCare, Biotronik, and Boehringer Ingelheim, and has consulted for St. Jude Medical and Quest Diagnostics. E.I. is a scientific advisor for Precision Wellness, Cellink and Olink Proteomics for work unrelated to the present project. B.M.P. serves on the DSMB of a clinical trial funded by Zoll LifeCor and on the Steering Committee of the Yale Open Data Access Project funded by Johnson & Johnson. The remaining authors have no disclosures.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–13, Supplementary Tables 1, 3–5, 8–10, 12, 14, 16 and 26, and Supplementary Note

Life Sciences Reporting Summary

Supplementary Table 2: Sample overview and genetic information of all studies

Given for each sample are the age distribution, gender distribution and risk factors distribution, if available. Further information on genotyping platform, technique, imputation parameters and QC parameters are given, if available.

Supplementary Table 6: Variance explained by the 32 lead SNPs. Shown are the lead SNPs of the 32 risk loci for stroke and the phenotypic variance explained as estimated by the method of So et al.

Variances are given for the Europeans-only and the East Asianonly meta-analysis. If a SNP was not available in the analysis, variance explained was set to zero.

Supplementary Table 7: Results from the Gene-based tests using VEGAS2

Data were analyzed for each ethnicity and a meta-analysis was calculated using Stouffer’s Z. Genome-wide results are displayed in bold (P < 2.02 x 10-6 for Bonferroni correction for the number of genes).

Supplementary Table 11: Results of the conditional analysis (GCTA-COJO) in the European sample

Shown are the 2-SNP or 3-SNP solutions for each lead SNP after conditioning on the lead SNP in Europeans. P-values of SNP2 and SNP3 were considered significant at P < 5 x10-8. SVS is omitted because there were no genome-wide significant signals to investigate.

Supplementary Table 13: Results from look-ups of the 32 genome-wide significant loci for stroke in published GWAS data from related phenotypes

Column D specifies the index SNPs of the non-stroke phenotype or SNPs in high LD with the index SNP (r 2 > 0.9) with the lowest Pvalue in the respective non-stroke phenotype. Index SNPs or proxy SNPs reaching P < 1.30 x 10-4 (0.05/32 loci/12 related vascular traits) in the respective related phenotype are shown. Index SNPs and proxy SNPs reaching genome-wide significance are marked by an asterisk in column G. Column F specifies the r 2 between the index SNP and the lead SNP in stroke.

Supplementary Table 15: MR-Egger regression and comparison with Inverse-Variance Weighted (IVW) estimates, for vascular wGRS showing a significant association with stroke risk

IVW estimates are derived from a fixed effects analysis using the GTX software (Online Methods); for the intercept of the MR-Egger analysis (Egger_intercept, Online Methods) we used a significance threshold of P < 0.05. Effect estimates are given per unit increase in the wGRS. CI: confidence interval; OR: odds ratio *The MR-Egger intercept estimate was nominally significant (P = 0.015) only for the association between the SBP wGRS and AS, and this was no longer the case after removing 6 of 37 SNPs that appeared as outliers on the leave-one-out plot (Online Methods), leading to causal estimates in broad agreement across regression techniques, with larger standard errors using the MR-Egger method as is typically the case ( and PMID: 26050253, 28527048). The causal estimates obtained by the weighted median approach (PMID: 27061298) are also in broad agreement with those from the IVW and the MR-Egger (beta ± s.e.: 0.032 ± 0.005, OR (95%CI): 1.03 (1.02-1.04), P = 9.48x10- 10).

Supplementary Table 17: Results of the epigwas analysis

Shown is the enrichment P-value of GWAS results in specific tissues. We used epigwas to calculate enrichment P-values for H3K4me1 (enhancers), H3K4me3 (promoters) and H3K9ac (active promoters).

Supplementary Table 18: Results of DEPICT pathway analysis

For each stroke subtype, SNPs with BF > 5 from the trans-ethnic meta-analysis were analyzed. Gene sets with a FDR < 0.05 were considered significant. Columns E-N show the Z-scores of the genes in the gene set.

Supplementary Table 19: Results from the Ingenuity Pathway Analysis

Shown are enrichment P-values for the corresponding Ingenuity canonical pathway and the proteins involved in the respective pathway. P-values are derived from Fisher’s exact test. FDR < 0.05 was considered significant and are displayed in bold. For The IPA Diseases and Bio Functions and for the IPA Tox Functions, Pvalues are given for the enrichment of specific function annotations.

Supplementary Table 20: Results from the VEGAS2 pathway analysis

Shown are pathways for each stroke subtype, the ethnicity-specific P-values and the meta-analysis P-value. Pathways with FDR < 0.05 were considered significant and are displayed in bold (CES only).

Supplementary Table 21: Results of the 95% credible set analysis

Results were obtained separately in European, East Asian, and African American ancestry samples. Shown is the number of SNPs in the 95% credible set (numerator) and the total number of SNPs in the analysis (denominator, r 2 > 0.1).

Supplementary Table 22: Detailed functional and biological information on SNPs at the 32 stroke risk loci

Shown are the lead SNPs and all proxy SNPs with r 2 > 0.8. We show information on nearby genes, the genomic consequence (intergenic, intronic, missense, regulatory), chromatin marks, eQTLs (GRASP_v2, GTEX_v6, BIOS, BLUEPRINT, STARNET, UCLA and HGVD), meQTLs (BLUEPRINT and ARIC) and pQTLs (KORA). We also give information whether this specific SNP is included in the 95% credible set analysis and the P-value of the Riviera-beta-analysis.

Supplementary Table 23: Relation of the lead and proxy SNPs (r 2 > 0.8) from 32 stroke risk loci with the best cis eQTL, meQTL and pQTL from various human bio-resources, grouped per tissue or cell type

Shown is the stroke subtype showing the most significant association; for meQTLs, CpG probe numbers are indicated in brackets after the gene name.

Supplementary Table 24: Biological candidate gene prioritization of 149 genes located in the 32 stroke associated risk loci

For each gene, we first list the biological score derived from 14 biological criteria and the overall score by including other biological information. All colored boxes have a value of 1; values of 0 signify no information or not satisfied criteria. For the genomic context, filled red boxes indicate that the criteria are satisfied. Filled blue boxes indicate significant QTL association (eQTL, geneexpression; meQTL, methylation; pQTL, protein). Filled yellow boxes indicate overlap with H3K4me3, H3K9ac and H3K4me1 peaks in cells types that showed significant enrichment in epigwas analysis. Filled green boxes indicate significantly enriched pathways. Filled purple boxes indicate overlap with drug target genes (ATC-C: Cardiovascular; ATC-B01: Antithrombotic).

Supplementary Table 25: Results of the drug class enrichment analysis

Shown is the number of genes falling into the respective Anatomical Therapeutic Chemical (ATC) drug class together with the respective statistics for genome-wide loci (BF > 6) and suggestive loci (BF > 5) both with and without the SH2B3 locus.

Supplementary Table 27: Information on the SNPs selected for the wGRS analysis

Given are the related vascular traits from which the respective wGRS were derived, the marker name (rs_id), the risk/other allele and the beta used as weight for the wGRS approach.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Malik, R., Chauhan, G., Traylor, M. et al. Multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes. Nat Genet 50, 524–537 (2018).

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing