Venous thromboembolism is a significant cause of mortality1, yet its genetic determinants are incompletely defined. We performed a discovery genome-wide association study in the Million Veteran Program and UK Biobank, with testing of approximately 13 million DNA sequence variants for association with venous thromboembolism (26,066 cases and 624,053 controls) and meta-analyzed both studies, followed by independent replication with up to 17,672 venous thromboembolism cases and 167,295 controls. We identified 22 previously unknown loci, bringing the total number of venous thromboembolism-associated loci to 33, and subsequently fine-mapped these associations. We developed a genome-wide polygenic risk score for venous thromboembolism that identifies 5% of the population at an equivalent incident venous thromboembolism risk to carriers of the established factor V Leiden p.R506Q and prothrombin G20210A mutations. Our data provide mechanistic insights into the genetic epidemiology of venous thromboembolism and suggest a greater overlap among venous and arterial cardiovascular disease than previously thought.
Subscribe to Journal
Get full journal access for 1 year
only $18.75 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
The full summary-level association data from the MVP trans-ancestry VTE meta-analysis from this report are available on request through dbGAP, accession code no. phs001672.v2.p1. Data contributed by the CARDIoGRAMplusC4D investigators are available at http://www.CARDIOGRAMPLUSC4D.org/. Data on large artery stroke have been contributed by the MEGASTROKE investigators and are available at http://www.megastroke.org/. The genetic and phenotypic UK Biobank data are available on application to the UK Biobank.
Heit, J. A. Epidemiology of venous thromboembolism. Nat. Rev. Cardiol. 12, 464–474 (2015).
Bertina, R. M. et al. Mutation in blood coagulation factor V associated with resistance to activated protein C. Nature 369, 64–67 (1994).
Poort, S. R., Rosendaal, F. R., Reitsma, P. H. & Bertina, R. M. A common genetic variation in the 3′-untranslated region of the prothrombin gene is associated with elevated plasma prothrombin levels and an increase in venous thrombosis. Blood 88, 3698–3703 (1996).
Klarin, D., Emdin, C. A., Natarajan, P., Conrad, M. F. & Kathiresan, S. Genetic analysis of venous thromboembolism in UK biobank identifies the ZFPM2 locus and implicates obesity as a causal risk factor. Circ. Cardiovasc. Genet. 10, e001643 (2017).
Hinds, D. A. et al. Genome-wide association analysis of self-reported events in 6135 individuals and 252 827 controls identifies 8 loci associated with thrombosis. Hum. Mol. Genet. 25, 1867–1874 (2016).
Heit, J. A. et al. A genome-wide association study of venous thromboembolism identifies risk variants in chromosomes 1q24.2 and 9q. J. Thromb. Haemost. 10, 1521–1531 (2012).
Germain, M. et al. Meta-analysis of 65,734 individuals identifies TSPAN15 and SLC44A2 as two susceptibility loci for venous thromboembolism. Am. J. Hum. Genet. 96, 532–542 (2015).
Hernandez, W. et al. Novel genetic predictors of venous thromboembolism risk in African Americans. Blood 127, 1923–1929 (2016).
Tang, W. et al. A genome-wide association study for venous thromboembolism: the extended cohorts for heart and aging research in genomic epidemiology (CHARGE) consortium. Genet. Epidemiol. 37, 512–521 (2013).
Trégouët, D. A. et al. Common susceptibility alleles are unlikely to contribute as strongly as the FV and ABO loci to VTE risk: results from a GWAS approach. Blood 113, 5298–5303 (2009).
Collins, R. What makes UK Biobank special? Lancet 379, 1173–1174 (2012).
Gaziano, J. M. et al. Million veteran program: a mega-biobank to study genetic influences on health and disease. J. Clin. Epidemiol. 70, 214–223 (2016).
Lindstrom, S.et al. Genomic and transcriptomic association studies identify 16 novel susceptibility loci for venous thromboembolism. Blood https://doi.org/10.1182/blood.2019000435 (2019).
Glynn, R. J. et al. A randomized trial of rosuvastatin in the prevention of venous thromboembolism. N. Engl. J. Med. 360, 1851–1861 (2009).
Liu, D. J. et al. Exome-wide association study of plasma lipids in >300,000 individuals. Nat. Genet. 49, 1758–1766 (2017).
Bowden, J., Davey Smith, G. & Burgess, S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 44, 512–525 (2015).
Eitzman, D. T., Westrick, R. J., Nabel, E. G. & Ginsburg, D. Plasminogen activator inhibitor-1 and vitronectin promote vascular thrombosis in mice. Blood 95, 577–580 (2000).
Liu, B., Gloudemans, M. J., Rao, A. S., Ingelsson, E. & Montgomery, S. B. Abundant associations with gene expression complicate GWAS follow-up. Nat. Genet. 51, 768–769 (2019).
Sun, B. B. et al. Genomic atlas of the human plasma proteome. Nature 558, 73–79 (2018).
Auton, A. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).
Fogo, A. B. Renal fibrosis: not just PAI-1 in the sky. J. Clin. Invest. 112, 326–328 (2003).
Pawlinski, R. & Mackman, N. Cellular sources of tissue factor in endotoxemia and sepsis. Thromb. Res. 125 (Suppl. 1), S70–S73 (2010).
Henke, P. K. et al. Deep vein thrombosis resolution is modulated by monocyte CXCR2-mediated activity in a mouse model. Arterioscler. Thromb. Vasc. Biol. 24, 1130–1137 (2004).
Obi, A. T. et al. Plasminogen activator-1 overexpression decreases experimental postthrombotic vein wall fibrosis by a non-vitronectin-dependent mechanism. J. Thromb. Haemost. 12, 1353–1363 (2014).
Wassel, C. L. et al. A genetic risk score comprising known venous thromboembolism loci is associated with chronic venous disease in a multi-ethnic cohort. Thromb. Res. 136, 966–973 (2015).
Ridker, P. M. et al. Rosuvastatin to prevent vascular events in men and women with elevated C-reactive protein. N. Engl. J. Med. 359, 2195–2207 (2008).
Mihaylova, B. et al. The effects of lowering LDL cholesterol with statin therapy in people at low risk of vascular disease: meta-analysis of individual data from 27 randomised trials. Lancet 380, 581–590 (2012).
Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).
Bahl, V. et al. A validation study of a retrospective venous thromboembolism risk scoring method. Ann. Surg. 251, 344–350 (2010).
Anderson, F. A. Jr. & Spencer, F. A. Risk factors for venous thromboembolism. Circulation 107, I9–I16 (2003).
The Women’s Health Initiative Study Group. Design of the Women’s Health Initiative clinical trial and observational study. Control. Clin. Trials 19, 61–109 (1998).
Loh, P. R., Palamara, P. F. & Price, A. L. Fast and accurate long-range phasing in a UK Biobank cohort. Nat. Genet. 48, 811–816 (2016).
Howie, B., Fuchsberger, C., Stephens, M., Marchini, J. & Abecasis, G. R. Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44, 955–959 (2012).
Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).
Klarin, D. et al. Genetics of blood lipids among ~300,000 multi-ethnic participants of the Million Veteran Program. Nat. Genet. 50, 1514–1523 (2018).
Manichaikul, A. et al. Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010).
Winkler, T. W. et al. Quality control and conduct of genome-wide association meta-analyses. Nat. Protoc. 9, 1192–1212 (2014).
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
Bellenguez, C., Strange, A., Freeman, C., Donnelly, P. & Spencer, C. C. A robust clustering algorithm for identifying problematic samples in genome-wide association studies. Bioinformatics 28, 134–135 (2012).
Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).
Eitzman, D. T. et al. Bleomycin-induced pulmonary fibrosis in transgenic mice that either lack or overexpress the murine plasminogen activator inhibitor-1 gene. J. Clin. Invest. 97, 232–237 (1996).
Baldwin, J. F. et al. The role of urokinase plasminogen activator and plasmin activator inhibitor-1 on vein wall remodeling in experimental deep vein thrombosis. J. Vasc. Surg. 56, 1089–1097 (2012).
Wojcik, B. M. et al. Interleukin-6: a potential target for post-thrombotic syndrome. Ann. Vasc. Surg. 25, 229–239 (2011).
Diaz, J. A. et al. Critical review of mouse models of venous thrombosis. Arterioscler. Thromb. Vasc. Biol. 32, 556–562 (2012).
Obi, A. T. et al. Endotoxaemia-augmented murine venous thrombosis is dependent on TLR-4 and ICAM-1, and potentiated by neutropenia. Thromb. Haemost. 117, 339–348 (2017).
Henke, P. K. et al. Targeted deletion of CCR2 impairs deep vein thrombosis resolution in a mouse model. J. Immunol. 177, 3388–3397 (2006).
Laser, A. et al. Deletion of cysteine-cysteine receptor 7 promotes fibrotic injury in experimental post-thrombotic vein wall remodeling. Arterioscler. Thromb. Vasc. Biol. 34, 377–385 (2014).
Funding was received from the Department of Veterans Affairs Office of Research and Development, Million Veteran Program (grant no. MVP000). This publication does not represent the views of the Department of Veterans Affairs or the US Government. This research was also supported by three additional Department of Veterans Affairs awards (no. I01-01BX03340 to K.C. and P.W.; no I01-BX003362 to P.T. and K.M.C.; and no. I01-CX001025 to P.W.) and used the resources and facilities at the VA Informatics and Computing Infrastructure (no. VA HSR RES 13-457). S.M.D. is supported by the Veterans Administration (no. IK2-CX001780). S.K. is supported by a Research Scholar award from the Massachusetts General Hospital, the Donovan Family Foundation and the National Institutes of Health (NIH) (no. R01HL127564). P.N. is supported by the NIH/National Heart, Lung, and Blood Institute (NHLBI) (nos. K08HL140203 and R01HL142711). D.T. was financially supported by the EPIDEMIOM-VTE Senior Chair from the Initiative of Excellence of the University of Bordeaux. C.K. is supported by the NIH (grant no. HL116854). Data on coronary artery disease have been contributed by the CARDIoGRAMplusC4D investigators. Data on large artery stroke have been contributed by the MEGASTROKE investigators. The MEGASTROKE project received funding from sources specified at http://www.megastroke.org/acknowledgements.html. The WHI program is funded by the NHLBI, NIH and the US Department of Health and Human Services (contract nos. HHSN268201600018C, HHSN268201600001C, HHSN268201600002C, HHSN268201600003C and HHSN268201600004C). For a list of all the investigators who have contributed to WHI science, see https://www.whi.org/researchers/Documents%20%20Write%20a%20Paper/WHI%20Investigator%20Long%20List.pdf. This research has been conducted using the UK Biobank resource, application no. 7089.
P.N. reports grant support from Amgen, Apple and Boston Scientific, and consulting income from Apple, all unrelated to the submitted work. S.K. reports grant support from Regeneron Pharmaceuticals and Bayer, grant support and personal fees from Aegerion Pharmaceuticals, personal fees from the Regeneron Genetics Center, Merck, Celera Corporation, Novartis, Bristol-Myers Squibb, Sanofi, AstraZeneca, Alnylam Pharmaceuticals, Eli Lilly and Company and Leerink Partners, personal fees and other support from Catabasis and other support from San Therapeutics outside the submitted work. He is also the chair of the scientific advisory board at Genomics plc and the Chief Executive Officer of Verve Therapeutics. S.D. reports grants to his institution in the last 3 years outside the submitted work: AbbVie Inc.; Anolinx LLC; Astellas Pharma Inc.; AstraZeneca Pharmaceuticals LP; Boehringer Ingelheim International GmbH; Celgene Corporation; Eli Lilly and Company; Genentech, Inc.; Genomic Health, Inc.; Gilead Sciences, Inc.; GlaxoSmithKline; Innocrin Pharmaceuticals Inc.; Janssen Pharmaceuticals; Kantar Health; Myriad Genetic Laboratories, Inc.; Novartis International AG; and Parexel International Corporation. C.K. reports grants to his institution from Janssen Pharmaceuticals, Diagnostica Stago and Siemens Healthcare Diagnostics for research related to VTE, but not related to the current work. J.C. is now with the US Food and Drug Administration.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The primary analysis consisted of a genome-wide association study to identify novel VTE risk variants. Secondary analyses included: an analysis of VTE and atherosclerosis overlap, a fine-mapping analysis, colocalization analysis, and functional analysis of PAI-1 using trans-ethnic summary statistics, pQTL data, and murine models respectively, a closer examination of autosomal VTE risk variants through PheWAS, generation and analysis of a 297 variant VTE polygenic risk score, and a Mendelian randomization analysis of blood lipids and VTE. Abbreviations: PAI-1, Plasminogen Activator Inhibitor-1; BMI, Body-Mass Index; CAD, Coronary Artery Disease; GLGC, Global Lipids Genetics Consortium; GTEx, Genotype-Tissue Expression Project; LAS, Large Artery Stroke; MVP, Million Veteran Program; PAD, Peripheral Artery Disease; PheWAS, Phenome-wide Association Study; VTE, Venous Thromboembolism; WHI, Women’s Health Initiative.
Supplementary Fig. 2 Quantile-quantile plot for the discovery VTE GWAS in MVP (N = 11,844 VTE cases and 251,951 controls).
The expected logistic regression association P values versus the observed distribution of P values for VTE association (Wald statistic) are displayed. Quantile-quantile plots were inspected for ancestry-specific analyses in MVP (European/African/Hispanic) and genomic control values were < 1.05 for each racial group (data not shown). No systemic inflation was observed (λGC = 1.04). All P values were two-sided. Abbreviations: GWAS, Genome-wide Association Study; MVP, Million Veteran Program; VTE, Venous Thromboembolism.
Supplementary Fig. 3 Quantile-quantile plot for the discovery VTE GWAS in UK Biobank (N = 14,222 VTE cases and 372,102 controls).
The expected logistic regression association P versus the observed distribution of P values for VTE association (Wald statistic) are displayed. No systemic inflation was observed (λGC = 1.08). All P values were two-sided. Abbreviations: GWAS, Genome-wide Association Study; VTE, Venous Thromboembolism.
Supplementary Fig. 4 Quantile-quantile plot for the trans-ethnic VTE GWAS meta-analysis in MVP and UK Biobank (N = 26,066 VTE cases and 624,053 controls).
The expected logistic regression association P values versus the observed distribution of P values for VTE association (Wald statistic) are displayed. No systemic inflation was observed (λGC = 1.06). All P values were two-sided. In a linkage disequlibrium (LD) score regression analysis restricted to Europeans (N = 23,151 VTE cases and 553,439 controls), the LD score intercept was observed to equal 1.02, suggesting nearly all the inflation in test statistics is due to genuine polygenicity in VTE as a trait. Abbreviations: GWAS, Genome-wide Association Study; VTE, Venous Thromboembolism.
Supplementary Fig. 5 Manhattan plot for the trans-ethnic VTE GWAS (N = 26,066 VTE cases and 624,053 controls).
Plot of -log10(P) for association (logistic regression Wald statistic) of genotyped and imputed variants by chromosomal position (alternating blue and yellow) for all autosomal polymorphisms analyzed in the UK Biobank and MVP VTE GWAS meta-analysis. Logistic regression two-sided P values are displayed. Abbreviations: GWAS, Genome-Wide Association Study; VTE, Venous Thromboembolism.
Supplementary Fig. 6 LocusCompare visualization of colocalization between ZFPM2 VTE GWAS and PAI-1 pQTL signals.
Colocalization between the ZFPM2 locus in the VTE GWAS (N = 23,151 VTE cases, 553,439 controls) and PAI-1 human plasma pQTL (N = 3,301) signals. The pQTL p-values were derived from the plasma samples of 3,301 participants of the INTERVAL19 study based on a linear regression model. The GWAS p-values were derived from a logistic regression model (Wald statistic) and meta-analysis from the current study. Two-sided values of P are displayed.
Using linkage disequilibrium score regression, a stronger positive correlation between VTE [N= 14,222 cases; 372,102 controls] and PAD [N = 24,009 cases; 150,983 controls] (rg = 0.47, P = 2.0x10−15) than for VTE and CAD [N= 60,801 cases; 123,504 controls] (rg = 0.27, P = 1.2x10−7) or VTE and LAS [N = 6,688 cases; 454,450 controls] (rg = 0.35, P = 0.02) was observed across the genome. All values of P are two-sided; genetic correlations with associated standard errors are displayed. Abbreviations: VTE, Venous Thromboembolism; CAD, Coronary Artery Disease; LAS, Large Artery Stroke; PAD, Peripheral Artery Disease.
About this article
Cite this article
Klarin, D., Busenkell, E., Judy, R. et al. Genome-wide association analysis of venous thromboembolism identifies new risk loci and genetic overlap with arterial vascular disease. Nat Genet 51, 1574–1579 (2019) doi:10.1038/s41588-019-0519-3