Imputation of behavioral candidate gene repeat variants in 486,551 publicly-available UK Biobank individuals


Some of the most widely studied variants in psychiatric genetics include variable number tandem repeat variants (VNTRs) in SLC6A3, DRD4, SLC6A4, and MAOA. While initial findings suggested large effects, their importance with respect to psychiatric phenotypes is the subject of much debate with broadly conflicting results. Despite broad interest, these loci remain absent from the largest available samples, such as the UK Biobank, limiting researchers’ ability to test these contentious hypotheses rigorously in large samples. Here, using two independent reference datasets, we report out-of-sample imputation accuracy estimates of >0.96 for all four VNTR variants and one modifying SNP, depending on the reference and target dataset. We describe the imputation procedures of these candidate variants in 486,551 UK Biobank individuals, and have made the imputed variant data available to UK Biobank researchers. This resource, provided to the scientific community, will allow the most rigorous tests to-date of the roles of these variants in behavioral and psychiatric phenotypes.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1


  1. 1.

    McInnes LA, Freimer NB. Mapping genes for psychiatric disorders and behavioral traits. Curr Opin Genet Dev. 1995;5:376–81.

  2. 2.

    Ramamoorthy S, Bauman AL, Moore KR, et al. Antidepressant and cocaine-sensitive human serotonin transporter: molecular cloning, expression, and chromosomal localization. Proc Natl Acad Sci USA. 1993;90:2542–6.

  3. 3.

    Sabol SZ, Hu S, Hamer D. A functional polymorphism in the monoamine oxidase A gene promoter. Hum Genet. 1998;103:273–9.

  4. 4.

    Tol HHM Van, Wu CM, Guan H-C, et al. Multiple dopamine D4 receptor variants in the human population. Nature. 1992;358:149–52.

  5. 5.

    Vandenbergh DJ, Persico AM, Hawkins AL, et al. Human dopamine transporter gene (DAT1) maps to chromosome 5p15.3 and displays a VNTR. Genomics. 1992;14:1104–6.

  6. 6.

    Lesch KP, Bengel D, Heils A, et al. Association of anxiety-related traits with a polymorphism in the serotonin transporter gene regulatory region. Science. 1996;274:1527–31.

  7. 7.

    Collier DA, Arranz MJ, Sham P. The serotonin transporter gene is a potential susceptibility factor for biplor affective disorder. Neuroreport. 1996;7:1675–9.

  8. 8.

    Hamer DH, Greenberg BD, Sabol SZ, Murphy DL. Role of the serotonin transporter gene in temperament and character. J Pers Disord. 1999;13:312–27.

  9. 9.

    Caspi A, Sugden K, Moffitt TE, et al. Influence of life stress on depression: moderation by a polymorphism in the 5-HTT gene. Science. 2003;301:386–9.

  10. 10.

    Culverhouse RC, Saccone NL, Horton AC, et al. Collaborative meta-Analysis finds no evidence of a strong interaction between stress and 5-HTTLPR genotype contributing to the development of depression. Mol Psychiatry. 2018;23:133–42.

  11. 11.

    Johnson EC, Border R, Melroy-Greif WE, de Leeuw CA, Ehringer MA, Keller MC. No evidence that schizophrenia candidate genes are more associated with schizophrenia than noncandidate genes. Biol Psychiatry. 2017;82:702–8.

  12. 12.

    Duncan LE, Keller MC. A critical review of the first 10 years of candidate gene-by-environment interaction research in psychiatry. Am J Psychiatry. 2011;168:1041–9.

  13. 13.

    Burton PR, Hansell AL, Fortier I, et al. Size matters: just how big is BIG?: quantifying realistic sample size requirements for human genome epidemiology. Int J Epidemiol. 2009;38:263–73.

  14. 14.

    Bosker FJ, Hartman CA, Nolte IM, et al. Poor replication of candidate genes for major depressive disorder using genome-wide association data. Mol Psychiatry. 2011;16:516–32.

  15. 15.

    Farrell MS, Werge T, Sklar P, et al. Evaluating historical candidate genes for schizophrenia. Mol Psychiatry. 2015;20:555–62.

  16. 16.

    Brookes KJ. The VNTR in complex disorders: The forgotten polymorphisms? A functional way forward? Genomics. 2013;101:273–81.

  17. 17.

    Vinkhuyzen AAE, Dumenil T, Ryan L, et al. Identification of tag haplotypes for 5HTTLPR for different genome-wide SNP platforms. Mol Psychiatry. 2011;16:1073–5.

  18. 18.

    Lu AT-H, Bakker S, Janson E, Cichon S, Cantor RM, Ophoff RA. Prediction of serotonin transporter promoter polymorphism genotypes from single nucleotide polymorphism arrays using machine learning methods. Psychiatr Genet. 2012;22:182–8.

  19. 19.

    Assary E, Vincent JP, Keers R, Pluess M. Gene-environment interaction and psychiatric disorders: review and future directions. Semin Cell Dev Biol. 2018;77:133–43.

  20. 20.

    Duncan LE, Pollastri AR, Smoller JW. Mind the gap: why many geneticists and psychological scientists have discrepant views about gene-environment interaction (G × E) research. Am Psychol. 2014;69:249–68.

  21. 21.

    Sudlow C, Gallacher J, Allen N, et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12:1–10.

  22. 22.

    Conger RD, Schofield TJ, Neppl TK. Intergenerational continuity and discontinuity in harsh parenting. Parenting. 2012;12:222–31.

  23. 23.

    Haberstick BC, Smolen A, Stetler GL, et al. Simple sequence repeats in the national longitudinal study of adolescent health: an ethnically diverse resource for genetic analysis of health and behavior. Behav Genet. 2014;44:487–97.

  24. 24.

    Haberstick BC, Smolen A, Williams RB, et al. Population frequencies of the triallelic 5HTTLPR in six ethnicially diverse samples from North America, Southeast Asia, and Africa. Behav Genet. 2015;96:255–61.

  25. 25.

    Masarik AS, Conger RD, Brent Donnellan M, et al. For better and for worse: genes and parenting interact to predict future behavior in romantic relationships. J Fam Psychol. 2014;28:357–67.

  26. 26.

    Derringer J, Corley RP, Haberstick BC, et al. Genome-wide association study of behavioral disinhibition in a selected adolescent sample. Behav Genet. 2015;45:375–81.

  27. 27.

    Young SE, Stallings MC, Corley RP, Krauter KS, Hewitt JK. Genetic and environmental influences on behavioral disinhibition. Am J Med Genet Part B Neuropsychiatr Genet. 2000;695:684–95.

  28. 28.

    Chang CC, Chow CC, Tellier LC, et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 2015;4:7.

  29. 29.

    Abraham G, Inouye M. Fast principal component analysis of large-scale genome-wide data. PLoS One. 2014;9:e92766.

  30. 30.

    McCarthy S, Das S, Kretzschmar W, et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat Genet. 2016;48:1279–83.

  31. 31.

    Delaneau O, Zagury J-F, Marchini J. Improved whole-chromosome phasing for disease and population genetic studies. Nat Methods. 2013;10:5–6.

  32. 32.

    Das S, Forer L, Schönherr S, et al. Next-generation genotype imputation service and methods. Nat Genet. 2016;48:1287–1287.

  33. 33.

    Drury SS, Theall KP, KB JB, Scheeringa M. The role of the dopamine transporter (DAT) in the development of preschool children. J Trauma Stress. 2009;22:534–9.

  34. 34.

    Yu YWY, Tsai SJ, Hong CJ, Chen TJ, Chen MC, Yang CW. Association study of a Monoamine oxidase A gene promoter polymorphism with major depressive disorder and antidepressant response. Neuropsychopharmacology. 2005;30:1719–23.

  35. 35.

    Hutchison KE, McGeary J, Smolen A, Bryan A, Swift RM. The DRD4 VNTR polymorphism moderates craving after alcohol consumption. Heal Psychol. 2002;21:139–46.

  36. 36.

    Browning BL, Browning SR. Genotype imputation with millions of reference samples. Am J Hum Genet. 2016;98:116–26.

  37. 37.

    Mitt M, Kals M, Pärn K, et al. Improved imputation accuracy of rare and low-frequency variants using population-specific high-coverage WGS-based imputation reference panel. European Journal of Human Genetics. 2017;25:869–76.

  38. 38.

    Bycroft C, Freeman C, Petkova D et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562:203–9.

  39. 39.

    Li Y, Willer C, Sanna S, Abecasis GR. Genotype imputation. Annu Rev Genom Hum Genet. 2009;10:387–406.

  40. 40.

    Marchini J, Howie B. Genotype imputation for genome-wide association studies. Nat Rev Genet. 2010;11:499–511.

  41. 41.

    Chang FM, Kidd JR, Livak KJ, Pakstis AJ, Kidd KK. The world-wide distribution of allele frequencies at the human dopamine D4 receptor locus. Hum Genet. 1996;98:91–101.

  42. 42.

    Deelen P, Menelaou A, Leeuwen EM Van et al. Improved imputation quality of low-frequency and rare variants in European samples using the “Genome of The Netherlands”. European Journal of Human Genetics. 2014;1321–6.

Download references


We thank the participants of the FTP, CADD/GADD and UK Biobank studies. This work was supported by NIH R01MH100141 to MCK and the Institute for Behavioral Genetics. RB is supported by NIH T32MH016880. The FTP was supported by NICHD HD064687. CADD was supported by NIDA DA011015 and DA035804. GADD was supported by DA012845, DA035804, and DA021692. This work utilized the RMACC Summit supercomputer, which is supported by the National Science Foundation (awards ACI-1532235 and ACI-1532236), the University of Colorado Boulder, and Colorado State University.

Author information

Correspondence to Luke M. Evans.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Supplementary information

Supplemental Material

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Further reading