Genetic variation in CADM2 as a link between psychological traits and obesity

CADM2 has been associated with a range of behavioural and metabolic traits, including physical activity, risk-taking, educational attainment, alcohol and cannabis use and obesity. Here, we set out to determine whether CADM2 contributes to mechanisms shared between mental and physical health disorders. We assessed genetic variants in the CADM2 locus for association with phenotypes in the UK Biobank, IMPROVE, PROCARDIS and SCARFSHEEP studies, before performing meta-analyses. A wide range of metabolic phenotypes were meta-analysed. Psychological phenotypes analysed in UK Biobank only were major depressive disorder, generalised anxiety disorder, bipolar disorder, neuroticism, mood instability and risk-taking behaviour. In UK Biobank, four, 88 and 172 genetic variants were significantly (p < 1 × 10−5) associated with neuroticism, mood instability and risk-taking respectively. In meta-analyses of 4 cohorts, we identified 362, 63 and 11 genetic variants significantly (p < 1 × 10−5) associated with BMI, SBP and CRP respectively. Genetic effects on BMI, CRP and risk-taking were all positively correlated, and were consistently inversely correlated with genetic effects on SBP, mood instability and neuroticism. Conditional analyses suggested an overlap in the signals for physical and psychological traits. Many significant variants had genotype-specific effects on CADM2 expression levels in adult brain and adipose tissues. CADM2 variants influence a wide range of both psychological and metabolic traits, suggesting common biological mechanisms across phenotypes via regulation of CADM2 expression levels in adipose tissue. Functional studies of CADM2 are required to fully understand mechanisms connecting mental and physical health conditions.

health behaviours, hormone dysregulation and shared genetic risk factors 1,2 . A number of potential shared pathways between mood disorders and cardiometabolic disease have been suggested, including abnormal circadian rhythms, hypothalamic-pituitary-adrenal (HPA) axis dysfunction and inflammation. However, the molecular mechanisms of these pathways are poorly understood.
Single nucleotide polymorphisms (SNPs) in the locus encoding the synaptic cell adhesion molecule 2 (CADM2) on chromosome 3 have been associated with a number of psychological traits, including educational attainment 3 , alcohol consumption 4 , cannabis use 5 , physical activity habits 6 , risk-taking behaviour 7,8 , attention-deficit/hyperactivity disorder 9 and obesity 10 . Several lines of evidence point to CADM2 being the gene through which SNPs are having their effects, including genotype-specific effects on CADM2 mRNA expression levels 7,8 , CADM2 being predominantly expressed in the brain, and cadm2 knockout models demonstrating relevant phenotypes. Specifically, cadm2-knockout mice have reduced adiposity, reduced systemic glucose levels, improved insulin sensitivity, increased locomotor activity, increased energy expenditure rate and raised core body temperature, suggesting an important role in systemic energy homeostasis 11 .
We set out to systematically evaluate the relationship between CADM2 SNPs and psychological and physical traits, and assess whether there is evidence for distinct signals influencing metabolic versus psychological traits.

Materials and Methods
CADM2 locus. We defined the CADM2 locus as the CADM2 gene plus 250 kb up and downstream (Chromosome 3:84758000-86374000, UCSC genome browser, https://genome-euro.ucsc.edu/). study cohorts. High CVD risk population: IMPROVE is a cohort of individuals with no symptoms or history of cardiovascular disease, but with a least three classic risk factors (namely any combination of the following: family history of CVD, type 2 diabetes, smoking, hypertension, dyslipidaemia, male sex or women at least 5 years post-menopause) 12 . In brief, 3,711 participants were recruited from 7 centres across 5 European countries (Finland, Sweden, the Netherlands, France and Italy) between January 2004 and June 2005. Participants completed a structured medical history and lifestyle questionnaire at baseline, as well as standard biochemical tests and genotyping. Ethics committee approval was granted by the Institutional review board (IRB) at each recruitment centre: Karolinska Institutet, Stockholm, Sweden; University of Milan, Milan, Italy; University of Kuopio and Kuopio Research Institute of Exercise Medicine, Kupio, Finland; University Hospital Groningen, Groningen, The Netherlands; University of Perugia, Perugia, Italy; Groupe Hôpital Pitie-Salpetriere, Paris, France. Informed consent was provided by all participants. The study was conducted in accordance with the Helsinki Declaration.
Young CVD case-control cohort: SCARFSHEEP is a case-control cohort of Swedish participants (N = 2,513) recruited in Stockholm 13,14 . Cases were those with a first myocardial infarction before 60 years of age. Controls were age and sex-matched from the general population of the same county. Standard biochemical phenotyping was available for all participants. Approval was granted by the Karolinska Hospital and Karolinska Institutet Ethics Committees (for SCARF and SHEEP respectively). Informed consent was provided by all participants. The study conducted in accordance with the Helsinki Declaration.
CVD case-control cohort: PROCARDIS is a case-control cohort 15 , where cases (n = 5,688) were diagnosed with coronary artery disease before 66 years and controls (n = 2,310) are unrelated participants without coronary artery disease at 66 years. Participants were recruited from 4 centres across 4 European countries (Sweden, the UK, Germany and Italy). Participants completed a questionnaire at baseline. Standard biochemical phenotyping was available for all participants. Ethics Committee approval was granted by the IRB at each recruitment centre: the Regional Ethics Review Board at Karolinska Institutet, Stockholm in Sweden, the IRB at the University of Munster, Munster, in Germany, the IRB at the Mario Negri Institute, Milano in Italy and the IRB at the University of Oxford, Oxford, United Kingdom. Informed consent was provided by all participants. The study conducted in accordance with the Helsinki Declaration.
General population cohort: UK Biobank is a cohort of over 500,000 participants aged 40-69 at baseline 16 17 at the SNP&SEQ Technology Platform in Uppsala. For both cohorts, imputation to the 1000 Genomes reference panel was conducted according to standard protocols, as described previously 19 .
PROCARDIS participants were genotyped at the Centre National du Genotypage, Paris and the SNP&SEQ Technology Platform in Uppsala, using the Illumina 1 M and 610 K arrays. Imputation to the 1000 Genomes panel was conducted according to standard protocols, as described 20 .
UK Biobank participants were genotyped using either the Affymetrix UK Biobank Axion or the Affymetrix BiLEVE Axion array 16 . A modified version of SHAPEIT2 was used for phasing and IMPUTE2 for imputation. The data from UK Biobank was released in two phases. The UK Biobank was imputed to the 1000 Genomes, UK10K haplotype (first release) and Haplotype Reference Consortium (merged with the first release for the second release) reference panels 21 .
We applied standard quality control procedures to all cohorts, including SNP exclusion for low call rate (<95%), minor allele frequency (MAF < 1%), deviation from Hardy-Weinberg equilibrium (p < 5 × 10 −6 ) or imputation quality score <0·4 and subject exclusion for sex mismatch, cryptic relatedness, low call rate (<95%) and non-Caucasian ancestry (self-reported or based on principle component analysis). For UK Biobank exclusions, further exclusions based on relatedness were applied (one of each pair of individuals with a KING-estimate kindship coefficient >0·0442 was randomly removed). After quality control, 5,684, 2,786, 5,452 and 5,361 SNPs were available for IMPROVE, SCARFSHEEP, PROCARDIS and UK Biobank respectively. In total, 2,123 SNPs were available in all four cohorts, with 2,133 overlapping between the three CAD case-control cohorts and 2,434 overlapping in the three cohorts with biomarker data.
phenotypes. Psychiatric and psychological phenotypes were only available in the UK Biobank. The baseline questionnaire included questions to assess mood instability ("does your mood often go up and down?" variable #1920) and risk-taking behaviour ("Would you describe yourself as someone who takes risks" variable #2040). Single item questions are imperfect ways to measure psychological traits, however validity of the question used here has been demonstrated relative to more detailed phenotyping (at least for risk-taking) 22 and in terms of the expected associations with psychiatric disorders (for mood instability 23 and risk-taking 7,8,24,25 ). Neuroticism was assessed using the Eysenck Personality Questionnaire (Revised Short Form), where 12 yes/no questions were asked. These were summed, resulting in a score between one and 12 for each individual 26 . Phenotyping in relation to psychiatric disorders was based upon the online "Thoughts and Feelings" questionnaire 27 , which requested information on lifetime symptoms of mental disorders. This enabled classification of likely major depressive disorder (MDD), bipolar disorder (BD), generalised anxiety disorder (GAD) and addiction).
Anthropometric and blood pressure phenotypes (BMI, waist and hip circumferences, SBP and DBP) were assessed in a standardised and comparable manner. Waist to hip circumference ratio adjusted for BMI (WHRadjBMI) was calculated as per Shungin et al. 28 . For those on anti-hypertensive medication, values of SBP and DBP were adjusted, with 15 and 10 mmHG, respectively, being added prior to analysis 29 . Current smoking was assessed by questionnaire in all cohorts.
Metabolic parameters were available in IMPROVE, SCARFSHEEP and PROCARDIS, where fasting glucose, lipid (HDL, LDL and TG) and CRP levels were measured using standard methodology at the Department of Clinical Chemistry, Karolinska University Hospital. Fasting insulin levels were measured by radio-immunoassay 30,31 . HOMA indices were calculated from fasting glucose and insulin levels as described 32 . Type 2 diabetes was defined as diagnosis, medication and/or fasting glucose levels ≥7 mmol/L for IMPROVE, SCARFSHEEP and PROCARDIS. The definition of T2D in UK Biobank has been described 33 , and is generally comparable with the assessment used for the other cohorts.
Coronary vascular disease (CVD) was defined as clinically diagnosed myocardial infarction, symptomatic acute coronary syndrome, angina or coronary artery revascularisation before the age of 66 years for PROCARDIS 15 . Criteria for inclusion as a CVD case in the SCARFSHEEP study was clinical diagnosis of myocardial infarction diagnosis 34 . For UK Biobank, CVD was defined as clinical diagnosis of heart attack/myocardial infarction or angina (variable # 6150). statistical analyses. All continuous phenotypes were assessed for normality and, where necessary, were natural log transformed prior to analysis. For each cohort, phenotypes were analysed in PLINK 1.07 using linear or logistic regression (for continuous vs. binary traits respectively), assuming additive allelic effects. With the exception of WHRadjBMI, all models included age, sex and population structure (3 principal components for PROCARDIS, SCARSHEEP and IMPROVE, 8 for UK Biobank), with further adjustment for genotyping chip being applied for UK Biobank and PROCARDIS analyses. For analysis of lipid traits, lipid-lowering medication and CVD case-control status were included as a covariates. For glucometabolic traits, individuals with type 2 diabetes were excluded and CVD case-control status was included as a covariate.
Results from the individual studies were combined in inverse variance-weighted meta-analyses using METAL 35 (with binary effect sizes being analysed as Beta coefficients). Inverse variance-weighted meta-analysis was chosen, as the phenotype measurements and data transformation were comparable (including consistent units) between studies. Averages and standard errors of allele frequencies were computed. No additional filters were applied. Supplementary Table 1 summarises the phenotypes analysed in each cohort, covariates used and total sample number in the meta-analyses. Only SNPs present in 3 (of 3 or 4) cohorts were considered. Despite the prior knowledge implicating this locus in mental and physical health traits, we used a conservative approach, with genome-wide significance being set at p < 5 × 10 −8 and suggestive evidence of association being set at p < 1 × 10 −5 . Locuszoom was used to visualise the results 36 .
Genetic architecture. In order to determine whether the different traits have distinct signals, or whether there is one signal influencing all traits, two approaches were used: Firstly, SNPs meeting suggestive or genome-wide significance thresholds for at least one phenotype (candidate SNPs) were identified and linkage disequilibrium (LD) assessed. For analysis of genetic architecture, a random subset of 1000 unrelated white British participants from the UK Biobank were selected. In this subset of UK Biobank, candidate SNPs were filtered to leave only independent SNPs, using PLINK (independent pairwise selection with default settings, including LD r 2 threshold 0·5). LD between the independent SNPs and lead/index SNPs was calculated and visualised using Haploview 37 .
Secondly, conditional analyses were performed to further examine the possibility of multiple signals in the CADM2 locus. Here, the risk-taking and BMI analyses were repeated, with the index SNP (coded as an additive genetic effect, namely 0 for common homozygote, 1 for heterozygotes and 2 for rare homozygotes) from each other phenotype in turn included as a covariate.
Data-mining. The GWAS catalogue (https://www.ebi.ac.uk/gwas/, accessed 2018-09-04, 3:84,758,000-86,374,001) was used to identify CADM2 locus SNPs previously associated with relevant phenotypes (specifically www.nature.com/scientificreports www.nature.com/scientificreports/ cardio-metabolic and psychiatric disorder-related traits). All SNPs in the CADM2 locus with suggestive or genome-wide evidence for association with at least one phenotype were assessed for predicted functional effects using the Variant Effect Predictor 38 . For lead and index SNPS, the GTEx portal 39 was queried to identify genotype-specific gene expression patterns (or expression quantitative traits loci (eQTLs)).

Results
The cohort characteristics are presented in Table 1 and the phenotypes assessed are presented in Table 2.
Meta-analysis of cardiovascular and metabolic phenotypes. Only SNPs present in at least 3 of the 4 cohorts were considered. It is worth noting that the heterogeneity I 2 value was high for many SNPs in the meta-analysis of UK Biobank, IMPROVE, PROCARDIS and SCARFHSEEP. This could be due to selection of UK vs European or population vs case-control participants. Therefore we present results for both a lead SNP (defined as the SNP with the lowest P-value) and an index SNP (defined as the SNP with the lowest p-value with heterogeneity I 2 = 0), but for robustness we focus on the Index SNPs.
No associations were observed for WHRadjBMI, DBP, T2D or current smoking ( Supplementary Fig. 1A-D). For CVD analysis, SNPs were only considered if they were present in all three cohorts. None met the threshold for suggestive significance (Supplementary Fig. 1E).
Cross-trait observations. A total of 49 SNPs demonstrated at least suggestive associations with multiple phenotypes (Supplementary Table 2). This observation demonstrates, firstly that the same SNPs influence both metabolic and psychological traits, and secondly that effects on risk-taking, BMI and CRP were positively correlated and these were inversely correlated with effects on neuroticism, mood instability and SBP.
Genetic architecture of CADM2. In order to determine whether the associations with psychological and physical traits reflect the same or distinct signals, two approaches were used. Firstly, filtering the 1,533 candidate SNPs (SNPs meeting suggestive significance of association with any phenotype) in the 1.62 Mb CADM2 locus by LD gave 75 independent loci (Fig. 2A). The index SNPs for neuroticism (rs818219) and mood instability (rs818225) are in perfect LD, therefore represent the same signal, whereas LD between index SNPs for other traits (BMI, rs11708632; SBP, rs6803322; CRP, rs11708024; risk-taking, rs485659) is low (maximum r 2 = 0·37, Fig. 2B), which could indicate independent signals for each other phenotype.
Secondly, conditional analysis using the index SNPs (BMI, rs11708632; SBP, rs6803322; CRP, rs11708024; risk-taking, rs4856591; neuroticism, rs818219; mood instability, rs818225) was performed, using the UK Biobank data. If there is only one signal in the locus, then adjusting for the index SNP would remove the effects/significance of other SNPs in the locus. Alternatively, if there are additional signals in the locus which are independent of the index SNP, then adjusting for the index SNP would have little or no impact on the independent signal in terms of effect size or significance. The risk-taking results demonstrated that inclusion of index SNPs had some effect on the p-value of the risk-taking signal (Fig. 3), but the effect size was stable (primary risks analysis Beta = 0.056, conditional Betas = 0.054-0.057), which concurs with the LD analysis suggestion that the signals are independent. The BMI results were similar, but the effect size was less stable (primary BMI analysis  www.nature.com/scientificreports www.nature.com/scientificreports/ Beta = −0.099, conditional Betas = −0.087-0.101) and the plots support the possibility of more than one signal in this region (Supplementary Fig. 4). eQtL analysis. Firstly, SNPs with genotype-specific effects on mRNA CADM2 levels were identified using the GTEx portal (Fig. 4). The location of the eQTLs appears to be tissue specific (Fig. 4): In subcutaneous adipose tissue, eQTLs are located upstream and centrally relative to the gene location, whereas those in visceral adipose tissue are restricted to the central part of the gene (Fig. 4A,B). In heart (left ventricle) tissue and skeletal muscle, www.nature.com/scientificreports www.nature.com/scientificreports/ eQTLs are preferentially (but not exclusively) located centrally, whereas eQTLs in lung tissue are mainly upstream (Fig. 4D,F). In contrast, there are no eQTLs for CADM2 in the brain (cerebellum), instead the eQTLs in this brain region are for CADM2-AS1 (Fig. 4C). These findings suggest the potential for differential regulation of CADM2   www.nature.com/scientificreports www.nature.com/scientificreports/ levels across a range of tissues, thereby explaining how different SNPs within the locus can influence a variety of traits.
Secondly, 1747 SNPs with eQTL effects on CADM2 and CADM2-AS1 were identified and are presented in Supplementary Table 3. No eQTLs for CADM2-AS2 were identified. For CADM2 there were 3702 eQTLS (each SNP can have eQTL effects in more than one tissue), of which 41% were in adipose tissue (subcutaneous or visceral) and 5% in brain tissues. For CADM2-AS1, 1628 eQTLs were identified, of which 90% were in brain tissues and none were in adipose tissues. www.nature.com/scientificreports www.nature.com/scientificreports/ Finally, all of the index SNPs available in GTEx (rs1170802 was not available) demonstrated eQTLs for CADM2 and none were eQTLs for CADM2-AS1 (Table 4). Expression levels of CADM2 are highest in the brain (Fig. 5A). However, it was interesting to note that the trait-increasing alleles of rs11708632 (BMI) and rs4856591 (risk-taking) were associated with increased CADM2 expression levels in adipose tissue (Fig. 5B,C). Consistent with their inverse correlations with BMI and risk-taking, the trait-decreasing alleles of rs818225 (mood instability) and rs818219 (neuroticism) were associated with increased CADM2 expression in adipose tissue (Fig. 5D,E).

Data-mining.
None of the index SNPs have previously been associated with any traits in the GWAS catalogue. Additionally, none of the candidate SNPs were predicted by Variant Effect Predictor to have more than low or modifier impact.
SNPs in the CADM2 locus which have previously been associated with psychological or metabolic traits are presented in Supplementary Table 4. Where comparison was possible, SNPs with reported effects on measures of obesity demonstrated consistent effect directions in our study compared to those previously reported 10,28,40,41 , with one exception: rs12495178 has the opposite effect directions in the Japanese population 42 compared to our study. It is noteworthy that SNPs associated with educational attainment or intelligence 3,43 , were associated with increased BMI, which is somewhat surprising, but consistent with effects on risk-taking behaviour 7,8 . Maybe unsurprisingly, rs62253088-T (associated with strenuous exercise 6 ) had positive associations with risk-taking and   www.nature.com/scientificreports www.nature.com/scientificreports/ negative associations with neuroticism. Other associations with physical activity habits 6 and alcohol consumption 4 demonstrate less consistent associations with risk-taking.

Discussion
We identified novel associations between CADM2 genetic variants and SBP, CRP levels, neuroticism and mood instability, and have highlighted a possible link between SNPs associated with psychological traits and adiposity via CADM2 expression levels in adipose tissue.
Associations between CADM2 SNPs and obesity have previously been reported 10,28,[40][41][42][44][45][46][47] and were observed here. It is possible that the associations of CADM2 SNPs with CRP and SBP are secondary to the effects on obesity, as increased fat accumulation is associated with systemic inflammation 2 and reduced cardiovascular www.nature.com/scientificreports www.nature.com/scientificreports/ fitness 48 . Associations between CADM2 SNPs and risk-taking behaviour have also been reported 7,8,25 . We previously suggested that the association between risk-taking and obesity might be behavioural, with risk-takers choosing to disregard health-related advice and/or are prone to aberrant reward circuitry predisposing them to poor dietary choices and excessive intake 7 .
Pleiotropy, where genetic variants influence more than one trait, is a concern in genetic studies. Pleiotropy can be classified as biological (where a genetic variant has true effects on more than one trait), mediated (where a genetic variant has a true effect on one trait, but because there is a causal relationship between that trait and a second trait, effects of the genetic variant are seen on the second trait as well) or spurious (where biases in the study result in genetic effects on multiple traits) 49 . When considering the effects of CADM2 variants on both psychological and obesity traits, a possible explanation is mediated pleiotropy, for example through physical exercise. If the CADM2 variant effects on behavioural traits such as risk-taking, neuroticism, mood instability (observed here) and physical exercise 6 are true, then the effects on obesity might be knock-on effects of physical exercise. Whilst possible, the CADM2 variants associated with increased physical activity were associated with increased BMI 6 , therefore this logic of this argument is flawed. Spurious pleiotropy is another possibility; however there are consistent effects of CADM2 on psychological and obesity traits in a number of cohorts with different recruitment and study designs (population-based, CVD case-control, high CVD risk) and populations (European, UK, north American, Pakistan), which would be expected to differ in their biases. Whether spurious pleiotropy can result from such a variety of biases is doubtful.
In contrast, biological pleiotropy is supported by the body of evidence indicating a role for CADM2 in psychological and obesity traits from other types of studies. Mouse models demonstrate a clear effect of Cadm2 on obesity and gluco-metabolic parameters: A global Cadm2-knockout mouse demonstrated reduced body weight, improved insulin sensitivity and improved glucose tolerance 50 . Furthermore, this effect was maintained when Cadm2-knockout mice were crossed with the traditional obesity model, the Leptin-knockout mouse 11 . Rat models also demonstrate that increased Cadm2 (via a knockdown of a cadm2 regulator) reduced neurite outgrowth in response to ischemic damage 51 whilst errors in axon pathfinding and neurite outgrowth were observed in Cadm2-deficient chick embryos 52 . In vivo and in vitro studies of tumour models have demonstrated that increased expression of CADM2 mRNA or protein is associated with reduced cell viability, proliferation, migration and invasion in glioma 53 , retinoblastoma 54 , renal cell 55 , hepatocellular 56,57 , endometrial 58 , prostate 59 and oesophageal squamous cell carcinomas 60 . The wide range of cell types suggests that CADM2 regulation of cell turnover could be ubiquitous. With this in mind, neuronal remodelling is important for health and pathology and requires cell turnover, so levels of CADM2 likely influence plasticity of the brain. Similarly, to increase the fat that can be stored, adipocytes either increase in volume (metabolically detrimental) or in number (metabolically benign) 61 . CADM2 levels could be a part of the volume vs number fate determination.
Further support for biological pleiotropy comes from the observation that CADM2 SNPs associated with risk-taking (rs4856591), neuroticism (rs818219) and mood instability (rs818225) had eQTLs for adipose tissue (and therefore potentially direct effects on adiposity). This is especially interesting in light of the established risk of obesity in psychiatric disorders and the relevance of these traits to a wider range of psychiatric disorders (MDD, GAD, SCZ, BD) than risk-taking (SCZ and BD). A common biological mechanism, such as CADM2, might also be consistent with the recent observation of a bi-directional link between depression and obesity 62 .
This study did not find evidence for effects of CADM2 on psychiatric diagnoses, although it should be noted that the number of cases for these analyses were low. We also cannot exclude psychiatric medication as a confounder in our analyses of cardiometabolic variables, however the very low percentage of individuals on psychiatric medication means that this is very unlikely. No associations were identified for CVD, which is unlikely to be due to the number of cases present. We also note that, in comparison to the Cadm2-knockout mice, no effects on glucose-related traits were observed. This may be due to a smaller sample size (N = 10,128) and thus reduced power for these biomarkers, or selection bias due to studying cardiovascular cohorts. It is also possible that effects of CADM2 on insulin sensitivity and glucose levels are secondary to effects on BMI, making it harder to discern.
A surprising finding of this study was the long-range linkage disequilibrium within the CADM2 locus, despite low LD, with effects being evident over a region of nearly 1 Mb. This means that the SNPs identified for psychological and metabolic traits are not independent. Haplotype analysis would be of value here, however the standard approaches rely on higher LD and smaller regions, so this has yet to be attempted. This consideration, combined with a plethora of eQTLs in a variety of tissues, results in complexity regarding the regulation of CADM2 levels. Further functional investigation of CADM2 is required for complete mechanistic understanding of this locus.
There are some limitations to this study, notably incomplete genetic coverage of the locus in the IMPROVE and SCARFSHEEP cohorts and reduced sample sizes for the biomarker analyses. In addition, only the UK Biobank had both psychological and cardio-metabolic phenotyping. As is typical for the majority of cardio-metabolic studies, history of psychiatric illness was an exclusion criterion for IMPROVE, SCARFSHEEP and PROCARDIS, therefore it is possible that these cohorts have lower levels of variants associated with psychiatric disorders than the general population. Conversely, these cohorts have higher rates of cardiovascular risk factors and disease than the general population (average/general population, to moderate/early CVD, to high risk/at least 3 CVD risk factors). Whilst this is a strength when looking at cardiovascular phenotypes, it is likely to contribute to the high I 2 demonstrated for some SNPs in the meta-analyses. Despite this, there are a number of variants that show significant associations, with I 2 = 0, and these were used for the follow-up analyses. The consistency in effect sizes and directions for these associations with BMI and CRP are striking, especially being irrespective of CVD risk burden. These findings also demonstrate that the effects of CADM2 variants on cardio-metabolic parameters are generalizable to a wider European ancestry population. The same cannot be assumed for the psychological phenotypes, which were analysed exclusively in white British UK Biobank participants. Strengths of the study include large sample sizes for most analyses. The meta-analyses of several cohorts provides robust results, whereas consistent phenotyping is a clear advantage of the UK Biobank study. www.nature.com/scientificreports www.nature.com/scientificreports/ In conclusion, we have conducted a systematic, large-scale analysis of multiple datasets providing evidence that CADM2 represents a putative shared biological link between metabolic and psychological disorders. Future work, including animal models which investigate both metabolic and behavioural traits in the same animals, is now needed to understand the functional biological mechanisms that might explain this link.

Data Availability
Data is available on request, contact either UK Biobank (UK Biobank-only analyses) or the corresponding author (all other analyses).