Phenotypic and genetic analysis of a wellbeing factor score in the UK Biobank and the impact of childhood maltreatment and psychiatric illness

Wellbeing is an important aspect of mental health that is moderately heritable. Specific wellbeing-related variants have been identified via GWAS meta-analysis of individual questionnaire items. However, a multi-item within-subject index score has potential to capture greater heritability, enabling improved delineation of genetic and phenotypic relationships across traits and exposures that are not possible on aggregate-data. This research employed data from the UK Biobank resource, and a wellbeing index score was derived from indices of happiness and satisfaction with family/friendship/finances/health, using principal component analysis. GWAS was performed in Caucasian participants (N = 129,237) using the derived wellbeing index, followed by polygenic profiling (independent sample; N = 23,703). The wellbeing index, its subcomponents, and negative indicators of mental health were compared via phenotypic and genetic correlations, and relationships with psychiatric disorders examined. Lastly, the impact of childhood maltreatment on wellbeing was investigated. Five independent genome-wide significant loci for wellbeing were identified. The wellbeing index had SNP-heritability of ~8.6%, and stronger phenotypic and genetic correlations with its subcomponents (0.55–0.77) than mental health phenotypes (−0.21 to −0.39). The wellbeing score was lower in participants reporting various psychiatric disorders compared to the total sample. Childhood maltreatment exposure was also associated with reduced wellbeing, and a moderate genetic correlation (rg = ~−0.56) suggests an overlap in heritability of maltreatment with wellbeing. Thus, wellbeing is negatively associated with both psychiatric disorders and childhood maltreatment. Although notable limitations, biases and assumptions are discussed, this within-cohort study aids the delineation of relationships between a quantitative wellbeing index and indices of mental health and early maltreatment.


Defining the wellbeing phenotype
We first reviewed all items in the baseline questionnaire (issued in 2006-2010) and the follow-up mental health questionnaire (issued from 2016) to identify those self-report questions that provide some measurement of subjective or psychological wellbeing. Due to the time-lag between the baseline and follow-up questionnaires, and the smaller number of wellbeing-related questions and participants who completed follow-up assessment, we focused on relevant items from the baseline questionnaire -which focused only on subjective elements of wellbeing. There was only one question "To what extent do you feel your life to be meaningful?" (Data-Field 20460) related to psychological wellbeing which was in the follow-up questionnaire, and we did not use in the current study.
We identified six items in the baseline dataset that related to the 'core' concept of wellbeing: These questions (Data-Fields 4526, 4559, 4570, 4581, 4548, 4537 respectively) were rated on a 6point Likert scale (e.g., from 'extremely happy' to 'extremely unhappy'). Participants with "Do not know" or "Prefer not to answer" were coded as missing.
While job satisfaction is also an important aspect of wellbeing, ~29% of individuals at baseline had answered "I am not employed" -thus, to maximise the sample size and avoid circumstantial bias in the wellbeing score relating to employment status, the variable job satisfaction was excluded from factor analysis. Items were reverse scored so that higher scores corresponded to higher happiness/satisfaction. Principal component analysis was performed in SPSS v25 for five subjective wellbeing items and a factor score created which we herein refer to as the "wellbeing index" ( Figure S2). Participants who had missing data for any of the five core wellbeing questions were excluded. To evaluate the goodness of fit for this factor model, we employed a confirmatory factor analysis in the validation dataset using lavvan version 0.6.7 package in R.

Independent confirmatory sample
Participants in this sample were selected from those who completed the same baseline questionnaire during the imaging visit (after 2014), were not in our discovery sample, and had no missing data for any of the five items used to derive the wellbeing index (happiness, friendship, family, financial and health satisfaction). Further we identified and removed any participants who were genetically related to participants in the discovery sample (up to the third degree relative, IBD kinship coefficients >0.044), leaving N=23,703 participants in this independent confirmatory sample.

Negative indicators of mental health -coding
For loneliness, the question "Do you often feel lonely?" (Data-Field 2020), was answered with a binary "yes" or "no" response.
The neuroticism measure was a summary score based on 12 neurotic behaviour domains (Data-Field 20127), with a score of 0-12 [1]. These 12 questions are from the Eysenck Personality Inventory Neuroticism scale (EPIN-R) [2]. The questions are: "Does your mood often go up and down?"; "Do you ever feel 'just miserable' for no reason?"; "Are you an irritable person?"; "Are your feelings easily hurt?"; "Do you often feel 'fed-up'?"; "Would you call yourself a nervous person?"; "Are you a worrier?"; "Would you call yourself tense or 'highly strung'?"; "Do you worry too long after an embarrassing experience?"; "Do you suffer from 'nerves'?"; "Do you often feel lonely?"; "Are you often troubled by feelings of guilt?" A binary variable was constructed using two lifetime depressive symptom questions -"have you ever had a time when you were feeling depressed or down for at least a whole week?" and "Have you ever had a time when you were uninterested in things or unable to enjoy the things you used to for at least a whole week?" (Data-Fields 4598 and 4631, respectively) -participants who answered yes to one or both questions were coded as "yes" for depressive symptoms, and participants who answered no to both were coded as "no". For the "seen GP or psychiatrist" binary variable, participants who answered yes to either of the two questions -"Have you ever seen a psychiatrist for nerves, anxiety, tension or depression?" and "Have you ever seen a general practitioner (GP) for nerves, anxiety, tension or depression?" (Data-Fields 2100 and 2090, respectively) -were coded as "yes", and those who answered no to both were coded as "no".

Childhood maltreatment
There are five items that relate to childhood maltreatment in the UK Biobank Mental Health Questionnaire, which were derived from the Childhood Trauma Questionnaire [3] which was part of To evaluate the accumulative effect, each maltreatment type was dichotomised from a 5-point Likert response ("Never true" to "very often true") to a binary trauma "exposed" or "not-exposed" variable, and a sum score was created by summing across the first four maltreatment types (ranging from 0 to 4). For sexual, physical, and emotional abuse, all responses except "never true", were considered as trauma exposed. For emotional neglect, all responses except "very often true" and "often true" were considered trauma exposed [see Table S2 for details], consistent with a higher cutoff for emotional neglect within the Child Trauma Questionnaire [3].
The physical neglect category ("...there was someone to take me to the doctor if I needed it") (Data-Field 20491) was not included in the maltreatment sum score after analysis of the wellbeing index against the categorical responses indicating a potential misunderstanding of this question (Data-Field 20491). The people who answered "never" (N=893, mean=0.053) were more numerous than those who answered "rarely" (N=317) and had a significantly higher wellbeing index score than both the population mean and the mean value of individual "exposed" subcategories ( Figure S8.e), suggesting that some participants may have responded "never" if they didn't need to be taken to the doctor (i.e., they had no ailments), rather than not having someone to take them.
There were 45,723 participants with non-missing childhood maltreatment sum score and wellbeing index for analysis.

Effect of childhood maltreatment on the wellbeing index score
First, we examined the impact of each individual type of childhood maltreatment on the wellbeing index to investigate their effects as categorical variables relating to the frequency of the maltreatment experience. Then, the cumulative impact was investigated using maltreatment sum score. Mean differences across all categories were tested using Kruskal-Wallis test, and mean differences between two adjacent groups were tested using Wilcoxon test. Further, linear regression models adjusting for age, age-squared, sex and Townsend Deprivation Index [Data-Field 189] were used to test how different types of childhood maltreatment (or the sum of exposures across multiple types of maltreatment) can influence the wellbeing index in the presence of possible confounding.
Standardised estimates from the linear models were considered as the magnitude of the effect of the dependant variable. To investigate the unique effect of each type of maltreatment, a subset of participants who reported a single childhood maltreatment type were compared to those without any type of maltreatment, for mean wellbeing index score differences.

Functional Mapping and Annotation of GWAS loci
Functional Mapping and Annotation of Genome-Wide Association Studies (FUMA) was used for the wellbeing GWAS annotation [4]. FUMA first identified linkage disequilibrium (LD) independent SNPs with genome-wide significant association (r2<0.6, p<1×10 -8 ) as independent significant SNPs, then defined independent lead SNPs as those with p<1x10 -8 and LD r 2 <0.1. The LD blocks of independent lead SNPs were merged into one genomic locus if they were closer than 250kb. All SNPs (with maximum p<0.05) with r2≥0.6 with any of the independent significant SNPs were included for further annotation. We used UK Biobank release 2bB reference panel for all the analysis, except for annotation of the genome -wide significant loci on chromosome 14 (rs79167904), an insertiondeletion variant which was excluded from the UK Biobank release 2bB reference panel. Instead, the smaller 1000 Genome Phase3 reference panel, which included indel variants was used to annotate rs79167904.
FUMA uses three strategies to map SNPs to genes: 1) Positional mapping, based on physical proximity of a SNP to a gene; 2) expression quantitative trait loci (eQTL) mapping, to identify genes whose transcript expression is significantly associated with allelic variation of the SNP; and 3) chromatin interaction mapping, which links SNPs to genes based on significant chromatin interactions between each SNP's genomic region and nearby or distant genes. For positional mapping, we used a 10kb window, without any other filters. For eQTL mapping, only databases employing brain tissue were selected -PsychENCODE [5], xQTLServer [6], CommonMind Consortium [7], BRAINEAC [8], and GTEx brain tissues V6-8 [9] -and only significant SNP-gene pairs with FDR<0.05 were used. For chromatin interaction mapping, we included PsychENCODE and brain tissues/cell types from Hi-C chromatin interaction dataset and set an FDR threshold of p<1×10 −6 to define significant interactions.
Tissue specific enrichment of genes mapped to the wellbeing index GWAS signals, employing the full distribution of SNP p-values, was performed in MAGMA [10] (implemented in FUMA) using GTEx v8 differentially expressed gene data set for 54 tissue types.    SNPs across the genome, with chromosome and base pair position is on x-axis and negative logarithm of the p-value from infinitesimal model is on y-axis. The red line indicates the genomewide significance threshold of p<5×10 -8 . For of loneliness, neuroticism, depressive symptoms and 'seen a psychiatrist or GP for nerves, anxiety, tension, or depression' phenotypes the sample size is smaller due to data missingness. For childhood maltreatment the sample is from the whole UKB sample with data of childhood maltreatment. a) happiness, b) friendship satisfaction, c) family satisfaction, d) financial satisfaction, e) health satisfaction, f) neuroticism, g) loneliness, h) depressive symptoms i) seen GP or psychiatrist for nerves, anxiety, tension, or depression, j) childhood maltreatment.  [11]. Error bars represent standard errors. Figure S8. Effect of frequency of exposure to different childhood maltreatment types on wellbeing index score. The mean difference in wellbeing index score across all categories for each maltreatment exposure was tested using Kruskal-Wallis test (all p <2.2e -16 ). In all graphs, the mean wellbeing index zscore for each group category is shown above each violin plot, and the pairwise mean difference between two adjacent groups employed the Wilcoxon test and the p-values are presented. The interquartile range is represented by vertical black lines inside the violin plots, and the dotted horizontal line is the median wellbeing index score in the discovery sample (N=129,237). The number of participants in each group is shown at the bottom of each category.  Figure S9. Regression analysis on exposure to different types of childhood maltreatment (as binary exposed/not-exposed variables) on the wellbeing index. Panel A represents the standardised estimates (95% confidence interval) from linear regressions where wellbeing is an outcome and one type of childhood maltreatment (or the sum score) as predictor. Panel B represents the estimates from a multiple linear regression model with all types of maltreatments fitted as predictors in the same model. All regressions are adjusted for age, age-squared, sex and Townsend deprivation index. Maltreatments have been coded as binary variables and the maltreatment sum score represents the accumulation of different types of traumas. Participants represented in each trauma type category are not mutually exclusive, as one person can report multiple types of traumas. The sample size varies slightly across different maltreatment categories due to variable missingness across items (N=45,723 -46,374). Figure S10. The effect of a single type of maltreatment exposure compared to no maltreatment exposure on mean wellbeing index. Participants with exposure to multiple types of maltreatment have been removed from this analysis. The mean wellbeing index z-score for each group category is shown above each violin plot, and the pairwise mean difference between two adjacent groups employed the Wilcoxon test and the p-values are presented. The interquartile range is represented by vertical black lines inside the violin plots, and the dotted horizontal line is the median wellbeing index score in the sample (N=38,204). The number of participants in each group is shown at the bottom of each category.
Figure S11. Genetic correlation (r g ) between the childhood maltreatment sum score GWAS and the wellbeing index, its subcomponents, and negative mental health traits in the UK Biobank. Green and orange colouring indicates the positive and negative genetic correlations. All genetic correlations are significant at Bonferroni correction (p<6.94×10 -4 ). Error bars represent 95% confidence intervals. Figure S12. Distribution of wellbeing index scores within each category of self-reported psychiatric diagnoses in the discovery cohort. The number (N) and frequency (%) of participants with each diagnosis is provided along x-axis below the violin plots. The mean wellbeing index score, standardised to a mean of 0 in the general population, is represented by the black dot within plot, and its corresponding numeric value is provided along x-axis above the violin plots for each diagnosis. The interquartile range is represented by vertical black lines inside the violin plots, and the dotted horizontal line is the median wellbeing index score in the discovery sample (N=129,237). Membership in each category was not mutually exclusive, as some participants endorsed receiving more than one lifetime psychiatric diagnosis.