Genome-wide DNA methylation analysis of heavy cannabis exposure in a New Zealand longitudinal cohort

Cannabis use is of increasing public health interest globally. Here we examined the effect of heavy cannabis use, with and without tobacco, on genome-wide DNA methylation in a longitudinal birth cohort (Christchurch Health and Development Study, CHDS). A total of 48 heavy cannabis users were selected from the CHDS cohort, on the basis of their adult exposure to cannabis and tobacco, and DNA methylation assessed from whole blood samples, collected at approximately age 28. Methylation in heavy cannabis users was assessed, relative to non-users (n = 48 controls) via the Illumina Infinium® MethylationEPIC BeadChip. We found the most differentially methylated sites in cannabis with tobacco users were in the AHRR and F2RL3 genes, replicating previous studies on the effects of tobacco. Cannabis-only users had no evidence of differential methylation in these genes, or at any other loci at the epigenome-wide significance level (P < 10−7). However, there were 521 sites differentially methylated at P < 0.001 which were enriched for genes involved in neuronal signalling (glutamatergic synapse and long-term potentiation) and cardiomyopathy. Further, the most differentially methylated loci were associated with genes with reported roles in brain function (e.g. TMEM190, MUC3L, CDC20 and SP9). We conclude that the effects of cannabis use on the mature human blood methylome differ from, and are less pronounced than, the effects of tobacco use, and that larger sample sizes are required to investigate this further.


Introduction
Cannabis use is an important global public health issue, and a growing topic of controversy and debate 1,2 . It is the most widely used illicit psychoactive substance in the world 3 , and the potential medicinal and therapeutic benefits of cannabis and its main active ingredients tetrahydrocannabinol (THC) and cannabidiol (CBD) are gaining interest [4][5][6] . There is strong evidence to suggest that the heavy and prolonged use of cannabis may be associated with increased risk of adverse outcomes in a number of areas, including mental health (psychosis 7-9 , schizophrenia 10,11 , depression 12,13 ) and illicit drug abuse 14 .
Drug metabolism, drug response and drug addiction have known genetic components 15 , and multiple genome-wide association studies (GWAS) have identified genes and allelic variants that are likely contributors to substance use disorders 16,17 . There are aspects of cannabis use disorder that are heritable [18][19][20][21] , and several candidate loci for complex phenotypes such as lifetime cannabis use have recently been identified 3,22 that explain a proportion of the variance in cannabis use heritability. Complex phenotypes like these are influenced by multiple loci, each of which usually has a small individual effect size 23 , and such loci are frequently located in non-coding regions of the genome 24,25 , making their biological role difficult to elucidate.
Epigenetic mechanisms are involved in the interaction between the genome and environment; they respond to changes in environmental stimuli (such as diet, exercise, drugs), and act to alter chromatin structure and thus regulate gene expression 26 . Epigenetic modifications, such as DNA methylation, contribute to complex traits and diseases 27,28 . Methylation of cytosine residues within CpG dinucleotides is an important mechanism of variation and regulation in the genome [29][30][31][32] . Cytosine methylation, particularly in the promoter region of genes, is often associated with a decrease in transcription 33 , and DNA methylation in the first intron and gene expression is correlated and conserved across tissues and vertebrate species 34 . Furthermore, modulation of methylation at CpG sites within the human genome can result in an epigenetic pattern that is specific to individual environmental exposures, and these may contribute to disease 26,[35][36][37] . For example, environmental factors such as drugs, alcohol, stress, nutrition, bacterial infection, and exercise 36,[38][39][40][41] have been associated with methylation changes. A number of these methylation changes have been shown to endure and induce lasting biological changes 36 , whereas others are dynamic and transient. For example, alcohol consumption affects genome-wide methylation patterns in a severity-dependent manner 42 and some of these changes revert upon abstinence from alcohol consumption 43 . A similar observation is reported for former tobacco smokers, with DNA methylation changes after cessation eventually reaching levels close to those who had never smoked tobacco 44 . Thus, DNA methylation can be indicative of a particular environmental exposure, shed light on the dynamic interaction between the environment and the genome, and provide new insights in to the biological response.
Recreational drug use (an environmental stimulus) has been associated with adverse mental health outcomes, particularly in youth [45][46][47][48][49] , and epigenetics may play a role in mediating the biology involved. Therefore, we sought to determine whether regular cannabis users displayed differential cytosine methylation compared with noncannabis users. Cannabis users in this study are participants from the Christchurch Health and Development Study (CHDS), a longitudinal study of a birth cohort of 1265 children born in 1977 in Christchurch, New Zealand. Users often consume cannabis in combination with tobacco. Unusually, the CHDS cohort contains a subset of cannabis users who have never consumed tobacco, thus enabling an investigation of the specific effects of cannabis consumption, in isolation, on DNA methylation in the human genome.

Cohort and study design
The Christchurch Health and Development Study includes individuals who have been studied on 24 occasions from birth to the age of 40 (n = 987 at age 30, with blood collected at approximately age 28). In the early 1990s, research began into the initiation and consequences of cannabis use amongst CHDS participants; cannabis use was assessed prospectively over the period up to the collection of DNA [11][12][13][14][48][49][50][51][52][53][54] . A subset of n = 96 participants for whom a blood sample was available are included in the current study. Cases (regular cannabis users, n = 48) were matched with controls (n = 48) for sex (n = 37 male, n = 11 female each group, for additional information see Supplementary Table 1). Case participants were partitioned into two subsets: one that contained cannabis-only users (who had never consumed tobacco, "cannabis-only", n = 24 [n = 21 male, n = 3 female]), and one that contained cannabis users who also consumed tobacco ("cannabis with tobacco", n = 24 [n = 16 male, n = 8 female]) and were selected on the basis that they either met DSM-IV 55 diagnostic criteria for cannabis dependence, or had reported using cannabis on a daily basis for a minimum of three years, prior to age 28. Of the 48 cannabis users, 6 participants had ceased cannabis use by 28 years of age, however, still met the diagnostic criteria for cannabis dependence. Mode of cannabis consumption was via smoking, for all participants. The median duration of regular use was 9 years (range 3-14 years). Control participants had never used cannabis or tobacco. In addition, comprehensive single nucleotide polymorphism (SNP) data was available for all participants 56

Bioinformatics and statistics
All analysis was carried out using R (Version 3.5.2 57 ). Prior to normalisation, quality control was performed on the raw data. Firstly, sex chromosomes and 150 failed probes (detection P value > 0.01 in at least 50% of samples) were excluded from analysis. Furthermore, potentially problematic CpGs with adjacent SNVs, or that did not map to a unique location in the genome 58 , were also excluded, leaving 700,296 CpG sites for further analysis. The raw data were then normalised with the NOOB procedure in the minfi package 59 (Supplementary Fig. 1). Normalisation was checked by visual inspection of intensity densities and the first two components from Multi-Dimensional Scaling of the 5000 most variable CpG sites ( Supplementary Figs. 2 and 3). The proportions of cell types (CD4+, CD8+ T cells, natural killer, B cells, monocytes and granulocytes) in each sample were estimated with the Flow.Sorted.Blood package 60 . Linear models were fitted to the methylated/unmethylated or M ratios using limma 61 . Separate models were fitted for cannabis-only vs. controls, and cannabis plus tobacco users vs. controls. Both models contained covariates for sex (bivariate), socioeconomic status (three levels), batch (bivariate), population stratification (four principal components from 5000 most variable SNPs) and cell type (five continuous). β values were calculated, defined as the ratio of the methylated probe intensity (M)/the sum of the overall intensity of both the unmethylated probe (U) + methylated probe (M). P values were adjusted for multiple testing with the Benjamini and Hochberg method and assessed for genomic inflation with bacon 62 . Differentially methylated CpG sites that were intergenic were matched to the nearest neighbouring genes in Hg19 using GRanges 63 , and the official gene symbols of all significantly differentially methylated CpG sites (nominal P < 0.001) in cannabis-only users were tested for enrichment in KEGG 2019 human pathways with EnrichR 64 .

Data normalisation
Modelled effects showed no indication of genomic inflation with λ = 1.04 for cannabis-only users (Supplementary Fig. 4a) and λ = 0.855 for cannabis with tobacco users (Supplementary Fig. 4b), versus controls. These were confirmed with bacon for cannabis-only (inflation = 0.98, bias = 0.044) and cannabis with tobacco users (inflation = 0.91, bias = 0. 19). Inflation values <1 suggest that the results may be conservative.
Cannabis with tobacco users had a significantly lower estimated proportion of natural killer cells than controls (1.8%, 0.4-3.2%, P < 0.014) with no other proportions differing significantly. After adjusting for multiple comparisons this was not significant (P = 0.08), however, we note that it is consistent with other findings that NK-cells are suppressed in the plasma of tobacco smokers 65,66 .

Differential methylation
The most differentially methylated CpG sites for cannabis users relative to controls differed in the absence (Table 1) and presence ( Table 2) of tobacco smoking. Five individual CpG sites were significantly differentially methylated (P adjusted <0.008) between cannabis users and controls when cannabis with tobacco was used (Table 2 and Fig. 1). The top CpG sites in the AHRR, ALPG and F2RL3 genes ( Table 2) are consistent with previous studies on tobacco use without cannabis (e.g. refs. 44,67-69 ), and cg17739917 is in the same CpG-island as other CpGs previously shown to be hypomethylated in response to tobacco 70 . Cannabis-only users showed no CpG sites differentially methylated after correction for multiple testing (Table 1 and Fig. 2), however, the most differentially methylated site was hypermethylation of cg12803068 in the gene MYO1G, which has been reported to be hypermethylated in response to tobacco use 67 . We identified 28 genes with multiple (two or more) differentially methylated CpG probes (Supplementary Table 2). Of these 28 genes, 25 have all sites hypermethylated, one has two sites hypomethylated, two have one hypermethylated and one hypomethylated probe.
To describe the data we chose a nominal P value of 0.001, and observed that both cannabis-only and cannabis with tobacco users showed relatively higher rates of hypermethylation than hypomethylation compared with controls and that the distribution of these CpG sites was similar with respect to annotated genomic features (Table 3). Four CpG sites overlapped between the cannabis-only and cannabis with tobacco users analyses; two were hypermethylated; cg02514528, in the promoter of MARC2, and cg27405731 in CUX1, and one, cg26542660 in the promoter of CEP135, was hypomethylated in comparison to controls. The second most differentially methylated site (ranked by P value) in cannabis-only users was cg02234936 which maps to ARHGEF1; this was hypermethylated in the cannabis with tobacco users.

Pathway enrichment analyses
We then took the genes containing differentially methylated CpG sites at P < 0.001 for the cannabis-only group, or the closest gene where that CpG was intergenic (Supplementary Table 3) and compared them with human KEGG pathways using Enrichr. The hypermethylated CpG sites (n = 420) showed enrichment in the arrhythmogenic right ventricular cardiomyopathy, long-term potentiation, cAMP signalling, adrenergic signalling in cardiomyocytes, glutamatergic synapse, hypertrophic cardiomyopathy, dilated cardiomyopathy and nicotine addiction pathways at an adjusted P < 0.05 (Fig. 3). Enrichment analysis of hypomethylated loci (n = 101) in cannabis-only users did not identify any KEGG pathways at or near adjusted significance (P > 0.05, Fig. 4). We further submitted all differentially methylated CpG sites (hyper and hypomethyated) at a nominal P < 0.001 to Enrichr, revealing significant enrichment for genes involved in the glutamatergic synapse (adjusted P = 0.012), arrhythmogenic right   ventricular cardiomyopathy (adjusted P = 0.011) and longterm potentiation pathways (adjusted P = 0.039) (Fig. 5).

Discussion
Many countries have recently adopted, or are considering, lenient polices regarding the personal use of cannabis [71][72][73] . This approach is supported by the evidence that the prohibition of cannabis can be harmful 53 . Further, the therapeutic benefits of cannabis are gaining traction, most recently as an opioid replacement therapy 74 . However, previous studies, including analyses of the CHDS cohort, have reported an association between cannabis use and poor health outcomes, particularly in youth 75,76 . Epigenetic mechanisms, including DNA methylation, provide the interface between the environment (e.g. cannabis exposure) and genome. Therefore, we investigated whether changes in an epigenetic mark, DNA methylation, were altered in cannabis users, versus controls, a comparison made possible by the deep phenotyping of the CHDS cohort with respect to cannabis use, and the fact that the widespread practice of mulling or mixing cannabis with tobacco, is not common in New Zealand.
Consistent with previous reports of tobacco exposure, we observed greatest differential methylation in cannabis with tobacco users in the AHRR and F2RL3 genes 44,67-69 . These changes, however, were not apparent in the cannabis-only data. Only two nominally significantly differentially methylated (P < 0.05) CpG sites were observed in both the cannabis-only and cannabis with tobacco  analyses. This suggests that tobacco may have a more pronounced effect on DNA methylation and/or dominates any effects of cannabis on the human blood methylome, and that caution should be taken when interpreting similar cannabis exposure studies which do not, or cannot, exclude tobacco smokers. Interestingly, the two nominally significant CpG sites (P < 0.05) that overlap between the cannabis-only and the cannabis with tobacco data are located within the MARC2 and CUX1 genes, which both have reported roles in brain function; a SNP in MARC2 has been provisionally associated with the biological response to antipsychotic therapy in schizophrenia patients 77 , and the CUX1 gene has an established role in neural development 78 . Cannabis affects the brain, leading to perceptual alterations, euphoria and relaxation 18 , and prolonged use is associated with mood disorders, including adult psychosis 7,8,49,79,80 , mania 13 , and depression 12 . We did not detect significantly differentially methylated loci associated with exclusive cannabis use at the epigenome-wide level. However, an assessment of those top loci reaching nominal significance (P < 0.05) identified CpG sites within genes involved in brain function and mood disorders, including MUC3L 81,82 , CDC20 83  Pathway enrichment revealed that differential methylation in cannabis-only users was over-represented in genes associated with neural signalling and cardiomyopathies. This is consistent with the literature which raises clinical concerns around cardiac complications potentially associated with cannabis use [100][101][102][103] . The enrichment of genes associated with neural signalling pathways is also consistent with the literature, including previous analyses of the CHDS cohort, which report associations between cannabis exposure and brain related biology such as mood disorders 7,12,48,49,[51][52][53][54]104,105 . Our study was limited by sample size, achieving~10% power at P = 10 −7 to detect the largest standardised effect size found. However, while we have not implicated any gene at the genome-wide significance level with respect to differential methylation associated with cannabis-only exposure, our data are suggestive of a role for DNA methylation in the biological response to cannabis, a possibility which definitely warrants further investigations in larger cohorts.
In summary, while tobacco use has declined on the back of state-sponsored cessation programmes 106 , rates of cannabis use remain high in New Zealand and globally, and might be predicted to increase further with the decriminalisation or legalisation of cannabis use for therapeutic and/or recreational purposes 107 . Therefore, analysis of the potential effects of cannabis (an environmental stimuli) on DNA methylation, an epigenetic mechanism, is timely. Our data are suggestive of a role for DNA methylation in the biological response to cannabis, significantly contributes to the growing literature studying  the biological effects of heavy cannabis use, and highlights areas of further analysis in particular with respect to the epigenome.