INTRODUCTION

Major depression and cocaine dependence (CD) frequently co-occur. Major depression ranks as one of the top three mental disorders in terms of burden of disease (Evans and Charney, 2003), and CD ranks as the third most prevalent illicit drug dependence diagnosis (SAMSHA, 2009). There is a high rate of comorbidity between CD and depression observed clinically. Although there are no reported statistics for prevalence of comorbid CD and depression, it is estimated that approximately 30% of people with major depressive disorder have a lifetime drug use disorder, and the lifetime prevalence of affective disorders is about 35% among cocaine abusers (Regier et al, 1990). Similar comorbidity rates were observed in the Sequenced Treatment Alternatives to Relieve Depression study (Davis et al, 2006) and in the 2008 National Survey on Drug Use and Health (SAMSHA, 2009). This comorbidity is associated with a higher risk of suicide, greater social and functional impairment, and greater risk of other co-occurring psychiatric disorders (Davis et al, 2006). ‘The family data used for this study show that CD–major depressive episode (MDE) subjects’ relatives have specifically increased risk of CD–MDE. The probability of at least two siblings with CD–MDE among all the families with CD–MDE is 0.28 for African American (AA) and 0.37 for European American (EA), both of which are higher than the prevalence in the general population. In the United States, the lifetime prevalence of MDD was estimated by the National Health and Nutrition Examination Survey III (Riolo et al, 2005) to be 0.104 and 0.075 for EA and AA, respectively. Although prevalence estimates are not available for CD–MDE, the trait for which we aimed to map susceptibility genomic regions in this study, it has a lower population prevalence than CD alone by virtue of being a subset of those affected with the disorder.

Evidence from adoption, twin, and family studies supports the important role of genetic factors in vulnerability to comorbid depression and substance dependence (Ingraham and Wender, 1992; Prescott et al, 2000; Maher et al, 2002). Estimates from twin studies of the heritability of CD are as high as 72% for CD (Goldman et al, 2005), and as high as 58% for depression (Uhl and Grow, 2004).

The first genomewide linkage scan (GWLS) of comorbid alcoholism and depression (Nurnberger et al, 2001) was published by the Collaborative Study of the Genetics of Alcoholism research group. Although they found significant logarithm of odds (lod) scores on four chromosomes, the significant signals each appeared in only one of the two data sets. They reported that chromosome 1, near 120 centimorgans (cM), might harbor genes for the alcoholism or depression phenotypes with lod scores of 5.12 and 1.52 in the two data sets, respectively (4.66 in the combined data set) and concluded that a locus on chromosome 1 influences vulnerability to alcoholism and affective disorder.

Genomewide association studies (GWAS) have been widely employed in the last few years to identify risk genes for complex traits. Many risk loci have been identified, but loci found through the GWAS generally account for only a small amount of the genetic risk for the trait, typically with an odds ratio of 1.2 or lower for each locus (Manolio et al, 2009). More recently, it has been recognized that multiple rare variants at risk loci may account for some of the ‘missing’ heritability (Manolio et al, 2009; Eichler et al, 2010). Such disease-associated rare variants are unlikely to be detected through GWAS because of the low correlation between common single-nucleotide polymorphisms (SNPs) and rare variants, but an adequately powered genetic linkage analysis could detect their effects (McMillan and Robertson, 1974; Ng et al, 2009), because segregation of multiple causal rare variants at the same locus can sum to a detectable linkage signal over a sample. Therefore, linkage studies are still a useful method for gene discovery. To our knowledge, there have been no previous GWLS reports for the comorbidity of CD and depression, and specific genes associated with comorbid CD and depression have yet to be identified. To address this issue, we conducted an autosomal GWLS for comorbid CD and MDE to identify genomic regions likely to contain risk loci for the co-occurrence of the two disorders.

MATERIALS AND METHODS

Recruitment

Subjects were initially ascertained for genetic studies of CD and opioid dependence (OD) (Gelernter et al, 2006) using the affected sibling pair (ASP) linkage approach. There were four recruitment sites: Yale University School of Medicine (APT Foundation; New Haven, Connecticut), University of Connecticut Health Center (UConn; Farmington, Connecticut), Medical University of South Carolina (MUSC; Charleston, South Carolina), and McLean Hospital (Harvard Medical School; Belmont, Massachusetts). The recruitment and assessment were nearly identical between the two studies (other than for primary trait of interest), and the recruitment periods overlapped. Families were screened and recruited based on the belief that at least two siblings would meet diagnostic criteria for CD or OD. CD families were recruited from all sites, while OD families were recruited only at the UConn and Yale sites. The families used for this linkage study included all those with at least one CD sib-pair, which also included those comorbid with OD (but no OD families that did not also include at least one CD sib-pair). Probands with an axis I clinical diagnosis of a major psychotic disorder such as schizophrenia or schizoaffective disorder were excluded from participation. Regardless of affection status, other siblings and parents were recruited whenever available, to increase linkage information. After a complete description of the study was provided to the subjects, written informed consent was obtained from all subjects. The institutional review board at each recruitment site approved all study materials and the National Institute on Drug Abuse issued a certificate of confidentiality for the work.

Diagnosis and Study Subjects

Subjects were interviewed with the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA) (Pierucci-Lagha et al, 2005; Gelernter et al, 2005). In addition to a section on the diagnosis of CD, a section of the SSADDA is devoted to the assessment of depressive disorders. The diagnoses of CD and MDE were based on the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV). On the basis of DSM-IV, we drew a distinction between a MDE that is substance induced and one that is independent. However, we combined them to increase sample size, denoted as ‘MDE.’ An individual was considered affected for the comorbid CD and MDE (CD–MDE) only if diagnosed with both CD and MDE (not necessary at the same time). Table 1 presents the demographic characteristics of the sample, which was approximately evenly divided by sex and ranged in age from 18 to 61 (mean±SD=39.5±8.8). In all, 65% of the families had one or more affected relatives with a lifetime history of MDE, and 92.5% of the families that were positive for MDE had one or more family members who were comorbid for CD and MDE. Table 2 shows the number of informative families used in the GWLS for each trait.

Table 1 Characteristics of the Study Sample and the Distribution of the Number of Families for AAs and EAs with Respect to the Three Phenotypes
Table 2 Summary of Non-Parametric Linkage Analysis for LOD Scores Exceeding the Empirically Derived Suggestive Linkage Thresholds for the Three Phenotypes

Genotyping and Quality Control

DNA was obtained from immortalized cell lines in most cases, but for a small number of subjects DNA was obtained directly from blood or saliva. A total of 1630 individuals were genotyped at the Center for Inherited Disease Research for the 6008 SNPs Illumina Human Linkage IVb Marker Panel. An additional 266 individuals were genotyped at Yale (Keck Center) with the 6090 SNP Illumina Infinium-12 Human Linkage Marker Panel. We limited our analyses to the autosomal SNPs, and used 5636 and 5735 SNPs from the first and second panels, respectively. Among the SNPs in the two panels, 4518 SNPs were in common across the two platforms, and were used for the following quality control and analyses. PLINK software (Purcell et al, 2007) was used to calculate allele frequencies and examine Hardy–Weinberg equilibrium (HWE) in each population using a set of unrelated subjects (355 EA and 384 AA subjects were randomly selected, one per family). SNPs with a genotyping rate 0.95, a minor allele frequency (MAF) 0.1, or not in HWE (p0.01) were excluded. We used PedCheck (O’Connell and Weeks, 1998) and Merlin (Abecasis et al, 2002) to identify Mendelian inconsistencies. We also used Merlin to identify potentially incorrect genotypes based on estimation of the probability of double-crossover events using the Merlin ‘––error’ option. We used the Pedigree RElationship Statistical Test (PREST) (McPeek and Sun, 2000) to verify family relationships.

There were 21(15), 318 (68), and 46 (40) SNPs that failed the criteria of genotyping rate, MAF and HWE in AAs (EAs), respectively. We limited our analyses to the 4133 (4395) remaining autosomal markers in AAs (EAs). A total of 168 (46) Mendelian inconsistencies were identified by PedCheck and possible genotyping error rates of 0.048% (0.079%) were suggested by Merlin in AAs (EAs). These problematic genotypes were set to be missing in linkage analysis. Pedigree errors were detected in two AA families and five EA families by PREST. Of these, the relationships in one AA family and five EA families were corrected based on the shared identical by decent (IBD) patterns and the re-assigned family relationships were verified by PREST. One AA family relationship could not be resolved and the family was excluded from further analysis. Population was assigned by individual genetic profile (refer to Supplementary File 1 for details).

Linkage Analyses

Initially, we used Merlin (Abecasis et al, 2002) and a non-parametric (ie, model free), penetrance-independent, affected-only, allele-sharing model to test for linkage. Allele frequencies were estimated by counting all genotyped individuals. The Kong and Cox linear allele-sharing model (Kong and Cox, 1997) was used to estimate the lod score.

We used Monte Carlo simulations under the null hypothesis of random linkage between phenotype and genotype to assess the empirical thresholds for genomewide suggestive and significant linkages. We used the gene-dropping algorithm implemented in Merlin to simulate 1000 data sets conditional on the observed family structure, marker spacing, allele frequencies, and missing data pattern. To reduce the computational burden attributable to modeling maker-marker LD, we used a screened marker set with low LD (3675 SNPs in AAs and 3760 SNPs in EAs), that is, r2<0.1 for each pair of markers. Each simulated data set was then analyzed using the same procedure as for the observed data. The highest lod score for each chromosome was recorded for each simulated data set. The genomewide suggestive linkage threshold is defined as the maximum lod score expected once by chance per genome scan (Lander and Kruglyak, 1995) and set as the 1000 highest lod score out of 22 000 lod scores for each of the 22 autosomal chromosomes from the 1000 simulations. The genomewide significant threshold was set as the 95th percentile of the distribution of 22 000 lod scores. These empirical thresholds are shown in Table 2. The autosomal genomewide empirical significance of an observed lod score was derived from the same simulation. It was estimated by counting how often the entire genome had a maximum lod score greater than or equal to the observed score across the 1000 simulated replicates.

Parametric linkage analysis may increase statistical power by including information contributed by families with only one affected and unaffected individuals. We conducted parametric linkage analysis using the program Merlin (Abecasis et al, 2002) to verify the significant linkage signals from non-parametric analysis and identify other linkage regions. To avoid missing evidence for linkage because of genetic model misspecification, the parameters used in the analyses covered a wide range of both dominant and recessive models: nine levels of penetrance from 0.1 to 0.9 with increment of 0.1 for the disease gene carriers, two levels of penetrance of 0.005 and 0.01 for the non-risk genotype, and fixed disease allele frequencies of 0.05 and 0.3 for the dominant and recessive models, respectively. Hence, 36 models were examined and correction for multiple testing was done using empirical simulations on a genomewide scale. We calculated heterogeneity lod (hlod) scores to allow for linkage heterogeneity using Merlin, which internally maximizes parameters to estimate the degree of heterogeneity. The maximum hlod score over the 36 examined genetic models is presented for each locus. The empirical thresholds of Lander–Kruglyak for suggestive and significant linkages (Lander and Kruglyak, 1995), and the empirical genomewide significance for the maximized hlod scores, were assessed by 1000 simulations in the same way as the non-parametric linkage analysis described above.

RESULTS

Comorbidity in the Sample

The presence of either CD or MDE increased the risk of the other disorder, as is commonly observed. More than one-third (37.4%) of 1896 participants or 39.1% of 739 families had co-occurring lifetime diagnoses of both CD and MDE. The presence of CD more than doubled the likelihood of a diagnosis of MDE (44.8% in those with CD vs 24.3% in those without CD). The presence of MDE also increased the risk of CD; of those affected with MDE, 90.3% had CD, compared with 78.5% of those without MDE. The high rates of CD, with or without MDE, reflect the fact that the majority of the sample was ascertained as CD sibling pairs.

Population Assignment

The marker-based population assignment resulted in the majority of self-reported AA and Hispanic black subjects clustering with the AA group, and the majority of self-reported white and Hispanic white subjects clustering with the EA group. Of the self-reported AAs, 5.7% were re-classified to EA. Of the self-reported EAs, 0.76% were re-classified to AA. For mixed-race families, 10.8% were assigned to AA and 15.3% to EA.

Non-Parametric Linkage Analysis

Figure 1 shows non-parametric linkage lod scores from the analysis of AAs. Table 2 lists details of the chromosomal regions that exceeded the empirical threshold for genomewide suggestive linkage for the three phenotypes in AAs. The strongest evidence for linkage reached genomewide significance with the highest lod score of 3.8 on chromosome 7 at 183.4 cM in the analysis of the comorbid CD–MDE phenotype (genomewide empirical p=0.016; point-wise p=0.00001). In the same region, we also observed a genomewide suggestive linkage in the analysis of MDE (lod=3.01, genomewide empirical p=0.065; point-wise p=0.0001), whereas no evidence for linkage was detected for CD. Five other linkage peaks exceeded the empirical genomewide suggestive linkage cutoffs: three peaks on chromosomes 3 (145.5 cM), 9 (99.8 cM), and 10 (144.4 cM) for CD, two peaks on chromosome 2 at 59.9 and 71.9 cM for MDE and CD–MDE, respectively.

Figure 1
figure 1

Lod scores on 22 autosomal chromosomes resulting from non-parametric linkage analyses including African-American families with cocaine dependence (CD), major depressive episode (MDE), and comorbid CD with MDE (CD–MDE). The dashed lines denote the thresholds of lod scores for the three phenotypes for genomewide-‘suggestive’ linkage. (Some of these thresholds appear as one dashed line because they are very close).

PowerPoint slide

The linkage peaks identified in the EA sample were different from those in the AA sample (Figure 2 and Table 2). A nearly genomewide significant linkage peak with lod=2.95 was observed on chromosome 5 at 14.3 cM for the comorbid CD–MDE phenotype (genomewide empirical p=0.055; point-wise p=0.00012), while at the same location there was only modest evidence for linkage for MDE (lod=1.52, point-wise p=0.004) and no evidence of linkage for CD. Four other chromosomal regions showed suggestive linkage signals: chromosomes 18 (110.5 cM), 13 (102.7 cM), and 16 (76.3 cM) for CD, and 10 (153.5 cM) for MDE, and 16 (76.3 cM) CD–MDE.

Figure 2
figure 2

Lod scores on 22 autosomal chromosomes resulting from non-parametric linkage analyses including European-American families with cocaine dependence (CD), major depressive episode (MDE), and comorbid CD and MDE (CD–MDE). The dashed lines denote the thresholds of lod scores for the three phenotypes for genomewide suggestive linkage. (Some of these thresholds appear as one dashed line because they are very close).

PowerPoint slide

Parametric Linkage Analysis

We performed parametric analysis to examine the robustness of the non-parametric results and potentially identify additional linkage peaks. In the AA sample, the linkage peaks identified by non-parametric analysis were confirmed (Figure 3). Linkage peaks for CD on chromosome 10 at 144.4 cM (peak hlod=4.05) and for MDE (peak hlod=3.60) and CD–MDE (peak hlod=4.58) on chromosome 7 at 183.4 cM attained genomewide significance (Supplementary Table 1 and Supplementary Figure S1). Additionally, a new suggestive linkage peak for CD was identified on chromosome 1 at 61.5 cM (peak hlod=2.04). In the EA sample, parametric analysis confirmed the suggestive linkage peaks for CD–MDE (Figure 3 and Supplementary Figure S2), and the peak on chromosome 5 at 15.3 cM (peak hlod=3.28) attained genomewide significance (Supplementary Table 1). Four new regions on chromosomes 21 (34.6 cM) for CD, 3 (0.98 cM) for MDE, 15 (92.9 cM) for MDE, and 9 (82.2 cM) for CD–MDE reached suggestive linkage thresholds. However, the two suggestive linkage regions on chromosomes 18 (110.5 cM) for CD) and 13 (102.7 cM) for MDE identified by the non-parametric approach did not reach genomewide suggestive linkage criteria in the parametric linkage analysis.

Figure 3
figure 3

Hlod scores on chromosome 7 for African Americans and chromosome 5 for European Americans resulting from the parametric linkage analysis of all possible sibling pairs with cocaine dependence (CD), major depressive episode (MDE), and comorbid CD and MDE (CD–MDE). The dash lines denoted the thresholds of hlod scores for the three phenotypes for genomewide suggestive linkage. (Some of these thresholds appear as one dashed line because they are very close).

PowerPoint slide

DISCUSSION

In this first GWLS for comorbid CD and MDE, we found genomewide significant evidence for loci on chromosome 5 in the EA families and chromosome 7 in the AA families. We also obtained genomewide significant evidence in the AA sample for linkage of CD to a novel locus on chromosome 10. This linkage peak is located 10–15 cM proximal to a suggestive linkage peak for CD–MDE observed in the EA families. Moreover, the chromosome 10 locus for CD in the AA families is 17 and 27 cM distal from linkage peaks we found for alcohol dependence (AD) previously in the EA and AA families, respectively (Gelernter et al, 2009; Panhuysen et al, 2010).

The linkage region on chromosome 5 for CD–MDE spans from 11.2 to 16.7 cM as defined by the 1-lod support interval, is predicted to contain 18 genes, and harbors a particularly promising candidate gene, SRD5A1. This gene encodes 5-alpha-reductase type I, which catalyzes the synthesis of the neurosteroid allopregnanolone, a potent positive modulator of gamma amino butyric acid (GABA) action at GABA-A receptors in the brain (Agís-Balboa et al, 2006). The de novo synthesis of neurosteroids in the brain during stress and after alcohol consumption has been reported (Reddy, 2003; Kumar et al, 2004), and disruption of steroid-regulated GABAergic inhibition has been implicated in anxiety, major depression, AD, and schizophrenia (Eser et al, 2006; MacKenzie et al, 2007; Tanchuck et al, 2009).

The linkage region on chromosome 7 is about 2 cM from the telomere of the long arm and harbors approximately 20 genes. Of these, VIPR2, UBE3C, and PTPRN2 are three promising candidate genes for the comorbid CD–MDE phenotype. VIPR2, is one of a group of circadian genes; it encodes the VIPR2 receptor. The circadian genes have been implicated in cocaine sensitization (Andretic et al, 1999) and drug dependence (Perreau-Lenz et al, 2007) as well as mood disorders (Mendlewicz, 2009).

A recent GWAS of response to the antidepressant citalopram in major depressive disorder showed the most significant associated SNP (rs6966038, p=4.65 × 10−7) to be located 51 kb from UBE3C (Garriock et al, 2010). UBE3C encodes E3 ubiquitin ligase, which targets and degrades unneeded or damaged proteins by ubiquitinization and proteolysis; ubiquitin-mediated proteolysis has important roles in various types of substance dependence (Self et al, 1998; French et al, 2001). Taken together, ubiquitin-mediated proteolysis may be important in the development of MDE and drug dependence.

PTPRN2, encodes a member of the protein tyrosine phosphatase (PTP) family. PTPRN2 knockout mice showed decreased insulin secretion and significantly decreased release of norepinephrine, dopamine and 5-HT function in brain, which led to changes in anxiety-like behavior and learning (Nishimura et al, 2009).

In a previous linkage study of comorbid alcoholism and depression, Nurnberger et al (2001) reported a lod score of 3.97 for depression on chromosome 7 at 150 cM in a discovery data set that was predominantly EA subjects, but not in a replication data set. That locus is approximately 33 cM from our linkage peak for MDE and CD–MDE in the AA sample.

The linkage region on chromosome 10 for CD, CD–MDE, and AD in our previous report, contains at least three candidate genes whose protein products are involved in G protein-coupled receptor (GPCR) signaling: GRK5 (GPCR kinase 5), RGS10 (regulator of G protein signaling 10), and GPR26 (GPCR 26).

As all the families in this study were ascertained through affected sib-pairs for CD or OD, there are very few subjects having MDE without a substance dependence disorder. Thus, it is difficult to discern whether linkage findings for the CD–MDE phenotype represent a locus for a subtype of CD or for MDE alone. The linkage peaks on chromosomes 2 and 7 in the AA families and on chromosome 16 in the EA families (with similar linkage peaks for CD–MDE and MDE, but almost none for CD only), suggest the existence of a CD subtype (CD comorbid with MDE). However, the non-overlapping linkage peaks in the EA families on chromosome 13 for MDE alone and on chromosomes 5 and 10 for CD–MDE favor the alternative hypothesis: namely, that the findings are best explained by a gene predisposing to MDE. These two linkage signal patterns for CD, MDE, and CD–MDE might co-exist, because the complexity of comorbid addictive (eg, CD) and psychiatric (eg, depression) disorders might involve shared genetic liability with individual-specific genes mediating the development of different disorders. Future studies are needed to identify the gene(s) associated with the CD–MDE phenotype to clarify its relationship with the constituent phenotypes of CD and MDE.

Depression phenotypes are quite heterogeneous and include those that are recurrent and early onset, comorbid with anxiety disorders or bipolar disorders and so on. The most replicated linkage regions associated with depression (any subtype) include chromosomes 3 (105 cM), 8 (25.1 cM), 12 (100–105 cM), 15 (105–115 cM), 17 (28.0 cM), and 18 (75–88 cM) (Abkevich et al, 2003; Camp et al, 2005; Holmans et al, 2004, 2007; Levinson et al, 2007). The linkage regions reported for CD and related phenotypes include regions of chromosomes 3, 10, 12, and 18 (Gelernter et al, 2005). The linkage signals identified in this study do not coincide with any of the previously identified regions. The different phenotypes examined in previous studies and those examined in this study provide a likely explanation for the distinct linkage regions reported.

The results obtained here differed for the EA and AA samples. Considering that the SNP markers used in the linkage analyses were not identical between the AA and EA samples because of the cleaning procedure and analysis routines that excluded different markers for the two populations, we re-ran the analysis using the largest common set of cleaned SNP markers. The resultant linkage signals remained the same within each sample, excluding the possibility that the population-specific findings were due to the use of different marker sets. Many studies have yielded different linkage signals for the same phenotype in different populations (eg, type 2 diabetes mellitus (Malhotra et al, 2009), cardiovascular-related phenotype (Lynch et al, 2005), schizophrenia (Suarez et al, 2006), and nicotine dependence (Li et al, 2008) and so on). There are several possible mechanisms that may underlie the observation of distinct linkage regions identified between populations, including genetic heterogeneity arising from random genetic drift or differences in adaption, environmental differences, differences in allele frequencies and population history, and random effects pertaining to sampling. For instance, differences in allele frequencies across populations may very well lead to differences in the ability to detect IBD patterns such that the detected linkage signals are different across populations. In view of this, our results appear to indicate that distinct genetic susceptibility regions underlie comorbid CD–MDE in AAs and EAs were detected because of random effects pertaining to the factors mentioned above.

Strengths of our study include the fact that we ascertained >700 families with ASPs for CD and/or OD in AAs and EAs, and the corroboration of most non-parametric linkage findings by the parametric linkage analysis. The veracity of the linkage on chromosomes 5 and 7 is reinforced by the two analytical approaches. One set of results strengthens the other and reduces the possibility of a false positive.

Major depression and CD are common disorders that exact a heavy toll on individuals, families, and society. Clinically, these two disorders affect each other in terms of treatment outcome, the presence of one disorder resulting in poorer treatment outcomes, and higher relapse rates in the other (Hasin et al, 2002; Nunes and Levin, 2004). Subjects are more prone to relapse if either one of the disorders is untreated. In this study, we show that these comorbid disorders share genetic vulnerability. Our GWLS identified several novel chromosomal regions likely to harbor genes for CD–MDE or CD alone. Genes on other chromosomes may also affect this vulnerability. Further studies by genomewide association, pathway analysis, or next generation sequencing are needed to find causative variants in the linkage regions. Elucidating the genetic basis of CD comorbid with MDE to increase our understanding of the etiology of this specific subtype of CD could foster prevention efforts and the development of more effective treatments.