INTRODUCTION

Cigarette smoking is the most common form of tobacco use (Smith and Fiore, 1999), and is one of the most significant sources of morbidity and death worldwide (Murray, 2006). In the United States, more than 20% of adults are current smokers (CDC, 2007), and cigarette smoking is responsible for ∼438 000 premature deaths and an estimated economic cost of $167 billion annually (CDC, 2005b). In addition, about 20% of high school students (CDC, 2006) and 8% of middle school students (CDC, 2005a) in the United States smoke. Moreover, every day, about 4000 teenagers in the United States initiate cigarette smoking, and more than 1000 of them may become daily cigarette smokers (SAMHSA, 2006). Although a large fraction of smokers try to quit (CDC, 2007), available treatments are effective for only a fraction of them (Hughes et al, 2007; Lerman et al, 2007). Thus, developing therapeutic approaches that can help smokers achieve and sustain abstinence from smoking, as well as methods that can prevent people, especially youths, from starting to smoke, remain a huge challenge in public health.

Cigarette smoking is a complex behavior that includes a number of stages such as initiation, experimentation, regular use, dependence, cessation, and relapse (Ho and Tyndale, 2007; Malaiyandi et al, 2005; Mayhew et al, 2000). Although the initiation of tobacco use, the progression from initial use to smoking dependence, and the ability to quit smoking are undoubtedly affected by various environmental factors, twin, family, and adoption studies have provided strong evidence that genetics has a substantial role in the etiology of these phenotypes (Goode et al, 2003; Lerman and Berrettini, 2003; Lerman et al, 2007; Osler et al, 2001). Earlier studies revealed a considerable genetic contribution to the risk of smoking initiation (SI) (Hardie et al, 2006; Kendler et al, 1999; Li et al, 2003; Mayhew et al, 2000; Morley et al, 2007; Vink et al, 2004), nicotine dependence (ND) (Kendler et al, 1999; Lessov et al, 2004a; Maes et al, 2004; Malaiyandi et al, 2005; Sullivan and Kendler, 1999; True et al, 1999), as well as smoking cessation (SC) (Hamilton et al, 2006; Heath et al, 1999; Morley et al, 2007; Xian et al, 2003).

Nicotine is the main psychoactive ingredient in cigarettes and evokes its physiological effects by stimulating the mesolimbic brain reward system by binding with nicotinic acetylcholine receptors (nAChRs). So far, the majority of candidate gene-based association studies have focused on those genes that may predispose to addictive behavior by virtue of their effects on key neurotransmitter pathways (for example, dopamine and serotonin) and genes that may affect response to nicotine (for example, nAChRs and nicotine metabolism) (Ho and Tyndale, 2007). However, genetic studies have indicated that, for complex behaviors such as cigarette smoking, the individual differences can be attributed to hundreds of genes and their variants. Genes involved in different biological functions may act in concert to account for the risk of vulnerability to smoking behavior, with each gene having a moderate effect (Hall et al, 2002; Lessov et al, 2004b; Tyndale, 2003). Polymorphisms in related genes may cooperate in an additive or synergistic manner and modify the risk of smoking rather than act as sole determinants. Consistent with this belief, more and more genes have been found to be associated with smoking behavior over past decades, especially during the past few years. Whereas some plausible candidate genes (for example, nAChRs and dopamine signaling) have been reported and the findings have been partially replicated, numerous genes involved in other biological processes and pathways also have been associated with different smoking behaviors. This is especially true as the genome-wide association (GWA) study is being commonly used in genetic studies of complex traits such as smoking, and the underlying genetic factors can now be investigated in a high throughput and more comprehensive approach. In this situation, a systematic approach that is able to reveal the biochemical processes underlying the genes associated with smoking behaviors will not only help us understand the relations of these genes but also provide further evidence of the validity of the individual gene-based association studies.

In this study, we searched the literature to identify genes purportedly associated with SI/progression (P), ND, and SC. We then examined whether these genes are enriched in biochemical pathways important in neuronal and brain function.

MATERIALS AND METHODS

Identification of Smoking-Related Genes

Contemporary genetic association studies of smoking behaviors are focused primarily on SI, progression to smoking dependence, ND (assessed by various measures or scales such as DSM-IV, Fagerstrom Test for Nicotine Dependence, Fagerstrom Tolerance Questionnaire, and/or smoking quantity, etc.), or SC. Only limited studies have been conducted on SI and progression to ND, and considering the potential overlap of these two highly related behaviors, we combined them into the single category of SI/P.

The list of candidate genes for the three smoking-related phenotypes was created by searching all human genetics association studies deposited in PUBMED (http://www.ncbi.nlm.nih.gov/pubmed/). Similar to Sullivan et al (2004), we queried the item ‘(Smoking [MeSH] OR Tobacco Use Disorder [MeSH]) AND (Polymorphism [MeSH] OR Genotype [MeSH] OR Alleles [MeSH]) NOT (Neoplasms [MeSH])’, and a total of 1790 hits was retrieved by September 2008. The abstracts of these articles were reviewed and the association studies of any of the three smoking-related behaviors were selected. From the selected publications, we narrowed our selection by focusing on those reporting a significant association of one or more genes with any of the three phenotypes. To reduce the number of false-positive findings, the studies reporting negative or insignificant associations were not included, although it is likely that some of the genes analyzed in these studies might be associated with the phenotypes that we were interested in. The full reports of the selected publications were reviewed to ensure that the conclusions were supported by the content. From these studies, genes reported to be associated with each phenotype were selected for the current study.

The results from several GWA studies were included. In the work of Bierut et al (2007), 35 of 31 960 SNPs were identified with p-values <0.0001, and several genes were suggested to be associated with ND, including neurexin 1 (NRXN1), vacuolar sorting protein (VPS13A), transient receptor potential channel (TRPC7), as well as a classic candidate gene related to smoking, neuronal nicotinic cholinergic receptor β3 (CHRNB3). All the genes nominated in this study were included in our list for ND. In another large-scale candidate gene-based association study, Saccone et al (2007) analyzed 3713 SNPs corresponding to more than 300 candidate genes. The top five SNPs with the smallest false discovery rate (FDR) values (ranging from 0.056 to 0.166) corresponded with neuronal nicotinic cholinergic receptor α3 (CHRNA3), α5 (CHRNA5), and CHRNB3. In the work of Uhl et al (2007), allele frequencies in nicotine-dependent and control individuals were compared for 520 000 SNPs, and 32 genes were suggested to be potentially associated with ND; all of them were included in the ND-related gene list. In a recent study on SC, Uhl et al (2008) performed GWA studies on three independent samples to identify genes facilitating SC success with bupropion hydrochloride vs nicotine replacement therapy. Various genes involved in cell adhesion, transcription regulation, transportation, and signaling transduction were suggested to be candidates contributing to successful SC. From this study, we included those genes that showed significant association with SC in all the three samples (eight genes) or in two samples with at least two nominally significant SNPs in each sample (55 genes).

Identification of Enriched Biochemical Pathways

By literature search, we collected a list of genes associated with each smoking-related phenotype. To get a better understanding of the underlying biological mechanisms, multiple bioinformatics tools were used to identify the significantly enriched pathways involved in the smoking phenotypes. The available pathway analysis tools can be classified into three categories (Tarca et al, 2009): (1) Over-representation analysis (ORA), which compares the genes of interest with genes in predefined pathways and identifies the pathways that include a statistically higher number of genes in the list of interest as overrepresented; (2) functional class scoring (FCS), which compares the genes in chosen pathways with the entire list of genes sorted by certain criteria and identifies the pathways showing statistically significant correlation with the phenotypes under study; and (3) impact analysis, which is similar to ORA, but also considers the connections of genes in the pathways. The FCS approach, for example, GSEA (Subramanian et al, 2005), is not feasible for the current analysis as it requires gene expression measurements. The following is a brief description of the four pathway analysis tools used in the current study.

Ingenuity pathway analysis

The core of Ingenuity pathway analysis (IPA) (http://www.ingenuity.com/) is the Ingenuity Pathways Knowledge Base (IPKB), a manually curated knowledge database consisting of function, interaction, and other information of genes/proteins. On the basis of such information, IPA is able to perform analysis on global canonical pathways, dynamically generated biological networks, and global functions from a list of genes. Currently, the IPKB includes 81 canonical metabolic pathways involved in various metabolism processes such as energy metabolism, metabolism of amino acids, and complex carbohydrates. It also includes 202 signaling pathways, such as those related to neurotransmitter signaling, intracellular and secondary signaling, and nuclear receptor signaling.

In our analysis, the gene symbol and the corresponding GenBank accession numbers of genes associated with each smoking phenotype were uploaded into the IPA and compared against the genes in each canonical pathway included in the IPKB. All the pathways with one or more genes overlapping the candidate genes were extracted. A significance value was assigned by the program to measure the chance that the genes of interest participate in a given extracted pathway. Briefly, the p-value for a given pathway was calculated by considering: (1) the number of input genes that could be mapped to this pathway in the IPKB, denoted by m; (2) the number of genes involved in this pathway, denoted by M; (3) the total number of input genes that could be mapped to the IPKB, denoted by n; and (4) the total number of known genes included in the IPKB, denoted by N. Then the p-value was calculated using the right-tailed Fisher's exact test (which is identical to the hypergeometric distribution in this case):

where C(M, k), C(N−M, n−k), and C(N, n) are binomial coefficients. In general, a p-value <0.05 indicates a statistically significant, non-random association.

As many pathways were examined, multiple comparison correction for the individually calculated p-values was necessary to permit reliable statistical inferences. The output p-values for the pathways associated with each smoking phenotype were analyzed separately using the MATLAB Bioinformatics Toolbox (The Mathworks, Natick, MA), which calculated the FDR by the method of Benjamini and Hochberg (1995).

The database for annotation, visualization, and integrated discovery

The database for annotation, visualization, and integrated discovery (DAVID) (http://david.abcc.ncifcrf.gov) (Hosack et al, 2003; Huang da et al, 2009) is a bioinformatics resource consisting of an integrated biological knowledge database and analytic tools aimed at extracting biological themes from gene/protein lists systematically. Compared with other tools, DAVID can provide an integrated and expanded back-end annotation database, multiple modular enrichment algorithms, and exploratory ability in an integrated data-mining environment. In our analysis, the input genes were analyzed using different text- and pathway-mining tools including gene functional classification, functional annotation chart or clustering, and functional annotation table. Pathway analysis was performed using its Functional Annotation Tool based on the Kyoto Encyclopedia of Genes and Genomes (KEGG; www.genome.jp/kegg) and Biocarta (www.biocarta.com) pathway databases. The enrichment of given pathways in the gene list was measured by EASE score, a modified Fisher exact test p-value. The program also performed p-value correction based on the method of Benjamini and Hochberg (1995).

GeneTrail

GeneTrail (Keller et al, 2008) (http://genetrail.bioinf.uni-sb.de) is web-based bioinformatics tool providing the statistical evaluation of gene/protein lists with respect to enrichment of functional categories. It can perform a wide variety of biological categories and pathway analysis based on multiple databases such as KEGG and Gene Ontology (GO; http://www.geneontology.org/). The GeneTrail analysis tool used in this work was the ‘Over-representation Analysis’ module, which compared the gene list with a reference set of genes and identified the overrepresented pathways.

Onto Pathway-Express

Onto Pathway-Express (http://vortex.cs.wayne.edu/ontoexpress/) is a pathway analysis tool based on the KEGG pathway database. Different from other tools such as IPA and DAVID, this bioinformatics tool integrates the pathway topology and the position information of each gene in the pathway into its enrichment analysis. By implementing an impact factor analysis, Onto Pathway-Express incorporates both the probabilistic component and the gene interactions into pathway identification (Draghici et al, 2007). Briefly, on the basis of the input gene list, Onto Pathway-Express calculates a perturbation factor for each gene by taking into account its expression level and the perturbation of genes downstream from it in each selected pathway. The impact factor of the entire pathway includes a probabilistic term that takes into consideration the proportion of differentially regulated genes in the pathway and the perturbation factors of all genes in the pathway. The output pathways are assigned significance levels according to their impact factors, and the FDR values are computed by the method of Benjamini and Hochberg (1995).

Of the four tools, IPA, DAVID, and GeneTrail use the ORA approach. However, the pathway databases underlying these tools are different, for example, a proprietary database is included in IPA, whereas public databases are adopted by DAVID (KEGG and Biocarta) and GeneTrail (KEGG). Although Onto Pathway-Express performs its analysis on the basis of the KEGG database, the analysis algorithm is different from other methods. With these methods, we expect to obtain a relatively comprehensive evaluation of the pathways associated with the genes important to each smoking-related phenotype.

RESULTS

Identification of Genes Reported to be Associated with Each Smoking Behavior

By searching PUBMED, we extracted publications on the genetic association studies related to tobacco smoking. In the current work, we focused only on the studies related to one of the three phenotypes: SI/P, ND, and SC. For each phenotype, only the publications reporting a significant association of a gene(s) with this phenotype were collected; those by the original authors reporting a negative result or insignificant association were not included. A detailed list of all genes reported to be associated with each of the three phenotypes is provided in Table 1.

Table 1 Genes Associated with Smoking-Related Behaviors

For SI/P, 16 genes were identified in 15 studies, all of which were performed at individual gene level. Among them are five nAChR subunit genes, that is, CHRNA3, CHRNA5, CHRNA6, CHRNB3, and CHRNB4; dopamine receptor D2 (DRD2) and D4 (DRD4); and one serotonin receptor (HTR6). The genes encoding transporters of dopamine (DAT1 or SLC6A3) and serotonin (5-HTT or SLC6A4) were also included. The other genes were those involving the functions related to nicotine or neurotransmitter metabolism/synthesis such as COMT, CYP2A6, and TPH1; signal transduction (for example, PTEN and RHOA); or immune response (for example, IL8).

Regarding ND, there were 76 publications, including 73 studies focused on either single or a few genes. In these papers, 63 genes were reported to be significantly associated with ND by the original authors. The other three studies were either on a genome-wide scale (Bierut et al, 2007; Uhl et al, 2007) or on hundreds of candidate genes (Saccone et al, 2007), and they nominated a total of 41 genes. Collectively, 99 unique genes are included in the final list. The most prominent genes were those encoding acetylcholine receptors (CHRM1, CHRM5, CHRNA4, CHRNA5, and CHRNB2), dopamine receptors (DRD1, DRD2, DRD3, and DRD4), GABA receptors (GABRA2, GABRB2, GABARAP, and GABRA4), serotonin receptors (HTR1F and HTR2A), as well as proteins involved in nicotine or neurotransmitter metabolism/synthesis (for example, CYP2A6, DBH, MAOA, and TPH1).

For SC, 63 genes were nominated by a GWA study (Uhl et al, 2008) and 12 by 23 candidate gene-based association studies. These genes were involved in various biological functions, such as dopamine receptor signaling (DRD2, DRD4, and SLC6A3), glutamate receptor signaling (GRIK1, GRIK2, GRIN2A, and SLC1A2), and calcium signaling (for example, CACNA2D3, CACNB2, CDH13, and ITPR2).

Among the genes associated with the three smoking phenotypes, five were included in all the three lists, that is, COMT, CYP2A6, DRD2, DRD4, and SLC6A3. Another six genes, that is, CHRNA3, CHRNA5, CHRNB3, PTEN, SLC6A4, and TPH1, were associated with SI/P and ND. Ten genes, that is, A2BP1, ARRB2, CDH13, CHRNB2, CSMD1, CYP2B6, DBH, OPRM1, PRKG1, and PTPRD, were associated with ND and SC.

Enriched Biological Pathways Associated with Each Smoking-Related Phenotype

On the basis of the genes related to each smoking phenotype, enriched biochemical pathways were identified by IPA and other bioinformatics tools. For SI/P, the 16 genes (Table 1) were overrepresented in 9 pathways defined in the IPA database (p<0.05; Table 2). For five of these pathways (calcium signaling, dopamine receptor signaling, serotonin receptor signaling, cAMP-mediated signaling, and G-protein-coupled receptor signaling), the corresponding FDR values were <0.05. For the other pathways (tryptophan metabolism, tight junction signaling, IL-8 signaling, and integrin signaling), they had slightly higher FDR values (0.085–0.116).

Table 2 Pathways Overrepresented by Genes Associated with Smoking Initiation/Progressiona

The IPA assigned 51 of the 99 genes associated with ND to 21 overrepresented pathways (p<0.05; Table 3). Fourteen of these pathways (for example, dopamine receptor signaling, cAMP-mediated signaling, G-protein-coupled receptor signaling, and serotonin receptor signaling) had an FDR<0.05, and the other pathways (for example, fatty acid metabolism and synaptic long-term potentiation (LTP)) had an FDR<0.14.

Table 3 Pathways Overrepresented by Genes Associated with Nicotine Dependencea

For SC, 13 pathways were found to be enriched in 18 of the 75 genes associated with this phenotype (p<0.05; Table 4). Four of the pathways (dopamine receptor signaling, glutamate receptor signaling, cAMP-mediated signaling, and calcium signaling) had an FDR<0.05, and the remaining pathways (for example, synaptic LTP, G-protein-coupled receptor signaling, and synaptic long-term depression (LTD)) had an FDR ranging from 0.082 to 0.18.

Table 4 Pathways Overrepresented by Genes Associated with Smoking Cessationa

Of the pathways enriched in the genes associated with each smoking phenotype, four, that is, calcium signaling, cAMP-mediated signaling, dopamine receptor signaling, and G-protein-coupled receptor signaling, were associated with all three smoking behaviors (Table 5). Two other enriched pathways (that is, serotonin receptor signaling and tryptophan metabolism) were shared by SI/P and ND, and three enriched pathways (neurotrophin/TRK signaling, synaptic LTP, and tyrosine metabolism) were shared by ND and SC.

Table 5 Identified Common and Specific Pathways for Each Smoking Behavior Category

The enrichment of these pathways in multiple smoking phenotypes was consistent with the fact that synaptic transmission-related biological processes, such as nicotine-nAChR and dopamine signaling, were the key biochemical components underlying different smoking-related behaviors. This also implies that the genes involved in these three smoking phenotypes indeed overlap highly. On the basis of these biochemical relationships, we present in Figure 1 a schematic representation of the major pathways associated with the three phenotypes.

Figure 1
figure 1

Schematic representation of the genes and major pathways involved in smoking initiation/progression (SI/P), smoking dependence, or smoking cessation (SC). Genetic studies have indicated that tobacco smoking is a complex disorder. On the basis of the genes associated with SI/P, ND, and SC, we identified various enriched pathways corresponding to each phenotype of interest. These pathways were then connected on the basis of their biological relations. Owing to the overlap of many pathways among these three phenotypes, for the sake of simplicity, all identified pathways are shown together.

DISCUSSION

Over recent decades, much has been learnt from animal or cell models about the molecular mechanisms underlying nicotine treatment. Numerous genes and pathways have been found to have a role, either directly or indirectly, in these important smoking-related phenotypes. However, it is less clear whether the same sets of genes and pathways are involved in tobacco dependence of humans. Epidemiological studies have shown that genetic factors are responsible for a significant portion of the risk for SI, ND, and SC (Hamilton et al, 2006; Lerman and Berrettini, 2003; Li et al, 2003; Mayhew et al, 2000; Sullivan and Kendler, 1999). Moreover, significant genetic overlaps have been identified among these three phenotypes (Ho and Tyndale, 2007; Kendler et al, 1999; Maes et al, 2004). Identifying vulnerability genes for the three phenotypes, especially the biochemical pathways associated with them, will not only provide a systematic overview of the genetic factors underlying different smoking behaviors but is also helpful in guiding selection of potentially important genes for further analysis. With a thorough review of the genes contributing to the genetic risk of smoking behaviors, and a systematic search for gene networks using various pathway analysis tools, herein, we provide a comprehensive view of the biochemical pathways involved in the three major smoking phenotypes (see Figure 1 for details).

Although candidate gene-based association studies have provided much of our knowledge about factors contributing to smoking behaviors, a systematic approach, as reported in this study, has significant advantages. For complex disorders such as tobacco smoking, the presence of genetic heterogeneity and multiple interacting genes, each with a small to moderate effect, are considered to be the major hurdle in genetic association studies (Ho and Tyndale, 2007; Lessov-Schlaggar et al, 2008). Numerous genetic factors have been implicated, but in many cases, these findings cannot be replicated in independent studies. At the same time, because of resource limitations, a significant proportion of reported genetic studies might not have sufficient sample size or enough replication samples to reduce the rate of false-positive associations evoked by multiple testing. This is especially true for GWA studies, in which tens of thousands of SNPs can be analyzed simultaneously. A pathway approach, which takes account of the biochemical relevance of genes identified from association studies, not only can be more robust to potential false positives caused by factors such as low density of markers, small sample sizes, different ethnicities, and heterogeneity within and between samples but also may yield a more comprehensive view of the genetic mechanism underlying smoking behaviors. Moreover, although in candidate gene-based association studies, the selection of targets may be focused on some specific biological processes or pathways, the results from GWA studies seem to be more diverse. In such cases, pathway analysis becomes more necessary to detect the main biological themes from the genes involved in different functions. For example, in a recently reported GWA study, Vink et al (2009) identified 302 genes associated with SI and current smoking, but no gene involved in classic targets, such as dopamine receptor signaling or nAChRs, was detected. Instead, they identified genes related to glutamate receptor signaling, tyrosine kinase signaling, and cell-adhesion proteins. In our analysis based on genes other than those reported by Vink et al, glutamate receptor signaling was enriched among the genes associated with SC, and TRK signaling was enriched in both ND and SC (see Tables 3 and 4, and Figure 1). With the increased interest in conducting GWA studies for smoking behavior and other complex traits, a pathway approach will become more useful.

However, there are several limitations of this study. First, our pathway analysis results depend entirely on genes reported to be associated with each smoking phenotype of interest. Given that identification of susceptibility genes for each smoking phenotype is an ongoing process, the pathways identified in this report should be treated in the same way. Therefore, the pathways identified here are only some of the pathways that may be involved in the regulation of the three phenotypes. This is especially true for SI/P and SC, as significantly more genetic studies have been conducted on ND compared with the other smoking phenotypes. Second, we adopted the conclusions drawn by the original authors of each study in our pathway analysis. This means that some of our conclusions may be biased by some of those original reports because of their small sample size, the presence of heterogeneity, or absence of correction for multiple testing. Initially, we tried to apply a general standard to all those reported studies but had to give it up because different research groups conducted those studies over different time periods. It was challenging to redraw a conclusion from those studies reported by other researchers. However, we do not think this will affect our results greatly, as we have included as many reports as we could get from the literature. Third, for the sake of simplicity and increasing the number of genes included in each smoking phenotype, we classified more than 100 reports on smoking-related behaviors from different ethnic populations into three broad categories, that is, SI/P, ND, and SC. This is certain to bring a heterogeneity issue to the three phenotypes of interest, especially for SI/P and ND. Fourth, the direction of association is an important issue. For example, some variations may be associated with a protective effect against SI or ND, whereas others may increase the risk of such tendencies. Considering the fact that the direction of association depends on genetic variants under investigation for a given phenotype, we did not consider it in our current analyses. Because at this stage we are more interested in the genes and pathways potentially associated with smoking behaviors, focusing on the genes without considering the association directions will not create a serious problem. In addition, to simplify the analysis and reduce the number of false-positive genes, we did not include publications reporting negative or insignificant results. However, we realize that some genes from these studies may be among the factors associated with the smoking behaviors of interest. The fact that they were not found to be associated is likely attributable to other factors such as the small sample size or the presence of heterogeneity in their samples.

Although there are some limitations to this study, some interesting findings emerged, which probably never would have been identified in any single genetic study, including GWA, in one or a few samples. For example, we found that calcium signaling, dopamine receptor signaling, and cAMP-mediated signaling are the main pathways enriched in all three smoking phenotypes. The most prominent calcium signaling-related genes associated with each phenotype were nACh receptors. By mediating intracellular Ca2+ concentration, these ligand-gated cation channels have an important role in regulating various neuronal activities, including neurotransmitter release (Marshall et al, 1997; Wonnacott, 1997). Transcription factors, such as CREBs (cAMP responsive element-binding proteins), are crucial for conversion of events at cell membranes into alterations in gene expression. Regulation of the activity of CREB by drugs of abuse or stress has a profound effect on an animal's responsiveness to emotional stimuli (Carlezon et al, 2005; Conti and Blendy, 2004). The CREB function in the neurons is normally regulated by glutamatergic and dopaminergic inputs (Dudman et al, 2003).

The mesolimbic dopamine pathway is believed to be one of the central pathways underlying addiction to various drugs of abuse (Nestler, 2005). Genes included in this pathway are among the major targets of association study for ND. Although this pathway is enriched in all the three smoking-related phenotypes, the genes associated with each smoking phenotype are different. For SI/P, the genes reported in literature, such as COMT, DRD2, DRD4, and SLC6A3, were shared by ND and SC. For SC, two genes, FREQ and PPP2R2B, were uniquely detected. The FREQ protein (also known as neuronal calcium sensor 1, NCS1), a member of the neuronal calcium sensor family, has been implicated in the regulation of a wide range of neuronal functions such as membrane traffic, cell survival, ion channels, and receptor signaling (Burgoyne, 2007). In mammalian cells, FREQ may couple the dopamine and calcium signaling pathways by direct interaction with DRD2, implying an important role in the regulation of dopaminergic signaling in normal and diseased brain (Kabbani et al, 2002). The interaction between variants of DRD2 and FREQ significantly impacts the efficacy of nicotine replacement therapy (Dahl et al, 2006). PPP2R2B encodes a brain-specific regulatory subunit of protein phosphatase 2A (PP2A) and gives rise to multiple splice variants in neurons (Dagda et al, 2003; Schmidt et al, 2002). The product of this gene is suggested to be localized in the outer mitochondrial membrane and involved in neuronal survival regulation through the mitochondrial fission/fusion balance (Dagda et al, 2008). A CAG-repeat expansion in a non-coding region of this gene is responsible for the neurodegenerative disorder, spinocerebellar ataxia type 12 (SCA12) (Holmes et al, 1999). Although the dopamine receptor pathway has an important role in all three smoking phenotypes, it is possible that different parts of this pathway are involved in each smoking behavior, with SI/P and ND having greater similarity than SC. Given the importance of this pathway in the development of drug addiction, more genes need to be verified to obtain a more specific picture of its role underlying each phenotype.

Serotonin modulates dopamine release and has been implicated in nicotine reinforcement. Earlier study has shown that serotonin concentrations are increased by nicotine administration and decreased during withdrawal. Serotonin receptor signaling was enriched in the genes associated with SI/P and ND, but not in those associated with SC, in our analysis. In several recent studies designed to investigate the association between genes from the serotonin receptor signaling pathway and SC, no positive result was obtained (Brody et al, 2005; David et al, 2007b, 2008a; Munafo et al, 2006; O'Gara et al, 2008). Similar to the serotonin receptor signaling pathway, tryptophan metabolism, the pathway involved in the biological synthesis of serotonin, is enriched in the genes associated with SI/P, but not in those associated with SC. Consistent with this result, to date, the clinical effects of serotonergic-based drugs in SC are largely negative (Fletcher et al, 2008). Although more studies are needed, these results suggest that the genetic variants in serotonin receptor signaling and tryptophan metabolism pathways may be less important in SC.

Glutamate receptor signaling was found to be enriched in the genes associated with SC, but not in those associated with the other two phenotypes. In a recent GWA study (Vink et al, 2009), multiple genes from the glutamate receptor signaling pathway were suggested to be associated with SI and current smoking. Similarly, the glutamate receptor signaling-related genes associated with SC were also identified by a GWA study (Uhl et al, 2008). The genes in this pathway that are associated with SC include GRIK1, GRIK2, GRIN2A, and SLC1A2, whereas GRIN2A, GRIN2B, GRIK2, and GRM8 were associated with SI and current smoking (Vink et al, 2009). Another gene, GRM7, was suggested to be associated with ND in an earlier GWA study (Uhl et al, 2007). Taken together, these results suggest that glutamate receptor signaling is involved all three phenotypes of interest. In addition, till now, most of the genes from this pathway were identified by the GWA studies, showing the great potential of the GWA study in identifying genetic variants related to smoking behavior.

Our analysis indicates that the LTP pathway was enriched in genes associated with ND and SC, and the LTD pathway was enriched in genes associated with SC. Repeated exposure of neurons to nicotine eventually leads to the modulation of the functioning of the neural circuits in which the neurons operate. LTP and LTD are thought to be critical mechanisms that contribute to such modifications in neuronal plasticity (Kauer, 2004; Saal et al, 2003; Thomas and Malenka, 2003). In the development of ND, the LTP and LTD pathways may be essential for the neurons to form new synapses and eliminate some unnecessary ones to adapt to a new environment. In the process of SC, these pathways may be invoked to interrupt some neuron connections formed in the development of nicotine addiction in order to help the reward circuit return to normal. Until now, only a few genes related to LTP and LTD have been identified in the association studies. Considering the importance of these pathways in ND development and SC, other genes associated with these processes represent potential targets for future studies of these phenotypes.

In a recent study, five pathways were suggested to be associated with addiction to cocaine, alcohol, opioids, and nicotine in humans (Li et al, 2008). These pathways are gap junctions, GnRH signaling, LTP, MAPK signaling, and neuroactive ligand–receptor interaction. As shown in Table 5, three of the pathways (gap junction, LTP, and MAPK signaling) were enriched in genes associated with ND or SC. Although another pathway, neuroactive ligand-receptor interaction, was identified for all three smoking phenotypes by either Onto Pathway-Express or DAVID analysis in our work, it was not included in the current report because several more specific pathways, such as calcium signaling, dopamine receptor signaling, and serotonin receptor signaling, were also identified and reported herein. Our results provide further evidence that nicotine may share some biological mechanisms with other substances in addiction conditions. However, we also identified multiple specific pathways related to smoking behavior, suggesting that the mechanisms underlying nicotine addiction are complex and may be different in certain ways from those associated with addiction to other drugs.

The significantly overrepresented pathways suggest a view of neuronal responses in different conditions of nicotine–neuron interaction (Figure 1). On binding by nicotine, the nAChRs open and cause the influx of Ca2+ and Na+ into the presynaptic neuron, which evokes depolarization of the neuron, as well as activation of the Ca2+ signaling cascade. The Ca2+ signaling cascade is directly related to the presynaptic release of neurotransmitters, including dopamine, serotonin, GABA, and glutamate, in different neurons. The neurotransmitters interact with their specific receptors, provoking a series of signaling pathways, such as cAMP-mediated signaling and PKC signaling. With the regulation of these pathways, various physiological processes such as neuronal excitability and energy metabolism may be mediated. Variations in some of these genes may change the efficiency or function of the pathways and, eventually, the psychopathological phenotype. Although a significant number of genes associated with these pathways have been identified, our understanding of the genetic determinants of smoking is still in its early stages (Munafo and Johnstone, 2008). It can be expected that as more genetic factors are determined, more detailed pathways and more comprehensive understanding of the mechanisms of human smoking behavior will be obtained.