Introduction

In humans, norepinephrine (NE) is essential to fundamental cognitive and emotional processes including attention, learning and memory, perception of emotions, and perception of pain (Foote et al. 1983; Jasmin et al. 2002). NE is also involved in autonomic control via its actions in the brainstem and as the primary neurotransmitter at postganglionic sympathetic nerve terminals (Hahn et al. 2003). The majority of brain noradrenergic neurons are concentrated in the locus coeruleus, a phylogenetically ancient and developmentally precocious structure. These NE neurons project to limbic regions critical to cognition and affect.

NE released at central and peripheral synapses is inactivated through active transport into terminals by the presynaptically localized norepinephrine transporter (NET) (Iversen 1974). NET recaptures as much as 90% of released NE making it a critical mediator of NE inactivation and presynaptic catecholamine homeostasis (Schomig et al. 1989). Thus, NET plays a role in controlling the intensity and duration of signal transduction (Zahniser et al. 2001). NE interacts with many other neurotransmitters both in normal cortical regulation and in the therapeutic response to psychoactive compounds, and one critical interacting neurotransmitter is dopamine (Jordan et al. 1994). Dopamine is the NE precursor so that levels of both neurotransmitters are regulated by common factors, for example, tyrosine hydroxylase activity. The NET has the ability to transport dopamine, and drugs that block the NET increase extracellular levels of both NE and DA (Tanda et al. 1997; Bymaster et al. 2002; Gu et al. 1996). Monoamine transporters are initial sites of action for several antidepressant drugs (including several which are relatively NET selective) as well as psychostimulants including cocaine and the amphetamines (Pacholczyk et al. 1991; Ritz et al. 1990; Tatsumi et al. 1997; Sacchetti et al. 1999). Decreases in NE uptake sites and activity have been observed in hypertension, diabetes, cardiomyopathy, and heart failure (Esler et al. 1981; Merlet et al. 1992; Bohm et al. 1995; Schnell et al. 1996; Backs et al. 2001), and insufficient NE clearance may contribute to the progression of these diseases (Bohm et al. 1998).

The human NET gene (SLC6A2, hCG2025341) is located on chromosome 16q12.2 (Brüss et al. 1993) and has 15 exons spanning 48 kb (Pörzgen et al. 1995, 1998). The cDNA sequence encodes a 617-amino acid protein with 12 highly hydrophobic membrane domains and a high level of amino acid identity to other members of the Na+/Cl-dependent monoamine transporter family, e.g., HTT (serotonin transporter) and DAT (dopamine transporter) (Nelson 1998; Hahn and Blakely 2002). SLC6A2 has five alternative splice transcripts. Resequencing of SLC6A2 identified 13 DNA sequence variants, among them five low-frequency missense substitutions (Stober et al. 1996). The reported missense substitutions Val69Ile, Thr99Ile, Val245Ile, Val449Ile, and Gly478Ser are located in putative transmembrane domains 1, 2, 4, 9, and 10, respectively. The Thr99Ile substitution is at the 5th position of a putative leucine zipper in transmembrane domain 2. A rare Ala457Pro substitution in exon 9 resulting in more than 98% loss of function has recently been detected (Ivancsits et al. 2003). A synonymous substitution also located in exon 9, A1287G, has been used in a series of association studies to NE-related phenotypes including hypertension and mood disorders (Stober et al. 1996; Leszczynska-Rodziewicz et al. 2002; Samochowiec et al. 2002). However, no common functional NET polymorphisms are known.

The NET markers used so far in linkage studies do not capture the potential information on NET functional variation, and the results obtained so far have been nondefinitive. A haplotype approach combining abundant missense polymorphisms with a series of loci chosen for haplotype informativeness offers the potential for detection of effects of any allele of moderate abundance and effect size, regardless of whether the allele is presently known or unknown (Gabriel et al. 2002). Concerning linkage disequilibrium (LD), many regions of the genome have a block-like structure such that all loci within the block region tend to be in strong LD. However, haplotype block boundaries, strength of LD, haplotype diversity, and optimal marker panels to fully capture haplotype diversity vary across populations. In this study, we report the haplotype structure of SLC6A2 obtained using 26 single-nucleotide polymorphisms (SNPs) genotyped in four populations: Finnish and American Caucasians, American Indians, and African Americans. We also describe marker panels for each block, which maximize haplotype information content.

Materials and methods

Participants

A total of 384 participants were genotyped, including 96 individuals from each of four populations: Finns, US Caucasians, African Americans, and Plains American Indians. Informed consent was obtained according to human research protocols approved by the human research committees of the recruiting institutes, including the National Institute on Alcohol Abuse and Alcoholism, National Institute of Mental Health, Rutgers University, and University of Helsinki. All participants had been psychiatrically interviewed, and none had been diagnosed with a psychiatric disorder.

SNP markers

The physical position and frequency of minor alleles (>0.05) from a commercial database (Celera Discovery System, CDS, July, 2003) were used to select SNPs (including A1287G and Ala457Pro). A total of 50 SLC6A2 SNPs were identified in the database. 5′ nuclease assays (vide infra) could be designed for 35, and of these, 26 SNP assays detected sequence polymorphisms and could be genotyped in highly accurate fashion. This panel of 26 equally spaced markers covered the 48-kb gene plus 4 kb upstream and 4 kb downstream.

Genomic DNA

Genomic DNA was extracted from lymphoblastoid cell lines and diluted to a concentration of 10 ng/ul; 1-ul aliquots were dried in 384-well plates.

Polymerase chain reaction (PCR) amplification

Genotyping was performed by the 5′ nuclease method (Shi et al. 1999) using fluorogenic allele-specific probes. Oligonucleotide primer and probe sets were designed based on gene sequence from the CDS, July 2003. Primers and detection probes for each locus are listed in Table 1.

Table 1 Primer and probe sequences for 5′ nuclease genotyping of 26 SLC6A2 single-nucleotide polymorphisms (SNPs)

In each reaction well, 2.5 μl of PCR Master Mix (Applied Biosystems, CA, USA) containing AmpliTaq Gold DNA Polymerase, dNTPs, gold buffer, and MgCl2 was mixed with 900 nmol of each forward and reverse primer and 100 nmol of each reporter and quencher probe. DNA was allowed to stand at 50°C for 2 min and at 95°C for 10 min, amplified by 40 cycles at 95°C for 15 s and 60°C for 1 min, and then held at 4°C. PCR was carried out with a GeneAmp PCR system 9700 (Applied Biosystems).

Allele-specific signals were distinguished by measuring endpoint 6-FAM or VIC fluorescence intensities at 508 and 560 nm, respectively, and genotypes were generated using Sequence Detection System Software Version 1.7 (Applied Biosystems). Genotyping error rate was directly determined by regenotyping 25% of the samples, randomly chosen, for each locus. The overall error rate was <0.005. Genotype completion rate was 0.98.

Haplotype analysis

Haplotype frequencies were estimated using a Bayesian approach implemented with PHASE (Stephens et al. 2001). These frequencies closely agreed with results from a maximum likelihood method implemented via an expectation–maximization (EM) algorithm (Long et al. 1995).

Results and discussion

Of a total of 26 SLC6A2 SNPs, 25 were polymorphic in all four populations. Dramatic interpopulation differences in allele frequencies were observed for most of the SNPs. Ala457Pro, the previously reported in vitro functional variant, was monomorphic in the 384 individuals representing the four populations. Allele frequencies of SLC6A2 SNPs and their locations in the gene are shown in Table 2. The majority of the markers are located in the intronic sequence of the gene, one synonymous substitution (A1287G) is located in exon 9, one marker is located in the 3′ UTR region, five are in the 3′ region, and one is in the 5′ region (Fig. 1). All genotype frequencies conformed to Hardy–Weinberg equilibrium.

Table 2 Locations and allelic frequencies of 26 SLC6A2 SNPs in 96 individuals from each of four populationsa. # Single-nucleotide polymorphism (SNP) marker with location shown in Fig. 1. Orientation is #26–#1 (5′–3′).
Fig. 1
figure 1

Location of single-nucleotide polymorphisms (SNPs) genotyped in the human SLC6A2 gene. Coding exons are shown as solid blocks; 5′ and 3′ UTRs are indicated by unfilled rectangles. Physical locations are from the Celera Discovery System [CDS] database, July 2003

Within SLC6A2, three conserved LD blocks (1, 13.6 kb; 2, 12.5 kb; 3, 25 kb) were observed in all four populations. Definition of haplotype blocks and block boundaries is an inexact science. Some disruptions of D′ (a measure of LD) occurring within blocks are clearly attributable to low allele frequencies that lead to increased variance in estimation of D′. We discounted low D′ values, which appeared to originate from this cause. In the SLC6A2 haplotype block regions, D′ was generally >0.85 from one end of the region to the other. D′ averaged was 0.83, 0.94 and 0.94 in blocks 1, 2, and 3, and perhaps more importantly, the median D′ value within haplotype blocks was 0.97, 0.97, and 1.00 for blocks 1, 2, and 3, meaning that most of the SNP loci were in very high LD. We note that in the situation that haplotype block boundaries are drawn too widely, an increased number of haplotypes will be observed for the block and an increased number of markers will be required to capture this diversity. SLC6A2 haplotype block boundaries could be drawn somewhat differently than we have done, and it can also be observed that there is some variation from population to population. For example, there was some disruption of LD within block 1 in both Finns and Plains Indians. However, the marker panels we genotyped were sufficient to capture diversity in the blocks in the four populations we studied, as described below.

Pairwise LD values within each haplotype block are summarized in Tables 3, 4, and 5; all pairwise LD values among 25 SNPs across four populations are represented in Tables 6, 7, 8, and 9. Haplotype frequencies for the three blocks in four populations are shown in Table 10. For each population and haplotype block, 3–6 common (≥0.05) haplotypes accounted for most of the total: 85–96% of Caucasian and Plains Indian haplotypes and 75–89% of African American haplotypes. The number of common (≥0.05) haplotypes were block 1: 4, 5, 3, 6; block 2: 5, 5, 4, 5, and block 3: 4, 5, 5, 6 for U.S. and Finnish Caucasians, Plains Indians, and African Americans, respectively.

Table 3 Pairwise linkage disequilibrium (D′) among eight single-nucleotide polymorphisms (SNPs) in haplotype block 1 across four populations
Table 4 Pairwise linkage disequilibrium (D′) among eight single-nucleotide polymorphisms (SNPs) in haplotype block 2 across four populations
Table 5 Pairwise linkage disequilibrium (D′) among nine single-nucleotide polymorphisms (SNPs) in haplotype block 3 across four populations
Table 6 Pairwise linkage disequilibrium (D′) among 25 SLC6A2 single-nucleotide polymorphisms (SNPs) in U.S. Caucasians. Marker #9, Ala457Pro, was monomorphic and thus excluded
Table 7 Pairwise linkage disequilibrium (D′) among 25 SLC6A2 single-nucleotide polymorphisms (SNPs) in Plains Indians. Marker #9, Ala457Pro, was monomorphic and thus excluded
Table 8 Pairwise linkage disequilibrium (D′) among 25 SLC6A2 SNPs in Finnish Caucasians. Marker #9, Ala457Pro, was monomorphic and thus excluded
Table 9 Pairwise linkage disequilibrium (D′) among 25 SLC6A2 single-nucleotide polymorphisms (SNPs) in African Americans. Marker #9, Ala457Pro, was monomorphic and thus excluded
Table 10 Frequencies of haplotypes constructed from 25 single-nucleotide polymorphisms (SNPs)a. (1=allele 1, 2=allele 2)

For each haplotype block, a panel of markers sufficient to maximize genetic information content was available. Excluding Ala457Pro, the number of SNPs available for the three haplotype blocks were 8, 8, and 9 for blocks 1, 2, and 3, respectively. Knowing this, the value of additional SNP markers for haplotype diversity (informativeness) can then be evaluated. We began with the haplotypes derived from all available markers in the block and successively subtracted markers, first subtracting markers that resulted in no change whatever in haplotype diversity and then subtracting markers which changed diversity in the most minimal fashion and so on until we were left with a single, independent, highest heterozygosity SNP marker. The figures graphically depict a reversal of this process, showing that diversity can be maximized using a smaller group of tag SNPs selected from a larger panel. After each marker addition using the pathway determined by the subtraction analysis, haplotype frequencies, and diplotype heterozygosity, the chosen measure of diversity were recalculated. At some point for each haplotype block and each population, the addition of a new SNP marker did not appreciably increase diversity, as shown in Fig. 2.

Fig. 2a–c
figure 2

Effect of successively adding optimal SNPs on haplotype diversity in four populations. The successive addition of SNPs determined by the subtraction path, which maximally retained haplotype diversity, identifies a minimal set of tag SNPs to maximize diversity (diplotype heterozygosity) within SLC6A2 haplotype blocks (ac) in four populations

The required number of tag SNPs varies according to the haplotype diversity of the region (and population) and the information content of the markers available. Haplotype diversity in block 1 was greatest in African Americans, and to maximize it, more markers were needed (5–6 markers). For the other populations, 2–3 markers sufficed for block 1, but the optimal markers differed across populations. Thus, each SNP had different information content in different populations. For association/linkage studies, different tag SNPs could be used in different target populations. Alternatively, the entire panel of 25 SNPs could be applied to reliably capture haplotype diversity across populations. As illustrated in Fig. 2, genotyping larger panels of markers yields a steadily diminishing return, but another purpose of this approach is to capture more information on certain rarer haplotypes. The focus of haplotype-based genetic association studies has been the detection of effects of moderately abundant loci, because haplotypes and functional alleles of low frequency are not well represented in small datasets. However, power increases in larger datasets that may be available for certain noradrenergic related phenotypes, for example, diseases such as hypertension and mood disorders, which are readily diagnosed and which afflict very large segments of populations.

For SLC6A2, the 25-locus SNP panel defines a three-block LD structure across the entire gene region and would be sufficient to capture the signal of any moderately abundant SNP. For example, A1287G is the marker most extensively used in NET linkage studies, and this SNP is located in block 2, for which the SLC6A2 SNP panel includes another seven markers in addition to A1287G. When A1287G is excluded, 99% of the information content of block 2 is still captured.

In conclusion, the SLC6A2 haplotype map and marker panel are a comprehensive tool for genetic linkage studies on phenotypes related to noradrenergic function. This map is a surrogate for moderately abundant effective alleles, which may be unknown or unrecognized as functional.