Introduction

The serotonin transporter in the brain is the principal site of action for many antidepressant medications in treating major depressive disorders. The serotonin transporter gene (5-HTT) has several polymorphic loci that affect its expression or function. The promoter polymorphism known as the 5-HTT-linked polymorphic region (5-HTTLPR) has an ins/del in the 5-HTTLPR and most commonly is composed of a long allele (L) or short allele (S) (14 or 16 repeats of the 20–23 bp unit). Variations in the promoter region of 5-HTT have been reported to be associated with differential remission rates following selective serotonin reuptake inhibitor medication treatment.1, 2 The long form of the ins/del promoter variant has 43 or 44 more nucleotides than the short form and has been shown to drive transcription levels to more than twice the level of the S allele.3 In some previous studies, European subjects homozygous for the L allele of the ins/del promoter variant have been reported to benefit more from treatment with medications that block serotonin reuptake than subjects without the L allele.1, 4 Nakamura et al.,5 after isolating and sequencing all the repeat units in the promoter, categorized four kinds of S allelic variants and six different L allelic variants, based upon the sequence of the different repeats. The S alleles were 14-A, 14-B, 14-C, 14-D, and the L alleles were 16-A, 16-B, 16-C, 16-D, 16-E and 16-F. They also observed longer repeats composed of 19, 20 and 22 units; extra long versions of the LPR with 18-, 19- and 20-repeat units were also reported by others.6, 7, 8 Recently a 17-repeat allele and an extra short allele of 11 repeats was reported.8 Hu et al.3 designated the L allele with an adenosine at single nucleotide polymorphism (SNP) rs25531 as LA and the L allele with a guanine at rs25531 as LG. The LA and LG designation corresponds to alleles 16-A and 16-D, respectively, in Nakamura's nomenclature. The LA was reported to have higher activity than LG.3 The A → G substitution is in the μ unit (sixth unit) and this gave rise to six different genotypes in the HTTLPR—SS, LGLG, SLG, LALG, S LA, LALA, where SS expressed the least HTT mRNA and LALA expressed the most, and the two heterozygous genotypes that contain one copy of the S allele did not exhibit lower than expected expression as predicted by the S-dominant model.3 The LG allele created a stronger AP2-DNA binding site which, in turn, suppressed transcription. The LA allele was also associated with higher 5-HTT-binding potential, an index for 5-HTT density in putamen using 3-(11) C-amino-4-(2-dimethylaminomethylphenyl-sulfanyl) benzonitrile ([11C]DASB) positron emission tomography.9 The G substitution is also observed in the context of S allele (SG), which is very rare and was designated as 14-B and 14-D by Nakamura.5 The finding that rs25531 is also associated with selective serotonin reuptake inhibitor response demand that both 5-HTTLPR and rs25531 be viewed as two independent loci for genotyping, and demonstrate a need for more comprehensive genotyping procedures.10

Studies have shown that the StIn2, another variable number tandem repeat (VNTR) region, may act as a transcriptional enhancer of the 5-HTT gene, with 12-repeat allele having more activity than the 10-repeat allele.11 The allele frequency of the 9 repeat is low (1–3% in the European population). The intronic polymorphism has been mainly investigated in affective illness. Association of StIn2.9 with unipolar depression was shown by Ogilvie et al.12 StIn2.12 has been shown to be associated with bipolar disorder in multiple studies.6, 13, 14 Potential association/linkage disequilibrium of the intronic polymorphism and schizophrenia was also shown.15

Hranilovic et al.11 demonstrated the combined effect of the 5-HTTLPR and StIn2 polymorphism, they separated the groups into high-expressing genotypes at both loci (L/L 12/12), low-expressing genotypes at one locus (L/L 10 and S 12/12) and those containing low-expressing genotypes (S 10). The mean 5-HTT expression was the highest with two high-expressing genotypes, 20% lower with one low-expressing genotypes and 50% lower with low-expressing genotype at both the loci. Lovejoy et al.16 suggested that there is an additional layer of transcriptional complexity based on the primary sequence of the VNTR, and that the controversy as to the correlation of VNTR copy number with predisposition to affective disorders could be resolved by reanalyzing the data and taking into account the primary sequence of the VNTR.

The method we developed to genotype 5-HTTLPR, StIn2 and the SNP rs25531 uses fluorescently labeled PCR primers. An advantage of a fluorescent dye detection system is that DNA fragments overlapping in size range can be labeled with different dyes and thus be simultaneously detected in a single lane or by injection on electrophoresis instrumentation. Since both the 5-HTTLPR and StIn2 are composed of repeat regions, there is a possibility of size overlap. Labeling with two different dyes provides a significant increase in throughput over the more traditional methods in detecting the different alleles. Using alternative fluorescent dyes for the two separate loci, LPR and IN2, along with designing the products to yield defined and unique fragment length, enables the amplicons generated in the four separate PCRs to be separated for the four different reactions in the same well during electrophoresis. Furthermore the amplifications were designed to utilize the same cycling conditions to generate the PCR amplicons. The allele-specific PCR for the rs25531 eliminates the restriction enzyme digest of the PCR amplicon thereby further streamlining the assay and reducing the chance of errors while lowering the cost of performing the assay further. Here we present the simultaneous genotyping of multiple polymorphisms in human serotonin transporter gene and the novel variants identified.

Materials and methods

Generating the PCR fragments

Genomic DNA was extracted using the Qiagen EZ-1 extraction method (Hilden, Germany). The assay was composed of four separate amplifications as follows: amplify the ins/del region of the promoter using primers 5HTTLPRF and 6FAM5HTTLPRR, to amplify the StIn2 region with primers 5HTTIN2F and HEX5HTTIN2R, and two additional reactions to examine the SNP rs25531 region using two allele specific primers 5HTTSNPAF and 5HTTSNPGF, with a common reverse primer 6FAM5HTTLPRR. The promoter region reverse primer and the StIn2 region reverse primer were fluorescently labeled with 6FAM and HEX, respectively. The primer sequences are given in Table 1. In a 12.5-μl reaction volume, 25 ng of genomic DNA were amplified with 2.5 μl of 5 × Promega Flexi buffer, 1 μl of 25 mM MgCl2, 0.5 μl of 10 mM dNTP mix, 3 μl of 5 M betaine (Sigma-Aldrich, St Louis, MO, USA) and 0.1 μl of Promega GoTaq, Hot start polymerase (Promega Corporation, Madison, WI, USA). The primers were used at 25 μM concentration: 0.5 μl was used in the PCR reaction, except the allele specific primers, which were 0.1 μl in the reaction. Thermal cycling consisted of 2 min of denaturation at 94 °C followed by 30 cycles of 94 °C (30 s), 62 °C (30 s) and 72 °C (60 s), and with a final extension step of 10 min at 72 °C.

Table 1 Primers and Primer sequences

Detection on ABI 3130xl genetic analyzer

The PCR products were analyzed on ABI 3130xl Genetic analyzer. A volume of 1 μl of sample (composed by pooling 2 μl from each of the four amplicons) was added to a mix of 0.5 μl GS 600Liz size marker (Applied Biosystems, Foster City, CA, USA) and 9 μl of Hi-Di formamide (Applied Biosystems). A volume of 9 μl of the marker mix and 1 μl of pooled amplicon was denatured at 95 °C for 5 min, cooled on an ice-water slurry for 5 min, and subsequently run on the ABI 3130xl analyzer. The data were analyzed using GeneMarker v1.9 (SoftGenetics, State College, PA, USA).

Sequence confirmation

The samples analyzed by fragment analysis were verified by direct DNA sequencing. The primers used for generating the sequencing amplicons and sequencing primer details are given in Table 1. The primers used to generate the amplicons for sequencing have M13 sequences on the 5′ end. This simplifies the sequencing by using two common primers for any number of specific regions. Direct sequencing was performed using BigDye terminator v3.1/Sequencing standard kit (Applied Biosystems) following the manufacturer's directions and separating products on an ABI 3130xl.

The sequence data were analyzed using MutationSurveyor (SoftGenetics). The samples heterozygous (L and S) for the promoter region were complex and difficult to analyze using the software. To confirm the results we obtained, the amplicons were separated on a 2% agarose gel, and the short and the long bands were excised and purified using GeneClean (Qbiogene, Solon, OH, USA) before sequencing.

Results

Genotyping by fragment analysis

Since the reverse primers of the promoter region and the StIn2 region were fluorescently labeled, we were able to detect the alleles simultaneously by mixing the PCR products. The different alleles expected for the promoter region are L (long), S (short), LA, LG, SA and SG. The StIn2 VNTR alleles would be StIn2.9, 10, and 12 (9R, 10R and 12R repeats). The expected peak sizes are shown in Figure 1 and Table 2 (R15, A97, M09 and C126 are DNA from the Coriell repository, Camden, NJ, USA). The L allele peak was at 450 and the S allele peak was at 406. The LA peak was at 327, and the G allele-specific primers were made nine bases shorter than the A allele-specific primer so the LG peak was at 318, while the SA peak was at 284 (the SG peak should be 275 bases but we did not find any SG peak in any of the samples reported here). The StIn2 alleles yielded fragments of 248 (9), 265 (10) and 298 (12). The software was set to allow a variation of ±3 for the promoter polymorphisms and ±3 for the StIn2. As the reverse primer for the promoter region (6FAM5HTTLPRR) was the same for the 5-HTTLPR and the allele-specific PCR, the change in the size of the L/S allele(s) found in the variant sequences was also reflected in the A/G fragment.

Figure 1
figure 1

Electropherogram from ABI 3130xl. 1a was a LA/LA for promoter and StIn2.10. 1b was a LA/LG and StIn2.12. 1c was a SA/SA and StIn2.12 and 1d was a SA/LA with StIn2.12.

Table 2 Fragment sizes of different alleles detected and genotype calls.

Identification of novel alleles

We detected novel alleles using the genotyping method we developed. The peak sizes of the new alleles detected are shown in Figure 2 and Table 2, and the fragments longer than the L allele were called extra long (XL). Sample 1 was a heterozygous sample for the 5-HTTLPR polymorphism with S and SA peaks and an XL peak and StIn2.9/12 genotype for StIn2. The XL peak was at 470 instead of 450, which was 20 bases more, and the XLG was at 338 instead of 318, which was, again, 20 bases more. This sample also had two additional peaks from the promoter region at 141 and 163 bases. Samples in Figures 2b-e were all heterozygous for the promoter ins/del polymorphism and L peak of all the samples were more than 450, and we called it short/extra long, designated as S/XL. As the reverse primer was same for the LPR region and the A/G SNP, if there was a change in size of the L allele the same size change was seen in the LA/LG if the insertion was after the A/G SNP. Sample 2 had an XL peak at 494 and the XLA peak at 371; sample 3 had the XL at 492 and XLA peak at 369; sample 4 had an XL peak at 537 and XLA peak at 413; and sample 5 had an XL peak at 623 (approximate because it was run with 600 base ladder) and the XLA peak at 504.

Figure 2
figure 2

This figure shows the novel variants we found. In (a) the XL peak was at 470 instead of 450 and the XLG was at 338, this sample had 17 repeats for the 5-HTTLPR XL allele. It was a StIn2.9/2.12. (b) Had an XL peak at 494 and the XLA peak was at 371, the XL allele had 18 repeats and this sample had StIn2.12 repeats. (c) Had an XL peak at 492 and XLA peak at 369, which was also an 18 repeat, this sample was StIn2.12. (d) Showed an XL peak at 537 and XLA peak at 413, the XL allele was a 20-repeat unit and StIn2.12. (e) Showed an XL peak at 623 (approximate sizing because it was outside the ladder) and a XLA peak at 504, this was a 24-repeat unit and StIn2.10. All the five above samples were heterozygous for 5-HTTLPR and showed both S and SA peak.

Sequence verification of the novel alleles

The sequence data of the novel alleles are given in Supplementary information. We used the same Greek letter designation used by Nakamura et al.5 for the different units.

The genetic architecture for the novel alleles are given in Figure 3 in comparison with LA and LG. Sample 1 had the same units as the LG allele (16-D), α, β, γ, δ, ɛ, μ, o, ζ, η, θ, ι, κ, λ, μ, ν, ξ until κ, and instead of λ there was a ζ followed by two μ units instead of one μ (boldface in Figure 3); we are calling the first μ as μ' because the sequence had an extra base A at the end. This accounted for the extra 20 or 21 bases making this a 17-repeat unit. The two extra peaks at 141 and 163 bases were from the μ' and μ units at the end.

Figure 3
figure 3

5-HTTLPR novel variants and the different units identified (the unique units are shown in boldface or within brackets).

Sample 2 and 3 had 18-repeat units. Sample 4 had 20-repeat units; these units were different from the 20-repeat unit reported before,5, 6 which were α, β, γ, δ, ɛ, ζ, η, ζ, η, ζ, η, ζ, η, θ, ι, κ, λ, μ, ν, ξ. In sample 4, the ζ unit (6th unit) was followed by o, ζ, η, ζ, o, ζ, η; after η, it had the same pattern as LA. Sample 5 was the longest one we found, with 24-repeat units (Figure 3).

Discussion

We developed a method for genotyping the following polymorphisms in human serotonin transporter: the 5-HTTLPR polymorphism, the rs25531 SNP in the promoter and the StIn2. All four amplicons were amplified in separate reactions under the same cycling conditions, and the polymorphisms were detected simultaneously in a single electrophoretic step. This method eliminates time-consuming enzyme digestions and labor-intensive detection methods like running gels, and is more appropriate for a clinical setting. The analysis and interpretation is easy, and the entire assay is more cost-effective. This method will be validated for routine clinical testing.

We identified novel variants in 5-HTTLPR using the method we developed, and the new variants were verified by direct DNA sequencing. We reported for the first time a 17-repeat allele (XL) with the SNP rs25531. This allele had the same units as LG until the κ unit and differed by having a ζ instead of λ followed by μ' and μ units. The μ' unit had the nucleotide A at the end of the sequence. Typically the only unit that has an A at the end is the λ unit, which was missing at this position; therefore, this new repeat sequence must be a recombinant of μ and λ. The 17 repeat reported recently had a tandem duplication of a single κ unit.8 The μ, which is the fourteenth unit in LG (16D), is not usually picked up by the assay because the unit before that is λ, and the primer only anneals in this region weakly, whereas in this particular sample, as the unit before the μ' was ζ, the primer bound strongly and was detected.

The other four alleles we identified, the two 18 repeats, the 20 repeat and the 24 repeat followed the similar pattern, where the repeat elements were inserted after the ζ unit of LA (seventh unit). Of the two 18 repeats we detected, one had alternating units of o and ζ from repeat 7 to 10, which was identical to the one reported by Michaelovsky.7 The other 18 repeat had alternating units of η and ζ from 7 to 10 repeat (the tandem repeats are shown in boldface in Figure 3), repeats 1 to 6 and 11 to18 were same in both alleles.

The 20-repeat allele we found had different units from the ones reported by Kunugi6 and Nakamura5 (boldface units in Figure 3) from repeat 7 to 11. The 24-repeat allele had alternating units of o and ζ from 7 to 10 repeat; units 11 to 16 and units 17 to 23 were in tandem repeat. The two μ units (sixteenth and twenty-second unit) in the 24-repeat allele were not picked up by the assay as it was preceded by λ units.

All the rare alleles we found were heterozygous samples with the S allele and XL allele as reported before.6 All the repeat insertions occurred after the ζ unit and seemed to be the hot spot for recombination. Also, the recombination region in all the variants were followed by the η unit, and sometimes the η unit is inserted repeatedly; this aligns with Heils et al.'s17 suggestion that presence of a sequence of ‘hot spot’ for deletion mutagenesis (TGCAGCC) is in the η repeat element.

The 5-HTTLPR region is highly polymorphic, and repeats up to 24 units do exist in humans and are found to be a common occurrence in non-human primates.18 Presence of 5-HTTLPR in humans and simians but not in other mammals like mice may be related to anxiety-related personality traits so common in humans.18 The allelic distribution of the 5-HTT gene has remarkable ethnic variation, and the XL alleles seem to be largely present in the African or African-American population. Studies showed that L and XL alleles were more frequently found in the Japanese population in sudden infant death syndrome victims, chronic fatigue syndrome and temporomandibular disorder,19, 20, 21 whereas Haas et al.22 were not able to replicate the association between 5-HTT and sudden infant death syndrome in the Caucasian population. A large Taiwanese population displayed a much higher frequency of XL alleles than other studies, challenging the contention that the XL alleles are rare.23 As the L allele shows more efficient serotonin function than the S allele, this raises the question of whether the XL allele would exhibit even greater resilience to depression than the L allele.23

Very few functional studies have been reported on these rare alleles. It is well established that the L allele is associated with better and faster response to selective serotonin reuptake inhibitor therapy. Smeraldi et al.24 found significant differences in response to fluvoxamine among carriers of different alleles identified by Nakamura et al.5 They found that carriers of 16-F showed only a partial response, whereas 16-D showed a marginally better response than 16-A allele carriers. The authors attribute this difference to the interaction of various transcription factors with the different consensus regions.24 A similar observation was made where transcription factors YB-1 (Y box-binding protein) and CTCF (CCTC-binding factor) have an important role in the regulation of the StIn2 polymorphism.25 Ehli et al.8 showed that there was no significant difference in expression between the extra short 11 repeat, and the S, and LG alleles, and a slight decrease of expression with the 17-repeat allele compared with LA. It will be interesting to see if there is any difference in expression with the SNP rs25531 in the 17 repeat.

Serotonin uptake is genetically controlled, and dysregulation of 5-HTT function has been reported in various complex behavioral traits and disorders such as depression, bipolar disorder, anxiety, obsessive-compulsive disorder, schizophrenic and neurodegenerative disorders, substance abuse and eating disorders.26, 27, 28, 29, 30, 31 5-HTT is the main target for widely used antidepressants, so it will be beneficial in the future to study the significance of these XL and other rare variants in correlation with selective serotonin reuptake inhibitor therapeutics.