Introduction

Stuttering is a common disorder of speech fluency characterized by repetitive fragmentation of the beginnings of words, prolongation of initial sounds and large gaps between words or syllables, which are known as silent blocks.1

Stuttering has been shown to be highly heritable. It has been reported to aggregate in families.2, 3 Approximately half of the stutterers have a family history of the disorder,4, 5 and significant genetic linkage to stuttering has been observed on chromosome 12.6 We have shown that mutations in the GNPTAB (GlcNAc-phosphotransferase; EC 2.7.8.17) gene within this region are associated with this disorder.7 One mutation in GNPTAB, Glu1200Lys, was found in a number of Pakistani stuttering families and in several unrelated affected individuals of South Asian descent.7

The aim of the present study was to further characterize the Glu1200Lys mutation in GNPTAB. We first sought to determine whether this mutation represents a founder mutation descended from a common ancestor, or a recurrent mutation at the same position. We also sought to estimate the age of the mutation and to construct a cladogram tracing the mutation through our study population, in order to further understand the history of this mutation.

Materials and methods

Eight unrelated individuals previously shown to carry at least one copy of the Glu1200Lys mutation in the GNPTAB gene were examined in this study. Four of the eight subjects were heterozygous for the mutation, providing a total of 12 affected and 4 unaffected chromosomes from these 8 individuals. We genotyped 33 SNPs across the 650-kb region surrounding the Glu1200Lys mutation by sequencing genomic DNA. We also genotyped the four nearest microsatellite markers listed on the Marshfield Map Marker Database. Forty-eight random Pakistani individuals from a geographically matched location in Pakistan (=96 chromosomes) were also genotyped with all markers to generate control allele frequencies.

To identify additional informative markers in this region, the 20-kb region immediately surrounding the Glu1200Lys mutation was sequenced to completion. This sequencing resulted in the discovery of five novel SNPs. The resulting genotypes were then analyzed with PHASE (v2.1.1) to determine the most likely haplotypes for each individual.8, 9

The resulting haplotypes (12 mutation-containing chromosomes and 96 Pakistani control chromosomes) were analyzed with DMLE+ (version 2.2) to estimate the approximate age of the Glu1200Lys mutation.10 Trials were run using differing numbers of markers, ranging from all 33 informative SNP markers in the region to 7 markers immediately surrounding the Glu1200Lys mutation. The location of the Glu1200Lys mutation within the haplotype was defined as 0.022 cM from the first marker in the analysis using all 33 markers, and as 0.0039 cM when the 7 markers directly surrounding the mutation were analyzed.

TREEFINDER11 was used to generate a phylogenetic tree illustrating the relationships between the 16 chromosomes from the eight individuals. The J3+I nucleotide substitution model was used, and a 1000 × bootstrap analysis was run to generate possible phylogenic trees. A consensus program was run to choose the statistically most likely tree from those generated. The full sequence of the 650-kb region we had previously evaluated with SNP and microsatellite analysis was used. These sequences were derived from multiple sequencing traces aligned with the reference sequence using SeqMan, and included the relevant allele at each SNP or microsatellite position.

Results

Eight unrelated individuals carrying one or two copies of the Glu1200Lys mutation were analyzed by a combination of sequencing and genotyping in a 650-kb region surrounding the mutation. The resulting genotypes were then analyzed using PHASE to determine the most likely haplotypes carried by each of these 8 individuals and 48 control individuals. All 12 chromosomes containing the Glu1200Lys mutation were found to share a common haplotype immediately surrounding this position (Figure 1 and Supplementary Table 1). At its minimum, this haplotype consists of a unique combination of alleles at the seven SNPs nearest to the mutation, and is approximately 6.67 kb in length. This haplotype was not found in any of the 96 chromosomes in the control sample.

Figure 1
figure 1

The Glu1200Lys mutation is surrounded by a shared haplotype. The marker identifiers are listed above the topmost line. SNPs newly discovered in this study are ss252444755, ss252444756, ss252444757, ss252444758 and SNP 1, which was not assigned an rs/ss designation because it was monomorphic in our normal control sample. The horizontal blue bar illustrates the location of the GNPTAB gene, and the lysine mutation is shown by the vertical black line in the center. The gray horizontal bars indicate regions of shared haplotype in each chromosome.

Because these results suggest a single origin of the Glu1200Lys mutation, we sought to estimate the age of this allele. In our analysis using DMLE+, the estimated age of Glu1200Lys mutation was 572 generations (95% credible set: 467–697), or 14 300 years based on a 25-year generation time.12 We also constructed a phylogenetic tree of the 16 chromosomes in the eight unrelated individuals carrying the Glu1200Lys mutation using genotypes and DNA sequence spanning the 650-kb region shown in Figure 1. The consensus tree is illustrated in Figure 2. Chromosomes segregate into two distinct branches, one that contains the 12 chromosomes carrying the mutation, and another branch containing the four that do not. Additionally, the chromosomes that carry the mutation tend to segregate with other chromosomes that contain a similar amount of the shared haplotype. This can be seen in a comparison between Figures 1 and 2.

Figure 2
figure 2

Cladogram of 16 chromosomes from eight unrelated individuals carrying the Glu1200Lys mutation. A total of 16 chromosomes from eight individuals are included. Chromosomes are labeled by subject number, followed by the chromosome number within that subject. Chromosomes carrying the mutation are designated by an M.

Discussion

Our results reveal that all chromosomes carrying the Glu1200Lys mutation in the GNPTAB gene share a single, apparently unique haplotype surrounding this mutation. This indicates that the Glu1200Lys mutation is a founder mutation that occurred once and has been inherited by all of the affected individuals in our sample. The shared haplotype was found to be as short as 6.67 kb in length, suggesting that this mutation may be relatively old. Our estimation of the age of this mutation supports this hypothesis. Using a variety of parameters, we obtained an age estimate of 572 generations, or 14 300 years.

The phylogenetic tree generated from the data also supports the founder mutation hypothesis. The chromosomes with a larger portion of the mutation-carrying haplotype are clustered closer together than chromosomes sharing a lesser amount. All chromosomes carrying the mutation are separated on the cladogram from those shown not to carry the mutation, further supporting the conclusion that they derive from a common ancestor.