Main

Alström syndrome was initially mapped to an interval of 6.1 cM between loci D2S286 and D2S327 (refs 35). Although several candidate genes have been investigated, no mutations have previously been identified6,7,8. We have shown that the 2p13 breakpoint in the individual with the 46,XY,t(2;11)(p13;q21)mat translocation is between these loci by metaphase fluorescence in situ hybridization (FISH) analysis using the BACs RP11-355F16 (containing D2S286) and RP11-480F1 (located 150 kb proximal to D2S327) as probes. The BAC RP11-582H21 crosses the translocation breakpoint (Fig. 1a,b) and is overlapped by RP11-79N18, which contains CCT7, a member of a chaperonin gene family9. Mutations in MKKS (encoding a putative chaperonin) have been detected in Bardet-Biedl syndrome10,11, which has phenotypic overlap with Alström syndrome (including obesity, insulin resistance and retinopathy). We therefore examined individuals with Alström syndrome for mutations in CCT7, but did not identify any changes in coding sequence.

Figure 1: Partial karyotype and FISH analysis of 46, XY,t(2;11)(p13;q21).
figure 1

a, Partial ideograms and karyotypes showing translocation breakpoints. b, BAC RP11-582H21 (red) hybridizes to chromosomes 2, derivative 2 (der2) and derivative 11 (der11). c, Long-range PCR product of 9 kb (red) from within RP11-582H21 that contains ALMS1 exons 3–5 and that also hybridizes to chromosomes 2, der 2 and der 11, and thus crosses the 2p13 breakpoint. Chromosome 2 PAC RP5-1011O17 (green) and chromosome 11–specific alphoid DNA (green) were used as control probes.

Sequence analysis of RP11-582H21 suggested the presence of a single gene represented by the cDNA KIAA0328 (ref. 12). Northern-blot analysis using an RT–PCR product of 3.9 kb derived from this cDNA disclosed a transcript of 12–13 kb, suggesting that the 6.3-kb KIAA0328 database entry was a partial sequence. We used RT–PCR with exon prediction and specific primer design to identify a longer transcript that was confirmed by northern-blot analysis (Fig. 2). The extended sequence (ALMS1) is 12.9 kb in length and contains an open reading frame of 12.5 kb comprising 23 exons (Fig. 3a) and over 224 kb of genomic DNA (although the most abundant transcript detected by PCR does not contain exon 2). Analysis of the ALMS1 transcript revealed the predicted start codon in an efficient Kozak site with suitable transcription initiation less than 100 bp upstream of the 5′ limit of our cDNA sequence.

Figure 2: Multiple-tissue northern blot showing ALMS1 expression.
figure 2

Hybridization was carried out with probes consisting of the PCR product of ALMS1 (3,651–4,671 bp of the coding sequence) and then with ACTB as a control.

Figure 3: Structure of ALMS1 and protein.
figure 3

a, Intron and exon organization (not to scale). ALMS1 exons are shown in dark blue, untranslated regions are shown in light blue and the copies of exons 17–21 in the duplicated region are shown in brown. NAT8 exons are shown in red and are in the opposite orientation to ALMS1 (red and blue arrows). The position of the 2p13 translocation breakpoint is indicated by a dotted line. The genomic BAC contig is shown beneath. b, Primary structure of ALMS1 and position of premature stop codons causing protein truncation.

A duplication of the 6.5-kb region corresponding to exons 17–21 lies 61 kb downstream of ALMS1. The region shares 94.5% sequence identity and is in the same orientation; however, the presence of frameshift mutations indicates that the duplicated versions of exons 17–21 are not expressed. We designed PCR primers so as to screen the coding copies of exons 17–21 only. In between ALMS1 and the duplicated region is NAT8, a gene on the opposite strand that encodes a product similar to bacterial acetyltransferases13.

We localized the maternally inherited 2p13 breakpoint to a long-range PCR product of 9.0 kb (Fig. 1c), derived from RP11-582H21, containing exons 3–5. We refined the localization to an EcoRI fragment of 1.7 kb containing exon 4 and the start of exon 5 by Southern-blot analysis. In the same individual, we also detected a deletion of 2 bp in exon 8 of the paternal ALMS1 allele (2141delCT) that is predicted to cause premature termination five codons downstream of the deletion. In agreement with these findings, sequencing of an RT–PCR product derived from lymphocyte mRNA showed expression of only a transcript containing the 2-bp deletion (2141delCU) mutation, consistent with monoallelic expression of the paternal allele and disruption of the maternal ALMS1 transcript between exons 4 and 5.

We studied six other families with Alström syndrome (having either affected individuals or obligate gene carriers) and identified five additional mutations, each causing a premature stop codon (see Table 1 and Web Fig. A online). In families 8 and 9, we were unable to identify pathogenic mutations in coding sequences. We identified two mutations in exon 16 of family 2. The affected siblings are compound heterozygotes harboring a nonsense mutation (Trp3664X) and a single-base pair deletion (10775delC) resulting in a predicted frameshift causing premature termination five codons downstream (Thr3592fs). DNA from the father (F2F) was heterozygous with respect to the 10775delC mutation, but lacked the Trp3664X mutation, consistent with carrier status. Maternal DNA was unavailable. Sequencing of RT–PCR products from lymphocyte mRNA revealed expression of the respective mutations in the father and children. We also detected the 10775delC mutation in three additional families not known to be related to family 2. In family 3, the affected individual inherited the 10775delC mutation from his mother; sequencing the entire coding and splice-site regions failed to detect a paternally inherited mutation. We identified a paternally inherited, heterozygous A→G transition in exon 8, causing a substitution of aspartic acid for asparagine (Asn1788Asp); however, we also found this change in 3 of 72 normal chromosomes, suggesting that it is not pathogenic.

Table 1 Families investigated and mutations detected

We identified mutations in three additional affected individuals. One was homozygous with respect to a nonsense mutation in exon 16 (Gln3495X). The second was homozygous with respect to an insertion of a single base pair in exon 8 (7132insA), causing a frameshift mutation (Thr2378fs). The third had a deletion of 4 bp in exon 8 (6571delTCAC), causing a frameshift mutation (Ser219fs), in addition to the 10775delC mutation. None of the truncating mutations identified in this study were found in 100 normal chromosomes. These data indicate that ALMS1 is the gene underlying Alström syndrome.

The ALMS1 protein consists of 4,169 aa and has a predicted molecular weight of 461.2 kD. In silico analysis predicts a leucine zipper at aa 2480–2501 and a potential signal peptide at aa 211–223, but no other known evolutionarily conserved sequence domains are apparent (Fig. 3b). A striking feature of the sequence is a large tandem-repeat domain comprising 34 imperfect repetitions of 47 aa (aa 540–2201; Fig. 4) that contains no cysteine residues. This array constitutes approximately 40% of the protein and is encoded entirely by exon 8. Repeat length varies from 45 to 50 aa, and repeat identity ranges from 40% to 90%. ALMS1 also contains a run of 17 glutamic-acid residues (aa 13–29) encoded by (GAG)13GAA(GAG)3, followed by a run of 7 alanine residues (aa 30–36).

Figure 4: Tandem amino-acid repeat.
figure 4

The sequence alignment of 34 copies of a tandem-repeat unit identified in ALMS1 (residues 540–2201). The repeat unit is not perfect (length 45–50 aa and identity 40–90%). The array is interrupted by a region of 60 aa composed of incomplete degenerate repeats (residues 2001–2060), which for purposes of alignment is not shown. The array is encoded entirely by exon 8 and contains no cysteine residues. Light and dark shading indicate greater than 60% and greater than 80% conservation, respectively.

To identify other potential pathogenetic mechanisms, we analyzed the GAG trinucleotide repeat and polyalanine tract in exon 1 for size variation. Sequencing of 72 normal chromosomes revealed a size variation of 12–20 glutamic acids; however, the GAG allele sizes of the obligate carriers (F3F, F8M, F8F) and affected individuals (F9Ch), in whom no other mutations have been detected, lay within the normal variation range. We did not find any variation in the size of the polyalanine tract in either normal individuals or obligate gene carriers.

Using RT-PCR, we demonstrated ALMS1 expression in all of the tissues and cells investigated: fetal heart, aorta, liver, kidney, lung, neural tube, eye, adrenal glands, placenta, lymphocytes and WERI retinoblastoma cell line.

Cloning breakpoints of balanced translocations has proved a successful strategy for identifying dominant and X-linked, but not autosomal recessive, disease genes. We believe that ALMS1 is the first human autosomal gene involved in recessive disease to be identified using this strategy. The six independent mutations predicted to cause protein truncation and the accompanying report by Collin et al.14 confirm that dysfunction of ALMS1 causes Alström syndrome. We failed to identify mutations in two families, which raises the possibility of genetic heterogeneity, although our methods of mutation detection did not exclude larger DNA deletions or mutations within regulatory regions.

The function of ALMS1 is not clear. There are similarities between the structural organization of AMLS1 and that of mucin genes15. Mucins are secreted proteins that are heavily glycosylated and have a large tandem-repeat domain encoded by a single exon. Like ALMS1 repeats, mucin tandem repeats have a low cysteine-residue content, but they also have characteristic high threonine and serine content15 not found in ALMS1 repeats, suggesting that ALMS1 is not a mucin. ALMS1 is a new protein involved in an insulin-resistance syndrome and thus represents the potential for discovery of a new pathway.

Methods

Patient samples.

We obtained samples from nine families with Alström syndrome, including lymphoblastoid cell lines from members of three families from the European Collection of Cell Cultures (Salisbury, UK; BV0752, BV0754, BV0757, BV0791, BV0792, BV0764). In total, we used DNA from nine affected children; in two families, DNA was available from only one parent. We used the DNA of 50 unrelated normal individuals as controls. Ethical approval was obtained from the Newcastle Local Research Ethics committee.

Sequence analysis.

We identified BACs within the 6.1-cM critical region using BLAST to query the nr and htgs databases with marker sequences. We then built BAC contigs electronically using the TIGR (The Institute for Genomic Research) BAC end-sequence database and NIX analysis. We identified genes using NIX and analyzed protein sequences using PIX and InterProScan.

Mutation detection.

We sequenced PCR products using a BigDye Terminator Cycle Sequencing Kit (Applied Biosystems) according to the manufacturer's instructions and analyzed the reaction products on an ABI 377 sequencer. We designed primers to amplify exon and splice-site sequences from genomic DNA.

FISH analysis.

We labeled probes for FISH following the manufacturer's instructions (Vysis) and hybridized them to metaphase chromosomes. We obtained RP11 BACs from BAC/PAC Resources and CTD-2005P16 from Research Genetics. We generated long-range PCR products using an Expand Long Template PCR kit (Roche).

RT–PCR and RACE.

We extracted total RNA from lymphoblastoid cells using Tri reagent (Sigma) and generated first-strand cDNA using Superscript II (Invitrogen). We carried out 5′ RACE as described16. We obtained fetal tissues with approval from the Southampton and South West Hampshire Joint Local Research Ethics Committee and after receiving informed consent.

Southern-blot analysis.

We separated restriction digests of total genomic DNA on a 1% agarose gel and transferred them to membrane by alkali blotting. We purified a 346-bp PCR product spanning part of exon 4 and part of the following intron by gel extraction and labeled it with [α-32P]dCTP using a Rediprime II kit (Amersham Pharmacia Biotech). We followed a standard filter hybridization and washing protocol.

Northern-blot analysis.

We hybridized human multiple-tissue northern blots (Clontech) with a probe of either a 3.9-kb RT–PCR product spanning exons 10–20 (8,244–12,161 bp of the ALMS1 coding sequence) or the PCR product of Ex8fF/Ex8fR (3,651–4,671 bp) and then with ACTB as a control. We purified and labeled the probes as described above for Southern-blot analysis and followed the manufacturer's instructions for hybridization and washing conditions (Clontech).

GenBank accession number.

ALMS1 mRNA sequence, AJ417593.

Note: Supplementary information is available on the Nature Genetics website.