Main

We initially mapped ALMS1 within a region of 14.9 cM (ref. 6) on chromosome 2p13 in a large French Acadian kindred; we later refined the interval to 6.1 cM (refs 7,8). Further recombinational and physical mapping resolved the critical interval to less than 2 cM, encompassing a region of 1.2 Mb (Fig. 1). We assembled the physical contig from publicly available sequence data (GenBank) by aligning overlapping BAC clones and adjoining fragments by transcription unit content. We identified candidate genes for mutation analysis by comparing the contig sequence with sequences of identified genes and expressed sequence tag (EST) clusters, using the NIX (UK Human Genome Mapping Project Resource Center) annotation pipeline and individual searches of the LifeSeq database (Incyte Genomics) and GenBank. We identified 16 genes and EST clusters within the minimal interval (Fig. 1) and prioritized candidate genes based on their expression pattern and function. Genes that are known to be expressed in the eye (RAI15, a retinoic acid–responsive transcript) and genes that, based on their function, might be involved in obesity and retinal degeneration (the transcript KIAA0919, encoding a secretory pathway component, and CCT7, encoding a chaperonin) were screened first; we proceeded to carry out a systematic screening of all genes in the region.

Figure 1: Fine-resolution and physical maps of the ALMS1 region.
figure 1

Recombinations in an affected child from the French Acadian Kindred 1 and a child from a small nuclear family, Kindred 53, place the ALMS1 critical interval in a region of less than 2 cM. Affected haplotypes are shown as darkened bars. Eight overlapping BACs complete a 1.2-Mb contig. Locations of 16 known and predicted genes identified from EST clusters are shown as darkened bars. Genes not tested for mutation analysis are depicted with an asterisk. 1–16: EST (KIAA0919), SPR (sepiapterin reductase), EMX1 (empty spiracle, drosophila homolog 1), THC529835 (Caenorhabditis elegans sre2 homolog), EST (KIAA0857), EST (THC551446), a predicted gene related to EMX homeobox protein, RAI15 (retinoic acid–induced 15), CCT7 (chaperonin containing TCP1, subunit 7), EST (THC530316), EST (AI014261), EGR4 (early growth response 4), EST (KIAA0328), DUSP11 (dual-specificity phosphatase 11), AMSH (STAM-associated molecule) and ACTG2 (actin, γ-2, smooth muscle, enteric).

One EST cluster, including transcript KIAA0328, was composed of cDNA fragments expressed in many tissues that are affected in individuals with Alström syndrome. To obtain the full-length coding sequence of this new gene, we aligned KIAA0328 with overlapping transcripts from the databases of Incyte Genomics, GenBank and The Institute for Genomic Research (TIGR). We initially predicted a human cDNA sequence of 6,612 bp. Wilson et al.9 report the identification of additional exons at the 5′ end of the gene in an accompanying paper. We confirmed the presence of these exons by sequence analysis of RT–PCR products from human and mouse (C57BL/6J) brain. We identified an additional exon (exon 2) in the mouse cDNA sequence. This exon is conserved in the human genomic sequence and was subsequently confirmed to be present in human transcripts by direct amplification of human brain cDNA. We derived a cDNA sequence of 12,871 bp with an open reading frame of 4,169 amino acids. (see Web Fig. A on the supplementary information page of Nature Genetics online). We found a putative translation initiation site at nt 1 and two polyadenylation signals (AATAAA) at nt 12594 and nt 12796, respectively. Alignment of the cDNA sequence with the genomic sequence allowed us to identify 23 exons, varying in length from 64 bp to 6,108 bp (Fig. 2a).

Figure 2: Genomic structure and alternative splicing of ALMS1.
figure 2

a, Exon–intron structure of KIAA0328 (ALMS1) drawn to scale (see Web Table A online). The gene comprises 23 exons spanning more than 230 kb of genomic DNA. 5′ and 3′ untranslated regions and exon regions are depicted by open and filled boxes, respectively. Slash bars indicate introns for which there is incomplete sequence information. b, Original EST cluster, KIAA0328, from human brain. c, Human brain RT–PCR product. df, Alternative transcripts of ALMS1 in testes. g, Splicing pattern of Alms1, as determined by RT–PCR. Human transcripts containing exon 2 seem to be rare, as they were observed only when using exon 2–specific primers.

We designed intronic primers to amplify and sequence the coding region in DNA of six unrelated individuals with Alström syndrome. In a large consanguineous Acadian kindred (K1)10, we identified an insertion of 19 bp in exon 16 (Fig. 3a), which causes a frameshift resulting in early termination at codon 3530. All five affected subjects from the extended pedigree were homozygous with respect to the insertion. Transmission of the insertion allele in unaffected carriers is consistent with previously reported haplotypes6. The insertion allele was not seen in 100 unrelated individuals from the general population. We identified five additional mutations in five unrelated families of diverse ethnicity (Fig. 3be and Table 1). All mutations segregate with ALMS1. We identified a homozygous mutation, 8383C→T, generating a TAA termination signal, in a consanguineous Italian family (K57). We observed a frameshift mutation, 10609GA→T;10614G→A, resulting in a premature termination signal at codon 3546, in three affected siblings of a consanguineous French family (K22). We identified a TAA nonsense mutation, 11449C→T, in two distant cousins of a consanguineous Portuguese family (K59). One cousin harbors the mutation in a homozygous state; the other carries the TAA nonsense mutation in one allele and an insertion mutation (8395insA) in the other.

Figure 3: Mutations in six unrelated families segregating with ALMS1.
figure 3

We observed mutations in all affected subjects. ad, All mutations segregate with the disease in a homozygous state in affected individuals. a, Detection of a 19-bp insertion in exon 16 of KIAA0328 in a large consanguineous Acadian kindred (K1)13. The chromatogram shows the sequence variation between a normal control and an affected individual. PCR amplification of the 19-bp insertion from a nuclear family within the Acadian kindred is shown on the right. The parents (1,2) are heterozygous with respect to the mutation (carriers), the unaffected child (3) is homozygous with respect to the normal allele (439 bp, noncarrier) and the affected child (4) is homozygous with respect to the insertion (458 bp). The transmission of the insertion is in full agreement with previously reported haplotypes (data not shown). b, An 8383C→T nonsense mutation in exon 10 of an affected individual of Italian descent. c, 10609GA→T;10614G→A mutations in exon 16, resulting in a frameshift in all three affected siblings of French descent. d, A 10775delC frameshift mutation, which results in a premature termination signal at codon 3597, was identified in exon 16 of two kindreds. The genealogical relationship between these two kindreds has not been established. e, An 11449C→T nonsense mutation in second cousins A7 (homozygous) and A8 (heterozygous). We identified a second mutation (8395insA) in subject A8, which results in a frameshift. Slashed symbols indicate that the individual is deceased. Black symbols represent affected individuals, and gray symbols depict those individuals unavailable for study.

Table 1 Summary of mutations found in six kindreds, segregating with Alström syndrome

We found a 10775delC mutation in two unrelated young adults (Fig. 3d): a 19-year-old male of British ancestry (K42, subject A5) and a 21-year-old male who traces his ancestry to Britain two centuries ago (K3, subject A6). Both presented with infantile cardiomyopathy within the first two months of life and subsequently developed short stature, scoliosis, type 2 diabetes and renal insufficiency. However, they differed in the course of their disease presentation. Subject K42-A5 experienced a sudden recurrence of dilated cardiomyopathy at age 18 and has no evidence of hepatic dysfunction, whereas subject K3-A6 presented with severe hepatic failure at age 20 and has not had a recurrence of cardiomyopathy. This difference in disease progression in individuals carrying the same mutation suggests that the phenotypic variability observed in many individuals with Alström syndrome may be the result of genetic or environmental modifiers interacting with the ALMS1 locus.

Analysis by RT–PCR of human cDNA panel 1 showed that ALMS1 is ubiquitously expressed (Fig. 4a). We failed to detect expression of ALMS1 when probing a human multiple-tissue blot with ALMS1 DNA fragments after seven days of exposure, suggesting low abundance of the transcript. We then carried out northern-blot analysis of RNA from human testes and fetal brain. We found four transcripts of approximately 12.6 kb, 7.8 kb, 6.0 kb and 4.8 kb (data not shown). Additional tissues, not tested directly by RT–PCR or northern blot, that show expression as determined by in silico analysis of human cDNA libraries from the LifeSeq database include adrenal, thyroid, pituitary and mammary glands, thymus, uterus, urinary tract, colon and connective tissue (see Web Table B online). Northern-blot analysis of RNA from mouse tissues indicates that Alms1 is also ubiquitously expressed at low levels; it is expressed in the recapitulated lung, heart, kidney, large intestine, spleen, eye and ovary. By contrast, expression in testis is high (Fig. 4b).

Figure 4: Expression of ALMS1 in adult human and mouse tissues.
figure 4

a, RT–PCR of human cDNA multiple-tissue panel. b, Northern blot of mouse tissue, hybridized with a 490-bp cDNA fragment spanning exon 8, and β-actin mRNA as control.

We identified several splice variants of ALMS1 by comparing sequences from public databases and the LifeSeq cDNA library databases as well as from our RT–PCR analyses (Fig. 2). We estimated the relative abundance of the variants from the number of GenBank clones representing the different sequences and the tissue distribution analysis reported in the LifeSeq database. The most abundant variants contain exons 22 and 23 and are collectively referred to as the β-form (Fig. 2e,f). Exons 22 and 23 are not represented in the originally identified KIAA0328 transcript, which has an alternative polyadenylation site in intron 21. The predicted open reading frame of the KIAA0328 terminates immediately after exon 21. To obtain a longer sequence of the β-form, we amplified human testis cDNA using oligonucleotide primers designed to amplify exons 8–23. Fragments ranging from 0.6 kb to 5.5 kb were produced, suggesting complex alternative splicing. The sequences of two fragments (β-5.4 and β-5.3) confirmed that alternative splicing occurred. Form β-5.4 has a shortened exon 8 (368 nt), presumably resulting from the use of an internal splice donor site within exon 8 in conjunction with the use of the splice acceptor in exon 20. The variant β-5.3 also has a shortened exon 8 (555 nt), but splices into intron 21, 16 nt upstream of exon 22. Other, larger β-variants exist (data not shown), but have not been fully sequenced. In addition, a γ-variant may exist, represented by Incyte clone 2011622, that uses an alternate polyadenylation site in intron 16. Database searches also identify other rare variant sequences reported in GenBank and the LifeSeq database.

We assembled the mouse Alms1 cDNA sequence of 9.8 kb by aligning several EST sequences (GenBank & TIGR) and by aligning human cDNA sequence with mouse genomic trace data (GenBank) and mouse genomic fragments (Celera Genomics). We confirmed this by sequencing PCR-amplified cDNA derived from C57BL/6J mice. Excluding the region encoding exon 8, the deduced amino-acid sequence of mouse Alms1 is 63.7% identical to the protein sequence of human ALMS1. Exon 8 consists primarily of a short repeat unit of approximately 140 bp. There are roughly 35 copies of the repeat in the human gene and about 15 copies in the mouse gene. As the repeat units are not of exact sequence, but are highly similar, we estimated the number of repeats on the basis of their sequence similarity.

Although the degree of sequence similarity is low for orthologous genes, three lines of evidence suggest that we have identified the true mouse ortholog and not a family member. First, BLAST searches of Celera Genomics and public databases did not show other sequences with significant similarity (except across the small ALMS domain). Second, the genomic sequence of ALMS1 is flanked by EGR4 and DUSP11, as is the putative Alms1 genomic sequence in the Celera assembly (data not shown). Finally, the putative Alms1 was mapped to central mouse chromosome 6, closest to D6Mit22, using the T31 mouse radiation-hybrid panel. This corresponds to the homologous region for human chromosome 2p13, in which ALMS1 is located.

We carried out motif and homology searches using Prosite and Pfam databases. No signal sequences or transmembrane regions were detected, which, together with the overall hydrophilic nature of the protein, suggests an intracellular localization. The ALMS1 protein contains a predicted leucine zipper motif (aa 2480–2501) and a serine-rich region (aa 3857–3873). In addition, we identified potential nuclear localization signals (aa 3805–3830 and aa 3937–3954), as well as a histidine-rich region (aa 3486–3523) in the mouse sequence. All of these features are conserved between human and mouse ALMS1; however, because of the frequent occurrence of such sequences in various proteins, the functional significance of these motifs must be tested experimentally. In addition to the above domains, we identified a region of 120 aa at the C terminus of ALMS1 that has sequence similarity to regions of two predicted proteins from macaque and mouse (Fig. 5). Owing to the relatively small region of homology, it is unlikely that these sequences represent additional gene family members. It is possible that this well conserved ALMS motif defines a protein domain that may have structural or functional significance.

Figure 5: Amino-acid similarity between human and mouse ALMS1 with mouse and macaque domains.
figure 5

Asterisks indicate identical amino acids, and dots represent conserved amino-acid changes.

Obesity and type 2 diabetes, pervasive public health problems, are associated with increased risk of morbidity and mortality and affect a large percentage of the population11. Both diseases are influenced by environmental conditions but also by a strong genetic component12,13. Most of the genes identified so far that lead to obesity and type 2 diabetes have been in the context of syndromic diseases such as Bardet-Biedl syndrome14,15,16. The infantile obesity observed in individuals with Alström syndrome is probably caused by mutation of ALMS1, as it constitutes a relatively early (as early as 6 months) phenotype observed in all affected children. The early onset of obesity, anecdotal reports of hyperphagia, and the sensory deficits observed in individuals with Alström syndrome seem to point to a neuronal dysfunction and suggest that the obesity is due to loss of ALMS1 function in the central nervous system.

The insulin resistance and chronic hyperglycemia, accompanied by hyperlipidemia and atherosclerosis, that are seen in Alström syndrome also occur in common forms of adult-onset type 2 diabetes, although they occur at an accelerated rate in individuals with Alström syndrome. That nearly all individuals with Alström syndrome develop type 2 diabetes suggests that ALMS1 may be involved in diabesity—that is, both obesity and diabetes susceptibility are due to the altered function of one gene. This distinguishes it from the common forms of obesity, in which the genes that are presumably involved are thought to interact with independently segregating genes that confer diabetes susceptibility, as not all obese individuals develop type 2 diabetes. In such a scenario, an ALMS1 mutation in the hypothalamus might lead to hyperphagia followed by obesity, and subsequently the obesity, combined with the defective function of ALMS1 in another organ system, such as the pancreas, might lead to the development of type 2 diabetes.

The ubiquitous expression of ALMS1 could explain the syndromic nature of the disease, as concomitant, defective ALMS1 function in multiple tissues may cause the many phenotypes observed. Interactions between genes may also have a role, however, as suggested by the different disease presentations seen in two of our affected individuals who carry the same mutation. Recently, a triallelic inheritance pattern has been proposed to occur in the phenotypically similar, heterogeneous Bardet–Biedl syndrome17. It was suggested that Bardet–Biedl syndrome may be a complex trait requiring three mutant alleles to manifest a disease phenotype. This does not seem to be the case in Alström syndrome, as we have not identified an unaffected individual carrying two mutated ALMS1 alleles. In over 90 families studied so far, linkage to the ALMS1 locus is both necessary and sufficient to explain the affected status. Nevertheless, the genes involved in the Bardet–Biedl syndrome and Laurence Moon syndrome may be good candidates as modifier genes of Alström syndrome.

It is unlikely that mutations in ALMS1 have a major role in common diseases in the general population; the value of studying this gene lies in its potential to uncover new metabolic and regulatory pathways involved in the etiology of obesity, type 2 diabetes, neurosensory diseases and related disorders. Many examples of this paradigm—identifying single-gene mutations, thus allowing the identification of upstream and downstream molecules in a pathway—have been reported (see, for example, ref. 18). Determining the function of ALMS1 may provide insight into how variants of this gene interact with those of other genes to produce its pathological effects.

Methods

Families.

We isolated DNA from family members of individuals with Alström syndrom and control subjects from peripheral whole blood using a standard protocol19. Inclusion criteria were based upon the assessment of the cardinal features of Alström syndrome as well as the clinical diagnosis. We obtained written informed consent from all subjects. The Institutional Review Board at The Jackson Laboratory approved all experimental protocols.

Genotyping.

We obtained oligonucleotide primers for amplification of short tandem repeat polymorphisms (STRPs) from either Research Genetics or designed (MacVector 6.0)20 and custom-made (One Trick Pony) primers. We carried out PCR amplification of STRPs with [33P]oligonucleotides as previously described21. PCR products were separated on a 6% denaturing polyacrylamide gel and visualized by autoradiography.

Mutation analysis.

We amplified exons 8–23 of ALMS1 by standard PCR protocols. We separated amplified products on a 1–1.2% gel, purified products using Nucleospin columns (Clontech) and sequenced with an ABI Prism 3700. We compared sequencing results to an unaffected control, human BAC sequence and cDNA (KIAA0328). PCR primer sequences are available upon request.

Expression analysis.

To generate the probe for northern-blot analysis of mouse, we amplified mouse C57BL/6J retinal cDNA with primers specific to exon 8, using the Expand Template system (Roche). We purified the 490-bp product and radiolabeled probes with the Rediprime II labeling system (Amersham Pharmacia). Mouse multiple-tissue blots22 were pre-hybridized for 1 h with Rapid Hyb buffer (Amersham Pharmacia) and hybridized overnight. Membranes were washed, and hybridized products were visualized by autoradiography after an 8-d exposure. Blots were incubated with a probe representing β-actin coding sequence as a control22. For northern-blot analysis of human tissue, we hybridized human testes and fetal brain (5 μg) and human multiple-tissue blots (FirstChoice Blot 1, Ambion) with a 394-bp probe generated by PCR amplification of genomic DNA and carried out hybridization as above. For RT–PCR, we amplified human multiple-tissue cDNA panel I (Clontech) using forward and reverse primers for 35 cycles at an annealing temperature of 56 °C. Primer sequences are available upon request.

Mouse cDNA sequence.

We prepared total RNA from whole brain of male C57BL/6J mice. Tissues were homogenized and RNA was isolated by TRIzol (Life Technologies) treatment according to the manufacturer's protocol. We generated cDNA using the Superscript One-Step RT–PCR kit (Life Technologies). We designed primers for PCR amplification of Alms1 from sequences of aligned ESTs from the Celera Genomics database and carried out PCR using the Expand Template system (Roche).

Radiation-hybrid mapping.

To place Alms1 on The Jackson Laboratory radiation-hybrid map, we used clones from the Mouse T31 Radiation Hybrid panel (Research Genetics). We amplified 100 hybrid clones using primer pair sequences of Alms1 genomic sequence (Alms1 exon 10 and Alms1 intron 10) and submitted retention banding patterns to The Jackson Laboratory Mapping Panels.

Databases.

These data were generated using the databases of Incyte Genomics LifeSeq, The Institute for Genome Research, GenBank, Celera Discovery System and Celera Genomics.

Accession numbers.

Sequence data for human transcripts: KIAA0328, AB002326; KIAA0919, AB023136; mouse C57BL6/J cDNA, AF425257; human BACs, AC069346 and AC074008. Splice variant sequence: β-form, THC530050; γ-form, AL041387; rare variants, W11846 and AW082244. Protein motifs: leucine zipper motif, PS00029; potential nuclear localization signal, PS50079.

URL.

http://www.hgmp.mrc.ac.uk/NIX/

Note: Supplementary information is available on the Nature Genetics website.