Introduction

Pax genes are a highly conserved family of developmental control genes that encode transcription factors and are present in organisms ranging from nematodes to man.1 Nine Pax genes have been identified in mouse and man and are dispersed around the genome.2 The Pax gene family has been classified into four subfamilies (Pax1 and Pax9; Pax2, Pax5 and Pax8; Pax3 and Pax7; Pax4 and Pax6), according to their genomic organisation, the sequences of the paired domains and their expression patterns.1 The most conserved functional motif in all Pax proteins is the 128 amino-acid paired domain, which exhibits DNA-binding activity.3 Paired-box-containing genes were first discovered in Drosophila melanogaster (Drosophila paired segmentation gene),4 where they have multiple functions during embryogenesis. Pax genes have also been shown to play a role in pattern formation during embryogenesis in vertebrates, possibly by determining the time and place of organ initiation or morphogenesis.3 Although the primary developmental action of Pax transcription factors in most tissues has yet to be elucidated, they are thought to play a role in signal transduction during tissue interactions that could lead to a position-specific regulation of cell proliferation. Indeed, analysis of spontaneous and transgenic mouse mutants have shown that vertebrate Pax genes are key regulators during organogenesis of the eye, ear, nose, limb muscles, kidney, vertebral column and brain.5,6,7,8,9,10,11,12

Loss-of-function mutations in Pax genes have been associated with both spontaneous mouse mutants (undulated (Pax1),5 Splotch (Pax3),6 Small eye (Pax6),7 Neu (Pax2)8) and congenital human diseases such as Waardenburg syndrome (PAX3),9 aniridia and Peter's anomaly (PAX6),10,11 renal coloboma syndrome (PAX2),12 thyroid dysgenesis (PAX 8)13 and oligodontia (PAX9).14 All these show defects in development, and, in each case, haplo insufficiency is the pathogenetic mechanism.

The homology between mouse and human mutants has been a useful tool in defining conditions in man, that which are because of Pax gene mutations. However, the human homologue of the first mouse Pax mutant undulated (un)5 remains elusive. Un is caused by a missense mutation in the paired box of Pax1, which decreases the DNA-binding affinity of the protein and alters its DNA-binding specificity.15 The un mouse, first described in 1947,16 has segmentation anomalies along the entire axial skeleton where Pax1 is expressed. The intervertebral discs are irregular and reduced, and there are also anomalies in the development of the pectoral girdle, including the absence of the acromion of the scapula or its replacement with a ligament.17 Pax1 is also expressed in the thymus which, in un mice, is only half the normal size.

Human PAX1 maps to chromosome 20p11.2.2,18 Trisomy and monosomy of this locus have been associated with vertebral anomalies.19 Based on the mouse studies, the human phenotype of a PAX1 mutant would be expected to have vertebral segmentation anomalies possibly in conjunction with pectoral girdle and thymus abnormalities. Klippel–Feil syndrome (KFS)20 is a condition characterised by failed segmentation of the cervical vertebrae with the clinical sequelae of a short, immobile neck and a low posterior hairline. Vertebral fusions may also occur elsewhere along the spine and other vertebral anomalies such as hemivertebrae may be present.21 Some patients have a ligamentous or bony connection between the vertebrae and scapula. Other features include Sprengel's shoulder, renal, cardiac and neurological abnormalities.22,23,24 There are three subtypes of the condition categorised according to the extent of vertebral involvement.25 KFS appears to be an aetiologically heterogeneous condition with sporadic occurrence, autosomal dominant and autosomal recessive modes of inheritance reported.26 The phenotypic similarities between the undulated mouse and KFS syndrome encouraged us to screen KFS patients for PAX1 mutations. In this paper, we define the genomic structure of the human PAX1 gene and describe the sequence variations detected in a panel of 63 KFS patients screened for mutations in this gene.

Materials and methods

Clinical ascertainment of KF patients

A clinical study of KFS in the UK was undertaken and samples were obtained from patients for both chromosome analysis and DNA extraction. Chromosome analysis was carried out for karyotyping and no gross abnormalities were detected. A total of 63 KFS patient samples were obtained for molecular genetic analysis. Clinical information relating to the patients was also obtained so that they could be categorised according to their phenotype.

PCR amplification and sequencing

DNA was extracted from peripheral blood using standard automated extraction. PCR was used to amplify regions of the PAX1 gene from patient DNA for mutation screening by single-strand conformational polymorphism (SSCP) and heteroduplex analysis, and to amplify fragments for direct sequencing. PCR reactions were carried out in a 20 μl reaction volume using 50–100 ng genomic DNA as template and 10 pmol of each primer. The reactions were carried out in PCR buffer with a final Mg2+ concentration of 3.7 mM, 0.75 mM of each dNTP and 0.5 U Taq poly merase. The amplifications were carried out under the following conditions: 95°C for 3 min, 30 cycles of 94°C, 55–65°C and 72°C each for 1 min and a final extension at 72°C for 5 min. PAX1 PCR and sequencing primers and annealing conditions are summarised in Table 1.

Table 1 PAX 1 PCR primer sequences

Mutation screening

Mutations were screened for using SSCP–heteroduplex analysis. PCR products were denatured in formamide dye (0.01 M EDTA, 98% formamide, trace xylene cyanol and bromophenol blue) and run on 8% polyacrylamide gels (acrylamide 49:1) at 4°C for 16 h at 350 V. The bands were visualised by silver staining. PCR products were cleaned for sequencing using Centricon 100 columns (Amicon), according to the manufacturer's instructions. Band shifts seen on SSCP–heteroduplex analysis were sequenced using a fluorescent cycle-sequencing kit (sequencing conditions were: initial denaturation for 1 min at 96°C; 30 cycles of 96°C for 30 s, 15 s at 50°C and 60°C for 4 min). Products were run on an ABI 373 automatic sequencer and analysed using ABI sequence analysis software version 3.3.

Results

Genomic organisation of PAX1

At the start of the project the only published PAX1 sequence was HuP48,27 which represented the paired box of PAX1. We identified the complete human PAX1 genomic sequence by searching the Genbank database with HuP48 and the mouse Pax1 cDNA sequence (Accession number NM008780). The human genomic clone RP5-1065O2 (Genbank Accession number AL035562) contained the whole PAX1 gene, allowing us to determine its genomic structure (Table 2) and predict the PAX1 cDNA sequence (Figure 1).

Table 2 Genomic organisation of PAX1
Figure 1
figure 1

Predicted human PAX1 cDNA sequence. The putative stop codon (nucleotide bases 1364–1366) is underlined. The paired-box region is highlighted in bold type (nucleotide bases 254–673) and the conserved octapeptide sequence (nucleotide bases 839–862) is in bold type and italics. The polyA tail is shown in bold type and underlined. Amino-acid 71 (*A underlined) corresponds to the start codon in mouse pax1 cDNA.

PAX1 appears to span a genomic region of approximately 10 kb, and comprises four exons, analogous to the mouse Pax1 gene. Analysis of the genomic sequence using exon prediction and ORF software (NCBI ORF finder) suggests that the PAX1 protein is composed of 440 amino-acid residues with a molecular weight of 45703.8 Da and a theoretical pI of 9.54. Comparisons between the predicted amino-acid sequence of the human PAX1, mouse Pax1 (Accession number AF285175) and the Drosophila pox meso paired-box region (Accession number X16992) showed an extra 70 amino-acid residues 5′ to the paired box in the human gene (see Figure 2). This finding is compatible with the sequence of HuP48, a human genomic clone taken to be derived from the PAX1 gene.

Figure 2
figure 2

Comparison of human and mouse PAX1 amino-acid sequence and Drosophila pox meso paired-box sequence. The paired box is shown in bold type and the conserved octapeptide is underlined (Clustalw alignment).

To confirm the exon boundaries of the human PAX1 gene as well as the existence of the extra amino-acid residues predicted, we sequenced a number of IMAGE cDNA clones. The largest PAX1 cDNA clone (Accession number BF237649) started at nucleotide 71 of the predicted PAX1 cDNA sequence (Figure 2), but did not contain a methionine start codon upstream of the paired-box region, suggesting that it was not full length. The methionine codon at the amino-acid position corresponding to the start codon of the mouse gene was also absent from this cDNA clone 71 (a TCC sequence was present at nucleotide position 230–233 instead of the ATG in mouse Pax1), confirming that the sequence from genomic clones RP5-1065O2 and HuP48 was correct. Analysis of the genomic clone sequences suggests that the human PAX1 methionine start codon is predicted to lie further upstream at base 44 in the proposed PAX1 cDNA sequence.

Mutation analysis

Intronic primers, based on the genomic PAX1 sequence, were designed to amplify the four exons of the gene. Three sets of primers were designed for exon 1, splitting it into three fragments of approximately 200 bp, which were more suitable for SSCP mutation screening. A total of 63 affected individuals were screened for mutations in the PAX1 gene. SSCP band shifts were detected in eight individuals. The sequence changes were defined and healthy controls screened, by SSCP and sequence analysis, to determine whether these sequence differences were population polymorphisms.

PAX1 mutations and clinical description of KFS patients

The clinical phenotypes of patients with the detected PAX1 sequence changes are described below.

Missense mutations – most likely to be pathogenic

Patient 1 (932240): This patient is a Caucasian female. X-rays demonstrated that she had only two cervical vertebrae with extensive fusion including the atlas and axis. She had six lumbar vertebrae and fusion of the first and second ribs anteriorly on the left as well as severe sensorineural deafness and developmental delay. Both parents were clinically normal.

A band shift was detected on heteroduplex analysis in patient 1. Sequencing showed a C>G base change at position 224 in the cDNA sequence resulting in a missense mutation (CCC>GCC; P61A) 38 bp upstream from the paired-box region. This mutation resides in the extra amino-acid sequence not present in published mouse cDNA sequences. This mutation was also detected in the mother of the proband, who does not show clinical signs of KFS, but not in the paternal sample. A total of 303 chromosomes tested negative for this mutation, suggesting it is not a common polymorphism.

Patient 2 (931369) This female Caucasian was diagnosed with KFS type II, vertebrae fusion of C4/5, at age 10 years. Her only clinical features were a low hairline and a slight head tilt to left.

Exon 2 gave an SSCP shift, and sequencing showed a G>C base change at position 890 in the cDNA sequence, resulting in a missense mutation (GCC>CCC; A283P). This sequence change was not detected in 100 controls (200 chromosomes) tested, indicating that it is not a common polymorphism.

Patient 3: This patient was diagnosed overseas as KFS, and had the typical clinical phenotype.

A G>A sequence change at base 908 in the PAX1 cDNA sequence was detected in this patient, which results in a missense mutation in exon 2 (GGC>AGC; G289S). This change was not detected in 100 controls (200 chromosomes) screened, suggesting that it is not a common polymorphism.

Intronic changes

Patient 4 (921897): This Pakistani male was diagnosed at birth with KFS. He had a right Sprengel shoulder, mirror movements, small stature and torticollis. X-rays showed a gross complex anomaly of lower cervical and upper thoracic spine, multiple hemivertebrae and hypoplasia of the vertebral bodies with a short neck. Mother (921896) and maternal grandmother both had short necks but both had normal neck X-rays. The mother later had a son with spina bifida.

An SSCP shift was detected in the mother of patient 4, but was not present in either of her sons. Sequencing identified a G>A base change in intron 2, 10 bp from the donor splice junction of exon 2 (exon 2+12). This mutation was not detected in 60 individuals (120 chromosomes) tested.

Patient 5 (932917): This patient is a female of Caucasian/Algerian origin. She had fusion of several cervical vertebrae and was diagnosed at birth with KFS type 1. Other phenotypes included a heart murmur, VSD, scoliosis, Sprengel shoulder, mirror movements, short stature (third centile), posteriorly rotated ears, high palate, webbed neck and torticollis. Karyotype analysis did not detect any chromosome abnormalities.

The SSCP shift detected in patient 5 was an A>C base change in intron 3, 10 bp from the donor splice junction of exon 3 (exon 3+12). This mutation was not detected in 60 individuals (120 chromosomes) tested.

Silent mutation

Patient 6 (931423): This female Caucasian had a number of congenital abnormalities. Polyhydramnios was noted in pregnancy, she had an imperforate anus with fistula, oesophageal atresia, unilateral renal agenesis with vesicoureteric reflux in the remaining kidney and radial ray dysplasia. Subsequent investigation had shown an absent uterus and vagina. She also had bilateral preauricular tags and was diagnosed with VATER association. Her development was normal but she had bilateral conductive hearing loss needing speech therapy. She also had mirror movements, a low hairline, torticollis, webbing of the neck and scoliosis. X-rays showed a block vertebra in her neck and multiple vertebral anomalies in the mid-thoracic region with at least six lumbar vertebrae and a fusion bar between the two lowermost vertebrae, which have ribs on them. Hemivertebrae were present from T4–T7 and the sacrum was hypoplastic. Her spinal abnormalities placed her in the type III KFS group.

The base change (C>G, at base 976) in patient 6 is a silent mutation in exon 2 (CCC>CCG; P311P), which creates a new BglI restriction site. Digestion of the exon 2 PCR product with BglI in individuals with the wild-type sequence gives three bands (170, 65 and 35 bp), whereas the heterozygote containing the P311P mutation gives an additional band of 135 bp. Only patient 6 gave the four-band mutant pattern (data not shown). This sequence change was not detected in 100 controls (200 chromosomes) screened in this way, indicating that it is not a common polymorphism.

Rare polymorphism

Patient 7 (932345): This patient is a male Caucasian with a cleft palate, a left radial nerve palsy and mild talipes of the right foot and a KF anomaly. His low hairline, webbed neck and good neck movement suggested he had KFS type II.

Patient 8 (931978): This male Afro-Caribbean had the clinical triad of KFS as well as a Sprengel shoulder, mirror movements and strabismus as a child.

A rare polymorphism was identified in the paired-box region in both patients 7 and 8. The sequence change (G>A) at base 526 of the PAX1 cDNA sequence is silent (AAG>AAA; K161K). Two of 303 control chromosomes tested also had this change, suggesting that it is not pathogenic and represents a rare polymorphism. Interestingly, this base change corresponds to the normal mouse Pax1 cDNA sequence.

Discussion

It is interesting that the predicted human PAX1 gene product has a significant number of additional amino acids upstream of the paired box compared to the mouse Pax1 and the Drosophila pox meso sequence. This is unusual as Pax family genes usually show high homology, particularly between human and mouse, and the paired-box domain is located very near the N-terminus in most Pax proteins. This deviation from the mouse sequence could reflect an additional or different function of the human PAX1 protein, which could be affected by mutations in this region. If that were true, it is not self-evident that mutations in this region would produce a human phenotype similar to the undulated mouse.15

Mutation screening of our patients with KFS identified two with a polymorphism within the paired box (analogous to the mouse normal sequence), and six patients with variants that were not seen in controls. Although we did not detect any nonsense, frameshift or evident splicing mutations, the missense mutations could potentially have a pathogenic role. Of the missense mutations, those in exon 2 (A283P, G289S) lie outside the paired box but affect amino acids that are conserved in the mouse sequence. The mutation P61A lies in the run of amino acids upstream of the paired box (not present in the mouse sequence), and was detected in both the proband (patient 1) and her clinically unaffected mother, but not in any control individuals tested. The lack of clinical signs in the mother could be explained by reduced penetrance of KFS, which has been reported in other families with KFS.28 Silent and intronic mutations have the potential to be pathogenic if they activate cryptic splice sites, leading to exon skipping or the inclusion of intronic sequence into the mature transcript. Pathogenesis by silent mutations has been described in other diseases, for example, limb girdle muscular dystrophy.29 Unfortunately, since PAX1 is expressed primarily during development and only in the thymus in adults, we are unlikely to access mRNA from these patients to investigate anomalous splicing.

We also cannot exclude the possibility that all these changes are rare nonpathogenic variants, which probably applies in the case of the mother of patient 4 (who may have a mild form of KFS-short neck and deafness) with an intronic mutation not transmitted to her classically affected KFS son. Alternatively, the variable penetrance seen in some families could be explained by modifying effects of another unlinked locus or by digenic inheritance as in retinitis pigmentosa, where mutations in both ROM1 and eripherin/RDS are necessary to cause disease in some individuals.30 To support this hypothesis, a missense mutation in a conserved region of the PAX1 paired box has been described in a foetus with a neural tube defect.31 The mutation Q42H (Q115H in our sequence) was inherited through an unaffected mother and grandmother. A phenotype reminiscent of extreme spina bifida occulta in humans is also seen in mice doubly mutant for undulated and Patch.32

Our study shows that mutations in the PAX1 coding sequence are not the cause of all cases of KFS, but this does not exclude PAX1 as a candidate gene for some cases. KFS is clearly a highly heterogeneous condition. All three diagnostic signs for the syndrome (failure of segmentation of two or more vertebrae, a short neck with limitation of head movement and a low posterior hairline) are present in fewer than 50% of patients. KFS has also been described as a manifestation of foetal alcohol syndrome,33 and a phenotype similar to undulated is provoked in mice by maternal treatment with the anticonvulsant valproic acid.34 The possibility remains that PAX1, alone or in conjunction with other genetic or environmental factors, plays a role in the pathogenesis of KFS. Future work will concentrate on identifying and screening more candidate genes implicated in the PAX1 pathway or in other aspects of skeletal development. Gross chromosomal abnormalities and association with other syndromes may provide vital clues as to the location of candidate gene(s) involved in the pathogenesis of KFS. In fact, a familial KFS gene locus on 8q has been identified (inv(8)(q22.2q23.3) – segregated with congenital vertebral fusion), which provides a good starting point for candidate gene approaches.35