Introduction

Osteoarthritis (OA) is the most common joint disease and is the leading cause of disability among the elderly. It is a degenerative disease of the joints characterized by cartilage degeneration and subchondral bone remodelling. Clinically, OA manifests itself with pain, stiffness, disability and loss of joint function. Twin, sibling and segregation studies have highlighted substantial heritability for primary OA ranging between 39 and 80% depending on the sex and joint location.1, 2 OA often arises as a complex trait, but rare familial forms with autosomal dominant transmission have been reported.3, 4, 5 The phenotype of these rare OA families resembles common OA at later ages in the population except for the early age of onset (20–50 years) and the progressive course of the disease. In familial forms of osteochondrodysplasia often displayed with secondary early-onset OA, mutations in several structural genes of the extracellular cartilage matrix (ECM) including COL2A1, COL9A1, and COMP have been identified.6 There is no convincing evidence yet that these genes are implicated in the susceptibility for primary early-onset OA occurring without dysplasia.4, 6, 7 Linkage and association studies on the basis of joint- or sex-specific OA definitions have yielded several loci associated with common OA susceptibility including MATN3,8, 9 FRZB,10, 11, 12 IL4R,13 ASPN14 and CALM1.15

Previously, we mapped a locus for early-onset osteoarthritis (familial OA, FOA) in seven families to 2q33.3–2q34.16 The high LOD score in the early-onset OA families and absence of variants in the two most promising genes (FZD5 and PTHR2) prompted us to perform an extended systemic mutation analysis of candidate genes in the 2q33.3–2q34 linkage area. Additionally, we screened for mutations in the FRZB gene, which is localised at 2q32, slightly outside our linkage region. In the present study, we report results of mutation screening of the entire coding region, splice sites, and 5′ and 3′ untranslated regions (UTR) of 17 positional candidate genes and the FRZB gene in affected family members with early-onset generalised OA.

Materials and methods

Families

Families containing patients that express primary generalised OA in each generation were collected from different parts of the Netherlands. Informed consent was obtained from all patients and the Medical Ethics Committee of the Leiden University Medical Centre approved the study. Probands were recognised through Rheumatology outpatient clinics. Family members were recruited via probands. Initially, we used questionnaires to select eligible families. For eligible families, complete medical history and available radiographs were obtained from Rheumatologists of almost all affected family members (81%). Radiographs were re-evaluated for signs of chondrodysplasia, spinal dysplasia and abnormal development or growth of the articular and growth-plate cartilages including epiphyses of the peripheral joints. As previously described in Meulenbelt et al.,5 these features were absent in all families. The presence of radiographic OA (ROA) was assessed according to Kellgren/Lawrence criteria17 by an experienced reader. Some individuals had marked Heberden's nodes and ankle OA. A selection of affected and unaffected individuals of family 1, 2, 4 and 7 were additionally visited for physical examination in order to prevent misclassification. The mean age of onset of OA in these patients was 33 years ranging between 20 and 50 years. The phenotype within these families is characterised by distinct progressive OA in the absence of mild or severe chondrodysplasia, however, with symptoms and ROA at multiple joint sites simultaneously including involvement of the hands with noduli, knees, hips, ankle and spine. Individuals with clinical and radiographic evidence of OA in two or more joint sites before the age of 50 years were considered affected. Extensive description of the phenotype in family 1, which is representative for the phenotypes also of the other extended families included, is described elsewhere.5 All clinical diagnostic decisions were made independent to genetic linkage analysis and homogeneity of the phenotype between different families was checked.

The Rotterdam sample

The Rotterdam study, which comprises 7983 Caucasian participants is a prospective, population-based cohort study of the determinants and prognosis of chronic diseases in the elderly.18 The Medical Ethics Committee of the Erasmus University Medical Centre approved the study, and informed consent was obtained from all subjects. In a random sample of 809 unrelated subjects (the Rotterdam sample), ages 55–65 years, radiographs were scored for the presence of ROA in two knees, two hips,19 36 hand joints and three levels of the thoracocolumbar spine.19, 20 All radiographs were scored according to the Kellgren/Lawrence grading system (grades 0–4)17 by two independent readers, blinded to all other data of the participant. Definite ROA at a particular joint site was defined as a Kellgren/Lawrence score of two or more.17 In the hands, 36 separate joints were scored comprising eight joint groups: distal interphalangeal joints, the interphalangeal joint of the thumb, the proximal interphalangeal joints, the metacarpophalangeal joints, the first carpometacarpal joints, the trapezoscaphoideal joints, the radionavicular joints and the distal radioulnar joints. By definition, ROA of the spine is confined to the apophyseal joints, but these joints could not be assessed on the lateral radiographs of the spine that were available. Instead, we assessed disc degeneration (DD) of the spine, at three levels, that is, thoracic (Th4 to Th12), lumbar (L1–L4 or L5) and lumbosacral (L5–S1 or L5–L6).

We analysed and evaluated the occurrence and the generalised OA status of carriers of novel variants in the population-based sample as a qualitative trait using previously described definitions.11 In brief, subjects with two or more of the following four criteria were considered as affected with generalised ROA: hand ROA in three or more hand joint groups (the right and left hands were considered separately), spinal DD in two or more disc levels, knee ROA in one or two knees and hip ROA in one or two hips. Subjects affected with generalised ROA were compared to the complete Rotterdam sample from which subjects with generalised ROA were excluded. Phenotypic data for assessment of generalised ROA was available for 790 subjects.

Mutation analysis strategy

As the FOA phenotype is likely a Mendelian trait, severe and with an early age of onset, we expect a mutation with a large effect and likely located in a coding region. Moreover, cosegregating haplotypes among families that contributed to the linkage were not identical, indicating either genetic or allelic heterogeneity. Consequently, private mutations in each family may exist. Initially, three affected family members (individual 14 from family 1, individual 10 from family 2 and individual 9 from family 4) were screened for possible mutations by direct forward and reverse sequencing from both ends (Figure 1). To investigate whether detected variants were novel or common SNPs, all variants observed in affected family members were blasted for existence using National Centre of Biotechnology Information (NCBI) SNP blast, build 124 (http://www.ncbi.nlm.nih.gov/SNP/snpblastByChr.html). If a novel variant was identified, unaffected family members of these families and family members of remaining families were sequenced and analysed for cosegregation with OA. If the variant cosegregated in at least one family, this variant was genotyped in a random sample of 790 subjects scored for radiographic OA to determine the population frequency. The impact of a novel variant involving an amino-acid change was examined using PolyPhen (http://tux.embl-heidelberg.de/ramensky/index.shtml) and SIFT (http://blocks.fhcrc.org/sift/SIFT.html).21 To test the effect on the splicing process, exonic variants were screened for exonic splicing enhancers sequences using http://exon.cshl.edu/ESE.22 Conservation was determined using the Multiz Alignments and Conservation track of the UCSC genome browser version 140 (http://genome.ucsc.edu/). We applied the nomenclature of den Dunnen and Antonarakis23 to describe these variants.

Figure 1
figure 1

Family pedigrees segregating for generalised early-onset OA and chromosome 2q33.3–2q34 loci. Blackened circles and squares indicate affected female and male subjects, respectively. White individuals represent unaffected family members. Crosses indicate recombinations. Shaded individuals indicate individuals with unknown disease status. The black-lined box shows the most likely haplotype (A1), allowing one phenocopy (individual 9) in family 2.

Mutation screening

Genomic DNA was isolated from EDTA blood of affected and unaffected family members. Reference sequences corresponding to all coding and 5′ and 3′ UTR regions of the genes were obtained from the UCSC genome browser assembly May 2004 (http://genome.ucsc.edu/) or the Ensembl Genome database v35 (www.ensembl.org), NCBI build 35. Table 1 shows the Genbank numbers. To amplify exons, forward and reverse primer sets (primer sequences upon request) were designed with at least 25 bp flanking intronic sequences using Primer3 (http://www.broad.mit.edu/cgi-bin/primer/primer3_www.cgi) with the conditions described by Vieux et al.24 3′UTR of NRP2 and exons 1–27 of PIP5K3 have not been sequenced due to current genome browser updates. PCR amplifications were carried out in a volume of 15 μl that contained 15 ng genomic DNA, 4.1 pmol of the PCR primers, 1.5 mM MgCl2, 0.2 mM d’NTPs and 0.6 U of rTaq polymerase (Amersham Biosciences) or 0.6 U of HotfirePol® DNA polymerase and solution S (Solis Biodyne) for GC-rich regions or standard conditions of the GC-rich PCR system (Roche). Reactions were cycled at 94°C for 1 or 15 min for GC-rich regions and then cycled for 35 cycles of 94°C for 30 s, 57°C for 1 min 15 s, 72°C for 30 s, and finally incubated for 6 min at 72°C on B&L primus HT cyclers. PCR products were purified using Multiscreen 96-well plates (Millipore) filled with Sephadex (Amersham biosciences) and quantified on 1.5% agarose gels. PCR products were sequenced for possible mutations using an ABI3730 capillary sequencer with Big Dye chemistry (Applied Biosystems).

Table 1 Known genes between the markers D2S72 and D2S2178 based on RefSeq, mRNA, TrEMBL and Swiss–Prot prediction using Ensembl genome database v35

Genotyping

Nine novel variants were genotyped in 790 random subjects from the population-based Rotterdam study.18 All variants were in Hardy–Weinberg equilibrium (HWE). Variants were genotyped using Sequenom homogenous Mass Extend MassARRAY System (Sequenom Inc., San Diego, CA, USA) using standard conditions. Genotypes were analysed using Genotyper version 3.0 software (Sequenom Inc.).

Statistical analysis

HWE was calculated with an exact HWE test for rare alleles implemented in R version 2.3.1, (http://www.r-project.org/). A logistic regression model was fitted to measure the strength of association, which is expressed as odds ratios (ORs) with 95% confidence intervals (95% CI) adjusted for age (years), body mass index (BMI in kg/m2) and sex. In these analyses, homo- and heterozygous carriers of the risk allele were pooled. Instead of adjusting P-values a priori (eg for multiple testing), exact P-values are provided in order to let the reader interpret the level of significance. All analyses were performed with SPSS version 11 software (SPSS, Chicago, IL, USA).

Results

Linkage analysis

Results of the genome-wide scan and screening of PTHR2 and FZD5 have previously been reported.16 In brief, significant linkage was observed at 2q33.3–2q34 between markers D2S1384 and D2S2178 implicating a 5 cM interval (4.6 Mb).16 Recombinant haplotypes were constructed for all families that contributed positively to the linkage. In families 2 and 4, most affected individuals carry identical heterozygous haplotypes resulting in two possible haplotypes explaining the linkage (Figure 1). In family 2, the most likely haplotype 3678 (A1) allows one phenocopy (individual 9) and the second likely haplotype 7658 (A2) allows two phenocopies (subjects 6 and 17). In family 4, the most likely haplotype 5515232 (A1) allows no phenocopies and the second likely haplotype 5144984 (A2) allows one phenocopy (individual 7). Recombinant haplotypes of family 1 and family 2 only were used to determine the minimal restricted region. Cosegregating haplotypes among families were different, indicating either allelic or genetic heterogeneity in these families and assumes the possibilities of private mutations in one or different genes for each family. To detect sequence changes, we initially screened three affected family members from three different families contributing most to the linkage: individual 14 in family 1, individual 10 in family 2 and individual 9 in family 4. When a novel variant was found, segregation analysis of the variant with OA within all family members of the seven families was performed.

Mutation screening of positional candidate genes

Following up these results, we performed extended mutation analysis of the chromosome 2 locus. Using the human genome resources, 18 known RefSeq genes, nine predicted RefSeq genes (four genes with model status and five genes with predicted status in NCBI build 35) have been identified within the linkage area of chromosome 2 as shown in Table 1. Nearby the linkage region, at marker D2S72, the ICOS-CTLA4-CD28 cluster is located which is implicated in cytokine secretion and T-cell immunity.25 FRZB is a very consistent locus associated with hip10, 12 and generalised OA,11 which is localised at 2q32, slightly outside our linkage region. To exclude the possibility that a FRZB mutation influences the early-onset generalised OA in any of our families, we also included this gene. The entire coding region, splice sites, and 5′ and 3′ untranslated regions of FRZB, CTLA4, CD28, ICOS, NRP2, NM_017759, NDUFS1, EEF1B2, GPR1, XM_371590, ADAM23, MDH1B, CPO, KLF7, CREB1, NM_030804, FZD5, IDH1, PIP5K3 and PTHR2 were sequenced comprising 17 RefSeq genes and three predicted genes. Results of the screening of PTHR2 and FZD5 have previously been reported.16 If the considered variant appeared novel in the current dbSNP database (build 124) and cosegregated with OA in at least one affected family, it was evaluated according to the criteria previously mentioned.

Evaluation of novel variants

The initial screening indicated 26 novel variants (17 SNPs and nine insertion/deletion polymorphisms). From these 26 variants, only nine promising variants cosegregated with OA within one or more families as illustrated in Table 2. Three of these variants were found in coding regions and involved an amino-acid change: XM_371590 R2133S, IDH1 Y183C and PTHR2 A225S. Six variants were located in UTR regions (PIP5K3 c.8429T>A, PIP5K3 c.8434insC and NRP2 c.941A>C) or in vicinity of exon/intron boundaries (NRP2 c.1938-21T>C, ADAM23 c.2065+24C>T and IDH1 c.933-28C>T). To estimate the allele frequencies, these nine cosegregating variants were genotyped in a random population sample, screened for radiographic signs in hand, hip, spine and knee.

Table 2 Possible mutations segregating in FOA families

Using PolyPhen, SIFT and ESE finder analysis to predict possible functional effects of these variants, two variants emerged as potential mutations: XM_371590 R2133S and IDH1 Y183C (Table 2). In addition, three variants (NRP2 c.941A>C, PIP5K3 c.8429T>A and PIP5K3 c.8434insC) were conserved across other species and might be of functional importance.

The predicted XM_371590 gene probably belongs, as predicted in Unigene, to the fibronectin type III and M protein repeat family in C. elegans. Fibronectin is a component of the ECM and XM_371590 may therefore be an excellent candidate gene. The G/T nucleotide change in the third exon of XM_371590 (Q9HCK1) results in disruption of exonic splicer enhancer motifs which serves as binding site for serine/arginine protein 40 and 55 and might be, therefore, a functional variant. However, because this gene is a predicted gene, little is known about other possible predicted functional effects on the protein. The novel variant XM_371590 R2133S cosegregated in families 4 and 7 with OA and showed a rare population frequency of 0.01 corresponding to nine carriers of 763 genotyped (Table 3). In the Rotterdam sample, we did not observe a significant association of this variant with generalised ROA. It is unlikely that this variant is a causal mutation.

Table 3 Frequencies of novel segregating variants in population-based sample scored for generalised ROA (GOA)

Isocitrate dehydrogenase 1 encodes a cytoplasmic enzyme which catalyses the oxidative decarboxylation of isocitrate to 2-oxoglutarate and has a significant role in cytoplasmic NADPH production.26 In IDH1, two variants (Y183C and c.933-28C>T) cosegregated with the OA phenotype. IDH1 Y183C cosegregated in affected family members in family 2 (A2, Figure 1). This variant was located in exon 6 encoding the isocitrate/isopropylmalate dehydrogenase domain (PF00180) of IDH1, predicted to be probably damaging for the protein structure/function by SIFT and PolyPhen, and highly conserved across all species investigated. Based on these results, this variant could be functional for the onset of generalised OA. In the Rotterdam sample, we observed 14 carriers out of 767 genotyped corresponding to a frequency of 0.02. In addition, carriers of this variant conferred an OR, adjusted for age, BMI and sex of 2.8 (95% CI, 0.82–9.7, P=0.10) to have generalised ROA, as shown in Table 3.

Another variant in this gene, IDH1 c.933–28C>T, was identified in families 2 and 4, near the intron/exon boundary of exon 7. This variant was not conserved across other species, and was not associated with generalised ROA in the Rotterdam sample (frequency 0.04).

Neuropilin 2 (NRP2) is an interesting gene because it encodes for the co-receptor of vascular endothelial growth factor165 (VEGF165), which is an essential factor for endochondral ossification.27 Furthermore, VEGF and its receptors are expressed in OA cartilage and VEGF stimulates production of ECM-degrading matrix metalloproteinases (MMPs).28, 29 In the NRP2 gene, two novel variants were found: c.941A>C and c.1938-21T>C. In family 4, NRP2 c.941A>C was identified in a residue with a low conservation score. This variant showed a frequency of 0.03 in the random population and no significant association with generalised ROA was observed. The second NRP2 variant, c.1938-21T>C, was not conserved and cosegregated in three families (1, 2 and 4) and was more frequent in the population (0.07). Carriers of at least one risk allele of the NRP2 c.1938-21T>C variant conferred an increased risk of 2.1 (95% CI, 1.1–4.1, P=0.032), adjusted for age, sex, BMI, to have generalised ROA (Table 3).

Phosphatidylinositol-3-phosphate/phosphatidylinositol 5-kinase, type III catalyses the phosphorylation of phosphatidylinositol-4-phosphate and has a role in endosome-related membrane trafficking.30 We found two novel variants (PIP5K3 c.8429T>A and PIP5K3 c.8434insC) in the 3′UTR region of PIP5K3 in family 4. PIP5K3 c.8429T>A involved a highly conserved residue. PIP5K3 c.8434insC was not conserved. In the Rotterdam sample, we observed that PIP5K3 c.8429T>A and PIP5K3 c.8434insC showed a population frequency of 0.04 and were in complete LD (D′=1, r2=1). In the Rotterdam sample, no significant associations of these variants with generalised ROA were observed excluding a possible pathogenic role in relation to the onset of FOA. Even though PIP5K3 c.8429T>A occurred in a conserved residue, it is likely that these variants are neutral polymorphisms.

Finally, we also examined whether some novel variants were inherited together in different families to identify a possible LD pattern or genetic interaction resulting in a high LOD score linked to OA. As shown in Table 2, only two variants, NRP2 c.1938-21T>C and IDH1 c.933-28C>T, occurred together on haplotype A1 in family 2 and on haplotype A2 in family 4. In the random population, this inheritance pattern was observed only once in 754 genotyped subjects (0.0013). This individual had spinal DD at three disc levels that has a prevalence of 0.04 in the random population.

Discussion

Upon screening 20 genes localised within or near the area of linkage in three families that contributed most to the linkage, we identified nine novel variants cosegregating with OA. We evaluated the significance of the novel variants by prediction of pathogenicity using in silico functional analysis and by establishing the frequency in the random population. The IDH1 Y183C variant cosegregated on haplotype A2 with the OA phenotype in family 2 was predicted to be probably damaging for the protein structure/function and concerned a highly conserved residue. Among carriers of this variant in the general population (frequency 0.02), the risk of generalised ROA was 2.8 (95% CI, 0.82–9.7, P=0.10). Given these results and the likelihood of allelic heterogeneity among these families in combination with the significant LOD score in family 2 alone, this variant may contribute to the FOA susceptibility in family 2. IDH1 supplies NADPH for antioxidant systems, suggesting a regulatory role in cellular defense against oxidative stress and in senescence.31 Little is known about a possible role of IDH1 in cartilage but we speculate that increased oxidative stress could make chondrocytes more susceptible to cell death which might contribute to the onset of OA.

A second variant, NRP2 c.1938–21T>C, emerged from our mutation analysis which cosegregated in three families (1, 2 and 4) contributing most to the linkage. Carriers of this variant conferred an increased risk of 2.1 (95% CI, 1.1–4.1, P=0.032) to have generalised ROA.

NRP2 acts as a co-receptor of VEGF165, which is produced from hypertrophic chondrocytes and is also expressed in OA cartilage.28, 32 VEGF is an essential coordinator of growth plate morphogenesis and triggers cartilage remodelling.27 VEGF may contribute to OA cartilage destruction through stimulation of MMPs.28, 29, 33

Given the high frequency (0.02 and 0.07) and the low effect sizes for generalised ROA in the random population, both variants are unlikely to represent a causative, highly penetrant mutation. Mendelian traits are often oligogenic and/or have modifier genes explaining phenotypic variability.34 Although our linkage signal may not be explained by these variants, a possible modulating role for these variants or genetic interaction with other causal variants at this or other loci cannot be excluded.

The NRP2 c.1938–21T>C occurred together with IDH1 c.933–28C>T on haplotype A1 in family 2 and on haplotype A2 in family 4. This haplotype is extremely rare in the random population (0.0013). At this point, we conclude that possibly a causal variant in the LD pattern driven by both variants may have contribution to OA in families 2 and 4. The pathogenic potential of these variants and the role of this haplotype in the OA phenotype should be confirmed further in other populations with advanced OA or supported by functional assays.

The reduced penetrance among carriers in the Rotterdam study may be explained by the low frequency of the carriers, the absence of a clinical OA assessment and the relatively high frequency of generalised ROA above 55 years (0.17), which can result in reduced power of the statistical test, misclassification and spurious associations. We also analysed the presence of ROA among carriers in the random population as quantitative trait using a sum score of the number of affected joints16 which revealed similar findings (data not shown).

There is a possibility that the family members could be affected with generalised ROA by chance alone although the phenotype is more severe (clinical OA before age 50 years). In family 2 and family 4, we observed two possible haplotypes segregating with the disease allowing one or two phenocopies, suggesting that these individuals could be sporadic patients. Both haplotypes (A1 and A2) confined the linkage area. Consistent with an age-related disorder, we were not able to perform further segregation analysis because first-degree relatives died or are too young to reveal symptoms of the disease.

We prioritised genes for sequencing based on the function known in the literature and human genome resources. Obviously, unselected genes could carry an OA causing allele and will be screened for mutations in the future. In addition, we selected families that contribute most to the linkage for the initial mutation survey. Apparently, remaining four families could harbour causal variants in one of the screened genes. Alternatively, it is possible that affected family members may have a noncoding regulatory mutation in promoter or intron or a heterozygous deletion of one or more of the exons in one of the screened genes that has not yet been detected in our study.

In addition, predictions on the basis of computational algorithms as PolyPhen, SIFT and ESE finder are difficult to interpret. Furthermore, even variants that are associated with disease may be in linkage disequilibrium with the true causal variant or are rare polymorphisms.

Although we were able to identify nine novel variants cosegregating with the FOA phenotype by our extended mutation analysis, we may found no robust evidence for a major disease gene responsible for the observed linkage to the FOA phenotype. Our results, however, might indicate a possible modulating role for variants in or near NRP2 and IDH1. Further mutation analysis of the linkage area on chromosome 2q33.3–2q34 and confirmation of the most promising variants in other populations with advanced OA is needed.