Dawn of the GWAS era of OA genetics

We are in the Bone and Joint Decade (http://www.boneandjointdecade.org/). Bone and joint diseases are serious problems worldwide. Among these diseases, osteoarthritis (OA) is the greatest enemy to our present and future society. OA is the most common human arthritis; far more than 50 000 000 patients all over the world are suffering from this intractable disease. OA is characterized by progressive cartilage loss, associated with/without osteophyte (aberrant bone) formation and subchondral bone sclerosis.1 OA causes pain and loss of joint function, which results in disability and a reduction in quality of life; however, there is no fundamental treatment.

OA is a polygenic disease; susceptibility to OA is influenced by genetic and environmental factors. Twin studies show that the heritability of radiographic OA of the hand, hip and knee joints in women is estimated between 39 and 65%, independent of known environmental or demographic confounding factors.2, 3, 4 The genetic component is determined by susceptibility genes, the identification and characterization of which would lead to a better fundamental understanding of OA. Recent advances in genomic research have enabled us to investigate OA susceptibility genes.5 Several groups, including ours, have reported the identification of OA susceptibility genes, mainly using candidate gene association studies (Table 1). However, the candidate gene approach depends on a priori knowledge of the target. In addition, only a small region of the human genome has been examined; thus, many important genes have probably escaped from the coarse net of the search method.

Table 1 Main previous reports of the osteoarthritis susceptibility gene

As a solution to this problem, genome-wide association studies (GWAS), in particular those that use single nucleotide polymorphisms (SNPs), have recently become very popular.6 After the first successful study of myocardial infarction by Ozaki et al.7 through the International HapMap Project,8, 9 GWAS has been used worldwide, resulting in many published papers. For example, Nature Genetics published 47 papers in the last 3 months (August–October, 2009), and 24 of these papers were GWAS-based. The breakthrough selected by the journal Science in 2007 as the No. 1 scientific event of the year, was related to GWAS.10 This technological movement is now coming to the field of OA genetics.

Here, we review recent progress in the study of susceptibility genes for OA, with particular emphasis on GWAS and large-scale replication studies.

GWAS of OA

Several GWAS have been reported in the study of OA. Through a GWAS using ∼100 000 SNPs selected from a population-specific SNP database (JSNP, http://snp.ims.u-tokyo.ac.jp/index.html), Mototani et al.11 found an association of CALM1, which encodes calmodulin, a Ca2+-binding protein and the principal mediator of calcium signalling, with hip OA in a Japanese population. However, replication studies in the European Caucasian and Chinese populations reported negative associations for this SNP (rs12885713).12, 13, 14, 15

Using the same system, Miyamoto et al.16 identified association of DVWA (double von Willebrand factor type A domain) with knee OA. DVWA was a previously unknown gene, not in the gene database. This genetic association was identified in two independent Japanese case–control cohorts and has been replicated in a Japanese population cohort and a Han Chinese case–control cohort (combined P=7.3 × 10−11). The studies showed that there were two highly associated missense SNPs (rs11718863: N169Y and rs7639618: Y260C) in the VWA domain. Interestingly, one of the disease genes of multiple epiphyseal dysplasia, a skeletal dysplasia characterized by early-onset OA, is the gene for matrillin 3 (MATN3). Matrillin 3 has a VWA domain, and MATN3 mutations that cause multiple epiphyseal dysplasia are clustered in this domain.17, 18 The DVWA protein binds to β-tubulin, and the binding is influenced by the missense SNPs. The isoform produced from the overrepresented allele in knee OA (Y169-C260) showed a weaker interaction with β-tubulin than those produced from other alleles. The DVWA protein also binds to α-tubulin and cytosolic proteins (S Ikegawa, unpublished data). This is a good example of the power and potential of GWAS, as a new gene and new pathway associated with a disease were identified. The roles of DVWA and tubulin in the pathogenesis of OA should be examined further.

Valdes et al.19 examined knee OA susceptibility genes in women using pooled DNAs from 357 cases and 285 controls from the UK Caucasians and 410 000 SNPs in the Illumina550 Duo array (Illumina, San Diego, CA, USA). They checked replication of the top 28 SNPs in the UK, USA and Dutch Caucasian cohorts (a total of 1177 cases and 2355 controls). They did not find any association that reached genome-wide significance. The most associated marker was rs4140564 on chromosome 1, which is situated between the PTGS2 and PLA2G4A genes (odds ratio (OR)=1.55, 95% confidence interval (CI) of 1.30–1.85, P=6.9 × 10−7). In addition, the SNP was in the intergenic region, >50 kb away from the 5′ ends of both genes. Further investigation of the most associated SNP in the linkage disequilibrium block defined by the screening (∼200-kb region) followed by functional studies of the causal variant are necessary, as well as screening in males and other ethnic groups.

Zhai et al.20 examined hand OA susceptibility genes by a two-stage GWAS approach using a total of 3266 Caucasians from various regions. They identified the association of the SNP rs716508 in intron 1 of the A2BP1 gene, but the association was far below the genome-wide significance level (meta-analysis P=1.81 × 10−5). Replication studies are necessary to verify this association.

Problems with GWAS

The current method of GWAS is far from perfect, and it is aptly named a genome-wide association screen or scan. Current GWAS is used to screen the genome with surrogate markers that cover ∼85–96% of the genome (depending on the platform and ethnicity), if all the genotyping experiments are successful. However, in most cases, the number of SNPs that are successfully genotyped and actually subjected to association analysis is only 60–85% of the total number of SNPs in the screen. Thus, even if the screen potentially covered 95% of the genome (this is near the maximum with the current collection of SNPs) and 80% of SNPs were successfully genotyped, the actual coverage of the GWAS would be ∼75% (0.95 × 0.80=0.75).

As already seen in studies of other lifestyle-associated diseases, including diabetes mellitus and coronary heart disease, the use of combined analyses of several GWAS will increase.21 Although very significant P-values are obtained using such combined analyses, most of these methods simply identify markers (indirect association). We should go further than just genotyping the commercial set of SNPs (in most cases, genotyping has been out-sourced to companies). Starting from these ‘marker’ SNPs, we must identify true (functional) sequence variations and their functionalities, leading to clarification of pathogenesis of the diseases and, hopefully, to the development of new treatments.

Replication studies

Association studies are powerful tools for identifying susceptibility genes. However, the most significant problem associated with large-scale association studies, including GWAS studies, is the occurrence of ‘false positives’ due to several factors, including multiple testing and population stratification. The most practical measure to check and eliminate false positives is to perform a replication study using different populations, in particular, different ethnic groups. Studies that are replicated in different racial and ethnic backgrounds have high reliability, although the lack of such replication may be due to ethnic specificity of the susceptibility genes. Until recently, only one gene, ASPN, which encodes the cartilage-specific extracellular matrix protein asporin,22, 23 has been replicated in different ethnic groups. The asporin D14 allele has been associated with OA in the Japanese, Chinese and European Caucasian populations.24, 25, 26 Interestingly, the asporin D14 allele is also associated with lumbar disc degeneration in East Asians.27

Several replication studies of previously reported ‘well-associated’ genes using more than a few thousand subjects have been reported, mainly from the Western Europe. The most common problems of these studies are the considerable diversity between populations (even among Europeans and individuals of the same nationality) and inconsistent inclusion criteria used in the individual studies. Often, non-biological criteria, such as whether subjects had total joint replacement surgery (which is a surgeon's decision), are used without clinical and/or radiographic data to support the pathology.

GDF5

The association of GDF5 with symptomatic OA in hips and knees was first reported in the Japanese and Chinese populations.28 The most associated SNP (rs143383) is a functional variant, and the susceptibility allele shows decreased GDF5 transcription in vitro and in vivo.28, 29, 30 The same allele was reported to be associated with adult height in Caucasians through a GWAS using subjects with non-insulin-dependent diabetes mellitus.31 Valdes et al.32 assessed rs143383 for OA in the UK Caucasians using 999 knee OA, 843 hip OA and 1166 control samples, and they replicated the association of GDF5 in knee OA (OR=1.29, 95% CI of 1.14–1.47; P=8 × 10−5).

Chapman et al. performed a meta-analysis of the association between rs143383 and OA using combined data for more than 11 000 individuals from the European and Asian populations. They found strong evidence of the association of the GDF5 SNP with knee OA for the Europeans and Asians. The combined association for both ethnic groups was highly significant for the dominant model (P<0.0001, OR=1.48). These findings represent the first highly significant evidence for an OA susceptibility gene that affects diverse ethnic groups.33

Vaes et al.34 studied the association between rs143383 and radiographic OA in hands, knees and hips, together with height, bone size parameters and fracture risk, in a population-based cohort that consisted of a total of 6365 Caucasians. The authors found the association of GDF5 with knee OA, height, bone size and fracture risk in women; however, no associations with hip OA or BMD (Bone Mineral Density) were detected, and no associations were observed in men. Female homozygotes for the rs143383 C-allele had a 37% lower risk for hand OA (P=8 × 10−6), a 28% lower risk for knee OA (P=0.003), a 29% increased risk of incidence of non-vertebral fractures (P=0.02) and were 1.1 cm taller (P=0.001). The difference between the sexes is difficult to explain.

In a large-scale meta-analysis using 3500–5800 cases and 5200–10 800 controls, collected by 14 research groups in the Western Europe and East Asia, Evangelou et al.35 examined the association of OA with three SNPs in previously reported genes, rs143383 in GDF5 together with rs7775 and rs288326 in FRZB. A significant association for rs143383 with knee OA without significant between-study heterogeneity, were identified (P=9.4 × 10−7). The random-effects summary OR was 1.15 (95% CI, 1.09–1.22). Estimates of the effect sizes for hip OA were similar to those of knee OA; however, large between-study heterogeneity was observed and its statistical significance was marginal (P=0.016). They concluded that the association between the GDF5 rs143383 polymorphism and OA is substantially strong, and the genetic effect for knee OA is consistent across different populations. Although the effect is not remarkable, the association of GDF5 is solid. Replication studies in Africans and other ethnic groups are required to address whether GDF5 is really a ‘global’ OA gene.

On the other hand, analyses of FRZB polymorphisms and haplotypes did not reveal any statistically significant data.35 The associations reported in the original paper were marginal.36 When multiple testing is considered, only a negative association has been reported in replication studies, even in the European Caucasians.13, 37, 38 We also did not find an association of the FRZB variants in the Japanese and Han Chinese populations (D Shi et al., unpublished data). Taken together with the lack of functional characterization of the FRZB polymorphisms and evidence of causality in OA, the previously reported FRZB association36 seems to be just another example of a false-positive association.

DVWA

Valdes et al.32 assessed the DVWA variants rs11718863 and rs7639618 in OA susceptibility in the UK Caucasians. rs7639618 was not associated with susceptibility to knee OA, but the study was not powerful enough to yield a reliable conclusion. A meta-analysis of the Asian and UK knee OA data indicated highly significant heterogeneity.39

Meulenbelt et al.39 examined the association of the three DVWA SNPs (rs7639618, rs11718863 and rs9864422) by genotyping 1120 knee OA cases, 1482 hip OA cases and 2147 controls, all of white European descent from the Netherlands, UK, Spain and Greece.39 The authors also assessed a more global effect by meta-analysis, including the original Japanese and Chinese data together with the Western European data. The meta-analyses provided evidence for global association of rs7639618 with knee OA, with an OR of 1.29 (95% CI of 1.15–1.45) and a P-value of 2.70 × 10−5. This effect, however, showed moderate heterogeneity, and the association of rs7639618 with knee OA in the Europeans was marginal, with an OR of 1.16, 95% CI of 0.99–1.35 and a P-value of 0.063. In addition, no association was observed with hip OA in Europeans. The lower effect size in combination with the higher risk allele frequency in the European samples highlights again the ethnic differences in OA susceptibility genes. Larger scale studies are necessary to confirm the association of DVWA rs7639618 with knee OA.

Future of association studies

The current OA-associated variants identified by GWA and replication studies could not explain the entire OA heritability. For example, even in knee OA of Japanese population in which three significant genes with functional proof, namely, ASPN,24 GDF5 (see Miyamoto et al. 28) and DVWA16 are identified, population attribute risk of the OA-associated variants of the three genes are estimated to be 31% (H Takahashi et al., unpublished data). Further genetic association studies on OA susceptibility would be necessary to capture the complete picture of genetic aspect of OA.

The problem with replication studies is the occurrence of ‘false negatives’ due to low risk size and lack of a sufficient sample size. In future, larger scale replication studies using more different ethnic populations need to be performed. The power of association studies depends on the sample size. In case of GWAS and replication studies, future studies will consist of combined analysis of several studies using more than 10 000 samples. However, the identification of associations, in particular in GWAS, is only the first step toward the identification of causal variants and the molecular pathogenesis of diseases. Development of an efficient system to identify the real causal variant in the linkage disequilibrium block indicated by marker SNPs and to verify their functionality is also needed.