Genome-wide association study identifies COL2A1 locus involved in the hand development failure of Kashin-Beck disease

Kashin-Beck disease (KBD) is a chronic osteochondropathy. The pathogenesis of growth and development failure of hand of KBD remains elusive now. In this study, we conducted a two-stage genome-wide association study (GWAS) of palmar length-width ratio (LWR) of KBD, totally including 493 study subjects. Affymetrix Genome Wide Human SNP Array 6.0 was applied for genome-wide SNP genotyping of 90 KBD patients. Association analysis was conducted by PLINK. Imputation analysis was performed by IMPUTE against the reference panel of the 1000 genome project. Two SNPs were selected for replication in an independent validation sample of 403 KBD patients. In the discovery GWAS, significant association was observed between palmar LWR and rs2071358 of COL2A1 gene (P value = 4.68 × 10−8). In addition, GWAS detected suggestive association signal at rs4760608 of COL2A1 gene (P value = 1.76 × 10−4). Imputation analysis of COL2A1 further identified 2 SNPs with association evidence for palmar LWR. Replication study observed significant association signals at both rs2071358 (P value = 0.017) and rs4760608 (P value = 0.002) of COL2A1 gene. Based on previous and our study results, we suggest that COL2A1 was a likely susceptibility gene involved in the hand development failure of KBD.

GWAS and imputation analysis. All GWAS subjects were clustered together as one homogeneous sample in the STRUCTURE analysis. EIGENSTRAT analysis calculated a genomic control inflation factor λ = 1.07. After quality control filtering, a total of 532,894 SNP were used for GWAS of palmar LWR of KBD (Fig. 1). Across the whole genome, one significant association signal was observed at the rs2071358 of COL2A1 gene (P value = 4.68 × 10 −8 ). In addition, GWAS detected suggestive association signal at the rs4760608 of COL2A1 gene (P value = 1.76 × 10 −4 ). Further imputation analysis identified 2 SNPs, including rs3782915 (P value = 4.22 × 10 −7 ) and rs740024 (P value = 9.34 × 10 −5 ). The basic information of the identified SNPs was summarized in Table 2.
Replication study. To validate the association between COL2A1 and palmar LWR, two genotyped SNP rs2071358 and rs4760608 identified by the GWAS, were selected for replication study. Replication association analysis observed that rs2071358 (P value = 0.017) and rs4760608 (P value = 0.002) of COL2A1 gene were also significantly associated with the palmar LWR of KBD at Bonferroni corrected significance threshold P value < 0.025.

Discussion
Besides finger joint deformities, the growth and development failure of hand is another important outcome of KBD. To the best of our knowledge, no genetic study of palmar LWR of KBD has been conducted by now. To identify the genes involved in the growth and development failure of KBD, we conducted a two-stage GWAS of palmar LWR of KBD. Our results suggest that COL2A1 gene might be involved in the hand development failure of KBD. COL2A1 gene encodes the alpha-1 chain of type II collagen, which is specific for cartilaginous tissues. Type II collagen have four isoforms, including IIA, IIB, IIC and IID 16 . They are expressed in a developmentally regulated manner in chondrogenic tissue. For instance, the IIA isoform is mainly expressed in chondroprogenitor cells, while IIB mRNA is predominantly expressed in differentiated chondrocytes [17][18][19] . Type II collagen is the major extracellular matrix (ECM) component, which makes up about 50% of all protein in cartilage.
It is well documented that type II collagen is essential for the normal embryonic development of the skeleton and linear growth. For instance, a recent study demonstrated that type II collagen was required for proper long bone development of mice 20 . Collagen II knock-out mice didn't form epiphyseal growth plate 21 . In human, COL2A1 mutation is capable of causing bone dysplasia, characterized by skeletal dysplasia, short stature, and sensorial defects 22,23 .
KBD patients have obvious skeletal growth retardation, such as shortened fingers, shortened humeri and short statue. Previous studies have observed that dysfunction of type II collagen contributed to the cartilage lesion of KBD patients. For instance, the expression level of type II collagen in KBD articular cartilage was significantly lower than that in healthy articular cartilage 24,25 . Serum type II collagen can serve as the biomarker of the joint damage of KBD 26 . However, no genetic evidence supporting the role of type II collagen in the hand development failure of KBD has been reported by now. Based on previous functional study results of type II collagen and this study results, we infer that COL2A1 was likely implicated in the hand development failure of KBD.
To reveal the functional significanceof the SNPs identified by this study, we aligned the identified SNPs with public expression quantitative trait loci (eQTLs) and methylation quantitative trait loci (meQTLs) database. We did not find overlapped SNP between eQTLs/meQTLs lists and the identified SNPs. We also aligned the identified SNPs withpublished GWAS results of height. No significant association between height and the identified SNPs has been reported by now. This may be partly explained by that the identified SNPswereclose to the causal genetic variations, which were not detected by this study. Additionally, only the genotyped SNP rs2071358 and rs4760608 identified by the discovery GWAS, were selected for replication study. Because rs2071358 and rs4760608 are linkage disequilibrium with other identified SNPs, we believe that the association signals of rs2071358 and rs4760608 in the replication study generally represented the association between COL2A1 and KBD LWR. Further fine mapping and biological studies are needed toconfirm our finding and clarify the potential mechanism of the identified SNPs involved in the hand development failure of KBD.
In summary, we conducted the first GWAS of palmar LWR of KBD. Our study results suggest that COL2A1 was a likely susceptibility gene of KBD. COL2A1 may be implicated in the growth and development failure of hand of KBD. This study may provide new clues for clarifying the pathogenesis and rationale of therapies for KBD.

Materials and Methods
Human subjects. In this two-stage GWAS, we totally used four hundred and ninety three Chinese Han subjects, which were previously recruited for genetic studies of KBD 13 . In the discovery study, we used 90 representative grade II or III KBD patients 13 . The 90 KBD patients were collected from Linyou county of Xi'an city of Shaanxi Province in China 13 . In the replication study, another independent sample of 403 KBD patients having LWR data, was recruited from Yongshou county and Bin county of Xi'an city of Shaanxi Province in China 13 . All subjects underwent careful clinical examination and radiography of hands. Palmar length and width of left hand were measured using a digital caliper with a resolution of 0.01 mm. Clinical data of each participant was recorded by nurse-administered questionnaire, including self-reported ethnicity, lifestyle characteristics, health status, family and medical histories. According to the KBD Chinese diagnosis criteria (WS/T207-2010), KBD was diagnosed by at least two experts specialized in KBD. We excluded the subjects with genetic bone and cartilage diseases, primary osteoarthritis and rheumatoid arthritis. More informationabout study subjects can be found in our previous study 13 . 5 mL peripheral blood was drawn from each participant.
Genotyping and quality control. Genomic DNA specimens were extracted from peripheral blood leukocytes using the E.Z.N. A Blood DNA Midi Kit (Omega Bio-tek, Norcross, USA) following manufacturers' protocol. Affymetrix Genome Wide Human SNP Array 6.0 (Affymetrix, Santa Clara, CA, USA) was applied for genotyping. Fluorescence intensities were quantified by Affymetrix 30007 G scanner (Affymetrix). To control the impact of cryptic relatedness on our study results, all GWAS samples passed a pairwise identity by descent testing-basedgenetic relatedness check implemented by PLINK 1,3 . No genetically related sample (pi-hat > 0.2) was detected in this study. For quality control, the SNPs with Hardy-Weinberg Equilibrium testing P values < 0.001, call rates < 98.00% or minor allele frequencies (MAF) < 0.05 were excluded.
Association study. The genetic background of GWAS samples was evaluated by STRUCTURE software 27 .
The genomic control inflation factor λ was calculated by ELGENSTRAT software 28 . With PLINK software, a logistic regression model assuming additive genetic effects was used for association analysis adjusting for sex as a covariate 29 . Significant associations were identified at genomic control corrected P value < 5.0 × 10 −8 . Replication study. Based on the Chinese SNP data of the Hapmap project (http://hapmap.ncbi.nlm.nih.gov/), two SNPs rs2071358 and rs4760608 of COL2A1 gene were selected for replication in the independent validation sample of 403 KBD patients. Sequenom MassARRAY platform (Sequenom, San Diego, CA, USA) was applied for genotyping according to the manufacturer's protocol. The MassARRAY Assay Design 3.1 (Sequenom, San Diego, CA, USA) was used to design the primers of polymerase chain reaction experiment. Sequenom MassArray TYPER 4.0 (Sequenom, San Diego, CA, USA) was used for SNP calling. Association analysis was conducted by PLINK software adjusting for sex as a covariate. Significant associations were identified at Bonferroni corrected significance threshold P value < 0.025. Ethics Statement. This study was approved by the Institutional Review Board of Xi'an Jiaotong University (Project Number 2014008). The study was conducted in accordance with the principles of Helsinki Declaration. Written Informed consent was obtained from all subjects.