Introduction

Kawasaki disease (KD; OMIM 611775), also called mucocutaneous lymph node syndrome, is an acute, self-limited vasculitis that predominantly affects children younger than 5 years old, which was first reported by Tomisaku Kawasaki from Japan in 1974 in the English language literature1. KD occurs worldwide and is mostly prevalent in East Asian population, such as Japanese2, Koreans3 and Taiwanese4. The major clinical manifestations of KD include prolonged fever, bilateral non-purulent conjunctivitis, diffuse mucosal inflammation, polymorphous skin rashes, peripheral extremity changes and cervical lymphadenopathy5,6. Approximately 15–25% of untreated and 3–5% of treated children develop coronary artery lesions, including coronary artery dilatation or coronary artery aneurysm and even myocardial infarction7,8, making KD the leading cause of acquired heart disease in childhood in the developed countries.

Although almost 40 years have passed since the first description of the disease, its etiology is still not completely clear. However, the familial aggregation of KD patients suggests that genetic factors contribute to KD susceptibility and outcome9,10. Previous association studies based on the candidate gene approach have identified a set of common variants contributing to KD risk, but few received consistent replications. Inositol 1,4,5-triphosphate 3-kinase C (ITPKC) gene may be one of the most studied targets in genetic susceptibility of KD. A genome-wide linkage analysis conducted in Japanese KD sibling-pair samples led to the identification of a functional single nucleotide polymorphism (SNP), rs28493229 in the ITPKC gene on chromosome 19q13.211,12, which was confirmed subsequently by a meta-analysis integrating 10 case-control studies and 2 transmission/disequilibrium tests13. ITPKC encodes one of the three isoenzymes of inositol 1,4,5-triphosphate 3-kinase that phosphorylates inositol 1,4,5-triphosphate and is involved in the Ca2+/nuclear factor of activated T-cells (NFAT) signaling pathway in T cells as a negative regulator12. Besides, another SNP rs2290692 in this gene was also found associated with KD susceptibility in a Chinese population14.

It is believed that T-cell activation plays an important role in the pathogenesis of vascular endothelial cell injury by eliciting proinflammatory reactions at the onset of KD15,16, among which, Ca2+/NFAT signaling pathway may play a crucial role. Once the T-cell receptor receives a stimulus, phospholipase Cγ1 is activated and then generates amount of diacylglycerol and inositol 1,4,5-triphosphate, the latter of which then binds to its receptor expressed on endoplasmic reticulum (ER) membrane and causes the release of Ca2+ into the cytoplasm. The depletion of Ca2+ store in ER leads to a process that extracellular Ca2+ enters through calcium released-activated Ca2+ channels on the plasma membrane, which is evoked by stromal interaction molecule (STIM) as a sensor of Ca2+ in ER. Calcineurin is activated by the bond between cytoplasmic Ca2+ and calmodulin, afterwards dephosphorylates NFAT in the cytoplasm and leads nuclear translocation of NFAT. NFAT in the nucleus drives transcription of genes important in T cell activation such as IL-217.

Caspase 3 (CASP3) is one of the effector caspases that play an important role in the execution phase of apoptosis. It has been reported that CASP3 cleaves NFATc2 as one of its substrates in T cells18 and also acts as a negative regulator of the Ca2+/NFAT pathway. In addition, another study conducted by Onouchi et al. revealed that a G to A substitution of one commonly associated SNP located in the 5′-untranslated region of CASP3 (rs113420705, formerly rs72689236) abolished binding of NFAT to the DNA sequence surrounding the SNP19.

Compared to ITPKC and CASP3, few studies have investigated on other loci involved in the Ca2+/NFAT signaling pathway and KD susceptibility, such as STIM1, STIM2, NFATc1, NFATc2, IL-2, in spite of their contributions to the immune and inflammatory system20,21,22,23,24,25. Therefore, we proposed our hypothesis that multiple common genetic variants involved in Ca2+/NFAT signaling pathway may individually or interactively contribute to the etiology and pathogenesis of KD. Referring to the interactions, it could be challenging for the traditional analytic strategies, such as logistic regression (LR), to fully characterize them since sparseness of the data in high dimensions would lead to large standard errors of parameter estimates and result in an increase in type I errors26,27. What's more, statistical power would decrease and type II errors would increase when detecting interactions by LR in a relatively small sample size28. Thus, a non-parametric data mining approach, classification and regression tree (CART) has been applied to explore high-order gene-gene and gene-environment interactions, given its statistical advantages in overcoming the inaccuracy parameter estimates and providing good power for identifying high-order interactions29.

Here, we investigated 16 SNPs which were predicted to be functional or identified by previous studies in 7 candidate genes involved in the Ca2+/NFAT signaling pathway in a case-control study of a Chinese population. We examined the individual and combined effects of these 16 variants by traditional unconditional LR model and high-order gene-gene interactions in modulating KD risk using CART analysis.

Results

Subjects characteristics

A total of 428 KD children and 493 healthy controls were enrolled in this study. There were 263 (61%) males in cases and 303 (61%) males in controls and no statistically significant difference was observed between cases and controls in the distribution of gender (Pearson χ2 = 0.000, P = 0.997).

Association analysis between individual SNP and KD risk

The genotyping call rates of all the 16 SNPs were >95% except CASP3 rs113420705 (92%), so that the SNP rs113420705 was not analyzed any further. The genotype distributions of the remaining 15 SNPs in our control subjects were all in the Hardy-Weinberg equilibrium (HWE, P > 0.05) and their minor allele frequencies (MAFs) were similar to those in HapMap database of Han Chinese in Beijing, China (CHB, Supplementary Table S1 online). As shown in Table 1, two SNPs, CASP3 rs2720378 and IL-2 rs2069762 were significantly or marginally associated with the increased risk of KD (odds ratio (OR) = 1.39, 95% confidence interval (CI) = 1.07–1.80, P = 0.014; OR = 1.28, 95% CI = 0.98–1.67, P = 0.066), both under the dominant model.

Table 1 Association between individual SNP and KD risk

Association of high-order interactions with KD risk by CART analysis

In the final optimal decision tree generated by the CART analysis (Figure 1), the initial split of the root node was CASP3 rs2720378 and patients harboring rs2720378 C allele (GC or CC genotype) had a higher risk to suffer KD compared with patients with GG genotype, suggesting that CASP3 rs2720378 was the strongest risk factor for KD among the 15 SNPs examined. A deeper exploration of the classification tree structure demonstrated distinct interaction patterns between individuals carrying rs2720378 GC or CC genotype and those with rs2720378 GG genotype. Individuals harboring GG genotype had the lowest risk for KD with a rate of 42% cases, thus we considered this terminal node as the reference. As shown in Table 2, individuals carrying the combination of CASP3 rs2720378 GC or CC genotype, IL-2 rs2069762 AC or CC genotype and STIM1 rs1561876 AA genotype exhibited the highest risk for KD (OR = 2.12, 95% CI = 1.46–3.07, P < 0.001).

Table 2 Risk estimates of CART terminal nodes
Figure 1
figure 1

CART analysis of genetic variants in Ca2+/NFAT signaling pathway and KD risk.

Cases and controls are denoted by white and black box respectively and each node contains frequencies and percentages of cases and controls in each subgroup.

Afterwards, we attempted to test pairwise interactions between the three SNPs (rs2720378, rs2069762 and rs1561876) by the usual logistic regression analysis. However, we did not find any positive result on the interaction terms, regardless of multiplicative interaction or additive interaction (Supplementary Table S2 online).

Cumulative effect analysis

A cumulative effect of the 3 SNPs identified in the CART analysis was evaluated by LR analysis with the rs2720378 C allele, rs2069762 C allele and rs1561876 A allele as risk alleles. Participants were categorized into 3 groups based on the number of risk alleles they harbored (0 ~ 2, 3 ~ 4, 5 ~ 6 as individual group respectively, for the number of 0 risk allele subgroup was considerably small) and the 0 ~ 2 risk alleles subgroup was regarded as the reference group. Both of the subgroups with 3 ~ 4 risk alleles and 5 ~ 6 risk alleles showed marginally significant or significant associations with increased KD risk (OR = 1.32, 95% CI = 1.00–1.74, P = 0.052; OR = 2.93, 95% CI = 1.66–5.17, P < 0.001) and the Cochran-Armitage trend test indicated a significant dose effect with the ORs being increased with increasing numbers of risk alleles (Ptrend < 0.001, Table 3).

Table 3 Cumulative effect of the 3 SNPs (rs2720378 rs2069762 rs1561876) between KD patients and normal controls

Discussion

KD is an immune-mediated multi-systemic vasculitis and the most widely proposed consensus is that it results from an unknown infection trigger in part of genetically susceptible hosts19,30. A number of genes have been reported to have significant associations with the susceptibility to KD in different populations, such as CD40L31, HLA-E32, FAM167A-BLK33, FCGR2A34, through candidate gene approaches or genome-wide association studies. However, most of these genes were discrete no matter which strategies were applied. Transforming growth factor-beta signaling pathway, as the only pathway studied on KD hitherto, has been proved to play an important role in KD pathogenesis and genetic variants in three genes in the pathway (TGFB2, TGFBR2, SMAD3) influence KD susceptibility, coronary artery aneurysms, aortic root dilatation and intravenous immunoglobulin treatment response35.

To the best of our knowledge, this is probably the first study on the single- and multiple-risk of genetic variants in the Ca2+/NFAT signaling pathway and KD risk in a Chinese population. In our study, the candidate SNPs were predicted to be functional by bioinformatics analyses or identified by previous studies. We first demonstrated that variants of CASP3 rs2720378 and IL-2 rs2069762 were nominally associated with increased risk of KD. The following CART analysis revealed the prediction value of gene-gene interactions among CASP3 rs2720378, IL-2 rs2069762 and STIM1 rs1561876 variants on KD risk and the cumulative analysis further indicated their synergetic effect on KD risk.

Ca2+/NFAT signaling pathway plays a crucial role in the immune cell functions, including cell proliferation, cytokine gene expression, differentiation and cell death36. Accumulated evidence has indicated that any abnormal expression or anergy caused by mutations of any component in Ca2+/NFAT signaling pathway could be closely related to the change of signal intensity or transmission, thus contributing to the pathogenesis or progression of diseases, especially the immune-related diseases37,38,39,40,41,42. As far as our current findings were concerned, one of the positive results on SNP rs2720378 was also observed in one previous study conducted by Onouchi et al.19. The investigators explored the biological effect of CASP3 rs113420705 and found that the G > A substitution weakened the binding between CASP3 and NFATc2, reduced CASP3 transcription and presumably would inhibit T-cell apoptosis. In addition, rs2720378 was in tight linkage disequilibrium (LD) with rs113420705 in Japanese (r2 = 0.88). Accordingly, we could speculate that rs2720378 might not be the “real” causal variant but only a “proxy”. However, as Onouchi et al. pointed out in the study, there remained a possibility that rs2720378 also affected CASP3 expression by other, unknown mechanisms. Using the in silico tool SNPinfo43, rs2720378 was classified as affecting transcription-factor-binding site (TFBS) activity, with the moderate difference score between the two alleles, which might be a possible hint. On the other hand, SNP rs2069762, a single base change proximal to the IL-2 promoter region (−330A > C), was also predicted as affecting TFBS activity, where the C allele had a significantly higher score than the A allele. What's more, it has been proved that individuals homozygous for the C allele of rs2069762 produced over three times the amount of IL-2 than their AA and AC counterparts after stimulation with anti-CD3/CD2825 and then correlated with a series of inflammatory or immunological diseases, such as ulcerative colitis40, multiple sclerosis41 and childhood lymphoma42. Therefore, we considered that it was biologically plausible that SNP rs2069762 was also associated with increased KD risk. Referring to STIM1 rs1561876, few studies have investigated its effect on KD risk, but a recent study have demonstrated that the G allele of rs1561876 was associated with lower intracellular 1-β-D-arabinofuranosyl-CTP levels, inferior response after remission induction therapy, greater risk of relapse and poorer overall survival in acute myeloid leukemia patients receiving cytarabine and cladribine44. Moreover, this SNP was predicted as affecting TFBS activity as well as microRNA-binding site activity, which might provide a clue for further study in the future. The multiple-risk genetic variants detected by CART, CASP3 rs2720378, IL-2 rs2069762 and STIM1 rs1561876, suggested a stronger combined effect of SNPs, which might evoke researchers to pay more attention to the gene-gene interactions contributing to KD risk if they really exist.

Although the association of ITPKC rs28493229 with KD susceptibility has been reported in patients of Japanese and American ethnicities12,45, we failed to reproduce such findings in our current study, which was in accordance with a recent study on Chinese population from Sichuan province of China in 201214. The relatively low statistical power caused by insufficient number of study subjects might be taken into account. Moreover, the significantly lower frequency of the risk-conferring C allele of rs28493229 in CHB (7%) compared with Japanese in Tokyo or Americans (both are 15%), which data was obtained from 1000 Genomes (http://www.1000genomes.org/), might simply reflect the ethnic variation.

Considering the vital importance of Ca2+/NFAT pathway in pathogenesis of KD, exploring novel inhibitors against this pathway seems to be promising in treatment of KD. Recently, two clinical trials of cyclosporine A, which potently suppresses the activity of T cells as a calcineurin inhibitor, have indicated that cyclosporine A appeared to be a safe and effective approach for patients with refractory KD46,47.

Certain limitations to this study should be noted. Firstly, the candidate genes we studied were the primary but not all the genes involved in the Ca2+/NFAT pathway and the screening strategy of candidate SNPs was not so rigorous that the SNPs we selected could not cover all the probably functional SNPs among these gene regions. Thus, it might be unable to fully reflect the genetic effects of this pathway on KD susceptibility and might reduce the effectiveness of our conclusions. Secondly, these findings were preliminary since the sample size was relatively small and they were not replicated in this current study. Thirdly, the lack of information about environment factors, such as family history, infection history, which might play roles in KD onset, limited our further investigation of gene-environment interactions. Fourthly, the significant associations we observed were merely on the statistical level, whereas we did not evaluate their real biological functions.

In summary, our study implicates the importance of the single- and multiple-risk variants in Ca2+/NFAT signaling pathway in KD susceptibility and provides a clue to detect the potential gene-gene interactions conferring KD risk. Further studies on more comprehensive SNPs, different ethnicities and larger sample sizes are warranted and follow-up studies are also needed to further identify the true causal variants to reveal the biological mechanism of genetic etiology.

Methods

Study subjects

This study contained 428 children diagnosed with KD, all of which were unrelated ethnic Han Chinese and were consecutively recruited between April 2009 and September 2012 from Children's Hospital, Zhejiang University School of Medicine, China. The diagnosis of KD was based on the 5th revised edition of the guidelines established by the Kawasaki Disease Research Committee in Japan in 2002 (http://kawasaki-disease.org/diagnostic/). The controls comprised 493 ethnically and gender-matched unrelated healthy Chinese children without any evidence of infection. They were recruited from the same hospital at the time of a routine physical examination.

This study was approved by the ethics committees of the Children's Hospital, Zhejiang University School of Medicine and the methods were carried out in accordance with the approved guidelines. Informed consent was obtained from all participants or parents/caregivers of all the subjects who were studied.

Identification of candidate SNPs and genotyping

The candidate genes involved in the Ca2+/NFAT pathway were selected based on recent findings on genetic variants of KD, previous association studies and biological evidences, including 7 genes (ITPKC, CASP3, STIM1, STIM2, NFATc1, NFATc2 and IL-2). The screening procedure of SNPs which were predicted to be functional in these genes was described as follows. First, we extracted the ranges of the physical positions covering the 7 genes and their 2 Kb range up- and downstream from the HapMap database (http://hapmap.ncbi.nlm.nih.gov/, HapMap Data Rel 24/phaseII Nov08, on NCBI B36 assembly, dbSNP b126), CHB. Second, we placed the position of each gene into an integrated bioinformatics tool “SNPinfo”43 (http://snpinfo.niehs.nih.gov/snpinfo/snpfunc.htm) and retrieved a set of SNPs with possible functions. Third, we filtered these SNPs by MAF of CHB >10% and got a total of 52 SNPs in the 7 candidate genes. As shown in Supplementary Table S3 online, the detailed information of all the SNPs were listed, including the possible functions of allelic difference which the SNPinfo tool can predict (marked with a Y letter), such as TFBS, splicing site, microRNA-binding site, nonsynonymous SNP, polymorphism phenotyping. Forth, we tested the LD among these SNPs by Haploview v4.248 (Supplementary Fig. S1 online) and SNPs in strong LD with each other (r2 ≥ 0.80) were considered redundant and only one was reserved. As a result, a total of 32 SNPs were remained for further selection (Supplementary Table S4 online). Since a few of these SNPs, such as ITPKC rs7251246, STIM1 rs12806698, did not present in the results of Haploview, we rechecked the LD of these 32 SNPs using the online database of 1000 Genomes (http://www.1000genomes.org/) and found that several pairs of SNPs were in strong LD with each other (Supplementary Table S5 online). Therefore, 27 SNPs were obtained applying the similar filtering procedure (Supplementary Table S6 online). Next, we set priority of SNPs predicted to harbor multiple functional motifs if there also existed SNPs with single prediction of functional effect in the same gene (Supplementary Table S7 online). Afterwards we deleted five SNPs (rs9962479, rs4811191, rs2581732, rs9518 and rs4647601), among the genes which harbored more than one probably functional SNP.

Moreover, we added CASP3 rs113420705 and ITPKC rs28493229, which have been previously identified as biologically functional and associated with KD risk12,19. Thus, a total of 16 SNPs were collected for this study (Supplementary Table S1 online).

The allelic differences of effect on DNA motif which the SNPinfo could predict were shown in Supplementary Table S8 online. For different possible functions, the prediction scores were calculated by different methods, such as Match49, miRanda50, ESEfind51 and PolyPhen52 and if there were various pairs of scores for two alleles of a given SNP, we chose the one which has the maximal difference score between the two alleles. The only exception was rs2290692, with the same score between the two alleles. Since it has been found associated with KD susceptibility in a Chinese population in a previous study14, we reserved it for further study.

Genomic DNA was extracted from 2 mL peripheral blood sample that was collected from each participant at recruitment, using the RelaxGene Blood System DP319-02 (Tiangen, Beijing, China) by reference to the manufacturer's instructions. The concentration and the optical density of DNA were confirmed by NanoDrop 1000 spectrophotometer (Thermo Fisher Scientific, Waltham, Massachusetts, USA). All the SNPs were genotyped using TaqMan OpenArray Genotyping Assay System (Applied Biosystems, Foster City, CA, USA), the main procedure of which was described as follows. The final reaction volume per well was 4 μL for a 384-well plate, including TaqMan OpenArray Mater Mix, 2 × (2 μL) and normalized DNA sample (2 μL), the former of which was customized by Applied Biosystems. We applied the OpenArray AccuFill System (Applied Biosystems, Foster City, CA, USA) to load the samples into OpenArray plates (32 × 96 format) automatically, then we placed the loaded plates into cycling case, UV cured for 90 seconds. Next, we loaded the plates into the thermal cycler with the initial temperature of 93°C for 10 minutes, followed by 50 cycles of 95°C for 45 seconds, 94°C for 13 seconds and 53°C for 134 seconds and after that, the temperature dropped to 25°C for 2 minutes and was maintained at 4°C. Finally, allelic discrimination plate read and analysis were performed using TaqMan OpenArray System, followed the manufacturer's protocol.

Notably, genotyping was performed without knowing the status of the participants. A 5% masked, random samples were genotyped twice for quality control and the reproducibility was 100%.

Statistical methods

SNPs with genotyping call rates <95% or those that deviated from the HWE in controls (P < 0.05) were excluded. The HWE for genotypes was assessed by a goodness-of-fit χ2 test in the control group. The distribution differences in gender and genotype frequencies between cases and controls were examined by Pearson's χ2 test or Fisher's exact test, when appropriate. Unconditional LR analysis was then applied to estimate ORs and their 95% CIs for the effects of the SNPs on KD susceptibility, under assumption that variant alleles were the risk alleles. In order to increase the statistical power, the most likely inheritance model for each SNP was selected instead of three models calculated simultaneously. The statistical power to detect the effects of the SNPs was calculated by Power v3.0.053,54 and for example, for SNPs with minor allele frequency of 0.070, 0.244 and 0.500, we calculated that the power for our sample size to detect an odds ratio of 1.50 was 0.40, 0.79 and 0.86, respectively.

The potential gene-gene interactions were examined by CART, which is a non-parametric technique and does not require assumptions about the distribution of the data. A CART tree is constructed by splitting data recursively into binary subsamples, beginning with the root node that contains all the learning sample and leading to the formation of daughter nodes (nodes that can be split further) and terminal nodes (nodes that cannot be split any further). Gini criteria are used to achieve a high degree of homogeneity in the terminal nodes or subgroups. After a large tree is grown, which tends to be complex and difficult to interpret, a pruning procedure is performed to avoid overfitting the model. In the final stage, the optimal tree is selected based on the lowest misclassification error rate, which can be assessed based on cross validation55. Subgroups of individuals with differential risk associations with KD were identified in the different terminal nodes of the tree, indicating potential interactions. Finally, the risk of these subgroups was evaluated by LR analysis with the least percentage of cases as the reference and ORs and 95% CIs adjusted for gender28. After CART analysis, multiple risk of the variants potentially interacted with each other was evaluated by a cumulative effect analysis.

All of the above statistical analyses were conducted by SPSS software v13.0 (SPSS, Inc., Chicago, IL) and all P values were two-tailed with a statistically significant level at 0.05.