Genetic sharing and heritability of paediatric age of onset autoimmune diseases

Li, Yun R.; Zhao, Sihai D.; Li, Jin; Bradfield, Jonathan P.; Mohebnasab, Maede; Steel, Laura; Kobie, Julie; Abrams, Debra J.; Mentch, Frank D.; Glessner, Joseph T.; Guo, Yiran; Wei, Zhi; Connolly, John J.; Cardinale, Christopher J.; Bakay, Marina; Li, Dong; Maggadottir, S. Melkorka; Thomas, Kelly A.; Qui, Haijun; Chiavacci, Rosetta M.; Kim, Cecilia E.; Wang, Fengxiang; Snyder, James; Flatø, Berit; Førre, Øystein; Denson, Lee A.; Thompson, Susan D.; Becker, Mara L.; Guthery, Stephen L.; Latiano, Anna; Perez, Elena; Resnick, Elena; Strisciuglio, Caterina; Staiano, Annamaria; Miele, Erasmo; Silverberg, Mark S.; Lie, Benedicte A.; Punaro, Marilynn; Russell, Richard K.; Wilson, David C.; Dubinsky, Marla C.; Monos, Dimitri S.; Annese, Vito; Munro, Jane E.; Wise, Carol; Chapel, Helen; Cunningham-Rundles, Charlotte; Orange, Jordan S.; Behrens, Edward M.; Sullivan, Kathleen E.; Kugathasan, Subra; Griffiths, Anne M.; Satsangi, Jack; Grant, Struan F. A.; Sleiman, Patrick M. A.; Finkel, Terri H.; Polychronakos, Constantin; Baldassano, Robert N.; Luning Prak, Eline T.; Ellis, Justine A.; Li, Hongzhe; Keating, Brendan J.; Hakonarson, Hakon

doi:10.1038/ncomms9442

Download PDF

Article
Open access
Published: 09 October 2015

Genetic sharing and heritability of paediatric age of onset autoimmune diseases

Yun R. Li ORCID: orcid.org/0000-0002-8077-4975^1,2,
Sihai D. Zhao³,
Jin Li¹,
Jonathan P. Bradfield¹,
Maede Mohebnasab¹,
Laura Steel¹,
Julie Kobie⁴,
Debra J. Abrams¹,
Frank D. Mentch¹,
Joseph T. Glessner¹,
Yiran Guo ORCID: orcid.org/0000-0002-6549-8589¹,
Zhi Wei^1,5,
John J. Connolly¹,
Christopher J. Cardinale¹,
Marina Bakay¹,
Dong Li¹,
S. Melkorka Maggadottir^1,6,
Kelly A. Thomas ORCID: orcid.org/0000-0002-9719-2664¹,
Haijun Qui¹,
Rosetta M. Chiavacci¹,
Cecilia E. Kim¹,
Fengxiang Wang¹,
James Snyder¹,
Berit Flatø⁷,
Øystein Førre⁷,
Lee A. Denson⁸,
Susan D. Thompson⁹,
Mara L. Becker¹⁰,
Stephen L. Guthery¹¹,
Anna Latiano¹²,
Elena Perez¹³,
Elena Resnick¹⁴,
Caterina Strisciuglio¹⁵,
Annamaria Staiano¹⁵,
Erasmo Miele¹⁵,
Mark S. Silverberg¹⁶,
Benedicte A. Lie¹⁷,
Marilynn Punaro¹⁸,
Richard K. Russell¹⁹,
David C. Wilson²⁰,
Marla C. Dubinsky²¹,
Dimitri S. Monos^22,23,
Vito Annese²⁴,
Jane E. Munro^25,26,
Carol Wise²⁷,
Helen Chapel²⁸,
Charlotte Cunningham-Rundles¹⁴,
Jordan S. Orange²⁹,
Edward M. Behrens^23,30,
Kathleen E. Sullivan^6,23,
Subra Kugathasan³¹,
Anne M. Griffiths³²,
Jack Satsangi³³,
Struan F. A. Grant^1,23,
Patrick M. A. Sleiman^1,23,
Terri H. Finkel³⁴,
Constantin Polychronakos³⁵,
Robert N. Baldassano^23,36,
Eline T. Luning Prak³⁷,
Justine A. Ellis^38,39,
Hongzhe Li⁴,
Brendan J. Keating^1,23 &
…
Hakon Hakonarson^1,23,40

Nature Communications volume 6, Article number: 8442 (2015) Cite this article

7199 Accesses
40 Citations
55 Altmetric
Metrics details

Subjects

Abstract

Autoimmune diseases (AIDs) are polygenic diseases affecting 7–10% of the population in the Western Hemisphere with few effective therapies. Here, we quantify the heritability of paediatric AIDs (pAIDs), including JIA, SLE, CEL, T1D, UC, CD, PS, SPA and CVID, attributable to common genomic variations (SNP-h²). SNP-h² estimates are most significant for T1D (0.863±s.e. 0.07) and JIA (0.727±s.e. 0.037), more modest for UC (0.386±s.e. 0.04) and CD (0.454±0.025), largely consistent with population estimates and are generally greater than that previously reported by adult GWAS. On pairwise analysis, we observed that the diseases UC-CD (0.69±s.e. 0.07) and JIA-CVID (0.343±s.e. 0.13) are the most strongly correlated. Variations across the MHC strongly contribute to SNP-h² in T1D and JIA, but does not significantly contribute to the pairwise rG. Together, our results partition contributions of shared versus disease-specific genomic variations to pAID heritability, identifying pAIDs with unexpected risk sharing, while recapitulating known associations between autoimmune diseases previously reported in adult cohorts.

Genetic mapping across autoimmune diseases reveals shared associations and mechanisms

Article 13 May 2024

Multi-ancestry genome-wide association analyses identify novel genetic mechanisms in rheumatoid arthritis

Article 04 November 2022

HLA allele-calling using multi-ancestry whole-exome sequencing from the UK Biobank identifies 129 novel associations in 11 autoimmune diseases

Article Open access 03 November 2023

Introduction

Autoimmune (AI) diseases affect approximately 1 in 12 individuals living in the Western Hemisphere, representing a significant cause of morbidity, chronic disability and health-care burden. High rates of sibling recurrence and twin–twin concordance, both within and across multiple independent AI diseases, coupled with recent results from genome-wide association studies (GWAS), suggest that a set of shared genetic risk factors underlie paediatric AI disease (pAID) aetiology^1,2,3. Moreover, a number of AI diseases show clear familial clustering, such as inflammatory bowel disease (IBD)⁴, whereas others (for example, type 1 diabetes (T1D), AI thyroiditis (THY) and celiac disease (CEL) may manifest as comorbid diseases in polyglandular AI syndromes². Although the concept of genetic sharing among AIs is intriguing, it remains unclear if this is due to ‘pleiotropic’ risk factors that predispose to multiple AI diseases via shared mechanisms or if multiple, independent risk factors are responsible.

GWAS have identified single-nucleotide polymorphisms (SNPs) across hundreds of loci as being associated with an increased risk of developing AI^{5,6,7,8,9,10,11,12}. These findings, coupled with those from epidemiological studies, strongly support the existence of (i) an overlapping ‘AI disease genetic landscape’^13,14 and (ii), consequently, a shared heritability across these diseases. Heritability, in the broad-sense (H²), is defined as the entirety of an individual's phenotypic variation explained by genetic variance, but in practicality, it can be difficult to quantify and partition precisely¹⁵. A major contribution to H² is the narrow-sense or additive heritability (h²), which can be more accurately quantified.¹⁵. Recently, a new method was established to estimate the total phenotype variance attributable to additive genetic variations using genome-wide SNP genotyping data^16,17,18,19. The method has been since applied to dozens of GWAS-examined traits and extended to examine jointly the co-heritability of related diseases²⁰.

We systematically quantified the narrow-sense heritability, h², as well as the pairwise joint heritability of pAIDs attributable to common genomic variation using a single-centre accrued cohort of over 5,000 unrelated cases composed of nine independent pAIDs and 36,000 shared, population-based healthy controls. We first report the genome-wide SNP genotype-derived heritability estimates (referred to as SNP-h²) and then the genetic correlation (SNP-rG) across pairs of the nine investigated pAIDs. We contextualize these findings alongside a comprehensive review of available literature and epidemiological data sets, illustrate a method for quantifying genetic risk factor sharing across pAIDs, and provide considerations for how such genetic data can aid disease prediction.

Results

Quantifying the heritability of paediatric AI diseases

To quantify the SNP-h² of the nine pAIDs, we utilized genome-wide SNP genotypes ascertained from DNA samples of patients of each pAID cohort along with samples from population-based control subjects with no known diagnosis of autoimmunity or immunodeficiency. Following extensive quality control (QC), removing SNPs of lower minor allele frequency (MAF), missingness and differential missingness in cases and controls, and deviation from Hardy–Weinberg equilibrium (see Methods), we retained 461,301 SNPs. We excluded samples for low genotyping rates, cryptic relatedness and genetic outliers, leaving a cohort consisting of 4,956 cases distributed across nine pAIDs and 27,451 unrelated shared population-based controls (Table 1). We also included, for comparison, a non-immune-mediated dichotomous trait, paediatric-onset epilepsy (EPI); this cohort of ∼800 case subjects was recruited and genotyped at our centre using the same platforms over the same time period.

Table 1 Summary of cohorts included.

Full size table

We used a previously described method for estimating disease variance explained by additive genetic factors using GWAS data (referred to as SNP-based heritability or SNP-h²)¹⁷. We transformed the SNP-h² estimates from the observed to the liability-scale using respective observed disease prevalence. To assess if our SNP-h² estimates are consistent with previously published findings and other population-based heritability estimates (POP-h²), we performed a systematic literature search followed by manual curation of prevalence and heritability estimates for each of the nine pAIDs (Fig. 1a and Supplementary Tables 1 and 2).

**Figure 1: Autoimmune disease prevalence and heritability estimates.**

Among the pAIDs examined where the SNP-h² estimates were at least nominally significant (P<0.05), T1D and juvenile idiopathic arthritis (JIA) were the most highly heritable (Fig. 1b). Considerably lower estimates were observed for ulcerative colitis (UC) and Crohn's disease (CD; Supplementary Fig. 1A), suggesting that environmental factors may play a much larger role in IBD aetiology (Fig. 1d). We also observed relatively low SNP-h² estimates for systemic lupus erythematosus (SLE; 0.205±s.e. 0.076).

Contribution of the MHC region and ChrX to SNP-h²

Given the known association of variants across the MHC with AI diseases, we quantified their contribution to the SNP-h² for each of the nine pAIDs. We first performed HLA imputation²¹, to identify the most strongly associated SNP, amino acid or HLA allele with each pAID (Supplementary Table 5) and we estimated POP-h² attributable to the extended MHC based on previous analyses (Supplementary Tables 6 and 7). The MHC-specific SNP-h² estimates correlated well with the strength of lead MHC P-value. For example, variations across the extended MHC region accounted for 32.7% of the total autosomal SNP-h² in T1D and 24.7% of that in CEL, with no significant contribution to the SNP-h² estimates in psoriasis (PS), SLE, CD or the non-pAID, EPI. Despite the pervasive association between SNPs within the MHC and both JIA and UC, contributions of the extended MHC to their total SNP-h² (10.7% and 5.8%, respectively) were limited (Fig. 1c and Table 2). Despite the known association with HLA-DRB1*0103 and HLA-B*52 in UC¹³, we observed that removing the extended MHC did not significantly reduce the observed SNP-h² for either UC or, the related IBD phenotype, CD (Supplementary Table 8). As expected, the contribution of ChrX to the overall SNP-h² was small across all pAIDs (Supplementary Table 2). These estimates are consistent with expectations as ChrX makes up only about 5% of the total genome²², has comparatively fewer coding bases and is less polymorphic²³.

Table 2 Contribution of autosomal, autosomal with extended MHC removed (exMHC) and ChrX variations to pAID heritability (h²).

Full size table

Disease prediction using support vector machines (SVM)s

Given that we observed relatively high rates of heritability across many of the pAIDs, we evaluated the utility of common genomic variations in predicting pAID disease risk, using a SVM model-based approach. Using a tenfold cross-validation study design, we built a linear SVM model using the top GWAS signals observed using nine out of ten of the total samples and tested this SVM predictor in the remaining 10% of the samples. Based on previous analyses in both case–control²⁴ and quantitative traits²⁵, we expect that disease prediction accuracy to behave as function of heritability, sample size and the number of causal variants. We assessed the mean and maximum area under the receiver operating characteristic curve (AUC) achieved, showing that our SVM predictor was most effective for JIA and T1D (AUC_max>0.9; AUC_mean>0.85), although satisfactory results was also seen in CEL (AUC_max>0.8 and AUC_mean>0.7). These findings are consistent with that recently reported by Speed et al. using an independent adult CEL cohort²⁶. The predictability of all nine pAIDs was fairly robust to range of P-value thresholds used for selecting SNP predictors in building the SVM model (Fig. 3 and Supplementary Table 11).

**Figure 3: Disease prediction using a support vector machine model.**

Estimation of pairwise co-heritability across pAIDs

To investigate diseases with shared underlying genetic risk factors, we assessed the genetic correlation (rG) for each pair of pAIDs and between each of the nine pAIDs and EPI, which provided a comparative baseline for non-significant genetic correlation²⁰. We used both a strict (P_BS) and a more relaxed Bonferroni correction (P_BL) to adjust for either 45 (all pairwise combinations) or 9 comparisons (combinations per pAID); (see Methods). We observed the highest rG between UC and CD (rG=+0.66; P_BS<0.001), consistent with the reported sharing of association loci by several published GWAS, immunochip and fine-mapping studies^11,27,28,29 (Supplementary Table 10). We also noted a positive rG between common variable immunodeficiency disorder (CVID) and JIA (rG=+0.34), although it was more modest (P_BL<0.01). While we did observe a marginally positive rG for CD and T1D consistent with results from published GWAS metanalysis³⁰, although it did not reach significance at a liberal Bonferroni threshold (rG=+0.096; P_BL=0.17). Of note, we did not observe a significant reduction in rG estimates when the extended MHC was entirely removed from the analysis across any of the pAID pairs, making it unlikely that the sharing of common HLA alleles could significantly account for the degree of co-heritability observed (Fig. 2b).

**Figure 2: Prevalence of AI disease co-morbidities and estimates of genetic correlation (co-heritability) across pAIDs.**

Discussion

To our knowledge, this is the most comprehensive assessment of heritability and disease prediction using genome-wide dense genotyping data across multiple pAIDs. The results show that SNP-h² estimates were significantly higher for the pAID cohorts as compared with those obtained for the non-immune-mediated disease EPI (Fig. 1a and Supplementary Tables 1 and 2). Among the pAIDs examined where the SNP-h² estimates were at least nominally significant (P<0.05), T1D and JIA were the most highly heritable (Fig. 1b). These results are in keeping with the SNP-h² estimates reported for T1D and Rheumatoid Factor Positive (RF+), Rheumatoid Arthritis (RA) in adults, using the Wellcome Trust Case Control Consortium data sets^17,26,31. Considerably weaker SNP-h² estimates were observed for UC and CD, consistent with previous reports in adults³² (Supplementary Fig. 1A). Although the sample size of CD was several fold greater than those of T1D and UC, and twice that for JIA, the SNP-h² estimates are lower in CD, suggesting that environmental factors play a much larger role in CD disease aetiology (Fig. 1d). This finding is in keeping with studies demonstrating a key role for the gut microbiome and faecal flora in disease-onset and severity in the IBDs^11,33,34.

As noted, the SNP-h² observed for JIA was high despite the known heterogeneous nature of this disease, including seven distinct JIA subtypes³⁵. Little is known about the heritability of JIA as it is fairly uncommon. However, in RA, the more common JIA counterpart in adults, a range of SNP-h² estimates has been reported^17,26,31,36. Some of the heterogeneity in SNP-h² estimates for RA may be attributable to the different ratios of RF+ vs RF− patients across different study cohorts, as recent analyses suggest that RF+ RA may be ‘distinct’ from RF− forms of RA in terms of genetic aetiology³⁷. Moreover, the subphenotype of JIA that is most similar to RF+ RA (i.e. RF+ JIA) made up only a small component of our JIA cohort (4.9%). Thus, the high estimated heritability observed in JIA suggests that despite the heterogeneous clinical findings, there may be a strongly shared genetic component contributing to a common aetiology.

We observed relatively low SNP-h² estimates for SLE (Fig. 1b). Although these estimates are lower than those reported by So et al.,^38,39 they are higher than the POP-h2 reported based on sibling-recurrence⁴⁰. These observations are consistent with strong environmental and epigenetic components to SLE liability^41,42. We included in our analysis a non-immune-mediated disease, early-onset EPI, as a comparator cohort. As expected, the SNP-h² estimates on the liability scale, albeit non-zero, was relatively low compared with any of the AI diseases. That we observed slightly higher heritability estimates across our paediatric cohorts than previously reported in adults is also in keeping with the notion that paediatric-onset diseases have been noted previously to reflect disease aetiologies with a stronger genetic component²⁹ and less confounding due to reduced timespan of environmental exposure(s). Adult or late-onset AI diseases can be associated with environmental precipitating factors such as viral infections or drug exposures, which have been implicated in a range of AI diseases including T1D, CEL and SLE^3,42.

Although estimates for JIA and T1D are higher than SNP-h² estimates reported previously, our estimates for RA and T1D are more consistent, although still falling short of, than those reported by population estimates from twin-based or familial studies (Supplementary Table 2). That these SNP-h2 based estimates are in general still falling behind estimates made from epidemiological studies illustrates the ‘missing heritability’ phenomenon. Disparities between POP-h² and SNP-h² estimates may be at least partially attributable to inflation of population-based estimates in the presence of ascertainment-bias and/or insufficient adjustment for confounding effects. The latter tends to occur if there are significant non-additive or shared environmental factors that contribute to phenotypic variation^36,43.

A number of previous epidemiological and genetic studies have suggested a significant degree of shared risk across AI diseases^44,45,46,47. There are a number of reasons why our results may differ from these reports. In such population-based studies, observed sharing of risk in the population is inevitably confounded by common environmental factors or gene–environment interactions, neither of which would be parsed out from purely epidemiological observations. In addition, it can be challenging to perform these comparisons in heterogeneous populations because they may be composed of different underlying genetic backgrounds, and genetic ancestry is known to dramatically affect the risk for many AI diseases (for example, greater risk of CEL and JIA in Caucasians)^4,20.

Although there are several prior large-scale analyses of genetic sharing among AI diseases using GWAS data, these are based on somewhat different analytical approaches or study methodology than those employed here. A notable example comes from Cotsapas et al., who derived a Cross-Phenotype Meta-Analysis test statistic that powerfully combines multiple independent AI data sets to analyse the likelihood that a SNP is shared across disease phenotypes. They applied this test statistic to the 140 top genetic risk variants reported previously by GWAS across seven AI diseases⁴⁷. Although there is no doubt that findings from this study are informative, the targeted candidate approach has clear limitations and only summary statistics were available. Another concern, which is not unique to the study by Cotsapas et al., but a concern in most large GWAS meta-analyses, is inter-study heterogeneity these studies often combine summary data obtained from independent case–control study cohorts accrued and genotyped across North America and Europe using different genotyping platforms and QC/analysis steps, requiring post-hoc statistical adjustments for heterogeneity, genetic variation and the use of SNP proxies. Although single-institution study designs can have limited applicability, in our study, using a common shared control accrued in the same institution and genotyped on the same platform does limit the effect of inter-study heterogeneity in our analysis.

As expected, we found that variations across the extended MHC strongly contributed to both heritability estimates and disease risk predictability in T1D and CEL, and more modestly in UC and JIA. The contribution of the extended MHC to total phenotypic variance explained correlated with the strength of the strongest association signal within the extended MHC. However, as recent reports have shown, this method for estimating h² is sensitive to the variation in linkage-disequilibrium (LD) across the genome^18,31. We therefore examined the effect of LD on the SNP-h² estimates by comparing the results with those obtained using non-correlated SNP markers (Supplementary Fig. 2B; see Methods for details). As anticipated, the effect of the pruning is mostly attributable to the strong role of the MHC in the heritability of these diseases, as pruning had little effect on the heritability estimates once the extended MHC was removed. Thus, the number and degree of LD for the input SNPs used for calculating h² can be important for diseases where the MHC plays a major role, consistent with previous studies^31,48,49.

The SNP-h² for T1D was most strongly affected by the removal of the extended MHC, emphasizing the importance of MHC polymorphisms in T1D pathogenesis. In addition, the estimates for SPA and CEL both fell significantly when markers across the MHC were excluded from further analysis. The relatively limited contribution of the genetic polymorphisms across the MHC to heritability in IBD was consistent with prior GWAS results, as the MHC SNP (rs1626392, P<2.27 × 10⁻⁷) most significantly associated with CD did not reach genome-wide significance, defined as P<5 × 10⁻⁸ (Supplementary Table 8). Aside from the MHC, recent work has examined the degree to which functional or coding loci, for example, DNAse I Hypersensitivity Sites⁵⁰, contribute to disease heritability. Such studies, currently underway, will help delineate biological functions and connect genetic associations with mechanistic roles of such functional variants.

A still unrealized, but much anticipated goal of personalized medicine is to utilize genomic data to accurately predict disease risk^26,51,52,53. We found that for the three pAIDs (T1D, JIA and CEL) that were most predictable, a range of P-value thresholds (P<1 × 10⁻⁶ and P<1 × 10⁻⁸) could be used to identify the predictive SNPs without significantly impacting maximum or mean AUC achieved, suggesting that the SVM model was robust to this parameter (Fig. 3 and Supplementary Table 11). In comparison, we obtained fairly modest AUCs for CD, UC and CVID (AUC_max>0.7, AUC_mean>0.65). These are in keeping with our expectation that genetic prediction should rest on underlying genetic heritability and confirms the value of SNP heritability analysis.

Indeed, the above observations are perhaps not surprising, given recent findings that support a strong contribution for environmental factors in disease susceptibility. For example, host-microbial interactions have been implicated in the pathogenesis of IBD and RA^11,54. Furthermore, in CVID, it is well-established that although genetic risk factors play a role in disease risk, there is significant within-disease heterogeneity in terms of aetiology. Patients with CVID are often diagnosed in late adolescence, suggesting that environmental risk factors play a greater role. Likewise, most cases of paediatric-onset IBD also have a post-pubescent age of onset. This is in contrast to T1D, JIA or CEL, which are commonly diagnosed by or before the age of 12 years, although some degree of variability is observed. This is consistent with the correlations noted above, in that the three diseases with more moderate SNP-h² estimates were also less predictable.

Among the three largest cohorts, namely JIA, UC and CD, CD was by far the largest. However, the heritability estimated for CD in our data set was the lowest of the three. As we know from prior studies that disease prediction is a function of heritability, sample size and the number of causal variants, we might expect the accuracy of disease prediction for CD to be relatively poor. This is exactly what we observed. In contrast, we had somewhat limited sample sizes for SPA, PS and CEL cohorts, and we caution against the interpretation of the high heritability estimates observed for PS. Another limitation of the present study is that we have not considered the role of rare, or potentially de novo, variants in the overall estimates of genetic heritability. As more sequencing data using either whole-exome or whole-genome approaches become available, future studies will help address this question.

A unique opportunity provided by our cohort was the ability to quantify pairwise pAID genetic correlations as numerous epidemiological analyses have shown that subsets of pAIDs co-cluster in families or exhibit high rates of comorbidity^55,56,57 (Table 3). As pAID co-heritability has not been systematically examined using genome-wide SNP data, we aimed to identify pAIDs showing significantly positive rG (that are consequently co-heritable) versus diseases that are either genetically unrelated or negatively correlated (and are consequently ‘mutually-protective’). We calculated the rG for each pAID pair and between each of the nine diseases and EPI. This latter analysis provides a ‘control’ or contextual baseline, akin to the inclusion of CD as a ‘null comparator’ phenotype by the Psychiatric Genomics Consortium²⁰. We observed a strongly and moderately positive rG between two pAID pairs, namely UC-CD and JIA-CVID.

Table 3 pAID joint heritabilities or genetic correlation (rG) reaching nominal significance.

Full size table

Although the MHC made major contributions to the disease-specific heritability, we found no evidence that variations across the MHC significantly contributed to the pAID co-heritability for any of the investigated disease pairs (Fig. 2b). For the pAID pairs with significantly positive rG’s (UC-CD and JIA-CVID), we did not observe a significant reduction in rG estimates when the SNPs within the MHC were removed from the analysis, making it unlikely that genetic sharing of MHC haplotypes can explain the genetic correlation observed among pAIDs in this data set (Fig. 2b). In addition, that the UC-CD and JIA-CVID pairs were the two with the largest positive rG is also consistent with results we obtained using an independent genome-wide pairwise sharing metric for genetic correlation, in which we considered all genome-wide SNP markers except those within the extended MHC locus (Fig. 2c, see Methods for details). Although it may appear to be surprising given the known association with the MHC across all pAIDs, these results are in keeping with our finding that the most significant MHC association signals identified for each pAID was disease-specific and did not overlap across the nine pAIDs (Supplementary Table 5).

Somewhat unexpectedly, we observed a negative marginal rG across several pAID pairs, including SLE-CD, SPA-CD, PS-UC and PS-T1D. Although none of these was significant following a Bonferroni correction, in each of the negatively correlated pAID pairs, one of the two diseases is considered a ‘classic autoimmune’ (that is, SLE, UC, and T1D), whereas the other pAID in the pair (that is, CD and PS) has been noted to have a strong ‘inflammatory’ component.

Taken together, we report genome-wide SNP genotype-derived heritability estimates and genetic correlations of disease liability across pairs of nine investigated pAIDs using common and low-frequency genetic variants. We contextualized these findings alongside a comprehensive review of available literature and epidemiological data sets, illustrate a method for quantifying genetic risk factor sharing across pAIDs and provide considerations for how such genetic data can aid in disease prediction. We observed that SNP-h² estimates in pediatric AI diseases tend to be greater in magnitude when compared to SNP-h² reported previously based on GWAS data from studies of adult AI disease cohorts, particularly for T1D, UC, JIA/RA. Moreover, we also observed that the ‘co-heritability’ across pAIDs was minimally attributable to shared MHC variations. While genomic screening in the general population on a large scale is not currently feasible, or of high utility (given the low disease prevalence and consequently, limited positive predictive value as well as the limitations in interpretability), our analysis suggests that there is a high heritability and disease predictability across the pAIDs. Future studies in larger sample sizes and in adult cohorts will be helpful in validating these results and developing new and improved methods for genome-based disease prediction and for the development of novel biomarkers that can be used to predict pAID risk.

Methods

Study population

Information regarding the patient cohorts have been published previously and are summarized briefly below.

Cases and controls were either directly ascertained as described in prior studies^{29,53,58,59,60,61,62,63,64,65} or obtained from de-identified samples and associated electronic medical records (EMRs) residing in the genomics biorepository at the Children’s Hospital of Philadelphia. EMR searches were conducted using previously described algorithms^58,59 based on phenotype mapping established using PheWAS ICD-9 code mapping tables^53,58,60 in consultation with qualified physician specialists for each disease cohort. All DNA samples were assessed for QC and genotyped on the Illumina HumanHap550 or HumanHap610 platforms at the Center for Applied Genomics (CAG) at the Children’s Hospital of Philadelphia (CHOP, Philadelphia, Pennsylvania, USA). Note that the patient counts below refer to the total recruited sample size from which we excluded non-qualified samples/genotypes that did not pass QC criteria required for inclusion in the genetic analysis (for example, because of relatedness or poor genotyping rate; see details below).

The IBD cohort comprised 2,796 individuals aged 2–17 years of European ancestry with biopsy-proven disease, including 1,931 with CD and 865 with UC, excluding all patients with unclassified type (IBD-U). Affected individuals were recruited from multiple centres from four geographically discrete countries and diagnosed before their nineteenth birthday according to the standard IBD diagnostic criteria, as previously reported^3,29.

The T1D cohort consisted of 1,120 cases from nuclear family trios (one affected child and two parents), including 267 independent Canadian T1D cases collected in paediatric diabetes clinics in Montreal, Toronto, Ottawa and Winnipeg (Canada) and 203 T1D cases recruited at CHOP since September 2006. All patients were Caucasians by self-report and ranged in age between 3 and 17 years, with 7.9 years being the median age at onset. All patients have been treated with insulin since diagnosis. Disease diagnosis was based on these clinical criteria, rather than any laboratory tests.

The JIA cohort was recruited in the United States of America, Australia and Norway and comprised of a total of 1,123 patients with onset of arthritis at <16 years of age. JIA diagnosis and JIA subtype were determined according to the International League of Associations for Rheumatology revised criteria³⁵ and confirmed using the JIA Calculator software⁶⁶ (http://www.jra-research.org/JIAcalc/), an algorithm-based tool adapted from the International League of Associations for Rheumatology criteria. Before standard QC procedures and exclusion of non-European ancestry, the JIA cohort was comprised of 464 case subjects from Texas Scottish Rite Hospital for Children (Dallas, Texas, USA) and the Children’s Mercy Hospitals and Clinics (Kansas City, Missouri, USA) of self-reported European ancestry; 196 subjects from the CHOP; 221 subjects from the Murdoch Childrens Research Institute (Royal Children’s Hospital, Melbourne, Australia) and 504 subjects from the Oslo University Hospital (Oslo, Norway).

The CVID study population consisted of 223 patients from the Mount Sinai School of Medicine (New York City, New York, USA); 76 patients from the University of Oxford (London, England); 47 patients from the CHOP and 27 patients from the University of South Florida (Tampa, Florida). The diagnosis in each case was validated against the ESID/PAGID diagnostic criteria, as previously described⁶⁷. Although the diagnosis of CVID is most commonly made in young adults (aged 20–40 years), all of the CHOP and University of South Florida cases had paediatric age of onset disease, whereas the majority of the cases from the Mount Sinai School of Medicine and Oxford had onset in young adulthood. We note that as the number of individuals with adult-onset CVID disease is so small (less than 5% of all cases presented), and all ten diseases have paediatric age of onset disease, we have elected to refer to the cohort material as pAIDs.

The balance of the paediatric AI disease subjects’ (SPA, PS, CEL and SLE) samples were accrued by our biorepository at the CHOP, which includes over 60,000 paediatric patients recruited and enrolled by the Center for Applied Genomics at CHOP. These individuals were ascertained for having a confirmed diagnosis of SPA, PS, CEL and SLE in the age range of 1–17 years during time of diagnosis and were required to fulfill clinical criteria for these respective disorders, as confirmed by a specialist. Only cases that upon EMR search were confirmed to have at least two or more in-person visits, at least one of which is with the specified ICD9 diagnosis code(s) were pursued for clinical confirmation (see Supplementary Table 12 for ICD-9 inclusion and exclusion codes). We used ICD9 codes previously identified and utilized for PheWAS or EMR-based GWAS^59,60 and agreed upon by board-certified physicians.

Age- and gender-matched control subjects, including the EPI cohort of both generalized and focal idiopathic EPI (ICD-9 345.9 and 345.4, respectively), were identified from the CHOP-CAG biobank and ascertained by exclusion of any patient with any ICD-9 codes for disorders of autoimmunity or immunodeficiency⁵⁸ (http://eicd9.com/). Research Ethics Boards at the CHOP and each of the collaborating centre, including: the Mount Sinai School of Medicine, University of Oxford, University of South Florida, the Children’s Mercy Hospitals and Clinics, Texas Scottish Rite Hospital for Children, Murdoch Children’s Research Institute, Oslo University Hospital, Cincinnati Children's Hospital Medical Center, McGill University, RCCS ‘Casa Sollievo della Sofferenza’, University of Toronto, University of Edinburgh, Emory University, University of Naples ‘Federico II’, Cedars Sinai Medical Center, Yorkhill Hospital for Sick Children, University of Miami Miller School of Medicine, Careggi University Hospital, University of Utah School of Medicine and Primary Children's Medical Center, approved this study.

Written informed consent was obtained from all subjects (or their legal guardians). Genomic DNA extraction and sample QC before and following genotyping were performed using standard methods⁶¹. To minimize confounding because of population stratification, we focused on only individuals of European ancestry, as determined by both self-reported ancestry and principle component analysis, PCA) in the present study (see below and Supplementary Fig. 4).

Genotyping and QC

All samples were genotyped at the CAG on the HumanHap550 or 610 BeadChip arrays. Although some published analyses using GWAS data to derive heritability estimates have applied whole-genome imputation because of the presence of samples with non-matching platforms, this is not ideal given (i) added risk of artefacts and (ii) consequent variations in coverage (genotyping density) across the genome. Without adding significant additional information, this can result in biased heritability estimates unless careful corrections are made to apply additional down-sampling/weighting of more densely imputed regions^31,48,49. As over 90% of the markers on the two arrays are shared, whole-genome imputation was not necessary and we utilized only the set of directly overlapping genotyped SNPs in the analysis (∼500,000).

After extracting the overlapping SNPs from the two platforms, SNPs with a low genotyping rate <95%, low MAF (<0.01) or significantly departing from the expected Hardy–Weinberg equilibrium (P<0.01) were excluded. Samples with low average genotyping call rate (<95%) or determined to be of outliers of European ancestry by PCA (Any of the top ten principal components (PCs) >6.0 standard deviations as reported by SMARTPCA/EIGENSTRAT⁶⁸) were removed. In addition, one of each pair of distantly related individuals, as determined by Identify-by-State analysis (>0.05), was excluded, such that the largest sample size would be retained in the final cohort.

Web-based access to all novel data included in this manuscript is available through our website at http://www.caglab.org.

Population stratification correction

The final cohort, following all above-noted QC, included a total of 4,956 pAID cases inclusive of 9 pAIDs and 27,451 population-matched controls, as well as a cohort consisting of 819 cases of paediatric-onset EPI. To avoid confounding, we assigned individuals fitting the diagnosis criteria for two or more pAIDs to the smaller disease cohort by sample size. No individual was included twice. To ensure that the markers tested across the cohorts were consistent, we included only SNPs that passed all QC criteria (461,301 SNPs). The filtered SNPs were tested in cases and controls for association with disease and used for the estimation of the genetic relationship matrix (see below). We used a logistic regression equation to estimate ORs/betas, 95% confidence intervals and P-values for trend, using additive coding for genotypes (0,1,2 minor alleles). We adjusted for gender and population stratification by including the binary gender and the first ten PCs (GCTA) from the PCA calculated from a set of 100,000 pruned SNPs as covariates in the logistic regression analyses⁶⁹. From the results of the association testing, we determined the genomic inflation per disease-common control cohort. All disease-specific, case–control GWAS had λ_GC values at or below 1.04 with the exception of CD (1.09), consistent with that previously reported for this data set²⁹. Final counts from each pAID cohort, included controls and genomic inflation calculated from median χ² association test statistics are reported in Table 1.

Estimation of the variance components for each pAID

Only individuals and SNPs that passed all QC metrics were used to estimate the variance components for the ten diseases (nine pAIDs and one non-pAID condition EPI). For disease-specific analysis, the common set of controls were used for each case–control analysis cohort, after excluding individuals who are relatives up to within the 5th degree. The genetic relationship between individuals was estimated using (i) all autosomal SNPs, (ii) all autosomal SNPs excluding the extended MHC (chr6:26.5–34 Mb) and (iii) SNPs only found on the X-chromosome (ChrX).

We applied the previously described linear mixed model method for estimating whole-genome SNP-based heritability using both common and low-frequency variants, which is implemented in the software GCTA. We estimated the genetic variance associated with genome-wide SNPs on the observed scale (SNP-heritability or SNP-h²)⁷⁰, conditioning on the top 20 ancestry PCs derived from a pruned set of ∼100,000 independent SNPs across the same data set (that is, PLINK --indep-pairwise 50 10 0.2) obtained also using GCTA. As our phenotypes are dichotomous traits, we subsequently transformed these results to the liability scale based on approximately observed disease prevalence at our centre for each trait (Table 1). Note that the total control sample size utilized varied slightly as we optimized our analysis to maximize the retained sample size when conservatively removing distantly related individuals during QC. As we excluded rare variants (MAF<0.01), these variants are therefore not included in the heritability estimates attributable to genetic variation.

Joint heritability across pAID pairwise combinations

We estimated the genetic correlation in disease risk for each of other pAID pairs using a bivariate linear mixed model, as described previously¹⁷. For each pairwise analysis, the pooled control samples passing QC were randomly allocated to the two diseases evenly and the top 20 PCs were again included as covariates. By jointly analysing a pair of cohorts, these results estimate both the SNP-h² of liability to both diseases and an estimate of the SNP-genetic correlation between these liabilities. We determined the significance of the rG using a likelihood ratio test by fixing the genetic correlation at zero¹⁷. Significantly positive (or negative) rG’s should reflect a shared (or disparate) genetic background, as a positive (negative) rG means that the correlation in the genetic variance components are higher (lower) between case subjects than between the case subjects and the respective control cohorts.

Genome-wide pairwise sharing analysis

We applied a novel test to detect the presence of SNPs anywhere in the genome that are simultaneously associated with each of two diseases; these SNPs are the genetic risk factors shared by that pair of pAIDs. Most existing tests require choosing a significance threshold to determine which SNPs are associated with which disease, but it is unknown how best to choose this threshold. Our method is threshold-free and requires no tuning parameters. Specifically, for any two diseases, we converted the P-values for all SNPs in the genome into Z-scores, such that for example:

The test statistic, γ, used to detect genetic sharing between two diseases is

which is the maximum of the pairwise minima of the signals across all of the n SNPs. The rationale is that if SNP j is associated with both D₁ and D₂, the magnitudes of both X_j and Y_j should be large. The more shared SNPs there are, the greater the likelihood that the maximum of the pairwise minimal values will be large. Under the null hypothesis that any genetic sharing is due only to chance, γ should be relatively small. We can obtain the P-value of this statistic by permuting the labels of the Z-scores relative to each other in order to simulate the null hypothesis. In fact, these P-values can be calculated analytically using a hypergeometric distribution, and no actual permutation is needed. Note that no significance threshold is required. This test was performed for all 45 pairwise pAID combinations (hence, the reported P-values are Bonferroni-adjusted for 45 independent tests).

Disease prediction using a linear SVM

Given that we observed relatively high rates of heritability across many of pAIDs, we sought to evaluate the utility of genome-wide SNP data in predicting pAID disease liability, using a previously described SVM pipeline that can be applied to GWAS results for a dichotomous trait⁵².

We identified SNPs to be used as predictors based on the strength of association with a given disease in a training set, testing graded P-value thresholds (P<1 × 10⁻⁵, 1 × 10⁻⁶, 1 × 10⁻⁷, 1 × 10⁻⁸, 1 × 10⁻⁹) for selecting SNP predictors, where the P-value is derived from the case–control association testing using samples in the training data set. We used each set of SNPs passing the tested threshold to then train the linear SVM model.

We then validated the SVM model by testing the accuracy of disease liability predictions for each of the nine pAIDs in the remaining independent sample set. We reported the prediction performance as the mean and maximum AUC achieved in both the training and validation sets (Fig. 3 and Supplementary Table 11).

Additional information

How to cite this article: Li, Y. R. et al. Genetic sharing and heritability of paediatric age of onset autoimmune diseases. Nat. Commun. 6:8442 doi: 10.1038/ncomms9442 (2015).

References

Anaya, J.-M., Gómez, L. & Castiblanco, J. Is there a common genetic basis for autoimmune diseases? Clin. Dev. Immunol. 13, 185–195 (2006).
Article CAS PubMed PubMed Central Google Scholar
Rojas-Villarraga, A., Amaya-Amaya, J., Rodriguez-Rodriguez, A., Mantilla, R. D. & Anaya, J.-M. Introducing polyautoimmunity: secondary autoimmune diseases no longer exist. Autoimmune Dis. 2012, 254319 (2012).
PubMed PubMed Central Google Scholar
Lettre, G. & Rioux, J. D. Autoimmune diseases: insights from genome-wide association studies. Hum. Mol. Genet 17, R116–R121 (2008).
Article CAS PubMed PubMed Central Google Scholar
Nunes, T., Fiorino, G., Danese, S. & Sans, M. Familial aggregation in inflammatory bowel disease: is it genes or environment? World J. Gastroenterol. 17, 2715–2722 (2011).
Article PubMed PubMed Central Google Scholar
Cooper, J. D. et al. Seven newly identified loci for autoimmune thyroid disease. Hum. Mol. Genet. 21, 5202–5208 (2012).
Article CAS PubMed PubMed Central Google Scholar
Tsoi, L. C. et al. Identification of 15 new psoriasis susceptibility loci highlights the role of innate immunity. Nat. Genet. 44, 1341 (2012).
Article CAS PubMed PubMed Central Google Scholar
Hinks, A. et al. Dense genotyping of immune-related disease regions identifies 14 new susceptibility loci for juvenile idiopathic arthritis. Nat. Genet. 45, 664–669 (2013).
Article CAS PubMed PubMed Central Google Scholar
L, J. et al. Dense fine-mapping study identifies new susceptibility loci for primary biliary cirrhosis. Nat. Genet. 44, 1137 (2012).
Article Google Scholar
Liu, J. Z. et al. Dense genotyping of immune-related disease regions identifies nine new risk loci for primary sclerosing cholangitis. Nat. Genet. 45, 670–675 (2013).
Article CAS PubMed PubMed Central Google Scholar
Eyre, S. et al. High-density genetic mapping identifies new susceptibility loci for rheumatoid arthritis. Nat. Genet. 44, 1336 (2012).
Article CAS PubMed PubMed Central Google Scholar
DP, M. et al. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 491, 119 (2012).
Article Google Scholar
Beecham, A. H. et al. Analysis of immune-related loci identifies 48 new susceptibility variants for multiple sclerosis. Nat. Genet. 45, 1353–1360 (2013).
Article CAS PubMed PubMed Central Google Scholar
Zhernakova, A. et al. Detecting shared pathogenesis from the shared genetics of immune-related diseases. Nat. Rev. Genet. 10, 43–55 (2009).
Article CAS PubMed Google Scholar
Parkes, M., Cortes, A., van Heel, D. A. & Brown, M. A. Genetic insights into common pathways and complex relationships among immune-mediated diseases. Nat. Rev. Genet. 14, 661–673 (2013).
Article CAS PubMed Google Scholar
Visscher, P. M., Hill, W. G. & Wray, N. R. Heritability in the genomics era--concepts and misconceptions. Nat. Rev. Genet. 9, 255–266 (2008).
Article CAS PubMed Google Scholar
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
Article CAS PubMed PubMed Central Google Scholar
Lee, S. H., Yang, J., Goddard, M. E., Visscher, P. M. & Wray, N. R. Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood. Bioinformatics 28, 2540–2542 (2012).
Article CAS PubMed PubMed Central Google Scholar
Lee, S. H., Wray, N. R., Goddard, M. E. & Visscher, P. M. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 88, 294–305 (2011).
Article PubMed PubMed Central Google Scholar
Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).
Article CAS PubMed PubMed Central Google Scholar
Lee, S. H. et al. Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nat. Genet. 45, 984–994 (2013).
Article CAS PubMed Google Scholar
Jia, X. et al. Imputing amino acid polymorphisms in human leukocyte antigens. PLoS ONE 8, e64683 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Schaffner, S. F. The X chromosome in population genetics. Nat. Rev. Genet. 5, 43–51 (2004).
Article CAS PubMed Google Scholar
Gottipati, S., Arbiza, L., Siepel, A., Clark, A. G. & Keinan, A. Analyses of X-linked and autosomal genetic variation in population-scale whole genome sequencing. Nat. Genet. 43, 741–743 (2011).
Article CAS PubMed PubMed Central Google Scholar
Daetwyler, H. D., Villanueva, B. & Woolliams, J. A. Accuracy of predicting the genetic risk of disease using a genome-wide approach. PLoS ONE 3, e3395 (2008).
Article ADS PubMed PubMed Central Google Scholar
Lee, S. H. & Wray, N. R. Novel genetic analysis for case-control genome-wide association studies: quantification of power and genomic prediction accuracy. PLoS ONE 8, e71494 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Speed, D. & Balding, D. J. MultiBLUP: improved SNP-based prediction for complex traits. Genome Res 24, 1550–1557 (2014).
Article CAS PubMed PubMed Central Google Scholar
Franke, A. et al. Genome-wide meta-analysis increases to 71 the number of confirmed Crohn’s disease susceptibility loci. Nat. Genet. 42, 1118–1125 (2010).
Article CAS PubMed PubMed Central Google Scholar
Barrett, J. C. et al. Genome-wide association study of ulcerative colitis identifies three new susceptibility loci, including the HNF4A region. Nat. Genet. 41, 1330–1334 (2009).
Article CAS PubMed Google Scholar
Imielinski, M. et al. Common variants at five new loci associated with early-onset inflammatory bowel disease. Nat. Genet. 41, 1335–1340 (2009).
Article CAS PubMed PubMed Central Google Scholar
Wang, K. et al. Comparative genetic analysis of inflammatory bowel disease and type 1 diabetes implicates multiple loci with opposite effects. Hum Mol Genet 19, 2059–2067 (2010).
Article CAS PubMed PubMed Central Google Scholar
Speed, D., Hemani, G., Johnson, M. R. & Balding, D. J. Improved heritability estimation from genome-wide SNPs. Am. J. Hum. Genet. 91, 1011–1021 (2012).
Article CAS PubMed PubMed Central Google Scholar
Chen, G.-B. et al. Estimation and partitioning of (co)heritability of inflammatory bowel disease from GWAS and immunochip data. Hum. Mol. Genet 23, 4710–4720 (2014).
Article CAS PubMed PubMed Central Google Scholar
Cho, I. & Blaser, M. J. The human microbiome: at the interface of health and disease. Nat. Rev. Genet. 13, 260–270 (2012).
Article CAS PubMed PubMed Central Google Scholar
Márquez, A. et al. Specific association of a CLEC16A/KIAA0350 polymorphism with NOD2/CARD15(-) Crohn’s disease patients. Eur. J. Hum. Genet. 17, 1304–1308 (2009).
Article PubMed PubMed Central Google Scholar
Petty, R. E. et al. International League of Associations for Rheumatology classification of juvenile idiopathic arthritis: second revision, Edmonton, 2001. J. Rheumatol. 31, 390–392 (2004).
PubMed Google Scholar
Zaitlen, N. et al. Using extended genealogy to estimate components of heritability for 23 quantitative and dichotomous traits. PLoS Genet. 9, e1003520 (2013).
Article CAS PubMed PubMed Central Google Scholar
Han, B. et al. Fine mapping seronegative and seropositive rheumatoid arthritis to shared and distinct HLA alleles by adjusting for the effects of heterogeneity. Am. J. Hum. Genet. 94, 522–532 (2014).
Article CAS PubMed PubMed Central Google Scholar
So, H.-C., Gui, A. H. S., Cherny, S. S. & Sham, P. C. Evaluating the heritability explained by known susceptibility variants: a survey of ten complex diseases. Genet. Epidemiol. 35, 310–317 (2011).
Article PubMed Google Scholar
So, H.-C., Yip, B. H. K. & Sham, P. C. Estimating the total number of susceptibility variants underlying complex diseases from genome-wide association studies. PLoS ONE 5, e13898 (2010).
Article ADS PubMed PubMed Central Google Scholar
Harley, J. B. et al. Genome-wide association scan in women with systemic lupus erythematosus identifies susceptibility variants in ITGAM, PXK, KIAA1542 and other loci. Nat. Genet. 40, 204–210 (2008).
Article CAS PubMed PubMed Central Google Scholar
Kamen, D. L. Environmental influences on systemic lupus erythematosus expression. Rheum. Dis. Clin. North Am 40, 401–412 vii (2014).
Article PubMed PubMed Central Google Scholar
Mok, C. C. & Lau, C. S. Pathogenesis of systemic lupus erythematosus. J. Clin. Pathol. 56, 481–490 (2003).
Article CAS PubMed PubMed Central Google Scholar
Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Rzhetsky, A., Wajngurt, D., Park, N. & Zheng, T. Probing genetic overlap among complex human phenotypes. Proc. Natl Acad. Sci. USA 104, 11694–11699 (2007).
Article ADS CAS PubMed PubMed Central Google Scholar
Eaton, W. W., Rose, N. R., Kalaydjian, A., Pedersen, M. G. & Mortensen, P. B. Epidemiology of autoimmune diseases in Denmark. J. Autoimmun. 29, 1–9 (2007).
Article PubMed PubMed Central Google Scholar
EC, S., Cooper, G. S., Bynum, M. L. K. & Somers, E. C. Recent insights in the epidemiology of autoimmune diseases: improved prevalence estimates and understanding of clustering of diseases. J. Autoimmun. 33, 197–207 (2009).
Article Google Scholar
Cotsapas, C. et al. Pervasive sharing of genetic effects in autoimmune disease. PLoS Genet. 7, e1002254 (2011).
Article CAS PubMed PubMed Central Google Scholar
Speed, D. et al. SNP-based heritability analysis with dense data. Am. J. Hum. Genet. 93, 1155–1157 (2013).
Article CAS PubMed PubMed Central Google Scholar
Lee, S. H. et al. Estimation of SNP heritability from dense genotype data. Am. J. Hum. Genet. 93, 1151–1155 (2013).
Article CAS PubMed PubMed Central Google Scholar
Gusev, A. et al. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet. 95, 535–552 (2014).
Article CAS PubMed PubMed Central Google Scholar
Kang, J., Kugathasan, S., Georges, M., Zhao, H. & Cho, J. H. Improved risk prediction for Crohn’s disease with a multi-locus approach. Hum. Mol. Genet 20, 2435–2442 (2011).
Article CAS PubMed PubMed Central Google Scholar
Mittag, F. et al. Use of support vector machines for disease risk prediction in genome-wide association studies: Concerns and opportunities. Hum. Mutat. 33, 1708–1718 (2012).
Article CAS PubMed PubMed Central Google Scholar
Liao, K. P. et al. Electronic medical records for discovery research in rheumatoid arthritis. Arthritis Care Res. (Hoboken) 62, 1120–1127 (2010).
Article Google Scholar
Scher, J. U. et al. Expansion of intestinal Prevotella copri correlates with enhanced susceptibility to arthritis. Elife 2, e01202 (2013).
Article PubMed PubMed Central Google Scholar
Ramos, P. S. et al. A comprehensive analysis of shared loci between systemic lupus erythematosus (SLE) and sixteen autoimmune diseases reveals limited genetic overlap. PLoS Genet. 7, e1002406 (2011).
Article CAS PubMed PubMed Central Google Scholar
Tait, K. F. et al. Clustering of autoimmune disease in parents of siblings from the Type 1 diabetes Warren repository. Diabet. Med. 21, 358–362 (2004).
Article CAS PubMed Google Scholar
Lin, J.-P. et al. Familial clustering of rheumatoid arthritis with other autoimmune diseases. Hum. Genet. 103, 475–482 (1998).
Article CAS PubMed Google Scholar
Denny, J. C. et al. PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics 26, 1205–1210 (2010).
Article CAS PubMed PubMed Central Google Scholar
Ritchie, M. D. et al. Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record. Am. J. Hum. Genet. 86, 560–572 (2010).
Article CAS PubMed PubMed Central Google Scholar
Liao, K. P. et al. Associations of autoantibodies, autoimmune risk alleles, and clinical diagnoses from the electronic medical records in rheumatoid arthritis cases and non-rheumatoid arthritis controls. Arthritis Rheum. 65, 571–581 (2013).
Article CAS PubMed PubMed Central Google Scholar
Hakonarson, H. et al. A genome-wide association study identifies KIAA0350 as a type 1 diabetes gene. Nature 448, 591 (2007).
Article ADS CAS PubMed Google Scholar
Kugathasan, S. et al. Loci on 20q13 and 21q22 are associated with pediatric-onset inflammatory bowel disease. Nat. Genet. 40, 1211–1215 (2008).
Article CAS PubMed PubMed Central Google Scholar
Orange, J. S. et al. Genome-wide association identifies diverse causes of common variable immunodeficiency. J. Allergy Clin. Immunol. 127, 1360–1367 e6 (2011).
Article CAS PubMed PubMed Central Google Scholar
Behrens, E. M. et al. Association of the TRAF1-C5 locus on chromosome 9 with juvenile idiopathic arthritis. Arthritis Rheum. 58, 2206–2207 (2008).
Article PubMed Google Scholar
Grant, S. F. et al. Association of the BANK 1 R61H variant with systemic lupus erythematosus in Americans of European and African ancestry. Appl. Clin. Genet. 2, 1–5 (2009).
Article CAS PubMed PubMed Central Google Scholar
Behrens, E. M. et al. Evaluation of the presentation of systemic onset juvenile rheumatoid arthritis: data from the Pennsylvania Systemic Onset Juvenile Arthritis Registry (PASOJAR). J. Rheumatol. 35, 343–348 (2008).
PubMed Google Scholar
Conley, M. E., Notarangelo, L. D. & Etzioni, A. Diagnostic criteria for primary immunodeficiencies. Representing PAGID (Pan-American Group for Immunodeficiency) and ESID (European Society for Immunodeficiencies). Clin. Immunol. 93, 190–197 (1999).
Article CAS PubMed Google Scholar
Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904 (2006).
Article CAS PubMed Google Scholar
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Article CAS PubMed PubMed Central Google Scholar
NASU, Y. et al. Trichostatin A, a histone deacetylase inhibitor, suppresses synovial inflammation and subsequent cartilage destruction in a collagen antibody-induced arthritis mouse model. Osteoarthr. Cartil. 16, 723–732 (2008).
Article CAS Google Scholar

Download references

Acknowledgements

We thank the patients and their families for their participation in the genotyping studies and in the Biobank Repository at the Center for Applied Genomics. We are also thankful for the contributions of the Italian IBD Group, including Cucchiara S (Roma), Lionetti P (Firenze), Barabino G (Genova), de Angelis GL (Parma), Guariso G (Padova), Catassi C (Ancona), Lombardi G (Pescara), Staiano AM (Napoli), De Venuto D (Bari), Romano C (Messina), D'incà R (Padova), Vecchi M (Milano), Andriulli A and Bossa F (S. Giovanni Rotondo). Y.R.L. is supported by the Paul and Daisy Soros Fellowship for New Americans and an NIH F30 Individual NRSA Training Grant (1F30AR066486). This study was supported by Institutional Development Funds from the Children’s Hospital of Philadelphia, and by DP3DK085708, RC1AR058606, U01HG006830, the Crohn’s and Colitis Foundation, the Juvenile Diabetes Research Foundation and a grant from the LRI to E.L.P.

Author information

Authors and Affiliations

Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, 19104, Pennsylvania, USA
Yun R. Li, Jin Li, Jonathan P. Bradfield, Maede Mohebnasab, Laura Steel, Debra J. Abrams, Frank D. Mentch, Joseph T. Glessner, Yiran Guo, Zhi Wei, John J. Connolly, Christopher J. Cardinale, Marina Bakay, Dong Li, S. Melkorka Maggadottir, Kelly A. Thomas, Haijun Qui, Rosetta M. Chiavacci, Cecilia E. Kim, Fengxiang Wang, James Snyder, Struan F. A. Grant, Patrick M. A. Sleiman, Brendan J. Keating & Hakon Hakonarson
Medical Scientist Training Program, Perelman School of Medicine, University of Pennsylvania, Philadelphia, 19104, Pennsylvania, USA
Yun R. Li
Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, 61820, Illinois, USA
Sihai D. Zhao
Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, 19104, Pennsylvania, USA
Julie Kobie & Hongzhe Li
Department of Computer Science, New Jersey Institute of Technology, Newark, 07103, New Jersey, USA
Zhi Wei
Division of Allergy and Immunology, Children’s Hospital of Philadelphia, Philadelphia, 19104, Pennsylvania, USA
S. Melkorka Maggadottir & Kathleen E. Sullivan
Department of Rheumatology, Oslo University Hospital, Rikshospitalet, 0372, Oslo, Norway
Berit Flatø & Øystein Førre
Division of Gastroenterology, Center for Inflammatory Bowel Disease, Cincinnati Children's Hospital Medical Center, Cincinnati, 45229, Ohio, USA
Lee A. Denson
Divison of Rheumatology, Cincinnati Children’s Hospital Medical Center, Cincinnati, 45229, Ohio, USA
Susan D. Thompson
Division of Rheumatology and Division of Clinical Pharmacology, Toxicology, and Therapeutic Innovation, Children’s Mercy-Kansas City, Kansas City, 64108, Missouri, USA
Mara L. Becker
Department of Pediatrics, University of Utah School of Medicine and Primary Children's Medical Center, Salt Lake City, 84113, Utah, USA
Stephen L. Guthery
Division of Gastroenterology, RCCS ‘Casa Sollievo della Sofferenza’, San Giovanni Rotondo, 71013, Italy
Anna Latiano
Division of Pediatric Allergy and Immunology, University of Miami Miller School of Medicine, Miami, 33136, Florida, USA
Elena Perez
Department of Medicine, Institute of Immunology, Icahn School of Medicine at Mount Sinai, Mount Sinai Hospital, New York, 10029, New York, USA
Elena Resnick & Charlotte Cunningham-Rundles
Department of Translational Medical Science, Section of Pediatrics, University of Naples "Federico II", Naples, 80138, Italy
Caterina Strisciuglio, Annamaria Staiano & Erasmo Miele
IBD Centre, Mount Sinai Hospital, University of Toronto, 441-600 University Avenue, Toronto, Ontario, M5G 1X5, Canada
Mark S. Silverberg
Department of Immunology, Oslo University Hospital, Rikshospitalet, 0027, Oslo, 0372, Norway
Benedicte A. Lie
Texas Scottish Rite Hospital for Children, Dallas, 750219, Texas, USA
Marilynn Punaro
Yorkhill Hospital for Sick Children, Glasgow, G38SJ, Scotland
Richard K. Russell
Paediatric Gastroenterology and Nutrition, Royal Hospital for Sick Children, Edinburgh and Child Life and Health, University of Edinburgh, Edinburgh, EH9 1UW, UK
David C. Wilson
Departments of Pediatrics and Common Disease Genetics, Cedars Sinai Medical Center, Los Angeles, 90048, California, USA
Marla C. Dubinsky
Department of Pathology, Children’s Hospital of Philadelphia, Philadelphia, 19104, Pennsylvania, USA
Dimitri S. Monos
Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, 19104, Pennsylvania, USA
Dimitri S. Monos, Edward M. Behrens, Kathleen E. Sullivan, Struan F. A. Grant, Patrick M. A. Sleiman, Robert N. Baldassano, Brendan J. Keating & Hakon Hakonarson
Department of Medical and Surgical Specialties, Unit of Gastroenterology, Careggi University Hospital, Viale Pieraccini 18, Florence, 50139, Italy
Vito Annese
Paediatric Rheumatology Unit, Royal Children’s Hospital, Parkville, 3052, Victoria, Australia
Jane E. Munro
Arthritis and Rheumatology Research, Murdoch Childrens Research Institute, Parkville, 3052, Victoria, Australia
Jane E. Munro
Sarah M. and Charles E. Seay Center for Musculoskeletal Research, Texas Scottish Rite Hospital for Children, Dallas, 750219, Texas, USA
Carol Wise
Department of Clinical Immunology, Nuffield Department of Medicine, University of Oxford, OX1 1NF, UK
Helen Chapel
Department of Pediatric Medicine, Section of Immunology, Allergy, and Rheumatology, Texas Children’s Hospital, Houston, 77030, Texas, USA
Jordan S. Orange
Division of Rheumatology, Children’s Hospital of Philadelphia, Philadelphia, 19104, Pennsylvania, USA
Edward M. Behrens
Department of Pediatrics, Emory University School of Medicine and Children’s Health Care of Atlanta, Atlanta, 30329, Georgia, USA
Subra Kugathasan
Hospital for Sick Children, University of Toronto, 555 University Avenue, Toronto, M5G 1X8, Ontario, Canada
Anne M. Griffiths
Division of Medical Sciences, Gastrointestinal Unit, School of Molecular and Clinical Medicine, University of Edinburgh, Western General Hospital, Edinburgh, EH4 2XU, UK
Jack Satsangi
Department of Pediatrics, Nemours Children’s Hospital, Orlando, 32827, Florida, USA
Terri H. Finkel
Departments of Pediatrics and Human Genetics, McGill University, Montreal, H3H 1P3, Quebec, Canada
Constantin Polychronakos
Division of Gastroenterology, Children’s Hospital of Philadelphia, Philadelphia, 19104, Pennsylvania, USA
Robert N. Baldassano
Department of Pathology and Lab Medicine, Perelman School of Medicine University of Pennsylvania, Philadelphia, 19104, Pennsylvania, USA
Eline T. Luning Prak
Genes, Environment and Complex Disease, Murdoch Childrens Research Institute, Parkville, 3052, Victoria, Australia
Justine A. Ellis
Department of Paediatrics, University of Melbourne, Parkville, 3052, Victoria, Australia
Justine A. Ellis
Division of Pulmonary Medicine, Children’s Hospital of Philadelphia, Philadelphia, 19104, Pennsylvania, USA
Hakon Hakonarson

Authors

Yun R. Li
View author publications
You can also search for this author in PubMed Google Scholar
Sihai D. Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jin Li
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan P. Bradfield
View author publications
You can also search for this author in PubMed Google Scholar
Maede Mohebnasab
View author publications
You can also search for this author in PubMed Google Scholar
Laura Steel
View author publications
You can also search for this author in PubMed Google Scholar
Julie Kobie
View author publications
You can also search for this author in PubMed Google Scholar
Debra J. Abrams
View author publications
You can also search for this author in PubMed Google Scholar
Frank D. Mentch
View author publications
You can also search for this author in PubMed Google Scholar
Joseph T. Glessner
View author publications
You can also search for this author in PubMed Google Scholar
Yiran Guo
View author publications
You can also search for this author in PubMed Google Scholar
Zhi Wei
View author publications
You can also search for this author in PubMed Google Scholar
John J. Connolly
View author publications
You can also search for this author in PubMed Google Scholar
Christopher J. Cardinale
View author publications
You can also search for this author in PubMed Google Scholar
Marina Bakay
View author publications
You can also search for this author in PubMed Google Scholar
Dong Li
View author publications
You can also search for this author in PubMed Google Scholar
S. Melkorka Maggadottir
View author publications
You can also search for this author in PubMed Google Scholar
Kelly A. Thomas
View author publications
You can also search for this author in PubMed Google Scholar
Haijun Qui
View author publications
You can also search for this author in PubMed Google Scholar
Rosetta M. Chiavacci
View author publications
You can also search for this author in PubMed Google Scholar
Cecilia E. Kim
View author publications
You can also search for this author in PubMed Google Scholar
Fengxiang Wang
View author publications
You can also search for this author in PubMed Google Scholar
James Snyder
View author publications
You can also search for this author in PubMed Google Scholar
Berit Flatø
View author publications
You can also search for this author in PubMed Google Scholar
Øystein Førre
View author publications
You can also search for this author in PubMed Google Scholar
Lee A. Denson
View author publications
You can also search for this author in PubMed Google Scholar
Susan D. Thompson
View author publications
You can also search for this author in PubMed Google Scholar
Mara L. Becker
View author publications
You can also search for this author in PubMed Google Scholar
Stephen L. Guthery
View author publications
You can also search for this author in PubMed Google Scholar
Anna Latiano
View author publications
You can also search for this author in PubMed Google Scholar
Elena Perez
View author publications
You can also search for this author in PubMed Google Scholar
Elena Resnick
View author publications
You can also search for this author in PubMed Google Scholar
Caterina Strisciuglio
View author publications
You can also search for this author in PubMed Google Scholar
Annamaria Staiano
View author publications
You can also search for this author in PubMed Google Scholar
Erasmo Miele
View author publications
You can also search for this author in PubMed Google Scholar
Mark S. Silverberg
View author publications
You can also search for this author in PubMed Google Scholar
Benedicte A. Lie
View author publications
You can also search for this author in PubMed Google Scholar
Marilynn Punaro
View author publications
You can also search for this author in PubMed Google Scholar
Richard K. Russell
View author publications
You can also search for this author in PubMed Google Scholar
David C. Wilson
View author publications
You can also search for this author in PubMed Google Scholar
Marla C. Dubinsky
View author publications
You can also search for this author in PubMed Google Scholar
Dimitri S. Monos
View author publications
You can also search for this author in PubMed Google Scholar
Vito Annese
View author publications
You can also search for this author in PubMed Google Scholar
Jane E. Munro
View author publications
You can also search for this author in PubMed Google Scholar
Carol Wise
View author publications
You can also search for this author in PubMed Google Scholar
Helen Chapel
View author publications
You can also search for this author in PubMed Google Scholar
Charlotte Cunningham-Rundles
View author publications
You can also search for this author in PubMed Google Scholar
Jordan S. Orange
View author publications
You can also search for this author in PubMed Google Scholar
Edward M. Behrens
View author publications
You can also search for this author in PubMed Google Scholar
Kathleen E. Sullivan
View author publications
You can also search for this author in PubMed Google Scholar
Subra Kugathasan
View author publications
You can also search for this author in PubMed Google Scholar
Anne M. Griffiths
View author publications
You can also search for this author in PubMed Google Scholar
Jack Satsangi
View author publications
You can also search for this author in PubMed Google Scholar
Struan F. A. Grant
View author publications
You can also search for this author in PubMed Google Scholar
Patrick M. A. Sleiman
View author publications
You can also search for this author in PubMed Google Scholar
Terri H. Finkel
View author publications
You can also search for this author in PubMed Google Scholar
Constantin Polychronakos
View author publications
You can also search for this author in PubMed Google Scholar
Robert N. Baldassano
View author publications
You can also search for this author in PubMed Google Scholar
Eline T. Luning Prak
View author publications
You can also search for this author in PubMed Google Scholar
Justine A. Ellis
View author publications
You can also search for this author in PubMed Google Scholar
Hongzhe Li
View author publications
You can also search for this author in PubMed Google Scholar
Brendan J. Keating
View author publications
You can also search for this author in PubMed Google Scholar
Hakon Hakonarson
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Y.R.L. and H.H. were leading contributors in the design, analysis and writing of this study. D.J.A., M.M. and L.S. contributed to data collection and literature review. B.F., Ø.F., L.A.D., S.D.T., M.L.B., S.L.G., A.L., E.P., E.R., C.S., A.S., E.M., M.S.S., B.A.L., M.P., R.K.R., D.C.W., H.C., C.C.-R., J.S.O., E.M.B., K.E.S., S.K., A.M.G., J.S., T.F., C.P., R.N.B. and J.A.E. contributed samples and phenotypes. F.D.M., K.A.T., H.Q., R.M.C., C.E.K., F.W. and J.S. provided assistance with samples genotyping, and data processing. J.K., S.D.Z., J.P.B., J.L. and H.L. contributed to, advised and supervised statistical analysis. E.T.L.P., J.A.E. and B.J.K. assisted in composing and revising the manuscript. All authors read, edited and approved of the manuscript.

Corresponding author

Correspondence to Hakon Hakonarson.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information

Supplementary Figures 1-4, Supplementary Tables 1-12 and Supplementary References. (PDF 2195 kb)

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Li, Y., Zhao, S., Li, J. et al. Genetic sharing and heritability of paediatric age of onset autoimmune diseases. Nat Commun 6, 8442 (2015). https://doi.org/10.1038/ncomms9442

Download citation

Received: 14 February 2015
Accepted: 21 August 2015
Published: 09 October 2015
DOI: https://doi.org/10.1038/ncomms9442

This article is cited by

Variable immunodeficiency score upfront analytical link (VISUAL), a proposal for combined prognostic score at diagnosis of common variable immunodeficiency
- Kissy Guevara-Hoyer
- Adolfo Jiménez-Huete
- Silvia Sánchez-Ramón
Scientific Reports (2021)
Moving towards a molecular taxonomy of autoimmune rheumatic diseases
- Guillermo Barturen
- Lorenzo Beretta
- Marta E. Alarcón-Riquelme
Nature Reviews Rheumatology (2018)
GWAS for male-pattern baldness identifies 71 susceptibility loci explaining 38% of the risk
- Nicola Pirastu
- Peter K. Joshi
- James F. Wilson
Nature Communications (2017)
Indications to Epigenetic Dysfunction in the Pathogenesis of Common Variable Immunodeficiency
- William Rae
Archivum Immunologiae et Therapiae Experimentalis (2017)
Rare phenotypes in the understanding of autoimmunity
- Yvonne Zeissig
- Britt‐Sabina Petersen
- Sebastian Zeissig
Immunology & Cell Biology (2016)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Quantifying the heritability of paediatric AI diseases

Contribution of the MHC region and ChrX to SNP-h2

Disease prediction using support vector machines (SVM)s

Estimation of pairwise co-heritability across pAIDs

Discussion

Methods

Study population

Genotyping and QC

Population stratification correction

Estimation of the variance components for each pAID

Joint heritability across pAID pairwise combinations

Genome-wide pairwise sharing analysis

Disease prediction using a linear SVM

Additional information

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links

Contribution of the MHC region and ChrX to SNP-h²