Introduction

Immune-mediated diseases (IMDs) are a clinically heterogeneous group of immune system diseases.1 Diseases such as psoriatic arthritis (PsA), coeliac disease (CeD), rheumatoid arthritis (RA), psoriasis (Ps), primary biliary cirrhosis (PBC), spondyloarthropathies (SpA), juvenile idiopathic arthritis (JIA), type 1 diabetes (T1D), autoimmune thyroid disease (AITD) and inflammatory bowel diseases (IBDs) have very different clinical presentations and appear to be unrelated, but they have been recognized as sharing common pathogenic mechanisms.2, 3, 4, 5, 6 Most IMDs are epidemic in highly developed industrialized countries and rank in the top 10 causes of death in women under 65 years of age in the United States of America.7 The National Institutes of Health has estimated that over 23.5 million Americans are afflicted with IMDs, which affect ~5–7% of Western populations.2, 8, 9 As the identification of IMDs increases, so does the expected overall prevalence increase.10

Knowledge of the pathogenic mechanisms of IMDs remains limited, and they are believed to occur as a consequence of an imbalance in the complex interplay between genetic and environmental factors.11 The marked emergence of IMDs in highly developed industrialized countries in the latter half of the 20th century suggests that environmental factors are involved in the onset of IMDs.7 Many environmental determinants of some IMDs, such as cigarette smoking, appendectomy, urbanization, pollution, diet, antibiotic use, hygiene status, socioeconomic status and microbial exposure, have been identified, but whether the relationship between those factors and IMDs is causal remains to be established.7, 12, 13, 14

IMDs are considered to share a substantial portion of their heritable aetiology.15 First, epidemiologic observations have indicated that IMDs can co-occur in either the same individual or in closely related family members, and second, clinical data have shown that the same therapies exhibit similar efficacies across diseases.3, 16, 17, 18, 19 The best known and undeniable hallmark of IMDs is their association with shared HLA haplotypes; however, variation at this locus cannot explain all of the familial risk.20 Over the past decade, significant progress has been made in the identification of genetic factors in IMDs, with associations found at convincing levels of statistical significance between over 200 common or rare variants.20, 21 These discoveries have been driven largely by genome-wide association studies (GWAS), which provide hypothesis-free surveys of the human genome for common variants associated with disease susceptibility.22, 23 The number of variants found to be associated with each disease from the GWAS data is increasing every day; however, these allelic variants together account for a relatively small risk, explaining between 20 and 50% of heritability.24, 25 The introduction of the Immunochip, which is a single-nucleotide polymorphism (SNP) microarray designed for deep replication and fine mapping, represents a step forward because it allows for the identification of genes more strictly related to the immune/inflammatory pathways.26 Many studies on IMD genetics using the Immunochip have successfully identified a significant number of common genetic variants associated with multiple IMDs and have highlighted several phenotype-specific SNPs.25 Sharing genetic loci across IMDs is not only common but also complex.

In this review, we provide an overview of advances in the genetics of various IMDs, focusing on the overlapping genetic factors shared among multiple IMDs and some disease-specific genes that exert relatively large effects. Comparing and contrasting the overlapping genetic factors between GWAS and the Immunochip studies allowed us to gain insight into the underlying biological mechanisms of these genes associated with pathogenesis of IMDs. Studying disease-specific genetics also allowed us to better understand the notion that genotype determines phenotype. We also review the ethnic differences in genetic factors and discuss what the next steps should be for the immune genetics field to leverage these observations into a more complete understanding of disease mechanisms.

Overview of genomic landscape of IMDs

Familial clustering of multiple diseases, epidemiological co-occurrence and the similar efficacy of therapies across diseases suggest that genetic factors predispose individuals to IMDs.17, 18, 27, 28 Studies have compared the concordance rates of IMDs in monozygotic twins and dizygotic twins and showed that the disease concordance was much higher in monozygotic twins compared with that in dizygotic twins.29, 30, 31 In addition, the recurrence risk for a patient’s sibling is larger compared with that for a sibling of an individual from a healthy population.27 These results indicate that genetic factors are important in the development of IMDs.

The major histocompatibility complex (MHC) loci were the first reported genetic loci associated with IMDs and remain the most strongly associated genes for most IMDs.25 MHC is an extremely gene-dense region with hundreds of immunologically active genes.27 More than 28% of the expressed transcripts from the MHC region are believed to have essential functions in the human immune system, and the strongest association with IMDs maps to the HLA region.27, 32 However, variations in the MHC loci cannot entirely explain familial risk, and other genetic determinants must also be shared in complex patterns of pathogenesis.20

Before the introduction of GWAS (pre-2006), the identification of disease-predisposing genes was based on candidate gene approaches and family-based linkage studies.27 Although there were some successful studies that clarified the role of the CTLA4, PTPN22 and NOD2 genes in IMDs, in the most cases, they were unsuccessful because of limited knowledge of disease processes and a lack of common genetic variation mapping.

After the completion of the Human Genome Project and International HapMap Project and the introduction of array-based approaches, GWAS allowed for the testing of every gene in the human genome for association with a disease or trait of interest.27, 33, 34 Hundreds of loci harbouring risk alleles for a variety of IMDs have been identified by GWAS, and one of the most striking observations from the GWAS results is the overlap in signals between different diseases.27, 35 There are also some disease-specific genetic loci that often exert relatively large effects.25 Although the number of variants found to be associated with IMDs is increasing daily, they can explain <50% of the genetic variation for each IMD. It is likely that the GWAS have missed many rare or population-specific variants that contribute to heritability.25

The Immunochip array is expected to reveal even more of the shared genetic factors because it allows for identification of genes that are more strictly related to the immune/inflammatory pathways.24, 27 With Immunochip, pathways shared by IMDs and specific genetic variants associated with different clinical features are beginning to emerge.25 Immunochip array has been used in CeD, IBD, PBC, Ps, systemic sclerosis, Takayasu artetiris and RA investigations, and it has already revealed several interesting observations.36, 37, 38, 39, 40, 41, 42

Functional data analysis, such as gene expression data and protein–protein interaction data, could incorporate functional data from GWAS loci and help to identify the potentially pathogenic cell types associated with IMDs.43, 44, 45, 46, 47 Genetically engineered mouse models could imitate phenotypes similar to IMDs and help to prioritize candidate genes in GWAS loci.27 Isis Ricañ o-Ponce and Cisca Wijmenga extracted all the lead SNPs from studies performed in Caucasian individuals and their perfect proxies and successfully annotated SNPs associated with regulatory sequences as follows: 7.6% of the variants mapped to promoter histone marks, 18.8% mapped to enhancer histone marks, 32.1% were present at DNase-hypersensitive sites, 42.3% changed motifs and 14.4% mapped to protein-bound regions.

Overlapping loci between diseases

The notion that IMDs share some of their genetic background is now well accepted, and the most striking observation from GWAS is the overlap in signals between different diseases.16, 25, 27, 48 The GWAS on 12 IMDs revealed that 68 non-HLA loci are shared across IMDs and that 11 SNPs are shared by more than four diseases.25 The introduction of the Immunochip array is expected to reveal even more of the shared genetic factors, and it allows for the identification of genes more strictly related to the immune/inflammatory pathways.24, 25, 27 A good example is provided by Van Sommeren et al.,49 who assessed shared disease pathways by performing an extensive pathway analysis of protein–protein interactions and cotranscriptional analysis and identified 370 protein–protein interactions that are shown to cluster in specific biological pathways.

The sharing of genetic loci across IMDs is not only common but also complex. For example, a shared SNP can confer increased risk for more than one disease (‘correlated and concordant’); a risk allele may appear to impart risk for some diseases but is protective against other diseases (‘correlated but opposite’); or different haplotypes may be implicated in a shared locus (‘non-correlated’).25 Here, we summarize the overlapping genetic loci and pathways that are involved in IMDs.

Major histocompatibility complex

The MHC is an extremely gene-dense region with long-range linkage disequilibrium and hundreds of immunologically active genes, including the HLA genes.25 The association of individual diseases with multiple MHC genes has long been suspected, and more than 28% of the expressed transcripts from the MHC region are believed to have essential functions in the human immune system.27, 32 Not surprisingly, the strongest association with IMDs maps to the HLA region.27 A good example is that the HLA-B27 genotype exists in 90% of AS patients and in only 2–8% of healthy Europeans.50

Some of the HLA alleles associated with IMDs are present in several IMDs. For example, seropositive diseases are typically associated with HLA class II alleles as follows: CeD, T1D, AITD and systemic lupus erythematosus (SLE) are associated with the HLA-DR3-DQ2 haplotype; and RA is associated with different alleles of HLA-DRB1.51, 52 Extended haplotypes and strong linkage disequilibrium make it difficult to identify independent association signals within the HLA haplotype and its region.27

Although classic HLA alleles constitute most of the associations across IMDs, the non-MHC genes in the MHC region are also likely to contribute to IMD heritability.25 Yang et al.53 analysed epidemiologic studies on C4A or C4B deficiencies in human SLE and found that C4A deficiencies were present in 40–60% of SLE patients from almost all ethnic groups and races investigated. The co-occurrence of IMDs is because of the influence of shared HLA haplotypes, but variation at this locus cannot entirely explain familial risk, as other genetic determinants must also be shared in complex patterns of overlap.16, 20

The IL-12 family

The interleukin-12 (IL-12) family comprises four heterodimeric cytokines: IL-12, IL-23, IL-27 and IL-35.11 Their respective receptors have key roles in immune responses, and a surprising number of autoimmune diseases (AIDs) have been found to be associated with genetic variation in this gene family.48 Based on the association of AIDs with the various IL-12 regions, two major clusters can be distinguished.

The first cluster shows strong associations with IL-23R, and the majority are associated with the IL-12B gene region as well.25 This result indicates an important pathogenic role for T-helper type 17 (Th17) pathway and, possibly, the T-helper type 1 (Th1) pathway. Crohn’s disease (CrD), ulcerative colitis (UC), Ps, PsA, AS and RA are all encompassed in this cluster.11 IL-23 has an important role in amplifying and stabilizing Th17 cells,54, 55 and the IL-23R gene appears to be the gene with the most consistent associations in the first cluster of AIDs. The majority of the IL-23R SNPs associated with these diseases protect against the development of these diseases, suggesting that the presence of these SNPs decreases the function of the proinflammatory Th17 cells.56 Another IL-23R variant (rs10889677) located in the 3′-untranslated region is associated with an increased disease susceptibility to IBD.56 IL-12 is produced by antigen-presenting, phagocytic and B cells in response to infection and promotes the development of naive CD4+ T cells into proinflammatory Th1 cells, which secrete the proinflammatory cytokine interferon-γ.57 The polymorphisms in the 3′-untranslated region region of the IL-12B gene include the TaqI restriction fragment length polymorphism at position 1188 (rs3212227), which is associated with multiple AIDs.58 Moreover, genetic variations in the coding region of murine IL-12b resulted in more efficient binding of the IL-12p40 variant to the IL-12p35 and IL-23p19 subunits, likely because of conformational changes resulting from differential glycosylation induced by the polymorphism.59

The second cluster is associated with polymorphisms in the IL-12A gene region, suggesting an important pathogenic role for IL-12 and/or IL-35 in the Th1/IL-35 pathway.25 PBC, CeD, multiple sclerosis (MS) and Graves’ disease are all encompassed in this cluster.11 Increased IL-12p35 mRNA levels in the small intestinal mucosa of CeD patients correlate with the rs9811792 risk allele located ~10 kb proximal to the start codon of the IL-12A gene.60 Another polymorphism (rs582054) located in an IL-12A intron is associated with atopic dermatitis and was shown to be inversely correlated with blood eosinophil count.61 However, it is still unknown whether altered IL-12A gene expression primarily affects IL-12 formation or the recently identified IL-35 expression.

In addition to the two clusters described above that categorize diseases based on their respective associations with the IL-12 gene cluster, there are additional IMDs that do not show associations strong enough to enable their categorization into either of the two clusters. Asthma, atopic dermatitis (AD) and T1D are classified in this group and may be associated with Th2.11 Th2 cells are involved in allergic disorders and protect against extracellular pathogens.62 The decreased IL-12 secretion in these AIDs results in impaired inhibition of Th2 cell differentiation, leading to an exaggerated Th2 response.63

GWAS and Immunochip studies have identified several SNPs in IL-12 family gene are associated with more than one IMD. As an example of genetic overlap between IMDs, rs11209026, a missense polymorphism in the IL-23R, is shared among CrD, UC, AS and Ps.11 Another example is that UC, CrD, AS and Ps have all been linked to an SNP located in the upstream of the 5′-untranslated region of IL-12B gene.11

Th17 pathway

Th17 cells, a distinct subset of CD4+ T cells with IL-17 as their major cytokine, orchestrate the pathogenesis of inflammatory and AIDs. Deregulated Th17 cells contribute to inflammatory and AIDs. A further understanding of the mechanistic role of Th17 cells in inflammatory and AIDs will shed light on therapeutic targets that can potentially be exploited for the management of inflammatory and AIDs. For example, the frequency of Th17 cells is higher in patients with RA compared with that in healthy controls, suggesting that a Th17/Treg (regulatory T cell) cell imbalance may contribute to the pathogenesis and progression of RA.64 In this disease, Th17 cells represent a proinflammatory subset, whereas Treg cells have an antagonistic effect. Studies have demonstrated that IL-17 upregulates the production of IL-1β and tumour necrosis factor-α (TNFα) in antigen-presenting cells in arthritic joints.65 Similarly, IL-17 and Th17 cells have a role in Behcet's disease and uveitis,66 demonstrating that Th17 cells participate in the pathogenesis of AIDs. Another AID is SLE, which is a complex disease of unknown aetiology. However, multiple links between SLE and Th17 have emerged.67 A strong correlation between Th17 cell expression and SLE disease activity has been found, with a higher percentage of CD3+CD4+ T cells producing IL-17 in SLE patients compared with healthy controls.68 In Sjögren syndrome (SS), IL-17 and IL-23 levels are increased in the salivary gland as well as in the serum of patients. These cytokines and their receptors were also shown to be expressed within lymphocytic infiltrates and ductal areas in the salivary glands of patients.69, 70, 71, 72, 73 There is evidence that IL-17 has a crucial pathogenic role; thus, it may serve as a potential therapeutic target for amelioration of Sjögren syndrome. In Ps, which is a common skin disorder, IL-17A acts on keratinocytes to induce the expression of CCL20, thereby recruiting CCR6t Th17 cells and dendritic cells to the skin74, 75 and potentially providing a positive feedback loop that results in the maintenance of these cells in psoriatic lesions.76 Experimental evidence is also available to support a role for IL-17A in other immune inflammatory diseases, including PsA, AS, IDB, CrD, MS, diabetes and asthma.77, 78, 79, 80 T1D involves CD4+ and CD8+ T-cell-mediated destruction of pancreatic β-cells, and the uncontrolled expansion of Th17 cells is involved in the T1D pathology of the disease.81 Moreover, in patients with allergic asthma, the number of Th17 cells in peripheral blood, sputum and bronchoalveolar lavage fluids is increased compared with that in healthy controls, and levels of Th17 cells positively correlate with the severity of airway remodelling.82, 83 Above all, there is accumulating evidence that Th17 cells regulate many types of inflammatory diseases and AIDs. An understanding of the role of Th17 in these conditions will provide important insights and elucidate novel targets for therapeutic intervention.

TNF and its superfamily

TNF is one of a large group of cytokines collectively named the ‘TNF superfamily’, which includes cytokines that share molecular and functional similarities. Most of these cytokines are involved in the regulation of several steps of the biological processes related to inflammatory and immune responses.84, 85

Extensive preclinical and clinical investigations have shown that TNF has a pivotal role in the pathogenesis and pathophysiology of IMDs, which has been confirmed by the efficacy of anti-TNF biotechnological drugs, such as etanercept, infliximab and adalimumab, in the therapeutic management of these disorders.2, 86 Overexpression of TNF has been shown to promote proinflammatory processes. In particular, along with the dysregulation of other cytokines and a variety of cell types, TNF is implicated in the pathogenesis of IMDs, such as RA, CrD, Ps, PsA, SLE, T1D, MS, asthma, allergy and UC.87

In preclinical and clinical RA studies, it was shown that abnormal elevations of TNF concentrations at inflammatory sites were a primary factor in disease activity, and these observations generated the hypothesis that the removal of excess TNF from inflamed joints would confer therapeutic benefits.88 To support these concepts, transgenic mice with an overexpression of TNF were found to spontaneously develop an arthritic pathology that displayed clinical and histological features similar to those of RA.89 In addition, in an experimental model of collagen-induced arthritis, the blockade of TNF was effective in inhibiting disease activity.90 In experimental and clinical studies of the pathogenesis of IBDs, it was suggested that there are elevated levels of TNF in the inflamed mucosa of patients with CrD,91 as well as that enhanced levels of TNF are present in patients with both CrD and UC.92 TNF is also involved in the mechanisms underlying the pathogenesis of both Ps and PsA.93 With regard to Ps, TNF may be involved in the disease pathogenesis, including stimulation of the maturation of Langerhans cells and dendritic cells, with skewing of lymphocyte differentiation;94 promotion of dendritic cell migration from the skin to lymph nodes;95 accumulation of leucocytes in the inflamed skin through the induction of adhesion molecules and chemokines on dermal microvascular endothelial cells, keratinocytes and dermal firoblasts;96 induction of dermal vascular changes via the production of vascular endothelial growth factor by keratinocytes and hyperproliferation of keratinocytes;97 and induction of itching through the activation of TNF receptors on sensory nerve endings.93 In PsA, it has been shown that TNF has a primary role in the induction of inflammation and joint and bone damage through the following mechanisms: production of lytic enzymes, such as matrix metalloproteases;98 contribution to synovial vascular proliferation by the induction of angiogenic growth factors; stimulation of bone resorption; inhibition of bone formation; and inhibition of the synthesis of proteoglycans, with the subsequent occurrence of bone erosions leading to osteolysis, new bone deposition or both.99

TNF is implicated in the pathogenesis of many IMDs, such as RA, CrD, Ps, PsA, SLE, T1D, MS, asthma, allergy and UC,87 whereas GWAS and Immunochip studies have revealed that the reason might be TNFα-induced protein 3 (TNFAIP3) locus. TNFAIP3, which is also known as A20 and induced rapidly by TNF, is associated with RA, CeD, T1D, SLE, scleroderma, Ps, CrD and UC.25

Opposite alleles

The phenomenon that some SNPs show the strongest association across IMDs but act in opposite manners indicates that the genetic architecture of immune-related diseases may be even more complex than previously thought.27 Several risk alleles appear to impart risk for some diseases but are protective against others.100, 101 A well-known example in the literature is the R620 W variant in PTPN22, known to confer risk for T1D102 and the development of autoantibodies,103 RA104 and vitiligo105 but to be protective against CrD.106 Another emerging example is the rs744166 polymorphism in an intron of signal transducer and activator of transcription 3 on chromosome 17, where the A allele confers risk for CrD106 and the G allele confers risk for MS.107 Treatments targeting one IMD may inadvertently provoke or exacerbate another.25 For example, the rs1800696 variant of TNF receptor 1 is protective against AS but increases the risk for MS.108

Currently, the mechanisms of these opposing risk effects remain unclear. These opposing allelic effects may be worth investigating, as they could help us better understand the underlying disease biology.27

Ethnicity

Different ethnic groups may contribute to genetic diversity for gene mapping. For example, in Caucasians, IL-23R polymorphisms have been identified in CrD and AS. However, IL-23R has not been shown to be associated with CrD in East Asians.109 In 2009, we confirmed that rs11209026, which has been identified as the key IL-23R polymorphism associated with AS, is not polymorphic in the Han Chinese population.110

In several other IMDs, the polymorphic status of SNPs has contributed to differing levels of association between ethnic groups. PTPN22 is strongly associated with RA in Caucasians; however, it has been shown that the C1858T SNP, which is thought to be the key SNP associated with the disease in European Caucasians, is not polymorphic in East Asians, explaining the lack of association of PTPN22 with RA in these populations.111 Similarly, the NOD2/CARD15 polymorphisms associated with CrD in European Caucasian populations are not polymorphic in Asians.

These findings suggest that there are significant differences in susceptibility genes between different ethnicities, which could be exploited in an effort to distinguish true disease-associated polymorphisms from linkage disequilibrium effects.

From genotype to phenotype

Although the phenomenon of overlapping loci between IMDs is very common, there are some convincingly phenotype-specific loci. For example, in contrast to seropositive diseases, which are typically associated with HLA class II alleles,51, 52 seronegative diseases, such as CrD, Ps and AS, are generally associated with HLA class I alleles, which tend to be disease-specific.25 HLA B27 is confirmed to be associated with AS; HLA Cw6 is closely related to Ps; and HLA B51 is strongly linked to Behcet's disease.112, 113, 114, 115, 116 CrD is associated with SNPs in the MHC class I region, but it is not associated with any classical HLA allele.25, 39 There are also non-HLA examples for seronegative diseases; specifically, NOD2 (nucleotide-binding oligomerization domain-containing 2) and ATG16L1 (autophagy-related 16-like 1) are associated with CrD, and HNF4A is strongly linked to UC.117, 118, 119 Furthermore, all three are confirmed to have a vital role in the pathogenesis of CrD and UC.117, 118, 119, 120, 121, 122

There are also some phenotype-specific gene loci associated with seropositive IMDs. For example, INS (encoding insulin) was found to be associated with T1D, and TSHR (encoding thyroid-stimulating hormone receptor) was closely associated with Graves’ disease.123 PRTN3 (encoding proteinase 3) was confirmed to be strongly linked to antineutrophil cytoplasmic antibody-associated vasculitis, and PADI4 (encoding peptidyl arginine deiminase, type IV) is highly associated with RA.124, 125

Another good example was derived from TNF-overexpressing mouse models. Overexpression of human TNF results in the spontaneous development of severe systemic inflammation and destructive polysynovitis in mice, and the model phenocopies human RA rather than SpA.89 However, the blockade of DKK-1, an inhibitor of the Wnt signalling pathway, reverses the destructive phenotype of this model into a remodelling phenotype characterized by new bone formation in peripheral and sacroiliac joints, suggesting that TNF overexpression may partially photocopy human SpA under specific conditions.89, 126, 127, 128

An interesting phenomenon is that the disease-specific gene loci often confer large effects. Given the phenotype specificity of these loci, their generally large effect sizes, and their links to specific pathogenic pathways, it is likely that they have key phenotype-determining roles in each IMD.25 Furthermore, targeting disease-specific pathways may yield effective, disease-specific therapies with less systemic toxicity.

Conclusion and future directions

In conclusion, GWAS and the Immunochip have facilitated the discovery of disease-associated SNPs in IMDs, but several limitations exist regarding the GWAS and Immunochip studies conducted to date.

First, almost all of the studies were performed on cohorts of Caucasian descent, although a few were performed on Ashkenazi Jews and Chinese, Japanese and Korean individuals. More studies should be designed to analyse different ethnic groups.

Second, a large number of shared risk SNPs have been found across IMDs, but only a few of the IMD loci seem to be unique to a single disease. Future research should also focus on these disease-specific factors, as they may yield important clues regarding disease mechanisms and may provide avenues for prevention and/or treatment.

Third, the studies of epigenetic interaction would be helpful for assessing the expression of single alleles in relevant tissues. The significant insights gained from genetics will yield information on the genes and pathways that are associated with each disease. However, environmental factors or epigenetic events will alter the transition process from genotype to phenotype, which will be addressed in our next study.

Finally, more studies should focus on more personalized therapeutic approaches. Okada et al.129 performed a GWAS meta-analysis in European and Asian ancestries, identifying 98 biological candidate genes as the targets of approved therapies for RA and further suggesting that drugs approved for other indications may be repurposed for the treatment of RA. This comprehensive genetic study can provide important information for drug discovery.129