Genomic imprinting analyses identify maternal effects as a cause of phenotypic variability in type 1 diabetes and rheumatoid arthritis

Blunk, Inga; Thomsen, Hauke; Reinsch, Norbert; Mayer, Manfred; Försti, Asta; Sundquist, Jan; Sundquist, Kristina; Hemminki, Kari

doi:10.1038/s41598-020-68212-x

Download PDF

Article
Open access
Published: 14 July 2020

Genomic imprinting analyses identify maternal effects as a cause of phenotypic variability in type 1 diabetes and rheumatoid arthritis

Inga Blunk ORCID: orcid.org/0000-0003-4945-3302¹,
Hauke Thomsen^2,3,
Norbert Reinsch¹,
Manfred Mayer¹,
Asta Försti^2,4,5,6,
Jan Sundquist^4,7,8,
Kristina Sundquist^4,7,8 &
…
Kari Hemminki^2,4,9

Scientific Reports volume 10, Article number: 11562 (2020) Cite this article

1999 Accesses
9 Citations
Metrics details

Subjects

Abstract

Imprinted genes, giving rise to parent-of-origin effects (POEs), have been hypothesised to affect type 1 diabetes (T1D) and rheumatoid arthritis (RA). However, maternal effects may also play a role. By using a mixed model that is able to simultaneously consider all kinds of POEs, the importance of POEs for the development of T1D and RA was investigated in a variance components analysis. The analysis was based on Swedish population-scale pedigree data. With P = 0.18 (T1D) and P = 0.26 (RA) imprinting variances were not significant. Explaining up to 19.00% (± 2.00%) and 15.00% (± 6.00%) of the phenotypic variance, the maternal environmental variance was significant for T1D (P = 1.60 × 10⁻²⁴) and for RA (P = 0.02). For the first time, the existence of maternal genetic effects on RA was indicated, contributing up to 16.00% (± 3.00%) of the total variance. Environmental factors such as the social economic index, the number of offspring, birth year as well as their interactions with sex showed large effects.

A method for an unbiased estimate of cross-ancestry genetic correlation using individual-level data

Article Open access 09 February 2023

Searching for parent-of-origin effects on cardiometabolic traits in imprinted genomic regions

Article 02 January 2020

Integrative analysis of Mendelian randomization and Bayesian colocalization highlights four genes with putative BMI-mediated causal pathways to diabetes

Article Open access 04 May 2020

Introduction

The failure of the immune system to distinguish self from non-self antigens is the basis for autoimmune disorders (AIs)¹. Type I diabetes (T1D) is an AI that causes chronic destruction of pancreatic islet ß-cells and hyperglycemia due to reduced insulin production². With the incidence said to be increasing by 3–4% yearly, more than 20 million individuals are estimated to have T1D worldwide³. Rheumatoid arthritis (RA) is associated with autoantigen presentation with antigen specific T and B cell activation and aberrant inflammatory cytokine production. Consequences thereof include synovitis, proliferation of synovia and cartilage, and subchondral bone destruction⁴. The occurrence of RA is relatively constant and ranges between 0.5 and 1.0% in European and North-American populations⁵. The exact etiology of T1D and RA remains largely unknown¹, however, a complex interplay of genetic, environmental, and epigenetic factors is assumed^4,6,7.

With regard to genetic factors, the strongest effects have been found within the major histocompatibility complex or human leukocyte antigen system. T1D and RA show genetic overlap in terms of associations within HLA, PTPN22, CTLA4, and TAGAP⁸. Causal loci explain over 80% of T1D heritability which reportedly ranges between 40 and 92%³. For RA, associated variants outside and inside of the major histocompatibility complex region explain about 5% and 60% of the heritability^9,10. Heritability estimates range between 12 and 60%^11,12,13.

As disorder concordance rates in monozygotic twins have been observed to be less than 100%, AIs are assumed to be subject to epigenetic modifications^4,14,15,16. Perhaps the best-known example for all epigenetic phenomena is imprinting, in which the expression of genes is either maternally or paternally inactivated. Inactivation can either be full or partial¹⁷. Partial imprinting occurs when the inactivation of alleles is not complete. For example, loci may be imprinted in a tissue-specific manner¹⁸, or the imprinting status varies over time during the developmental stages¹⁹. As they appear as phenotypic differences between heterozygotes depending on their parental allele origin, imprinting effects belong to the class of parent-of-origin effects (POEs)²⁰. Imprinting has been identified in mammals, insects, and plants²¹. It is nevertheless assumed that less than 1% of all genes in mammals are imprinted^20,22, however, they have important functions in stem cells, neuronal differentiation, development, and growth^22,23. In humans, imprinted genes are associated with diseases such as Prader–Willi syndrome²⁴, Angelman syndrome²⁵, and cancer^26,27. They are also assumed to affect susceptibility to diabetes. This assumption originates from observations that T1D is preferentially expressed by children of T1D-affected fathers^28,29,30. Whether this observation is due to imprinting or other factors is not clear since findings are contradictory^29,31. With regard to RA, the existence of imprinting has been discussed since its incidence is considerably higher in women than in men³². However, imprinting studies are rare and results have been inconclusive; the role of imprinting in RA susceptibility is therefore not yet understood^33,34. Imprinted genes are difficult to detect in conventional association studies as their effects depend on the parental origin of the risk allele³⁵. The incorporation of knowledge on whether imprinting affects susceptibility to T1D and RA could increase the power to find causal genes³⁴. Moreover, the development of therapeutic approaches targeting these genes or their regulators could be improved³³. Therefore, the first goal of this study was to investigate the impact of imprinting on the susceptibility to T1D and RA in a variance components analysis by applying a unique mixed model (imprinting model). The model allows for the simultaneous consideration of all kinds of imprinting patterns (full, partial, maternal, and paternal). As it has never been applied to human population data before, it opens up new opportunities for understanding the etiology of T1D and RA^{17,36,37,38,39,40}.

In an imprinting variance components analysis, maternal effects must be accounted for in the model to avoid inflated estimates⁴¹. The second research goal was therefore to incorporate maternal effects into the statistical model; not only to prevent biases in the imprinting variances, but also to investigate the maternal contribution to T1D and RA susceptibility. Like imprinting effects, maternal effects contribute to the broader class of POEs. However, their variation is assigned to the environmental contribution to the phenotypic variance. According to Falconer⁴², they are defined as prenatal and postnatal effects on offspring and can have two main components. The first component is the maternal genotypic effect on, for example, the birthweight of her children (maternal genetic effect)⁴³. The second component is the maternal environmental effect on the birthweight of her offspring⁴². This component refers to the permanent environmental effects of the mother on all of her offspring and can therefore also be considered a shared household effect⁴². Although T1D and RA differ in their average age of onset, attention must be given to the maternal contribution in the development of both diseases since early environmental factors can permanently modify the development of the immune system⁴⁴. The imprinting model is able to separate maternal effects from maternal imprinting effects, allowing the first imprinting variance components analysis to be performed in human population genetics.

The third goal of this study was to gain knowledge on the importance of sex and environmental triggers such as birth year, social economic index, and the number of offspring on the susceptibility to T1D and RA. Overall, this study brought to light the complex interplay between genetic, epigenetic and environmental factors in the development of autoimmunity.

Theory

A unique mixed model, previously used on animal data, was applied to investigate the existence of imprinting^{36,37,38,39,40,45}. The advantage this model confers is that it is able to simultaneously consider all kinds of imprinting (i.e. maternal, paternal, full, and partial imprinting) in its analyses¹⁷, ultimately separating maternal imprinting effects from maternal “non-imprinting” effects (e.g., maternal environmental and maternal genetic effects). This was not possible with previous population-scale imprinting analyses models, for example, that of Engellandt and Tier⁴⁶. Our imprinting model estimates two parental gametic variances and one covariance simultaneously. It is written as:

$${\varvec{Y}} = {\varvec{Xb}} + {\varvec{Z}}_{{\varvec{s}}} {\varvec{g}}_{{\varvec{s}}} + {\varvec{Z}}_{{\varvec{d}}} {\varvec{g}}_{{\varvec{d}}} + {\varvec{e}},\quad \quad \quad \quad \quad \quad (imprinting\;model)$$

where Y is a vector of the response variable; b is a vector of fixed effects; g_s is the vector of random gametic effects under a paternal expression pattern; g_d is the vector of random gametic effects under a maternal expression pattern; X, Z_s, and Z_d are the corresponding incidence matrices; and e is the vector of random residuals. The variance–covariance structure is:

$${\text{Var}}\left[ {\begin{array}{*{20}c} {{\varvec{g}}_{{\varvec{s}}} } \\ {{\varvec{g}}_{{\varvec{d}}} } \\ \end{array} } \right] = G \otimes \left[ {\begin{array}{*{20}c} {\sigma_{s}^{2} } & {\sigma_{sd} } \\ {\sigma_{sd} } & {\sigma_{d}^{2} } \\ \end{array} } \right],$$

where $\sigma_{s}^{2}$ and $\sigma_{d}^{2}$ are the gametic variances and $\sigma_{sd}$ is the covariance. Matrix G is the gametic relationship matrix reflecting the relationships between the gametes of all individuals in a pedigree. It is therefore twice the size of the number of individuals included in the analysis^47,48. The symbol ⊗ denotes the Kronecker product. The imprinting effect is defined as the vector of differences (g_s − g_d) and the corresponding variance of differences is $\sigma_{i}^{2} = \sigma_{s}^{2} + \sigma_{d}^{2} {-} 2\sigma_{sd}$, which represents the imprinting variance. Where no imprinting is observed, $\sigma_{s}^{2} = \sigma_{d}^{2} = \sigma_{sd}$ and $\sigma_{i}^{2} = 0$.

Results

Parent-of-origin effects

Type 1 diabetes

Genomic imprinting

Using a REML log-likelihood ratio test (RLRT), the significance of the imprinting variance was tested by comparing the logarithmic value of the restricted maximum likelihood (REML log-likelihood) of the linear imprinting model to the REML log-likelihood outcome of a corresponding linear Mendelian model (equivalent null model that assumes the non-existence of imprinting). At a 5% significance level, the analysis revealed that imprinted genes did not significantly contribute to the total genetic variance in T1D susceptibility in the Swedish population data (P = 0.18).

Maternal effects

Initially, data were analysed using linear models in order to test the significance of the variance components. First, a linear model that ignored maternal effects was applied, i.e. only the genetic effect of the individual was included in the model (Mendelian model 1). This led to a T1D heritability estimate (h²) of 0.19 (± 0.1 × 10⁻¹), i.e. 19% of the phenotypic variation in T1D is due to the variation in genetic factors in the analysed population (Fig. 1). Adding a maternal environmental effect to the model (Mendelian model 2) revealed significant maternal environmental variance with P = 1.60 × 10⁻²⁴. The relative maternal T1D environmental variance was 0.19 (± 0.2 × 10⁻¹), i.e. 19% of the phenotypic variance in T1D is due to the variation in maternal environmental effects (Fig. 1). Heritability was reduced to 0.10 (± 0.1 × 10⁻¹). Augmentation of Mendelian model 2 by the maternal genetic effect (Mendelian model 3) did not change the REML log-likelihood or variance component ratios (Fig. 1). More detailed information on the variance component estimates in T1D and REML log-likelihood models is provided in Supplementary Table S1.

In addition to the linear models, threshold models were applied to account for the binary nature of the phenotypic traits. However, each of the threshold models could only pick up one variance component, i.e. with the addition of parameters, the same amount of variation was explained by additive genetic effects, then by maternal environmental effects, and then by maternal genetic effects (Table 1).

Table 1 Heritability (h²), relative maternal environmental variance (c²), and relative maternal genetic variance (m²) for type 1 diabetes (T1D) and rheumatoid arthritis (RA) estimated using threshold models with a gametic effect (g), a maternal environmental effect (c), a maternal genetic effect (m), and a residual effect (e).

Full size table

Rheumatoid arthritis

Genomic imprinting

As maternally derived environmental and genetic effects could not be unambiguously disentangled, the imprinting model was applied in two forms: (a) with only the maternal environmental effect in addition to the two parental gametic effects, and (b) with only the maternal genetic effect in addition to the two parental gametic effects. The RLRT of model version (a) did not indicate significant imprinting variance (P = 0.26). Model version (b) led to a REML log-likelihood of 8,408.70, which was slightly smaller than the REML log-likelihood obtained from the corresponding null model containing a gametic effect and a maternal genetic effect (8,408.88). Because the addition of a parameter to a model should result in an REML log-likelihood value either being equal to or larger than that found here, these results could indicate a flat likelihood surface or numerical inaccuracies.

Maternal effects

Maternal effects were initially ignored (Mendelian model 1), which resulted in an h² value of 0.10 (± 0.3 × 10⁻¹; Fig. 1). Following the inclusion of a maternal environmental effect (Mendelian model 2), the h² value was reduced to 0.85 × 10⁻¹ (± 0.3 × 10⁻¹). The corresponding maternal variance component was significant at a 5% significance level (P = 0.02). The relative maternal environmental variance was 0.15 (± 0.6 × 10⁻¹; Fig. 1). While the REML log-likelihood value was not significantly altered upon addition of the maternal genetic effect (P = 0.21; Mendelian model 3), the relative maternal RA genetic variance estimate was 0.14 (± 0.4 × 10⁻¹; Fig. 1), the h² estimate dropped to zero and the relative maternal environmental variance was reduced from 0.15 (± 0.6 × 10⁻¹) to 0.49 × 10⁻¹ (± 0.7 × 10⁻¹). To investigate the importance of maternal genetic effects in more detail, a linear model that corresponded to the Mendelian model 2 but substituted the maternal environmental effect with maternal genetic effect was applied. The application of this model resulted in a relative maternal genetic variance of 0.16 (± 0.3 × 10⁻¹) and an h² estimate of zero (Fig. 1). The RLRT indicated a significantly better fit in comparison to Mendelian model 1 (P = 0.01). Comparing the results to those associated with Mendelian model 3, the REML log-likelihood was not significantly different (P = 0.53). Detailed information on RA variance component estimates and REML log-likelihoods of the models is provided in Supplementary Table S2.

The threshold version of Mendelian model 1 resulted in an h² of 0.26 × 10⁻¹ (± 0.2 × 10⁻¹). As an equal additive genetic variance, and thus the same h², was found using the threshold version of Mendelian model 2, maternally derived environmental factors appeared not to play a role (Table 1) in RA. However, when the maternal genetic effect was added (Mendelian model 3) the h² value was 0.65 × 10⁻² (± 0.1), while the maternal environmental variance remained zero and the relative maternal genetic variance was 0.20 × 10⁻¹ (± 0.1; Table 1).

Environmental and sex effects

Type 1 diabetes

Birth year

Across all models (including linear and threshold models), the effects of the year of birth (ranging from 1944 to 2012) were shown to differ significantly (P < 1.00 × 10⁻³; Table 2; Supplementary Table S3). Effects increased until the end of the 1950s and started declining slightly at the beginning of the 1960s. The effects increased after 1972 until the mid-1980s, declined again until the mid-1990s, and have been increasing ever since (Fig. 2). With the exception of a strong increase of effects and standard errors in 1992 when applying the threshold models (data not shown), trends observed and effects generated under the threshold models were in accordance with those observed for the linear models.

Table 2 Overview of incremental Wald F values (F), number of numerator degrees of freedom (DF), number of denominator degrees of freedom (DF_den), and the P values (P) for all fixed effects on type 1 diabetes (T1D) and rheumatoid arthritis (RA), which were sex, birth year, social economic index (SEI), number of offspring (no. offspring), medical region, SEI of the mother (SEI_mother), years under observation (years_obs), and whether an individual was a single child or not (single child).

Full size table

Social economic index of the mother

When considering the social economic index (SEI), analyses revealed that the effects of the mother’s SEI differed significantly for T1D with P values ranging from 2.56 × 10⁻¹² to 1.13 × 10⁻⁹ across all models (Table 2; Supplementary Table S3). Although small, the largest effect (0.02; ± 3.00 × 10⁻³) was found for the intermediate group of non-manual employees (code 4). The lowest effect (− 2.00 × 10⁻³; ± 5.00 × 10⁻³) was found for professionals as well as higher civil servants and executives (code 5).

Medical region

To investigate the effect of geographical location on T1D susceptibility, medical regions were used. Using linear and threshold models, effects differed significantly for T1D across medical regions with P values ranging from 2.12 × 10⁻¹⁵⁶ to 4.60 × 10⁻¹⁰⁴ (Table 2; Supplementary Table S3). As depicted in Fig. 3, effect sizes varied widely across Sweden.

Sex

A slight male skew towards T1D was observed (14,626 male vs. 12,629 female), with significantly different effects seen across all models for sex. P values ranged from 8.05 × 10⁻²²⁵ to 3.23 × 10⁻²⁰ (Table 2; Supplementary Table S3). The analyses further revealed significant interactions between sex and birth year with P values ranging from 1.50 × 10⁻⁵⁰ to 3.45 × 10⁻¹¹ (Table 2; Supplementary Table S3). Minimal changes were found for estimates across models. The effect of male sex on T1D increased proportionally with birth year starting in 1965, reaching its highest point in the late 1970s, and declined until no interactions could be observed in 1990 (Supplementary Fig. S1).

Rheumatoid arthritis

Birth year

Ranging from 1939 to 2007, the effects of birth year differed significantly across all models with P values ranging from 3.34 × 10⁻²⁶⁵ to 2.64 × 10⁻²⁴ (Table 2; Supplementary Table S3). Negative effects were observed from 1939 with the lowest point obtained in 1958. Since then, RA susceptibility has increased with a positive effect being observed in 1963; a trend that continued until the end of the 1970s. The trend remained constant for approximately 10 years, with a slight increase noticeable towards the end of the 1980s. Increasing standard errors must however be noted (Fig. 2). While a similar trend was observed for the threshold models (data not shown), effects increased in 1974 and remained constant before decreasing in 1991. Large standard errors were observed for effects after 1973.

SEI of individual

The SEIs of individuals significantly differed for RA with an average P value of 1.66 × 10⁻⁹ for linear models and P = 0.01 for threshold models (Table 2; Supplementary Table S3). Effect estimates varied little across models. The largest effect (0.11; ± 0.03) was found for unskilled or semi-skilled workers, while the lowest effect (0.02; ± 8.00 × 10⁻³) was observed for foremen in industrial production and assistant non-manual employees.

SEI of the mother

Maternal SEIs had a small but significant effect on RA susceptibility in offspring under both the linear and threshold models. P values ranged from 0.01 to 0.02 (Table 2; Supplementary Table S3). Effects varied little across models with similar estimates being calculated. The lowest effect was found for foremen in industrial production and assistant non-manual employees (− 0.03; ± 0.01), followed by skilled manual workers (− 0.01; ± 0.01). Except for the unknown SEI group, the highest effect was found for the group of professionals as well as higher civil servants and executives (4.00 × 10⁻³; ± 0.01).

Number of offspring

In the dataset, women had an average number of 1.94 children (ranging from zero to 11 children; sd = 1.25), while men had an average number of 2.04 children (ranging from zero to 11 children; sd = 1.37). The number of offspring affected RA development significantly across all models (P < 1.00 × 10⁻³). An inverse and almost linear relationship between RA susceptibility and the number of children is depicted in Fig. 4. While effects greater than zero were estimated for individuals with zero, one or two children, decreasing effects were observed below zero for individuals with more than two children.

Medical region

Medical regions, serving as the proxy for residential and geographic location, differed significantly in their impact on RA with P values ranging from 0.70 × 10⁻² to 0.01 across all models (Table 2; Supplementary Table S3). Effect sizes varied widely across Sweden and were generally small with large standard errors (Supplementary Fig. S2).

Single child

We found that being a single child or having siblings made a significant difference regarding RA susceptibility with P values ranging from 0.01 to 0.02 across the models (Table 2; Supplementary Table S3). For the linear models, the mean estimated effect of being a single child was − 0.02 (± 9.00 × 10⁻³), while an effect of − 0.15 (± 0.06) was observed under the threshold models.

Sex

The incidence of RA was considerably higher in women than in men with 11,442 female and 4,408 male cases, respectively. Sex effects were significantly different with P values ranging from 6.86 × 10⁻¹⁴⁹ to 6.61 × 10⁻¹⁰¹ across models (Table 2; Supplementary Table S3). Interactions between sex and birth year were significant only under the linear models (P = 0.03; P = 1.00 in the threshold models). No clear trend was visible for interaction effects amongst male patients (Supplementary Fig. S1). Significant interactions between sex and SEI (average P value of 4.56 × 10⁻⁶) as well as between sex and the number of offspring were observed (average P value of 2.55 × 10⁻¹⁵). The latter interaction was not significant using threshold models (average P value of 0.48; Supplementary Table S3).

Linear versus threshold model

Firstly, the concordance across the linear and threshold model results were investigated by comparing the predicted genetic values under Mendelian model 1. High Pearson correlation coefficients were obtained with r = 0.99 for both T1D and RA. The linear relationships are shown in Supplementary Fig. S3. Secondly, threshold genetic values were fitted using the linear genetic values as independent variables. These results and their respective residual values are shown in Supplementary Fig. S3. The residual variation was fairly constant with some outliers observed over the entire range for T1D and RA.