Common patterns of gene regulation associated with Cesarean section and the development of islet autoimmunity – indications of immune cell activation

Birth by Cesarean section increases the risk of developing type 1 diabetes later in life. We aimed to elucidate common regulatory processes observed after Cesarean section and the development of islet autoimmunity, which precedes type 1 diabetes, by investigating the transcriptome of blood cells in the developing immune system. To investigate Cesarean section effects, we analyzed longitudinal gene expression profiles from peripheral blood mononuclear cells taken at several time points from children with increased familial and genetic risk for type 1 diabetes. For islet autoimmunity, we compared gene expression differences between children after initiation of islet autoimmunity and age-matched children who did not develop islet autoantibodies. Finally, we compared both results to identify common regulatory patterns. We identified the pentose phosphate pathway and pyrimidine metabolism - both involved in nucleotide synthesis and cell proliferation - to be differentially expressed in children born by Cesarean section and after islet autoimmunity. Comparison of global gene expression signatures showed that transcriptomic changes were systematically and significantly correlated between Cesarean section and islet autoimmunity. Moreover, signatures of both Cesarean section and islet autoimmunity correlated with transcriptional changes observed during activation of isolated CD4+ T lymphocytes. In conclusion, we identified shared molecular changes relating to immune cell activation in children born by Cesarean section and children who developed autoimmunity. Our results serve as a starting point for further investigations on how a type 1 diabetes risk factor impacts the young immune system at a molecular level.

between 9 months and two years 3 . Notably, an increasing incidence of type 1 diabetes has been observed within the last few decades, especially in children 4 . Several previous studies indicated transcriptional changes in immune cells caused by the development of islet autoantibodies 5,6 , indicating substantial molecular changes long before the onset of type 1 diabetes.
Children born by Cesarean section have an odds ratio of 1.23 for the development of type 1 diabetes compared to vaginally delivered children 7 . Moreover, Cesarean section has been reported to be associated with a faster progression to type 1 diabetes, but not with an increase the risk for islet autoimmunity 8 . The underlying mechanisms of these associations are not yet fully understood. Some reports indicate that the mode of delivery affects colonization of microbiota in the intestinal tract 9 , which in turn affects the developing immune system of infants 10 . Such differences in the gut microbiome and its interaction with the immune system may lead to increased risk of asthma, childhood allergies 11 , and autoimmune diseases, such as type 1 diabetes.
Here we investigated the impact of Cesarean section and islet autoimmunity on the immune system by analyzing gene expression data in peripheral blood mononuclear cells (PBMCs) as a readout (Fig. 1a). The overall goal of the analysis was to elucidate common immune modulation patterns between an early risk factor, Cesarean section, and the development of autoimmunity. We first investigated the effects of Cesarean section on gene expression profiles of children in the first year of life (Fig. 1b). To this end, we analyzed data from several time points in this early period in children with an increased familial risk for type 1 diabetes. A generalized additive  Overview. (a) Study workflow: Parallel analyses were performed to detect differential gene expression and pathway enrichment for Cesarean section and islet autoimmunity. The results were then compared in a combined analysis and related to expression patterns of lymphocyte activation. (b) Schematic overview of the longitudinal study design for Cesarean section and islet autoimmunity analyses. (c) Schematic illustration of the generalized additive mixed effect model (GAMM) to analyze the longitudinal dataset, including intercept, a time effect, a random effect for multiple measurements, and the investigated Cesarean section vs. vaginal delivery effect. Abbreviations: CS = Cesarean section, VD = Vaginal delivery, Ab+ = Islet autoantibodypositive, Ab− = Islet autoantibody-negative. mixed model was used to account for the longitudinal information when extracting gene expression differences between children born by Cesarean section and vaginal delivery (Fig. 1c). In a second analysis, we compared gene expression levels of children with increased risk of familial type 1 diabetes after initiation of islet autoimmunity with age-matched children who did not develop islet autoantibodies. We then combined both results to identify genes and pathways that were differentially regulated in both analyses, to link Cesarean section and islet autoimmunity at the level of gene expression. Finally, we compared the patterns identified for Cesarean section and islet autoantibodies with gene expression data from activated human CD4+ T lymphocytes to identify candidates for common molecular mechanisms of immune activation.

Effects of Cesarean section on the transcriptome in the first year of life.
To focus on early postpartum effects on genetic regulation after delivery by Cesarean section, we restricted our analysis to samples from children in their first year of life ( Supplementary Fig. S1). This resulted in a total of 154 PBMC gene expression samples (71 Cesarean section, 83 vaginal delivery) from 87 children (39 Cesarean section, 48 vaginal delivery), Fig. 1a,b. The number of samples per child ranged between 1 and 4 ( Fig. S1). Details on the dataset and preprocessing steps are provided in the Methods section.
We used a generalized additive mixed effect model (GAMM) to account for the longitudinal nature of the dataset. The model consisted of three major parts ( Fig. 1c): (1) the change of expression profiles over time was modeled using a spline-based function, (2) the overall trend of expression values per child was captured by a random mixed effect, and (3) a term representing the Cesarean section effect extracted the association in which we were primarily interested. Applying this model to the data set, we observed an enrichment of low p-values before adjusting for multiple testing ( Fig. 2a and Supplementary Table S1). However, after multiple testing correction by controlling the false discovery rate (FDR) at 0.1, no significant single genes could be identified in this first step (Fig. 2b).
To improve statistical power and facilitate biological interpretability, we performed a pathway enrichment analysis based on the KEGG pathway database 12 . From this analysis, we obtained two significantly differentially expressed pathways after multiple testing correction (FDR <= 0.1): "Pentose phosphate pathway" and "Pyrimidine metabolism" (Fig. 2c and Supplementary Table S1). The "Pentose phosphate pathway" included 27 genes (20 up-and 7 down-regulated genes in children born by Cesarean section) and the "Pyrimidine metabolism" pathway contained 92 genes (57 up-regulated and 35 down-regulated).
Both pathways are well known to be involved in the synthesis of nucleotides during cell proliferation 13,14 . This indicated that there is a change in expression profiles of proliferation-related nucleotide synthesis pathways as an early consequence of delivery by Cesarean section.
Effects of islet autoimmunity on the transcriptome. To identify gene expression signatures associated with seroconversion, we compared the earliest sample of children after the development of islet autoantibodies (up to 6 months post-seroconversion; 15 children) with all available age-matched samples of children who did not develop islet autoantibodies (74 children). We applied linear regression to explain gene expression differences induced by islet autoantibody status, corrected for age (see Methods). We detected a strong accumulation of differentially expressed genes (Fig. 3a), of which 3,867 were significant after multiple testing correction (FDR ≤ 0.1, Fig. 3b). The majority of these genes were found to be up-regulated in children with islet autoimmunity (64% up-regulated vs. 36% down-regulated, Supplementary Table 1).
Pathway enrichment analysis on KEGG pathways identified 20 as significantly regulated (FDR <= 0.1, Fig. 3c and Supplementary Table 1). The top-ranked pathways were "p53 signaling pathway", "Ubiquitin mediated proteolysis", and "RIG-I-like receptor signaling pathway. " The two significant pathways from the Cesarean section analysis, "Pyrimidine metabolism" and "Pentose phosphate pathway", also appeared as significant in the islet www.nature.com/scientificreports www.nature.com/scientificreports/ autoantibody analysis. Notably, comparing gene expression samples of children before seroconversion (up to 6 months before) and age-matched children without islet autoantibodies yielded no significant genes or pathways ( Supplementary Fig. S2). In addition, we investigated children that had gene expression samples both before and after seroconversion (up to 6 month before and after) in a paired analysis, leaving only seven children. No genes or pathways were found to be differentially expressed ( Supplementary Fig. S2).
Taken together, we observed several immune system-related and nucleotide synthesis pathways associated with islet autoimmunity, which included the pathways found in our Cesarean section analysis.

Coherent gene expression changes between Cesarean section and islet autoimmunity.
We further investigated the similarities of transcriptional changes between the two risk factors. First, we found that the individual gene regulation of both "Pyrimidine metabolism" and the "Pentose phosphate pathway" pathway showed similar patterns of up-and down-regulation between Cesarean section and islet autoimmunity (Figs 4a and S3).
Extending this analysis, we quantified the relationship of gene regulation between the two factors at a systematic level. We correlated the standardized effects from Cesarean section and development of islet autoantibodies across all genes (Fig. 4b), revealing a striking correlation of 0.606 (p = 0.0062, Fig. 4c). In other words, genes regulated by Cesarean section, despite the rather weak signal strength in our study, were also regulated in the same direction by the initiation of islet autoimmunity. A similar correlation between the effects of Cesarean section and islet autoimmunity was observed at the pathway level, with a correlation of 0.49 (p = 0.02, Fig. 4d,e), supporting the functional agreement of transcriptional changes between the two risk factors. Importantly, we did not observe that children born by Cesarean section developed islet autoantibodies more frequently than children born by vaginal delivery, ruling out a confounding effect not related to gene expression ( Supplementary Fig. S4).
In contrast to the profound Cesarean section to islet autoantibodies correlation at transcript level, we found the effects of gender, maternal diabetes and multiple first-degree relatives to be randomly correlated with Cesarean section at the single gene level (maternal diabetes, r = −0.22, p = 0.53; gender, r = 0.09, p = 0.78; multiple first-degree relatives, r = −0.24, p = 0.49) and at the pathway level (maternal diabetes, r = 0.01, p = 0.97; gender, r = −0.01, p = 0.97; multiple first-degree relatives, r = −0.23, p = 0.34), as shown in Figs 4c,e and S5. This demonstrates the specificity of the Cesarean section-islet autoantibody effect correlation.
In summary, this analysis showed that the gene expression changes in these two risk factors, Cesarean section and islet autoimmunity, are remarkably coherent.
Finally, we investigated whether the two activated pathways are specific to the development of islet autoantibodies and therefore type 1 diabetes. To this end, we performed the same pathway enrichment analysis for gene expression data from lupus disease (GSE81622), juvenile idiopathic arthritis and juvenile dermatomyositis (GSE11083, includes samples for both diseases), comparing cases against controls. In contrast to seroconversion, we did not observe the pentose phosphate pathway and pyrimidine metabolism to be activated, providing evidence that these two pathways are specific to islet autoimmunity of type 1 diabetes (Supplementary Table S2).
signatures of immune cell activation. The pentose phosphate pathway is a universal, central metabolic pathway in the cytosol, which supports cell proliferation and survival 15 . The non-oxidative branch of the pentose phosphate pathway branches off glycolysis and generates ribose 5-phosphate as a precursor for the synthesis of nucleotides and amino acids necessary for cell growth and division. Moreover, the pyrimidine metabolism pathway is directly related to the pentose phosphate pathway in its role in nucleotide synthesis. Since we investigated PMBCs, these differentially regulated pathways in Cesarean section and islet autoimmunity point toward a general activation of immune cells. A direct proof of this hypothesis in the same children was unfeasible in the context of this study. Instead, we collected evidence from several secondary analysis steps.   www.nature.com/scientificreports www.nature.com/scientificreports/ First, we investigated whether regulated genes from both analyses enriched immune genes annotated in innateDB 16 . Indeed, there was a significant accumulation of higher standardized effects for immune genes compared to non-immune genes, for both Cesarean section (Wilcoxon rank sum test: p = 0.0002) and autoimmunity (p = 2.6 × 10 −15 ); see Supplementary Table S3.
Second, we compared results from the mixture of PBMC cells with published data on activation in isolated immune cells. For the analysis, we used transcriptomics data from isolated naïve and activated human CD4+ T cells 17 . We calculated gene expression differences before and after activation, and applied enrichment analysis to identify differentially expressed pathways. In particular, the "Pentose phosphate pathway" (p = 0.049) and "Pyrimidine metabolism" (p = 0.035) pathways were significantly differentially expressed in activated CD4+ T cells (Supplementary Table S1). To compare the effects of CD4+ T cell activation with effects from the Cesarean section and islet autoantibody analyses, we calculated correlations of the standardized effects. We observed borderline significant correlations between changes in activated CD4+ T cell and Cesarean section (r = 0.20, p = 0.049, Fig. 5a,b) and islet autoimmunity (r = 0.24, p = 0.052, Fig. 5c,d). Remarkably, the association was  www.nature.com/scientificreports www.nature.com/scientificreports/ substantially stronger at pathway level for both Cesarean section (r = 0.45, p = 0.021, Fig. 5e,f) and islet autoimmunity (r = 0.57, p = 0.008, Fig. 5g,h). The pathway association was replicated in a second transcriptomics dataset from naïve and activated CD4+ T cells, monocytes, and natural killer cells, published in 18 . We found that the pathway association was present in CD4+ T cells and monocytes; however, we did not observe any association for natural killer cells. Detailed results are discussed in Supplementary Note S1.
In summary, we observed a significant correlation between the functional effects of human lymphocyte activation and the effects of Cesarean section and islet autoimmunity on gene expression in PBMCs.

Discussion
We identified coherent gene expression signatures for Cesarean section, an early risk factor for type 1 diabetes, and islet autoantibody positivity, an obligatory stage of autoimmune response prior to the development of autoimmune type 1 diabetes. Specifically, at the transcriptome level, we identified two pathways involved in nucleotide synthesis and cell proliferation that were regulated in PBMCs of children born by Cesarean section. This analysis required an extended statistical model to incorporate the complex time information in our present dataset. Islet autoantibody analysis revealed various pathways generally involved in immune processes, including the aforementioned nucleotide synthesis pathways. Comparing global gene expression signatures, we found that transcriptomic changes were systematically and significantly correlated between the Cesarean section and islet autoantibody-positive analyses. Importantly, transcriptional signatures of Cesarean section did not correlate with gender, maternal diabetes, or multiple first-degree relatives, demonstrating that the correlation is specific to Cesarean section and islet autoimmunity. In a functional follow-up analysis, both Cesarean section and islet autoimmunity signatures were significantly correlated with gene expression changes observed during activation of isolated CD4+ T lymphocytes. At pathway level, this correlation was also observed for monocytes and natural killer cells in a second dataset.
We can speculate on the biological basis of our statistical observations. The coherent regulation of proliferation pathways in blood PBMCs may indicate a general activation of the immune system and immune cells for both Cesarean section and seroconversion. In the case of Cesarean section, this activation might indirectly reflect different microbial exposures during birth. This idea is indirectly supported by findings that the microbiome is affected by the mode of delivery, and in turn affects the developing immune system 9,10 . For the autoimmunity results, the precise interplay and timing of the occurrence of environmental stimuli, immune response and the development of islet autoantibodies still need to be elucidated. Regarding a hypothetical disease trajectory leading to type 1 diabetes, it is conceivable that the observed gene expression signatures may reflect a transient deflection www.nature.com/scientificreports www.nature.com/scientificreports/ of the immune system and that this primes a child for subsequent progression to type 1 diabetes in the presence of other risk factors.
An interesting general observation in our analysis was the increased correlation of gene expression signatures when performing pathway analysis instead of single gene analysis. This pattern was observed both for the comparison of Cesarean section and islet autoantibody-positive signatures, and for comparisons of these two signatures with the isolated immune cells. These findings indicate that pathway analysis substantially reduced the noise compared to single gene analysis, which allowed us to identify common patterns at a functional level.
Our study could be extended and improved in several directions. (1) The present dataset has rather low statistical power for the islet autoantibody analysis, with only 15 positive samples available for at most 6 months post-seroconversion. While differential expression changes were remarkably significant, the analysis should be validated with a larger sample size. (2) The hypothesis of coherent immune system activation should ideally be confirmed in an isolated primary T cell population from children after Cesarean section or after seroconversion. We took an indirect route using PBMCs rather than a more general activation experiment with isolated immune cells. (3) The investigated samples stem from a study with increased familial risk for type 1 diabetes. Ideally, future studies would include gene expression measurements of children without familial risk. (4) The incorporation of further environmental factors, such as nutrition and medical parameters, in combination with the children's genetic background, is expected to give a more complete picture of the parameters leading to autoimmunity and type 1 diabetes.
In summary, we found a transcriptional link between Cesarean section and islet autoimmunity, pointing toward a transiently altered immune system in the susceptible period of islet autoimmunity generated by Cesarean section, which was remarkably coherent with the changes observed after islet autoimmunity.
www.nature.com/scientificreports www.nature.com/scientificreports/ sample permutation-based approach for p-value calculation, which avoids false positive results due to correlating transcripts. The number of permutations was set to 5,000.
For node coloring of pathways (Fig. 4a,b), we used a "directed p-value", defined as (−log10(p-value) * sign(regression coefficient)). For nodes in the pathway that had multiple genes annotated, the highest effect was shown.
Comparing Cesarean section and islet autoimmunity results. To compare the results obtained from the Cesarean section and islet autoimmunity (Ab+) analyses, we calculated the Pearson correlation between the standardized effect estimates of both analyses. Standardized effect estimates were represented by the t-statistic in the single gene analysis, and a "directed p-value" (analogously to previous section) in the pathway analysis. To assess the statistical significance of the correlation, we calculated the correlation between the effects of Cesarean section and permuted islet autoimmunity status. The number of permutations was fixed at B = 5,000 for the single gene analysis and 1,000 for the pathway analyses. The same analysis with 1,000 permutations was repeated for the factors maternal diabetes, gender, and multiple first-degree relatives.
Immune cell activation analysis. To investigate the enrichment of genes within annotated immune genes from innateDB 16 , we calculated Wilcoxon rank sum tests using the t-statistics from the single gene level analysis (see above) to identify differences between the distribution of immune genes and non-immune genes. For comparison of standardized effects, we used two published datasets, one from isolated CD4+ T cells before and after activation (GEOD-33272, processed data downloaded) and one containing different subsets of human leukocytes before and after activation (GEOD-22886, processed data downloaded). Single gene differential analysis was performed on log2-transformed expression values using linear regression, with gene expression as the response, and activation status as the explaining variables. For SAFE pathway enrichment, the number of permutations was set to 1,000. To calculate permutation-based p-values of Pearson correlations, we used 1,000 permutations.
All statistical analyses were performed using the computing environment R version 3.3.2.
ethics approval and consent to participate. The BABYDIET study was approved by the ethics committee of Ludwig-Maximilians University (Protocol No. 329/00) and is registered at ClinicalTrials.gov NCT01115621. The parents or guardians of each child provided informed consent for participation in the BABYDIET study. All samples and information were collected after obtaining signed informed consent.

Data Availability
This publication uses publicly available gene expression datasets, available via ArrayExpress (accession number E-MTAB-1724) and the Gene Expression Omnibus (Accession Numbers GSE33272 and GSE22886). The informed consent given by study participants does not cover data posting of the clinical covariates in public databases.