Introduction

Epidemiologic associations, clinical phenotype, and natural history differ across CD age-of-diagnosis.1,2,3,4 Older children, adolescents, and young adults develop CD involving the ileum most commonly,5 whereas colon-only involvement is predominant in the first decade of life and in the elderly.1,3 Variation in antimicrobial serology with increasing age-of-diagnosis within pediatric CD has suggested potential age-dependent differences in the ileal microbial community and/or immune responses.6 These observations have informed the Paris classification system6 that sub-divides the Montreal A1 (0–16 years) classification for CD into diagnosis prior to age 10 years (A1a) and age 10–16 years (A1b). Recently, the younger A1a group was subdivided further to children diagnosed prior to 6 years of age, the very early onset type7,8 that was linked to monogenic forms of Inflammatory Bowel Disease (IBD) that differ from the polygenic form of CD diagnosed in older ages.4,9 Whether there was a specific mucosal basis for the clinical-based classification of the polygenic CD to younger A1a (6 years to < 10 years) and older A1b (10–16 years) was not known.

The gut-associated lymphoid Peyer’s patches (PP), concentrated (50%) in the distal ileum (TI),10,11 interact with the overlying epithelium to induce tolerance or defense against luminal antigens.12 Importantly, PP undergoes dynamic maturation from birth, whereby the number and size of PP increase up to the second/third decade of life and then decline with age. The spatiotemporal relationship between the peak incidence of ileal CD (iCD) and PP evolution is represented primarily by INFγ-expressing Th1 cells; this has led to the hypothesis that this dynamic mucosal immune process plays a central role in iCD pathogenesis.5,10,11 In fact, murine studies have revealed maturation of an IFN-γ associated immune signature in adult vs. pre-weaned mice,13 and a prior study in healthy children showed a pronounced Th1 polarization (increased IFNγ and IL-12) within PP and adjacent ileal mucosa in response to bacterial antigens.14 However, comprehensive human mucosal-based gene expression analysis to test for maturation of a Th1 signature across the pediatric age-of-diagnosis range within CD had not been performed.

Paneth cells, located at the base of the intestinal crypts, produce antimicrobial peptides such as lysozymes and α-defensins to modulate the intestinal microbiota and are an important arm of the innate immune response.15 Reduced human α-defensins (DEFA5 and DEFA6) expression were previously documented in patients with iCD.16 However, it is unclear if this deficiency is a primary event in iCD or a secondary event occurring as a consequence of inflammation.17 Whether differences in α-defensin expression would be observed as a function of age-of-diagnosis in pediatric iCD, and whether this in turn is associated with alterations in the microbial community, is not known. Interestingly, our group has recently shown that in a competing-risk model, older age-of-diagnosis, African American race, and ASCA IgA and CBir1 sero-positivity were associated with internal penetrating (B3) disease complications.18 To improve our understanding of pediatric CD pathogenesis across different ages-of-diagnosis, we employed combined ileal mRNA and 16s rRNA sequencing approaches to define host transcriptome and microbial community differences in patients stratified by their age-of-diagnosis. We show that despite an age-independent microbial dysbiosis in pediatric CD, there are robust differentially expressed host ileal gene signature differences with higher expression of inflammatory-related genes, and a lower expression of Paneth cell α-defensins, with increasing age-of-diagnosis.

Materials and methods

The RISK cohort

The Crohn's and Colitis Foundation sponsored RISK prospective inception cohort study18,19,20 included newly diagnosed pediatric CD patients (Table 1) at 28 North American pediatric gastroenterology centers. 946 patients had inflammatory (B1) disease behavior and no disease complications (B2 or B3) at the time of diagnosis. All patients were required to undergo baseline colonoscopy and confirmation of characteristic endoscopic features and chronic active colitis/ileitis by histology prior to diagnosis and treatment, and documented initial clinical severity and disease location. Non-IBD controls were subjects suspected to have IBD, but with normal radiographic, endoscopic, and histologic findings. Ileal biopsy samples from a CD sub-cohort representative of the overall RISK cohort (age, gender, and disease phenotype and severity) and non-IBD controls (age and gender) are included in our mucosal mRNAseq analysis (254 CD and 50 control (Ctl), Tables 1, 2). Ileal biopsies of CD patients (n = 272) and Ctl (n = 178) were also analyzed for ileal microbial profiles and were already included in our recent reports,18,19,20 of those 197 also had mRNAseq (160 CD and 37 Ctl). Patient-preparation for endoscopy and treatment course were according to the dictates of their physicians, not by standardized protocols. Serological determination of perinuclear anti-neutrophil cytoplasmic antibodies (pANCA), anti-Saccharomyces cerevisiae antibodies (ASCA) IgG, ASCA IgA, and anti-CBir1, was performed at Cedars-Sinai Hospital (Los Angeles, CA, USA).21 Granulocyte-macrophage colony-stimulating factor (GM-CSF) autoantibodies were measured at Cincinnati Children’s Hospital Medical Center (Cincinnati, OH, USA). Positive serologies were defined based on specific predefined cut points. anti-GMCF is consider positive with values above 1.6 mcg/mL,22 the value previously reported within pediatric-onset CD patients linked to higher risk of stricturing disease complications, and subsequently validated within adult-onset CD patients for this outcome. Other antibody levels were determined and results are expressed as ELISA units (EU/ml), which are relative to a Cedars-Sinai Laboratory standard, derived from a pool of CD patient sera with well-characterized disease found to have reactivity to this antigen; ASCA IgA is considered positive above 20 EU/ml, ASCA IgG above 40 EU/ml, anti-CBir1 above 25 EU/ml, and pANCA above 30 EU/ml.

Table 1 Clinical and Demographic Characteristics stratified by age-of-diagnosis of the RISK cohort
Table 2 Clinical and demographic characteristics

Ileal DNA and RNA extraction and RNA-seq

DNA and RNA were isolated from ileal biopsies as previously described.19 Reads were quantified by kallisto,23 using Gencode v23 as the reference genome and Transcripts per Million (TPM) as an output. We included 13,206 protein-coding genes with TPM above 5 in 5 samples. Differentially expressed genes were determined in GeneSpring® software with fold change differences (FC) > = 1.5 and using false discovery rate correction (FDR < 0.05). Euclidean distance metric and Ward’s linkage rule was used for unsupervised hierarchical clustering. ToppGene24 and ToppCluster25 software were used to test for functional annotation enrichment analyses. Visualization of the network was obtained using Cytoscape.v3.0.2.26 The RNASeq data is deposited in the GEO repository (GSE101794).

Microbial community profiling and analysis of associations testing between microbial taxa and clinical and molecular metadata

Detailed protocols used for 16s rRNA amplification and sequencing are as previously described.18,20,27 In brief, 16s rRNA amplicon sequencing of 450 ileal samples was performed using the Illumina MiSeq v2 platform, targeting the V4 region of the SSU rRNA gene, and generating paired-end reads of 175b. Samples with at least 3000 reads were included. Taxonomy was assigned based on the Greengenes database.28 Each OTU was required to occur at a relative abundance of at least 0.01% across all samples and be present in at least 5%.

Differentially abundant taxa were determined based on multivariate statistical analyses using Multivariate Analysis by Linear Models (MaAsLin).18,19,20 The following metadata were investigated in the analysis: age, gender, body mass index (BMI, as a measure of nutritional status), clinical phenotype (Ctl or CD), endoscopic severity (deep ulcers in ileum), clinical severity (Pediatric Crohn’s Disease Activity Index, PCDAI), ileal gene expression of CSF2, CXCR1, IFNG, MMP3, DEFA5, GSTA1, and LCT, and NOD2 and ATG16L1 IBD risk allele carriage. Significant association was considered below a q value (Benjamini-Hochberg) threshold of 0.2.

Immunohistochemistry

Immunohistochemistry detection of the human alpha-defensin five protein encoded by the DEFA5 gene in control and CD ileal biopsies was performed as previously described in the CCHMC Digestive Healthy Center core facility,19 using anti-DEFA5 (mouse monoclonal anti-human alpha-defensin NP5 Ab, clone nr 8C8, code ab62757, Abcam, Cambridge, UK). Staining was examined using an Olympus BX51 light microscope and digitally recorded at ×20 and ×40 magnification.

Ethical considerations

The institutional review board reviewed and approved the protocol and informed written consent was obtained. ClinicalTrials.gov identifier is NCT00790543.

Results

The RISK cohort

RISK is a prospective inception cohort study which enrolled pediatric CD patients at diagnosis at 28 sites in North America. RISK includes 946 treatment naïve newly diagnosed CD patients that were all classified as having an inflammatory B1 phenotype without B2 (stricturing/narrowing) or B3 (internal penetrating behavior) complications at diagnosis. Change in disease behavior from B1 inflammatory to either B2 stricturing or B3 penetrating behavior was recorded during follow up. For the purpose of our study, patients were stratified based on the Paris age-of-diagnosis classification, where diagnosis prior to age 10 years is defined as A1a, and age 10–16 years is defined as A1b. The younger A1a were further subdivided to very early onset (VEO, <6 years7), and to early onset (EO) (6–9 years).7,8

Table 1 shows demographic and clinical characteristics categorized by age-of-diagnosis. Of the 946 CD patients in the RISK cohort, only 36 (4%) were classified as VEO (<6 years) CD, 196 (21%) as EO younger CD (6–9 years), and the rest (75%) as EO older CD (10–16 years). Clinical severity defined using the Pediatric Crohn's Disease Activity Index (PDCAI) was not significantly different between the groups at diagnosis. Perianal involvement was significantly higher in older CD. As previously reported,6 sero-positivity for anti-GMCSF and ASCA were significantly increased with age-of-diagnosis classifications. There were no significant differences in baseline PCDAI, early anti-TNF exposure that was previously associated with lower rate of penetrating (B3) complications,18 and PCDAI six months after diagnosis between older and younger CD cases. However, there was a significantly higher prevalence of penetrating complication (B3) during 3 years follow up in older vs. younger CD (Table 1).

Based on those differences between older (10-16 years) and younger EO CD, the relative low number of VEO cases, and the likelihood association between the VEO cases and the monogenic IBD type, we included hereafter only patients older than 6 years in the mucosal transcriptomics and microbial analyses. Ileal mucosal transcriptomics and microbial characterization were done on a representative sub-group of ileal CD (iCD, n = 198), colon only CD (cCD n = 56), and non-IBD Ctl (n = 50) (Table 2 and Suppl. Table 1). cCD patients are subjects who met diagnostic criteria for CD but lacked ileal inflammation on endoscopy. This unique large patient population and biospecimens offered an opportunity to directly test for the association between the presence of mucosal inflammation, host mucosal genes and pathways, and mucosal microbial composition by performing an age-dependent analyses in treatment-naive patients.

Decreased expression of α-defensins in older iCD

We identified 365 genes (Suppl. Table 2 and Fig. 1) that were significantly differentially expressed between older and younger iCD. We have previously reported the identification of a core iCD gene expression signature.19 Interestingly, the α-defensin genes (DEFA5 & DEFA6) and REG3A were not included in the core iCD list (Fig. 1a), as these were only suppressed in older iCD compared to both younger iCD and Ctl (Fig. 1c). Results of unsupervised clustering using the 365 differentially expressed genes are shown in Fig. 1b with the indicated location of α-defensins. 51 of 148 (34.5%) older CD and 5 of 48 (10.4%) younger CD (Chi-squares p = 0.0013) are in block 1, while 20 of 148 (13.5%) older, and 16 of 48 (33%) younger CD (Chi-squares p = 0.002) are in block 2b(ii). DNASE1 that was negatively associated with abnormal Paneth cell morphology in iCD29 showed a similar pattern of lower expression level in older iCD as the α-defensins and REG3G/A in contrast to stable increased expression of lysozyme (LYZ, Fig. 1c and Suppl. Table 2). Consistent with this, a reduced frequency of alpha-defensin 5 positive epithelial cells were detected in ileal biopsies from older CD patients, compared to younger CD patients and controls (Fig. 1d). Unsupervised hierarchical clustering analysis identified groups of biopsies with similar ileal gene expression profiles. Unsupervised clustering using the top third most differentially expressed genes subset (121 of 365 genes) demonstrated that patients diagnosed at age 6, 7, 8, and 9 years clustered together and separately from patients who were diagnosed at age 10, 11, 12, 13, 14, 15, and 16 years (Fig. 1e and suppl. Figure 1).

Fig. 1
figure 1

Decreased epithelial Paneth cell α-defensins signature in pediatric iCD ≥10 years at diagnosis. a Venn diagram shows an overlap between the previously reported 1340 core iCD gene signature19 and the 365 genes that were differentially expressed in the ileum between older (A1b) and younger (A1a) iCD (FDR correction [0.05], fold-change ≥1.5). 124 of these 365 genes including α-defensins and REG3A were not included in the core iCD gene list. b Unsupervised hierarchical clustered heat map of the 365 genes differentially expressed genes between older and younger iCD with upregulated genes in red and down-regulated genes in blue. Above the heat map, younger Ctl (light blue), older Ctl (dark blue), younger CD (orange), and older CD (red) samples are indicated. c TPM ileal gene expression is shown for the indicated genes for the indicated groups stratified by age-of-diagnosis with Kruskal-Wallis test with Dunn's multiple comparisons. d Immunohistochemistry in representative CD patients and non-IBD controls. Relatively high DEFA5 staining is shown for older non-IBD control (i) and younger CD (ii) that correlated with transcripts per million (TPM) values of >10,000 and 6143 for gene expression by RNASeq, respectively, and relatively low DEFA5 staining in older CD that correlated with transcripts per million (TPM) values of 1239 for gene expression by RNASeq. Images were captured using an Olympus BX51 light microscope and digitally recorded at ×20 magnification. e Averaged and unsupervised hierarchical clustering heat map of 121 of 365 (top third differentially expressed genes between older and younger iCD) stratified by yearly age-of-diagnosis as indicated. Number of patients for each age group is indicated adjacent to the heat map

Enhanced ileal immune responses in older iCD

We next tested whether the ileal transcriptome of older iCD exhibited mucosal immunologic maturation represented by a Th1 gene expression profile compared with younger iCD. Functional annotation enrichment analyses using ToppGene24 and ToppCluster25 were used to map groups of related genes within the 365 gene signature to biologic functions and immune cell types (Fig. 2a, Suppl. Table 3 and 4). Genes up-regulated in older iCD were notable for higher expression of an inflammatory-related signature including GM-CSF (CSF2), IFNγ, matrix metalloproteinases, and collagens (Suppl. Table 2). This gene signature showed functional annotation enrichment (Fig. 2a and Suppl. Table 4) for genes induced in response to molecules of bacterial origin (P < 4.82E−07) including several monocyte-derived pro-inflammatory cytokines and the Th1-related cytokine IFNγ. Additional functional annotation enrichment was noted for genes expressed by granulocytes (P < 2.65E−06), B cells (P < 2.72E−04), and lymphoid stromal cells (P < 8.1E−14, Fig. 2b). The upregulated genes also showed a remarkable increase for genes encoding extracellular matrix (ECM) and extracellular matrix-associated proteins (P < 4.95E−16), extracellular space (P < 1.12E−26), and genes defining epithelial-mesenchymal transition (P < 4.51E−15). The functional annotation enrichment analyses included enrichment for genes associated with a specific drug or supplement, with one of the top enrichments for curcumin (P < 1.15E−9, Suppl. Table 4). We also noted enrichment for collagen genes (P < 1.05E−05) including COL12A1, COL1A1, COL3A1, COL6A3, COL7A1, and COL8A1. Of note, this age-dependent gene signature was specific, as several other mucosal inflammatory genes including the NADPH oxidase DUOX2, FOLH1, REG1A, and SAA2, that we previously identified as part of a core iCD gene signature,19 were upregulated to the same degree in older and younger iCD and are not included in the 365 gene list. These results defined a maturation of components of both the innate and Th1 adaptive ileal gene signature in pediatric CD patients with older age-of-diagnosis.

Fig. 2
figure 2

Enhanced ileal innate and adaptive Th1 immune responses in pediatric iCD ≥10 years at diagnosis. a Top functional annotation enrichment analyses using Toppgene/ToppCluster25 platforms of the 171 upregulated and differentially expressed genes between older and younger iCD and Cytoscape26 was used for visualization. b Top functional annotation enrichment analyses using Toppgene/ToppCluster25 platforms using the 194 downregulated and differentially expressed genes between older and younger iCD and Cytoscape26 was used for visualization

A similar approach was applied to the downregulated genes between older and younger iCD (Fig. 2b). These analyses identified decreased genes associated with brush border localization (P < 5.23E−24), digestion (P < 1.65E−13), lipid metabolic processes (P < 1.58E−08), and vitamin digestion and absorption (P < 1E−07) in older iCD. Additionally, we found that 32 genes of the 365 genes had a FC ≥1.5 in older vs younger Ctl (Suppl. Table 5), including LCT that showed decreased expression ≥2 in the older vs. younger groups. However, only 2 passed corrected P-values FDR of ≤0.05.

Decreased α-defensins expression in older CD is associated with Th1 IFNγ and inflammation but not with NOD2 or ATG16L1 genotype

Our cohort also included cCD patients, who met diagnostic criteria for CD but lacked ileal inflammation on endoscopy. This unique patient population offered an opportunity to directly test for the association between mucosal inflammation and gene expression by performing age-dependent analyses.  Unsupervised clustering using specifically the 365 genes that were differentially expressed between older and younger iCD demonstrated that while patients groups first clustered based on disease phenotype (CD and Ctl), older cCD clustered with younger iCD and cCD (Fig. 3a). α-defensin gene expression was not significantly decreased in the ileum of older cCD (A1b) in comparison to younger cCD or Ctl in a univariate non parametric t-test. 84 (Suppl. Table 6) of the 365 genes differentially expressed between older and younger iCD showed a FC > 1.5 in the ileum of older vs. younger cCD including DEFA5 but none passed the FDR <0.05. To capture differences between older CD with and without clinical ileal inflammation (iCD vs. cCD), we identified 1135 genes that were significantly differentially expressed between older iCD and cCD (Suppl. Table 7). Those 1135 included CSF2 and IFNG that were up-regulated in all CD forms in comparison to Ctl, with further increased expression in older iCD. Of note, no genes were differentially expressed between younger iCD and cCD, showing that genes such as CSF2 and IFNG were associated with overt mucosal inflammation specifically in the older age group. We next asked whether α-defensin expression would be associated with NOD2 or ATG16L1 genotype. We did not observe a significant difference when we compared DEFA5 expression between patients stratified by ATG16L1 or NOD2 (Fig. 3b and c) risk allele carriage.

Fig. 3
figure 3

Decreased epithelial Paneth cell signature in pediatric CD ≥10 years at diagnosis is associated with clinical ileal inflammation. a Unsupervised averaged hierarchical clustered genes heat map of 365 genes differentially expressed between older and younger iCD is shown for each clinical sub-group [Ctl, cCD, and iCD divided to <10 and ≥10 years]. b, c DEFA5 TPM ileal gene expression is shown for the indicated younger and older CD age-of-diagnosis groups as indicated, stratified by their ATG16L1 (b) or NOD2 (c) genotype. R; risk allele (homozygote or heterozygotes), nn; no risk allele, nR; heterozygote for the risk allele. RR; homozygote for the risk allele

Th1-related IFNγ was suggested to suppress human α-defensin gene expression,30 while α-defensin expression was shown to specifically inhibit Th1-related IFNγ inflammation in a murine model.31 We identified significant negative associations between IFNG and DEFA5 gene expression (Pearson r = −0.33, P < 0.0001) in our cohort. Collectively, these results characterized an unexpected age-dependent downregulation of α-defensin genes within pediatric CD which was in turn associated with increasing IFNG expression and intensified extracellular matrix and collagen expression in patient with older age-of-diagnosis.

Microbial dysbiosis in pediatric CD is already established in the younger group and lacks systematic changes observed in non-IBD Ctls

We have previously described a pediatric CD-associated ileal dysbiosis.19,20 Variation in microbial samples was compared by using Principal Coordinate Analysis (PCoA) with Bray-Curtis distance and samples were colored by either the Chao1 α-diversity (richness within a sample) or Paris age (Fig. 4a). The frequency of younger and older CD with PC1 values >0 was not significantly different between the two groups (41% of the younger CD and 45% of the older CD). Differences in α-diversity were the main driver of variation rather than age or diagnosis, as was previously observed. Further, α-diversity was significantly reduced in older CD (CD A1b) vs. Ctl of any age. To capture age-dependent differences we applied a multivariate statistical framework (MaAsLin)20 to Ctl samples (n = 177, n = 47 for younger and n = 130 for older) and CD samples (n = 272, n = 64 for younger and n = 208 for older), respectively. We were able to detect significant differences in microbial abundances between older vs. younger Ctl (≥10 years vs. <10 years) while controlling for body mass index (BMI), ethnicity, gender, and NOD2, and ATG16L1 risk allele carriage (Fig. 4c). This included a significant decrease of Enterobacteriaceae, Parabacteroides distasonis, Streptococcus, Veillonella, Gemellaceae, Lachnospiraceae, and Enterococcus in older Ctl. Applying the same analysis to CD samples, while additionally controlling for antibiotic usage and ileal deep ulcers yielded no significant associations. Collectively these analyses demonstrated no systematic changes in CD across age-of-diagnosis suggesting that that CD-associated dysbiosis was already established at diagnosis in the younger CD patients.

Fig. 4
figure 4

Age-associated shifts in the ileal microbial community composition detected in non-IBD controls are not present within CD. a PCoA with Bray−Curtis distance comparing microbial community diversity in samples from CD patients (n = 272) and Ctl (n = 178). Left panel, samples are colored by the Chao1 diversity index. Right panel, samples are colored by <10 and ≥10 years. Triangular shape indicates Ctl, filled circles indicate CD samples. b Mean and standard deviation of Chao1 α-diversity is shown for Ctl and CD divided in younger <10 and older ≥10 years. *P < 0.01, a two-sided t-test was used. c The bar graph shows fold change (mean older A1b Ctl/mean younger Ctl) for significant associations between the indicated taxa as determined by MaAsLin while taking Paris age, gender, and body mass index (BMI) into account

Host gene expression differences associated with ileal microbial community composition

We next applied MaAsLin19,20 in order to test specifically for associations between age-dependent CD host gene expression and specific microbial taxonomy factor (n = 229). We included in the analysis differentially expressed genes from selected representative pathways (Figs. 1 and 2) including DEFA5, GSTA1, and LCT that were downregulated and CSF2, CXCR1, IFNG, and MMP1 that were upregulated in older vs. younger CD. We also included clinical phenotype (Ctl, CD), endoscopic severity (ileal deep ulcers), clinical severity (PCDAI), age-of-diagnosis (≥10 years vs. <10 years), gender, BMI, recent exposure to antibiotics, and NOD2, and ATG16L1 risk allele carriage. Overall, we identified 38 significant associations for clinical phenotype, and 5 significant associations with host gene expression (Fig. 5).

Fig. 5
figure 5

Co-variation of the ileal microbial community structure with host gene expression. a The bar graph shows fold change (mean CD/mean Ctl) for significant associations between the indicated taxa and clinical phenotype (Ctl, CD) as determined by MaAsLin while taking Paris age, gender, body mass index (BMI, as a measure of nutritional status), endoscopic severity (deep ulcers in ileum), clinical severity (Pediatric Crohn’s Disease Activity Index, PCDAI), antibiotics, ileal gene expression of CSF2, CXCR1, IFNG, MMP3, DEFA5, GSTA1, and LCT, and NOD2, and ATG16L1 IBD risk allele carriage into account. b Scatter plots are shown for significant associations between the indicated taxa (y-axis) and host gene expression (x-axis) based on the multivariate statistical analysis described in a

The bar graph (Fig. 5a) illustrates fold change differences between CD and Ctl taking into account the clinical factors and host gene expression. As was previously reported by us19,20 and others,32 CD phenotype showed a decrease in taxa from the Firmicutes phylum including Faecalibacterium prausnitzii, Lachnospiraceae, and Ruminococcaceae OTUs and an increase in taxa from the Proteobacteria and Fusobacteria phyla including Pasteurellaceae, Campylobacteraceae, Enterobacteriaceae, and Fusobacteriaceae organisms. Further, we determined a positive associations between Bacteroides abundance and α-defensin expression and a negative association between Acinetobacter and α-defensin expression. Finally, we noted a negative association between Lachnospiraceae abundance and IFNγ expression and between Bacteroides fragilis and MMP1 expression. No significant associations were captured for CSF2, CXCR1, GSTA1, and LCT host gene expression. Altogether, these results demonstrated that while some of the age-dependent host gene expression dynamics were associated with specific microbial taxa, gene expression differences were largely not associated with systematic microbial shifts within CD age-of-diagnosis.

Discussion

Epidemiologic associations and clinical phenotype, as well as natural history, differ across CD age-of-diagnosis.1,2,3,4 Using the RISK cohort, we were able to show that while there were no significant differences in baseline clinical severity (PCDAI), early exposure to anti-TNFα,18 and PCDAI six month after diagnosis between older (A1b) and younger (A1a) CD, however, the prevalence of penetrating complications (B3) during follow up was significantly higher in older CD (Table 1). By using whole transcriptome mRNA-seq analyses, we were able to identify 365 genes that were differentially expressed between older and younger iCD, with older iCD showing an increased Th1-related IFNγ profile associated with amplified inflammatory activation including an enriched innate myeloid and lymphoid stromal cell signature, and enhanced extracellular matrix and collagen signatures. Remarkably, this signature for enhanced immune activation occurred in the older iCD group in association with a specific reduction in epithelial Paneth cell α-defensin expression, and was not associated with systematic changes in the local mucosal microbial communities across CD age-of-diagnosis (Summarizing cartoon, Fig. 6). We therefore suggest that age-dependent ileal host inflammatory intensification and depression of α-defensins genes is largely intrinsic to the mucosal dynamics and is not associated with a systematic local microbial community shift.

Fig. 6
figure 6

Summarizing cartoon. Using high throughput mRNA and 16s rRNA amplicon sequencing of treatment naïve newly diagnosed ileal biopsies from CD cases we identify 365 genes that were robustly differentially expressed between older iCD (10–16 years) and younger iCD  <10 years). Those differentially expressed genes showed increased Th1-related IFNγ profile associated with amplified innate myeloid inflammatory activation, and enhanced extracellular matrix and collagen signatures in older iCD cases. Remarkably, these signatures for enhanced immune activation was associated with a specific reduction in epithelial Paneth cell α-defensin expression in older iCD, and was not associated with systematic changes in the local mucosal microbial communities

While a reduction in α-defensins was previously described in ileal CD, there was an ongoing controversy as to whether NOD2 and ATG16L1 genotype and/or the presence of inflammation is the primary drive of low α-defensin levels.16,33,34 We detected preservation of DEFA5/DEFA6 expression in the ileum of younger iCD patients similar to Ctl levels, and a significant reduced expression in older iCD patients. We specifically show that older age-of-diagnosis is associated with decreased expression of Paneth-cell associated α-defensins and that these age-related differences cannot be explained solely by either NOD2 or ATG16L1 genotype. We did not however identify similar decreased expression of TCF4,35 LRP6,36 and TCF737 (also known as TCF-1) which were previously linked to α-defensin expression in iCD. DNASE1 that was negatively associated with abnormal Paneth cell morphology in iCD29 showed a similar pattern of lower expression level in older iCD as the α-defensins and REG3G/A in contrast to stable increased expression of lysozyme (LYZ) that was one of the differentially expressed core iCD genes19 but not included in the 365 genes (Suppl. Table 2). We did identify a negative association between IFNγ and α-defensins expression, consistent with the previously reported bi-directional negative regulation of these genes.31 Th1-related IFNγ was suggested to specifically repress α-defensin gene expression by leading to Paneth cells extrusion and subsequent cell death.30

Early life gut microbial maturation and its implications in health and disease were previously characterized.38 By the end of the third year of life, the microbiome composition evolves toward an adult-like configuration.39 However, a specific understanding of the microbiome of older pediatric cohorts (>3 years) and pre-adulthood (<18 years) has been limited,40,41 whereby the largest cohort included 62 stool samples from residents of the United States.39 Our cohort19,20 includes 178 Ctl with ileal mucosal microbial composition. We were able to show that dynamic microbial changes occur in the ileum of older Ctl cases in comparison to younger Ctl cases, with no similar significant association detected in in older vs. younger CD patients. Collectively these data show that the CD dysbiosis characterized by increased abundance of pro-inflammatory taxa including taxa from Enterobacteriaceae, Pasteurellaceae, Veillonellaceae, and Fusobacteriaceae families, and decreased abundance of anti-inflammatory taxa from the Erysipelotrichaceae, Ruminococcaceae and Lachnospiraceae families was largely independent of CD age-of-diagnosis.

To test for associations between the microbiota, and age-dependent ileal gene expression, we specifically performed multivariate analysis that included Ctl versus CD clinical phenotype, age-of-diagnosis, and host genes that were differentially expressed in older vs. younger CD. Indeed, we were able to identify a significant association with expression of specific Th1 and tissue remodeling genes (IFNγ and MMP1), which were upregulated in older CD, and Lachnospiraceae and Bacteroides fragilis OTU abundance respectively. We also identified a significant positive association between Bacteroides abundance and negative association between Acinetobacter, and the antimicrobial gene DEFA5. Importantly, DEFA5-transgenic mice also showed a significant increase in taxa from the Bacteroidetes phylum.15 These data demonstrate that while some of the mucosal gene expression differences were associated with the abundance of specific microbial taxa, much of the robust age-dependent mucosal immune gene signature maturation observed is likely driven by host mucosal factors. However, with further defining of the changes in intestinal microbiota, specific strains may be found to play important in mucosal gene expression. For instance, a Faecalibacterium prausnitzii out was found to be decreased but there are also other F. prausnitzii OTUs that are increased in CD suggesting potential strain-specificity.42

Our study has several strengths, but also some limitations. Although it is reasonable to assume that puberty influences the observed gut maturation, we lacked consistent Tanner Stage information to specifically address this hypothesis. However, the clinical practice approach, namely the Paris classification, is specifically based on age-of-diagnosis and not on Tanner Stage. Future studies will need to test whether Tanner Stage and puberty-related hormones can further define primary pathways driving mucosal gene expression maturation in association with disease onset at specific ages. Because we require OTUs to occur at a relative abundance of at least 0.01% across all samples, it remains possible that a rare but infrequent immunomodulatory organism might explain the differences in the younger and older group. Moreover, we lacked power to test for differences in mucosal gene expression between sub-sets of younger and older patients stratified by measures of microbial diversity. Additionally, we used whole biopsies, composed of a mixture of cellular components rather than single cell transcriptomics. However, in order to capture the overall pathogenic process, and as a potential future diagnostic/prognostic tool, there are also substantial advantages in using whole mucosal biopsies, which are the basis for diagnosis and follow up in the clinical setting. Another limitation is the inability to replicate this results in an independent large treatment naïve human cohort not affected by treatment regimens. RISK is the largest treatment naïve inception cohort, involving 28 sites in the US, and other such large human cohorts are not yet available. However, there are several murine and mechanistic studies supporting our findings. Age-specific response to enteric Salmonella infection, with developmentally regulated intestinal expression of IFN-γ and its target genes was already noted.13 Paneth cells degranulation does not directly occur upon stimulation with microbial antigens or bacteria, but IFN-γ induces rapid and complete loss of Paneth cells granules, coupled with induction of apoptosis, luminal extrusion, and death of Paneth cells.30 Finally, similar to our findings, Firmicutes was significantly lower and Bacteroidetes was significantly higher in the DEFA5 transgenic (+/+) mice than in controls.15

In summary, our data identify important clinical and biologic differences between older and younger pediatric CD. We would suggest that pediatric A1b CD phenotype and mucosal gene expression signature, is likely very similar to adult (A2) CD,16,43,44 and different from the younger pediatric age group. Interestingly, in mice and humans the initial formation of Peyer’s patches occurs before birth in a relative sterile environment and hence is likely independent of microbiota.45 Consistent with this observation, our data suggest that much of the age-dependent differentially expressed gene expression signature in pediatric CD (≥10 years) is intrinsic to the host ileal mucosa, and not primarily associated with systematic shifts in the local microbial community. These specific host and microbial factors may offer the potential to tailor future age-based therapy.

Transcript profiling

The ileal RNASeq data has been placed in the GEO repository with the following accession number: GSE101794.