Gene set enrichment analysis of the bronchial epithelium implicates contribution of cell cycle and tissue repair processes in equine asthma

Severe equine asthma is a chronic inflammatory condition of the lower airways similar to adult-onset asthma in humans. Exacerbations are characterized by bronchial and bronchiolar neutrophilic inflammation, mucus hypersecretion and airway constriction. In this study we analyzed the gene expression response of the bronchial epithelium within groups of asthmatic and non-asthmatic animals following exposure to a dusty hay challenge. After challenge we identified 2341 and 120 differentially expressed genes in asthmatic and non-asthmatic horses, respectively. Gene set enrichment analysis of changes in gene expression after challenge identified 587 and 171 significantly enriched gene sets in asthmatic and non-asthmatic horses, respectively. Gene sets in asthmatic animals pertained, but were not limited, to cell cycle, neutrophil migration and chemotaxis, wound healing, hemostasis, coagulation, regulation of body fluid levels, and the hedgehog pathway. Furthermore, transcription factor target enrichment analysis in the asthmatic group showed that transcription factor motifs with the highest enrichment scores for up-regulated genes belonged to the E2F transcription factor family. It is postulated that engagement of hedgehog and E2F pathways in asthmatic horses promotes dysregulated cell proliferation and abnormal epithelial repair. These fundamental lesions may prevent re-establishment of homeostasis and perpetuate inflammation.

Gene set enrichment analysis. Within the asthmatic group, 587 gene sets were significantly enriched and linked by 18,777 edges while in non-asthmatic horses 171 gene sets were significantly enriched and linked by 2326 edges. Results of GSEA (FDR < 0.05, p < 0.001) were visualized using the Enrichment Map plugin available for Cytoscape (Figs 3 and 4, detailed results in Supp. . In Fig. 3, linked gene sets present only in horses with asthma are shown. Gene sets involved in cell cycle regulation dominated, and genes involved in mechanisms such as inflammation/immune-response, metabolism, extracellular matrix (ECM) degradation, and protein translation and processing ( Fig. 3; Suppl. Tables 3-4) were also enriched. Such gene sets showed high redundancy as indicated by tight clusters of nodes (Figs 3 and 4, red and blue squares), in particular for cell cycle, Toll-like receptor (TLR) pathway, wound healing, glycosylation and other inflammatory and defense response gene sets. In horses without asthma, similar but fewer enriched gene sets were noted ( Fig. 4; Suppl. Tables 5-6). Enriched gene sets suggest concurrent expression of multiple genes in related pathways.

Enrichment of neutrophil migration and chemotaxis gene sets.
Within the asthmatic group, significantly up-regulated gene sets with lowest rank at max metric, suggesting higher likelihood of involvement in disease process, pertained to granulocyte and neutrophil chemotaxis and migration (Fig. 5, Table 1). Additional significant gene-sets were related to phases and components of the cell cycle (Table 2). In horses without asthma, gene sets with lowest rank at max included those connected to mitosis, cytoskeleton, protein binding and leukocyte migration and chemotaxis (Table 3). Gene sets linked to M phase and protein translation were identified as most highly significant ( Table 4).

Enrichment of cell cycle, hedgehog and hemostasis-related gene sets. Detailed analysis of cell
cycle phases identified significantly enriched gene sets involved in all phases except G1 within the asthmatic group following challenge, while only those in M phase were significantly enriched in the non-asthmatic group (Fig. 6). High enrichment of cell cycle gene sets in the asthmatic group correlated with significantly up-regulated expression of M-phase inducer phosphatase 1 (CDC25A) in horses with asthma (Suppl. Table 1). Forkhead box protein M1 (FOXM1) was significantly up-regulated only in horses with asthma (Suppl. Table 1). Expression of CDC25A and FOXM1 corresponded to enrichment in cell cycle gene sets (Fig. 3).
Multiple gene sets interacting with hedgehog (Hh) were identified as significantly enriched within the asthmatic but not the non-asthmatic group (Fig. 7): "hedgehog 'on' state" (FDR = 0.0172; REACT_268718.1),  , and in horses without asthma (B). Red dots represent the genes that differ significantly in horses with asthma before and after challenge (A), and in horses without asthma before and after challenge (B). Horizontal blue lines delineate 1-fold change. Significance set at FDR < 0.05. Differentially expressed genes can be observed in both groups, but a greater number were observed in horses with asthma. "hedgehog ligand biogenesis" (FDR = 6.14E −05 ; REACT_264605.1), "Hh ligand biogenesis disease" (FDR = 1.92E −05 ; REACT_263883.1) and "processing-defective hh variants abrogate ligand secretion" (fdr = 7.47e −05 ; react_264623.1). Two gene sets linked to promotion of Wnt signaling were also significantly enriched in only asthmatic horses: "Deletions in the axin genes in hepatocellular carcinoma result in elevated wnt signalling" (FDR = 2.10E −03 ; REACT_264286.1) and "Axin mutants destabilize the destruction complex, activating Wnt signaling" (FDR = 2.57E −03 ; REACT_264496.1) (Suppl. Table 3). Significant up-regulation of hypoxia-inducible factor 1α (HIF1α) and enrichment of the p53-hypoxia pathway (MSIGDB_C2) gene set was detected only in horses with asthma. Similarly, gene sets involved in "response to wounding and hemostasis" (Fig. 8), were significant only in horses with asthma. Complete lists of gene sets are in Suppl. Tables 3-6. iRegulon. Within the asthmatic group, a large number of significantly up-regulated genes were identified as potential targets of E2F transcription factors. The ten transcription factor motifs with the highest enrichment score in asthmatic and non-asthmatic horses are listed in Tables 5 and 6, respectively. Of note, E2F2, E2F3 and E2F8 were significantly up-regulated in asthmatic horses while only E2F1 was significantly down-regulated in non-asthmatic horses. Complete lists of targets for each of the transcription factor motifs are in Suppl. Tables 7-10. Each node (square) corresponds to a gene set either up-regulated (red) or down-regulated (blue) in response to asthmatic challenge. Edges (green lines) link sets with shared genes, and thickness of lines correlates with the number of genes in common between two sets. Only gene sets with FDR < 0.05 and p < 0.01 were included in visualizations; disconnected nodes and small clusters were removed. Clusters of gene sets involved in cell cycle, Toll-like receptor (TLR) pathways, wound healing, glycosylation and other inflammatory and defense responses are outlined in black circles.

Discussion
The goal of this study was to identify key pathways in the pathogenesis of asthma through identification and analysis of gene sets associated with the response to challenge in a group of asthmatic horses. This study built upon our previous work where bronchial biopsies from asthmatic and non-asthmatic horses were obtained before and after a challenge (dusty hay), and differences in gene expression between the two groups were determined 26 . In the current study, the gene expression response to an asthmatic challenge was analyzed for each group separately, and then gene sets related by biological process or regulation were identified. For this purpose, GSEA was applied to evaluate whether genes that were a priori assigned to a specific biological process (gene set) were associated with the asthmatic phenotype 25 . GSEA builds upon gene expression data derived from RNA-seq to cluster genes based on common functions, locations, pathways, interactions or other connecting properties. Enrichment Map software with the Cytoscape plugin was applied to visualize GSEA output and to facilitate interpretation of results. After differential expression analysis with edgeR, data from each group of animals were further analyzed to identify significantly enriched gene sets and their linked edges. Edges reflect genes shared within a gene set, with the thickness of an edge corresponding to the number of overlapping genes, and similar gene sets being clustered together 26 . The number of genes with expression that differed significantly within groups of asthmatic and non-asthmatic horses after exposure to challenge was 2,341 and 120, respectively. Analysis yielded 587 significantly enriched gene sets linked by 18,777 edges, and 171 gene sets linked by 2,326 edges, respectively, which was in approximate relation to the number of differentially expressed genes. The enrichment map highlighted many closely positioned groups of gene sets, reflecting a high degree of redundancy of genes within sets. In addition, one or several genes linked most gene sets, which implies that most pathways and networks responsible for specific features of asthma were related.
Neutrophil migration and chemotaxis gene sets. Neutrophil influx into airways is a hallmark of equine severe asthma 27 . Neutrophil and granulocyte migration and chemotaxis gene sets were ranked at max classification in asthmatic horses. Rank at max classification represents the maximum enrichment score (ES), which in turn is the maximal deviation from zero calculated for each gene descending the ranked list. Therefore, the rank represents the degree of over-representation of a gene set at the top or bottom of the ranked gene list. Enriched genes involved in neutrophil and granulocyte migration and chemotaxis were identified between positions 3 and 16 of the ranked list in asthmatic horses, yielding the highest score of all gene sets (Table 1). Hence, the top genes differentially expressed in the epithelium of horses after asthmatic challenge are highly enriched for neutrophil migration and chemotaxis, which is consistent with IL8 and CXCR2 previously identified as differentially expressed in asthmatic compared to non-asthmatic horses 28 . Cell cycle, hedgehog and hemostasis-related gene sets. Of particular interest was that gene sets affecting all phases of the cell cycle, except for part of the G1 phase, were significantly enriched in asthmatic horses while only those involved in M phase were significantly enriched in non-asthmatic horses. As apparent in Fig. 3, proximity visualization placed cell cycle gene sets very close together and linked them by a large number of edges, reflecting a high degree of redundancy. Detailed analysis of individual cell cycle phases ( Fig. 6) showed that gene sets in M, G0 and early G1, and G2 were most significant in asthmatic horses. Although the significance of relative differences between groups cannot be derived from this analysis, we hypothesize that processes in G0, early G1 and G2 phase are most altered in asthmatic relative to non-asthmatic horses because release of injurious mediators from granulocytes stimulates induction of cell cycling in the epithelium. In agreement with this hypothesis, CDC25A (ENSECAG00000016336), an M2-inducing phosphatase, was up-regulated and differentially expressed in asthmatic compared to non-asthmatic horses 28 . Altered cell cycle regulation was previously reported when peripheral blood mononuclear cells (PBMC) from asthmatic horses were exposed to hay dust extract 18 . However, that study assessed changes in PBMC, which are comprised of monocytes and lymphocytes, while here changes in the bronchial epithelium were investigated. Induction of genes associated with cell proliferation in both studies suggests that hay dust extract has a direct effect on PBMC (which may include previously in vivo sensitized lymphocytes), and that bronchial inflammatory cells or inhaled hay dust also stimulate epithelial cell cycling. Significant up-regulation of FOXM1, a regulator of G1/S and G2/M cell cycle transition, further confirmed that there was induction and progression of the cell cycle 29,30 . FOXM1 regulates CDC25A gene transcription by direct promoter binding and indirectly via the E2F transcription factor 31 . Previous work showed that patched 1 (PTCH1) was significantly down-regulated in horses with asthma compared to non-asthmatic horses 28 . Here, the Hh-related gene sets were enriched. Activation of Hh signaling through aberrant FOXM1 and GLI Family Zinc Finger 1 (Gli1) activation was observed in colorectal epithelial cells, and was considered essential for cell growth and proliferation 32 . Therefore, it is plausible that transient down-regulation of PTCH1 and activation of Hh signaling lead to persistent bronchial cell proliferation, which in turn may be mediated by FOXM1 and CDC25A activation.
It has been proposed that injury of terminal bronchioles in airway inflammation leads to luminal exudation of plasma and accumulation of fibrinogen, thrombin and mucus in airways, and in turn a pro-coagulant state 33 . Horses with exacerbated asthma were considered to be in a hypercoagulable state and to have systemic inflammation during active disease, with hypercoagulability persisting during remission 34 . We detected here significantly enriched gene sets associated with wound healing, hemostasis, blood coagulation and regulation of body fluid levels in the asthmatic group following challenge, lending support to linkage and importance of these pathways in equine asthma. Smooth muscle cells of the bronchioles are hypercontractile in human asthmatics 35 , and SRF and its co-factor MYOCD are thought to contribute to airway remodeling in severe equine asthma 19 . Several other genes previously identified as differentially expressed between asthmatic and non-asthmatic horses are involved in wound healing, coagulation, hemostasis and regulation of body fluids, including thrombospondin 1 (THBS1), oncostatin (OSM), pleckstrin (PLEK) and others 28 . Therefore, diverse and functionally overlapping sets of genes are altered in asthmatics, though their precise roles in pathogenesis remain to be defined.
iRegulon. Up-regulated genes in the asthmatic group were highly enriched in E2F-linked transcription factor-binding motifs: E2F8 was most significantly up-regulated, followed by E2F2 and E2F3. Members of the E2F transcription factor family are key players in cell cycle regulation 36 . E2F8 is thought to be a transcriptional repressor of DNA-damage responses in cancer 37 , and also to have opposing roles as oncogene or tumor suppressor under specific conditions 38 . E2F2 and E2F3 are transcriptional activators 36 . Nuclear overexpression of E2F3 has been associated with lung cancer 39 , and E2F2 was identified in non-small cell lung carcinoma in humans 40 . In non-asthmatic horses the transcriptional activator E2F1 was significantly down-regulated 28 . FOXM1 in part activates CDC25A through E2F transcription factor-pathways 31 . Interestingly, in humans, E2F1 inhibition was linked to reduced airway SMC proliferation 41 . Hence, data presented here suggest that E2F8, E2F2 and E2F3 regulate cell proliferation in epithelial cells of asthmatic horses. Transcription factor-binding motifs containing RFX were overrepresented among significantly down-regulated genes in both groups. Specifically, RFX2 was down-regulated in both groups while RFX5 was down-regulated in asthmatics only. RFX2 is involved in ciliogenesis 42 , while RFX5 regulates expression of MHC II genes 43,44 . These findings suggest that MHC II may have a limited role in the exacerbation phase of asthma and that ciliary regeneration may be impaired in equine asthma, as also described in human asthma 45 . Equine versus human severe asthma. Clinical features of severe equine asthma closely resemble those of low-Th2 late-onset human asthma. The pathogenesis of this particular human asthma phenotype remains poorly  understood, which precludes devising effective interventions. In this study of equine asthma, gene sets including but not limited to neutrophil migration and chemotaxis, Hh signalling, wound healing, hemostasis, coagulation and regulation of body fluid, were significantly enriched in the asthmatic group following challenge. Enrichment of neutrophil migration and chemotaxis gene sets is consistent with influx of neutrophils into airway lumens during exacerbation, and with differential expression of IL8 and CXCR2 in asthmatic and non-asthmatic horses 26 . Neutrophils are the prevailing luminal leukocyte in equine asthma, and also predominate in most late-onset human asthma 9,46 . Yick et al., using RNA-seq of endobronchial biopsies from human asthmatics and controls, found comparable differential expression of genes in networks concerning cellular movement, cell death and cellular morphology, genetic disorder, and cellular movement and development 47 . Following epithelial injury, inflammation and tissue repair mechanisms in the bronchial epithelium are activated. Resolution of inflammation is crucial for orderly epithelial tissue repair 48 , but over-expression of innate immune response, tissue repair, cell cycle, Hh signalling and other genes intimates that re-establishment of epithelial homeostasis fails in exacerbated asthma. Hence, we propose that chronic inflammation causes dysregulation of the epithelial cell cycle and of progenitor cell differentiation, which in turn promotes tissue remodeling and SMC proliferation characteristic of advanced asthma. Hh signalling modulates cell cycle and remodelling processes through expression of PTCH1 49 . While the Hh pathway is crucial for compartmentalization in the developing lung, additional roles in adult lung diseases were recently identified. SNPs close to the negative regulator hedgehog interacting protein (HHIP) on 4q31 were associated with decreased lung function and lower HHIP promoter activity in COPD patients 50 . In humans with chronic lung fibrosis, Hh was expressed specifically at affected epithelial sites, and the Hh receptor PTCH1 on infiltrating leukocytes 51 . SNPs in PTCH1 and HHIP regions were associated with decreased lung function in asthmatic patients 52 . Together, these results support the concept that impaired Hh signalling in horses and humans may be a causal factor in pathologic airway remodelling through dysregulation of cell cycle and differentiation. However, the Hh pathway is unlikely activated in isolation, and the Wnt pathway is a possible candidate also linked to asthma susceptibility. In a genome-wide association study (GWAS) of asthmatic humans the Wnt signalling pathway was significantly enriched, and multiple SNPs near Wnt interacting genes were strongly associated with asthma susceptibility 53 . The Wnt/catenin pathway has been considered to regulate fibrosis and SMC remodelling in a mouse model of asthma 54 . In this study, two Wnt signalling gene sets were significantly enriched in asthmatic horses following challenge. Since Wnt and Hh signalling pathways co-exist and interact in diseases such as colon cancer, it is conceivable that both may contribute to aberrant epithelial repair and concomitant stromal cell abnormalities in asthma 55 .   Finally, based on target enrichment analysis, we suggest that E2F transcription regulates gene expression in the asthmatic lung. E2F2, E2F3 and E2F8 were significantly up-regulated in asthmatic horses only, while E2F1 was significantly down-regulated in non-asthmatics. E2F transcription factors are crucial in regulating cell cycle 56 , lung development 57 and transcriptional programs for inflammation, immunity, metabolism and stress-response 58 . E2Fs factors can have redundancy in function through either promotion or repression of transcription, sometimes shifting function under different conditions, such as different development stages 59 . For instance, E2F8, a known transcriptional repressor 60,61 can suppress E2F1 62 but is overexpressed in lung cancer 63 . Overall, E2F transcription is complex and particular transcription factors have not yet been directly linked to human or equine asthma. Chromatin immunoprecipitation and protein-protein interaction assays with specific E2Fs antibodies could more specifically clarify their roles in asthma.

Conclusion
We assessed changes in gene expression in the bronchial epithelium within groups of asthmatic and non-asthmatic horses after a dusty hay challenge. In asthmatic horses, we observed enrichment of 587 gene sets and transcription factor targets linked to activation of cell cycle and inflammation/immune response programs. A small number of cell cycle gene sets were also significantly enriched in non-asthmatics, but restricted to M phase of the cell cycle. These findings point at mechanisms underlying impaired regeneration of the epithelial barrier in asthmatics, and may be pivotal events leading to subepithelial tissue remodeling and eventual irreversible lung parenchymal changes. Furthermore, gene sets and transcription factor targets affecting cilia were significantly enriched among down-regulated genes in both groups. Although interesting, the biological significance of this findings is unknown and awaits further investigation. Additional time-points would have yielded a more dynamic representation of tissue response, and might have more precisely defined the divergence of physiological from pathological epithelial changes. Such data might also allow analysis of dynamic changes in specific pathways. Tissue factor motifs identified as enriched were consistent with gene set changes, but the biological role of their cognate binding partners needs to be confirmed with functional assays.

Animals and procedures.
Overall study design is outlined in Fig. 1. Animal procedures, sample collection and sample processing in this study were performed as previously described by Tessier et al. 28 . Briefly, six horses with asthma in remission and seven horses without asthma (mean ages of 15 and 12 years (p = 0.352, unpaired t test, respectively) were placed in a dust-free indoor environment for 24 hours before exposure to dusty hay until respiratory impairment was apparent in asthmatic horses (range 1 to 3 days, average 2.2 days). Non-asthmatic horses were exposed to dusty hay for 3 days. Physical examination, pulmonary function test (PFT) and bronchoalveolar lavage (BAL) were performed before and after exposure to the asthmatic challenge, and endoscopic bronchial biopsies were obtained from a contralateral lung lobe. All procedures were approved by the Institutional Animal Care Committee of the University of Guelph (protocol R10-031) and conducted in compliance with Canadian Council on Animal Care guidelines.
RNA-Seq sample preparation and analysis. RNA extraction, RNA-seq library generation and sequencing were performed as previously described 28 . Briefly, total RNA was extracted from endobronchial biopsies (Qiagen, Toronto, ON). RNA quality and concentration was determined with the Bioanalyzer RNA Nanochip (Agilent, ON) and capillary electrophoresis. RNA-seq unstranded library preparation and sequencing were performed at The Centre for Applied Genomics (TCAG; Toronto, ON) using the Illumina TruSeq RNA sample preparation and sequencing protocols (Illumina, San Diego, CA). For each sample, approximately 1 μg of non-degraded, high quality total RNA was enriched for poly-A RNA, fragmented into 200 to 300 bases, and  65 . The STAR_pass2 alignment protocol was followed including these adaptations: horse Ensembl version 70 GTF annotation file for first-and second-pass, and the junction SJ.tab file generated by STAR for the second-pass after non-canonical junctions were removed. Default settings were used except for:-runThreadN 8-outFilterScoreM-inOverLread 0.5-outFilterMatchNminOverLread 0.5. Read counts were generated from STAR alignment files using HTSeq version 0.6.1p1 66 with settings -s no -f bam -r name.
Statistical Analysis. Differential expression (DE) analysis was performed in R, version 3.2.1 (www.r-project. org), with the edgeR package version 3.10.2 67-69 as previously described 28 with exception of the applied contrast. Paired DE analysis was performed to assess changes within groups (before versus after asthmatic challenge in asthmatics and non-asthmatics separately). The edgeR analysis was based on section 3.5 of the edgeR user's guide (last revised April 10, 2017). The model matrix was designed as followed: ~group + group:horse + group:challenge, Figure 8. Curves of GSEA enrichment scores for wound healing and hemostasis-associated pathways in horses with asthma. These gene sets were significantly enriched only in horses with asthma, suggesting higher activity of pathways related to these processes. See caption of Fig. 5 Table 6. Most enriched transcription factor motifs among significantly up-and down-regulated genes in horses without asthma following challenge. ^N ormalized enrichment score. *Transcription factors significantly upregulated following challenge. # Transcription factors significantly down-regulated following challenge.
where "group" refers to non-asthmatic and asthmatic groups, "horse" refers to each horse, and "challenge" refers to samples before and after the asthmatic challenge. Fit of the generalized linear model and tests for differences in expression were performed with the "glmFit" and "glmLRT" functions, respectively. For the current study, GlmLRT(fit, coef = "groupNonAsthmatic:postChallenge"), and glmLRT(fit,coef = "groupasthmatic:postChallenge") were used to analyze asthmatic challenge effect within the non-asthmatic and asthmatic horse group (before challenge versus after challenge), respectively. Statistical significance was set at a false discovery rate (FDR) < 0.05.

Gene set enrichment, modules and network analysis. Ranked Gene Set Enrichment Analysis
(GSEA) was performed with software version 2.1.1 25,70 . The human gene set file excluded annotations with evidence codes -'IEA' (inferred from electronic annotation), 'ND' (no biological data available), 'RCA' (inferred from reviewed computational analysis) and was downloaded from: http://download.baderlab.org/EM_Genesets/ January_28_2015/Human/symbol/Human_GO_AllPathways_no_GO_iea_January_28_2015_symbol.gmt 71 . GSEA pre-ranked analysis (GseaPreranked) was performed using default settings except for "Collapse dataset to gene symbols" set to "False". Prior to analysis, a ranked list was calculated with each gene assigned a score based on the FDR and the direction of the log fold-change ("+" or "−"). Horse Ensembl IDs were converted to HUGO gene symbols. Non-matching symbols were enriched using the human orthologs when percent identity was above 80% for target and query sequence. Input gene and score lists for asthmatic and non-asthmatic horses are included in Suppl. Files 1 and 2, respectively. Gene sets identified as significant (FDR < 0.05, p < 0.001) with GSEA were visualized using the Enrichment Map plugin available for Cytoscape version 3.4.0 26,72 . Connected nodes only were included, and gene set clusters were summarized and labeled manually with the WordCloud plugin 73 for Cytoscape version 3.4.0 72 .
Transcription factor target enrichment. Transcription factor target enrichment among differentially expressed genes in asthmatic and non-asthmatic horses was performed with Cytoscape (v 3.4.0) in combination with the iRegulon plugin 74 . Analysis was conducted using the Homo sapiens database and default settings. Genes with FDR < 0.05 were included in the analysis, and up-and down-regulated genes were analyzed separately. iRegulon input for each of the groups is included in Suppl. Files 3-6. Transcription factors of interest with motif enrichment scores (NES) > 3 were selected for further analysis.
The datasets generated and analyzed in this study are available in the NCBI Sequence Read Archive at https:// www.ncbi.nlm.nih.gov/Traces/study/?acc=SRP106023.