Introduction

Allergic rhinitis tends to cluster with asthma (A) in multimorbidity1. However, clinically, two rhinitis (R) phenotypes can be identified: (i) R alone (affecting around 70–80% of patients with R) and (ii) R in multimorbidity with A (R + A), affecting 20–30%. On the other hand, the majority of patients with A have or have had rhinitis (R)1. Furthermore, airway remodelling, a constant feature of A, does simply not exist in R. It is also important to consider that the clinical, immunological and genetic differences between monosensitisation (to one allergen) and polysensitisation (to more than one allergen) and the link between polysensitisation and multimorbidity increase the heterogeneity of R. This suggests the existence of distinct molecular pathways in R + A and R alone2. In consequence, the concept of “one-airway-one-disease” (coined over 20 years ago) may be an oversimplification of the disease1,3.

Previous efforts to understand the links between R and R + A have focused on the atopic march sequence4. An alternative approach is the characterisation of the molecular mechanisms of these diseases and their interactions. A complete characterisation of cellular function can only emerge from studying how gene products interact with one another, forming a dense molecular network known as the interactome, which can be defined as the representation of all interactions (regulatory, metabolic, physical, etc.) among the gene products present at a given time within a cell. This is where the branch of systems biology, known as interactomics, comes into play, applying data mining and biostatistical methodologies (i) to identify molecular pathways and (ii) in general, to provide a molecular context that will facilitate the understanding of the complexity of many phenotypes. During the last decade, its analysis has provided important insights into the inner operations of the cell under different conditions including pathological ones5,6.

The MeDALL study, which was aimed at unravelling the complexity of allergic diseases, did show that the coexistence of eczema, R and A in the same child is more common than expected by chance alone, both in the presence and absence of IgE sensitisation. This suggests that these diseases share causal mechanisms7. A MeDALL in silico study suggested the existence of a multimorbidity cluster between A, eczema and R, and that type 2 signalling pathways represent a relevant multimorbidity mechanism of allergic diseases8. The in silico analysis of the interactome at the cellular level implied the existence of differentiated multimorbidity mechanisms between A, eczema and R at cell type level, as well as mechanisms common to distinct cell types9. A transcriptomics study of samples from MeDALL birth cohorts identified a signature of eight genes associated to multimorbidity for A, R and eczema7. In this study, genes of R alone differed from those of R + A multimorbidity without any overlap.

In this study, we used transcriptomic information obtained in MeDALL cohorts to compare the molecular mechanisms of R and R + A, assessing how the relationship between these diseases should be understood in a multimorbidity framework using an interactomics approach.

Materials and methods

Study design

We used the transcriptomics data from Lemonnier et al. obtained in MeDALL (Mechanisms of the Development of Allergy)10. The analysis comprised a cross-sectional study carried out in participants from three MeDALL cohorts using whole blood. It compared participants with single allergic disease (asthma, dermatitis or rhinitis) or with multimorbidity (A + D, A + R, D + R, or A + D + R) to those without asthma, dermatitis or rhinitis and non-allergic participants. We characterised the molecular pathways associated to R alone and R + A using an interactomics approach.

Settings and participants

Three birth cohorts were used: BAMSE (Swedish abbreviation for Children, Allergy, Milieu, Stockholm, Epidemiology, Sweden), INMA (INfancia y Medio Ambiente, Spain) and GINIplus (German Infant Study on the Influence of Nutrition Intervention plus Air pollution and Genetics on Allergy Development, Germany)10.

Datasets

Differentially expressed genes (DEGs) for R alone and for R + A were obtained from the MeDALL gene expression study10,11,12,13. The full dataset is supplied in Supplementary Table S1.

The interactome

The first-degree interactomes of the DEGs for R alone and for R + A were independently generated using the IntAct database, which contains a curated collection of > 106 experimentally determined protein–protein interactions in human cells14. Data was downloaded via the IntAct web-based tool at the European Bioinformatics Institute (https://www.ebi.ac.uk/intact/). Ensembl Gene IDs were used instead of HGNC names to avoid ambiguities. Self-interactions and expanded interactions were discarded. Interactomes are supplied in Supplementary Table S2. Network density was calculated as the number of edges with respect to the maximal possible number of edges. Random distributions were used to test the degree of interconnectedness of the interactomes. They were generated by random sampling of gene sets (of the same size) of each interactome over 10,000 iterations.

Functional annotation

The interactomes were functionally annotated using the DAVID web-based tool15, with the Reactome database as the source of functional information16 and the default gene background. Functional pathways were considered significant with FDR < 0.05. In order to simplify the functional annotation and remove redundancy, pathways associated to diseased or defective cellular processes were removed and we only considered pathways in the intermediate levels (levels 3 and 4) of the Reactome hierarchy. Furthermore, we assigned each pathway to a generic functional family (“Signal Transduction”, “Cell Cycle”, etc.) in order to help interpreting the results. We did this by (1) clustering all the pathways according to their Szymkiewicz-Simpson overlap17; (2) identifying the best partition of using the Pearson’s Gamma method18 implemented in the fpc R package; and (3) asigning each cluster in the best partition to a generic pathway (i.e. a Reactome pathway with > 800 genes) by means of a Fisher’s Exact Test19. Full functional annotation is available in Supplementary Table S3.

Software

All data mining and statistical analysis were carried out using the R programming language20.

Results

Demographic characteristics of the participants

Among the 786 participants included in the analysis, 54.8% had no allergic disease. Among those with an allergic disease, 45% had asthma (61% with multimorbidity), 42% dermatitis (48%) and 51% rhinitis (49%). Asthma was more common in BAMSE (63%), dermatitis in INMA (74%) and rhinitis in GINIplus (79%). Fifty-five percent of the participants had no allergic disease (BAMSE 39%, GINIplus 67%, INMA 54%) (Table 1).

Table 1 Characteristics of the population study.

Topological analysis of the interactomes

We generated interactomes for R alone and R + A, which can be seen as snapshots of the cellular mechanisms behind these conditions (Fig. 1). The interactome of R alone consisted of 464 genes connected by 466 edges. The interactome of R + A consisted of 130 genes connected by 149 edges. The interactome of R alone is 2.18 times denser than random expectation, which is statistically significant (z-test; P = 1.09·10–11). Similarly, the interactome of R + A is 6.22 times denser than random expectation, which is also statistically significant (z-test; P = 3.42·10–50). There were no DEGs common to R alone and R + A in the MeDALL study, but we identified 25 genes common to both interactomes, which implied a degree of interconnectedness significantly larger than random expectation (z-test; P = 2.52·10–22).

Figure 1
figure 1

Interactomes of rhinitis alone and rhinitis associated with multimorbidity. DEGs: differentially expressed genes. For clarity, only genes with HGNC symbol are shown.

Functional annotation

Functional annotation revealed marked differences in the molecular pathways of both phenotypes. Pathways specific to R alone (Fig. 2; in table form in Supplementary Table S4) involved a number of Toll-like receptor (TLR), IL-17 and MyD88 (myeloid differentiation primary response gene 88) signalling cascades, as well as WNT5A-dependent signalling, RHO GTPase activity and the small ubiquitin-related modifier (SUMO) pathways. In contrast, pathways associated to R + A (Fig. 3; in table form in Supplementary Table S4) were much richer in signal-transduction-related processes such as IL-mediated and fibroblast growth factor receptors (FGFRs)-mediated signalling. IL-33 particularly stood out with a ~ 68-fold enrichment.

Figure 2
figure 2

Pathways unique to rhinitis alone. Pathways were classified in broad functional families (coloured rectangles).

Figure 3
figure 3

Pathways unique to rhinitis in multimorbidity. Pathways were classified in broad functional families (coloured rectangles).

The pathways common to both R phenotypes are shown in Fig. 4 (in table form in Supplementary Table S4). The pathways with largest fold enrichment (both in R alone and R + A) are estrogen-stimulated signalling through PRKCZ and RAS-mediated signalling.

Figure 4
figure 4

Pathways common to rhinitis alone and rhinitis in multimorbidity. Pathways were classified in broad functional families (coloured rectangles).

Discussion

Using topological and functional analysis, we identified a core of common mechanisms between the two phenotypes of R, but also found significant differences between both phenotypes. Densely interconnected groups of genes within the interactome are known to be contributors to the same pathological phenotypes21. The high level of connectivity that we observed within the interactome of each phenotype, together with the lack of common DEGs, suggests that R alone and R + A are largely mechanistically different diseases, affecting different molecular pathways.

TLRs stand out as strong drivers of R alone. TLRs are type I transmembrane receptors employed by the innate immune system22. Variation in the TLR genes has been associated with R in several candidate gene studies. A significant excess of rare variants in R patients was found in TLR1, TLR5, TLR7, TLR9 and TLR10 but not in TLR823. Children carrying a minor rs1927911 (TLR4) allele may be at a higher R risk24. In turn, TLR is strongly associated to MyD88 pathways, mediating in innate lymphoid cells type 2 (ILC2) activation and eosinophilic airway inflammation25,26. IL-17 was also identified, as were SUMO pathways, known to regulate many cellular processes including signal transduction and immune responses27,28. Furthermore, it is known that SUMOylation plays a critical role in the expression of TSLP in airway epithelial cells. Inhibition of SUMOylation attenuates house dust mite-induced epithelial barrier dysfunction29.

On the other hand, there are a number of mechanisms—such as Nf-kB-mediated signalling and IL-1 and IL-33 activity—that seem to be driving R + A multimorbidity. IL-33 and IL1RL1 are among the most highly replicated susceptibility loci for A30, and IL-33 has a known role in infection-mediated A susceptibility31. There is an increase in FGFR (fibroblast growth factor receptor) signalling. The FGF/FGFR signalling system regulates a variety of biological processes, including embryogenesis, angiogenesis, wound repair and lung development32. It may be relevant in A remodelling. Interleukin-33 (IL-33) which belongs to the interleukin-1 (IL-1) family is an alarmin cytokine with critical roles in tissue homeostasis, pathogenic infection, inflammation, allergy and type 2 immunity. IL-33 transmits signals through its receptor IL-33R (also called ST2) which is expressed on the surface of T helper 2 (Th2) cells and group 2 innate lymphoid cells (ILC2s), thus inducing transcription of Th2-associated cytokine genes and host defense against pathogens33. IL33 and IL1RL1/ST2 are among the most highly replicated susceptibility loci for asthma34,35. However, IL-33 is not associated with rhinitis alone36.

The exposure of the airway epithelium to external stimuli such as allergens, microbes, and air pollution triggers the release of the alarmin cytokines IL-25, IL-33 and thymic stromal lymphopoietin (TSLP). IL-25, IL-33 and TSLP interact with their ligands, IL-17RA, IL1RL1 and TSLPR, expressed by cells including dendritic cells, ILC2 cells, endothelial cells, and fibroblasts. Alarmins play key roles in driving type 2-high, and to a lesser extent type 2-low responses, in asthma1,3,37. Future analysis could exploit tissue-specific transcriptomic data to highlight the differences between the epithelium of upper and lower airways38,39.

Finally, some signal transduction pathways common to R alone and R + A have an impact on the IgE-mediated immune response. They include activation of RAS on B cells, CD209 signalling40, MAPK Kinase41, FceRI MAPK kinase activation42, ERK activation43, Raf kinases44 or VEGFA45.

Limitations of the study

One major problem is that there are no data on A without rhinitis. However, this is not the first study in which either the A population is too low or there is no signal for A alone10. This is the case for the present study and we were unable to include the A alone group. It is likely that A is almost always associated with R in children.

Incompleteness and spurious interactions have for a long time been limitations in studies that make use of data from the human interactome. However, authors have argued that data noise does not limit a successful application of the interactome to the investigation of disease mechanisms46,47. Also, the human interactome is known to be biased towards certain genes of interest (a category that includes many disease-associated genes)48. However, non-biased interactomes have a much lower coverage, which makes them unsuitable for some topology-based studies49. Lastly, time-dependent and location-dependent interaction patterns are not captured in our study, which only considers an interactome static in time.

Impact of the study

Clinical data, epidemiologic studies50, mHealth-based studies51 and genomic approaches7 all support the existence of two distinct diseases: R alone and R with A multimorbidity. This study helps to better understand the differences between R and R + A and to refine the ARIA-MeDALL hypothesis on allergic diseases52. It also highlights the importance of IL-1753, IL-3354 and their interactions to understand the allergic multimorbidity.

Conclusions

The interactomes of R alone and R + A showed topological characteristics that suggest that the cellular mechanisms involved are different for each phenotype. We identified mechanisms specific to R alone (TLR and MyD88 signalling cascades, SUMO pathways) and mechanisms specific to R + A (IL-33-mediated signalling, FGFR-mediated signalling).