The nasal methylome as a biomarker of asthma and airway inflammation in children

The nasal cellular epigenome may serve as biomarker of airway disease and environmental response. Here we collect nasal swabs from the anterior nares of 547 children (mean-age 12.9 y), and measure DNA methylation (DNAm) with the Infinium MethylationEPIC BeadChip. We perform nasal Epigenome-Wide Association analyses (EWAS) of current asthma, allergen sensitization, allergic rhinitis, fractional exhaled nitric oxide (FeNO) and lung function. We find multiple differentially methylated CpGs (FDR < 0.05) and Regions (DMRs; ≥ 5-CpGs and FDR < 0.05) for asthma (285-CpGs), FeNO (8,372-CpGs; 191-DMRs), total IgE (3-CpGs; 3-DMRs), environment IgE (17-CpGs; 4-DMRs), allergic asthma (1,235-CpGs; 7-DMRs) and bronchodilator response (130-CpGs). Discovered DMRs annotated to genes implicated in allergic asthma, Th2 activation and eosinophilia (EPX, IL4, IL13) and genes previously associated with asthma and IgE in EWAS of blood (ACOT7, SLC25A25). Asthma, IgE and FeNO were associated with nasal epigenetic age acceleration. The nasal epigenome is a sensitive biomarker of asthma, allergy and airway inflammation.

great attention to detail at every step, including the elegant statistical analyses. This alone makes the paper important, and establishes the relevance of nasal sampling to the understanding of atopic asthma at the same time as laying down the standards to which all subsequent investigators will refer.
The method for controlling for cell counts appears effective, but has the disadvantage of giving no insights into the relative contribution of specific cell types (eosinophils, neutrophils, etc.)or their functional state. I wonder if the paper could be strengthened by including information from published cell-specific methylation patterns. The use of correlation networks (e.g. from the WGCNA package) may derive cell-specific eigenvectors that can be included in the overall analyses.
It might also be valuable to explore factors that are specific to particular phenotypes such as FeNO, and I wonder if random forest or other analyses may help in this regard.
Reviewer #3 (Remarks to the Author): This manuscript by Cardenas et al reports results of an epigenome-wide association study (EWAS) of current asthma, allergic sensitization, allergic rhinitis, fractional exhaled nitric oxide (FeNO) and lung function using Illumina EPIC array on nasal swabs from the anterior nares of 547 children from Project Viva (mean-age 12.9y; 67.1% White, 16.1% Black, 4.2 % Hispanic, 3.1% Asian and 9.3% of more than one race). The analysis was done using linear models adjusting for race/ethnicity, sex, age at nasal sample collection, body mass index (BMI) z-score, maternal education, smokers living in the household, and sine and cosine of season at sample collection. The authors also models adjusted for cell proportions estimated by reference-free method, which substantially reduced the number of associations, but they discuss both sets of results. All phenotypes tested have some associations, except for lung function I which case nothing was significant. Validation was performed using data form peripheral blood from Liang et al Nature publication (3 independent cohorts that focused on IgE analysis).
Strengths of the study are in large numbers using a very well established cohort and in superb statistical analysis. There are, however, several limitations of this study as outlined below.
Major issues: 1. Use of cells collected from anterior nares without any measures of cell composition is a major issue with this study that unfortunately cannot be overcome since this information was not collected. I believe this is a HUGE confounder in this analysis. There is no way that these samples are only 1% immune cells if they see so many changes in immune genes and they report replication in peripheral blood. These samples also likely have varying proportions of columnar and squamous epithelia. This cannot be adequately adjusted for with a reference-free method. Samples from inferior turbinates can also have a mix of cells but previously published studies selected samples that predominately have columnar cells. While use of nasal swab from anterior nares may be desirable in younger kids, I am not sure why authors chose this method. They have increased their sample size over previous studies but introduced so much variability and such confounding that they did not try to deal with adequately by obtaining cell count information. 2. There is no gene expression data so there is no way to know which of these methylation changes may have potential functional impact. I am surprised that they do not have gene expression data given the publication they cite to justify the use of this collection method (J Allergy Clin Immunol. 2015 Oct;136(4):1120-3.e4) has expression data but this manuscript says that the authors "were unable to collect sufficient high-quality RNA from the anterior nares". 3. While these is replication in blood, there is no attempt to replicate in another cohort of nasal epithelial cells, and yet conclusions are drawn about nasal epithelia. 4. The way results are presented is extremely confusing. They report single CpGs and regions, with and without adjustment for cell proportions, in relation to 4 outcomes: current asthma, allergic sensitization, allergic rhinitis, and FeNO. The summary table is helpful but results are discussed jumping from one model to another, and it is hard to follow. Are there any commonalities among all these models? Or maybe pick one outcomes to focus discussion on?
Minor comments: 1. Are there differences in methylation by race/ethnicity? Main models adjusted for it but it seems that putting all these diverse individuals may wash out some signal specific to one race/ethnicity. 2. First ICAC study (97 allergic asthmatics vs 97 controls ) was in PBMCs and not in nasal cells (the second study of 36 allergic asthmatics vs 36 controls was in nasal cells).

Reviewer #1 (Remarks to the Author):
The authors report an EWAS on 547 subjects from the Project Viva study, where associations between nasal methylation levels and different allergy-and asthma-related outcomes were assessed. While the results are intriguing and potentially of great interest, I have a couple of major concerns as outlined below.

Major comments
1) The most obvious limitation in this study is the lack of an independent replication dataset for the top findings. After all, the study includes for example only 65 current asthmatics and 47 rhinitis subjects. Although the authors checked if the Xu et al CpGs for asthma also were significant in this study using nasal epithelial cells, which they were, there is no other way forward than to replicate your own top hits in an independent dataset. As discussed further below, there is reason to believe that there are numerous false positive findings in the unadjusted models in this study. Similar to any state-of-the-art GWAS study of today, where replication attempts are always performed (at least in decent papers), this should be the standard also for EWAS.

Response:
We thank the reviewer for this constructive comment and completely agree that replication should be the standard. This suggestion motivated us to look for external replication of our results. We have majorly revised our manuscript and results to include replication of findings in nasal epithelial cells of an independent cohort of children. We now only present and interpret results of analyses fully adjusted for cell-type heterogeneity. We incorporated the following major revisions based on the reviewer's comments:

1)
We excluded all unadjusted findings and only discussed, present, and interpret cell-type adjusted results throughout the manuscript to reduce the potential for false positives as suggested.
2) Additionally, we used an external replication cohort of inner-city children aged 10 to 12 years with persistent atopic asthma (n=36) versus healthy control subjects (n=36) with nasal epithelial DNA methylation measurements (Yang, Ivana V., et al. Journal of Allergy and Clinical Immunology 139.5 (2017): 1478-1488; PMID: 27745942). The largest limitation of this replication cohort is that samples were measured using Illumina's Infinium Human Methylation 450k BeadChip sharing approximately 328,000 CpGs with the newer EPIC BeadChip used in our study. In this replication cohort, nasal sampling occurred in the inferior turbinate in contrast to the anterior nares location from our study. However, this is currently the most similar dataset publically available to attempt replication for our multiple findings and provide comparability to the published literature.
To test for external replication, we downloaded publicly available data from the Gene Expression Omnibus GEO (GSE65163) from the study performed by Yang and colleagues. We now describe this replication approach in the manuscript with track changes on page 25 lines 527-538:

"Replication in Epithelial Nasal Cells. We sought to replicate our top differentially methylated findings of asthma and allergic asthma in an external cohort with nasal epithelial cells collected from the posterior portion of the inferior turbinate from the Inner City Asthma Consortium 1 . Briefly, in this study samples of nasal epithelial cells from 36 atopic asthmic and 36 controls with at least 80% ciliated epithelial cells were collected and DNA methylation was measured using Illumina's Infinium Human
Methylation 450K BeadChip. We downloaded publicly available data from the Gene Expression Omnibus repository (GSE65163) from the study performed by Yang and colleagues 1 . To allow for direct comparability we carried out the same pre-processing and analytical strategy used in our study, including adjusting for cell-type using ReFACTor (9 PCs). Among differentially methylated CpGs found for asthma and allergic asthma we compared differences in DNA methylation in adjusted models and controlling the FDR<0.05." Although this replication cohort was relatively small (36 kids with atopic asthma and 36 controls) we were able to replicate many of our findings after adjusting for multiple comparisons in the replication phase. Namely, among the 95 CpGs found to be differentially methylated relative to current asthma in our data and present in the 450K assay of Yang and colleagues, 58 CpGs replicated (61%) with FDR<0.05 and relative small unadjusted p-values (p<1x10 -4 ) in the replication cohort with the same direction and similar magnitude of association for all 58 CpGs.
For allergic asthma 375 CpGs were present in the Yang et al. 450K data of which 199 CpGs (53%) replicated with an FDR<0.05 and 197 of these had consistent direction for the association.  present table after table with top hits not adjusted for cell type (Tables 3-8).
The unadjusted models seem to primarily represent cell type composition, which the authors also acknowledge, but these are not the results of interest when it comes to methylation changes and health outcomes. I would strongly recommend presenting cell-type adjusted top findings.
Response: This is a valid concern and valuable comment that motivated us to re-evaluate our approach and presentation of results. We followed the suggestion and removed all unadjusted analyses (Tables 3-8). We now only present tables for top differentially methylated CpGs and regions found in analyses bioinformatically adjusted for cell-type (Table 2) presented below: Finally, we include all fully adjusted results, both CpGs and regions, in a supplement csv file for future reproducibility and comparison (Supplementary Tables S3-S9).
3) Please provide Q-Q plots for the different models, both the unadjusted (for cell type) and the adjusted ones. Looking at the numbers of significant hits in the unadjusted models together with the lambda values suggest issues with inflation and false positive findings. The Manhattan plots also indicate very unspecific patterns, mainly driven by cell-type issues (I would guess). Lambdas from the adjusted models using ReFACTor look much more sensible and trustworthy.
Response: This is a great suggestion. We now provide overlapping Q-Q plots for models adjusted and not adjusted for cell-type heterogeneity for all the 10-EWAS performed in the supplementary material (Supplementary Figure S1) shown below. As predicted by the reviewer, unadjusted results show very large inflation (black). Furthermore, we now only present Manhattan plots adjusted for cell-type composition in Figure 2, which as predicted are greatly attenuated.
We now include this information on the manuscript page 6 lines 109-112: Figure S1). Manhattan plots of fully adjusted EWAS of FeNO, current asthma and allergic asthma are shown in Figure 2.

Supplementary Figure S1. Quantile-Quantile plots of expected vs observed p-values from epigenome-wide association analyses (EWAS): p-values adjusted for confounders but no celltype are in black and p-values from models further adjusted for cell-type heterogeneity using
ReFACTor (10 PCs) are in blue.

4)
Biologically relevant genes were identified for the studied outcomes. There is however yet another limitation in the study in that the authors were not able to directly assess the functional relevance of hypo-or hypermehtylation at the identified CpG sites.
Response: This is correct and a limitation of this study and was likely a consequence of sampling in the anterior nares, which results in adequate DNA with comparable methylation measures to inferior turbinate samples, but may result in insufficient RNA for gene expression measurement (Lai, Peggy S., et al. "Alternate methods of nasal epithelial cell sampling for airway genomic studies." Journal of Allergy and Clinical Immunology 136.4 (2015): 1120-1123; PMID: 26037550). For us, the great advantage to sampling in this location was the acceptability of the procedure to the children (also documented by Lai 2015) and to the Institutional Review Board (IRB) that approved the protocols, with the resultant large numbers of children providing samples. Sampling from the inferior involves using a speculum and greater discomfort. Thus, while we were able to obtain sufficient DNA for high quality DNA methylation assessment, unfortunately, we were not able to obtain a sufficient RNA and/or of high enough quality perform transcriptomic analyses. For example, out of 250 samples for which we attempted to recover RNA half had concentration lower than 10 ng/uL and not a single samples had as much as 250 ng/uL. In addition, preliminary quality checks using spectrophotometry showed low OD ratios for most samples (A260/A280 in usually lower than 2.0). Based on these disappointing initial results, we elected to abort the RNA analysis and we did not proceed to measure RIN values nor to conduct any transcriptomics analysis. We now state this on the manuscript pages 18-19 lines 395-397: "Additionally, while it was our intention to measure gene expression we were unable to collect sufficient high-quality RNA from the anterior nares or enough concentration (most samples yielded less than 10 ng/uL)." We also stated our rationale and justification for sampling form the anterior nares on page 17 lines 357-359: "Sampling from the anterior nares does not require a speculum, and subjects report less discomfort with this technique as compared to inferior turbinate sampling, making it more amenable for use in pediatric populations 39 .

5)
The use of multiple, related phenotypes could be an asset, but there is also a risk of "fishing expedition" issues and multiple testing problems. Again, the only way to tackle potential drawbacks with such approach is to apply rigorous statistical and methodological analyses and to replicate key findings.

Response:
We chose to present all analyses conducted for transparency. As mentioned the IgE, asthma and allergic asthma results are multiple related phenotypes, but we believe presenting all results will allow testing for reproducibility in future studies, which might collect one outcome but not the other. In addition, we hypothesized that lung function would be related to nasal epithelial DNA methylation, but this hypothesis was not supported. We would still like include these results, as it could be very important for other cohorts in the near future that might want to test this hypothesis.
Lastly, our replication efforts show generalizability of findings even for samples from other regions of the nose (i.e. inferior turbinate), different ethnic population and DNA methylation technology (450K). Those we believe that your results are generalizable.

Reviewer #2 (Remarks to the Author):
This is a very interesting study from a leading group. They have, with great care, recovered cells from the anterior nasal airway by swabs and performed an EWAS on extracted DNA. They discover multiple hits that are very relevant to asthma and to atopy.
It is very difficult to fault the methodology of the study, which seems to have been carried out with great attention to detail at every step, including the elegant statistical analyses. This alone makes the paper important, and establishes the relevance of nasal sampling to the understanding of atopic asthma at the same time as laying down the standards to which all subsequent investigators will refer.
The method for controlling for cell counts appears effective, but has the disadvantage of giving no insights into the relative contribution of specific cell types (eosinophils, neutrophils, etc.)or their functional state. I wonder if the paper could be strengthened by including information from published cell-specific methylation patterns. The use of correlation networks (e.g. from the WGCNA package) may derive cell-specific eigenvectors that can be included in the overall analyses.
It might also be valuable to explore factors that are specific to particular phenotypes such as FeNO, and I wonder if random forest or other analyses may help in this regard.

Response:
We appreciate the reviewer's positive comments. The suggestions above motivated us to do another analysis in regards to factors that are specific to some outcomes. To improve interpretability we now performed enrichment analyses of gene ontology categories for allergic asthma, asthma and FeNO findings fully adjusted for cell-type (Figure 3) shown below: We find overlap for differentially methylated genes among allergic asthmatics and FeNO measurements for the neutrophil degranulation, signaling by interleukins and interleukin-4 and interleukin-13 signaling biological processes. We now elaborate on this on page 10 lines 192-196: "Among differentially methylated genes, we also looked at biological pathways enriched in the Reactome database. The top differentially methylated Reactome pathway was observed for neutrophil degranulation followed by signaling by interleukins pathways for allergic asthma and FeNO. Genes found for asthmatics were enriched in the interleukin-2 family signaling as well as the SUMOylation of intracellular receptor pathways (Figure 3)."

Reviewer #3 (Remarks to the Author):
This manuscript by Cardenas et al reports results of an epigenome-wide association study (EWAS) of current asthma, allergic sensitization, allergic rhinitis, fractional exhaled nitric oxide (FeNO) and lung function using Illumina EPIC array on nasal swabs from the anterior nares of 547 children from Project Viva (mean-age 12.9y; 67.1% White, 16.1% Black, 4.2 % Hispanic, 3.1% Asian and 9.3% of more than one race). The analysis was done using linear models adjusting for race/ethnicity, sex, age at nasal sample collection, body mass index (BMI) z-score, maternal education, smokers living in the household, and sine and cosine of season at sample collection. The authors also models adjusted for cell proportions estimated by reference-free method, which substantially reduced the number of associations, but they discuss both sets of results. All phenotypes tested have some associations, except for lung function I which case nothing was significant. Validation was performed using data form peripheral blood from Liang et al Nature publication (3 independent cohorts that focused on IgE analysis).
Strengths of the study are in large numbers using a very well established cohort and in superb statistical analysis. There are, however, several limitations of this study as outlined below.
Major issues: 1. Use of cells collected from anterior nares without any measures of cell composition is a major issue with this study that unfortunately cannot be overcome since this information was not collected. I believe this is a HUGE confounder in this analysis. There is no way that these samples are only 1% immune cells if they see so many changes in immune genes and they report replication in peripheral blood. These samples also likely have varying proportions of columnar and squamous epithelia. This cannot be adequately adjusted for with a reference-free method. Samples from inferior turbinates can also have a mix of cells but previously published studies selected samples that predominately have columnar cells. While use of nasal swab from anterior nares may be desirable in younger kids, I am not sure why authors chose this method.
They have increased their sample size over previous studies but introduced so much variability and such confounding that they did not try to deal with adequately by obtaining cell count information.

Response:
We appreciate the reviewer's concern and agree that even samples from the inferior turbinates contain a mixture of cells. Therefore, in theory there is no ideal sample. To ensure generalizability of our findings, we replicated our results in an independent cohort of nasal DNA methylation samples collected from asthmatics and controls from posterior portion of the inferior turbinate, verified to have at least 80% ciliated epithelial cells visualized from slides (Yang, Ivana V., et al. Journal of Allergy and Clinical Immunology 139.5 (2017): 1478-1488; PMID: 27745942). We replicated over 50% of our findings present in their data for asthma and allergic asthma even after adjusting for multiple comparisons in the replication phase. We believe this is evidence that our samples are appropriate and comparable to those that might be obtained from the inferior turbinates.
We now include the results from this replication on pages 10-11 lines 212-225: "Replication in Epithelial Nasal Cells. We tested for replication of our top differentially methylated findings of asthma and allergic asthma in an external cohort with nasal epithelial cells collected from the posterior portion of the inferior turbinate and analyzed with the Infinium HumanMethylation450K BeadChip 6 . We checked for replication of the 285 differentially methylated CpGs for asthmatics of which 95 probes were present on the 450K from the replication cohort. Among the 95 CpGs found to be differentially methylated for asthmatics in our study and present in the 450K, 58 CpGs replicated (61%) after controlling for multiple comparisons (FDR<0.05 for 95 comparisons) in the replication cohort with consistent direction and magnitude of association for all 58 CpGs (Supplemental Table 10 In terms of confounding by cell-type, the reference free method we used (ReFACTor) has been shown to adequately control the false positive rate even in highly confounded scenarios. For example, in highly confounded epigenome-wide association analyses of DNAm from mixed leukocytes of rheumatoid arthritis cases and controls ( However, the method we applied (ReFACTor) has been proven to adequately control the false discovery rate as stated on page 24 lines 511-512: "We chose ReFACTor as it has been shown to control the false positive rate even when compared to reference-based methods 40 ."