High miR203a-3p and miR-375 expression in the airways of smokers with and without COPD

Smoking is a leading cause of chronic obstructive pulmonary disease (COPD). It is known to have a significant impact on gene expression and (inflammatory) cell populations in the airways involved in COPD pathogenesis. In this study, we investigated the impact of smoking on the expression of miRNAs in healthy and COPD individuals. We aimed to elucidate the overall smoking-induced miRNA changes and those specific to COPD. In addition, we investigated the downstream effects on regulatory gene expression and the correlation to cellular composition. We performed a genome-wide miRNA expression analysis on a dataset of 40 current- and 22 ex-smoking COPD patients and a dataset of 35 current- and 38 non-smoking respiratory healthy controls and validated the results in an independent dataset. miRNA expression was then correlated with mRNA expression in the same patients to assess potential regulatory effects of the miRNAs. Finally, cellular deconvolution analysis was used to relate miRNAs changes to specific cell populations. Current smoking was associated with increased expression of three miRNAs in the COPD patients and 18 miRNAs in the asymptomatic smokers compared to respiratory healthy controls. In comparison, four miRNAs were lower expressed with current smoking in asymptomatic controls. Two of the three smoking-related miRNAs in COPD, miR-203a-3p and miR-375, were also higher expressed with current smoking in COPD patients and the asymptomatic controls. The other smoking-related miRNA in COPD patients, i.e. miR-31-3p, was not present in the respiratory healthy control dataset. miRNA-mRNA correlations demonstrated that miR-203a-3p, miR-375 and also miR-31-3p expression were negatively associated with genes involved in pro-inflammatory pathways and positively associated with genes involved in the xenobiotic pathway. Cellular deconvolution showed that higher levels of miR-203a-3p were associated with higher proportions of proliferating-basal cells and secretory (club and goblet) cells and lower levels of fibroblasts, luminal macrophages, endothelial cells, B-cells, amongst other cell types. MiR-375 expression was associated with lower levels of secretory cells, ionocytes and submucosal cells, but higher levels of endothelial cells, smooth muscle cells, and mast cells, amongst other cell types. In conclusion, we identified two smoking-induced miRNAs (miR-375 and miR-203a-3p) that play a role in regulating inflammation and detoxification pathways, regardless of the presence or absence of COPD. Additionally, in patients with COPD, we identified miR-31-3p as a miRNA induced by smoking. Our identified miRNAs should be studied further to unravel which smoking-induced inflammatory mechanisms are reactive and which are involved in COPD pathogenesis.

www.nature.com/scientificreports/ lung cancer 2,3 and other respiratory diseases 4,5 , including chronic obstructive pulmonary disease (COPD). Characteristics of COPD are an inflammatory response in the airways and lung parenchyma, associated with progressive, irreversible airflow limitation, dyspnea, hypersecretion of mucus, and alveolar destruction, i.e., emphysema.
We have previously shown that current smoking affects levels of gene expression and DNA methylation in bronchial biopsies 6 . However, current smoking on microRNA expression in the human airway wall is not well studied so far 7,8 . MicroRNAs (miRNA) are small non-coding RNA transcripts (approximately 17-27 nucleotides long), which interact with the Argonaute protein (AGO protein) in the RNA-induced silencing complex (RISC), leading to cleavage or silencing of a target mRNA 9 . Most human protein-coding genes contain miRNA target sites 10 . A single miRNA can influence the expression of hundreds of genes 11 and regulate a broad range of biological processes, such as cell proliferation, apoptosis, and the immune system 12,13 . Because of this, deregulation of miRNA function is associated with numerous diseases, including COPD 14-16 . We also have previously shown that miRNAs play an essential role in mediating inflammatory responses and corticoid effects in asthma and COPD [17][18][19][20] . However, although several studies have assessed the impact of smoking on miRNA expression, very few have focused on the airway wall in an unbiased method 21 . Here, we investigated the influence of current smoking on miRNA expression in bronchial biopsies in two independent datasets consisting of current and ex-smoking COPD patients and current and non-smoking respiratory healthy controls to elucidate the overall smoking-induced miRNA changes and those specific to COPD. In addition, we correlated miRNA to matched mRNA expression to identify genes and pathways influenced by the miRNAs affected by smoking to investigate the downstream effects on gene expression. Finally, to investigate the relation with cellular composition, differential miRNAs expression was associated with gene signatures of specific cell populations and immunohistological staining of specific cell types in matched biopsies.

Methods
Subjects. Affymetrix GeneChip miRNA 1.0 Array and RNA-Seq data from baseline bronchial biopsies from COPD patients and respiratory healthy controls were used for this study as described previously 22,23 .
COPD patients participated in the Groningen Leiden Universities and Corticosteroids in Obstructive Lung Disease study (GLUCOLD, ClinicalTrails.gov NCT00158847) (n = 62). Respiratory healthy controls participated in the Study to Obtain Normal Values of Inflammatory Variables from Healthy Subjects (NORM study, Clini-calTrails.gov NCT00848406, n = 73). In-and exclusion criteria and more extensive characteristics of subjects participating in the GLUCOLD and NORM studies were previously described 24,25 .
Briefly, for the GLUCOLD study, all patients were current or ex-smokers with > 10 pack years. Participants had irreversible airflow with postbronchodilator forced expiration volume in one second (FEV 1 ) < 80% predicted and postbronchodilator FEV 1 /forced vital capacity (FVC) below 70%. Patients took no oral corticosteroids for at least three months before starting the study and did not use inhaled corticosteroids for at least six months prior to the study. Patients classified as ex-smokers did not smoke at least one year prior to the study.
Participants of the NORM study had normal lung function (FEV 1 > 80% and FEV 1 /FVC > 70%) and no bronchial hyperresponsiveness to methacholine (PC 20 > 16 mg/ml). Participants defined as non-smokers did not smoke during the year leading up to the study and did not smoke more than one year in total, with a total of < 0.5 packyears.
The local medical ethics committees approved all studies, and all subjects gave their written informed consent (the NORM study was approved by the ethics committee of the UMCG, and the GLUCOLD study was approved by the same committee and the ethics committee of the LUMC) [24][25][26] . All methods were performed in accordance with the relevant guidelines and regulations.

Statistics.
To identify differentially expressed miRNAs in bronchial biopsies of current and ex-smoking COPD patients and current and non-smoking healthy participants, we used a linear model using Limma (R-package version 3.40.6, R statistical software version 3.6). We used current smoking status as a categorical variable, adjusting for age and sex. A Benjamini-Hochberg corrected p-value (False Discovery Rate (FDR)) of < 0.05 was considered statistically significant.
We performed the identification of potential targets of differentially expressed miRNAs by directly correlating miRNA expression to normalized 27 gene expression data available from the same bronchial biopsies. We correlated significant differentially expressed miRNAs to all expressed mRNAs using the Pearson Correlation Coefficient. Significant correlations between miRNA expression levels and gene expression levels were determined using Psych (R-package version 1.8.12), and we performed a meta-analysis on the results of both studies. Significant correlations with an absolute R-value above 0.25 were considered biologically relevant.
We verified the smoking-related expression changes of the identified miRNAs using an independent miRNA dataset (Illumina HiSeq) obtained from bronchial brushings from 30 COPD patients and 30 non-COPD controls 28 . COPD was defined in this dataset by FEV 1 /FVC < 70% and FEV 1 < 80% predicted and non-COPD controls were defined as FEV 1 /FVC > = 70 or FEV 1 /FVC < 70% and FEV 1 % predicted > 80%. We used a linear model and smoking status as a categorical variable while correcting for age, sex and pack years. We analyzed the COPD and non-COPD participants separately. miRNA predicted target approach. We calculated the most likely target for each miRNA using miR-NAtap (R-package version 1.18.0). The default settings were used to calculate the geometric mean of ranks from the databases/algorithms PicTar, Diana, TargetScan, miRanda, and miRDB. The minimum number of sources required for a potential target was 2. 3) on the mRNAs positively and negatively correlated with our differentially expressed smoking-associated miRNAs. A separate analysis was done for the negatively correlated predicted target genes only. We compared the ranked lists to BioCarta and KEGG gene set databases. We then performed a meta-analysis on the pathways found in both studies and adjusted the p-value using Benjamini-Hochberg's method.
Association of miRNAs with composite scores of cell-type-specific genes based on GSVA. We used cell-type-specific gene signatures based on single-cell data for 15 cell types (activated endothelium, B-cells, basal cells, proliferating basal cells, ciliated cells, club cells and goblet cells, fibroblasts, inflammatory dendric cells, ionocytes, luminal macrophages, mast cells, neutrophils, smooth muscle cells, and submucosal cells) 29 . We then performed Gene Set Variation Analysis (GSVA 30 , R-package version 1.32.0) to calculate the composite scores for each cell type. We then correlated each composite score to relevant miRNAs using the Pearson correlation coefficient.

Evaluation of miRNAs with the presence of cell populations in matched bronchial biopsies.
For the differentially expressed miRNAs, we assessed the correlation between miRNA expression and numbers of eosinophils, neutrophils, macrophages, mast cells, CD3 or CD4 or CD8 positive T-cells, and percentage of Periodic acid-Schiff stain (PAS)-positive goblet cells in paraffin-embedded bronchial biopsies from the same patients taken at the same location in the lung. These data were obtained from previously published data sets for the GLUCOLD 31 and NORM 26 studies. A flow diagram showing the analysis approach is provided in Fig. 1.

Results
In the current study, we investigated smoking-related miRNA changes in 62 COPD patients (ex-smokers, n = 22, and current smokers, n = 40) and 73 respiratory healthy individuals (non-smokers, n = 38 and current smokers, n = 35). The subject characteristics of the two studies are presented in Table 1. There was a significant difference Figure 1. Flow diagram of the study approach. Two studies were used: the GLUCOLD study with current or ex-smoker COPD patients and the NORM study with current or non-smoking respiratory healthy controls. A linear regression model was used to identify smoking-associated miRNAs. The overlapping two miRNAs were used for further analysis of COPD patients. A meta-analysis of Pearson correlations was performed to identify miRNA associated mRNA. The predicted targets were obtained, and a list of negatively correlated predicted targets was compiled. Pathway analysis was performed using this list as well as on all the negative and positively correlated mRNAs. Negatively correlated predicted targets were repeated in the NORM. Additionally, miRNA expression was correlated to cell type signatures and cell populations. Changes in miRNA expression associated with current smoking. We identified three miRNAs (miR-31-3p, miR-203a-3p, and miR-375) differentially expressed between current-and ex-smoking COPD patients (FDR < 0.05). All three miRNAs showed a higher level of expression in current-compared to ex-smokers (Table 2). A volcano plot and heatmap are depicted in Fig. 2A,B. We next repeated this analysis in respiratory healthy individuals. A total of 22 miRNAs were differentially expressed in the current-versus non-smokers (FDR < 0.05). Of these, 18 miRNAs showed a higher expression level in current smokers than non-smokers, while four were lower expressed ( Table 2). A volcano plot and heatmap are depicted in Fig. 2C,D, respectively.
Two miRNAs, miR-375 and miR-203a-3p, were associated with current smoking in the same direction in both COPD patients and respiratory healthy controls ( Fig. 2E-G).
MiR-31-3p, the third smoking-associated miRNA in the COPD patients, was not present in the respiratory healthy small-RNA-seq dataset.
We validated our findings in an independent dataset of epithelial brushes. We confirmed higher miR-375 expression in current-than ex-smokers in COPD and non-COPD, but no effect of smoking status on miR-203a-3p or miR-31-3p. (Supplementary Table S4).
Associations between changes in miRNA expression and mRNA expression. To identify potential genes and pathways that are regulated by the three smoking-associated miRNAs, we used two different methods: (1) we investigated the global effects of the miRNAs by doing direct correlations with all expressed genes in the same biopsies, and (2) we investigated the direct effects by focusing on the negatively correlated predicted targets for each miRNA.
For the first analysis, we examined whether miR-375, miR-203a-3p and miR-31-3p affected the expression of mRNA transcripts in each study separately and then performed a meta-analysis for the miRNAs present in both studies (miR-375 and miR-203a-3p). Here, we identified 983 significant negatively correlated genes for miR-203a-3p (FDR < 0.05), and 2254 positively correlated genes. We found 2442 significantly negatively correlated genes for miR-375 and 865 positively correlated genes, and we found 1106 significant negative correlations for miR-31-3p and 1029 positive correlations (Table S1). The top 3 most strongly correlated genes (PDGFD, ARL4C and MBNL1) of both miR-375 and miR-203a-3p are shown in Fig. 3C-H. Gene functions are described in Table 3.  predicted targets for miR-203a-3p, miR-375 and miR-31-3p. First, predicted targets were identified using miR-NAtap, and then the overlap was taken from the negatively correlated genes from the previous analysis.
Of the 983 genes negatively correlated with miR-203a-3p, two genes were also predicted targets; PDGFD, previously associated with fibroblast proliferation and survival, and EBF3, previously associated with B cell differentiation and cell survival through apoptosis and cell cycle arrest [32][33][34] .

Pathways affected by miRNA expression that change with current smoking patients.
To investigate the pathways influenced by changes in miRNA levels, we performed pathway analysis on (1) all genes of which the expression correlated to the identified miRNAs, for a global effect (Fig. 3A,B), and (2) the significantly negatively correlated predicted targets of the miRNAs, for a direct effect of these miRNAs. For miR-203a-3p, we found a higher expression associated with genes involved in cellular protection (xenobiotic and glutathione metabolism pathways), cellular repair (cell cycle, citrate cycle pathways), and protease pathway, as well as negative associations with pro-inflammatory pathways. In addition, the MAPK signalling pathway was negatively associated with miR-203a-3p associated genes in COPD patients (Table 4).
We found a global positive effect of genes associated with a higher expression of miR-375 on olfactory transduction, glycan biosynthesis and linoleic acid metabolism, and adverse effects on pro-inflammatory pathways in COPD patients, for example, via the cytokine-cytokine receptor integration pathway (Table 4).
A higher expression of miR-31-3p had positive associations with genes involved in pathways related to xenobiotic metabolism, cellular repair and linoleic acid metabolism, as well as negative associations with genes involved in pro-inflammatory pathways. Table 3. Selective literature of predicted target genes that were negatively correlated to miR-203a-3p, miR-375 and miR-31-3p, whilst significant after our meta-analysis (miR-203a-3p and miR-375) or FDR significant (miR-31-3p) after correlation. www.nature.com/scientificreports/ Relation between cell-type composition and genes associated with miRNAs. Two methods were used to assess potential effects of cell-type composition on the expression of the miRNA-associated genes, i.e. GSVA based on single-cell gene signatures derived from a publically available dataset and correlation analysis with inflammatory cell counts assessed in adjacent biopsies of the same patients.
With the GSVA analysis, we found a significant negative correlation between miR-203a-3p expression and cell type composite scores of single-cell signatures for fibroblasts, luminal macrophages, endothelial cells, mast cells, smooth muscle cells, B-cells, submucosal cells and inflammatory dendric cells. In contrast, we found a positive association of this miRNA with proliferating basal cells and club and goblet cells with current smoking in both COPD patients and respiratory healthy controls.
For miR-375, we found a negative correlation with cell type composite scores for endothelial cells, smooth muscle cells, mast cells, fibroblasts, basal cells and luminal macrophages., and a positive correlation with club Table 4. Significant gene sets negatively affected by the identified miRNAs in COPD patients and asymptomatic participants. Only the top pathways with a family-wise error rate < 0.05 in the COPD patients are shown. NES normalized enrichment score, FWER P-value family-wise error rate p-value.  Tables S2 and S3). Activated endothelium, fibroblasts, luminal macrophages, mast cells and smooth muscle cells GSVA composite scores were negatively associated with mir-31-3p while proliferating basal cells and club and goblet cells were positively associated.
In addition, we correlated the miRNA expression profiles to cell counts of eosinophils, neutrophils, macrophages, mast cells, CD3 + , CD4 + or CD8 + T-cells, CD20 + B-cells, and percentage of PAS staining positive mucus in the COPD study. We found a negative correlation between eosinophils and miR-203a-3p expression and a positive correlation between miR-31-3p and epithelial cells, but no association with miR-375 (data not shown).

Discussion
In the current study, we identified two miRNAs (miR-203a-3p and miR-375) that are significantly higher expressed with current smoking in COPD patients and respiratory healthy subjects, indicating a common effect of smoking regardless of disease status. Additionally, we found that miR-31-3p is associated with current smoking only in COPD patients.
We found that miR-375 is higher expressed in bronchial biopsies of current-smoking COPD patients and respiratory healthy controls, and we validated this finding in an independent dataset. We identified 18 predicted target genes that decrease with increased miR-375 expression. These included TCF12, WBP1L, CXCL12, MMD, and LST1 that were previously associated with inflammation via T-cell precursor differentiation 35 , altering B-cell development 36 , chemotaxis 57 , differentiation of monocytes to macrophages 53 , or differentiation of other cells 59 . These predicted target genes were not enriched in one specific pathway. However, our global miRNA pathway analysis showed that genes associated with miR-375 are lower expressed in pro-inflammatory pathway activity, such as the IL2RB and B-cell receptor signalling pathway. We continued our investigation using cellular deconvolution and found that higher levels of miR-375 are associated with lower levels of basal cells, fibroblasts, mast cells, smooth muscle cells, endothelial cells and luminal macrophages and higher levels of secretory cells, submucosal cells and ionocytes. In contrast, no association was found with inflammatory cell populations as determined histologically. As we consistently find miR-375 being increased by smoke across all datasets and a positive correlation between miR-375 with secretory cells, this may indicate cell-specific expression in secretory cells as it is well established that the levels of secretory cells are higher in current smokers 37 . The high number of genes and pathways related to inflammation changed by miR-375, but the lack of correlation to gene signatures typical for inflammatory cells might indicate an increased activation of inflammatory cells caused by miR-375 rather than a change in cell numbers or composition.
We showed that miR-203a-3p is more highly expressed in bronchial biopsies of current-compared to exsmokers with COPD and in current-versus non-smoking respiratory healthy subjects. Furthermore, two negatively correlated predicted targets of miR-203a-3, EBF3 and PDGFD, were involved in cell cycle arrest and apoptosis 31 and fibroblast proliferation 29,30 , respectively. Therefore, the lower EBF3 and PDGFD in response to smoking in association with miRNA-203a-3p by smoking potentially indicates a protective response and stimulation of repair as a response to cigarette smoke. In addition, we found that genes associated with xenobiotic metabolism were positively correlated with miR-203a-3p, suggesting that this miRNA facilitates the metabolism of harmful particles in cigarette smoke 38 . Additionally, we found that genes involved in pro-inflammatory pathways (e.g. cytokine-cytokine receptor interaction, IgA production and chemokine signalling) were negatively correlated with miR-203a-3p, suggesting an anti-inflammatory effect for this miRNA. We associated these microRNAs with cell signatures to assess whether a change in cell populations may drive the smoking-induced increase in miR-203a-3p expression. We found higher smoking-induced expression of miR-203a-3p associated with lower proportions of fibroblasts, luminal macrophages, endothelial cells, mast cells, smooth muscle cells, B-cells, inflammatory dendritic cells, and submucosal cells and higher proportions of secretory cells (club and goblet cells) and proliferating basal cells. Therefore, it could be speculated that miR-203a-3p might play a role in the basal cell's differentiation towards secretory cells, leading to more mucus production in smokers or that this miRNA is selectively expressed in secretory cells. This, however, has now only been demonstrated in biopsies and requires further investigation in isolated cells to see the relationship of this miRNA with basal cell differentiation towards secretory cells, with and without smoke exposure.
MiR-31-3p was higher in current-compared to ex-smoking COPD patients, and this miRNA was not found in the respiratory healthy control dataset. We found 14 predicted target genes that decreased with an increase of miR-31-3p. Among these genes was PDE5A, which was involved in smooth muscle contraction and relaxation, as well as cell proliferation, cell signalling and included pro-inflammatory w Inhibition of PDE5A is involved in preventing tobacco smoke-induced emphysema 39 , suggesting a protective role of this miRNA in current smokers with COPD as an attempt to limit the harmful effects of smoking 40 . In a previous study, it was shown that PDE5A was decreased after smoke exposure in lung tissue of healthy individuals 40 . The higher expression of miR-31-3p could suppress PDE5A expression even further in patients with COPD.
There were some limitations to the current study. As the miRNAs expression in the healthy group were conducted on different platforms, these miRNAs were not directly comparable. This may have led to the lack of expression of miRNA-31-3p in the healthy cohort. However, as the healthy cohort was conducted with small-RNA-Seq, it should have been picked up when present, and therefore we are confident in the current results. As a strength of the study, we were able to directly correlate the miRNA to matched expression data in the same subject allowing for the identification of direct miRNA-target gene interactions. Predicted targets for the miRNAs were identified using miRNAtap, which relies on algorithms and databases. However, as these are only predictions and not necessarily accurate, having the paired gene expression and miRNA expression needs additional confirmation of the associations. www.nature.com/scientificreports/ In conclusion, we found two miRNAs, miR-203a-3p and miR-375, that are upregulated in bronchial biopsies of smokers compared to ex-and non-smokers in both COPD and respiratory healthy controls. These miRNAs play a role in the detoxification and inflammatory response to smoke response. In addition, we identified that miR-31-3p was upregulated only in current smokers versus ex-smokers with COPD. miR-31-3p might have a protective role via decreasing PDE5A expression and thus smoke-induced emphysema. We propose that our identified miRNAs are candidates for future studies aimed at unravelling which smoking-induced inflammatory mechanisms are mainly reactive and which are involved in COPD pathogenesis.