Smoking is one of the leading causes of preventable death worldwide, and despite the well-known health effects, over 15% of the world’s population still smoked in 2019, according to the WHO. The smoking of cigarettes exposes the lung to more than 4000 components1. As a result, it is strongly associated with the development of lung cancer2,3 and other respiratory diseases4,5, including chronic obstructive pulmonary disease (COPD). Characteristics of COPD are an inflammatory response in the airways and lung parenchyma, associated with progressive, irreversible airflow limitation, dyspnea, hypersecretion of mucus, and alveolar destruction, i.e., emphysema.

We have previously shown that current smoking affects levels of gene expression and DNA methylation in bronchial biopsies6. However, current smoking on microRNA expression in the human airway wall is not well studied so far7,8. MicroRNAs (miRNA) are small non-coding RNA transcripts (approximately 17–27 nucleotides long), which interact with the Argonaute protein (AGO protein) in the RNA‐induced silencing complex (RISC), leading to cleavage or silencing of a target mRNA9. Most human protein-coding genes contain miRNA target sites10. A single miRNA can influence the expression of hundreds of genes11 and regulate a broad range of biological processes, such as cell proliferation, apoptosis, and the immune system12,13. Because of this, deregulation of miRNA function is associated with numerous diseases, including COPD14,15,16.

We also have previously shown that miRNAs play an essential role in mediating inflammatory responses and corticoid effects in asthma and COPD17,18,19,20. However, although several studies have assessed the impact of smoking on miRNA expression, very few have focused on the airway wall in an unbiased method21. Here, we investigated the influence of current smoking on miRNA expression in bronchial biopsies in two independent datasets consisting of current and ex-smoking COPD patients and current and non-smoking respiratory healthy controls to elucidate the overall smoking-induced miRNA changes and those specific to COPD. In addition, we correlated miRNA to matched mRNA expression to identify genes and pathways influenced by the miRNAs affected by smoking to investigate the downstream effects on gene expression. Finally, to investigate the relation with cellular composition, differential miRNAs expression was associated with gene signatures of specific cell populations and immunohistological staining of specific cell types in matched biopsies.



Affymetrix GeneChip miRNA 1.0 Array and RNA-Seq data from baseline bronchial biopsies from COPD patients and respiratory healthy controls were used for this study as described previously22,23.

COPD patients participated in the Groningen Leiden Universities and Corticosteroids in Obstructive Lung Disease study (GLUCOLD, NCT00158847) (n = 62). Respiratory healthy controls participated in the Study to Obtain Normal Values of Inflammatory Variables from Healthy Subjects (NORM study, NCT00848406, n = 73). In- and exclusion criteria and more extensive characteristics of subjects participating in the GLUCOLD and NORM studies were previously described24,25.

Briefly, for the GLUCOLD study, all patients were current or ex-smokers with > 10 pack years. Participants had irreversible airflow with postbronchodilator forced expiration volume in one second (FEV1) < 80% predicted and postbronchodilator FEV1/forced vital capacity (FVC) below 70%. Patients took no oral corticosteroids for at least three months before starting the study and did not use inhaled corticosteroids for at least six months prior to the study. Patients classified as ex-smokers did not smoke at least one year prior to the study.

Participants of the NORM study had normal lung function (FEV1 > 80% and FEV1/FVC > 70%) and no bronchial hyperresponsiveness to methacholine (PC20 > 16 mg/ml). Participants defined as non-smokers did not smoke during the year leading up to the study and did not smoke more than one year in total, with a total of < 0.5 packyears.

The local medical ethics committees approved all studies, and all subjects gave their written informed consent (the NORM study was approved by the ethics committee of the UMCG, and the GLUCOLD study was approved by the same committee and the ethics committee of the LUMC)24,25,26. All methods were performed in accordance with the relevant guidelines and regulations.


To identify differentially expressed miRNAs in bronchial biopsies of current and ex-smoking COPD patients and current and non-smoking healthy participants, we used a linear model using Limma (R-package version 3.40.6, R statistical software version 3.6). We used current smoking status as a categorical variable, adjusting for age and sex. A Benjamini–Hochberg corrected p-value (False Discovery Rate (FDR)) of < 0.05 was considered statistically significant.

We performed the identification of potential targets of differentially expressed miRNAs by directly correlating miRNA expression to normalized27 gene expression data available from the same bronchial biopsies. We correlated significant differentially expressed miRNAs to all expressed mRNAs using the Pearson Correlation Coefficient. Significant correlations between miRNA expression levels and gene expression levels were determined using Psych (R-package version 1.8.12), and we performed a meta-analysis on the results of both studies. Significant correlations with an absolute R-value above 0.25 were considered biologically relevant.

We verified the smoking-related expression changes of the identified miRNAs using an independent miRNA dataset (Illumina HiSeq) obtained from bronchial brushings from 30 COPD patients and 30 non-COPD controls28. COPD was defined in this dataset by FEV1/FVC < 70% and FEV1 < 80% predicted and non-COPD controls were defined as FEV1/FVC >  = 70 or FEV1/FVC < 70% and FEV1% predicted > 80%. We used a linear model and smoking status as a categorical variable while correcting for age, sex and pack years. We analyzed the COPD and non-COPD participants separately.

miRNA predicted target approach

We calculated the most likely target for each miRNA using miRNAtap (R-package version 1.18.0). The default settings were used to calculate the geometric mean of ranks from the databases/algorithms PicTar, Diana, TargetScan, miRanda, and miRDB. The minimum number of sources required for a potential target was 2.

Pathway analysis

To predict which pathways are influenced by smoking-induced miRNAs, we performed Gene Set Enrichment Analysis (GSEA version 4.0.3) on the mRNAs positively and negatively correlated with our differentially expressed smoking-associated miRNAs. A separate analysis was done for the negatively correlated predicted target genes only. We compared the ranked lists to BioCarta and KEGG gene set databases. We then performed a meta-analysis on the pathways found in both studies and adjusted the p-value using Benjamini-Hochberg's method.

Association of miRNAs with composite scores of cell-type-specific genes based on GSVA

We used cell-type-specific gene signatures based on single-cell data for 15 cell types (activated endothelium, B-cells, basal cells, proliferating basal cells, ciliated cells, club cells and goblet cells, fibroblasts, inflammatory dendric cells, ionocytes, luminal macrophages, mast cells, neutrophils, smooth muscle cells, and submucosal cells)29. We then performed Gene Set Variation Analysis (GSVA30, R-package version 1.32.0) to calculate the composite scores for each cell type. We then correlated each composite score to relevant miRNAs using the Pearson correlation coefficient.

Evaluation of miRNAs with the presence of cell populations in matched bronchial biopsies

For the differentially expressed miRNAs, we assessed the correlation between miRNA expression and numbers of eosinophils, neutrophils, macrophages, mast cells, CD3 or CD4 or CD8 positive T-cells, and percentage of Periodic acid–Schiff stain (PAS)-positive goblet cells in paraffin-embedded bronchial biopsies from the same patients taken at the same location in the lung. These data were obtained from previously published data sets for the GLUCOLD31 and NORM26 studies.

A flow diagram showing the analysis approach is provided in Fig. 1.

Figure 1
figure 1

Flow diagram of the study approach. Two studies were used: the GLUCOLD study with current or ex-smoker COPD patients and the NORM study with current or non-smoking respiratory healthy controls. A linear regression model was used to identify smoking-associated miRNAs. The overlapping two miRNAs were used for further analysis of COPD patients. A meta-analysis of Pearson correlations was performed to identify miRNA associated mRNA. The predicted targets were obtained, and a list of negatively correlated predicted targets was compiled. Pathway analysis was performed using this list as well as on all the negative and positively correlated mRNAs. Negatively correlated predicted targets were repeated in the NORM. Additionally, miRNA expression was correlated to cell type signatures and cell populations.


In the current study, we investigated smoking-related miRNA changes in 62 COPD patients (ex-smokers, n = 22, and current smokers, n = 40) and 73 respiratory healthy individuals (non-smokers, n = 38 and current smokers, n = 35). The subject characteristics of the two studies are presented in Table 1. There was a significant difference in sex between the COPD patients and respiratory healthy individuals (Chi-Squared p-value = 3E−4), as well as a significant difference in age (Mann–Whitney U p-value = 2E−11).

Table 1 Patient demographics in relation to smoking status.

Changes in miRNA expression associated with current smoking

We identified three miRNAs (miR-31-3p, miR-203a-3p, and miR-375) differentially expressed between current- and ex-smoking COPD patients (FDR < 0.05). All three miRNAs showed a higher level of expression in current- compared to ex-smokers (Table 2). A volcano plot and heatmap are depicted in Fig. 2A,B.

Table 2 Differential expression of miRNAs for respiratory healthy controls and subjects with COPD when comparing smoking status.
Figure 2
figure 2

Differentially expressed microRNAs (miRNAs) when comparing smoking status in patients with COPD and respiratory healthy controls. Volcano plot (A) shows changes in expression levels of 220 miRNAs in patients with COPD (n = 62). Heat map (B) shows expression changes of the three significantly differentially expressed miRNAs in COPD patients. Volcano plot (C) shows changes in the expression of 395 miRNAs in respiratory healthy controls (n = 73). Heat map (D) shows expression changes of the 22 significantly differentially expressed miRNAs in respiratory healthy controls. Venn diagram (E) shows the overlap between significantly differentially expressed miRNAs in COPD patients compared to respiratory healthy controls. Boxplot (F) shows the normalized expression values of the miRNAs miR-203a-3p and miR-375 per smoking status group in the COPD patients, and boxplot (G) shows the normalized expression values for the same miRNAs in the respiratory healthy controls. Red indicates an increase in expression, whereas blue indicates a decrease in expression. A false discovery rate (FDR) cut-off of 0.05 was used.

We next repeated this analysis in respiratory healthy individuals. A total of 22 miRNAs were differentially expressed in the current- versus non-smokers (FDR < 0.05). Of these, 18 miRNAs showed a higher expression level in current smokers than non-smokers, while four were lower expressed (Table 2). A volcano plot and heatmap are depicted in Fig. 2C,D, respectively.

Two miRNAs, miR-375 and miR-203a-3p, were associated with current smoking in the same direction in both COPD patients and respiratory healthy controls (Fig. 2E–G).

MiR-31-3p, the third smoking-associated miRNA in the COPD patients, was not present in the respiratory healthy small-RNA-seq dataset.

We validated our findings in an independent dataset of epithelial brushes. We confirmed higher miR-375 expression in current- than ex-smokers in COPD and non-COPD, but no effect of smoking status on miR-203a-3p or miR-31-3p. (Supplementary Table S4).

Associations between changes in miRNA expression and mRNA expression

To identify potential genes and pathways that are regulated by the three smoking-associated miRNAs, we used two different methods: (1) we investigated the global effects of the miRNAs by doing direct correlations with all expressed genes in the same biopsies, and (2) we investigated the direct effects by focusing on the negatively correlated predicted targets for each miRNA.

For the first analysis, we examined whether miR-375, miR-203a-3p and miR-31-3p affected the expression of mRNA transcripts in each study separately and then performed a meta-analysis for the miRNAs present in both studies (miR-375 and miR-203a-3p). Here, we identified 983 significant negatively correlated genes for miR-203a-3p (FDR < 0.05), and 2254 positively correlated genes. We found 2442 significantly negatively correlated genes for miR-375 and 865 positively correlated genes, and we found 1106 significant negative correlations for miR-31-3p and 1029 positive correlations (Table S1). The top 3 most strongly correlated genes (PDGFD, ARL4C and MBNL1) of both miR-375 and miR-203a-3p are shown in Fig. 3C–H. Gene functions are described in Table 3.

Figure 3
figure 3

Significant pathway negatively enrichment by miRNA-203a-3p and miR-375 and correlation of aforementioned miRNAs with genes in both COPD patients as well as respiratory healthy controls, excluding correlations that were considered not biologically relevant due to a low r-value. (A) Enriched pathways affected by miR-203a-3p in COPD patients. (B) The enriched pathways affected by miR-375 in COPD patients. Scatter plots (CE) show the correlation between miRNA expression and gene expression levels in COPD patients for all significant negatively correlated predicted targets; miR-203a-3p and PDGFD, miR-3758, and ARL4C, and miR-375 expression and MBNL1, respectively. Smoking participants are shown in blue, ex-, and non-smokers are shown in red. Scatter plots (FH) show the correlation between miRNA expression and gene expression levels in respiratory healthy controls between miR-203a-3p and PDGFD, miR-375, and ARL4C, and mir-375 and MBNL1, respectively.

Table 3 Selective literature of predicted target genes that were negatively correlated to miR-203a-3p, miR-375 and miR-31-3p, whilst significant after our meta-analysis (miR-203a-3p and miR-375) or FDR significant (miR-31-3p) after correlation.

Correlation with predicted target genes

For the targeted analysis, we focused on negatively correlated predicted targets for miR-203a-3p, miR-375 and miR-31-3p. First, predicted targets were identified using miRNAtap, and then the overlap was taken from the negatively correlated genes from the previous analysis.

Of the 983 genes negatively correlated with miR-203a-3p, two genes were also predicted targets; PDGFD, previously associated with fibroblast proliferation and survival, and EBF3, previously associated with B cell differentiation and cell survival through apoptosis and cell cycle arrest32,33,34.

Of the 2442 genes negatively correlated to miR-375, 18 were also predicted targets; ZNF385D, MBNL1, ARL4C, PDE5A, MMD, APBB2, LDHB, HAS2, CXCL12, CNIH4, APBB1IP, DTHD1, LST1, RLF, SLC16A2, TCF12, TRAPPC6B and WBP1L (Fig. 3C–H).

For miR-31-3p, we found that 14 of the 1106 negatively correlated genes were also predicted targets; NFATC2, PLCB1, PDE5A, RXFP1, DLC1, TOX, C1QTNF7, PIK3CG, IL15, DIRAS3, WASF3, LGI2, KCNIP2 and TMOD2.

Pathways affected by miRNA expression that change with current smoking patients

To investigate the pathways influenced by changes in miRNA levels, we performed pathway analysis on (1) all genes of which the expression correlated to the identified miRNAs, for a global effect (Fig. 3A,B), and (2) the significantly negatively correlated predicted targets of the miRNAs, for a direct effect of these miRNAs.

For miR-203a-3p, we found a higher expression associated with genes involved in cellular protection (xenobiotic and glutathione metabolism pathways), cellular repair (cell cycle, citrate cycle pathways), and protease pathway, as well as negative associations with pro-inflammatory pathways. In addition, the MAPK signalling pathway was negatively associated with miR-203a-3p associated genes in COPD patients (Table 4).

Table 4 Significant gene sets negatively affected by the identified miRNAs in COPD patients and asymptomatic participants. Only the top pathways with a family-wise error rate < 0.05 in the COPD patients are shown.

We found a global positive effect of genes associated with a higher expression of miR-375 on olfactory transduction, glycan biosynthesis and linoleic acid metabolism, and adverse effects on pro-inflammatory pathways in COPD patients, for example, via the cytokine-cytokine receptor integration pathway (Table 4).

A higher expression of miR-31-3p had positive associations with genes involved in pathways related to xenobiotic metabolism, cellular repair and linoleic acid metabolism, as well as negative associations with genes involved in pro-inflammatory pathways.

Relation between cell-type composition and genes associated with miRNAs

Two methods were used to assess potential effects of cell-type composition on the expression of the miRNA-associated genes, i.e. GSVA based on single-cell gene signatures derived from a publically available dataset and correlation analysis with inflammatory cell counts assessed in adjacent biopsies of the same patients.

With the GSVA analysis, we found a significant negative correlation between miR-203a-3p expression and cell type composite scores of single-cell signatures for fibroblasts, luminal macrophages, endothelial cells, mast cells, smooth muscle cells, B-cells, submucosal cells and inflammatory dendric cells. In contrast, we found a positive association of this miRNA with proliferating basal cells and club and goblet cells with current smoking in both COPD patients and respiratory healthy controls.

For miR-375, we found a negative correlation with cell type composite scores for endothelial cells, smooth muscle cells, mast cells, fibroblasts, basal cells and luminal macrophages., and a positive correlation with club and goblet cells, ionocytes and submucosal cells in both studies (Supplementary Tables S2 and S3). Activated endothelium, fibroblasts, luminal macrophages, mast cells and smooth muscle cells GSVA composite scores were negatively associated with mir-31-3p while proliferating basal cells and club and goblet cells were positively associated.

In addition, we correlated the miRNA expression profiles to cell counts of eosinophils, neutrophils, macrophages, mast cells, CD3 + , CD4 + or CD8 + T-cells, CD20 + B-cells, and percentage of PAS staining positive mucus in the COPD study. We found a negative correlation between eosinophils and miR-203a-3p expression and a positive correlation between miR-31-3p and epithelial cells, but no association with miR-375 (data not shown).


In the current study, we identified two miRNAs (miR-203a-3p and miR-375) that are significantly higher expressed with current smoking in COPD patients and respiratory healthy subjects, indicating a common effect of smoking regardless of disease status. Additionally, we found that miR-31-3p is associated with current smoking only in COPD patients.

We found that miR-375 is higher expressed in bronchial biopsies of current-smoking COPD patients and respiratory healthy controls, and we validated this finding in an independent dataset. We identified 18 predicted target genes that decrease with increased miR-375 expression. These included TCF12, WBP1L, CXCL12, MMD, and LST1 that were previously associated with inflammation via T-cell precursor differentiation35, altering B-cell development36, chemotaxis57, differentiation of monocytes to macrophages53, or differentiation of other cells59. These predicted target genes were not enriched in one specific pathway. However, our global miRNA pathway analysis showed that genes associated with miR-375 are lower expressed in pro-inflammatory pathway activity, such as the IL2RB and B-cell receptor signalling pathway. We continued our investigation using cellular deconvolution and found that higher levels of miR-375 are associated with lower levels of basal cells, fibroblasts, mast cells, smooth muscle cells, endothelial cells and luminal macrophages and higher levels of secretory cells, submucosal cells and ionocytes. In contrast, no association was found with inflammatory cell populations as determined histologically. As we consistently find miR-375 being increased by smoke across all datasets and a positive correlation between miR-375 with secretory cells, this may indicate cell-specific expression in secretory cells as it is well established that the levels of secretory cells are higher in current smokers37. The high number of genes and pathways related to inflammation changed by miR-375, but the lack of correlation to gene signatures typical for inflammatory cells might indicate an increased activation of inflammatory cells caused by miR-375 rather than a change in cell numbers or composition.

We showed that miR-203a-3p is more highly expressed in bronchial biopsies of current- compared to ex-smokers with COPD and in current- versus non-smoking respiratory healthy subjects. Furthermore, two negatively correlated predicted targets of miR-203a-3, EBF3 and PDGFD, were involved in cell cycle arrest and apoptosis31 and fibroblast proliferation29,30, respectively. Therefore, the lower EBF3 and PDGFD in response to smoking in association with miRNA-203a-3p by smoking potentially indicates a protective response and stimulation of repair as a response to cigarette smoke. In addition, we found that genes associated with xenobiotic metabolism were positively correlated with miR-203a-3p, suggesting that this miRNA facilitates the metabolism of harmful particles in cigarette smoke38. Additionally, we found that genes involved in pro-inflammatory pathways (e.g. cytokine-cytokine receptor interaction, IgA production and chemokine signalling) were negatively correlated with miR-203a-3p, suggesting an anti-inflammatory effect for this miRNA. We associated these microRNAs with cell signatures to assess whether a change in cell populations may drive the smoking-induced increase in miR-203a-3p expression. We found higher smoking-induced expression of miR-203a-3p associated with lower proportions of fibroblasts, luminal macrophages, endothelial cells, mast cells, smooth muscle cells, B-cells, inflammatory dendritic cells, and submucosal cells and higher proportions of secretory cells (club and goblet cells) and proliferating basal cells. Therefore, it could be speculated that miR-203a-3p might play a role in the basal cell's differentiation towards secretory cells, leading to more mucus production in smokers or that this miRNA is selectively expressed in secretory cells. This, however, has now only been demonstrated in biopsies and requires further investigation in isolated cells to see the relationship of this miRNA with basal cell differentiation towards secretory cells, with and without smoke exposure.

MiR-31-3p was higher in current- compared to ex-smoking COPD patients, and this miRNA was not found in the respiratory healthy control dataset. We found 14 predicted target genes that decreased with an increase of miR-31-3p. Among these genes was PDE5A, which was involved in smooth muscle contraction and relaxation, as well as cell proliferation, cell signalling and included pro-inflammatory w Inhibition of PDE5A is involved in preventing tobacco smoke-induced emphysema39, suggesting a protective role of this miRNA in current smokers with COPD as an attempt to limit the harmful effects of smoking40. In a previous study, it was shown that PDE5A was decreased after smoke exposure in lung tissue of healthy individuals40. The higher expression of miR-31-3p could suppress PDE5A expression even further in patients with COPD.

There were some limitations to the current study. As the miRNAs expression in the healthy group were conducted on different platforms, these miRNAs were not directly comparable. This may have led to the lack of expression of miRNA-31-3p in the healthy cohort. However, as the healthy cohort was conducted with small-RNA-Seq, it should have been picked up when present, and therefore we are confident in the current results. As a strength of the study, we were able to directly correlate the miRNA to matched expression data in the same subject allowing for the identification of direct miRNA-target gene interactions. Predicted targets for the miRNAs were identified using miRNAtap, which relies on algorithms and databases. However, as these are only predictions and not necessarily accurate, having the paired gene expression and miRNA expression needs additional confirmation of the associations.

In conclusion, we found two miRNAs, miR-203a-3p and miR-375, that are upregulated in bronchial biopsies of smokers compared to ex- and non-smokers in both COPD and respiratory healthy controls. These miRNAs play a role in the detoxification and inflammatory response to smoke response. In addition, we identified that miR-31-3p was upregulated only in current smokers versus ex-smokers with COPD. miR-31-3p might have a protective role via decreasing PDE5A expression and thus smoke-induced emphysema. We propose that our identified miRNAs are candidates for future studies aimed at unravelling which smoking-induced inflammatory mechanisms are mainly reactive and which are involved in COPD pathogenesis.