Introduction

Respiratory pathophysiology is a common long-term consequence of preterm birth1 and Bronchopulmonary Dysplasia (BPD), also called Chronic Lung Disease of Prematurity, remains a major clinical challenge for neonatologists2. It is well-established that survivors of preterm birth, both with and without a diagnosis of BPD, are at risk of persistent lung function deficits throughout childhood and adulthood3 with increased propensity to respiratory symptoms, increased hospitalisation and increased inhaler use4, with potential premature development of chronic obstructive pulmonary disease (COPD)5. Whilst many of these children are diagnosed with asthma, it is becoming apparent that there are more complex respiratory phenotypes following preterm birth6,7; however, the underlying pathophysiology remains poorly characterised8. Our recent randomised controlled trial (RCT) showed that combined inhaled corticosteroids and long acting β2-agonists was an effective treatment for prematurity-associated lung disease7, but only by understanding the underlying mechanisms, to identify the various underlying endotypes, can appropriate therapeutic interventions be developed and brought into clinical use.

Exhaled breath condensate (EBC) provides a useful sample to study in children due to its ease in collection. EBC is composed of droplets of the airway lining fluid (ALF), evolved from all compartments of the lung during tidal breathing. It is a complex mixture of DNA, RNA, proteins, metabolites and volatile organic compounds reflecting lung tissue biology9. EBC is of interest for studying respiratory pathology due to its simple, non-invasive, and easily repeatable method of collection10 and has been used to study mechanisms in asthma, COPD and bronchiolitis. Identifying the proteome in EBC has been challenging, but recent developments aid identification and accurate quantification of large arrays of proteins. Proteomics methods simultaneously analyse the entire protein complement of biological samples and have gained interest clinically as a potential tool for unravelling disease pathogenesis and identifying biomarkers11,12,13.

We hypothesised that the EBC proteome in preterm-born children with lung disease would be altered when compared to term born controls; and that treatment, in an RCT, would normalise any abnormalities detected. We, therefore, compared the EBC proteome obtained from preterm-born school-aged children, with and without BPD, to term-born controls. In addition, we compared the EBC proteome in preterm-born children with low lung function, who were treated with inhaled corticosteroids (ICS), a combination of ICS and long-acting β2 agonist (LABA) and placebo in a RCT7.

Methods

Participants

This study was conducted on a cohort of children recruited to the Respiratory Health Outcomes in Neonates study (RHiNO, EudraCT: 2015-003712-20) which has been described previously6,7. Briefly, children from a previous study4 were supplemented with additional preterm-born children identified by the NHS Wales Informatics Service and sent a respiratory and neurodevelopmental questionnaire if they were born ≤ 34 or ≥ 37 weeks’ gestation and were aged 7–12 years. Children with significant congenital malformations, cardiopulmonary or neuromuscular disease were excluded. Ethical approval was obtained from the South-West Bristol Research Ethics Committee (15/SW/0289). Parents gave informed written consent and children provided assent. The study was conducted according to the Good Clinical Practice (GCP) guidelines and the Declaration of Helsinki.

Responders were assessed at their home and a subset attended the hospital children’s research facility for comprehensive respiratory testing including collection of EBC (RTube®, Respiratory Research Inc. Texas, USA), conducted by a trained nurse and paediatrician between January 2017 and November 2019. Spirometry (MasterScreen Body and PFT systems, Vyaire Medical, Germany) was performed to ATS/ERS guidelines14 and normalised using Global Lung Initiative (GLI) references15. Those preterm-born children with low lung function (PTlow) defined as percent predicted forced expiratory volume in 1 second (%FEV1) of ≤ 85% were enrolled into the RCT7. Term-born children who had %FEV1 > 90% were included as term controls. BPD was defined as oxygen-dependency of 28 days or greater for those born < 32 weeks’ gestation and at 56 days of age for those born ≥ 32 weeks’ gestation)16. Intrauterine growth restriction (IUGR) was defined as birthweight < 10th percentile adjusted for sex and gestation (LMS Growth version 2.77, Medical Research Council, UK). Neonatal history was corroborated with medical records.

PTlow participants were enrolled to a twelve-week blinded RCT, receiving ICS (50 μg fluticasone propionate), a combination of ICS and LABA (ICS/LABA) (50 μg fluticasone propionate and 25 μg salmeterol xinafoate) or placebo. Following treatment, RCT participants underwent repeat EBC sampling. The RCT has recently been published7 and is described in detail in the online supplement.

EBC sampling

EBC was collected in a standardised manner using a cooling tube (RTube®, Respiratory Research Inc. Texas, USA) over a period of 10 min of passive tidal breathing whilst the participant wore a nose clip, stopping briefly to swallow saliva if needed, as per manufacturer’s guidance. The RTube® is a single-patient, single-use design, preventing cross contamination, and features a large ‘Tee’ section to separate saliva from exhaled breath, thereby ensuring collection of airway lining fluid and not secretions from the oropharynx. Once collected, samples were immediately separated into aliquots and stored at − 70 °C pending analysis.

Sample analysis

In brief, 100 µL of each sample (ensuring that no sample contained more than 50 µg of protein) was digested with trypsin, labelled with Tandem Mass Tag (TMT) eleven plex reagents (Thermo Fisher Scientific, Loughborough, UK) and the labelled samples pooled. The TMT-labelled pool was fractionated using an Ultimate 3000 nano-LC system (Thermo Scientific). All spectra were acquired using an Orbitrap Fusion Lumos mass spectrometer (Thermo Scientific) controlled by Xcalibur 3.0 software (Thermo Scientific) and operated in data-dependent acquisition mode using an SPS-MS3 workflow. The raw data files were processed and quantified using Proteome Discoverer software v2.1 (Thermo Scientific) and searched against the UniProt Human database (downloaded October 2019: 150,786 entries). Further detail on sample processing and proteomic analysis are described in the online supplement.

Statistical analysis

Baseline population and RCT group characteristics were compared using Chi-squared, t-test or one-way ANOVA with Bonferroni correction as appropriate. Replicate numbers (number of samples in which a particular protein was detected) were calculated. Relative protein abundances, determined from the quantity of TMT-tag counts at each detected peptides spectral peak, were log2-transformed and fold changes (log2FC) between groups were compared, and the data inspected for normality. Welch’s t-test/ANOVA with post-hoc Tukey correction was used for baseline samples as appropriate, and paired samples t-test for pre/post-RCT samples. p < 0.05 was considered statistically significant. All analyses were performed using R v4.0.417. Gene name is used synonymously with protein name. Gene names were unavailable for four proteins. WebGestalt was used to perform functional enrichment analysis18. Ingenuity Pathways Analysis (IPA, Qiagen®, Germany) identified relationships between significantly different proteins using network maps, which were reproduced for publication in Cytoscape v3.919. Linear regression models were used to identify associations between participant characteristics and proteins of interest.

Results

From 1426 returned questionnaires, 768 children had home assessments and 241, including 53 enrolling into the RCT, underwent detailed assessments. EBC was successfully collected and analysed from 218 (91%) children at baseline. 48 of the 53 RCT participants completed treatment and 46/48 (96%) post-treatment EBC samples were successfully collected and analysed. Participant demographics at baseline and for the RCT groups are shown in Table 1. At baseline, significant differences were noted between the preterm-born and term-born children for age at testing (mean 11.01 ± 1.24 years vs 10.43 ± 1.09, p = 0.001) and asthma diagnosis (34 (23%) vs 5 (7%), p = 0.007). Thirty-seven (25%) of the preterm-born children had a neonatal diagnosis of BPD and 53 (36%) were classed PTlow, all of whom joined the RCT. There were no differences for asthma diagnosis (10 (27%) vs 24 (21%); p = 0.67) or IUGR (8 (22%) vs 19 (17%); p = 0.70) between the preterm BPD and No BPD groups, nor for between the PTlow and PTc groups (asthma: 16 (30%) vs 18 (19%); p = 0.19; IUGR: 12 (23%) vs 15 (16%); p = 0.35 respectively). Marginally more EBC was collected from term-born children compared to preterm-born (1.13 ml vs 1.28 ml, p = 0.001), but no significant differences were noted between the preterm groups (BPD or PTlow vs preterm controls, p = 1.0]) or between the three RCT groups. Demographics were similar for the three RCT groups. However, the placebo group produced more EBC after treatment (p = 0.02), but not for ICS or ICS/LABA (p > 0.1).

Table 1 Participant demographics.

We identified 210 different proteins as detailed in the online supplementary Table 1 together with replicate number. The distribution of detected proteins across all samples is shown in the heatmap (Supplementary Fig. 1). Functional enrichment analysis (which determines classes of proteins that are over-represented within a large group of proteins) was possible for 192 proteins (Supplementary Fig. 2). Twenty-eight proteins were identified with a significant difference between one or more of the group comparisons and functional enrichment analysis was possible for 27 of these. Most proteins with significantly differing abundances were functionally related to protein/ion binding and cell structure.

Baseline samples

Nineteen proteins were detected in all 218 baseline EBC samples (Table 2). Cytokeratins were the most detected protein class. Increased abundance of two keratins, type II cytoskeletal 5 (KRT5) (0.12, p = 0.03) and 6A (KRT6A) (0.14, p = 0.02) was observed when the all preterm-born and term-born groups were compared.

Table 2 Proteins detected in every sample.

Exploratory analyses of proteins with a significant abundance difference between the groups that were not detected in every sample are shown in Supplementary Table 2, ordered by decreasing replicate number. Eleven proteins were detected with significant differences between preterm- and term-born children, nine between BPD and No BPD, and seven between PTlow and PTc groups. Figure 1 shows all significantly different protein abundances between the BPD and No BPD groups; and PTlow and PTc groups.

Figure 1
figure 1

Volcano Plots demonstrating baseline protein abundance by BPD and Lung Function Status for Preterm-born children. Vertical line represents a Log2FC of 0. Horizontal line is equivalent to p-value 0.05. Size of point is relative to number of samples in which protein was detected. Gene name associated with protein given if p < 0.05. BPD, bronchopulmonary dysplasia; PTlow, preterm-born with low lung function; PTc, preterm-born control; Log2FC, Log2 fold-change between groups.

BPD vs No BPD

For proteins detected in every sample, significantly decreased abundance (determined by TMT tag count at the spectral peak) of the desmosome proteins desmoglein-1 (DSG1) (Log2FC − 0.26, p = 0.02), desmocollin-1 (DSC1) (− 0.27, p = 0.02) and junctional plakoglobin (JUP) (− 0.23, p = 0.04), and increased abundance of KRT6A (0.68, p = 0.01) was observed when the BPD and No BPD groups were compared (Table 2). No significant differences were noted for DSG1, DSC1 and JUP between the No BPD and Term groups (Fig. 2).

Figure 2
figure 2

Violin plots: Desmosome/cell adhesion baseline protein abundances in children with history of BPD. Term, term-born control; No BPD, preterm-born without BPD; BPD, bronchopulmonary dysplasia; Dot and bars represent mean and standard error (SEM); Comparison bars between violin plots give p-values by ANOVA with post-hoc Tukey’s for multiple comparisons. Coloured areas represent distribution of sample values.

Protein network maps highlighting significant protein pathways (including proteins detected in all or only some samples) comparing BPD and No BPD groups are shown in Fig. 3. For proteins detected in a proportion of samples, dermcidin (DCD) was detected in n = 146 (98%) samples and was less abundant in the BPD group (Log2FC − 0.43, p = 0.03), and was related to DSG1 and DSC1 in the network map. As with KRT6A, KRT6B was detected in 133 (89%) samples, being more abundant in the BPD group (0.76, p = 0.03) when compared to the No BPD group. Small proline-rich protein 2E (SPRR2E), secretory leukocyte peptidase inhibitor (SLPI) and gamma-glutamyl hydrolase (GGH) were all significantly less abundant in the BPD group (− 0.92, p = 0.04; − 2.08, p = 0.04; − 0.79, p = 0.03 respectively); however, these were detected in < 25% of the samples.

Figure 3
figure 3

Protein network map of significant protein differences between BPD and No BPD Preterm-born children.

Univariable linear regression analyses for DSG1, DSC1 and JUP (Table 3) identified that a history of BPD had significant association with each of these three proteins (β−0.23, p = 0.014; β−0.27, p = 0.019; β−0.23, p = 0.008 respectively) but sex, age, diagnosis of asthma and low lung function did not. Since low lung function in those with BPD was associated with lower desmosome proteins, we assessed the interaction of BPD and low lung function: the results showed that the reduced abundance of DSG1, DSC1 and JUP was significantly or near significantly associated with those with BPD and PTlow (β−0.35, p = 0.012; β−0.30, p = 0.06, β−0.30, p = 0.01 respectively) but not in the BPD group who had normal lung function (Table 3).

Table 3 Linear regression analyses in preterm-born school-aged children.

PTlow vs PTc

For proteins detected in all samples, no significant differences were noted between the PTlow group and PTc groups at baseline. A protein network map including all detected proteins with significant differences between PTlow and PTc groups is given in Supplementary Fig. 3. Three antiproteases (Annexin A1 [ANXA1], Serpin B3 [SERPINB3], SLPI) were less abundant in the PTlow group (− 0.77, p = 0.02; − 0.86, p = 0.01; − 2.32, p = 0.03 respectively), with reduced abundance of fatty acid-binding protein 5 (FABP5) (− 0.39, p = 0.04) when compared to the PTc group. The network map (Supplementary Fig. 3) did not demonstrate any direct links between these proteins.

RCT group

Figure 4, which includes proteins detected in all or some samples, shows significant differences before and after the three blinded inhaler treatments. Supplementary Table 3 shows the changes observed in the RCT treatment groups for proteins detected in all samples. Significant increases in abundance of DSG1 (0.58, p = 0.003), DSC1 (0.47, p = 0.048), JUP (0.52, p = 0.002), KRT2 (0.32, p = 0.047) and KRT10 (0.27, p = 0.04) occurred after ICS/LABA treatment. For proteins not detected in every sample, increases in Protein-glutamine gamma-glutamyltransferase-E (TGM3) (log2 fold change 1.82, p = 0.005), Filaggrin-2 (FLG2) (0.76, p = 0.007) and Rab5 GDP/GTP exchange factor (RABGEF1) (0.76, p = 0.02), and a decrease in Heat shock protein beta-1 (HSPB1) (− 3.09, p = 0.04) abundances were noted after ICS/LABA treatment. Protein network map for ICS/LABA is shown in Supplementary Fig. 4.

Figure 4
figure 4

Volcano Plots demonstrating protein abundance pre- and post-RCT treatment. Log2FC: Log2 fold-change between groups. Vertical line represents a Log2FC of 0. Horizontal line is equivalent to p-value 0.05. Size of point is relative to number of samples in which protein was detected. Gene name associated with protein given if p < 0.05.

Following ICS treatment, significant increase in abundance of cytokeratin-1 (KRT1) (0.34, p = 0.03) and decreased abundances of cystatin-A (CSTA) (− 0.66, p = 0.01) and Zinc-alpha-2-glycoprotein (AZGP1) (− 0.70, p = 0.03) was seen. Protein network map is shown in Supplementary Fig. 5. No differences were observed for proteins detected in every sample after placebo treatment, but immunoglobulin kappa constant (IGKC) (− 2.02, p = 0.04), Lipocalin-1 (LCN1) (− 1.35, p = 0.03), Plakophilin-1 (PKP1) (− 0.80, p = 0.03) and Catalase (CAT) (− 0.33, p = 0.04) decreased, but were only noted in some samples.

Figure 5 shows significant increases in DSG1, DSC1 and JUP occurred after ICS/LABA treatment which were not noted after ICS intervention. The PTlow group who had BPD in infancy had significant increases in abundance of all three proteins after ICS/LABA treatment, whereas PTlow without BPD only had significantly increased JUP abundance. Following ICS/LABA treatment in the PTlow with BPD group, levels of DSG1, DSC1 and JUP were comparable to the term control group at baseline (p = 0.56, 0.12, 0.06 respectively). Supplementary Fig. 4 demonstrates the biological links between these proteins and the changes observed for TGM3, FLG2, HSPB1, KRT2 and KRT10 as described above.

Figure 5
figure 5

Violin plots of desmosome proteins before and after treatment with Placebo, ICS or ICS/LABA by BPD status. BPD, bronchopulmonary dysplasia; Coloured areas represent distribution of sample values. Dot and bars represent mean and standard error (SEM); Comparison bars between violin plots give p-values by paired samples t-test.

Discussion

In this exploratory proteomic analysis, we have shown that preterm-born school-aged children who had BPD in infancy have significant differences in protein abundances for key structural proteins involving desmosomes and the cytoskeleton, several years after the initial pulmonary insult which occurred in the neonatal period. Linear regression models demonstrated that for DSG1, DSC1 and JUP, those children with a history of BPD and current low lung function had the reduced abundance of these proteins but the BPD group with normal function did not. We have recently demonstrated that ICS/LABA is an effective treatment for preterm-born children with low lung function, increasing FEV1 by over 14% after treatment7. In this study, we show that the decreases for the desmosome proteins, DSG1, DSC1 and JUP were reversed to levels observed in the term controls after 12 weeks of blinded ICS/LABA inhaler therapy. This effect was predominantly noted in children who had BPD and low lung function.

The mechanism of why some preterm-born children continue to experience lung function deficits in later life including in adulthood remains incompletely understood. Desmosomes have historically been thought to provide inert structural support to tissues, providing strong cell-to-cell adhesion; however more recent evidence shows that they have an active role in cell signalling, proliferation, migration, and apoptosis20,21. Despite minimal published evidence taking a proteomics approach, a reduction in desmosome proteins has been implicated in other respiratory pathologies. In a murine asthma model, bronchial wall tissue analysis noted reduced DSG1 expression following asthma exacerbation and reduced epithelial barrier integrity, potentially predisposing to further exacerbations22. Desmosome size and number are reduced in bronchial biopsies taken from asthmatic adults23, and two in-vitro studies reported that pro-inflammatory cytokines (TNF-α and IFN-γ) reduced desmosomes and JUP expression in bronchial epithelium, which was reversed by corticosteroids24,25.The reasons why we observed changes in DSG1, DSC1 and JUP after ICS/LABA therapy but not after ICS use is unclear. However the pathophysiology of BPD differs to that of asthma, with post-mortems of infants dying from BPD showing airway smooth muscle extension distally into peripheral airways26, together with peri-bronchial fibrosis and CD8+ T-lymphocyte epithelial infiltrate in adolescent survivors of BPD27. A recent study of adult BPD survivors has also shown a higher proportion of CD8+ cells in bronchoalveolar lavage fluid, a finding in keeping with adults with chronic obstructive pulmonary disease (COPD)28.

The BPD group also had increased cytokeratins (KRT6A and KRT6B). Cytokeratins comprise the intracytoplasmic cytoskeleton of epithelial tissues forming important components of intermediate filaments, which connect to desmosomes, aiding resistance to mechanical stress29. Although cytokeratin detection could be due to epidermal contamination, cytokeratins have previously been shown to be the most abundant proteins in EBC30, and both KRT6A and KRT6B have been identified as potential biomarkers for lung carcinomas in EBC proteomic analyses13,31. In addition, cytokeratins in EBC are potential markers of lung injury in ventilated adults32, and serum cytokeratin-19 fragments are increased in ventilated preterm infants who develop BPD33. In conjunction with the changes in DSG1, DSC1 and JUP, increased KRT6A and KRT6B in the BPD group suggest persistence of parenchymal structural abnormalities which can potentially explain the abnormal lung function observed in preterm-born subjects in childhood and adulthood.

We focused our main analyses on proteins detected in every sample to capitalise on our large sample size and ensure robust findings. Overall, the protein content of EBC was low, as previously reported12,30, and close to the limits of detection. Thus, we performed exploratory analyses of proteins detected only in a proportion of samples, as our methodology allowed robust quantification of these proteins in multiple replicates, most of which exceed sample sizes of many other published proteomic studies. We detected DCD in a high proportion of our samples (98%) noting significantly decreased abundance in the BPD group. DCD, a peptidase with antimicrobial activity, has been described in EBC samples previously30—increased detection was weakly associated with asthma in a small paediatric proteomic study12. In addition, we observed reduced abundance of several protease inhibitors in the BPD and PTlow groups, including ANXA1, SERPINB3, CSTA and SLPI, with reduced abundance of SLPI being noted in both BPD and PTlow groups. We have previously demonstrated protease/antiprotease imbalance, and subsequent tissue remodelling, may be implicated early in the pathogenesis of BPD34 but this has not been reported in later life. Tracheobronchial aspirates from ventilated preterm-born neonates who developed BPD have relative deficits of SLPI, with increased protease activity early in life35. Our result should be interpreted with caution as SLPI was detected in a minority of samples. ANXA1, a protease inhibitor, also known to have innate immune properties, which was decreased in the PTlow group but not in the BPD group, has been implicated in early lung injury in neonatal mouse models36, and also in proteomic studies of EBC from adults with pneumonia31. The decrease in antiproteases suggest an imbalance in protease/antiprotease activity, but additional work will be required in more invasive samples (e.g. bronchoalveolar lavage or induced sputum) which are ethically more challenging to obtain.

It is established that preterm born survivors, both with and without BPD, are at risk of lung function deficits in later life3, but there is increasing evidence that a diagnosis of BPD in infancy16 is a poor predictor of future lung function deficits6,37. In our cohort, we saw fewer differences in biologically related proteins at baseline when comparing PTlow and PTc groups in comparison to those with and without BPD, and less than half of the children in the RCT had BPD. It is most likely that the decrease observed in DSG1, DSC1 and JUP seen in the BPD group, which is reversed by combination inhaler therapy, is due to cellular injury secondary to continuing airway inflammation38,39, although further work is needed to clarify this speculation. HSPB1 decreased in children treated with ICS/LABA. HSPB1 is a small heat-shock protein family member, which controls protein folding and preventing aggregation. HSPB1, which has been previously detected in the EBC proteome40, has been shown to have an important role in cellular responses to oxidative stress, preventing apoptosis and regulating inflammation41, adding further to the suggestion of chronic airway inflammation contributing to low lung function. Previous studies have also demonstrated evidence of persistent airway inflammation in children several years after preterm birth (< 32 weeks’ gestation), with increased neutrophils and IL-8 in induced sputum38; however, the link between this chronic inflammation and lung function parameters is not clear. We did not observe differences in proteins in the PTlow group who did not have BPD in infancy with the control groups; previous publications have also not been able to report a link between EBC biomarkers and lung function parameters42. We noted a change in protein abundances after placebo treatment which we are not able to explain. This change was not associated with any early or current life factors available to us and whether this is due to a placebo effect is speculative (especially as the lung function remained unaltered).

This study represents one of the largest proteomic analyses of EBC, and the first time, to our knowledge, that preterm-born children have been studied. By using EBC, we have been able to directly sample ALF, representative of the biochemistry of the airways, in a simple, well-tolerated and non-invasive manner. We have demonstrated that it is technically possible to perform a quantitative proteomic analysis of EBC using TMT on a large sample size and identify meaningful differences between our clinical groups. By restricting our primary analysis to proteins detected in every sample, we report robust findings, strengthened further by the modulation of these proteins of interest after inhaler treatment. Our untargeted approach and exploratory analysis of less frequently detected proteins has also implicated potentially important protease/antiprotease dynamics that future work should explore. Although collected EBC volumes varied marginally between preterm- and term-born children, our methodology was based on identifying differences in relative protein abundances thus our findings are likely to remain robust. Although some children did not complete their treatment in the RCT7, and some could not provide EBC after treatment, we believe that we had sufficient numbers in each group to compare EBC protein results before and after treatment. Limitations include the overall low protein content of EBC, as discussed above, and the relatively low number of proteins detected in every sample, which limited the statistical analysis approaches we could undertake. There may have been very low levels of some proteins in the samples which did not reach the limit of detection for the TMT methodology we utilised.

In conclusion, in an exploratory proteomics analysis we report significantly decreased desmosome components, DSG1, DSC1 and JUP, in children who had BPD in infancy. Furthermore, additional analyses suggest that there may be a protease/antiprotease imbalance. Taken together with the recent RCT findings7, these data suggest that persistent structural injury to the parenchyma in those who develop BPD in infancy is a major contributor to decreased lung function in childhood and possibly adulthood but encouragingly can be reversed by ICS/LABA treatment.