Introduction

Chronic respiratory disease is a global health problem that brings heavy economic and health burdens to society1. The prevention and management of chronic respiratory diseases have become a great challenge. Spirometry is an important tool in the diagnosis, assessment, and management of chronic respiratory diseases2,3. However, the result of spirometry contains various parameters and may be complicated for some patients to understand. Spirometry-derived lung age (LA), a simplification of spirometry data, might be an alternative tool in the management of chronic respiratory diseases.

The concept of “lung age” was first proposed in 1985 to make spirometry data easier to understand and as a physiological instrument to evaluate the lung function damage caused by smoking4. A lung age older than the chronological age is considered an indication of the accelerated decline or impairment of lung function, and the difference between lung age and chronological age (i.e., ∆ lung age) is used to estimate the severity of this functional impairment. For example, if a 50-year-old man has a lung age of 60 years old, then his ∆ lung age is 10 years, indicating there may be an impairment of his lung function. A randomized controlled trial showed that telling smokers their lung age can improve the success rate of smoking cessation5. Lung age is also used as a clinical indicator in studies regarding physiological changes in individuals with morbid obesity6 and treatment efficacy in asthmatic patients7. However, it has not yet been widely used or fully exploited due to the doubt regarding the current lung age estimation methods8.

As a simplified lung function indicator, lung age was originally estimated by the backward calculation of the reference equation of spirometric parameter (usually the forced expiratory volume in 1 s [FEV1]) based on the assumption that the lung age of an individual equals the chronological age of a healthy nonsmoker who has the same FEV1 level as the individual (Morris et al.4, 1985; Hansen et al.9, 2010; Newbury et al.10, 2010). However, this estimation method may decrease the reliability of lung age, as it estimates lung age by using only one spirometric parameter11, whereas various spirometric parameters can reflect the functional changes of the lung during aging or in the context of disease. More importantly, lung age derived by this method is a mean value, not a normal range of the population. Neglecting the normal variability in lung function between individuals makes lung age unable to be correctly interpreted as normal or abnormal8. To overcome these problems, Yamaguchi et al.11,12 proposed the use of multiple linear regression to develop lung age estimation equations, with the chronological age of healthy nonsmokers as the dependent variable and spirometric parameters as well as height as explanatory variables. The limitation of this method is the assumption that the relationship between age and spirometric parameters can be approximated by a linear function11. Previous studies13 have revealed that the relationship between age and spirometric parameters is nonlinear over the full age range, indicating that nonlinear regression may be more suitable for depicting the relationship between lung age and spirometric parameters.

Considering the potential application value of lung age in the management of chronic respiratory diseases, this study aimed to develop new lung age estimation equations and hypothesized that nonlinear regression was a more appropriate method to build the equations. Furthermore, the lung age of patients with chronic obstructive pulmonary disease (COPD) and asthma was estimated to explore the clinical application of lung age in chronic respiratory diseases.

Results

Demographic characteristics

As shown in Fig. 1, 2931 healthy subjects were included in the modelling group, 478 healthy subjects were included in the validation group. The demographic characteristics and spirometric variables of the healthy subjects are presented in Table 1 and Supplementary Fig. 1. After propensity score matching (PSM), 280 patients with COPD (70 patients in each stage) and 70 COPD-matched healthy subjects were included in the analysis. As the number of stage IV asthmatic patients was limited, PSM was performed between stage I-III asthmatic patients and healthy subjects, and 285 asthmatic patients (78 patients in stage I–III, 51 patients in stage IV) and 78 asthma-matched healthy subjects were included in the analysis. The distributions of age, height and sex ratio were similar between the matched healthy subjects and patients with COPD or asthma (Tables 2 and 3).

Fig. 1: Study flow chart.
figure 1

The left panel displays the data inclusion and exclusion of healthy subjects in the modelling group, the right panel displays the data inclusion and exclusion of healthy subjects in the validation group and patients with COPD and asthma.

Table 1 Demographic characteristics and spirometric parameters (mean ± standard deviation) of healthy subjects.
Table 2 Demographic characteristics and spirometric parameters (mean ± standard deviation) of COPD patients and matched healthy subjects.
Table 3 Demographic characteristics and spirometric parameters (mean ± standard deviation) of asthmatic patients and matched healthy subjects.

Lung age estimation equations

A series of models composed of different variables were built by multiple linear regression, piecewise linear regression, and the natural cubic spline method, respectively. Models with the highest adjusted R2 values of each method are presented in Fig. 2 and Supplementary Table 1. Among these models, the one with the highest adjusted R2 was built by the spline method and composed of FEV1, FEF50%, FEF75%, and height as explanatory variables (Table 4). This model was defined as the estimation equation of lung age and was used to derive lung age for patient groups.

Fig. 2: Fitting curves of the lung age estimation equations developed by different methods.
figure 2

Panel A, fitting curves of lung age estimation equatitons in males; Panel B, fitting curves of lung age estimation equatitons in females.

Table 4 Lung age estimation equations developed by the spline method.

Internal validation showed that the coefficients of independent variables, adjusted R2 and mean square error (MSE) of the models of bootstrap validation were similar to those of the primary model (see Supplementary Table 2), suggesting that the equations developed in this study performed well in internal prediction. External validation showed that the MSE of ∆ lung age in the validation group (69.9) was smaller than that of the modelling group (73.0–75.6), indicating that the validation group did not present larger differences between the estimated lung age and the chronological age compared to the modelling group (Supplementary Tables 2 and 3). The MSE of ∆ lung age estimated by nonlinear regression (piecewise linear regression: 69.3, spline method: 69.9) was smaller than that by the multiple linear regression (78.8), suggesting nonlinear regression had smaller errors than the multiple linear regression in estimating lung age in validation group (Supplementary Table 3).

Upper normal limit of ∆ lung age

Lung age and ∆ lung age of healthy subjects in the modelling group were calculated using the new lung age equations. As ∆ lung age is of greater practical use in the interpretation of lung age, the normal limit of ∆ lung age was explored. Analysis of the ∆ lung age of the modelling group showed that ∆ lung age was negatively correlated with chronological age but not with height or FEV1 (shown in Supplementary Fig. 2), and there was no significant difference in ∆ lung age between healthy males and females (male median: 0.92 years, female median: 0.89 years, P = 0.966). Thus, age-dependent normal limits of ∆ lung age were derived from a regression model between ∆ lung age and chronological age (Supplementary Fig. 2a). Since those with higher ∆ lung age (older lung age) are of greater clinical interest, we only derived the upper limit of normal (ULN) of ∆ lung age, which was calculated according to the results of the regression model as follows: ULN of ∆ lung age (years) = 12.243–0.323 × Age (years) + 1.645 × 7.037 (residual standard error). In addition, we derived a constant ULN of ∆ lung age by calculating the 95th percentile of the healthy subjects, which was 12.5 years. To compare the practical use of lung age estimated by different regression methods, we also derived the ULN of ∆ lung age estimated by the multiple linear regression (MLR) method in the same way, that is, the ULN of ∆ lung age (MLR) (years) = 14.690–0.392 × Age (years) + 1.645 × 7.387.

Proportion of subjects with ∆ lung age above the ULN

As shown in Table 5, the proportions of patients with ∆ lung age above different ULNs (age-dependent ULN derived by spline method, constant ULN derived by spline method, age-dependent ULN derived by multiple linear regression, ULN proposed by the previous study [Yamaguchi et al., 2012]11) were compared. The age-dependent ULN derived by the spline method identified more patients with COPD or asthma than other ULNs, with 52.9% of stage I and 100% of stage II-IV COPD patients exceeding the age-dependent ULN (Table 5). For healthy subjects in the validation group, 94.1% (450/478) of ∆ lung age was within the age-dependent ULN derived by spline method (Fig. 3), indicating the equations and the derived normal limit are acceptable in healthy subjects.

Table 5 Number (percentage) of patients with ∆ lung age over the ULN.
Fig. 3: ∆ lung age of healthy subjects in the validation group.
figure 3

The grey area represents the upper limit of normal of the ∆ lung age.

∆ lung age of COPD and asthma

As shown in Fig. 4, the ∆ lung age of stage I COPD patients (Mean ± SD: 4.88 ± 6.81 years) was higher than that of the matched healthy subjects (−4.59 ± 9.42 years, P < 0.05), and a progressive increase in ∆ lung age was shown in stage I–IV COPD patients (stage II: 25.85 ± 9.30 years, stage III: 50.56 ± 9.00 years, stage IV: 65.43 ± 10.08 years, between-group P < 0.05). Similarly, the ∆ lung age of stage I asthmatic patients (2.45 ± 9.16 years) was higher than that of the matched healthy subjects (−1.95 ± 7.99 years, P < 0.05), and a progressive increase in ∆ lung age was shown in stage I–IV COPD patients (stage II: 28.44 ± 11.63 years, stage III: 54.27 ± 12.90 years, stage IV: 68.85 ± 12.77 years, between-group P < 0.05).

Fig. 4: Comparisons of ∆ lung age between matched healthy subjects and patients with COPD and asthma.
figure 4

Stage I, 80% ≤FEV1 %pred; Stage II, 50% ≤FEV1 %pred <80%; Stage III, 30% ≤FEV1 %pred <50%; Stage IV, 30% >FEV1 %pred. The centre line in the box indicates the median, the lower and upper bound of the box and whiskers indicate the first and the third quartile, and the minimum and maximum value, the point indicates the outlier.

Discussion

In this study, spirometry-derived lung age estimation equations were developed based on the data of healthy nonsmokers aged 18–80 years old by a spline method. Analysis of lung age in patients with COPD and asthma revealed that ∆ lung age progressively increases with the degree of airflow limitation.

Scatter plots of spirometric parameters against age in healthy subjects of the modelling group (Fig. 2) demonstrated a similar result as previous studies13,14 that the relationship between age and spirometric parameters is nonlinear. The present study showed that building lung age equations by using nonlinear regression (piecewise linear regression or spline method) could improve the goodness of fit of the equations and had smaller errors in estimating lung age compared with multiple linear regression. Moreover, lung age estimated by the nonlinear regression method could identify more patients with COPD or asthma than that estimated by linear regression. These findings confirm our hypothesis that nonlinear regression is more suitable for developing estimation equations of lung age. Though the lung age values estimated by the piecewise linear regression and the spline method seemed to be similar in Fig. 2, we chose the equations built by the spline method as the lung age equations given its higher adjusted R2 and the continuity of the equations. Calculation tools for the estimation of lung age with our new equations are available at https://cltshiny.shinyapps.io/LungAge/ for an individual and in an Excel spreadsheet in the Supplementary material for large datasets.

While several studies have developed equations for the estimation of lung age15, the interpretation of lung age remains controversial. When first proposed, lung age was used to reflect the functional damage or premature aging of the lungs caused by smoking4,9. However, Quanjer et al.8 questioned whether an older lung age (a mean value estimated by the backward calculation method) should be interpreted as lung damage caused by smoking, as it disregarded the variability between individuals. Yamaguchi et al.11 and Ben Saad et al.16 proposed using a three-step procedure based on the limits of normal to judge the abnormality of lung age: if the calculated lung age of an individual is within the ULN and LLN (lower limit of normal), then his or her lung age should be interpreted to be consistent with his or her chronological age, otherwise, his or her lung age is judged to be older (when lung age >ULN) or younger (when lung age <LLN) than the chronological age. Based on their study populations, the LLN/ULN of ∆ lung age proposed by Yamaguchi et al.11 or Ben Saad et al.16 was −13.4/+13.4 years or −16.90/+16.90 years in males, and −15.0/+15.0 years or −14.77/+14.77 years in females, respectively. Our study showed that the ∆ lung age of healthy subjects was correlated with chronological age, and the age-dependent ULN of ∆ lung age displayed a better performance than the constant ULN in identifying patients with diseases, especially for mild patients. Thus, we propose to use an age-dependent normal limit of ∆ lung age instead of a constant normal limit in the interpretation of lung age. As demonstrated in Table 5, ULN derived by our new method could markedly improve the capacity to identify patients with COPD and asthma when compared with the ULN of the previous study (Yamaguchi et al. 201211). Different from previous studies, we also suggest that estimated lung age, instead of the chronological age, should be adopted when the lung age value is above the chronological age but within the ULN, so that such individuals can be aware of their lower lung function compared to that of the population and thus be more active in early intervention, such as smoking cessation.

To our knowledge, this is the first study to analyze the levels of ∆ lung age in COPD and asthmatic patients with varying airflow limitations. Although FEV1 or FEV1%pred is the “gold standard” in the assessment of lung function in clinical or research fields2, it may be abstract for some patients to understand. In this study, we found that ∆ lung age progressively increases with the degree of airflow limitation, suggesting that lung age may be used as a simple surrogate to inform patients about the severity of their disease. Moreover, even though the FEV1 of stage I COPD patients was numerically normal (FEV1%pred ≥80%), the ∆ lung age values of 53% stage I COPD patients exceeded the ULN, indicating that lung age may be more helpful than FEV1%pred for patients to be aware of their lung function impairment. As lung age makes it easier for patients to understand their lung function level and is also well accepted by the majority of primary care physicians17, we believe that lung age may be a useful tool to be applied in the assessment and management of chronic respiratory diseases with lung function impairment, especially in primary care.

Aiming at developing lung age equations based on the data of healthy subjects, the present study excluded smokers from the analysis. It may be of more practical value to include smokers or those with occupational exposures for analyzing the ∆ lung age of those at risk of diseases. What’s more, the analysis of ∆ lung age of disease patients was exploratory, and we did not perform sample size estimation with a prior hypothesis, thus, though the between-group P-value was <0.05, it did not promise the between-group difference was clinically significant. The practical application value of lung age in chronic respiratory diseases should be validated in prospective studies.

In conclusion, spirometry-derived lung age estimation equations were developed by a spline modelling method. ∆ Lung age derived by the new estimation equations can reflect the level of lung function in patients with COPD and asthma. Thus, lung age may be used in the assessment of chronic obstructive respiratory diseases by both health care providers and patients to better understand and manage the disease.

Methods

The study was based on retrospective data obtained from research databases. Firstly, spirometric lung age estimation equations were established and validated based on the data of healthy nonsmokers. Secondly, lung age and ∆ lung age of patients with COPD and asthma were analyzed. The study was performed in accordance with the Declaration of Helsinki and was approved by the Ethics Committee of First Affiliated Hospital of Guangzhou Medical University with a waiver of informed consent as it is a retrospective study (approval number: No. 2019-72).

Study population

Healthy subjects of this study included the modelling group and the validation group, the former was used for the establishment and internal validation of the lung age equations, and the latter was used for the external validation. Subjects of the modelling group were from the “Reference Values for Spirometry in Chinese Aged 4–80 Years” study that collected multicenter spirometric data in 2007–2010 (research database 1). Subjects of the validation group were from the “Reference Values for Respiratory Impedance with Impulse Oscillometry in Healthy Chinese Adults” study that collected multicenter spirometric data in 2016–2018 (research database 2). Details of these studies have been reported previously14,18. Briefly, the inclusion criteria of healthy subjects in this study were: age 18–80 years old; no history of smoking or occupational exposures; no symptoms or history of chronic cardiopulmonary diseases; body mass index ≤ 30 kg/m2; and FEV1, forced vital capacity (FVC), FEV1/FVC, and maximum mid-expiratory flow (MMEF) all within the normal limits.

Spirometric data and medical records of patients with COPD or asthma from 2016 to 2019 were derived from the Respiratory Health Big Data database of Guangzhou Respiratory Health Institute, First Affiliated Hospital of Guangzhou Medical University (database 3). Patients with COPD included in the analysis met the following criteria: clinically diagnosed as COPD according to the guideline of the Global Initiative for Chronic Obstructive Lung Disease (GOLD)2; ≥40 years old; no exacerbation within the last 4 weeks before the spirometry measurement; and no history of asthma, interstitial lung diseases, pulmonary tuberculosis, or lung cancer. Patients with asthma included in the analysis met the following criteria: clinically diagnosed as asthma according to the guideline of the Global Initiative for Asthma19; ≥18 years old; and no history of COPD, interstitial lung diseases, pulmonary tuberculosis, or lung cancer. The severity of airflow limitation in patients with COPD was categorized according to GOLD2: stage I (80% ≤FEV1 %pred), stage II (50% ≤FEV1 %pred <80%), stage III (30% ≤FEV1 %pred <50%) and stage IV (30% <FEV1 %pred). For comparability, patients with asthma were also categorized according to the same criteria.

Spirometric parameters

Spirometric parameters analyzed in this study included FEV1, FVC, FEV1/FVC, MMEF, FEF50% (forced expiratory flow at 50% of FVC), and FEF75% (forced expiratory flow at 75% of FVC).

Statistical analysis

One-way ANOVA was used for the comparisons of age, height, weight and ∆ lung age of multiple groups, and Tamhane’s T2 test was used for post hoc multiple comparisons as the variances were unequal. The chi-square test was used for the comparisons of sex ratio.

Sex-specific lung age estimation models with the chronological age of healthy subjects as the dependent variable and spirometric parameters and height as explanatory variables were built via multiple linear regression, piecewise linear regression, and the natural cubic spline method. The goodness of fit of the model was assessed by the adjusted coefficient of determination (R2). The model with the highest adjusted R2 was used as the final estimation equations of lung age. The final model was internally validated using the bootstrap resampling method. Differences between the estimated lung age and the chronological age (∆ lung age) of healthy subjects in the validation group, and the proportion of ∆ lung age exceeding the normal limit in the validation group were analyzed for the external validation of the equations.

As the distributions of age and height, and the proportion of sex were different between healthy subjects and patients with COPD or asthma, PSM was performed to balance these factors between healthy subjects and patients. R® Version 4.0.3 and GraphPad Prism® Version 8.0.1 were used in the analysis and for graphics.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.