Considering lead-time bias in evaluating the effectiveness of lung cancer screening with real-world data

Low-dose computed tomography screening can be used to diagnose lung cancer at a younger age compared to no screening. Real-world studies observing mortality after lung cancer diagnosis are subject to lead-time bias. This study developed a method using a nationwide cancer registry and stage shift from trial for the adjustment of lead-time bias. 78,897 Taiwanese nationwide lung cancer patients aged 55–82 were matched with 788,820 referents randomly selected from the general population at a ratio of 1:10 by age, sex, calendar year, and comorbidities, to estimate the pathology- and stage-specific life expectancy (LE). Loss-of-LE is the difference between the LE of cancer patients and that of referents. By multiplying LE and loss-of-LE by the pathology and stage shift in the National Lung Screening Trial (NLST), we compared the effectiveness of cancer screening measured by LE gained and loss-of-LE saved. The mean LEs of stage IA and IV adenocarcinoma were 14.5 and 1.9 years, respectively, indicating a LE gain of 12.6 years. However, the mean loss-of-LEs of stage IA and IV adenocarcinoma were 3.7 and 15.1 years, respectively, with a saving of only 11.4 years, implying an adjustment of different distributions of age, sex, and calendar year of diagnosis from stage shift and a reduction in lead-time bias. Applying such estimations on the results of 10,000 participants with the same pathology and stage shift in the NLST, the benefit of screening using LE gained would be 410.3 (95% prediction interval: 328.4 to 503.3) years. It became 297.1 (95% prediction interval: 187.8 to 396.4) years when using loss-of-LE saved, indicating the former approach would overestimate the effectiveness by 38%. Our approach of multiplying loss-of-LE by pathology and stage shift to estimate loss-of-LE saved could adjust for different distributions of age, sex, and calendar year at early diagnosis and reduce lead-time bias.


Survival of patients stratified by pathology and stage. Because eligible participants in the NLST
were between 55 and 75 years of age and a maximum follow-up of seven years was used to determine the incidence of lung cancer 1 , we abstracted all pathologically-verified lung cancer patients aged 55-82 during 2002-2015 from the Taiwan National Cancer Registry database for the analysis. We categorized the histopathology into small-cell lung cancer, squamous-cell non-small-cell lung cancer (SqCC), adenocarcinoma, and non-SqCC other than adenocarcinoma. Each patient's tumor stage was defined according to the classifications provided by the American Joint Committee on Cancer, 6th and 7th editions. Because a completely different categorization of lung adenocarcinoma was proposed in 2011 19,20 , adenocarcinoma in situ (AIS), minimally invasive adenocarcinoma (MIA), and bronchioloalveolar carcinoma (BAC) in stage 0/IA/IB were assigned as three specific subcategories for analysis. Each patient's identification information was linked to the National Mortality Registry database and followed up from the day of diagnosis until the end of 2017. We used the Kaplan-Meier method to estimate the survival function up to the limit of follow-up.
Extrapolating the survival to lifetime for life expectancy (LE). The survival functions of different lung cancer subtypes were extrapolated to lifetime using a rolling extrapolation method 21 , which was carried out in the following four steps: First, by using the life tables in different calendar years, we simulated an age-, sexmatched reference population and estimated the lifetime Kaplan-Meier survival function. Second, we calculated the survival ratio between the lung cancer cohort and reference population at each time point and performed logit transformation of the ratio. Third, we used restricted cubic splines models to fit the logit-transformed relative survival (see Supplementary Fig. S1). Fourth, the first restricted cubic splines model, together with the survival function of the reference population beyond the follow-up limit, was used to extrapolate survival for the next month. Then, with the new end of survival, we fitted the second restricted cubic splines model to extrapolate survival one further month. By repeatedly performing the above procedure month-by-month (i.e., rolling over), we were able to obtain the survival of lung cancer patients over their lifetimes. This method has been shown to produce a relatively accurate estimate of lifetime survival 21 . We used the iSQoL2 statistical package, which can be downloaded at http:// sites. stat. sinica. edu. tw/ isqol/, for computations.
Age-, sex-, calendar year-, and comorbidities-matched referents. We interlinked the National Registries for Beneficiaries and Catastrophic Illnesses databases and matched lung cancer patients with referents randomly selected from the general population at a ratio of one-to-ten. The matching criteria included age, sex, calendar year, and major comorbidities at the time of lung cancer diagnosis, which included the following: (1) malignant neoplasms other than skin cancer or in-situ carcinoma; (2) acute cerebrovascular disease, spinal cord injury, and motor neuron disease; (3) end-stage heart failure, chronic pulmonary diseases, and primary neuromuscular diseases, which required ventilation for 21 or more days; (4) end-stage renal disease; (5) cirrhosis of the liver with poorly-controlled ascites, varicose bleeding, or hepatic coma. These catastrophic illnesses would generally cause premature mortality and the diagnoses are credible because patients with any of the above diagnoses can be waived from co-payments. To prevent abuse, the National Health Insurance stipulates two physicians to validate the documents before approval. www.nature.com/scientificreports/ cancer patients and that of matched referents. We also used the iSQoL2 statistical package to carry out these computations. The 95% confidence intervals (CIs) for estimates of LE and loss-of-LE were obtained through 100 bootstraps. The Taiwan National Lung Screening Program is a single-arm research-based study 22 . Although it contains pathology and stage information of screened lung cancer cases, we were not able to find a comparable control group diagnosed without screening. The pathology and stage shift under Medicare LDCT screening services has not yet been disclosed. Therefore, we borrowed the pathology and stage shift data from the NLST 1 . The LE and loss-of-LE of pathology-and stage-specific lung cancer were multiplied by the pathology and stage shift for LE gained and loss-of-LE saved. To compute the 95% prediction intervals of LE gained and loss-of-LE saved, we created a decision tree (see Supplementary Fig. S2) and simulated 10,000 NLST participants with 1,000 iterations. Consistent with the approach in prior literature for following up lung cancer incidence over 7 years 23 , we assumed all excess lung cancers in the screening arm were over-diagnosed.
LE and loss-of-LE for lung cancer patients who were smokers. Our National Cancer Registry data included smoking information of lung cancer patients beginning from 2011. They were followed up until the end of 2018. We applied our method to extrapolate the survival function and estimate the LE. The 8-year follow-up period was too short for estimating the lifetime survival functions of age-, sex-, calendar year-, and comorbidities-matched referents accurately. We thus directly compared the survival of lung cancer patients with that of the age-, sex-, calendar year-matched reference population simulated from the life tables (i.e., comorbidities were not matched) for loss-of-LE. We also applied the method to the study cohort for comparison.

Sensitivity analyses.
We conducted sensitivity analysis using lung cancer patients who were smokers. The magnitude of over-diagnosis depends critically on the length of follow-up after the final screening 24 . Notably, the over-diagnosis rate in the NLST became 3% after extending the follow-up to 11.3 years 25 , and the lifetime over-diagnosis rate was predicted to be zero. To have a more appropriate estimation and inference, we performed sensitivity analysis hypothesizing 50%, 3%, and 0% of excess cancers were over-diagnosed. We accounted for the timing of cancer detection in the screening and control arms from the NLST 23 . The test sensitivity of LDCT in the NLST was 93.1% 26 , and could be as low as 59% 27 . We thus assumed test sensitivity to be 80% and 60% for analysis. Besides, we also borrowed the stage shift from the Dutch-Belgian lung cancer screening trial (NEL-SON) 2 for sensitivity analysis.
Validating the extrapolation method. We used the survival data of patients who were diagnosed during the first 8 years and extrapolated their survival up to 16 years. Because these patients were actually followed from 2002 until the end of 2017, the mean survival duration within the 16-year follow-up period using the Kaplan-Meier method was considered as the gold standard. The relative bias was computed by comparing the difference in values between the extrapolation and the Kaplan-Meier estimation. The same validation method was applied to lung cancer patients who were smokers. That is, we used the survival data of patients diagnosed during the first 5 years (2011-2015) and extrapolated them to 8 years (2011-2018), which were then compared with the 8-year Kaplan-Meier estimates for relative biases.
Ethics approval, consent to participate. The Institutional Review Board of National Cheng Kung University Hospital approved this study before commencement (B-EX-107-050). Consent was waived by the Institutional Review Board. Study methods were performed in accordance with the STROBE guidelines.

Results
Comparing life expectancies for loss-of-LE. During 2002-2015, a total of 78,897 lung cancer patients aged 55-82 were analyzed and one-to-ten matched with 788,820 referents randomly selected from the general population. The clinical characteristics of lung cancer patients and referents were completely matched (Table 1). Table 2 summarizes the LEs of lung cancer patients and matched referents stratified by pathology and stage. Patients with stage IA and stage IV adenocarcinoma were diagnosed at an average age of 66.3 (standard deviation (SD): 7.2) and 68.4 (SD: 7.7), and had a LE of 14.5 (95% CI: 12.6 to 16.4) and 1.9 (95% CI: 1.8 to 2.1) years, respectively. Figure 1 shows the loss-of-LE, which is the difference between the area under the lifetime survival curve of patients and that of matched referents. The loss-of-LEs of patients with stage IA and stage IV adenocarcinoma were 3.7 (95% CI: 1.0 to 5.8) and 15.1 (95% CI: 14.8 to 15.4) years, respectively ( Table 2). We found that lung cancer patients who were smokers usually lost 0.5-1 year more in LE than the whole study cohort stratified by pathology and stage. Figure 2 illustrates how leadtime bias could be adjusted by using loss-of-LE. According to Table 2, a patient with stage IV adenocarcinoma was generally diagnosed at a mean age of 68.4 and had a mean LE of 1.9 years. If the patient was diagnosed earlier at stage IA (at a mean age of 66.3 and with a mean LE of 14.5 years), the average gain in LE would be 12.6 years. However, if we took different age, sex, calendar year of diagnosis, and major comorbidities into consideration and compared the loss-of-LEs, the average savings of loss-of-LE would be 11.4 (= 15.1-3.7) years. In other words, our proposed method of comparing loss-of-LEs, or measuring difference-in-differences, adjusts for the overestimation of LE gained resulting from early diagnosis of lung cancer, and the magnitude was 1.2 (= 12.6-11.4) years. www.nature.com/scientificreports/ Applying such estimations on results of 10,000 participants with the same pathology and stage shift in the NLST and assuming 100% excess lung cancers were over-diagnosed, we compared the effectiveness by measuring LE gained and loss-of-LE saved (Fig. 3, also see Supplementary  Table 3 displays that the LE gained and loss-of-LE saved of smokers became lower after weighting the stage shift. Sensitivity analysis hypothesizing 50%, 3%, and 0% of excess cancers were overdiagnosed was performed. We found that the differences between LE gained and loss-of-LE saved slightly increased when the over-diagnosis rate decreased in lung cancer. Sensitivity analyses assuming the test sensitivity of LDCT ranged from 60 to 80%, and stage shift from the NELSON are also shown. All estimates using LE gained were higher than those using loss-of-LE saved. Table 1. Clinical characteristics of lung cancer patients and age-, sex-, year-, comorbidities-matched referents. BAC bronchioloalveolar carcinoma, ESRD end-stage renal disease, SCLC small-cell lung cancer, SqCC squamous-cell non-small-cell lung cancer. *Selected major comorbidities include: 1. malignant neoplasms other than skin cancer or in-situ carcinoma; 2. acute cerebrovascular disease, spinal cord injury, and motor neuron disease; 3. end-stage heart failure, chronic pulmonary diseases, and primary neuromuscular diseases, which required ventilation for 21 or more days; 4. end-stage renal disease; 5. cirrhosis of liver with poorlycontrolled ascites, varicose bleeding, or hepatic coma. **Adenocarcinoma in situ (n = 254) and minimallyinvasive adenocarcinoma (n = 102) were not analyzed due to small sample sizes and high censored rates.  Table S3). The relative biases of the extrapolation ranged between 1.1% and 7.1%. The results of lung cancer patients who were smokers were also similar.

Discussion
Although a randomized screening trial beginning follow-up from randomization would eliminate lead-time bias 28 , detailed trial data might not be available for researchers. In contrast, research using observational realworld data is subject to lead-time bias. For example, cases detected through an effective national screening program usually show increased proportions of cases in early stages, and their age would be younger than those who are not diagnosed through screening, indicating the existence of lead time. We proposed estimating the www.nature.com/scientificreports/ loss-of-LE saved to adjust for different distributions of age, sex, and calendar year at early diagnosis to reduce this bias, and we have the following arguments to support the above assertion: First, the validation of our extrapolation method using half period (i.e., 8 years) of survival showed that the relative biases were all less than 7.1% (Supplementary Table S3). The month-by-month rolling over algorithm combined with the well-fit model of restricted cubic splines ( Supplementary Fig. S1) for the full period of survival also assure the accurate estimate of LE 21 . As our follow-up of 16 years was generally longer than the usual LE of lung cancer patients, the validity of extrapolated results would be acceptable. Second, since the referents were selected by matching every case by age, sex, calendar year of diagnosis, and major comorbidities (Table 1), the difference between the LE of pathology-and stage-specific cancer patients and that of matched referents could mainly be attributed to the occurrence of lung cancer. That is, there would be little confounding on all estimates of loss-of-LE (Table 2). Third, as the weighted sums of loss-of-LE in screening and control arms have already accounted for the different pathologies and stage distributions of lung cancer patients (Supplementary Table S2), comparison of them for loss-of-LE saved has adjusted for the stage shift and excess incidence of lung cancer. In real-world practice, it is challenging to select a matched control from the beginning to observe survival for adjustment of lead-time bias. Our method might provide an alternative solution for this issue. However, even though the Centers for Medicare & Medicaid Services initiated reimbursement for LDCT screening in February 2015 4 , real-world pathology and stage shift has not yet been disclosed. We therefore tentatively borrowed the pathology and stage shift in the NLST 1 and simulated 10,000 participants to demonstrate the calculations of LE gained and loss-of-LE saved (Fig. 3). Compared with an incremental LE of 316 life-years per 10,000 participants in an epoch-making cost-effectiveness study 28 , the loss-of-LE saved in our study, equal to 297.1 life-years per 10,000 participants is slightly more conservative. Once the real-world pathology and stage shift becomes available, our method could directly be applied to provide a quick evaluation of the effectiveness of LDCT screening.
It is interesting to note that cases with BAC and stage IA adenocarcinoma were diagnosed at a younger age than late-staged patients (Fig. 2). The difference between the ages of diagnosis at early and late stages could be regarded as an approximate estimate of lead time with the same pathology, assuming that people with the same age, sex, and calendar year of diagnosis would be of the same risk set. Further matching performed on major comorbidities would improve the validity of this assumption. However, because of the higher cell proliferation rates 29 and shorter doubling times 30,31 of small-cell lung cancer (SCLC) and SqCC than those of adenocarcinoma, patients with these subtypes of cancers had a shorter preclinical phase. The differences in age distribution between early-and late-stage lung cancers of such subtypes were smaller. Consequently, our proposed method  (Table 2). If the patients were diagnosed earlier at limited stage for SCLC and stage I for SqCC with mean LEs of 2.3 and 7.4 years, the mean LEs gained would be 1.5 and 6.4 years, respectively. If we compared the loss-of-LEs, the mean loss-of-LEs saved would be 1.7 (= 14.2-12.5) and 6.7 (= 13.1-6.4) years, respectively, which were similar to the LEs gained. In other words, our proposed method of estimating loss-of-LE saved would not be applicable in lung cancers with rapid proliferation rates, which offers a short or minimal preclinical phase for early detection. Several limitations must be acknowledged in this study. First, we used a diagnosed lung cancer cohort from Taiwan's nationwide database instead of cases detected from a screening program. The results might not be applicable to settings in other countries. Besides, most of the diagnoses were made through symptoms instead of screening, which could underestimate the lead time. However, the early-stage patients in our study had higher proportions of other malignancies upon diagnosis than late-stage ones (Table 1), which might imply that some of the early-stage patients were diagnosed through active surveillance. Therefore, the magnitude of underestimation would not be overly large. Second, due to the lack of information on smoking among the reference population, we did not directly account for the difference in exposure to smoking while extrapolating the survival for LE and matching the comorbidities of referents for loss-of-LE. However, sensitivity analysis using lung cancer patients who were smokers was conducted ( Table 3). The LE gained and loss-of-LE saved of smokers were lower than those of the study cohort. Third, we hypothesized 100% of excess lung cancers were over-diagnosed and provided no details on the pathology and stage of the cases considered to be over-diagnosed. Over-diagnosis denotes detection of lung cancer which will not go on to cause symptoms or mortality before death from other causes 32 . That is, death from other causes preceding lung cancer death, or no loss-of-LE, is a prerequisite for over-diagnosis. Our method for estimating loss-of-LE saved might have indirectly adjusted for this bias. We assumed 50%, 3%, and 0% of excess cancers to be over-diagnosed in the sensitivity analyses (Table 3), which might have considered the influence of the length of follow-up period on over-diagnosis. AIS and MIA have been associated with an increased risk of over-diagnosis 33 . However, they became new categories of lung cancer starting from 2011 19,20 .  Table 2). If the patient was diagnosed earlier at stage IA (at a mean age of 66.3), the average gain in life expectancy (LE) would be 14.5-1.9 = 12.6 years. However, if we take different age, sex, year of diagnosis, and comorbidities into consideration and compared the loss-of-LE, the average savings of loss-of-LE would be 15.1-3.7 = 11.4 years, which implies an adjustment for lead-time bias. The values in parentheses for age and LE/loss-of-LE denote the standard deviations and 95% confidence intervals, respectively. † denotes mortality. BAC bronchioloalveolar carcinoma.  www.nature.com/scientificreports/ Given the small sample sizes and high censored rates, the loss-of-LEs could not be estimated. No inferior survival for AIS and MIA cases compared to that of referents was observed during the period (Supplementary Fig. S3).
In conclusion, estimating LE gained to evaluate the effectiveness of early diagnosis of lung cancer is subject to lead-time bias. Our approach of multiplying the loss-of-LE by pathology and stage shift to estimate the loss-of-LE saved could adjust for different distributions of age, sex, and calendar year at early diagnosis and reduce this bias.

Data availability
The datasets generated and/or analyzed during this study are not publicly available due to confidentiality reasons, but the sufficiently aggregated data used for analyses may be available from the corresponding author upon reasonable request.