Clinical and pathological characteristics associated with the presence of the IS6110 Mycobacterim tuberculosis transposon in neoplastic cells from non-small cell lung cancer patients

Lung cancer (LC) and pulmonary tuberculosis (TB) are the deadliest neoplastic and bacterial infectious diseases worldwide, respectively. Clinicians and pathologists have long discussed the co-existence of LC and TB, and several epidemiologic studies have presented evidence indicating that TB could be associated with the development of LC, particularly adenocarcinoma. Nonetheless, this data remains controversial, and the mechanism which could underlie the association remains largely unexplored. Some bioinformatic studies have shown that human cancer biopsies have a very high frequency of bacterial DNA integration; since Mycobacterium Tuberculosis (MTb) is an intracellular pathogen, it could play an active role in the cellular transformation. Our group performed an exploratory study in a cohort of 88 LC patients treated at the Instituto Nacional de Cancelorogía (INCan) of Mexico City to evaluate the presence of MTb DNA in LC tissue specimens. For the first time, our results show the presence of the MTb IS6110 transposon in 40.9% (n = 36/88) of patients with lung adenocarcinomas. Additionally, through in-situ PCR we identified the presence of IS6110 in the nuclei of tumor cells. Furthermore, shotgun sequencing from two samples identified traces of MTb genomes present in tumor tissue, suggesting that similar Mtb strains could be infecting both patients.

www.nature.com/scientificreports/ (8.9-11.0) in 2019 alone 1 . Moreover, the WHO estimates that approximately 23% of the world population is infected by MTb, though only 1 in 10 individuals will develop active disease 3,4 . The importance of this infectious agent cannot be understated, and although several world regions have gained control through public health measures, other areas are still under a considerable burden by pulmonary TB 1 . Another relevant disease in terms of global mortality is LC. LC is currently the first cause of cancer-related deaths worldwide, and although some encouraging data emerged in the last years, the survival rate for this neoplasm remains merely at 14-15% 2 . Each year, approximately 1.76 million deaths are attributed to LC, as well as 2.09 million incident cases, most of these diagnosed in the advanced setting, and therapies lack curative intent in most cases 5,6 . Non-small cell lung cancer (NSCLC) accounts for approximately 85% of all LCs, and the most common subtype is lung adenocarcinoma 7 . Although tobacco smoke plays an undeniable role in the development of LC, there is still an essential proportion of patients who develop this disease without a history of smoking 8 . Non-smoker patients comprise approximately 10-20% of lung cancer cases globally, although this number can increase in specific subgroups; for example, among female patients in Mexico, only 30% of cases have a positive smoking history 9 . Several other risk factors have been well-described and causally linked to the development of LC, including radon exposure, asbestos, arsenic, and others 9,10 . Infections also seem to play a role, although their specific mechanism remains elusive. In this regard, a history of TB has been associated with LC development in several epidemiologic studies, particularly for the development of adenocarcinoma 11 .
Animal models have presented experimental evidence regarding the association between TB and LC, in such cases the increased inflammatory process from chronic TB infection induces cell dysplasia and squamous cell carcinoma. Nonetheless, the association in epidemiological studies has been observed in humans between TB and lung adenocarcinoma, rather than squamous cell carcinoma, which would suggest other mechanistic pathways 12 . Interestingly, although the International Agency for Research on Cancer Monographs has identified eleven biological agents as group 1 carcinogens, MTb has not been included in this list 13,14 .
TB and LC co-occurrence have been frequently reported in the literature, though the nature of this observation remains undescribed 11,15 . Although results have been inconclusive, one large cohort in China identified that risk of LC was significantly increased in subjects with a previous TB history 16,17 . Similarly, a prospective cohort in Asia showed an increased incidence of LC in TB patients, a risk which was also increased by presence of other comorbidities including COPD and risk factors such as smoking 18,19 . Interestingly, results also indicate that adenocarcinoma is most frequently associated with a TB history, as reported by a systematic review 11 . Furthermore, Wong et al. found an association between TB and lung adenocarcinoma (OR 1.31, 95% CI 1.03-1.66, p = 0.027) among never-smoking Asian women in a genome-wide association study using Mendelian randomization and pathway analysis 20 . Last, a recent meta-analysis concluded that pre-existing TB increases the risk of LC (RR 2.170 (1.833-2.569). The results emphasize the importance of LC screening in this patient subgroup, as there could be a need for a considerable follow-up after the infection has been treated 21 .
Having a history of TB appears not only to increase the risk of developing LC, but it can also negatively affect prognosis. In a cohort study conducted in the Netherlands which included over 8000 persons and had a follow-up of 18 years a total of 214 cases of LC were found, of which 13 had a history of pulmonary TB. The overall survival of patients with LC and a history of pulmonary TB was significantly lower than patients without a history of TB (HR 2.36, 95% CI 1.1-4.9), with an average result of 311 days difference between the two groups 22 . The role of a previous TB infection in terms of prognosis is scarcely understood, though it might be related to specific molecular alterations and response to treatment. For example, a correlation has been observed between TB and EGFR mutations in patients with LC 23 . Further, considerable differences have been observed in response to treatment using tyrosine kinase inhibitors (TKIs) 24 , and could be related to a high expression of Epiregulin in EGFR positive tumors 25 . Interestingly, inducible nitric oxide synthase (iNOS) and Epiregulin are highly expressed in chronic inflammatory processes 12,24 .
Previous preclinical and clinical studies have provided information regarding the possible mechanisms underlying the relationship between TB and LC. Mtb induces a strong inflammatory response in the lung tissue of infected hosts, and this can in turn promote cancer development and progression [26][27][28][29] . Moreover, the infection is also characterized by the formation of TB-induced scars, a process which has been suggested to play an etiologic role in LC development due to the cellular proliferation which occurs during tissue repair 11,[27][28][29][30] . The role of angiogenesis has also been explored, which is characteristic during repair processes 31 .
Considering the extensive evidence of LC and TB's relationship, we sought to explore the presence of MTb genetic material within tissue samples of patients with LC treated at the Instituto Nacional de Cancelorogía (INCan) in Mexico City.

Results
Patient characteristics. The baseline characteristics of the eighty-eight patients included in this study ( Fig. 1) (Fig. 3). This result also suggests that the strain present in both patients is closely related to KZN 1435.

Discussion
Despite the well documented epidemiological association between pulmonary Tb infection and LC, the molecular mechanisms by which Tb promotes the development and progression of LC remain unknown 19 . Nonetheless, the association has been documented even in large population-based studies, highlighting a considerable risk also for secondary lung cancer even after adjusting for important covariates 32 . TB appears to be a significant factor in LC development, though the mechanism behind this association has seldom been explored. It is well known that TB induces an inflammatory response, and that chronic inflammation is conducive to neoplastic processes through several pathways [26][27][28][29]31 . Furthermore, other infectious agents have been well characterized for their oncogenic properties, accounting for a considerable proportion of worldwide cancers 33 3 , nevertheless, it must be noted that this cohort of LC patients represents a selected population and therefore a selection bias could be responsible for this difference, compared with open population and other epidemiology reports which have shown an association between LC and TB [26][27][28][29]31 . This is not the first time MTb DNA is found in the absence of a TB histological reaction. We previously showed that mycobacterial DNA was identified in 38% and 35% of normal lung tissue samples from Ethiopia and Mexico, respectively, by conventional PCR, in subjects without a previous TB history 35 .
Another interesting observation is the difference in age of patients who tested positive for IS6110 compared with those who tested negative. The mean age of positive patients was 54.75 (± 16.07) years. Among them, 63.9% (23/36) were younger than 60 years old (P = 0.03) Our group recently reported that the mean age of patients with LC in Latin America was 62.2 years (± 12.3) 36 and the median age for this cohort is similar to the one reported for patients with no known risk factors for LC 9 . It is essential to highlight that previous studies have already shown this important association between young age and increased risk of lung cancer among subjects with a TB history. In a study which compared 3776 pulmonary TB patients with 18,880 matched controls, the authors identified that the risk for lung cancer increased as a function of younger age. Compared to patients < 50 years of age, the risks for lung cancer were HR 9.85, 7.1, 3.32, and 2.57 in patients aged 50-59, 60-69, and ≥ 70 years, respectively 37 .
Another critical factor to consider pertaining to the baseline characteristics of patients with a positive IS6110 result is the significant predominance of female sex patients with this characteristics. In a study by Chang et al. the authors perform a retrospective cohort among patients with lung cancer receiving EGFR targeted therapy, with the objective of assessing whether TB affects the outcome of patients with NSCLC. In this study, the authors  24 . This could be a hypothesis worth exploring, given the association between female sex and EGFR mutations, and the possible mechanistic pathway for TB-induced lung adenocarcinoma, which some hypothesize could involve the EGFR pathway. In a study which included 477 patients with pulmonary adenocarcinoma, 39% had EGFR-mutated tumors, while 21% had pre-existing TB lesions. In this same study, the authors report that the frequency of EGFR mutations is significantly higher in the subgroup of patients with TB lesions, and multivariate analysis revealed that pre-existing TB lesions were an independent factor associated with EGFR mutations 38 . Although our study included a small number of EGFRm patients due to the fact that many samples were not tested for this alteration, and tissue was not available to perform it at the time. It is important to highlight that among positive IS6110 patients, those with an EGFRm had a numerically higher median OS compared with wild-type subjects (56.0 vs. 21.4 months), although the difference is considerable, this did not reach statistical significance, likely due to sample size and patients lost to follow-up. Interestingly, this tendency was not observed for IS6110 negative patients with EGFR mutations.
To confirm that MTb was present in the lung cancer tissue without histological evidence of TB lesions, in-situ PCR was performed in samples which had a positive End-Point PCR for the IS6110 transposon. Twelve patients had enough paraffin-embedded tumor tissue to perform in-situ PCR, with a positive result in 41.6% (5/12) of the cases. Among these, 2 samples showed nuclear labeling in neoplastic cells (Fig. 2) and were selected to perform whole-genome sequencing. We found a low number of sequencing reads mapping to MTb genomes and an average of 82.33% were from three strains: Mycobacterium tuberculosis KZN 1435, Mycobacterium tuberculosis str. Haarlem/NITR202, Mycobacterium tuberculosis RGTB327. The reads of both patients were mainly associated with the Mycobacterium tuberculosis KZN 1435 strain, suggesting that the MTb strain present in both patients is closely related. Interestingly, MTb KZN 1435 was isolated for the first time from patients in KwaZulu-Natal, South Africa, and is considered as a multiple drug resistant (MDR) strain (resistance to isoniazid and rifampicin) 39 , The information from this study adds to the body of knowledge which seeks to explore the relationship between TB and LC. Nonetheless, the association is difficult to determine due to several challenges, including  www.nature.com/scientificreports/ the subclinical nature resulting from the primary infection, which makes it difficult to identify when it occurred, and several confounding environmental and host factors which can modulate pathogenesis 33 . Despite the considerable challenges, this is an association worth exploring further due to the considerable impact both in terms of follow-up and screening.

Conclusion
Results from this study show that in patients with lung adenocarcinoma from Latin America, a large proportion of tumor samples have MTb DNA sequence IS6110, additionally we report that these sequences can be identified in the nuclear area of neoplastic cells by in situ PCR. The shotgun sequencing effort suggests that genotypes of the two sequenced patients could be related. The presence of MTb DNA by PCR in this cohort is significantly associated with sex and younger age. Although chronic inflammation is increasingly implicated as a cancer development mechanism following bacterial infection, proto-oncogene disruption by MTb DNA could provide another mechanism in lung oncogenesis. However, further controlled experiments should be undertaken to assess the possible mechanisms by which TB participates in the development of LC.

Materials and methods
Experimental design. The present work was a clinical, longitudinal, prospective, observational and analytical study, using a cohort of lung cancer patients to select a non-probabilistic sample type.
Patients and tissue samples. From January 2015 to December 2017, patients admitted to the Instituto Nacional de Cancerología (INCan) with a pulmonary lesion suggestive of primary lung carcinoma were biopsied prospectively. Lung tissues were obtained by computer tomography-guided tru-cut (CareFusion, San Diego, CA, USA) in the clinically suspected primary tumor after informed consent was obtained. Patients with histologic confirmed locally advanced and metastatic lung cancer (stages III B and IV) were eligible for inclusion in the present study. If there was a histologic report that did not indicate primary lung cancer, the patient was excluded. A complete clinical-medical history was included, and all lung tumor specimens were collected after confirmation of diagnosis. This study was conducted according to the principles of the World Medical Association Declaration of Helsinki "Ethical Principles for Medical Research Involving Human Subjects" 41 . All experimental protocols were approved by the Scientific and Bioethical committees at INCan (014/009/ICI, CEI/870/14). All patients provided a signed written consent to participate in genotyping/genomic studies. Primary tumor core biopsy was performed before any treatment, and the specimen was snap-frozen in liquid nitrogen for DNA extraction. A total of 88 tumor samples were included in the study (Consort).

Statistical analysis.
Continuous variables were tabulated as medians with ranges, or as means with standard deviations (SDs), depending on data's distribution. The distribution was assessed using the Shapiro-Wilk test with a P-value greater than 0.05 considered as normally distributed. Two group comparisons were tested using Student's t test or Mann-Whitney U depending on data's distribution. Nominal data was analysed using the chi square (X 2 ) test. All data were analyzed using the SPSS package v. 20 (SPSS, Inc., Chicago, Ill, US) following methods previously reported by this research group 42 . DNA isolation. DNA was extracted from frozen tumor biopsies, weighted, and cryo-fractured in liquid nitrogen. The procedure for extraction and purification of total DNA from tissue (up to 5 mg tissue) was performed using QIAGEN QIAamp DNA UCP Micro Kit (Cat. 5204).
End-point and in-situ PCR. End-point PCR evaluated DNA from tumor tissue for the MTb transposon IS6110. The DNA concentration was assessed using ND-1000 Spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA) and the quality by agarose gel. End-Point PCR was performed in a 25 μl mixture containing 100 ng of DNA, HotStar Taq Master Mix (Qiagen) and the following IS6110 primers IS-1 5′CCT GCG AGC GTA GGC GTC GG′3 and IS-2 5′CTC GTC CAG CGC CGC TTC GG′3, as previously described 43 . The PCRs were performed with the following cycling conditions: hold at 95 °C for 15 min; complete 40 cycles of denaturation at 95 °C for 30 s; annealing at 70 °C for 30 s followed by extension at 72 °C for 45 s and a final extension at 72 °C for 10 min. The presence of the transposon product of 123 bp was evaluated in a 2% agarose gel. As a positive control, we used MTb H37Rv DNA 35,44 . The samples that resulted positive for IS6110 End-Point PCR were selected for in-situ PCR with the same primers for IS6110, as previously reported 35 . Briefly, 4 μm sections were obtained using a microtome apparatus from paraffin-embedded tumor tissue and placed on electrostatically charged slides. Incubation removed paraffin at 60 °C for 20 min and then hydrated gradually in xylene, alcohol, and DEPC-water. Tissue permeabilization was performed with chlorhydric acid (0.02 M) for 10 min, then digested with proteinase K (1 mg/Lt) at 37 °C for 30 min and fixed with 20% acetic acid. The reaction mix contained the FastStart Taq DNA polymerase (Roche), dNTPs couples with digoxigenin (PCR DIG labeling Mix, Roche) and the following primers IS-1 5′CCT GCG AGC GTA GGC GTC GG′3 and IS-2 5′CTC GTC CAG CGC CGC TTC GG′3. The in-situ PCR was performed using Amplicover discs and AmpliClips system from Applied Biosystems in a Touchgene thermo cycler, and the cycling conditions were the same as the end-point PCR mention above. The PCR products were detected with an anti-digoxigenin antibody (1:500) incubated for 30 min in a wet chamber; NBT/BCIP (1:50) was used as a substrate incubated for 30 min in a wet chamber and Nuclear Fast Red as counterstaining 35 .