Taxonomic diversity of sputum microbiome in lung cancer patients and its relationship with chromosomal aberrations in blood lymphocytes

Here we report a pilot-sized study to compare the taxonomic composition of sputum microbiome in 17 newly-diagnosed lung cancer (LC) patients and 17 controls. Another object was to compare the representation of individual bacterial genera and species in sputum with the frequency of chromosomal aberrations in the blood lymphocytes of LC patients and in controls. Both groups were male; average age 56.1 ± 11.5 in patients and 55.7 ± 4.1 in controls. Differences in the species composition of bacterial communities in LC patients and controls were significant (pseudo-F = 1.94; p = 0.005). Increased prevalence in LC patients was detected for the genera Haemophilus and Bergeyella; whereas a decrease was observed for the genera Atopobium, Stomatobaculum, Treponema and Porphyromonas. Donors with high frequencies of chromosomal aberrations had a significant reduction in the microbiome of representatives of the genus Atopobium in the microbiome and a simultaneous increase in representatives of the species Alloprevotella compared to donors with a low level of chromosomal aberrations in lymphocytes. Thus, a comparison of the bacterial composition in the sputum of donors with cytogenetic damages in theirs lymphocytes, warrants further investigations on the potential role of microorganisms in the process of mutagenesis in somatic cells of the host body.

Here we report a pilot-sized study to compare the taxonomic composition of sputum microbiome in 17 newly-diagnosed lung cancer (LC) patients and 17 controls. Another object was to compare the representation of individual bacterial genera and species in sputum with the frequency of chromosomal aberrations in the blood lymphocytes of Lc patients and in controls. Both groups were male; average age 56.1 ± 11.5 in patients and 55.7 ± 4.1 in controls. Differences in the species composition of bacterial communities in LC patients and controls were significant (pseudo-F = 1.94; p = 0.005). Increased prevalence in Lc patients was detected for the genera Haemophilus and Bergeyella; whereas a decrease was observed for the genera Atopobium, Stomatobaculum, Treponema and Porphyromonas. Donors with high frequencies of chromosomal aberrations had a significant reduction in the microbiome of representatives of the genus Atopobium in the microbiome and a simultaneous increase in representatives of the species Alloprevotella compared to donors with a low level of chromosomal aberrations in lymphocytes. thus, a comparison of the bacterial composition in the sputum of donors with cytogenetic damages in theirs lymphocytes, warrants further investigations on the potential role of microorganisms in the process of mutagenesis in somatic cells of the host body.
Lung cancer (LC) is the most common malignant tumor and the leading cause of death in the world. In particular, deaths from LC in men account for about 1/3 of deaths for all malignant neoplasms 1 . The treatment of LC, despite the enormous efforts reflected in of basic and clinical studies, still shows unsatisfactory results on survival and prevention programs show little effect on its incidence. Complex mechanisms of pathological changes and some key biomarkers of LC are recognized but there is a significant gap in the understanding of their mechanisms of theirs action in the course of LC development and progression. Among causative factors in the etiology of LC, smoking is certainly plays a leading role 2 . At the same time, 15 to 25% of all cases of lung cancer are observed in never-smoking patients. Among women, this figure reaches 53% 3 . Therefore, in addition to smoking, there are factors that can significantly affect the risk of developing LC.
In addition to the effects of hereditary components in the causation of LC, a number of contributory factors have been reported: passive smoking 4 , air pollution 5 , professional carcinogen exposition 6 , chronic exposure to high doses of radon 7 and, recently, the composition of the lung microbiome 8 .
In connection with the recent development of metagenomic studies, information has been rapidly accumulated regarding the presence and diversity of the microbiota populating the lungs and the entire respiratory tract.
It was noted that alpha diversity (richness and uniform distribution of taxa in samples) is significantly higher in non-cancerous lung tissues than in tumor lung tissues 9,10 , while community similarity (beta diversity) varies widely. Researchers of the LC microbiota agree that Firmicutes accounts for the main contribution to the development of LC [11][12][13] . At the level of taxa below type, the results today are mixed. In most cases, scientists have observed correlations for some individual genera. In particular, two genera: Veillonella and Megasphaera were noted as possible LC biomarkers 13 . Erb-Downward and colleagues observed a link between Acidovorax and small cell carcinoma 14 , other researchers attribute an increased risk of developing LC to the detection of Granulicatella 10,15 , Abiotrophia 10 , Streptococcus 10,12 , Haemophilus influenzae, Enterobacter spp., Escherichia coli 16 , Capnocytophaga, Selenomonas, Veillonella, Neisseria 17 . Given the inconsistency of the previously obtained results, and taking also into account the fact that the taxonomic composition of the respiratory microbiota depends on environmental factors, it is important to continue research, including the analysis of microbiota in patients with LC from different regions of the world. This is especially true in regions with high levels of environmental pollution, for example, due to coal burning, as was previously shown for one of the provinces of China 18 .
It was shown that changes that offset the balance in the community of microorganisms inhabiting the lungs lead to the predominance of Haemophilus influenza, Acidovorax, Klebsiella, Moraxella catarrhalis, Mycobacterium tuberculosis, and Granulicatella adiacens 19 .
It is also known that inflammatory reactions can be triggered by opportunistic species such as Enterobacter spp., E. coli, Pneumococcus 16 , Legionella 20 and Moraxella 21 . An equally important contribution to the development of LC can be made by microbiota acting on Th17-positive T-helpers, responsible for a balanced immune response and necessary for the control of autoimmune reactions. It was shown that commensal bacteria have a direct effect on the Th17-dependent pathway and calcineurin expression 22 .
Another, no less important mechanism of microbiota-induced carcinogenesis is the genotoxic properties of many bacterial metabolites. Bacterial genotoxins, such as colibactin, a cytolethal distending toxin (CDT) and others have been identified as compounds that directly damage DNA in host cells 23 . In other cases, mutagenesis in the cells of the host organism is associated with the formation of DNA-reactive metabolites due to bacterial activity, the formation of radicals, or the immune modulation of the host cells 24 . This is a mechanism of action for Helicobacter pylori, Pseudomonas aeruginosa, Enterococcus faecalis, Shigella flexneri, Bacteroides fragilis, Neisseria gonorrhoeae, Listeria monocytogenes, Chlamydia trachomatis and others. This list is clearly not exhaustive and will be updated over time.
In general, one can suggested that bacteria use different strategies to ensure their survival and replication, which includes inhibiting DNA repair of host cells, contributing to the survival of infected cells, despite the presence of DNA damage 25 . The genotoxic potential of bacterial microbiota is the basis for the hypothesis that the stability of the genome of somatic human cells, especially under the influence of genotoxic and carcinogenic factors, can directly (or indirectly) depends on the taxonomic composition of the bacterial community.
To test this hypothesis, we have undertaken efforts to analyze the taxonomic composition of the microbiota in the sputum of LC patients and healthy donors living in the environmentally challenged coal-mining region of Western Siberia (Kuzbass), Russia.
Another object was to correlate the representation of individual bacterial genera and species in sputum with the frequency of chromosomal aberrations (CA) in the blood lymphocytes of LC patients and in controls.

Methods cohort information.
The composition of the bacterial microbiome was studied in 17 patients with newly-diagnosed LC (male only, average age 56.1 ± 11.5 years) who were admitted to the Kemerovo Regional Oncology Center (Kemerovo, Russian Federation) and 17 healthy male donors, residents of Kemerovo (average age 55.7 ± 4.1 years). There were no differences in mean age between patients and control (p = 0.062). Among LC patients 70.6% were active smokers, among the controls, 64.7%. A summary of the information regarding LC and control subjects is shown in Table 1. An individual questionnaire was filled out for each survey participant, containing information about the place and date of birth, profession, exposure to occupational hazards, health status, diet features, taking medications (the use of antibiotics three months before the study), X-ray procedures, and bad habits (smoking and drinking status). For LC patients, the results of clinical and histological analyses were additionally taken into account. The distribution of LC diagnoses of patients analyzed was: squamous cell carcinoma −5 (29.4%); adenocarcinoma −5 (29.4%); large cell lung carcinoma −3 (17.7%); 4 (23.5%) other forms LC: mesenchymal, non-small cell undifferentiated. In addition, for each patient, the stage of the disease was determined in accordance with the TNM classification 26 . In accordance with this, 6 patients (35.3%) had stage I-II, and 11 patients (64.7%) stage III-IV of the disease. Additionally, metastases in distant organs were present in 23.5% of LC patients. Taxonomy quantification using 16S rRNA gene sequences and statistical methods. The processing of the results was conducted with the help of the program QIIME2 30 . A quality check was carried out and a sequence library was generated. The sequences were combined into operational taxonomic units (OTUs) based on a 99% nucleotide similarity threshold using the Greengenes reference sequences library (versions 13-8) and SILVA (version 132), followed by the removal of singletones (OTUs containing only one sequence). www.nature.com/scientificreports www.nature.com/scientificreports/ Calculation of indicators of alpha diversity and analysis of community similarity (beta diversity) was carried out according to the method UniFrac 31 . The total diversity of prokaryotic communities (alpha diversity) of sputum was estimated by the number of allocated OTU (analogue of species richness) and Shannon indices (H = Σp i ln p i , p i -part of i-sh species in community). When calculating sample diversity indices, 386 sequences were normalized (the minimum number of received sequences per sample). The variation in the structure of the bacterial community of different samples (beta diversity) was analyzed using UniFrac 31 -a method common in microbial ecology that estimates the difference between communities based on the phylogenetic relationships of the presented taxa. In this study, we used a version of the unweighted UniFrac method that takes into account only the presence of taxa, but not their share in the community. The significance of differences between groups of samples was evaluated by the PERMANOVA method (Adonis).
In addition, to assess the significance of differences in the relative percentage of individual bacterial taxa in sputum, as well as the frequency of chromosomal aberrations in lymphocytes, the Mann-Whitney U test was used. To estimate the difference in the frequencies of occurrence, the Fisher exact test was used. Calculations were performed using the software package STATISTICA.10, Statsoft, USA.

Results
characterization of sputum bacterial communities. In our sequencing approach (16S rRNA V3-V6) of LC and controls using sputum samples, we were able to identify a total of 11 phyla with relative frequencies above 0.1%. The prevailing phyla in our dataset were Firmicutes, Bacteroidetes, Actinobacteria and Proteobacteria (Fig. 1), as it could be expected from previous studies 15,18 .
Regarding alpha diversity, neither the number of allocated OTUs nor the Shannon indices, did not showed significant differences between LC and control. Overall, the bacterial communities were both fairly diverse as indicated by Shannon index at genus level (5.632 in LC vs 5.634 in controls). This suggests that any changes to the sputum microbiome as a result of a malignancy are not large-scale community shifts.
Differences in the structure of bacterial communities in sputum samples of lung cancer patients and controls are shown in Fig. 2. The PERMANOVA (Adonis) test using the difference matrix constructed by the unweighted  www.nature.com/scientificreports www.nature.com/scientificreports/ UniFrac method showed a significant difference in the prokaryotic sputum communities of healthy people and LC patients (pseudo-F = 1.94; p = 0.005).
In LC, compared to controls, a decrease in the occurence of individual representatives of the same genera was observed at the species level: Atopobium rimae (1.1 ± 0.91 vs 2.27 ± 1.52; p = 0.003), Treponema amulovorum (0.27 ± 0.52 vs 0.05 ± 0.17; p = 0.016), as well as an increase in Bergeyella zoohelcum (0.25 ± 0.25 vs 0.03 ± 0.11; p = 0.004). Additionally, two different species from the genus Prevotella (P.histicola and P. sp. oral clone DO014) were significantly less represented in the sputum of patients than in the controls (Table 3).
Smoking status, as a factor that may have influenced the composition of the bacterial flora in LC patients and controls was studied separately. A significant difference in the occurrence of the species Selenomonas bovis in sputum was revealed in the samples of LC patients differing in smoking status (3.52 ± 2.55% in smokers and 0.91 ± 1.62% in non-smokers; p = 0.044); the genus Bacteroides (4.02 ± 2.51% in smokers and 1.35 ± 1.61% in non-smokers; p = 0.039); the genus Zhouea (0% in smokers and 1.03 ± 1.39% in non-smokers; p = 0.006); the genus Selenomonas (4.07 ± 2.42% in smokers and 1.03 ± 1.55% in non-smokers; p = 0.013); the genus Peptostreptococcus (0% in smokers and 0.73 ± 0.21% in non-smokers; p = 0.031).
Comparison of the species and generic microbiome compositions in the sputum of LC patients of the main pathomorphological forms (squamous, adenocarcinoma and large cell carcinoma) revealed significant differences in the content of the two bacterial genera Veillonella and Leptotrichia. In the sputum of patients diagnosed with large cell carcinoma, Veillonella representatives were recorded with a higher frequency than in patients with adenocarcinoma (15.35 ± 3.39% vs 7.68 ± 3.01%, respectively; p = 0.036). Representatives of the genus Leptotrichia were also recorded with a greater frequency in the sputum of patients with large-cell carcinoma as compared to adenocarcinoma (4.78 ± 2.81% vs 1.11 ± 1.13%, respectively; p = 0.036).
Comparison of the species and generic microbiome compositions in patients with different stages LC (I-II and III-IV) revealed a significant difference in only one species, Porphyromonas endodontalis, the content of which was significantly higher in the sputum of patients with stage I-II (0.53 ± 0.6%) compared with patients at the III-IV stage of RL (0.06 ± 0.21%; p = 0.021).
The presence of metastases to distant organs was associated with an increase in the content in the sputum of representatives of the genus Capnocytophaga (3.05 ± 1.58%) vs 0.71 ± 1.05% in LC patients without metastases (p = 0.016). In addition, Atopobium rimae showed a decrease in sputum in patients with metastases compared with patients without metastases (0.21 ± 0.43% vs 1.37 ± 0.84%, respectively; p = 0.017). chromosomal aberrations in lymphocytes and sputum microbiome composition. Lung cancer patients had a significantly increased total frequency of CAs in comparison with controls (4.31 ± 1.86% vs. 2.27 ± 1.03%, p = 0.002). In LC patients, as compared to the control group, significant increases in the frequency of chromosome-type aberrations, such as сhromosome breaks (1.16 ± 0.94% vs. 0.38 ± 0.45%, p = 0.012) in comparison with the control group were detected (Table 4). No significant differences in chromatid-type aberrations were found between LC patients and controls. The Spearman correlation coefficient for the CAs, CTAs and CSAs frequencies with age (for the patients and the controls), were not significant (p > 0.05). No significant difference in the frequencies of CAs was detected between squamous cell lung cancer, adenocarcinoma, large cell lung carcinoma and TNM stage. Additionally, smoking status was not a factor significantly affecting the CAs, CTAs and CSAs frequencies for the two groups studied.
To analyze the possible effect of the taxonomic composition of the microbiome on the level of genetic instability in somatic cells of the host organism, based on the results of cytogenetic analysis, subgroups of patients and controls were formed that differed in the frequency of lymphocytes with CAs.
The subgroup with a low background level of CA (0-3.5%; the average value is 2.26 ± 0.86%) consisted of 7 patients and 16 control groups, and the subgroup with a high level of CA (more than 3.5%; the average value is −5.67 ± 1.15%) consisted of 9 patients and 1 control. (2020) 10  www.nature.com/scientificreports www.nature.com/scientificreports/ Comparison of the representation of Atopobium rimae type by percentage in the bacterial microbiome revealed a signficant decrease of Atopobium rimae in the sputum from the patients with lung cancer with a high level of CA as compared to the first subgroup (0.59 ± 0.63 versus 1.72 ± 0.9; p = 0.014).
In addition, we compared the content of bacterial genera and species in the combined subgroups of patients and controls with high and low levels of CA in lymphocytes.
In the composition of sputum from donors with high frequencies of CA frequencies, there is a significant decrease in representatives of the genus Atopobium and particularly of species of Atopobium rimae, as well as a decrease in representatives of Prevotella sp. oral clone DO014 (Fig. 3).
At the same time, an increase in the level of genetic instability in donor lymphocytes was associated with a significant increase in the content in the sputum of representatives of the genus Alloprevotella.
Representatives of the genus Actinobacillus were noted only in the sputum of donors with a high frequency of CA (0.29 ± 0.57%) and were not found in donors with a low level of aberrations in lymphocytes.

Discussion
The contribution of the microbiota of the upper respiratory tract to the pathological transformation of the tissues of these organs is a recognized phenomenon in studies of such a widespread disease as LC.
The urgent problem remaining is the formation of adatabase that is as complete as possible on the microbiome of the respiratory organs of people from different geographical regions of the planet, the establishment of strong cause-effect/associative relationships, and the identification of the molecular mechanisms of LC pathogenesis. This will open up prospects for early diagnosis and, ultimately, open up new approaches in the treatment of LC.
This pilot study, for the first time, presents the results of an investigation of the microbiome of the respiratory tract in a small group of patients with lung cancer of various histological types and degree, residing in Kuzbass, an industrial region of Russia.
Through the use of 16S metagenomic sequencing, we determined differences in the composition of the microbiome up to the species level in 17 men with a confirmed diagnosis of lung cancer (LC) compared with 17 age-matched controls.
In our study, we found 12 types, 119 genera, 118 OTU and analyzed 187361 sequences in total. The taxonomic structure of the upper respiratory tract microbiome at the type level is shown in Fig. 1, from which it follows that the microbiome is represented mainly by representatives of the Bacteroidetes, Firmicutes, Actinobacteria types.
At the species level, differences in the composition of the microbiome in sputum in patients with LC and in control donors (Table 4) are characterized by a statistically significant decrease in the prevalence of four types of bacteria, namely: Prevotella histicola, Prevotella sp. oral clone DO014, Atopobium rimae, Treponema amylovorum and increase in Bergeyella sp. AF14 (Flavibacteriaceae) species in LC patients.
Prevotella histicola and Prevotella sp. oral clone DO014 belong to type Bacteroidetes, whose family members were recently associated with a worse prognosis for patients with LC 32 . Although overrepresented in LC patients

Minimummaximum
Mean ±SD  Table 4. Frequency of chromosome aberrations in lung cancer patients and controls. www.nature.com/scientificreports www.nature.com/scientificreports/ on a type level, an individual species still could be underepresented in patients with LC, such as Prevotella sp. oral clone DO014 in our case, by analogy with Filifactor species. Filifactor bacteria were recently discovered mainly in healthy study participants and was even noted as a good control that allowed to distinguish between healthy and LC patients 11 , while the type where Filifactor belongs is represented more frequently in patients with lung cancer 13 . We also detected Filifactor mainly in healthy people.

Minimummaximum
Our data are consistent with previously published data. Representatives of the genera Treponema and Filifactor were found as ideal biomarkers in healthy participants in LC study 11 .
Representatives of the genus Atopobium are usually found in the oral cavity. Atopobium rimae species are associated with chronic periodontitis, bacteria of this species can cause bacteremia, are strictly anaerobic and Gram-positive bacteria, short and elliptical in shape, with a low content of G + C nucleotides 33 . The association of this species with LC has not been noted previously.
In our study, representatives of the genus Stomatobaculum are less represented in the sputum of patients with LC. This species has only recently been described as part of the microbiota of the human oral cavity and is positively associated with chalicytosis [34][35][36] , and with increased fasting glucose level (which is a sign of an increase in blood glucose after 2 years and the development of insulin insensitivity) It is also associated with a slowdown in bone restoration during implantation 37 . Stomatobactulum is strictly anaerobic, Gram-positive, non-spore forming, contains inclusions of iron and sulfur. For this species as well, association with LC was not previously noted.
Sequencing of bacterial 16S rRNA genes from the sputum of the lung cancer patients in our study has shown that the genera Atopobium and Stomatobaculum are significantly less abundant in samples from lung cancer patients as compared to the controls. The association of these two genera with the development of LC was not previously noticed. However, it was previously reported that the family Lachnospiraceae, to which Stomatobaculum belongs showed a positive relationship with LC 32 .
The representation of bacterial species of Atopobium rimae in the sputum of patients with LC decreases with progression of metastases according to our study, but this requires confirmation in a larger sample collection.
Our 16S rRNA sequencing data indicate an increase in the content of bacteria of the species Bergeyella sp. AF14 (Flavibacteriaceae) in the sputum of LC patients with. Previously, this type of bacteria was not associated with LC. Published data link this species of bacteria to animal respiratory disease 38,39 . Only recently have these bacteria have been identified as the cause of infectious endocarditis in humans 40 .
Interestingly, Bergeyella is represented to a lesser extent in patients with chalicytosis 34 . To firmly establish a more reliable association of this bacterium with the diagnosis of LC, a srtudy with more participants will be required.
Regarding the detection of such widespread bacteria in the respiratory tract as Haemophilus inflluenzae in our samples, we were able to detect it by analyzing the bacterial DNA 16S rRNA gene sequences using the SILVA database, which is based on 16S/18S Archaea analysis. We see that Haemophilus inflluenzae is present in 17 patients with LC and only in 1 subject in the control (p = 0.016). This contradicts previously published data that Haemophilus inflluenzae is mainly detected in healthy individuals 15 . It is known that most strains of H. influenzae are opportunistic pathogens that coexist with the host without causing disease, and only concomitant factors such as viral infections, decreased immunity, or chronically inflamed tissues such in allergies, predispose for pathogenic infections with H. influenzae. Thus, the reproduction of this pathogen and, therefore, higher chances of its detection should be expected precisely in the case of LC, as our data show. In any case, this discrepancy in the data deserves further study, with the involvement of more participants in the study of microbiota of the upper respiratory tract.
The frequencies of bacterial genera detected in our study indicates a decrease in the diversity of the microbiome of LC patients as compared to controls (Table 3). Seven representatives of unique bacterial genera were found in patients with LC compared to 14 in the controls.
Our data are consistent with previously published data showing that the lung microbiota in LC patients has a less complex composition. LC patients are distinguished by a decrease in the alpha diversity of the lung microbiota 41 .
Comparison of the microbiome composition in the sputum of patients with lung cancer of the main histological types (squamous cell carcinoma, adenocarcinoma and large-cell carcinoma) at the species and genera level did not reveal significant differences, except for the occurrence of bacteria of the genus Bergeyella. In the sputum of patients diagnosed with adenocarcinoma, representatives of Bergeyella were recorded at a higher frequency than in patients with squamous cell carcinoma (0.4 ± 0.24% and 0.04 ± 0.1%, respectively; p = 0.045).
Comparison of the microbiome in the sputum of patients with lung cancer of different stages (I-II and III-IV) at the species and genera level did not reveal significant differences.
Comparison of the microbiome in the sputum of patients with lung cancer with metastases in distant organs and without them in distant organs at the species and genera level did not revealatopobium significant differences, with the exception of the genus Atopobium, whose representatives were much less frequently detected in patients with metastases (0.21 ± 0.43% and 1, 37 ± 0.084%, respectively; p = 0.017), in particular, species of Atopobium rimae (0.21 ± 0.43% and 1.21 ± 0.089%, respectively; p = 0.034).
Interestingly, there was a different and non overlapping representation of bacterial species in the sputum microbiome of smokers as compared to non smokers in both LC patients and healthy participants in the study.
We examined the chromosome aberrations in the blood lymphocytes of all participants in the study and found that there is a significant correlation between the level of bacteria of a certain species in the sputum and chromosomal aberrations (CA).
Representatives of the genus Alloprevotella were most frequently associated with chromosomal aberrations in all participants in our study (Fig. 3). Bacteria of this genus have recently been identified as reliably associated with the metastatic stage of melanoma 42 .
Interestingly, representatives of the genus Actinobacillus were found only in the group of participants with high levels of CA. Representatives of the genus Actinobacillus (A.ureae and A. hominis) were previously found in the respiratory tract of healthy people and can cause bronchopneumonia and meningitis 43 .
Previous publications reported an increase in chromosome damage in the lymphocytes of primary LC patients [44][45][46] . The results obtained in our study confirm this (Table 4).
This increase in the level of cytogenetic lesions cannot be explained by the consequences of radiation or chemical therapy, since biological material from patients was collected before any treatment procedures. Instead, the cytogenetic instability of somatic cells in untreated patients with LC is a sign that reflects the influence of endogenous genotoxic factors on the human genome. In particular, one such factor can be oxidative stress, which is an indispensable attribute of the tumor process. In addition, the possible clastogenic effects of the bacterial environment in the lung tissue cannot be ruled out as causative agent. The influence of lung microbiota on lung carcinogenesis, immunity and immunotherapy is summarized in a recent review, the major points of which agree with our results as follows: the microbiota of the healthy lung is different from neoplastically transformed lung; bacterial products might promote host oncogene activation; the lung immune system is under the influence of the microbiota 47 .
It would be interesting to conduct a screening of the microbiome of the respiratory tract on a larger scale and for a longer period of time, for example, over three years, in order to identify the types of bacteria detected by us among healthy people from the same age group and region, in order to compare the data with the subsequent diagnosis of lung cancer and other diseases of the respiratory tract.
We plan to confirm the representation of different bacterial species detected by us in the sputum of patients with LC and healthy subjects by other methods, for example, by the method of specific quantitative PCR.

conclusion
The method of mass parallel sequencing of 16S ribosomal genes was used to determine the taxonomic composition of the microbiome in the sputum of LC patients and healthy donors (for the first time in the Russian population). Bacterial taxonomic groups have been identified where the microbiome composition differs significantly in patients as compared to controls. The discrepancy between our data and the results of previous studies of the LC microbiome probably reflects the specific epidemiological circumstances of the environment in the population we studied, a region with intensive coal mining and processing. The results obtained for the taxonomic composition of the microbiome rely on experimental data that will be further confirmed using a larger number of patients and control groups. The sputum microbiome, although it does not reflect the specific location of the respiratory tract, can potentially serve as an important non-invasive biomarker in LC. Our results show a correlation between chromosomal aberration of host genomes with bacterial representation in pharyngeal microbiome.
Thus, a comparison of the bacterial composition in the sputum of donors with cytogenetic damages in theirs lymphocytes, warrants further investigations on the potential role of microorganisms in the process of mutagenesis in somatic cells of the host body.
Knowledge of the specific composition of the microbiota in the respiratory tract of a patient will make it possible to predict the effectiveness of chemoradiotherapy, gene therapy, immunotherapy and other treatment methods, and may also contribute to the development of innovative strategies for early prevention and personalized treatment of lung cancer.