Molecular Changes of Lung Malignancy in HIV Infection

Malignancy of the lung is a major source of morbidity and mortality in persons with human immunodeficiency virus infection; as the most prevalent non-acquired immunodeficiency syndrome-defining malignancy, it represents an important and growing problem confronting HIV-infected patients. To evaluate the molecular changes of lung malignancy in HIV infection, we analyzed differential gene expression profiles and screened for early detection biomarkers of HIV-associated lung cancer using Affymetrix arrays and IPA analysis. A total of 59 patients were diagnosed with HIV-associated lung cancer from Jan 2010 to May 2018. The primary outcome was a significant difference in survival outcome between stages III-IV (10.46 ± 1.87 months) and I-II (17.66 ± 2.88 months). We identified 758 differentially expressed genes in HIV-associated lung cancer. The expression levels of SIX1 and TFAP2A are specifically increased in HIV-associated lung cancer and are associated with poorly differentiated tumor tissue. We also found decreased ADH1B, INMT and SYNPO2 mRNA levels in HIV lung cancer. A comprehensive network and pathway analysis of the dysregulated genes revealed that these genes were associated with four network functions and six canonical pathways relevant to the development of HIV-associated lung cancer. The molecular changes in lung malignancy may help screen the growing population of HIV patients who have or will develop this malignancy.

Immunosuppression, but not HIV infection, accounts for the higher rates of lung cancer in HIV patients 14 .
The HIV tat gene product increases the expression of some proto-oncogenes, including c-myc, c-fos and c-jun. Downregulation of HIV-tat interacting protein promotes metastatic progression of lung cancer 9 . However, there is no clear relationship between the degree of immunosuppression and the risk of lung cancer, so the decision to screen an HIV-infected patient for cancer should include an assessment of individualized risk for cancer, life expectancy, and the harms and benefits associated with the screening test and its potential outcome. Thus, screening the differentially expressed genes in lung cancer with HIV infection needs to be discussed.
Understanding the mechanisms underlying lung carcinogenesis in HIV infection may improve its treatment and the screening of the growing population of HIV patients who have or will develop this malignancy. The aim of this study is to heighten the awareness of lung malignancies occurring in HIV/AIDS while highlighting some of the clinical features in order to facilitate early recognition and diagnosis.

Results
Patient characteristics. Among the 59 patients with HIV-associated lung cancer enrolled in the study, the age of patients with lung cancer ranged from 40-77 years, and the average age was 56.40 ± 9.12 years. Fully 88.14% of patients with HIV-associated lung cancer were male, and only 11.86% were female. The pathological types were as follows: adenocarcinoma (36 cases), squamous cell carcinoma (14 cases) and small cell lung cancer (SCLC; 9 cases). The corresponding clinical characteristics of these patients are presented in Table 1. We found that the median overall survival (OS) duration of the 59 patients was 14.12 months (95% CI, 10.63-17.61 months). Although OS did not differ by age, sex, smoking, HAART, complication, CD4 + count, or pathological type among these patients by univariate analysis with SPSS, there were significant differences in survival outcome between TMN stages I-II (17.66 ± 2.88 months) and stages III-IV (10.46 ± 1.87 months) (p = 0.026) by pairwise comparison analysis among pathological types of HIV-associated lung cancer.
Transcriptomic profiles in HIV-associated lung cancer. To identify possible mechanisms of lung malignancies in HIV infection, we performed gene microarray profiling of AIDS patients with lung cancer early diagnosis biomarkers. We collected 7 pairs of tumor/adjacent normal tissue paraffin specimens and 10 pairs of tumor/adjacent normal fresh tissue samples, and we successfully separated 34 HIV with lung cancer tissue and cancer-adjacent RNA samples (17 pairs). A genome-wide analysis of the gene transcripts expressed in the HIV lung cancer tissue samples was performed using an Affymetrix array. We used Partek software for statistical analysis of microarray data. Figure S1 shows the cancer tissue microarray quality of our AIDS patients with lung cancer and that our chips can distinguish cancer tissues and adjacent tissues. Analysis of the microarray hybridization data revealed that 758 genes exhibited more than a 1.5-fold change in expression level (p ≤ 0.05) in the 4 pairs of HIV lung cancer tumor/adjacent normal tissue samples. Further, using the Cancer Genome Atlas (TCGA) data for multiple cancer types analysis of 52 differentially expressed genes (DEGs), which were identified with a 5.0-fold change in expression level (p ≤ 0.05) in HIV-associated lung cancer (Table 2), we identified 10  DEGs (FAT3, NFASC, SLIT2, FMO2, ITGB8, SCARA5, ABCA6, ABCA8, MACC1, and TACC1) that had a high incidence of genetic alterations in lung adenocarcinoma (TCGA-nature 2014, 6.0-21.0%; TCGA-provisional, 5.0-10.0%) and lung squamous cell carcinoma (5.0-18.0%), with an incidence cutoff value of ≥5.0% in the three lung cancer TCGA lists of 929 patients (Table 3). There were also frequent alterations in these genes across other cancer types, including breast invasive carcinoma (2.1-14.0%) and bladder urothelial carcinoma (0.8-16.0%).
Our analysis results suggest that alterations in these 10 candidate genes interact in at least a subset of tumors.
Since the large-scale sequencing of human cancers can be used to comprehensively discover mutated genes that confer a selective advantage to cancer cells and find genes that drive cancer based on their patterns of mutation in large patient cohorts 15 , we applied these TCGA datasets for driver gene prediction. We demonstarted that a strong tendency of co-occurrences was noted for genetic alterations in these DEGs between ADH1B and FAT3, FAT3 and SLIT2, FIGNL1 and ITGB8, MACC1 and ITGB8, SCARA5 and TACC1, ABCA6 and ABCA8, FMO2 and NFASC, KIAA0895 and ITGB8, MAL and ABCA8 (p < 0.01) ( Table S1). Considering the regulatory role of these 10 candidate genes, the underlying mechanisms and cellular consequences of these interactions could be critical for understanding HIV-associated lung cancer pathology.  (Fig. 1). To investigate the expression status of TFAP2A and SIX1 proteins in benign and malignant lung tissue, we performed IHC staining of 7 lung tumor samples and 7 adjacent non-cancer tissue samples. TFAP2A-and SIX1-specific staining was clearly observed in the nucleus of the lung cancer cells including HIV associated squamous cell carcinoma (SCC) and AC (Fig. 2). For TFAP2A ( Fig. 2A,B), a total of 4 of 7 malignant cases showed positive staining for TFAP2A, and 2 of 7 adjacent non-cancer tissue samples were positive. It is interesting that a total of 6 of 7 (85.7%) malignant cases showed positive staining for SIX1, and not one of the adjacent non-cancer tissue samples was positive; the difference in SIX1 expression between lung cancer tissue and adjacent non-cancer tissue samples was statistically significant (p < 0.001) (Table S2). Although there was not different expression of SIX1 between Grade II and Grade III of HIV associated AC (Supplemental Fig. S2), the staining of HIV associated SCC with an anti-SIX1 was variable with poorly differentiated tumor tissue (Grade III) showing higher expression of the protein (Fig. 3). Little or no staining of SIX1 was observed in adjacent non-cancer tissue samples. All these results indicate that the expression levels of SIX1 and TFAP2A are specifically increased in HIV-associated lung cancer. To better distinguish the cancer tissue from the surrounding tissue, we finally stained with p63 and TTF-1 as lung cancer cell specific markers 16 and found that all the two SCC cases showed strong positive staining for p63 (Fig. 4A), and not one of adenocarcinomas samples was positive (Fig. 4B), but TTF-1 staining was observed in 60.0% of adenocarcinomas ( Fig. 4D and Table S2).

Functional networks and pathways of HIV lung cancer.
The genetic networks and cellular pathways dysregulated in HIV lung cancer were identified using the IPA software program. Expression microarray profiling studies revealed that 758 genes were dysregulated in HIV lung cancer. A comprehensive network and pathway analysis of the DEGs revealed that these genes were associated with four network functions and six canonical pathways relevant to the development of HIV lung cancer. The scores take into account the number of focus proteins and the size of the network to approximate the relevance of the network to the original list of focus proteins. In each of the four genetic networks, the DEGs constituted about half of the molecules involved in network-associated (1) cancer, connective tissue disorders, organismal injury and abnormalities; (2) nucleic acid metabolism, small molecule biochemistry, organ morphology; (3) cardiovascular system development and function, embryonic development, organismal development; (4) cancer, organismal injury and abnormalities, developmental disorder (Table S3 and Supplemental Fig. S3). The differentially expressed genes belong to six canonical signaling pathways, such as cellular effects of sildenafil; dopamine-DARPP32 feedback in cAMP signaling; purine nucleotides de novo biosynthesis II; 5-aminoimidazole ribonucleotide biosynthesis I; and tetrahydrofolate salvage from 5,10-methenyltetrahydrofolate pathways (Table S4).

Discussion
Lung cancer is the most common cause of cancer mortality worldwide for both men and women, and it also represents an important and growing problem confronting HIV-infected patients 5 . The incidence of lung cancer has risen among women in the past several decades, but the incidence is still higher among men than among women 17,18 . Although several mechanisms such as HAART, repeated lung infections, chronic pulmonary inflammation and/or immunosuppression have been reported to promote the development of lung cancer 9,10 , in this study, we found that most patients did not have a low CD4 + cell count, with a median CD4 + count at lung cancer diagnosis of 281/ml; there were 14 patients with CD4 + count <200/mL and 41 patients with CD4 + count >200/mL. Moreover, for patients with low CD4 + count, no significant differences in survival outcome were seen between CD4 + count <200/mL (11.46 ± 2.72 months) and CD4 + count >200/mL (15.70 ± 2.34 months) (p = 0.324), which demonstrated that HIV + patients with lung cancer did not seem to be particularly immunosuppressed. We have already identified significant differences in survival outcome between TMN stages I-II (17.66 ± 2.88 months) and stages III-IV (10.46 ± 1.87 months). Lung cancer can often be asymptomatic in the early stages and may be diagnosed purely by chance 19 . Brock found that advanced stage in HIV-infected lung cancer patients was associated with worse survival compared to HIV-uninfected patients 20 . Although our data showed that HIV-infected lung cancer patients have shortened survival mainly due to advanced stage, there is concern for lead-time bias in lung cancer screens 19 . Lead-time bias cannot be excluded as a reason for the longer overall survival in those HIV-infected patients diagnosed as stage I-II compared to stage III-IV. These results implied that TNM was not only a major prognostic factor in the general population but also in HIV and might yield a survival benefit, although our study population was small. To identify the molecular mechanisms underlying the gene expression profiles of lung malignancy in HIV infection, we performed an integrative network analysis using IPA. Using this tool, we identified four significant networks of lung malignancy in HIV infection (score ≥32; Table S3 and Supplemental Fig. S3). The scores take into account the number of focus proteins and the size of the network to approximate the relevance of the network to the original list of focus proteins. The networks altered in lung malignancy in HIV infection were associated with "Cancer", "Nucleic acid metabolism", "Organismal injury" or "Developmental disorder". We then carried out a canonical pathway analysis of these dysregulated genes in this malignancy and revealed cellular effects of sildenafil; dopamine-DARPP32 feedback in cAMP signaling; purine nucleotides de novo biosynthesis II; 5-aminoimidazole ribonucleotide biosynthesis I; and tetrahydrofolate salvage from 5,10-methenyltetrahydrofolate. Most of these pathways are involved in cellular metabolism and are frequently dysregulated in HIV-associated lung cancer [21][22][23][24][25] . Sildenafil, as an inhibitor of cGMP-degrading phosphodiesterase 5, is used to treat erectile dysfunction and potentiates a cGMP-dependent pathway to promote melanoma growth, which can also affect the innate and adaptive immune system in patients 26,27 .
In the initial biomarker identification stage, gene expression profiling was performed to assay the differentially expressed genes. Validation of the candidate differentially expressed genes was then performed using quantitative real-time polymerase chain reaction assay and immunohistochemistry analysis. We found that SIX1, PROM1, and TFAP2A were upregulated in HIV-associated lung cancer. SIX1 (SIX Homeobox 1) was reported to play a role in the development of tumors, including breast, colorectal, gastric, and pancreatic cancer [28][29][30][31] . The SIX family homeobox genes have been demonstrated to be involved in tumor initiation and progression; they play distinct roles in the tumorigenesis of non-small cell lung cancer (NSCLC) and can be potential biomarkers in predicting prognosis of NSCLC patients 32 . SIX1 can promote cell proliferation via reactivating the cell cycle-related proteins cyclin A 33 and stimulate malignant transformation of nontumorigenic cells 31 . Gene Expression Omnibus database analysis also confirmed that luminal breast cancer patients with SIX1 overexpression had worse overall survival, shorter relapse-free survival, and much worse prognosis 34 . More importantly, it was found to be closely linked to poor clinical prognosis in cancer patients 35 Table 3. The Cancer Genome Atlas consortium data on the incidence of genetic alteration a of DEGs in HIVassociated lung cancer by cancer type. a Genetic alterations consist of mutations and/or CNV. b Cancer Genome Atlas Research Network, Nature, 2014 15 .
bladder, breast, endometrium, and pancreas cancers, and it was reported to be a predictive marker for good response and survival after cisplatin-containing chemotherapy in several cancers [37][38][39] . High levels of AP-2α protein in bladder cancer were associated with good response to cisplatin 38 , and it was a newly identified prognostic marker for chemotherapy 40 . As a cancer stem cell marker, PROM1 was identified as both a hematopoietic and neuroepithelial stem cell marker 41,42 . PROM1 has been identified in colorectal, hepatocellular, and pancreatic cancer 43 , as one of the most important markers of tumor-initiating cells and an adverse prognostic factor in colon cancer, gliomas, and medulloblastoma 43,44 , and is associated with decreased survival in a variety of human tumors, including brain, colorectum, endometrium, gliomas, liver, medulloblastoma, NSCLC, ovary, and stomach 45,46 . On the other hand, SYNPO2, ADH1B and INMT were found to be repressed in HIV-associated lung cancer. SYNPO2 (synaptopodin-2) is a repressor of tumor cell invasion, being predominant in prostate acinar epithelial and basal cells, and induces formation of complex stress fiber networks in the cell body 47 . Kopantzev also comfirmed that ADH1B and INMT were down-regulated in NSCLC as compared to adjacent normal tissues using qRT-PCR and microarray analyses 48 . Against this background, there is an opportunity to develop novel gene signatures for HIV-associated cancer, which is good for early detection of HIV-associated lung cancer. These molecular changes of lung malignancy can offer the hope of early detection as well as tracking disease progression and recurrence. Therefore, these cancer biomarkers have provided great opportunities for improving the management of cancer patients by enhancing the efficiency of early detection, diagnosis, and treatment. Finally, we have also realized that our study is exploratory research on the molecular changes of lung malignancy in HIV infection, and the current sample size is small. We hope our molecular insights can optimize cancer screening and prevention strategies for HIV-infected populations and guide the treatment of HIV-associated lung cancer.

Materials and Methods
Patients and tissue specimens. This   Quantitative real-time polymerase chain reaction analysis. Thirty-four RNA samples were used for qRT-PCR analysis in the HIV lung cancer tumor tissue samples from 13 patients with AC, 3 patients with SCC, and 1 patient with neuroendocrine tumors (NETs) including 7 pairs of tumor/adjacent normal tissue paraffin specimens and 10 pairs of tumor/adjacent normal fresh tissue samples. Total RNA (1 μg) was reverse transcribed to cDNA using the First Strand cDNA synthesis kit (Invitrogen). qRT-PCR was performed using the ABI Prism 7900 HT sequence detection system and Taqman Universal PCR master mix (both from Applied Biosystems, Foster City, CA, USA). Seventeen pairs of lung cancer and adjacent non-cancer tissue samples were used for qRT-PCR analysis. Relative expression of the mRNAs was calculated utilizing the comparative Ct (2 −ΔΔCt ) method with 18S as the endogenous control to normalize the data. Immunohistochemical staining. IHC staining for TFAP2A, TTF-1, p63 and SIX1 was performed in 14 formalin-fixed, paraffin-embedded tissue samples from the Shanghai Public Health Clinical Center. The study was approved by the Institutional Review Boards. Paraffin-embedded tissues were dewaxed in xylene, rehydrated in serial concentrations of ethanol, and then rinsed in PBS followed by treatment with 3% H 2 O 2 to inhibit endogenous peroxidase. After being heated at 60 °C overnight, the sections were incubated with 10% normal goat serum at room temperature for 10 min to block non-specific reactions. This was followed by a PBS wash and incubation with Anti-TFAP2A (Abcam, ab108311), Anti-TTF-1 (Dako Corporation, CA), Anti-p63 (Novocastra Laboratories Ltd, UK), or Anti-SIX1 (Cell Signaling Technology, #16960, Danvers, MA) antibodies for IHC analysis. Fourteen paraffin-embedded tissues were retrieved for IHC analysis. SIX1, TTF-1, p63 and TFAP2A proteins appeared as brownish granules after staining. The expression status of SIX1, TTF-1, p63 and TFAP2A was scored using a 5-point scale based on the intensity of positive staining and the distribution of positive cells under 5 random high-power fields.
Statistical analysis. Data analyses were performed using SPSS statistical package 17.0 (SPSS Inc., Chicago, IL, USA). Statistical values are presented as the mean ± standard deviation. The Student t-test was used to assess differences between groups. A univariate analysis was performed using the Kaplan-Meier estimator method and a log-rank test. The median survival time was calculated using SPSS. p < 0.05 was considered to indicate a statistically significant difference.