High expression of oxidative phosphorylation genes predicts improved survival in squamous cell carcinomas of the head and neck and lung

Mitochondrial activity is a critical component of tumor metabolism, with profound implications for tumorigenesis and treatment response. We analyzed clinical, genomic and expression data from patients with oral cavity squamous cell carcinoma (OCSCC) in order to map metabologenomic events which may correlate with clinical outcomes and identified nuclear genes involved in oxidative phosphorylation and glycolysis (OXPHOG) as a critical predictor of patient survival. This correlation was validated in a secondary unrelated set of lung squamous cell carcinoma (LUSC) and was shown to be driven largely by over-expression of nuclear encoded components of the mitochondrial electron transport chain (ETC) coordinated with an increase in tumor mitochondrial DNA copy number and a strong threshold effect on patient survival. OCSCC and LUSC patients with a favorable OXPHOG signature demonstrated a dramatic (>2fold) improvement in survival compared to their counterparts. Differential OXPHOG expression correlated with varying tumor immune infiltrates suggesting that the interaction between tumor metabolic activity and tumor associated immunocytes may be a critical driver of improved clinical outcomes in this patient subset. These data provide strong support for studies aimed at mechanistically characterizing the interaction between tumor mitochondrial activity and the tumor immune microenvironment.

Code was written in Python (see below) to adapt the consensus clustering scheme to Ward's method of agglomerative hierarchical clustering. Essentially, "N" number of input samples are randomly resampled by taking a fixed proportion "p" of starting samples and individually running Ward's clustering for "K" number of resamplings. After each individual clustering run, the dendrogram is divided into increasing numbers of clusters ranging from 2 to "c", where the input "c" corresponds to the maximum number of cluster divisions considered. For a given value of "c" the program then calculates the total fraction of times that any two samples are grouped together in the same cluster over K resamplings. The matrix of co-localization proportions for each value of "c" is referred to as the similarity matrix. For each "c" the similarity matrix is used as the final input to then perform Ward's hierarchical agglomerative clustering to produce the consensus cluster groupings with the same choice of "c" clusters. Thus, a different similarity matrix is produced for each choice of cluster numbers ranging from 2 to "c". For the analyses in this manuscript, K=300, p=0.8, and c was set to a maximum of 15. For consensus clustering of the second dimension (e.g., genes or features), sample-wise Z scores for each feature were first calculated before transposing the input table, and then the same consensus clustering algorithm was performed independently to find the optimal number of clusters for features. Dendrograms created in JMP using consensus matrix values from the optimal cluster choices (described below) were combined with Z-score heatmaps (samples along the horizontal axis and features across the vertical axis) to generate the final heatmap figures.
Optimal choice of cluster number: A method for choosing optimal cluster numbers has been described 21 however; it still requires visual inspection. We approached the problem with a novel calculus (See Supplementary Figure 1). An optimal choice of "c" clusters is the one that generates the best similarity matrix. The ideal similarity matrix ordered horizontally and vertically according to clusters should have diagonal blocks of values equal to 1 within clusters and equal to zero outside of clusters because elements of the cluster should group together 100% of the time during resampling but 0% of the time with outside elements. However, the ideal consensus cluster matrix will have different numbers of "1s" and zeros depending upon the number of clusters chosen.
To overcome this limitation, the similarity matrix is transformed so that values outside the cluster blocks are defined as the probabilistic compliment or 1 minus the existing value. This redefines values outside the cluster blocks as the proportion of times the elements do not cluster together, which theoretically should be 100% or a value of 1. Now, the ideal transformed consensus matrix has values of 1 in every cell regardless of how many clusters are chosen and comparisons can be made across choices of "c" number of clusters The goodness of the cluster choice can be taken as the deviation of the transformed consensus matrix from the ideal transformed consensus matrix (i.e. the identity matrix), by calculating the Euclidean distance in matrix space (Supplementary Figure 1). Because the Euclidean distances will be proportionate to the square root of the total sum of squares in the formula (equal to N^2), the distances can be normalized by dividing by the number of samples in a cohort to yield a Normalized Euclidean Distancemaking it possible to compare metric across different cohorts with differing numbers of starting samples.
In practice, the NED will fluctuate with the choice of "c" number of clusters but theoretically will approach a value equal to zero when the number of clusters is equal to the number of samples. However, there is a price to be paid for choosing too many clusters, resulting in a loss of information about relatedness that defeats the original purpose. Therefore, the best practice is to plot the NED versus number of "c" chosen clusters and identify local minima, so that both NED and number of clusters is simultaneously optimized.
An example pot of NED for data in this paper are depicted in Supplementary Figure 7.
Immune subset gene lists: We began with a published list of genes previously used to distinguish leukocyte subsets for ssGSEA 22 and further filtered the list by vetting genes based on their crosscorrelation. Genes that are unique for immune subsets and truly specific for leukocytes should be tightly correlated. Cross-correlation coefficients of genes in pooled RNA-Seq data from the HNSCC and LUSC cohorts were clustered and heat maps generated with JMP software to identify genes that were highly correlated (e.g., Supplementary Figure 2) for each immune subset. For genes to be selected, the geometric mean of their correlations with other genes defining the immune subset had to be ≥ 0.3 (e.g., import pandas as pd import numpy as np import random, math, time, openpyxl from sklearn.cluster import AgglomerativeClustering from openpyxl.styles import PatternFill from scipy.stats import zscore from scipy.spatial.distance import euclidean # VARIABLES TO EDIT PER RUN # enclose these variables in quotes path = "C:\\"your path"\\"file name".xlsx" # directory of input file (double backslash (\\) # single backslash (\) | periods (.) in excel file name are invalid sheet = "Sheet1" # name of the sheet containing the matrix input sample_column = 'samples' # name of sample column (exactly what is contained in cell A1) | if noth # enter "samples" (without quotes) in A1 and set this variable to "samples" (with quotes) # do not use quotes for these variables c = 8 # clusters k = 300 # resamples p = 0.8 # resample proportion t = 50 # number of rows by which time increments will be reported z = True # True = calculate z scores treating vertical columns as separate datasets (running as gen # False = do not calculate z scores (running as features, data set must already be in

Supplementary Figure Legends
Supplementary Figure 1. Method to calculate normalized Euclidean distance from a similarity matrix to inform optimal cluster choice. (A) Mock similarity matrix output of the type generated from consensus clustering (See Supplementary Methods) for an artificial data set with 9 samples labeled "A"to"H" (left panel). In this theoretical example, the number of clusters "c" after each re-sampling was set to equal 3 to count the number of times any two samples end up in the same cluster. In practice, the procedure is repeated for different values of "c" over a chosen range (e.g., 2 ≤ c ≤ 15) with the desired number of re-samplings "K", but keeping the same re-sampling input sets common across different values of c. Values in the matrix values represent the fraction of times two samples co-cluster together for a given choice of c. The similarity matrix values themselves are subsequently used for clustering to generate "c" number of consensus clusters (e.g., pink, grey, and light blue). The heatmap and cluster tree for the example consensus clusters (A,C, B; D,E; and F,I,G, H) is pictured in the right panel. (B) The similarity matrix is then transformed, so that values outside the consensus clusters (e.g., colored diagonal blocks) are substituted with the probabilistic complement or fraction of times two samples do not co-occur in the same cluster (i.e., 1-similarity matrix value) to yield a transformed similarity matrix (left panel), also represented as a heatmap (right panel). (C) If consensus clustering were perfect, the theoretical transformed similarity matrix would have values of 1 in every cell because samples in the same cluster should always cocluster together and never cluster with outside members, 100% of the time. (D) Formula for normalized Euclidean distance in matrix space, which defines how far apart the observed transformed matrix for a given value of c is from theoretical perfection or the identity matrix.

Supplementary Figure 2. Heatmap for gene expression cross correlation coefficients among the 15 genes previously published to define cytotoxic cells.
RNA expression from the TCGA OCSCC and LUSC tumor cohorts was pooled and cross correlation coefficients calculated among all 15 candidate genes. Correlations were then analyzed by hierarchical clustering to depict the subset of genes best correlating. Genes that clustered together graphically with good correlation were further analyzed as described in the methods to identify the subset of genes with correlations ≥ 0.3 (e.g., Supplementary Tables 4 and 5). Heatmaps representing unsupervised hierarchical consensus clustering of patient tumors based on expression of nuclear-encoded mitochondrial pathway genes (top left), glycolysis genes (top right), and pentose phosphate pathway genes (bottom left). (B) Patients in cluster 1 with highest expression of mitochondrial pathway genes had significantly better survival than patients in clusters 2, 3 and 4 grouped together. In contrast, patients in clusters 6 and 7 with highest glycolysis pathway gene expression(right) had significantly worse survival compared to patients in cluster 1. None of the survival differences for patients clustered by pentose phosphate pathway gene expression approached significance. P-values were calculated in comparison to cluster 1 for all graphs. Figure 7. Graphical approach to identify optimal values for "c" (i.e., number of consensus clusters) using nuclear encoded mitochondrial pathway gene expression data from the OCSCC cohort. (A) Normalized Euclidean distances (NEDs) for transformed similarity matrices plotted over a range of possible consensus cluster values following hierarchical consensus clustering of OCSCC patient samples. Red symbols indicate local minima and acceptable choices for number of consensus clusters, while the blue symbol represents the number of clusters actually used for the survival analysis in the text because it corresponds to a good trade off between NED minima and loss of information from too many clusters. (B) Plot of NEDs for transformed similarity matrices plotted over a range of possible consensus clusters following hierarchical consensus clustering of the second dimension (genes). The optimal number of consensus clusters for genes is 5 (blue symbol). For more details see Supplementary Methods.

Supplementary figure 8. A composite list of metabolic genes involved in mitochondrial oxidative phosphorylation and cellular glycolysis identifies a cluster of patients with good prognosis in OCSCC and LUSC.
(A) OCSCC patients were clustered based upon combined expression pattern of manually curated metabolic genes from Supplementary Tables 1 and 2, which included 36 glycolysis pathway members and 89 nuclear encoded mitochondrial genes involved in oxidative phosphorylation. Patients in cluster 1 with high expression of oxidative phosphorylation genes and low expression of glycolysis pathway genes had very high survival compared to patients in clusters 2,3, and 4. (B). A similar pattern of patient clusters was found for LUSC where patient cluster 1 also had improved survival relative to other clusters. P-values for Kaplan-Meier curves from both cancers represent comparisons between cluster 1 and other clusters. Figure 9. Overlap of metabolic genes from the manually curated lists and the 118 OXPHOG gene list generated from GO search terms. A Venn diagram illustrates that only 20/36 manually curated glycolysis genes and 45/79 manually curated oxidative phosphorylation genes (OXPHOS =mitochondrial pathway genes) were in common with the 118 GO gene list of OXPHOG genes that also included 53 unique genes. where patients with the highest expression of oxidative phosphorylation genes (i.e., cluster 1) now had the worst MS. P-value reflects comparisons to cluster 1. Figure 12. Comparison of OXPHOG patient clusters in OSCC and HPV+ OPSCC patients. Pooled unsupervised hierarchical consensus clustering of tumor samples from both subsites was performed. Original cluster assignments for samples when just OCSCC (e.g., Figure 1) or just HPV+OPSCC (e.g., Supplementary Figure  11) were considered are annotated beneath the samples (e.g., red boxes were originally in cluster 1). The new pooled cluster 1 is composed almost exclusively of OCSCC samples, with most HPV+ OPSCC samples originating from OPSCC cluster 1 now associated with OCSCC samples originally from cluster 2. The ratio of geometric means between OCSCC samples originally in cluster 1 and HPV+ OPSCC tumors originally in cluster 1 (but now in pooled cluster 2) is annotated vertically. Despite being grouped in distinct clusters when considered together, OCSCC cluster 1 tumors do not differ that much from HPV+OPSCC tumors originally found in cluster 1 when the sub-sites are clustered separately. Figure 13. Validation of the muscle-specific gene list. The 32 muscle specific genes were used in a ssGSEA of RNA expression from normal skeletal muscle, esophageal mucosa, and normal lung samples available in the GTEX database. As expected, ssGSEA scores from muscle tissue are significantly higher (P<0.0001) than the other normal samples by several orders of magnitude.