Characterization of normal and cancer stem-like cell populations in murine lingual epithelial organoids using single-cell RNA sequencing

The advances in oral cancer research and therapies have not improved the prognosis of patients with tongue cancer. The poor treatment response of tongue cancer may be attributed to the presence of heterogeneous tumor cells exhibiting stem cell characteristics. Therefore, there is a need to develop effective molecular-targeted therapies based on the specific gene expression profiles of these cancer stem-like cell populations. In this study, the characteristics of normal and cancerous organoids, which are convenient tools for screening anti-cancer drugs, were analyzed comparatively. As organoids are generally generated by single progenitors, they enable the exclusion of normal cell contamination from the analyses. Single-cell RNA sequencing analysis revealed that p53 signaling activation and negative regulation of cell cycle were enriched characteristics in normal stem-like cells whereas hypoxia-related pathways, such as HIF-1 signaling and glycolysis, were upregulated in cancer stem-like cells. The findings of this study improved our understanding of the common features of heterogeneous cell populations with stem cell properties in tongue cancers, that are different from those of normal stem cell populations; this will enable the development of novel molecular-targeted therapies for tongue cancer.

www.nature.com/scientificreports/ squamous cell carcinomas of the tongue 8,9 . Additionally, previous studies on the mutational landscape of 4-NQOinduced tumors revealed that common mutations in human tongue carcinomas, including Trp53, Pik3ca, and Notch1, were recapitulated in the mouse model 10 . Therefore, this model was selected for this study. This study aimed to characterize cells with stem cell-like features in normal and cancerous tongue epithelial tissues. To enrich rare stem cell populations, we isolated cells from lingual epithelium of untreated and 4-NQOtreated mice and expanded progenitor cells in organoid cultures in vitro. The findings of this study demonstrated that cancer organoids mimicked the histological features of lingual cancer tissues. Single-cell RNA sequencing (scRNA-seq) analysis revealed that negative regulation of the cell cycle and the activation of p53 signaling were enriched characteristics of stem-like cells from normal organoids whereas hypoxia-related pathways, such as HIF-1 signaling and glycolysis, were upregulated in stem-like cells from cancer organoids.

Results
Induction of tongue cancer. To induce tongue cancer, Sox2-GFP mice were administered 4-NQO (100 µg/mL) through drinking water for 16 weeks. Next, the mice were administered plain drinking water for 8 weeks and sacrificed. The tongue was harvested and subjected to immunohistochemical analysis and organoid culture (Fig. 1a). The lingual epithelium of untreated mice was covered with cone-shaped filiform papillae. The expression of Krt14, p63, and Ki-67 was detected in the basal layer of the epithelium of untreated mice. In contrast, 4-NQO-treated mice exhibited thick focal lesions with disrupted epithelial organization lacking filiform papillae. The expanded basal layer exhibited widespread expression of Krt14, p63, and Ki-67, while the outer keratinized layers exhibited the expression of Krt8 (Fig. 1b).

Establishment and characterization of control and cancer lingual organoids. Sox2+ keratino-
cytes residing in the basal layer of the tongue epithelium include stem cells for both taste and non-taste lingual epithelia 11 . Furthermore, only the Sox2+ cells formed lingual organoids in vitro. To culture normal and cancerous lingual organoids, the tongue epithelia from the control (untreated) and 4-NQO-treated Sox2-GFP mice were digested and separated into single-cell suspensions. GFP + cells were isolated from the cell suspensions using a cell sorter and seeded in Matrigel for 3D organoid culture. The control organoids derived from normal tissue were round and symmetrical, whereas the 3D cultures of cells isolated from cancer tissue were characterized by large organoids that exhibited accelerated, aberrant growth and irregular shapes with bud-like proliferative structures, although some smaller organoids, similar to those from normal tissue, were formed as well (Fig. 1c). The expression of the proliferation marker Ki-67 in the cells of the outer and basal layers in cancer organoids was higher than that in the control organoids (Fig. 1d). The expression of Krt14, however, was observed only in the basal layer of control organoids but detected in the inner suprabasal layers of cancer organoids. Krt8 was widely expressed in cancer organoids but detected only in some suprabasal cells in the control organoids (Fig. 1c). These expression patterns recapitulated the immunohistochemical patterns of tongue epithelial tissue in control and cancer mice. Thus, cancer organoids can be used as an in vitro model for tongue cancer (Fig. 1b).
scRNA-seq of control and cancer lingual organoids. To analyze the gene expression profiles between control and cancer organoids in a comparative manner, organoids derived from one control mouse and one 4-NQO treated mouse were digested and separated into single-cell suspensions. The cell suspensions were subjected to cell sorting and scRNA-seq. Approximately 100,000 cells (corresponding to approximately 500 organoids) from the control sample and 60,000 cells (corresponding to approximately 150 organoids) from the cancer sample were collected and used for subsequent scRNA-seq analysis (Fig. 2a).
Clustering and StemID value determination for control and cancer datasets. Seurat was used to cluster and analyze control and cancer organoid datasets. The analysis revealed 14 and 10 clusters in the former and latter, respectively (Fig. 2b). To identify stem cells in the two datasets, the RaceID and FateID R packages were used to identify the clusters with the highest StemID values (most stem-like cell clusters). By comparing expression of the top markers for the RaceID clusters with high StemID values and their expression pattern in the Seurat object, the Seurat clusters enriched in stem-like cells could be determined ( Supplementary Fig. 1). Interestingly, while the control organoids had one cluster (RaceID cluster 7, corresponding to Seurat cluster 5) with a significantly higher StemID score than the other clusters, none of the cancer organoid clusters had such a high score, with the highest StemID score for RaceID cluster 2 (corresponding to Seurat cluster 3), but with several other clusters displaying only slightly lower scores ( Fig. 2c; Supplementary Fig. 1A). This suggests the presence of several cell populations with stem cell features in the cancer organoids, thus indicating the heterogeneous nature of tongue tumors.
Marker expression in the lingual organoids. The expression of known basal and progenitor cell markers of the lingual epithelium in the uniform manifold approximation and projection (UMAP) plots of the Seurat objects was examined. The progenitor marker Lgr6 was expressed in the stem-like cell clusters of both control and cancer organoids, whereas Hopx was mostly expressed in clusters with low StemID values ( Fig. 2d; Supplementary Fig. 1C). The number of Lgr6 + cells in the control organoids was higher than that in the cancer organoids, while the number of Hopx + cells in the cancer organoids was higher than that in the control organoids. Mki67 + cells, which divide actively, were separated in a single cluster in both datasets (clusters 9 and 8 for the control and cancer organoids, respectively). The basal cell marker Krt14 was widely expressed in both control and cancer organoids. Krt13, a marker for differentiated suprabasal cells in the lingual epithelium, was expressed in cluster 2 in the control organoids but sparsely expressed in the cancer organoids (Fig. 2e) www.nature.com/scientificreports/ cells were observed in most cancer clusters but were few in the stem-like cell cluster. In the control, on the other hand, Krt8 was mostly expressed in the stem-like cell cluster.

Trajectory analysis.
To investigate changes in gene expression during the differentiation of stem/progenitor cells in the control and cancer organoids, trajectory analysis of the Seurat objects was performed using the Monocle3 package (Fig. 3a). The clusters with the highest StemID values were denoted the root nodes and the pseudotime and trajectory paths were estimated (Fig. 3b). Selected markers for stem, proliferative, and differentiated cells were plotted in pseudotime for the control and cancer datasets (left and right, respectively, in Fig. 3c). In both control and cancer organoids, the expression of Hopx and the suprabasal keratinocyte markers Krt13, Krt4, and Krt6a increased with pseudotime while Lgr6-expressing cells were enriched in undifferentiated cells. The expression of Krt8 increased with pseudotime in the cancer organoids but was high in both stem-like and differentiated cells in the control organoids. Furthermore, the expression of Trp63 decreased gradually over pseudotime in cancer organoids but peaked in the middle of the trajectory in control organoids. The expression Integration of control and cancer organoid datasets. For direct comparison in the same plot, the two datasets were integrated using reciprocal principal component analysis (RPCA) in Seurat, which generated an integrated Seurat object with nine clusters (Fig. 4a-c). The contribution of cells from each sample is shown in the bar graph in Fig. 4d. To analyze the broad differences between control and cancer organoids, the genes that were the most significantly altered between the control and cancer organoids in all clusters were determined using Seurat. These genes were subjected to gene ontology (GO) enrichment analysis (Fig. 4e). The analysis revealed the regulation of cell adhesion, response to wounding, extracellular matrix, positive regulation of cell projection organization, and chemotaxis among the most enriched terms in the control organoids while terms such as ATP metabolism, apoptosis, response to hypoxia, and cell migration were enriched in the cancer organoids.

Characteristics of stem-like cells in control and cancer organoids.
Comparing the expression of markers for the clusters with the highest StemID values in the control and cancer data sets with their expression in the integrated object, we found that cell populations with stem-like features from the two datasets were clustered separately. Stem-like cells from the control and cancer organoids clustered in clusters 7 and 0, respectively, of the integrated object ( Supplementary Fig. 2). To analyze the differences between stem-like cells from cancer and control organoids, the genes that were most significantly altered between the two clusters were identified using Seurat. These genes were subjected to enrichment analysis to identify the pathways and biological processes that were the most significantly altered between the two clusters. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed that the p53 signaling pathway was upregulated in the control cells (Fig. 5a), whereas hypoxia-related pathways, such as HIF-1 signaling and glycolysis, were upregulated in the cancer cells (Fig. 5b). GO enrichment analysis revealed that apoptotic signaling, cell cycle arrest, and negative regulation of cell population proliferation were among the most significantly enriched terms in the control stemlike cell cluster. Extracellular matrix organization, epithelial cell proliferation, and positive regulation of cell motility were enriched terms in the cancer stem-like cell cluster (Fig. 5c,d). Several cell cycle inhibitors, including Cdkn1a, Cdkn2a, Mdm2, Trp53inp1, Ccng, and Notch1-a known tumor suppressor in head and neck squamous cell carcinoma (HNSCC) 12 -were upregulated in the control stem-like cell cluster but not in the cancer stem-like cell cluster ( Fig. 5e; Supplementary Table 1). The expression of known cancer-related genes such as Sostdc1, S100a4, Crabp2, and Id3 was significantly upregulated in the cancer stem-like cell cluster ( Fig. 5f; Supplementary Table 2).

Discussion
To improve the treatment of heterogeneous malignancies, such as tongue carcinoma, tumor cells-particularly cell populations with stem cell properties-must be characterized at the single-cell level. To the best of our knowledge, there has been only one study on single-cell RNA-sequence analysis of 4-NQO-induced tongue cancer tissue in mice 13 . The authors reported that the most significantly enriched pathway in cancer-related clusters was the Myc targets regulatory pathway. Some studies performed bulk RNA sequencing of 4-NQO-induced tongue cancers. GO enrichment and pathway enrichment analyses revealed that angiogenesis, proliferation, and migration pathways were upregulated and that the citric acid cycle was downregulated in cancer tissue 14,15 . The 4-NQO-induced tumor tissue is heterogeneous and comprises both non-cancerous and cancerous regions; therefore, it is difficult to interpret the analysis of gene expression of cells isolated directly from the tissues. Additionally, this analysis may fail to detect rare stem cell and progenitor populations. We therefore chose to enrich cancer and normal progenitor cells by subjecting them to 3D organoid culture before performing scRNA-seq analysis. Some small, round organoids were observed in the culture of cells isolated from 4-NQO treated mice. However, most lingual organoids exhibited distinct characteristics including accelerated growth and the formation of bud-like proliferating structures. We previously demonstrated that organoids are generally generated by a single cell 5 . Irregularly shaped organoids should be derived from single immature cancerous cells www.nature.com/scientificreports/ with stem-like properties. The cancer organoids mimicked the histological features of tongue cancer tissue as evidenced by the expression patterns of markers such as Ki-67 and Krt14. scRNA-seq analysis indicated that, consistent with organoid growth patterns, the proliferation of stem-like cells in the control organoids was tightly regulated when compared with that of stem-like cells in the cancer organoids. The p53 signaling pathway was activated and the expression of genes involved in adhesion, apoptosis, and negative control of cell division was upregulated in the control organoids. The expression of tumor suppressor genes, such as Cdkn2a and Notch1, was downregulated in stem-like cancer cells compared to control stem-like cells. This was consistent with the situation in human oral cancer where loss or mutation of these genes are frequent 12,16 .
Cancer organoid stem-like cells showed enrichment in proliferation, motility, and migration gene sets and upregulation of pathways such as HIF-1 signaling and glycolysis. These pathways are activated by hypoxia, which is a common feature of many solid tumors including oral squamous cell carcinoma 17 . The active uptake of glucose and glycolysis under aerobic conditions (the Warburg effect) supplies energy and metabolites necessary to support the increased proliferation of cancer cells in various tissues and may be the mechanism underlying the uncontrolled proliferation of cancer organoids 18 . These findings were consistent with those of previous studies, which reported the activation of angiogenesis and hypoxia pathways in 4-NQO-induced murine tongue cancer models 14,15 . Thus, these findings support the hypothesis that organoids can be used as in vitro models for tongue cancer.
Several novel genes including Sostdc1, S100a4, Crabp2, and Id3 were identified among the top upregulated genes in the cancer stem-like cell population. Sostdc1 is a BMP/Wnt antagonist that often functions as a tumor suppressor, but was reported to promote invasion and liver metastasis in the case of colorectal cancer 19 . S100a4 is an epithelial-mesenchymal transition mediator associated with the maintenance of cancer-initiating cells in HNSCC 20 . Aberrant expression of Crabp2 is correlated with oncogenesis in lung and liver cancers 21,22 . Id3, often in concert with Id1, is essential for the self-renewal of tumor-initiating cells in several malignancies including glioma, breast cancer, and colon cancer [23][24][25] . Future studies should focus on the expression of these genes in murine lingual cancer tissue and human primary cancer samples and examine the effect of a targeted knockdown of these genes on the proliferation of cancer organoids.
To the best of our knowledge, this is the first study to analyze the gene expression profiles of control and cancer organoids derived from the mouse lingual epithelium at the single-cell level in a comparative manner. Future work should study gene expression patterns of organoids from various stages of cancer progression. This study analyzed the organoids isolated at one time point from one control and one 4-NQO-treated mouse. However, the organoid culture of lingual epithelial cells recapitulated the phenotypic and gene expression characteristics of healthy and malignant lingual epithelia. Thus, organoid culture can be used as a model to further characterize cancer cells and screen anti-cancer drugs in vitro. The findings of this study will contribute to elucidating the characteristics of heterogeneous cell populations with stem cell properties in tongue tumors, thereby enabling the development of novel molecular-targeted cancer therapies for tongue cancers based on specific genes expressed in cancer stem-like cell populations.

Induction of tongue tumors.
A stock 4-NQO solution was diluted in drinking water to a concentration of 100 µg/mL and administered to Sox2-GFP mice for 16 weeks. The mice were administered with plain distilled water for 8 additional weeks and then sacrificed. The tongue was harvested for immunohistochemical analysis and lingual epithelial cell culture.

Establishment of mouse lingual organoids.
The tongue was excised from the euthanized control and 4-NQO-treated mice and incubated with dispase (50 U/mL) for 10 min at 37 °C. The epithelial tissue was peeled off the tongue, minced into small pieces, and incubated in TrypLE Express (Thermo Fisher Scientific, Waltham, MA) for 30 min at 37 °C with pipetting every 10 min. The cell suspension was then passed through a 70 µm filter into 5 mL of culture medium. The filtrate was centrifuged at 440g for 5 min. The supernatant was removed such that 3 mL remained and the pellet was resuspended in the remaining supernatant. The cell suspension was passed through a 40 µm filter into a fluorescence-activated cell sorting tube. The filtrate was centrifuged for 5 min and the supernatant was discarded. The pellet was resuspended in phosphate-buffered saline (PBS) supplemented with 10% fetal bovine serum and incubated with 2.5 µg/mL of 7-amino-actinomycin D (7AAD) viability staining solution (Thermo Fisher Scientific) for 5 min to label the non-viable cells. The samples were then subjected to sorting in a SH800S cell sorter (SONY, Tokyo, Japan). The 7AAD − GFP + cells were sorted and collected into one well of a 96-well plate. The cells were then centrifuged at 120g for 5 min. The supernatant was removed such that only 10 µL remained. The cell pellet was then resuspended in 90 µL Matrigel on ice and seeded in a collagen-coated plate. The Matrigel was allowed to polymerize for 5 min. The cells were cultured in an organoid culture medium [Advanced Dulbecco's modified Eagle medium/F12 supplemented with penicillin/ streptomycin, HEPES (10 mM), Glutamax, N2, B27, N-acetylcysteine (1 µM), epithelial growth factor (50 ng/ www.nature.com/scientificreports/ mL), noggin (100 ng/mL), and R-spondin1 (1 µg/mL)]. ROCK inhibitor (Y-27632; 10 mM) was added to the medium for the first 3-4 days after plating but was not supplemented during subsequent changes of the medium. After a few days, organoids were visible in the Matrigel. The organoids were dispersed into single cells and replated after they reached a large size (days 7-14 post seeding).
Immunohistochemistry. Tissues and organoids were fixed in 4% paraformaldehyde, frozen in OCT compound, and sectioned into 5-µm thick sections using a microtome. The sections were transferred to slides, dried, washed in PBS, blocked with 3% bovine serum albumin (BSA)/0.1% Triton-X/PBS for 1 h, incubated with the primary antibodies overnight, washed in PBS, and incubated with the secondary antibodies. The nuclei were stained with Hoechst33342 (1 µg/mL; Thermo Fisher Scientific; H3570). Antigen retrieval using citric acid treatment at 121 °C for 20 min was performed before incubation with Ki67, Sox2, and p63 antibodies.
Antibodies. Tissues and organoids were subjected to immunostaining using the anti-Ki67 ( Microscopy. The frozen sections were imaged using a FV3000 confocal laser scanning microscope (Olympus Corporation, Tokyo, Japan). The images were analyzed using the software ImageJ (National Institutes of Health, Bethesda, MD). Live images of organoids in culture were captured using a BZ-X710 All-in-One fluorescence microscope (Keyence Corporation, Osaka, Japan). Downstream analysis of 3′-scRNA-seq data. The scRNA-seq output data were processed with the Cell Ranger pipeline (10 × Genomics) against the mouse reference (mm10) datasets. The gene-barcode matrices were analyzed and visualized using the Seurat R package (version 4.0.1) 26 . Genes expressed in less than 3 cells and cells containing less than 200 genes were filtered out. To remove potential doublets and low-quality cells, cells with unique gene counts of more than 6500 or a mitochondrial gene percentage greater than 10% were discarded. After filtration, the control and cancer organoid datasets contained information on 17,642 genes expressed in 9164 cells and 17,134 genes expressed in 12,378 cells, respectively. The expression of each gene was normalized, and the data were log-transformed. Principal component (PC) analysis was performed using highly variable genes. The first 40 PCs were used to generate cell clusters with the Louvain algorithm (14 and 10 clusters in the control and cancer datasets) at a resolution of 0.5. Nonlinear dimensional reduction and visualization were performed using the UMAP algorithm and the PCs selected above. The RaceID3 algorithm was used to identify stem cell-related clusters 27 . The datasets of cells with more than 2000 transcripts per cell were filtered. Mitochondrial and ribosomal genes were excluded. K-medoids cluster calculations were performed. The lineage tree was calculated using the StemID2 algorithm included in the RaceID3 R package. The number of significant links and the transcriptome entropy (reflecting the uniformity of the transcriptome) were calculated for each cluster. The StemID score for each cluster was determined by multiplying the link number and Δentropy. Marker gene expression was determined for the RaceID clusters. The expression of these markers in the RaceID object and the corresponding Seurat object was used to determine which Seurat cell cluster corresponded to each RaceID cluster. Trajectory/pseudotime analysis was performed using the Monocle3 algorithm 28 . To order the cells according to pseudotime, the clusters with the highest StemID scores were chosen as the root nodes. The control and cancer datasets were integrated into one dataset using the reciprocal PCA method in Seurat to allow direct comparisons.

scRNA-seq.
KEGG pathway and GO enrichment analyses. Differentially expressed genes between cancer stemlike cell and control stem-like cell clusters (for the integrated Seurat object) or those between the control/cancer stem-like cell clusters and other cell clusters (for the individual control and cancer Seurat objects) were identified using the Seurat FindMarkers function with default parameters. KEGG pathway enrichment analyses of these differentially expressed genes and the graphical representation were performed using the WEB-based GEne SeT AnaLysis Toolkit (http:// www. webge stalt. org/) 29 . KEGG is developed by Kanehisa Laboratories (http:// www. kegg. jp/ kegg/ kegg1. html). GO enrichment analysis was performed using the web-based Metascape database (https:// metas cape. org/) 30 .

Data availability
The datasets analyzed in this study are available from the corresponding author upon reasonable request.