Molecular characteristics and spatial distribution of adult human corneal cell subtypes

Bulk RNA sequencing of a tissue captures the gene expression profile from all cell types combined. Single-cell RNA sequencing identifies discrete cell-signatures based on transcriptomic identities. Six adult human corneas were processed for single-cell RNAseq and 16 cell clusters were bioinformatically identified. Based on their transcriptomic signatures and RNAscope results using representative cluster marker genes on human cornea cross-sections, these clusters were confirmed to be stromal keratocytes, endothelium, several subtypes of corneal epithelium, conjunctival epithelium, and supportive cells in the limbal stem cell niche. The complexity of the epithelial cell layer was captured by eight distinct corneal clusters and three conjunctival clusters. These were further characterized by enriched biological pathways and molecular characteristics which revealed novel groupings related to development, function, and location within the epithelial layer. Moreover, epithelial subtypes were found to reflect their initial generation in the limbal region, differentiation, and migration through to mature epithelial cells. The single-cell map of the human cornea deepens the knowledge of the cellular subsets of the cornea on a whole genome transcriptional level. This information can be applied to better understand normal corneal biology, serve as a reference to understand corneal disease pathology, and provide potential insights into therapeutic approaches.

The human cornea serves as a barrier against the external environment and provides the main refractive power to focus light to the retina. The cornea has five distinctive layers, three composed by cellular elements (epithelium, stroma, endothelium) and two by extracellular membranes (Bowman's and Descemet's membranes). The molecular, biomechanical and structural attributes of each layer is associated with characteristic cellular composition. The outermost stratified epithelium provides a protective barrier between the environment and the anterior chamber of the eye. Superficial epithelial cell exuviation necessitates renewal by epithelial stem cells located in the corneal limbus (LESC) that differentiate into early progenitor and transient amplifying cells which continually differentiate and migrate to replenish the epithelium. Structural integrity and clarity are provided by a thick stromal layer (about 90 percent of the cornea). Stromal keratocytes deposit and maintain the extracellular matrix that provides corneal structure and transparency. The corneal endothelium (CenC) is a monolayer that unlike the epithelium does not divide. This layer separates the cornea from the aqueous humor with extensive tight junctions to maintain barrier functions to pump fluids from the stroma in order to maintain proper hydration and corneal clarity.
The molecular bases underlying these highly distinct physiologic functions have not been fully elucidated by previous technology platforms, including proteomics, microarrays, and bulk RNAseq [1][2][3][4][5][6][7][8] . We aimed to generate a single-cell RNAseq transcriptome of the human cornea to characterize the functions of each cell type at the whole genome transcriptomic level and gain insights into the corneal cell organization using RNAscope. Single-cell RNAseq offers the ability to investigate both prominent and rare cell populations within the same complex tissue and has been successfully utilized for studying ocular cell subtypes [9][10][11][12][13] . Here we present a single-cell RNAseq transcriptomic map from six healthy adult cornea samples. Clustering analysis revealed 16 unique cell clusters and captures all major cellular regions of the cornea.
This transcriptome map of the healthy human cornea could be utilized to better understand the cell populations affected by corneal dystrophy mutations and inform potential cellular and gene therapy approaches. Furthermore, transcriptomic information can provide functional insight into mechanisms of related polygenic corneal disorders and those associated with complex systemic or other organ diseases.

Molecular characteristics of cornea endothelial cells (CenCs).
Previous studies aiming to characterize human CenCs were either based on low throughput analysis of a subset of genes or genome-wide transcriptomic analysis of cultured and expanded CenCs which may not always fully reflect primary tissues [14][15][16][17][18] . Single-cell RNAseq of primary tissues better ensured CenC purity and provides a comprehensive genome-wide transcriptomic profile which cannot be achieved with bulk RNAseq approaches 3,19 .
Besides those shown in Fig. 1c, Fig. 2a includes an expanded set of CenC specific or enriched marker genes (heatmap: Fig. S2; full gene list: Table S1) RNAscope results indicated that the expression of CA3, SLC4A11 and RGS5 were restricted to the CenC monolayer confirming their identity (Fig. 2b). Previously, CA3 and SCL4A11 have been reported to be highly expressed in CenCs 3,15,19 . We also find functionally related family members, CA12 and SLC4A4, to be highly enriched in CenCs (Fig. 2a). CA3 and CA12 belong to a family of carbonic anhydrase (CA) proteins involved in fluid transport, a major CenC biological function.
Among the CenC marker genes in Fig. 2a, MSMP and EDN3 are highly CenC enriched. TMEM178A is reported as a quality assessment marker for CenC cultures 16 . Both WNT and FGF pathways are manipulated during in vitro CenC differentiation and expansion for potentially engineering CenC grafts 20 . We find FZD2, a component in WNT signaling pathway, and FGF7 and FGFR1, components of FGF signaling pathway, are enriched in CenCs (Fig. 2a). Pathway analysis suggests a higher expression of genes involved in cellular respiration in CenCs compared to other cornea cells: SLC4A11, SLC4A4, and N + /K + ATPases ATP1A1, ATP1B1, ATP1B2 (Fig. 2c). This finding aligns with the key CenC function of actively transporting fluid requiring high metabolic activity.
RGS5 has been considered as a marker of pericytes which are associated with vascular endothelial cells. In the cornea we find RGS5 highly enriched in CenCs (Fig. 2a,b). The negative expression of another pericyte marker PDGFRB ruled out the possibility that this cluster was a mixture of CenCs and pericytes and supports RGS5 as another CenC marker (Table S2).
Molecular signature underlying key cornea stromal functions. The cornea stroma is a thick layer primarily comprised of highly ordered layering of collagens that provide structure and strength to the cornea while also permitting passage of light. The dominant cell type in the stroma is the keratocyte. The stroma cluster (Stro) markers include classical keratocyte markers: KERA, LUM, DCN ( Fig. 3a; heatmap: Fig. S3). Mutations in these genes are associated with stromal corneal dystrophies. As evidenced by knock-out mice with abnormal corneas, KERA is critical in maintaining proper corneal shape 21 and LUM is required for ordered stromal collagen fibral spacing for corneal transparency 22 . Both DCN and KERA RNAscope results demonstrate staining throughout the stroma resembling the expected locations of keratocytes (Fig. 3b). DCN transcript was also found lightly staining basal cells in the limbus and peripheral cornea (Fig. 3b), which agrees with our observation that the expression of these genes is highly enriched in but not unique to the keratocytes (Fig. 3a). Similarly, in human skin DCN is found both in keratocytes and suprabasal epidermis layer 23 . Table 1. Cell numbers contributing to each cell cluster from each donor cornea.

Molecular signature of limbal progenitor cells (LPC), transiently amplifying cells, and other basal epithelial cells.
One of the characteristics of cornea epithelial cells is the ability to continually self-renew. Previous bulk transcriptomics of peripheral cornea and limbal niche report upregulated genes for cell cycling and proliferation in the cornea which is reflective of maintaining epithelial turn over 6 . The limbal epithelial stem cells (LESCs) reside within the basal layer palisades of Vogt and have been challenging to characterize within the heterogeneous limbal niche. www.nature.com/scientificreports/ We identified marker genes for LPC-1 (Table S1), with representative genes presented in Fig. 5a. We hypothesize that LPC-1 are early limbal progenitor cells stemming from LESC differentiation due to their enriched expression of genes expected of LESC and other stem cells from previous reports: SCRG1, FRZB, and GPHA2 (Fig. 5a). Additionally, LECT1, NPPC, and CDH19 are also highly enriched in LPC-1 compared to the remaining epithelial cell clusters (Fig. 5a, heatmap: Fig. S5). GPHA2 is highly enriched in LPC-1, but also expressed in LPC-2 and Epi-B1 ( Fig. 5a) with RNAscope demonstrating expression in basal cells within the limbus and the limbal-adjacent peripheral cornea, with no staining in the basal epithelium of the central cornea (Fig. 5c). We find FRZB predominately expressed in the limbus and absent from the central cornea (Fig. 5c). RNAscope suggests that LPC-1 enriched genes can also be expressed in a few clusters which reside in the basal epithelium of the peripheral cornea. Therefore, LPC-1 was chosen as the origin point of the other corneal epithelial cell clusters in our PAGA and pseudotime analyses ( Fig. 4c-e). www.nature.com/scientificreports/ LPC-2 is most similar in gene expression to LPC-1 (Fig. 4a,c,d) and may be the next transitional state of late limbal progenitor cells. For instance, MMP10 is found diffusely staining in the basal cell layer of the limbal niche and is enriched in both LPC-1 and 2 ( Fig. 5a,d). To better capture the unique signaling pathways defining LPC-1 and LPC-2, we performed pathway analysis on differentially expressed genes. The top-most enriched signaling pathways included p53, p63 and c-Myc (Fig. 5b). These pathways are important for cell growth, differentiation and apoptosis and were down-regulated in early progenitors (LPC-1) as compared to late progenitors (LPC-2) (Fig. 5b).
Stem and early progenitor cells in healthy corneas infrequently divide whereas transiently amplifying cells are often dividing and can be identified by proliferation markers. Epi-B1 have distinctive expression of cell cycle-related genes, such as PCNA, TOP2A, MKI67 and UBE2C (Fig. 1c). CKS2, is highly enriched in Epi-B2  (Fig. 5e). Epi-B2 represents a trajectory stemming from related limbal and basal cell clusters (Fig. 4c-e). WNT10A is expressed in LPC-1,2,Epi-B1,B2 and is most enriched in Epi-B2 (Fig. 4a). RNAscope for WNT10A demonstrates staining in basal cells beginning within the limbal region, continuing into the periphery, and extending into the central cornea (Fig. 5e). WNT10A is reported to be upregulated in central cornea compared to the limbal niche 30 . Therefore, Epi-B1 are likely transiently amplifying cells which move from the limbus towards central cornea contributing to basal epithelial cells along with Epi-B2. www.nature.com/scientificreports/ Molecular characteristics of conjunctival and corneal epithelial cells. Genes differentially expressed between Epi-S1,2,3 and Conj-1,2,3 were identified and subjected to NextBio correlation analysis which identified a cornea and conjunctiva comparison in the database (Fig. S6) 31 . The top overlap genes distinguishing the two types of epithelial are demonstrated in a heatmap (Fig. 6a). KRT13 and KRT12 staining was used to confirm the locations of the bulbar conjunctival epithelium and the start of the superficial mature corneal epithelium respectively, as previously reported 31,32 . KRT13 is enriched in the conjunctiva and extends over the superficial layer of the limbal region whereas KRT12 is enriched in the mature cornea and is seen from the peripheral and extending throughout the central cornea (Figs. 4a and 6b). The bulbar conjunctival epithelium (Conj-1,2,3) extends into the corneal region and resides superficially within the limbal region and peripheral-most cornea (Figs. 4a, 6b,c). KRT4 and AQP5 are enriched within the superficial epithelial layer of the limbal region with a few cells found within the peripheral cornea (Fig. 6c). CEACAM7 is highly enriched in the Conj-3 cluster compared to Conj-1,2 and marks cells on the further edge of the superficial bulbar conjunctiva above the limbus (Fig. 6c). These conjunctival markers are used to distinguish from adjacent corneal epithelial cells 31,33 .
Epi-S1,2,3 represent the central mature superficial corneal epithelium and comprise a developmental branch in the corneal epithelial trajectory (Figs. 4a, 6b,d). RNAscope of enriched marker gene KRT3 demonstrates strong staining within the superficial layers of the central cornea with sparse staining in the limbus and peripheral cornea, and no staining in basal cells or palisades of Vogt (Fig. 6d). KRT3 expression is associated with the differentiation of transient amplifying cells into mature superficial epithelium and is absent from basal and limbal epithelium [34][35][36][37] . MAL staining was also enriched in the superficial cells of the central cornea, with more positive cells in the limbus and peripheral cornea than KRT3 (Fig. 6c). MAL plays a role in lipid-raft-mediated apical sorting of epithelium proteins 38 . www.nature.com/scientificreports/ RNAscope staining of LYPD2 (Epi-S3, Conj-3) and KRT24 (Epi-S3) demonstrate that these are expressed within the superficial-most epithelial layers (Fig. 6e). KRT24 has a few positive cells in the superficial layer of the peripheral cornea, but no staining in the limbus (Fig. 6e) which agrees with a report of high expression in superficial cornea epithelium 37 rather than conjunctiva.
Finally, Epi-T cells are defined by the lack of unique marker genes and sharing the expression of marker genes of many other clusters, suggestive of a transitional epithelial state (Fig. 4a,c).
To further characterize the epithelial clusters, we examined the expression of genes key to epithelial functions. Multiple cellular junction genes were found to be expressed at lower levels in Conj-1,2,3 conjunctival than in Epi-S1,2,3 corneal epithelium (Fig. 6f). Several connexins (GJA2, GJB2, GJB4, GJB6) were significantly differentially expressed which may be important in coordinating cell-to-cell communication and signals among epithelial cells 39 . Desmosome components, DSG1 and DSC2, which provide structural strength to adhere neighboring cells together, are enriched in the corneal epithelial layer (Fig. 6f). DSG1 is known to be enriched in the skin's superficial epithelial layer and we demonstrate a similar pattern in the cornea (Fig. 6f). This reflects the more prominent function of maintaining a tight physical barrier and cell-cell adhesion in the corneal as compared to the conjunctival epithelium.
A wide range of keratins have been reported to be highly expressed in cornea and contribute to the global corneal signature in comparison to other ocular regions 2,37 and are the most abundant proteins expressed by corneal epithelial cells 4 . Keratins play important roles in maintaining corneal transparency by organizing extracellular collagen proteins and retaining water in the cornea through negatively charged glycosaminoglycans. Keratin expression patterns are also utilized to identify epithelial differentiation states, as well as residing in the conjunctiva or cornea. We examined the expression patterns of the top four keratins across epithelial subtypes, and similar to the gene expression profiles outlined above, the relative expression of keratins also defined related clusters based on anatomical location and biological functions (Fig. 6g). KRT13 is enriched in the conjunctiva (Conj-1-3) while KRT12 dominates the mature corneal epithelial cells (Epi-S1-3) (Fig. 6b,g). KRT14 and KRT15 are expressed predominately in clusters residing in the basal epithelial layer across all cornea regions (LPC-1,2,Epi-B1,B2,T) (Fig. 6g). Transitional Epi-T has homogenous expression of all four keratins (Fig. 6g). Together with the physical locations of epithelial sub-clusters we obtained from RNAScope, our observation provides a spatial location of keratin expression patterns.
The majority of monogenic diseases display pathology in a single layer and require corneal grafts for treatment. Reported causal genes for monogenic corneal dystrophies and their expression patterns among the clusters is shown in Fig. 7 (heatmap: Fig. S7). As expected, the causal genes are predominately enriched in clusters where the pathology occurs.
There is a category of seven corneal dystrophies, epithelial-stromal TGFBI dystrophies, that are all caused by 70 reported mutations in TGFBI [42][43][44] . TGFBI (kerato-epithelin) is an extracellular protein enriched in the epithelium. It is a critical protein for maintaining structure in the corneal epithelial and stromal layers through its cell adhesion and cell-collagen interactions. Expression of TGFBI is widespread among the clusters which corresponds to the wide range of mutation-associated pathologies (Fig. 7).

Discussion
A strength of this study is that it provides a deep single-cell-based transcriptomic-based identification of 16 cell clusters from six healthy adult human corneas and adjacent bulbar conjunctiva. We were able to generate whole-genome transcriptomic profile for each cell type identified and map out relationships between epithelial cell clusters. We are confident in our assignment of the small but important LPC-1 and CenC cells as early limbal progenitor and corneal endothelial cells, but the paucity of them in our population prevented further in-depth characterization of their transcriptome. Early LPC cells are challenging to distinguish from LESC, due to the absence of exclusive markers or sufficient cell numbers, making it difficult to assert LESC with great confidence. Future studies including additional healthy corneas to collect larger cell numbers for each cluster would be advantageous to increase the chances of detecting low-expression genes particularly in rare cell types. A recent report of single-cell RNAseq which included four healthy adult human corneas also revealed a diverse set of unique clusters from the cornea and conjunctiva, identifying similar key genes and cell types 45 . Collectively, these two studies strengthen the transcriptomic map foundation of the healthy adult human cornea.
The location of the clusters in human cornea based on gene expression profiles and as shown by RNAscope is graphically summarized in Fig. 8. This single-cell transcriptomic map offers further detailed insight into the complexity of the human cornea which is best reflected in the wide heterogeneity of classified epithelial cells. There are two major superficial/suprabasal epithelial groups, the conjunctival (Conj-1-3) and the corneal (Epi-S1-3) epithelium. The peripheral cornea is a transition zone of dividing and migrating cells which is reflected by the gradient of cell types and genes shared among the peripheral cornea clusters. We propose a trajectory of epithelial cells originating from the early LPC (LPC-1), transitioning to late LPC (LPC-2) and transiently amplifying cells (Epi-B1) and then to transitional cells (Epi-T) and centrally enriched basal epithelial cells (Epi-B2), www.nature.com/scientificreports/ and ultimately to the mature superficial corneal epithelium (Epi-S1-3) (Figs. 4c,d, 8b). These cellular transitions represent a complex development and migratory process underlying the remodeling of the epithelial cells. We aimed to obtain whole-genome transcriptomic profile of primary unmanipulated CenCs. Despite only obtaining 28 CenCs, the identification is clear based on the unique transcriptome. Since the CenC is a monolayer, this population is considerably smaller than the epithelial or stroma populations. Due to the small cell number and low detection sensitivity of scRNASeq technology, we most likely failed to detect many low-level expression genes in CenCs. Nevertheless, the detection of 11,706 genes provided a better insight of primary CenC biology than what was previously available. For instance, manipulation WNT or FGF pathways was utilized in CenC culturing due to high gene expression of family members 20 . Our CenC signature provides molecular characteristics that can be used for monitoring the quality of cultured and expanded CenCs for potential cellular therapy.
The LESC are essential for maintaining cornea epithelial homeostasis and wound healing. Wnt signaling is essential for LESC proliferation and maintenance 46,47 and we found expression of members in LPC and basal layer cells. Limbal epithelial stem cell deficiencies (LSCD) result in abnormal epithelial wound healing, neovascularization, and vision loss 48 . Limbal transplants offer transient restoration due to allograft failure at a higher rate than central corneal transplants 49 . Due to the heterogenous nature of the limbal region and diverse types of cells present, it has been challenging to identify resting primary LESC and LPC markers. Numerous genes have been suggested to be LESC/LPC markers, but many can also be detected in other corneal cells 6,7,34,[50][51][52][53] . The enrichment of a gene in the limbal niche does not indicate that it is exclusive to LESC or surrounding cells. We propose that the enriched genes comprising the signature of LPC-1 indicates they are the earliest progenitor cells arising from LESC differentiation. Expression of CDH19, GPHA2, and FRZB have been reported in several gene profiling studies of human LESC or limbal niche 6,7,52,54 . Interestingly, a recent study independently also identified GPHA2 as a novel putative LPC marker using scRNAseq and found its expression to be lost as LPC cells are expanded in vitro 45 . FRZB has been found enriched in the palisades of Vogt with staining in spots along basal peripheral, but not central, cornea 6,30 . Additionally, we find SCRG1, LECT1, NPPC, and CDH19 to also be highly enriched in LPC-1 and these genes are reported in other stem and early-progenitor cell populations of the bone and peripheral nervous system [55][56][57][58][59] . A more comprehensive understanding of unique genes in the LPC, limbal stem cell niche, and transient amplifying cells may offer insight into the underlying pathological origin of corneal diseases and extended damage from injury.
Apart from identifying unique markers for the regions of the epithelium, analysis of differential expression of gene families reveals patterns associated with conjunctival and corneal epithelial regions. Connexins were highly expressed in the superficial corneal epithelium. These gap junction proteins permit cell-to-cell communication and are found in various epithelial tissues, have differential cell and tissue isoform distributions, and are involved with wound healing responses 39 . Mutations in GJB2, GJB6, and GJA1 result in keratitis, corneal opacity, skin disorders, and hearing loss 60,61 . Multiple connexin isoforms are found with differential spatial expression in corneal epithelium and display altered patterns in diseased or injured corneas [62][63][64] . Desmosomes maintain strong www.nature.com/scientificreports/ attachments between cells and the weak corneal epithelial attachment in vitamin-D receptor knockout mice has been attributed to decreased expression of DSG1 and DSC2 65 . We find enrichment of both gene families in corneal epithelium as compared to the conjunctival epithelium highlighting the critical need for tight regulation of cell connections in maintaining the cornea epithelial barrier function in an avascular environment. The top four expressed keratins, KRT12, KRT13, KRT14, KRT15, showed differential expression patterns among the epithelial clusters (Fig. 6g). KRT13 was enriched in the bulbar conjunctival epithelium and KRT12 was enriched in the corneal epithelium, a pattern consistent with previous reports 31,32,66,67 . Meesmann dystrophy presents with unstable and fragile corneal epithelium, and is associated with KRT3 or KRT12 mutations, both of which are corneal epithelial differentiation markers and not highly-expressed in the basal or limbal epithelium 37,68 . KRT14 and KRT15 have previously been reported by several groups as potential LESC markers 26,52,53,69 , presumably due to expression in the basal layers as we find in this present study. KRT14 is present in basal cells of stratified epithelium in skin and cornea 37 . Furthermore, we see that KRT15 is absent from the central cornea clusters while KRT14 is maintained to a small degree which corresponds to findings from the bovine and mouse cornea 70,71 .
A meta-analysis of whole cornea tissue transcriptome compared to other ocular tissues revealed enrichment for collagen metabolism and extracellular matrix organization 1 which is not surprising due to the relatively large stromal contribution to the cornea thickness. Over 15% of cells in our analysis are keratocytes within a single cluster. Keratocytes are fibroblasts which are normally quiescent and are stimulated by injury to repair the cornea by transitioning to different phenotypes 72 . Keratoconus is a multi-factorial disease where keratocyte and stromal pathology are evident. Keratocyte apoptosis and stress-related dysfunctions have been associated with keratoconus and transcriptional alterations in many functional pathways are altered in keratoconus stromal cells [73][74][75] . We found COL12A1 as the highest collagen gene expressed in keratocytes, which plays an important role in establishing and maintaining stromal structure and function 76 . We also find matrix metalloproteinases MMP2 and MMP3 and their inhibitors TIMP1 and TIMP2 highly enriched in keratocytes which contributes to their function in regulating extracellular matrix composition.
Many genes found in the various epithelial groups are also shared with the skin epithelium as well, suggesting shared mechanisms in the barrier structures. There were several key relevant genes with biological functions in wound healing that helped define cell clusters. In addition to the well-known and severe retinal complications in diabetic retinopathy and systemic wound healing deficiencies in diabetics, there is a wide range of frequent but www.nature.com/scientificreports/ underdiagnosed corneal abnormalities in diabetics 77,78 . One such example is that MMP10 expression is elevated in diabetic retinopathy patients' corneas and implicated in the abnormal wound repair response 79 . Wnt10A deficient mice have delayed skin wound healing and reduced synthesis of collagen 80 . Prominent markers of the conjunctiva epithelium are associated with wound healing responses. S100A8 and S100A9 are important host defense and wound healing proteins within the body, including epithelial layers 81,82 . Additionally, AQP5 knockout mice have reduced corneal wound healing 83 . The cornea and conjunctiva epithelium, like the skin, is in direct contact with the exterior world thus rapid and effective wound healing responses are critical for maintaining functional integrity. Treatments focused on key wound healing genes may facilitate reversal of corneal pathology. Vascular endothelium, melanocytes, and Langerhans cells, while not common, are present in the cornea. These supportive cells are necessary for the healthy functioning of the limbal niche and subsequently the corneal epithelium.
Corneal dystrophies are typically progressive and those that result in visual loss are still predominately treated by corneal transplantation. Transplants can be complicated by disease recurrence, graft failures, the need for repeated transplant over the patient's lifetime, and accessibility of graft tissue in some regions of the world 42,84 . Therefore, there is a clinical need to better understand these dystrophies from a genetic, cellular, and molecular level to advance novel gene-editing and cell-based therapies.
The aim of this study was to provide a reference single-cell transcriptomic and spatial map of the healthy adult human cornea. Identification of cellular subsets within the cornea by using transcriptional information can help inform future cellular studies. For example, with the success of iPSC differentiation into terminally differentiated cells for research or transplantation, there is a need for well-defined characterization markers to ensure the correct cell has been generated in vitro to match its intended in vivo counterpart. These results contribute to that effort.

Materials and methods
Human tissue. Corneas from deceased human donors were approved for research purposes with informed consent obtained by The Eye-Bank for Sight Restoration (New York City) and all identifiable information removed. Ethical approval for research use of the donor corneas is approved by The Eye-Bank for Sight Restoration Research Review Committee. The guidelines of removing all identifiable information and ethical handling of the donor corneas were set by The Eye-Bank for Sight Restoration Research Review Committee. These research corneas were used in accordance with guidelines and regulations outlined by The Eye-Bank for Sight Restoration Research Review Committee. The corneas were collected a median of 12 h postmortem and placed into Optisol medium and stored at 4C until processing. One cornea from each of the six donors (2 male, 4 female; median age 57) with no noted ocular or corneal disease history were processed for FACS and single-cell RNAseq using 10X Genomics v2 platform.
Cornea tissue dissociation and cell sorting. Corneas were individually dissociated. Briefly, corneas were finely chopped in the storage Optisol media and transferred to DMEM containing a digestion mix of Collagenase A, Dispase II, and DNAse I. The samples were incubated at 37 °C for 15 min with 1-2 intervals of pipette mixing to aid in tissue disruption. Samples were centrifuged and the cell pellets were further digested in 0.5% Trypsin for 5 min at 37 °C. Hoeschst nuclear dye was added to each sample with a small portion of unstained sample reserved for FACS gating. Cells were washes in 5%PBS and 2 mM EDTA to stop digestion and pelleted. Topro3 viability dye was added to each sample. Viable cells were identified as Hoechst positive and Topro3 negative and sorted with the MoFlo Astrios (Beckman Coulter). Samples were collected and pelleted for 10X Genomics v2 preparation and processing.
Single-cell RNA-seq. Cells counts were generated using the Nucleocounter NC-250 Chromium v2 platform (10X Genomics). Single cells suspended in PBS with 0.04% BSA were loaded on a Chromium Single Cell Instrument (10X Genomics). RNASeq libraries were prepared using Chromium Single Cell 3' Library, Gel Beads & Multiplex Kit, v2 (10X Genomics). RNASeq libraries were prepared using Chromium. Paired-end sequencing was performed on Illumina NextSeq500 (Read 1 26-bp for unique molecular identifier (UMI) and cell barcode, 8-bp i7 sample index, 0-bp i5, and Read 2 55-bp transcript read). Cell Ranger Single-Cell Software Suite (10X Genomics, v2.0.0) was used to perform sample de-multiplexing, alignment, filtering, and UMI counting. The Human b37.3 Genome assembly and UCSC gene model for human were used for the alignment.
Data quality assessment, normalization and integration. Data quality of each cell was assessed using three parameters: total UMI, number of genes detected and the ratio between mitochondria gene UMI and total UMI. To avoid cell doublets and possibly unhealthy cells due to the experiment manipulation, cells with number of genes detected over 5000 or mitochondrial vs. total UMI ratio over 0.2 were removed. Seurat 3.0.1 program 85,86 using R (R Core Team, 2013) was then used to perform all further single cell transcriptome analyses for this project unless mentioned otherwise. Seurat program scaled the total UMI to 10,000 for each cell followed by UMI natural log transformation. Data integration across the 6 samples was performed using two Seurat functions: FindIntegrationAnchors (based on top 2000 variable genes identified by FindVariableFeatures) and IntegrateData. In both functions, 30 scaling dimensions of variables were used. Figure S8 outlines the cell filtering based on QC.
Cell cluster generation and identification of cluster marker genes or genes differentially expressed between two cell populations. Once  www.nature.com/scientificreports/ of variables) and FindClusters (using resolution 0.6). Cluster IDs were given to cell clusters in an ascending order based on the number of cells in each cluster: the smallest cluster ID was given to the largest cell cluster.
Cluster marker genes were identified by FindAllMarkers of Seurat using three cutoff thresholds: detection rate over 25% cells in the cluster of interest, natural log transformed fold change over 0.25, and a p values smaller than 0.01 in Wilcoxon Rank Sum test. Genes differentially expressed between clusters were identified using FindMarkers with the same significant cutoff criteria as used in FindAllMarkers, except that the gene expression detection rate over 25% is required in at least one of the 2 clusters in comparison.
Genes differentially expressed between 2 cell populations were identified by the function FindMarkers of Seurat. The cutoff thresholds used were the same as mentioned above for FindAllMarkers, except that the fold change can be either increase or decrease.
Cell cluster connectivity and pseudotime analyses. Cell cluster connectivity was analyzed by PAGA (Partition-based graph abstraction) program developed by Wolf et al. 29 and pseudotime reconstruction 87 using SCANPY package 88 in Python 89 .
Single cell RNASeq data visualization. For QC assessment, histograms and scatter plots were generated using R. Seurat RunUMAP function was used to generate UMAP (Uniform Manifold Approximation and Projection) plots for the visualization of cell cluster ID of or gene expression in each individual cell in conjunction with Seurat function DimPlot or FeaturePlot, respectively. Seurat DotPlot was used to visualize the average expression and percentage of cells expressing a gene in each cluster. Stacked violin plots of the expression of multiple genes across multiple clusters were generated using stacked_violin function of SCANPY in Python 88 . Heatmap was generated using Seurat function DoHeatmap, except in the supplemental figures, heatmaps were generated by using the average of normalized UMI of each gene in each cluster and colored based on the value range of each gene across cell types using Excel conditional formatting (Microsoft Excel 2016, Microsoft Corporation, Redmond, WA). NextBio correlation analysis. NextBio correlation analysis was performed using genes differentially expressed between Conj-1, 2, 3 and Epi-S1, 2, 3 (Illlumina, San Diego, CA). Bar plots of resulted p values were generated by Excel bar chart (Microsoft Excel 2016, Microsoft Corporation, Redmond, WA).

RNAscope.
Human corneas were received from The Eye-Bank for Sight Restoration and were immediately placed in proprietary fixative and shipped to Excalibur Pathology Inc. The samples were then paraffin embedded and sectioned at a thickness of 6 mm. The RNAscope 2.5 Duplex kit (ACD Bio) was used and the protocol from ACD Bio was followed. Briefly, slides were incubated at 60 °C for 1 h prior to deparaffination, which consisted of 2-5 min Xylene washes and 2-1 min 100% ethanol washes. Slides were left to dry at RT for 5 min. Afterwards, samples were incubated in hydrogen peroxide (ACD Bio) at RT for 10 min, incubated in RNAscope Target Retrieval Reagent (ACD Bio) at 100 °C for 15 min, and incubated in protease III (ACD Bio) at 40 °C for 15 min, with diH2O washes in between. The slides we dried at 60 °C for 15 min, cooled at RT for 5 min, a barrier was drawn around the samples and stored at RT overnight. Samples were washed in RNAscope Wash Buffer Reagent (ACD Bio) 2 × 2 min and then probes were added to the sections and incubated at 40 °C for 2 h within a humidified chamber. Signal amplification and detection reagents (ACD Bio) were applied sequentially and incubated in AMP 1, AMP 2, AMP 3, AMP 4, AMP 5, AMP 6, AMP 7, AMP 8, Amp 9, and AMP 10 reagents (AMP 1-4,7,8 at 40 °C; AMP 5,6,9,10 at RT), for 30,15,30,15,30,15,15,30,30,15 min, respectively. Before adding each AMP reagent, samples were washed 2 × 2 min with wash buffer. The samples were then counterstained with 50% Gill's hematoxylin I (Vector Laboratories) for 6-8 s at room temperature and briefly rinsed with tap water 2x. Mounting media (VectaMount Permanent Mounting Media; Vector Laboratories) and cover slips were then added to slides for imaging. Images were captured using a Keyence BZX-700 microscope.