Single-cell transcriptomic profile of human pulmonary artery endothelial cells in health and pulmonary arterial hypertension

Pulmonary arterial hypertension (PAH) is an insidious disease characterized by severe remodeling of the pulmonary vasculature caused in part by pathologic changes of endothelial cell functions. Although heterogeneity of endothelial cells across various vascular beds is well known, the diversity among endothelial cells in the healthy pulmonary vascular bed and the pathologic diversity among pulmonary arterial endothelial cells (PAEC) in PAH is unknown and previously unexplored. Here single-cell RNA sequencing technology was used to decipher the cellular heterogeneity among PAEC in the human pulmonary arteries isolated from explanted lungs from three patients with PAH undergoing lung transplantation and three healthy donor lungs not utilized for transplantation. Datasets of 36,368 PAH individual endothelial cells and 36,086 healthy cells were analyzed using the SeqGeq bioinformatics program. Total population differential gene expression analyses identified 629 differentially expressed genes between PAH and controls. Gene Ontology and Canonical Ingenuity analysis revealed pathways that are known to be involved in pathogenesis, as well as unique new pathways. At the individual cell level, dimensionality reduction followed by density based clustering revealed the presence of eight unique PAEC clusters that were typified by proliferative, angiogenic or quiescent phenotypes. While control and PAH harbored many similar subgroups of endothelial cells, PAH had greater proportions of angiogenic and proliferative subsets. These findings identify that only specific subgroups of PAH PAEC have gene expression different than healthy PAEC, and suggest these subpopulations lead to the pathologic functions leading to remodeling.

Pulmonary artery hypertension (PAH) is a deadly cardiopulmonary disease characterized by profound structural changes to the pulmonary artery wall 1 . Pulmonary artery endothelial cells (PAEC) are causally linked the vascular remodeling in PAH [2][3][4][5][6] . For example, apoptosis-resistant proliferation and dysregulated metabolism in PAEC are key features of PAH [4][5][6][7] . Although endothelium exhibits a marked heterogeneity in structure and function [8][9][10] , little is known about the single-cell transcriptomic heterogeneity of the pulmonary vasculature. The pulmonary vasculature is distinct from other vascular beds; it is a low pressure vascular bed and accommodates the complete venous return from the systemic vascular bed. Because endothelial remodeling in PAH is patchy 11 , PAEC isolated from explanted lung contain a mixture of cells from healthy and affected areas. Analysis of these cells in bulk cultures may mask phenotypically altered PAEC subsets. Here, we hypothesized that the transcriptomic diversity of PAEC isolated from healthy and PAH lungs would differ and that PAH would have specific expansion of pathological phenotypes of cells. To test this, we evaluated PAH and control PAEC by methodologies to identify cell clusters and their phenotypes. Eight distinct clusters of endothelial cells were identified from healthy Figure 1. Overall work flow differential gene expression and clustering analysis. 10 × Single-Cell Omics was performed on PAEC derived from 3 control and 3 PAH lungs. The total number of cells in each sample was 36 K. Global gene expression analysis between the total populations of control and PAH cells revealed that 629 genes from 1,116 detected genes were differentially expressed. Among these, 445 genes were upregulated and 184 were downregulated in PAH. 416 highly dispersed genes were found in the total control and PAH population. Principle Component Analysis followed by Trimap dimentionality reduction was performed to depict the high-dimentional data in a two-dimentional space. Density Clustering Analysis identified 8 clusters of cells among control and PAH PAEC. Differential gene expression analysis of clusters in each sample, followed by gene ontology analysis of the differentially expressed genes showed that there were 4 transcriptomic profiles among the 8 clusters of PAECs. Three PAH clusters had an altered profile compared to the corresponding control cluster. www.nature.com/scientificreports/ Clustering of control and PAH PAEC. Clustering of control and PAH PAECs was first performed using a nonbiased approach. Dimensionality reduction in single cell analyses summarizes heterogeneous populations in a high dimensional space in a two-dimensional embedding. tSNE, UMAP and TriMAP dimensionality reduction were applied (Fig. 3). The TriMap algorithm was used because it gave an improved preservation of the global data structure in the two-dimensional space 12 . Three clustering approaches, PhenoGraph, Kmeans and density clustering were used to identify PAEC subsets. A minimum number of clusters found by these methods was 8 (Supplemental Fig. S3). Because the density clustering gave the clearest separation among the clusters we proceeded with this clustering approach to explore the expression patterns that defined the 8 subsets. The distribution of the differentially expressed genes between PAH and controls found in the total population (629 genes) was assessed for possible localization of differential expression in any specific clusters Fig. 4. In fact, the total population differentially expressed genes did not localize to any single cluster. We next investigated the specific phenotypes of cells in each cluster. For this purpose, we performed differential gene expression analysis between each cluster as compared to the remaining cells not part of that specific cluster, within control or PAH populations. The complete lists of these genes are shown in Supplemental Tables S3-S6. Gene Ontology, a bioinformatics platform providing functional information about gene products and biological processes in which gene sets are involved 18,19 , was utilized to gauge functionality and naming of the clusters (Fig. 5A,B). Cluster 1 in control and PAH PAEC exhibited genes primarily all down-regulated relative to the remaining clusters. The majority of the cells in cluster 1 were involved in cell-cycle and cell proliferation, for example centromere protein F (cenpf), cell division cycle 25B (cdc25b), cyclin F (ccnf), regulator of cell cycle (rgcc) and marker of proliferation Ki-67 (mki67). Based on the down-regulated nature profile of this cluster, it was assigned the name of quiescent cluster.
A cluster 2 cell group was identified in controls and PAH, but their gene expression was dissimilar. In control PAEC, cluster 2 genes involved in angiogenesis, cell migration and proliferation were all down-regulated. These genes included kinase insert domain receptor (kdr), regulator of cell cycle (rgcc), platelet and endothelial cell adhesion molecule 1 (pecam), placental growth factor (pgf), C-X-C chemokine receptor type 4 (cxcr4), Angiopoietin-like 4 (angptl4), G1/S-specific cyclin-D2 (ccnd2) and, Chemokine (C-C motif) ligand 14 (ccl14). The transcriptomic profile suggests that this healthy PAEC cluster has a quiescent phenotype. In contrast, PAH PAEC cluster 2 had both down-and up-regulated genes. Down-regulated genes, such as pgf, kdr, fibronectin 1 (fn1) and C-C motif chemokine ligand 2 (ccl2), are mainly involved in new blood vessel formation. Eight genes Table 1. Differential expression of endothelial and mitochondrial pathways by ontology analysis. FDR false discovery rate. www.nature.com/scientificreports/ were strikingly up-regulated in PAH cluster 2, including denticleless e3 ubiquitin protein ligase homolog (dtl), minichromosome maintenance complex component 4 (mcm4) and minichromosome maintenance 10 replication initiation factor (mcm10). Gene Ontology term enrichment analyses showed that these genes are involved in DNA replication, cell-cycle checkpoint, DNA strand elongation and cell proliferation. Thus cluster 2 in control PAEC contains quiescent cells, but in PAH contain cells with a proliferative phenotype. Cluster 3 in both control and PAH were characterized by decreased expression of genes in cell-cycle and cell proliferation pathways, including cenpf, cdc25b, cenpw (centromere protein W), pclaf 9 (pcna clamp associated  The tSNE, UMAP, TriMAP and PC plots depict the expression of all genes in higher dimensional space in a two-dimensional embedding. Thus, from left to right the data depicts maximum local structure (for example the needles on a pine tree) to maximum global structure (for example the clusters of pine trees in a pine forest). Eight clusters were identified by visual inspection of the TriMap space. www.nature.com/scientificreports/ factor), and cdkn3 (cyclin dependent kinase inhibitor 3). Therefore, cluster 3 in both PAH and controls were assigned the term of quiescent phenotype. A cluster 4 was identified in both PAH and control PAEC but the clusters were different in their transcriptomic profiles. Cluster 4 in control PAEC had 142 transcripts that were down-regulated relative to the other clusters. The genes were from wide-ranging pathways, including mitotic cell cycle, cell division, chromosome organization and segregation, spindle organization and microtubule cytoskeleton organization. Based on this transcriptome, cluster 4 among control PAEC was termed quiescent. In contrast, although some cell proliferation genes were down-regulated in PAH cluster 4, 23 genes in angiogenesis, migration and adhesion pathways were up-regulated, such as (claudin 4 (cldn5), pgf, cxcr4, nidogen 2 (nid2), and ccl14). This profile suggests that PAH cluster 4 has a unique angiogenic phenotype not found in the control PAEC cluster 4.
The majority of the differentially expressed genes of cluster 5 were upregulated in both control and PAH cells. The genes upregulated in this cluster control cell proliferation, such as cenpf, cytoskeleton-associated protein 2 (ckap2), kinesin family member 15 (kif15), kinesin family member 18a (kif18a), mik67, serine/threonine-protein kinase 2 (nek2) and kinetochore protein (spc25). This cluster is thus active in proliferation and was accordingly named Proliferative cluster.
Control and PAH Cluster 8 had cells with down-regulation of genes mainly involved in cell proliferation. While there were no genes upregulated in PAH cluster 8, control PAEC cluster 8 had upregulated expression of genes controlling the enzyme-linked-receptor-protein-signaling pathways inhibin subunit beta A (inhba), follistatin like 3 (fstl3), zinc finger protein 703 (znf703), hedgehog interacting protein (hhip), eph receptor B6 (ephb6), integrin subunit beta 3 (itgb3) and plasminogen activator, tissue type (plat). These clusters were therefore named Receptor-signaling in control cluster 8 and Quiescent in PAH cluster 8.
Differential gene expression analysis was subsequently performed to compare transcriptomic profiles of control and PAH clusters (2, 4 and 8) (Fig. 5C). Genes down-regulated in PAH cluster 2 relative to control cluster 2 were involved in microtubules, spindle and centromere formation (tuba1b, tuba1c and cenp10). Upregulated genes regulate intracellular iron storage (fth1), chemoattractant of inflammatory cells know to have essential roles in PAH such as neutrophils, monocytes, T-cell and dendritic cells (cxcl6, cxcl8 and ccl2) [20][21][22] . In clusters 4 and 8, mt2a, a member of the metallothionein family controlling intracellular metabolism of essential metals 23 , and mmp1 (matrix metallopeptidase 1) were overexpressed. In cluster 8, humanin-like pepdide 8 (mtrnr2l8), a mitochondrial genome encoded gene with receptor activity regulator activity, was down-regulated 24 . Importantly, the majority of genes differentially expressed between corresponding control and PAH clusters were not identified in the global differential gene expression analysis with the total control and PAH cells. Only HIST1H4C (downregulated) and FTH1, CXCL6, CCND1 IFI27 and CCL2 (upregulated) were found in the analysis of the total populations.
The proportion of cells in each individual cluster was similar between control and PAH (Fig. 6A). Comparison of the total number of cells enriched in proliferation pathways between conditions, (control cluster 5 vs. www.nature.com/scientificreports/ PAH clusters 2 + 5) showed more proliferative cells in PAH (Fig. 6B). Similarly, the total number of cells in an angiogenesis pathway was significantly higher in PAH (control clusters 6 + 7 vs PAH clusters 4 + 6 + 7) (Fig. 6C).
Overall, PAH PAEC contained more proliferating and angiogenic cells due to a phenotypic change in specific clusters. In a biased approach, we also evaluated only the top ten differentially overexpressed genes in each cluster in control (Table 2) and PAH samples (Table 3). These genes revealed similar transcriptomic profiles as the Gene Ontology term enrichment.
Expression of BMPR2, TGF-β signaling genes and endothelial-to-mesenchymal transition markers. BMPR2, TGF-β signaling and Endothelial-to-Mesenchymal transition have been reported in PAH [25][26][27] . Several genes involved in these pathways were expressed by PAECs in both control and PAH across all clusters (Fig. 7). BMPR2 and SMAD mRNA in TGF-β signaling pathways 28 were down-regulated in PAH, consistent with the literature 29 . Similarly, genes in Endothelial-to-Mesenchymal transition pathways were also Expression of genes in therapeutic pathways. Next, we focused on the three main therapeutic pathways in PAH; eNOS, endothelin-1 and prostacyclin 30,31 . It has been previously reported that endotheliumderived vasodilators, e.g., prostacyclin and nitric oxide (NO), are downregulated while endothelium-derived vasoconstrictors, e.g., endothelin-1, are upregulated in PAH 4,32-36 Here, eNOS signaling related gene expression including calmodulin (CALM)2/3, NOS-interacting protein (NOSIP), placental growth factor (PGF), and E3 ubiquitin protein ligase (STUB1) were significantly increased, while heat shock protein family A member 5 (HSPA5), HSP90AB1, HSP90B1, and vascular endothelial growth factor receptor 1 (FLT1) were significantly reduced in PAH PAEC as compared with controls (Supplementary Tables S1 and S2) (Fig. 8). Gene expression of prostaglandin E synthase 2 (PTGES2) and prostaglandin reductase 1 (PTGR1) involved in prostacyclin pathway were significantly upregulated (Supplementary Tables S1 and S2) (Fig. 7). In addition, abnormality of endothelin-1 signaling related gene expression including transcription factor AP-1 (JUN), Ras family small GTP binding protein H-Ras (HRAS), RRAS, RRAS2, protein disulfide isomerase family A member 3 (PDIA3), and Src homology 2 domain containing E (SHE) was also found in PAH PAEC (Supplementary Tables S1 and S2) (Fig. 8). From all the genes involved in PAH therapeutic pathways, only pgf was differentially expressed across the clusters within controls and PAH. In both sources of PAEC, pgf was down-regulated in clusters 1, 2, and 3, and upregulated in clusters 6 and 7. In addition, pgf was also upregulated in PAH cluster 4.

Discussion
Here we define the cellular heterogeneity among primary cultures of PAEC isolated from the main pulmonary artery and 1st-4th branches. Differential gene expression analysis on the whole population of control and PAH PAEC showed altered expression of more than 600 genes. Gene ontology enrichment analysis on these differentially expressed transcripts revealed that altered mitochondrial and angiogenesis processes characterized PAH PAEC. Oxidative phosphorylation and mitochondrial dysfunction are known to be dysregulated in PAH PAEC 5,14,15 . Ingenuity pathways analysis of the differentially expressed genes demonstrated dramatic up-regulation of EIF2 signaling, essential for protein synthesis, among PAH PAEC. EIF2 is essential for the initiation of protein synthesis via mediating the binding of tRNA to the ribosome. The role of EIF2 in PAH vascular remodeling and proliferation is well-established 11,14,16,17 . mTOR/eIF2α pathway is known to mediate hypoxic responses and vascular remodeling in pulmonary arterial vascular smooth muscle cells in a rat model of hypoxia-induced PAH 37 . The expression of IL-8 by PAH PAEC is another novel finding. IL-8 has an established role in PAH 38,39 and is a key survival, proliferation and angiogenic factor 40 . Alterations in endothelium-derived vasodilator therapeutic pathways confirmed dysregulated vascular homeostasis in PAH 4,32-36 . This report is the first single-cell omics analysis of a systemic large vessel vasculature in humans. Transcriptional profiling of single cells obtained from explanted idiopathic arterial hypertension and normal controls lungs showed that endothelial cells, in addition to pericyte/smooth muscle cells exhibited the highest number of differentially expressed gene sets, relative to other lung cells 41 . Even with a lenient false discovery rate of 10%, only 33 endothelial genes were upregulated in PAH in this study which analyzed a limited number of endothelial cells. Unfortunately, the exact fraction of endothelial cells among the average 3,688 total cells per patient was not reported. From the 33 unregulated endothelial cell genes, 3 genes, apln, gas6 and arl6IP4, also found in our cohort. Clustering within the endothelial cell subset was not reported in their study. Another study utilizing the mouse Tabula Muris consortium and Seurat Findcluster approach, identified 13 endothelial clusters in the lungs and other organs. The authors concluded that transcriptomic endothelial clustering may be independent of tissue of origin 42 .
We used a novel TriMap algorithm that provides a better preservation of the global data structure in the twodimensional space 12 . Eight novel PAEC subsets were identified in both health control and PAH. Gene ontology analysis and highest overexpressed genes revealed key phenotypes of each cluster. PAH PAEC were characterized by three clusters with altered transcriptomic phenotypes. These included clusters 2, 4 and 8. While these clusters www.nature.com/scientificreports/ had a quiescent phenotype in controls, PAH cluster 2 had a proliferative and cluster 4 an angiogenic phenotype. Differential gene expression analysis between corresponding PAH and control clusters with altered profiles, revealed that recruitment of inflammatory cells, homeostasis of essential metals and coagulation regulation, essential processes in the pathogenesis of PAH 1,11,22 , are attributable to specific subsets of PAECs. In addition, the reduced expression of BMPR2 and altered expression of TGF-β signaling and Endothelial-to-Mesenchymal Transition marker among PAH PAECs indicated that PAH endothelial pathophysiology is well represented by PAECs. Overall, these data suggest that fundamental processes defining PAEC heterogeneity and identifying Table 2. Top ten upregulated genes in control clusters. Clusters 1 and 3-5 among control PAEC exhibited mostly down-regulated genes. The set of top ten overexpressed genes in control 2 cluster was enriched in genes involving cell proliferation, including cenpf, ckap2l, kif15, kif18a, mik67, nek2 and spc25. Control cluster 6 had increased expression of top ten genes regulating immune cell-endothelial interaction and new vessel formation such as cd200, Carbohydrate Sulfotransferase 1 (chst1), cxcr4, matrix Gla protein (mgp), and Ras guanyl-releasing protein 3 (rasgrp3). Sprouting and migratory gene signatures, A-kinase anchoring protein 12 (akap12), aplin (apln), H2. 0 Like Homeobox (hlx), Neurogenic locus notch homolog 4 (notch4), and pdgfb were enriched in control cluster 7. Highly expressed genes in control cluster 8 were characteristic of inflammation and angiogenesis, including fstl3, hhip, itgb3, ptx3 and tgfb1. www.nature.com/scientificreports/ Table 3. Top ten upregulated genes in PAH clusters. Clusters 1 and 8 in PAH exhibited mostly downregulated genes. Similar to control cluster 2, PAH cluster 2 overexpressed genes involved in cell proliferation. Interestingly, PAH cluster 3 was also enriched in overexpression of genes involved in cell proliferation (cdt1, DNA replication complex GINS protein (gins2), mcm10, mcm4, and mcm5). PAH cluster 4 did not have any upregulated genes. In cluster 5 contained genes reported in aberrant angiogenesis (ccl14, cxcr4, rasgrp3 and Extracellular sulfatase Sulf-2 (sulf2)). Similar to control cluster 6, PAH cluster 6 was enriched in genes regulating immune cell-endothelial interaction and new vessel formation. Genes upregulated in PAH cluster 7 were compatible with an angiogenic phenotype (A-Kinase Anchoring Protein 12 (akap12); ccl14; pgf and Cell Division Cycle 42 Pseudogene 5 (cdc42p5)).

Materials and methods
Isolation of human PAEC. Human primary pulmonary arterial endothelial cells (PAEC) were collected either from unused explanted control donor lungs or explanted from PAH patients undergoing lung transplantation. The study was approved by the Cleveland Clinic Institutional Review Board, and informed written consent was obtained from all participants or tissue was under an Cleveland Clinic IRB exempt protocol. All methods were performed in accordance with the relevant guidelines and regulations. PAEC from explanted PAH lungs or donor lungs not used for transplantation were harvested and cultured as previously described 43 . Donors for PAH lungs included 2 women and were of average age 33 ± 7 years old. They were transplanted at 16, 7 and 2 years after their diagnoses with PAH. All patients were on sildenafil, two of the patients were also on treprostinil, and one patient was also treated with prostacyclin and bosentan. Hemodynamic measurements were available and the mean pulmonary artery pressures were 53 ± 8.7 mmHg; cardiac output 3.5 ± 0.18 L/min; cardiac index 2 ± 0.19 L/min/m 2 (mean ± SD, n = 3). Briefly, excess fat and connective tissue were removed from the main artery and rinsed (3 times) with Hanks' Balanced Salt Solution (Invitrogen, Grand Island, NY). The arterial endothelial cells were removed by placing the arteries in 10-15 ml PBS containing 2 mg/ml type II collagenase (Worthington Biochem, Lakewood, NJ) at 37 °C and 5% CO 2 with 90% humidity for 20 min. After incubation the arteries were placed in into MCDB-107 medium containing 14. 3. The endothelial cells were detach by a gentle shake and the residual arteries were removed before centrifugation at 330 g for 7 min at room temperature. The cell pellet was re-suspended in MCDB-107 medium and seeded on fibronection (1 mg/cm 2 ) (CalBiochem, La Jolla, CA) coated cell culture plates. The cells were incubated at 37 °C, 5% CO 2 with 90% humidity followed by media changes at 24 h and every 3 days until confluence. At passage 5 we tested the purity of the cell culture by immunophenotyping for CD31. All cells used in this paper were more than 96% CD31 positive. The passage of cells used in experiments was passage 6. Venous 44 and alveolar capillary 45 gene signatures were dimly present among the cells (Supplemental Fig. S4), showing that we sampled the same endothelial cell populations in PAH and controls.
10 × Single-cell RNA-Seq. PAEC suspensions with viabilities between 96 and 98% were used on 10 × platform for single-cell partitioning. The Chromium 3' Solution V2 chemistry was used as instructed by the manufacturer.
Library Prep and sequencing. Concentration and quality of double-stranded cDNA was assessed using a high-sensitivity DNA assay on an Agilent 2100 Bioanalyzer. Library preparation for sequencing on an Illumina platform was accomplished for each sample following the manufacturer's protocol (CG000183 Rev A). Quality of library construction was again assessed using the Bioanalyzer. Samples were fluorometrically quantified with a Qubit (Thermo Fisher Scientific), pooled, and quantified on a Quantabio Q cycler using Quantabio SparQ Fast Library Quant kit. Sequence data was generated on a NovaSeq 6000 using parameters recommended by 10x (CG000183 Rev A).
Cell ranger quality control and mapping. Raw sequencing data were processed with Cell Ranger (version 3.0.2), a proprietary package from 10 × Genomics. Reads QC and mapping were also performed within the Cell Ranger analysis pipeline. Reads were mapped to human reference genome, assembly GRCh38, using STAR aligner 46 . After obtaining mapped reads from the Cell Ranger pipeline, count matrices for the samples were produced using data in filtered_feature_bc_matrix directories and subsequently filtered to remove genes without any counts. Gene Ontology (GO) term enrichment analysis for Biological Processes was performed using a browser interface to MSigDB database v7.1 47-49 A list of differentially expressed genes was provided and the list was compared to gene sets in the GO database using the hypergeometric test. Canonical pathway analysis of differentially expressed genes was done using QIAGEN Ingenuity Pathway Analysis (QIAGEN IPA, Qiagen, Redwood City, CA).
SeqGeq Bioinformatics. Data was analyzed in the SeqGeq bioinformatics platform, from the makers of FlowJo 50 . Techniques applied were done so following recommended bioinformatics protocols developed by the scientific community and honed by experts in single-cell data exploration at FlowJo, LLC. All differential expression analysis defined genes as up or down regulated if features showed Fold-Change >|2.0| and Bonferroni adjusted p-Values < 0.05 relative to comparator populations. Data files were concatenated together and normalized to Counts per 10 k reads, for uniform analysis of all matrices together. This ensured that uniform treatment was given to each dataset and made results more easily comparable between samples and conditions downstream.