Introduction

Induced pluripotent stem cells (iPSCs) are broadly used in scientific research to generate different cell types in the context of basic or applied research to develop potential therapies1,2,3, for example, the generation of isogenic iPSC models, either patient-derived or healthy donor-derived, to model pathogenic conditions and drug screening4. The advancement in genetic manipulation and iPSC differentiation methods allows investigators to gain new insight into the functional consequences of specific molecular alterations. We previously generated in vitro model using iPSCs to study vascular tumor pseudomyogenic hemangioendothelioma (PHE). Rearrangement involving FOSB is a characteristic of PHE, the most common binding partners are SERPINE1 and ACTB5,6. To mimic PHE, we introduced chromosomal translocation t(7;19)(q22;q13) leading to the SERPINE1-FOSB fusion using iPSC model7.

PHE is a rare, locally aggressive, rarely metastasizing endothelial neoplasm that occurs in bone and soft tissue in young adults with a strong male predominance8. The tumour is often multifocal and locally aggressive. It forms rarely metastasis and is classified clinically as an intermediate category9. After surgical resection, approximately 60% of patients experience local recurrences or develop additional tumors in the same anatomical location9. Chemotherapy or radiotherapy is generally given to treat patients with multifocal disease. Recently, the use of systemic targeted therapies using mTOR inhibition (sirolimus, everolimus, and rapamycin) or telatinib, a VEGFR1–4/PDGFRA multi-tyrosine kinase inhibitor has shown clinical benefit in reported cases10,11,12,13. Histologically, these tumors consist of loose fascicles of plump spindle cells with abundant and brightly eosinophilic cytoplasm8. Immunohistochemically, there is a characteristic expression of ERG, CD31, and keratin AE1/AE3, while CD34 and desmin are negative and INI1 retained8, highlighting its vascular differentiation despite the lack of vasoformation. The SERPINE1-FOSB leads to overexpression of FOSB at RNA and protein as the fusion transcript is driven by the SERPINE1 promoter region14,15.

Given that the presumed cell of origin of vascular tumors is endothelial cells, the iPSCSERPINE1-FOSB cells towards the endothelial lineage (iPSC-ECSERPINE1-FOSB) to facilitate the functional evaluation of specific genetic alterations7. This model may shed light on the involved pathways and facilitate the identification of targeted drugs for treatment. Previously, a PHE patient was shown to have undergone complete remission during treatment with telatinib, possibly through the inhibition of FLT1, FLT4, and PDGFRA signaling13. However, the effect of telatinib in treating PHE has yet to be validated. QRT-PCR expression analysis could be used to monitor cellular reaction to drug treatments, such as telatinib, in iPSC-ECSERPINE1-FOSB and detect the expression of FOSB, SERPINE1, and the SERPINE1-FOSB fusion transcript.

QRT-PCR is a broadly used technique that allows the relative quantification of gene expression. For reliable measurement and comparison, the identification and inclusion of stably-expressed housekeeping genes are needed. Examples of frequently used housekeeping genes are GAPDH, B2M, and HPRT16. However, these genes are differentially expressed in mammalian tissue types17. During reprogramming of human iPSC, expression of GAPDH was relatively stable compared to other common housekeeping genes such as ACTB, B2M, and HPRT18. Likewise in iPSC and neurons derived from iPSC, GAPDH is one of the most stably expressed genes19. However, an extensive study that compared GAPDH expression in 72 different human tissues revealed that the expression of GAPDH varies across tissue types20. Housekeeping genes are extensively used across various cell lines, though these housekeeping genes may not necessarily be stable in all cell lines19,20. Definitive identification of a stable housekeeping gene panel is laborious work. There is a need for validated reference genes in iPSC and iPSC-EC.

This study aimed to identify a set of housekeeping genes that remains stable between iPSC and iPSC-EC with and without gene manipulation and upon drug treatment. We used the transcriptome data from our previously iPSC and iPSC-EC samples with and without a gene fusion7. Five widely used algorithms, delta-Ct, geNorm, NormFinder, BestKeeper, and RefFinder21,22,23,24,25 were utilized to identify and verify the best reference housekeeping gene set. Here, we identified a panel of 15 candidate genes and compared their stability to the five most commonly used housekeeping genes, B2M, GAPDH, GUSB, HMBS, and HPRT1, in this field. From a total of 20 genes, RPL36AL, TMBIM6, MORF4L2, HPRT1, and SLC25A3 were superior to B2M and GAPDH based on a comprehensive housekeeping gene ranking. The top two ranked housekeeping genes, RPL36AL and TMBIM6 remain stable in genetically manipulated and endothelial differentiated cells during drug treatment.

Results

Housekeeping genes and their primer specificity

Using our transcriptome sequencing data from iPSC and iPSC-EC with and without SERPINE1-FOSB fusion7, we ranked gene expressions according to the lowest standard deviation (SD) and coefficient of variation (CV) of their fragments per kilobase of exon per million mapped fragments (FPKM) value. FPKM is a unit of expression which estimate gene expression based on transcriptome sequencing data26. We subsequently listed 15 candidate genes with an FPKM value of more than 100 and determined their suitability as housekeeping genes (Table 1). Of these 15 candidate genes, the YWHAZ gene is frequently used as a housekeeping gene and ranked first based on our ranking metrics27. We added five common housekeeping genes, B2M, GAPDH, GUSB, HMBS, and HPRT1, to compare their stability with the 15 candidate genes. The specificity and the amplification range of the primers that were designed for these 20 genes were analysed using a dilution of the template cDNA and using melt runs to identify by-products. The designed primers were specific and efficient when tested over a dilution range, as represented by their respective melting curve shown in Supp Fig. 1 and coefficient of R2 value between 0.98 to 1.00 (Supp Table 2).

Table 1 Transcriptome values of housekeeping genes.

Stability of reference genes using various algorithms

Five different, commonly used algorithms (delta-Ct, Bestkeeper, geNorm, Normfinder, and Reffinder) were tested on iPSCWT, iPSC-ECWT, and iPSC-ECSERPINE1-FOSB to identify the best matching housekeeping gene set using qRT-PCR. The data generated in the five algorithms were from three biological runs in technical triplicates. The expression of the 20 housekeeping genes is represented in Fig. 1. The mean Ct values range from 18 to 25.5 cycles. PRKCSH and HMBS showed the lowest expression levels, both with a mean Ct value of 25.5, while UBA52 and RPL36AL displayed the highest expression levels with mean Ct values of 19.0 and 19.1, respectively (Supp Table 1).

Figure 1
figure 1

Mean Ct values. Mean Ct values of iPSCWT, iPSCSSERPINE1-FOSB, iPSC-ECWT of each housekeeping gene were shown in a box-and-whisker plot and sorted from the lowest (left) to the highest (right). The five added common housekeeping genes are denoted with an asterisk. The whiskers represent SD of nine samples, three samples per cell line (iPSCWT, iPSCSSERPINE1-FOSB, and iPSC-ECWT).

Delta-Ct

The delta-Ct method determines the stability of candidate genes by comparing the relative expression of candidate genes among the samples21. The two most stable genes based on delta-Ct method are YWHAZ and COX7A2, with corresponding mean SD values of 0.15 and 0.16, while the least stable genes were PRKCSH and UBA1 with mean SD values of 0.39 and 0.41, respectively. Of the five common housekeeping genes included in the study, only HPRT1 falls in the top 10 stable mean SD values based on delta-Ct.

Bestkeeper

Bestkeeper evaluates candidate genes and ranks based on the lowest SD and CV. COX7A2 and UBA52 have the lowest variations in gene expression with corresponding CV values of 1.3% and 1.5% (Table 2). We observed that the lower expressed genes, such as UBA1 and PRKCSH, have higher variation with a CV value of CP 3.6% and 4.5%, respectively. Only one commonly used housekeeping gene, HPRT1, was ranked third in this algorithm. GUSB, GAPDH, HMBS, and B2M remained low in the ranking, with a CV value of CP between 2.4% to 3.9%.

Table 2 Bestkeeper housekeeping gene ranking sorted according to their SD and CV% values and their crossing point (CP) .

geNorm

geNorm measures the stability value (M) by calculating the pairwise expression ratio for each candidate gene against all the other genes24,28. Based on the geNorm ranking of M value, 14 genes were evaluated to be ideal housekeeping genes and six as acceptable. The genes with the highest stability M value (of 2.26) are SLC25A3 and PRELID1 (Fig. 2). The six genes that were designated as “acceptable”: YWHAZ, COX7A2, ATP5F1C, UBA1, PRKCSH, and B2M. The other four commonly used housekeeping genes, HPRT1, GUSB, GAPDH, and HMBS were classified as ideal.

Figure 2
figure 2

geNorm ranking of stability M value. Housekeeping genes are plotted based on their stability M value from the highest (left) to the lowest (right). Black bars are ideal housekeeping genes and grey bars are acceptable housekeeping genes. The five added common housekeeping genes are denoted with an asterisk. Note: this tool is not calculating SD.

NormFinder

NormFinder measures the intra- and intergroup variation to calculate the stability of candidate genes23. Similar to geNorm analysis, the two most stable genes identified by Normfinder are SLC25A3 and PRELID1, and the two least stable genes are PRKSCH and B2M (Fig. 3). The most stable genes, SLC25A3 and PRELID1 have the lowest stability value of 0.09 and 0.12. The least stable genes, PRKSCH and B2M have high stability values of 0.74 and 0.94. Among the common housekeeping genes, GUSB ranked fairly high in NormFinder according to its stability value of 0.20. HPRT1 was also ranked fairly, in the 8th position, with a stability value of 0.25. The other common housekeeping genes, GAPDH, HMBS, and B2M, were ranked 14th, 15th, and 20th, respectively. NormFinder algorithms calculated that the two genes that gave the best stability in our panel were HPRT1 and RPN2, each with a stability value of 0.07.

Figure 3
figure 3

NormFinder ranking based on stability value. Housekeeping genes are plotted based on their stability value from the least (left) to the most stable (right). The five added common housekeeping genes are denoted with an asterisk.

RefFinder

RefFinder utilizes the four other algorithms to evaluate and rank candidate genes by assigning appropriate weights to candidate genes and calculating the geometric mean of their weights for the final ranking25. The comprehensive ranking generated by RefFinder ranked RPL36AL as the most stable gene with the lowest geometric mean of 2.91 (Table 3). TMBIM6 was ranked second with the same geometric mean as RPL36AL. The top five genes were RPL36AL, TMBIM6, MORF4L2, HPRT1, and SLC25A3, respectively. In all five algorithms, PRKCSH, UBA1, and B2M were the least stable housekeeping genes.

Table 3 Housekeeping gene ranking of all five algorithms.

Stability of housekeeping genes during telatinib treatment

Using a single housekeeping gene for normalization could lead to relatively large errors, so we used a combination of two housekeeping genes for further studies16,24. Based on the comprehensive ranking of candidate housekeeping genes, we selected the top two genes, RPL36AL and TMBIM6, and the bottom two genes, UBA1 and B2M, for further analysis of their gene expression stability during drug treatment with telatinib. QRT-PCR data generated are from biological duplicates and technical triplicates. The relative expression of FOSB, SERPINE1, SERPINE1-FOSB was normalized to the top or bottom two housekeeping genes (Fig. 4). When the expression of FOSB and SERPINE1 was normalized to housekeeping genes RPL36AL and TMBIM6, telatinib treated iPSC-ECSERPINE1-FOSB showed a slight reduction in expression as compared to iPSC-ECWT. However, when FOSB and SERPINE1 expression were normalized to UBA1 and B2M, telatinib treated iPSC-ECSERPINE1-FOSB showed a slight increase in expression compared to iPSC-ECWT. In addition, we observed that SERPINE1-FOSB expression was reduced in telatinib treated iPSC-ECSERPINE1-FOSB compared to untreated iPSC-ECSERPINE1-FOSB when normalized to most ideal selected genes while the expression increased when normalized to least ideal selected genes. Of note, the significance level of the relative normalized expression of SERPINE1-FOSB between the top and bottom selected housekeeping gene is p < 0.4.

Figure 4
figure 4

Gene expression of telatinib-treated cells. Relative to iPSC-ECSERPINE1-FOSB untreated, the log scale expression of FOSB, SERPINE1 or SERPINE1-FOSB were normalized to the top two, RPL36AL and TMBIMB, or bottom two, UBA1 and B2M, housekeeping genes of telatinib treated or untreated iPSC-ECWT and iPSC-ECSERPINE1-FOSB samples. Three biological samples of each treatment were pooled together and run qRT-PCR in technical duplicates.

Discussion

In this study, we aimed to identify a set of housekeeping genes that remain stable in iPSC and iPSC-EC with and without genetic manipulation or drug treatment for qRT-PCR. A previously established transcriptome sequencing data set of iPSC, iPSC-EC with and without CRISPR-Cas9 induced translocation was used to identify stable housekeeping genes. Next to the identified new housekeeping genes, we aimed to compare these with a commonly used housekeeping gene panel and showed that most of the new housekeeping genes were more stable than the commonly used housekeeping genes. Multiple studies have investigated candidate reference genes in iPSCs or ECs18,29,30,31,32,33, during reprogramming and differentiation18,33. In addition, various publications reported a selection of housekeeping genes in endothelial cells, such as microvascular EC29, human retinal EC31, and human blood–brain barrier EC32. The candidate reference genes examined in these studies were among the commonly used housekeeping genes in our study, including B2M, GAPDH, GUSB, HPRT1, HMBS, and YWHAZ. A potential drawback of the use of common housekeeping genes is their variable expression across different tissue types16,17,20,34,35.

From our transcriptome sequencing analysis, 14 of 15 genes were not validated previously as reference genes in the context of iPSC and derived EC. Intriguingly, these 14 genes were also identified in two independent studies through gene expression data from EST (expressed sequence tag), SAGE (series analysis of gene expression), and/or microarray that suggested their potential as candidate reference genes36,37. Some of these novel candidate reference genes showed better stability in various human frozen tissues and cell lines as compared to common housekeeping genes such as B2M, HMBS, and GAPDH36. In our study of 20 housekeeping genes, PRKCSH, UBA1, and B2M were ranked the three lowest of all housekeeping genes validated. B2M was previously investigated as a candidate reference gene when examined in reprogrammed iPSC or EC and was ranked between four to eight of a panel of 10 housekeeping genes18,29,38. Likewise in the iPSC and derived neurons study, GAPDH and HMBS were ranked among the top five of 16 reference genes19. In our study using all five algorithms, GAPDH was ranked between 11 to 15 of 20, while HMBS was ranked between 13 to 17 of 20. This suggests that using other reference genes, such as RPL36AL or TMBIM6, could yield better results than the common housekeeping genes. However, the stability of these reference genes in other types of differentiated iPSC should be validated.

Next to tissue type differences, variation in the expression of common housekeeping genes was observed within the same cell line when exposed to different conditions30,31,39. For example, statin-treated HUVEC cells showed that HPRT1 and YWHAZ were considered the most suitable reference genes39. However, in homocysteine-treated HUVEC cells, YWHAZ was not stably expressed and ranked seven of eight31. In addition, hydrogen peroxide-treated HUVEC showed HPRT1 was a less reliable reference gene with a ranking of 11 of 1530. These studies stressed the importance of validating reliable reference genes in different physiological conditions. In telatinib-treated cells, the SERPINE1-FOSB expression showed a reduction in expression, when normalized to the top two housekeeping genes—RPL36AL and TMBIM6. However, when normalized to the bottom two housekeeping genes—UBA1 and B2M, an increase in expression was observed. The discrepancy in the expression of SERPINE1-FOSB between the top and bottom selected housekeeping genes was concerning and further supports the importance of verifying the stability of housekeeping genes during drug treatment.

In this study, the validation of housekeeping genes should fit into three criteria, 1. Stable expression in both iPSC and iPSC-EC, 2. Stable expression in wild-type and mutant lines, and 3. Stable expression during treatment with telatinib. As such, in our study, we found that RPL36AL, TMBIM6, MORF4L2, HPRT1, and SLC25A3 were ranked the top five genes. Common housekeeping genes such as GAPDH, HMBS, and B2M are less stably expressed in our panel of cell lines. The top two ranked genes, RPL36AL and TMBIM6, remained stable in both iPSC and iPSC-EC with and without SERPINE1-FOSB translocation and during drug treatment with telatinib, making them ideal housekeeping genes.

We identified a panel of HKGs that could be utilized in various conditions using iPSC and iPSC-derived endothelial cells as well as genetically modified iPSC cells for drug treatment.

Materials and methods

Cell lines and cell culture and drug treatment

The human iPSC line LUMC0054iCTRL (http://hpscreg.eu/cell-line/LUMCi001-A)40 was cultured on Vitronectin XF™ (STEMCELL technologies, 07180) coated plates in TeSR™-E8™ Kit for ESC/iPSC Maintenance (STEMCELL technologies, 05990) according to manufacturers’ instructions. The SERPINE1-FOSB translocated iPSC (iPSCSERPINE1-FOSB) was generated in our previous study7. iPSCs were differentiated into endothelial cells in three independent biological replicates as previously described41. Endothelial cell-differentiated iPSC were treated with or without 5 µM of telatinib for 14 h in serum-starved condition followed by four hours of serum stimulation. Each treatment condition were carried out in biological triplicates but pooled together when harvested for further analysis.

RNA isolation and quantitative real-time-polymearse chain reaction

Cells were homogenized and RNA was isolated using TRIzol (Ambion, 15596018) and DNase I treated and purified according to the RNeasy kit (Qiagen, 74104). First-strand cDNA was synthesized using the iScript™ cDNA Synthesis Kit (Bio-rad, 1708891). qRT-PCR reactions were carried out with iQ™ SYBR® Green Supermix (Bio-rad, 1708886). The samples that were used to determine the stability of reference genes are iPSCWT, iPSCSERPINE1-FOSB, and iPSC-ECWT. For iPSCWT and iPSCSERPINE1-FOSB, RNAs at three different passages were harvested. While for iPSC-ECWT, cells from three independent differentiation setups were harvested for analysis. These samples (biological replicates in triplicate) were run in triplicate in PCR for analysis (technical replicates). Untreated and drug-treated samples of iPSC-ECWT and iPSC-ECSERPINE1-FOSB of a single passage number were run in technical duplicate and used to analyze the stability of selected reference genes. All of the samples were run in a two-step PCR setting with an annealing temperature of 60 °C for 45 cycles. An overview of the 20 housekeeping genes and their primer sequences is shown in Supp Table 2. The primers used to amplify FOSB, SERPINE1, and SERPINE1-FOSB are listed in Supp Table 3. The amplification and melt curve of the designed primers was examined over a dilution range (1, 1/4, 1/16, 1/64, and 1/256) for their specificity.

Transcriptome data identification of housekeeping gene candidates

Previously generated and published transcriptome data (accession PRJNA448372) was used to select stable genes across various conditions7. Normalized FPKM gene expressions of induced pluripotent stem cells and derived endothelial cells of WT and SERPINE1-FOSB translocation were obtained and sorted according to the lowest SD and CV. The top 15 expressed genes with an FPKM value of > 100 were selected. In addition, independent of the ranking results, five commonly used housekeeping genes such as GUSB, GAPDH, B2M, HMBS, and HPRT1 were included for comparison.

Statistical analysis

The same generated qRT-PCR data were utilized in five algorithms to assess the expression stability of 20 reference genes, delta-Ct21, Bestkeeper22, geNorm24, NormFinder23, and RefFinder25. Three of the algorithms, delta-Ct, Bestkeeper, and RefFinder use untransformed Ct values as input. The remaining two algorithms, geNorm and NormFinder, the average delta Ct values were used. The delta-Ct method compares the mean SD of reference genes and the calculated values were exported from CFX Maestro software. The excel based software, Bestkeeper, was manually extended to accommodate 20 reference genes and the Ct values were used for analysis22. Bestkeeper tabulates the geometric mean, arithmetic mean, the minimal and maximal Ct values and the percentage of CV based on the crossing point values of each reference gene as opposed to other excel based software, NormFinder, in which the delta Ct values were input and the ANOVA-based mathematical analysis was used to calculate expression stability values23. geNorm was tabulated through the reference gene selector tool in the CFX Maestro software that calculates expression stability M values per reference genes as detailed in Vandesompele 2002. In both geNorm and NormFinder, a low stability value indicates more stable expressed gene23,24. We obtained the comprehensive ranking using a web-based tool, RefFinder, that compares and ranks the reference genes based on four algorithms, geNorm, NormFinder, Bestkeeper and delta-CT methods25. Statistical analysis comparing the expression of FOSB, SERPINE1, and SERPINE1-FOSB between the top and bottom two housekeeping genes was carried out using a t test.