Identification of candidate PAX2-regulated genes implicated in human kidney development

PAX2 is a transcription factor essential for kidney development and the main causative gene for renal coloboma syndrome (RCS). The mechanisms of PAX2 action during kidney development have been evaluated in mice but not in humans. This is a critical gap in knowledge since important differences have been reported in kidney development in the two species. In the present study, we hypothesized that key human PAX2-dependent kidney development genes are differentially expressed in nephron progenitor cells from induced pluripotent stem cells (iPSCs) in patients with RCS relative to healthy individuals. Cap analysis of gene expression revealed 189 candidate promoters and 71 candidate enhancers that were differentially activated by PAX2 in this system in three patients with RCS with PAX2 mutations. By comparing this list with the list of candidate Pax2-regulated mouse kidney development genes obtained from the Functional Annotation of the Mouse/Mammalian (FANTOM) database, we prioritized 17 genes. Furthermore, we ranked three genes—PBX1, POSTN, and ITGA9—as the top candidates based on closely aligned expression kinetics with PAX2 in the iPSC culture system and susceptibility to suppression by a Pax2 inhibitor in cultured mouse embryonic kidney explants. Identification of these genes may provide important information to clarify the pathogenesis of RCS, human kidney development, and kidney regeneration.

PAX2 is an essential transcription factor for kidney development. PAX2 is expressed in multiple urogenital tissues, including the nephric duct, cap mesenchyme, and differentiating nephron and collecting duct of the developing kidney 1,2 . In mice, Pax2 is required for the differentiation of the mesenchyme to the epithelium. Therefore, Pax2 −/− mice completely lack metanephric kidneys in the embryonic period and have no kidneys at birth and die immediately after birth. Pax2 +/− mice develop kidney hypoplasia and vesicoureteral reflux 3 . In humans, PAX2 is one of the key disease genes that are defective in renal coloboma syndrome (RCS) 4,5 , which is characterized by kidney hypoplasia or dysplasia and optic nerve dysplasia.
In mice, it is well known that PAX2-regulated gene expression was influenced by many transcriptional and intracellular signaling factors 6 and epigenetic mechanisms regulated by Pax transactivation domain interacting protein and mixed-lineage leukemia complex 7,8 . Nevertheless, the mechanisms of PAX2 regulation of human kidney development are poorly understood, in part because the human embryonic kidney tissue is not accessible. Recently, methods of inducing kidney lineage cells from induced pluripotent stem cells (iPSCs) have been developed 9,10 . iPSC organoids derived from these cells may become useful in studies of human kidney development. www.nature.com/scientificreports/ To identify candidate genes regulated by PAX2 during human kidney development, we performed comprehensive transcriptome analysis of nephrons developing in vitro from iPSCs from patients with RCS and healthy donors by cap analysis of gene expression (CAGE). CAGE is based on the preparation and sequencing of concatamers of DNA tags derived from the initial 20 nucleotides from the 5′ end of mRNAs. CAGE allows high throughput gene expression analysis and accurate profiling of transcription start sites, including promoter and enhancer usage analysis 11 .
In this study, we investigated PAX2-regulated genes in human kidney development using induced kidney lineage cells from RCS patient-derived iPSCs (RCS-iPSCs) and CAGE. Moreover, 189 promoters and 71 enhancers, especially three influential candidate genes, which are conserved in mice and humans, were detected. This knowledge provides new clues for a deeper understanding of PAX2-associated biology in humans, including the molecular pathogenesis of RCS, and mechanisms of human kidney development and regeneration 12,13 .

Establishment and characterization of iPSCs derived from patients with RCS. Three unrelated
adult Japanese female patients with RCS who were diagnosed with morphological kidney defects, ophthalmologic abnormalities, and family history of PAX2 gene mutations were selected for the study 14 . Their clinical characteristics are shown in Table 1. Patient 1 has a heterozygous frameshift mutation in exon 2, which is known as the key region for DNA binding. Patients 2 and 3 have heterozygous missense mutations in exon 2. These two single-nucleotide mutations (ex2 c.212G > C, R71T and ex2 c.187G > A, G63S) were suggested to influence the binding to target genes in a previous study 14 . RCS-iPSCs were established by transfection of patient peripheral blood mononuclear cells with episomal vectors encoding OCT4, SOX2, KLF4, L-MYC, and LIN28 and p53 shRNA. Control iPSCs were previously established from healthy blood donors using the same method.

Induction of kidney lineage cells in control and RCS-iPSCs. Kidney lineage cells were induced from
both control and RCS-iPSCs using previously reported methods 9 . We analyzed the markers of nephron progenitor cells on day 14 and markers of podocyte and proximal tubular cells on day 21, following a previous study 9 . On day 21 of induction, all control and patient iPSC cultures produced cells that were differentially positive by immunohistochemistry for PAX2 and the proximal tubule cell marker LTL (Lotus tetragonolobus lectin) and podocyte marker PODXL (podocalyxin) (Fig. 2a). This was confirmed at the RNA level for PAX2 and PODXL and extended to additional kidney lineage markers WT1, SIX2, and AQP1 (Fig. 2b). On day 14, the inductive efficiency for generating nephron progenitors, evaluated using the known surface marker INTEGRINα8+ PDGFRα− by fluorescence-activated cell sorting (FACS) (Fig. 2c), was approximately 20% and equivalent in control and patient iPSCs (Fig. 2d). Thus, at the level of light microscopy and marker gene expression, we observed no abnormalities conferred by PAX2 mutation in human nephron development in this model system.

Identification of PAX2-regulated genes in nephron progenitor cells using RCS-iPSCs and CAGE.
To identify genes that might be important in human kidney development, we examined the influence of PAX2 expression and RCS PAX2 mutations on global gene expression in developing nephrons grown from RCS-iPSCs in vitro using CAGE. In our nephron induction protocol, PAX2 was strongly expressed in control cells on day 14 according to both quantitative reverse transcription-polymerase chain reaction (qRT-PCR) and Western blot analysis (Figs. 2b, 3a, Supplementary Fig. S1). Therefore, we selected day 14 as the time point to evaluate the influence of PAX2 in this study. We purified PAX2-positive cells on day 14 of the cultures by sorting Table 1. Clinical and genetic information of patients with renal coloboma syndrome (RCS). The ocular fundus score was evaluated based on the methods described in previous studies 14 . The score of ocular fundus: Score 0, normal optic disc; Score 1, optic disc dysplasia with an unusual pattern of the retinal vessels and cilioretinal arteries; Score 2, optic disc pit associated with vascular abnormalities and cilioretinal artery; Score 3, large coloboma involving the entire surface of the optic disc; Score 4, large coloboma of the optic disc and adjacent retina or morning glory anomaly (with radial emergence of the retinal vessels) 14 . HD hemodialysis, Cr serum creatinine, Ex exon. www.nature.com/scientificreports/    Table S1, S2). As expected, in cluster analysis using these CAGE data, both promoter and enhancer usage data were clearly divided into two primary clusters, one for the iPSCs and the other for INTEGRINα8+ PDGFRα− nephron progenitor cells after 14 days of culture. Both primary clusters were composed of four subclusters, one for the healthy control and one for each of the three patients ( Fig. 3c,f). Heat map analysis based on the promoter usage patterns showed that 189 promoters were strongly activated in the control cells compared to those in the patient cells ( Fig. 3d, Supplementary Table S3). Gene Ontology (GO) analysis by The Database for Annotation, Visualization, and Integrated Discovery (DAVID) was used to classify these 189 candidate promoters by functional annotation, which revealed 14 transcription factors, 12 extracellular matrix protein genes, 13 extracellular matrix genes, and 7 extracellular matrix component genes (Fig. 3e). Furthermore, heat map analysis based on the enhancer usage patterns showed that 71 enhancers were activated in the control cells compared to those in the patient cells (Fig. 3g).
To identify the best candidates, we focused on 189 activated promoters and not the candidate enhancers because the Functional Annotation of the Mouse/Mammalian Genome (FANTOM) database is best suited for analyzing promoter usage. We searched the FANTOM database for the time course of promoter usage during mouse embryonic kidney development and found 2,122 promoters whose expression correlated with Pax2 expression levels with correlation coefficients > 0.80 (Fig. 4a). Of these, 17 genes were prioritized for validation as PAX2-dependent kidney development genes by merging the gene list established by CAGE for the human nephron culture system and gene list identified by the mouse FANTOM database (Fig. 4b,c). The expression patterns of 17 extracted genes (28 promoters) were similar to that of PAX2 promoter in CAGE data using our human organoid (Supplementary Fig. S2).
Validation of candidate PAX2-dependent genes involved in renal development. To test whether 17 candidate genes were indeed regulated by PAX2, expression was re-evaluated by qRT-PCR in two ways: (1) over time in culture for nephrons developing from iPSCs (Fig. 5) and (2) at different times of in vitro culture for mouse embryonic kidney samples (Fig. 6). In the first approach, we verified that the expression of 14 of 17 genes was attenuated at day 14 of culture in patient INTEGRINα8+ PDGFRα− cells compared to those of healthy controls (Fig. 5a,b, Supplementary Fig. S3). Furthermore, we confirmed the data for unsorted nephron progenitor cell clusters over time in culture for nephrons developing from iPSCs. Detailed inspection of the time course revealed that the gene expression patterns for eight candidates (PBX1, POSTN, ITGA9, LRRC17, HAND2, STAR , MDK, and MEIS1) most closely resembled the PAX2 expression pattern, which is strongly induced between days 11 and 14 of culture in the system and downregulated in RCS-iPSC-derived samples (Fig. 5c,d, Supplementary  Fig. S4).  www.nature.com/scientificreports/ Consistent with this, when mouse embryonic kidney cells were cultured in vitro with the Pax2 inhibitor EG1, we observed a reduction at 72 h of incubation in both the number of ureteric tip branches (Fig. 6a-c) and  www.nature.com/scientificreports/ expression of Pbx1, Postn, and Itga9. In this way, we narrowed down 17 candidate genes based on the expression pattern during the differentiation from iPSCs to day 14 and expression pattern of organ culture using mouse embryonic kidney with Pax2 inhibitor. Thereby, we identified the three corresponding genes (PBX1, POSTN, and ITGA9) as the best PAX2-regulated target gene candidates for the regulation of kidney development (Fig. 6d, Supplementary Fig. S5). Finally, we performed chromatin immunoprecipitation (ChIP)-PCR using differentiated nephron progenitor cells to evaluate whether PAX2 binds to the promoters of three genes (PBX1, POSTN, and ITGA9). ChIP-PCR showed that PAX2 may bind to the promoter regions of these genes (Supplementary Fig. S6).

Discussion
In the present study, we investigated PAX2-regulated genes in human kidney development. We performed sequential and comprehensive analysis using induced kidney lineage cells from RCS-iPSCs with PAX2 mutations. A total of 189 promoters and 71 enhancers derived from our human-based screening method were identified as PAX2-regulated regions. These data would be useful in human kidney development with respect to humanderived samples. To identify the best candidate genes, we narrowed down the 189 candidate promoters using bioinformatic analysis of a mouse database and model of kidney development. Finally, we identified three top genes (PBX1, POSTN, and ITGA9) that are regulated by PAX2 and conserved in humans and mice.
Pbx1 is expressed in the nephrogenic mesenchyme during mouse kidney development. Pbx1 knockout mice have hypoplastic kidneys, poorly defined cortical and medullary regions, and reduced number of differentiating nephrons because of defects in ureteric branching and abnormal expansion of induced mesenchyme 15 . Consistent with this, PBX1 is one of the causative genes of congenital abnormality of kidney and urinary tract in humans 16,17 . Particularly, PBX1 mutations cause bilateral renal hypoplasia and unilateral renal agenesis and deafness and developmental delay 16 . These phenotypes are similar to those of patients with RCS with PAX2 mutations, approximately 10% of whom also have sensorineural hearing loss 18 . Postn is a secreted extracellular matrix protein that is expressed in the renal stroma and ureteric mesenchyme during kidney development. Although Postn is not normally detected in the adult kidney, it is highly induced in renal disease, and its expression is associated with  www.nature.com/scientificreports/ the development or protection of renal lesions 19,20 . Itga9 is known to be involved in cell adhesion. Although its role during kidney development is unclear 21 , it is possible that it is regulated by Pax2 and involved in kidney development via cell adhesion and cell-to-cell interactions. Studies on kidney development have been mainly conducted using mouse models. However, there are differences in gene activity between the mouse and human kidneys during early development 22,23 . For example, nephron and interstitial markers mix and persist into epithelializing nephrons in the developing human kidney, unlike the mouse kidney. This is a critical problem especially for the investigation of Pax2 because Pax2 is involved in regulating the border between nephron and interstitial progenitors in the mouse 24 . Human iPSCs are useful in solving this problem. Previous studies have evaluated iPSC organoids targeting some genes as candidates (e.g., PKD1, PKD2, and PAX2) and confirmed that iPSC organoids could reproduce their phenotype 25,26 . In the present study, we evaluated PAX2-regulated genes during human kidney development, especially nephron formation, using RCS-iPSCs and CAGE. Future work will be needed to further interrogate the importance of genes we have prioritized for roles in human kidney development. The general developmental model or bioinformatic pipeline we have described may be useful in obtaining additional clues for the analysis of human kidney development.
The knowledge of PAX2-regulated genes in this study will be useful to examine the kidney development process and other pathophysiologies involving PAX2. RCS is an autosomal dominant disease. Approximately half of patients with RCS have PAX2 mutations. The causative genes in the remaining half of patients with RCS have not been identified. We previously reported that KIF26B is a novel causative gene in patients with RCS who lack Pax2 mutations by analyzing known genes related to kidney development 14 . The identified candidate PAX2-regulated genes need to be evaluated further to unravel potential disease-causing roles in RCS. We have also reported that patients with PAX2 mutation-positive RCS have more severe symptoms 14 , but the underlying mechanisms have not been defined. Pax2 is also reactivated in proximal tubular cells after kidney injury 14 . Part of the process of kidney regeneration is similar to the developmental stages of renal tubule specification. Although the relationship between kidney development and repair after kidney injury remains controversial 27 , Pax2 also potentially plays a role in kidney repair and regeneration, as it does during kidney development 28 . Therefore, our study may add helpful knowledge to research on the mechanism of RCS and kidney regeneration via PAX2.
Our investigation has some limitations. First, this study focused on nephron progenitor lineages. However, PAX2 is expressed in both nephron progenitor cells and ureteric buds during kidney development. The role of the PAX2 gene in ureteric bud development was not evaluated. Recently, some new methods of inducing ureteric buds have been reported 26,29 . Further investigation focusing on ureteric buds will be required to evaluate the role of PAX2 comprehensively. Second, we alternatively extracted PAX2-positive cells using the surface marker phenotype INTEGRINα8+ PDGFRα− for nephron progenitor cells because the PAX2 antibody for FACS did not work well 30 . The establishment of PAX2 reporter lines by genome editing or single-cell RNA-seq analysis can solve these limitations 31,32 . Third, we couldn't evaluate the transcriptional activity using PAX2 mutation knock-in model except for RCS-iPSCs. We showed the influence of PAX2 mutation to the transcriptional activity based on the transcriptional change using CAGE, ChIP-qPCR, and qPCR alternatively. In addition, our previous report provide us the possibility that PAX2 mutations influence transcriptional activity in the view of their structure 14 . Lastly, iPSC clones differ in their inductive efficiency for various reasons 33 . To reduce the effect of this limitation, we selectively used iPSC clones with the highest induction efficiency of OSR1, which is known as the intermediate mesoderm marker.
In conclusion, we successfully established disease-specific iPSCs from patients with RCS and differentiated them into kidney lineage cells. We identified 189 candidate promoters, 71 candidate enhancers, and 3 conservative genes (PBX1, POSTN, and ITGA9) in humans and mice as PAX2-regulated targets using patient-derived iPSCs and CAGE. These findings can be used to focus future studies on the mechanism of RCS, kidney development, and kidney regeneration. Further human and animal studies will be required to define the mechanism of the pathogenesis of RCS.

Methods
Ethics statement. This study was approved by the Ethics Committees of Kanazawa University and conducted according to the guidelines of the Declaration of Helsinki. Informed consent was obtained from all patients for whom individual information is included in this study in accordance with our institutional research ethics committee (approval number, 2016-046). Animal experiments were approved by the Institute for Experimental Animals, Kanazawa University Advanced Research Center (registration number, AP-143297) and conducted in accordance with the institutional guidelines.
Generation of iPSCs. RCS-iPSCs were established by transfection of patient PMNCs with episomal vectors encoding OCT4, SOX2, KLF4, L-MYC, LIN28, and p53 shRNA as previously described 34 . iPSCs were maintained with feeder-free cultures using StemFit AK02N medium (ReproCELL, Tokyo, Japan) on cell culture plates (Cell-BIND; Corning, NY, USA) coated with Synthemax (Corning). The medium was changed daily, and passage was performed at 80-90% confluence. iPSCs were routinely monitored for mycoplasma contamination. Three iPSC cell lines (585A1, 585B1, and 648A1), which were generated from PMNCs of 30 s healthy Japanese men using the same method, were used as control iPSCs. EB formation. EB formation was performed as previously described 35 . iPSCs were scraped off, dissociated in primate ES medium, and distributed into low-attachment six-well plates (Corning) for floating culture with EB medium (Knockout Dulbecco's Modified Eagle Medium, 20% KSR, 0.1 mM non-essential amino acid solution, 2 mM l-glutamine, 500 U/mL P/S, and 0.5 mM 2-mercaptoethanol; all reagents were obtained from Thermo Fisher Scientific). On day 8, the EBs were transferred into gelatin-coated six-well plates and cultured with EB medium for another 8 days.

Mutation analysis. Cell Lysis Solution (Qiagen, Hilden, Germany) and Protein Precipitation Solution
Teratoma formation. Teratoma formation was performed as previously described 35 . Undifferentiated iPSCs were transplanted into the testes of 8-week-old male CB17-Prkdcscid/L nonobese diabetic/severe combined immunodeficient (NOD-SCID) mice (Charles River Laboratories Japan, Yokohama, Japan). Nine weeks after transplantation, teratomas were dissected from the mice, fixed in 4% paraformaldehyde (PFA) at 4 ℃ overnight, washed three times with phosphate-buffered solution (PBS), and maintained in 70% ethanol. The fixed teratomas were sectioned and stained with hematoxylin and eosin.

Differentiation into nephron progenitors and kidney lineage cells. Three RCS-iPSCs and three
control human iPS cell lines were induced toward nephron progenitors as previously described 9 , with a minor modification. Briefly, human iPSCs were aggregated at 5 × 10 4 cells in V-bottomed 96-well low-cell-binding plates (Sumitomo Bakelite Co., Ltd.) using the medium containing 10 μM Y27632 and 0.5 ng/mL human bone morphogenic protein 4 (BMP4, R&D System). After 24 h (on day 1), the medium was changed to one containing 1 ng/mL human activin A (R&D System) and 20 ng/mL human fibroblast growth factor 2 (FGF2, R&D System). On day 3, the medium was changed to one containing 1 ng/mL human BMP4 and 10 μM CHIR99021 (Wako). On day 9, the medium was changed to one containing 10 ng/mL activin A, 3 ng/mL BMP4, 3 μM CHIR99021, 0.1 μM retinoic acid (Sigma), and 10 μM Y27632. On day 11, the medium was changed to one containing 1 μM CHIR99021, 5 ng/mL human FGF9 (R&D System), and 10 μM Y27632. On day 14, induced aggregates were cultured with NIH3T3 fibroblast expressing Wnt4 cells 36  www.nature.com/scientificreports/ removed. Reads were aligned to human genome hg19 using STAR version 2.5.0a 39 with the following parameters: -outSAMtype BAM SortedByCoordinate-outFilterMultimapNmax 1. The index for STAR was generated based on default settings, except for information on splice junctions obtained from Gencode v27lift37 ("comprehensive") 40 .
Expression analysis of CAGE data. Annotation files for human FANTOM5 promoters 41 were downloaded from http:// fantom. gsc. riken. jp/5/ datafi les/ latest/ extra/ CAGE_ peaks/. Coverage of 5′ positions at the single-base level was calculated using bedtools v2.25.0 42 in a strand-specific manner with the following parameters: genomecov -5 -bg -strand + or genomecov -5 -bg -strand -. bedGraph files were converted to bigWig files using bedGraphtobigWig 43 . Reads mapped to FANTOM5 promoters were counted using bigWigAverageOverBed 43 . We identified differentially expressed promoters between nephron progenitor cells from RCS-iPSCs (n = 9) and controls (n = 3) cultured for 14 days using edgeR version 3.16.5 44 . Briefly, count data were normalized based on relative log expression, and the dispersion values were estimated by the quantile-adjusted conditional maximum likelihood method. Then, exact tests were computed for differences in the means between two groups of negative-binomially distributed counts. P values were adjusted by Benjamini-Hochberg method 45 . For clustering and heatmap analysis, log 2 [counts per million (CPM) plus one] was utilized. Clustering analysis was performed using identified 110,659 promoters (read counts in at least one sample > 0), and the distance was measured by one minus Pearson's correlation coefficient. Heatmap analysis of differentially expressed promoters was performed using gplots version 2.2.1 46 . GO analysis of the differentially expressed genes was conducted using DAVID (https:// david. ncifc rf. gov/ home. jsp) 47,48 . De novo enhancers were identified as previously reported 49 .
Identified enhancers were combined with FANTOM-NET enhancers 50  To count reads mapped to the enhancer regions, a similar analysis was performed as for promoters. Then, the count data were normalized using the normalization factors computed by the count data of promoters. After filtering enhancers with no expression in any sample, differential enhancer expression analysis was performed as described above for differential promoter expression analysis. A prior count of 0.25 was added to counts for each enhancer, and log 2 [CPM] was calculated for clustering and heatmap analysis. Clustering analysis was performed using identified 36,374 enhancers (read counts in at least one sample > 0), and the distance measure was the same as described for the clustering analysis of promoters. The heatmap was generated with differentially expressed enhancers. In this subsection, R version 3.3.3 51 was used for analyzing and plotting CAGE data. Moreover, CAGE data were analyzed using Z-score after log2 conversion. The correlation to the PAX2 promoter was evaluated. CAGE data were available at the DNA Data Bank of Japan (DDBJ) database (accession number DRA009653).
Data mining from FANTOM database. The promoter activities during mouse kidney development were obtained using FANTOM5 Table Extraction Tool. The data were analyzed using Z-score after log2 conversion. Finally, the correlation to Pax2 promoter (indicated as p@chr19:44831187. 0.44831204,+) was evaluated.
ChIP-qPCR. Induced cell aggregates from iPSC at day 14 were dissociated and fixed with fresh 1% formaldehyde (final concentration) for 10 min at room temperature. Simple ChIP Enzymatic Chromatin IP Kit (Cell Signaling Technologies, MA, USA) was used according to the manufacturer's instructions to perform the ChIP experiment. Chromatin from 5 × 10 6 cells was used for each ChIP experiment. After chromatin fragmentation by enzyme treatment and sonication, ChIP-qPCR was performed with three technical replicates. ChIP was performed with rabbit anti-Pax, rabbit anti-histone H3 as a technical positive control (Cell Signaling Technologies, MA, USA), and normal rabbit IgG as a negative control (Cell Signaling Technologies, MA, USA). After reverse cross-linking and DNA purification, purified DNA was used for ChIP-qPCR. The reactions of ChIP-qPCR were performed in the same method as that in the qRT-PCR section. Detailed information on primers and antibodies is provided in Supplementary Table S4 and S5.

Statistical analyses.
The results are presented as mean ± SEM. Unpaired two-tailed Student's t-test was used to compare two groups, and a P-value < 0.05 was considered significant. For multiple group comparisons, one-way ANOVA followed by post hoc correction with Dunnett's test or Bonferroni's test was applied. Statistical analyses were performed using GraphPad Prism 8.0.

Data availability
The data underlying this study are available in the DDBJ database at https:// ddbj. nig. ac. jp/ DRASe arch/ and can be accessed with DRA009653.