Whole-transcriptome analysis of UUO mouse model of renal fibrosis reveals new molecular players in kidney diseases

Transcriptome analysis by RNA-seq technology allows novel insights into gene expression and regulatory networks in health and disease. To better understand the molecular basis of renal fibrosis, we performed RNA-seq analysis in the Unilateral Ureteric Obstruction (UUO) mouse model. We analysed sham operated, 2- and 8-day post-ligation renal tissues. Thousands of genes with statistical significant changes in their expression were identified and classified into cellular processes and molecular pathways. Many novel protein-coding genes were identified, including critical transcription factors with important regulatory roles in other tissues and diseases. Emphasis was placed on long non-coding RNAs (lncRNAs), a class of molecular regulators of multiple and diverse cellular functions. Selected lncRNA genes were further studied and their transcriptional activity was confirmed. For three of them, their transcripts were also examined in other mouse models of nephropathies and their up- or down-regulation was found similar to the UUO model. In vitro experiments confirmed that one selected lncRNA is independent of TGFβ or IL1b stimulation but can influence the expression of fibrosis-related proteins and the cellular phenotype. These data provide new information about the involvement of protein-coding and lncRNA genes in nephropathies, which can become novel diagnostic and therapeutic targets in the near future.


Results
To identify new molecular players in renal fibrosis, high throughput RNA-seq was used in the mouse UUO model. Kidneys of 6 UUO mice (time intervals 2 and 8 days post-ligation) and 4 Sham operated mice (Fig. 1A) were harvested and total RNA was used as input to generate Illumina TrueSeq libraries. Prior to RNA-seq analysis, RNA samples and tissue samples were analyzed to confirm molecular changes indicative of the fibrotic signature ( Fig. 1B; Supplemental Fig. 1 and data not shown 9 ). Libraries were sequenced, low-quality reads and rRNA sequences were filtered, total clean reads were mapped to genome and mapped reads were assembled into putative transcripts (Supplemental Table 1). The number of detected genes per sample as defined by RPKM values (reads per kilobase of exon per million reads) are reported in Supplemental Table 2, while the mean number of identified  In all cases, red color is indicative of up-regulated and green color of downregulated genes. Blue color indicates genes that even though are differentially expressed in statistical significant manner they do not pass the cutoff values of 1 and −1 in log 2 scale (> 2 folds up-regulation and < 0.5 folds down-regulation, respectively) that were utilized to identify up-regulated and down-regulated genes, respectively. The cutoffs are schematically represented as vertical lines in the volcano plots of (A-C). (B) Volcano plots for the comparison between Sham Operated (SO) and 8D ligated (8D) samples. Only the statistically significant genes are represented in the graphs. (C) Volcano plot for the comparison between 2D genes per group, defined by the same means, were 18790, 19572 and 20061 for the Sham Operated, 2D ligated and 8D ligated groups respectively. These data have been deposited in NCBI's Gene Expression Omnibus 10,11 and are accessible through GEO Series accession number GSE79443. (https://www.ncbi.nlm.nih.gov/geo/query/acc. cgi?acc= GSE79443).

Identification of differentially expressed genes during progression of renal fibrosis.
Multidimensional scaling analysis confirmed high correlation and reproducibility among individual samples of each group (Fig. 1C). The bioinformatics analysis was focused on three independent comparisons: 2 days ligated vs Sham operated (SO vs 2D), 8 days ligated vs Sham operated (SO vs 8D) and 8 days ligated vs 2 days ligated (2D vs 8D). Statistically significant (p < 0.05), differentially expressed genes are reported in detail (Supplemental Tables 3-5). We identified 2962 genes differentially expressed in SO vs 2D, 5152 genes in SO vs 8D and 1854 genes in 2D vs 8D. An absolute fold change cutoff value of 1 in log 2 scale was utilized to identify up-regulated and down-regulated genes, respectively, in each independent comparison (cutoffs represented as vertical lines in volcano plots of Fig. 2). Venn diagrams were used to schematically represent similarities between SO vs 2D and SO vs 8D (Fig. 2D-F). Many genes were found only in one of the two comparisons, indicating differential temporal expression ( Fig. 2D-F). Hierarchical clustering (Euclidean distance, Ward linkage) and heatmaps (Fig. 2G) show the expression changes between SO, 2D and 8D. The depicted genes exhibit statistically significant differential expression (Benjamini-Hochberg FDR < 0.05) in at least one of the comparisons. Fig. 2A-C,G are available as interactive links online (http://epigenomics.fleming.gr/metaseqr_runs/HS/000005sub/ and http://epigenomics. fleming.gr/inchlib_heatmaps/HS/BRFAA/Charonis/val_005_r/). Identification of cellular processes, molecular pathways and specific genes involved in renal fibrosis. Gene Ontology (GO) and pathway analyses were carried out to determine molecular processes and biological pathways associated with differentially expressed genes in all comparisons. We associated differentially expressed mRNAs with two structured networks provided by the Genecodis website: biological process (BP) and molecular function (MF) (Supplementary Tables 6-17). The GO analysis further validated our experimental set up, by demonstrating that many biological processes and molecular functions, already associated with fibrosis, are also enriched in our datasets, including cell adhesion, inflammatory response, apoptosis, cell proliferation, cell differentiation, proteolysis, etc (Supplementary Tables 6-17). Interestingly, we were able to identify a large number of individual genes in these categories that have not been previously correlated with renal fibrosis or kidney pathologies, even though the specific GO category/term is strongly associated with this disease. Only in the up-regulated genes for GO term: Cell Adhesion, in SO vs 2D comparison (Supplementary Table 6), we were able to identify 13 genes with very important biochemical/biological functions in other tissues or organs. The list of genes includes Fblim1, Epha2, Wisp1, Parvg, Madcam1, Mybpc2, Cdh3, Gpr56, Troap, Pstpip1, Pcdh8, Nlgn2 and Mfap4, genes that based on Pubmed and Kidney and Urinary Pathway Knowledge Base 12 searches have not been previously associated with renal fibrosis. Their importance is significant as: For example, Fblim1 is involved in the assembly and stabilization of actin-filaments and plays a role in modulating cell adhesion, cell morphology and cell motility 13,14 . Epha2 belongs to the ephrin receptor subfamily of the protein-tyrosine kinase family and regulates cellular morphology, adhesion and migration 15,16 . Wisp1 is a member of the WNT1 inducible signaling pathway and localized in the extracellular matrix, where it acts to regulate adhesion, proliferation and apoptosis [17][18][19] . All these genes and many more in the other GO categories or comparisons (Supplementary Tables 6-17) constitute an extremely rich source for potential target genes in understanding and combating renal fibrosis.
In the same comparison by screening the GO categories containing the term transcriptional regulation, we identified 42 genes encoding for transcription factors or regulators not previously associated with renal fibrosis (as defined by searches in Pubmed and kupkb) that were up-regulated in SO vs 2D comparison; most of these factors were also up-regulated in SO vs 8D comparison, indicative of consistent actions and/or functional roles ( Table 1). Many of these factors play critical roles in other organs' development and homeostasis and in human diseases, suggesting similar roles in renal pathophysiology. Collectively, these data offer a long list of novel candidate genes with potential implications in renal fibrosis. For example, Sox9, which is induced by 55.1 folds in SO vs 2D and 50.4 in SO vs 8D, is a HMG-box class transcription factor and a master regulator of the pancreatic program, liver progenitors, duodenal development, chondrogenesis, sex determination, Sertoli cell differentiation during testis development, craniofacial development and is mutated in campomelic dysplasia, a disorder characterized by skeletal malformations, XY sex reversal, and neonatal lethality [20][21][22] . Runx1, which is induced by 5.1 folds in SO vs 2D and 18.2 in SO vs 8D, is a heterodimeric transcription factor that binds to the core element of many enhancers and promoters, with important regulatory functions in hematopoietic stem/progenitor cells and causal roles in human leukemias as well as solid tumors 23,24 . Uhrf1, induced 11.4 folds in SO vs 2D and 11.5 in ligated (2D) and 8D ligated (8D) samples. Only the statistically significant genes are represented in the graphs. (D) Total statistically significant differentially expressed genes (Dif. Exp.) in the comparisons between Sham operated vs 2D ligated (SO vs 2D) and Sham operated vs 8D ligated (SO vs 8D). (E) Up-regulated statistically significant differentially expressed genes in the comparisons between Sham operated vs 2D ligated (SO vs 2D) and Sham operated vs 8D ligated (SO vs 8D) as defined by the cut-off value of > 1 log 2 fold change. (F) Downregulated statistically significant differentially expressed genes in the comparisons between Sham operated vs 2D ligated (SO vs 2D) and Sham operated vs 8D ligated (SO vs 8D) as defined by the cut-off value of < − 1 log 2 fold change. (G) Hierarchical clustering (Euclidean distance, Ward linkage) and heatmap showing the expression changes between Sham operated, 2D ligated, 8D ligated, expressed as the log2 of normalized and summed exonic read counts in Sham operated, 2D ligated and 8D ligated samples.
SO vs 8D, and Ezh2 (member of the Polycomp repressive complex), induced 2.3 folds in SO vs 2D and 2.01 in SO vs 8D, are both central epigenetic regulators that control chromatin structure and re-organization upon various stimuli. Most importantly these factors regulate gene expression program and cellular behavior, including cell proliferation, migration, differentiation, identity and malignant transformation [25][26][27] . These observations support a potential role of these factors in early chromatin re-organization events upon induction of fibrotic processes in the kidney. Furthermore, several other critical transcription factors have been found to be strongly up-regulated, including immediate early transcription factors, Fosl1, Fosl2, Fos, Jun, JunD, Egr2, Creb5, as well as other factors such as FoxJ1, Ikzf4, Atf5, Sox4 and Sox11. Collectively, these data offer a long list of novel candidate genes for potential implication in renal fibrosis. Interestingly, for many of these proteins, pharmacological agonists and antagonists have been recently developed, raising exciting possibilities for new therapeutic approaches.
Next, by pathway analysis we documented a number of biological pathways characterized as up-or down-regulated in all three comparisons. The predominant 25 up-or down-regulated pathways when comparing   Tables 24-26). In particular, we identified 62 lncRNAs that were differentially expressed in SO vs 2D, 110 lncRNAs in SO vs 8D and 24 in 2D vs 8D (Fig. 4A). Next, lncRNA genes were classified according to their relationship with protein-coding genes in their genomic loci (intronic, bidirectional, intergenic, sense/antisense), since this correlation might have functional importance. Details of the distribution of the lncRNAs are schematically presented in Fig. 4B. The majority of the identified lncRNAs, in all comparisons, were intergenic.
Proximity of a lncRNA gene to a protein-coding gene may be a factor contributing to regulatory mechanisms. Therefore, we first performed extensive bioinformatics searches 100 kb upstream and 100 kb downstream of each of these lncRNA genes for the identification of protein-coding genes. These searches were based on the publicly available data of the UCSC genome browser. The examination of the lncRNA's genomic loci revealed a large number of protein-coding genes that could be potentially affected in cis by the differential expression of the corresponding lncRNAs (listed in Supplemental Tables 27 and 28). In addition, we evaluated these protein-coding genes for similar or opposite patterns of expression with that of corresponding lncRNAs during progression of UUO model, based on our RNA-Seq data. Interestingly, we identified a large number of protein-coding genes with similar or opposite patterns of expression in the genomic vicinity of our differentially expressed lncRNAs (also enlisted in Supplemental Tables 27 and 28).
Following this analysis, and based on proximity of lncRNAs to protein-coding genes with important functions in kidney and/or other tissues as well as on the fold change in their expression levels, we focused on 26 lncRNAs for further analysis. We have chosen the distance of less than 3 kb as a criterion of proximity, mainly due to the general observation that several regulatory elements lay on distances less than 3 kb from their target genes. Regarding the importance, we first sorted out the protein-coding genes in close proximity to our differentially expressed lncRNAs and then by literature search we focused our analysis on genes that were associated with renal pathologies. Expression patterns were analyzed by real-time RT-qPCR (Fig. 4C) and expression levels and fold changes were found in good agreement with the RNA-seq results (Fig. 4C). Although in some cases RNA variation in fold changes was different between real time RT-qPCR and RNA-seq, the expression patterns were coincident between these two techniques (Supplemental Table 29). These data suggest that all 26 lncRNAs are transcribed in renal tissue and their changes in expression levels are either positively or negatively associated with renal pathology in the UUO model. Chromatin immunoprecipitation experiments (ChIP) with antibodies against RNA pol-II and tri-methylation of the fourth lysine (K) residue from the N-terminus of histone H3 (H3K4me3), were performed to specifically evaluate the promoters of lncRNA genes for RNA pol-II recruitment   and H3K4me3, which is tightly associated with transcriptional start sites of actively transcribed genes. This analysis was performed for 10 out of the 26 lncRNAs. In all cases, recruitment of RNA pol-II and H3K4me3 were verified for the promoters of these lncRNAs in renal tissue (3 selected cases are depicted in Fig. 5 and the other 7 in Supplemental Figure 2). In most cases the changes in the expression profile were nicely associated with changes in RNA pol-II recruitment and levels of H3K4me3. In detail, in 6/10 cases (more specifically the cases of 3110045C21Rik, AI662270, RP23-45G16.5, 3110099E03Rik, 330005D01Rik and D630024D03Rik) the changes in the expression profile (up-or down-regulation compared to Sham operated) at both 2D-and 8D-interval were followed by similar changes in RNA pol-II recruitment and levels of H3K4me3. In particular, when a lncRNA was characterized as up-regulated in the 6 cases mentioned we also recorded increased recruitment of RNA-pol II and levels of H3K4me3.
Next, the functional role of 3 of these lncRNAs was investigated. We focused on RP23-45G16.5 transcript, due to its close genomic association with Cdkn1b gene encoding for p27-Kip1 regulator of the cell cycle 28,29 (Fig. 5A,B), 3110045C21Rik, due to its close genomic association with Ddr2 gene with critical roles in ECM and fibrosis 30,31 (Fig. 5A,C), and AI662270 due to an early up-regulation in 2D interval (Fig. 5A,D). Interestingly, the changes in the expression profile of RP23-45G16.5 lncRNA show a positive correlation with the associated protein-coding gene (Cdkn1b) (Fig. 5B), while the profile of 3110045C21Rik shows a negative correlation (Fig. 5C). RP23-45G16.5, 3110045C21Rik and AI662270 lncRNAs are correlated with renal pathology in other animal models for kidney diseases. The expression levels of the three selected lncRNAs were further evaluated in three additional animal models (anti-GBM, renin overexpression and ischemia/reperfusion). In the anti-GBM model, the expression levels were evaluated in SV129 wild type female mice, 8 days after the administration of PBS and/or nephrotoxin (NTS). Changes in the expression profiles of the lncRNAs followed similar pattern with that of the UUO (Fig. 6A-C).

Changes in the expression levels of
In the renin overexpression model, Sv129 10 month-old male mice were used in both wild type (WT) and RenTg background. In two of the lncRNAs cases (AI662270 and RP23-45G16.5) the expression profiles were very similar with those in the UUO model while for the case of 3110045C21Rik no change was observed (Fig. 6).
In the ishemia/reperfusion model, Sv129 male mice subjected to nephrectomy and ischemia for 35 min and reperfusion after 48 h (IR) and wild type (WT) sham operated mice were used. All three lncRNAs expression profiles were very similar to the UUO model (Fig. 6). Collectively, these observations suggest possible involvement of these three lncRNAs in renal pathologies and therefore suggest a more general role for them in kidney diseases. 45G16.5, 3110045C21Rik and AI662270 lncRNAs affect the molecular properties and proliferation of M1 kidney cell line. Based on their involvement in many diverse nephropathy models, we explored the putative functional significance of the altered expression of these 3 lncRNAs. Towards this goal, specific phenotypic changes induced by their overexpression were examined in cultured tubular epithelial cells, using the M1 murine cell line, focusing on a panel of fibrosis-related and cell cycle-related genes. The overexpression of the lncRNAs was evaluated by RT-qPCR (Supplemental Figure 3). We show that each lncRNA affects the expression of selected fibrosis-related genes in a different fashion: the overexpression of RP23-45G16.5 leads to up-regulation of Ccl2 (chemokine (C-C motif) ligand 2, a pro-inflammatory factor), the overexpression of 3110045C21Rik leads to up-regulation of Cdh1 (E-cadherin) and down-regulation of Acta2 (alpha-SMA) and Tgfb1 (Transforming Growth Factor, Beta 1), while overexpression of AI662270 has no statistically significant effect on these genes (Fig. 7A). Regarding cell cycle-related genes, several genes whose products are inhibitors of the cell cycle are significantly up-regulated (Fig. 7A). This finding is further supported by the results from proliferation assays (Fig. 7B), which indicate that at least in our cell system, all three lncRNAs lead to reduction of their proliferative capacity, without affecting cell death (data not shown). Overall, these findings provide evidence that the differential expression of these lncRNAs may exert an important influence on cellular phenotype.

RP23-
Of special interest is the effect of 3110045C21Rik overexpression on Cdh1, Acta2 and Tgfb1, genes known to be involved in fibrosis. In most animal models (UUO, anti-GBM, ischemia-reperfusion) we recorded down-regulation of 3110045C21Rik during the progress of fibrosis (Fig. 6B) while in the same conditions Cdh1 is down-regulated and Acta2 and Tgfb1 are up-regulated. Interestingly when we overexpressed this lncRNA in cell lines, the expression levels of these genes were reversed. Taking into consideration that these changes are critical for the development of fibrosis it was interesting to check whether these can be induced by well-known factors of fibrosis. Therefore, we examined whether exogenous administration of TGFb and IL1b can induce these changes in the expression of 3110045C21Rik. We provide evidence that both factors are not able to cause changes in the expression of this lncRNA (Fig. 8), suggesting that 3110045C21Rik may be critically involved in renal fibrosis, without being upstream regulated by TGFb of IL1b signaling.

Discussion
Next generation sequencing techniques are useful means for detecting novel genes, transcriptional and epigenetic networks. These methodologies have recently been employed in common medical research; however, their use in nephrology is still very preliminary 32,33 and they have not addressed so far the changes in the transcriptomic profile during the development of renal fibrosis.
To address this issue, we performed whole transcriptome analysis combined with bioinformatics analysis in the UUO mouse model of renal fibrosis. We present novel information regarding up-regulated and down-regulated genes, pathways and biological processes associated with the progression of fibrosis. Moreover, we provide lists of differentially expressed protein-coding genes, not suspected so far to be involved in the process, including many transcriptional regulators. We propose here that a subset of these factors may participate in gene regulatory networks involved in renal fibrosis. More specifically, we were able to show strong up-regulation of many transcription factors implicated in signaling and transcriptional cascades (Table 1, i.e. members of the AP-1 transcriptional cascade, Fosl1, Fosl2, Fos, Jun, JunD, Egr2 and Creb5). This pathway regulates the extracellular stimuli-mediated changes in gene expression program of many organs and diseases to mediate cellular changes related to inflammation, proliferation, adhesion, migration, apoptosis, wound healing, viral infections and neuronal plasticity 34 , raising the hypothesis that it may also be involved in transcriptional regulation during progression of renal fibrosis. Therefore, our data provide an initial map to facilitate the understanding of molecular changes in gene expression program that correlate with and mediate renal fibrosis. In a recent report, by Zhou et al. 35 a transcriptomics analysis of Smad3 knock-out mice was performed in UUO injury. The comparison between their data and ours revealed similarities as well as interesting differences, probably due to different time points examined and analytical methodology used.
Thorough analysis of our data led to the association of a long list of non-coding RNA genes with UUO, not previously associated with renal pathology. Specifically, we showed that a large number of lncRNAs are expressed in renal tissue and exhibit altered expression patterns upon progression of renal fibrosis. The differential expression of these new molecular players in UUO model was firmly confirmed and established at the RNA and chromatin organization levels. Furthermore, we validated and documented the alterations in the expression profiles of selected lncRNAs in three other murine models of renal pathologies, and provided initial evidence for a putative functional role of this class of molecular regulators in renal cells. These observations add a new layer of molecular complexity in the understanding of renal fibrosis, not previously anticipated or considered.
In general, the new sequencing technologies and data from ENCODE and FANTOM have revolutionized our view of the organization, activity and regulation of the mammalian genome [36][37][38] , realising that the majority of the genome is transcribed producing vast numbers of formerly unknown regulatory lncRNA species 37,39,40 . LncRNAs are transcribed by RNApol II, can be post-transcriptional processed by 5′ capping, polyadenylation, splicing, RNA editing, and exhibit specific sub-cellular localization 41 . It was very recently shown that lncRNAs participate in the gene regulatory networks for the control of ESC pluripotency, proliferation and metastasis of cancer cells, as well as development and function of various tissues [42][43][44] . They have also been correlated with many human diseases and syndromes, either as biomarkers or causative factors 45,46 . In this respect, the list of novel lncRNAs uncovered in this study represents an important new source for potential regulators of renal pathologies. In the present report we focused on a limited number of lncRNAs that exhibited consistent transcriptional behaviour (up-regulation or down-regulation) in four different animal models of nephropathies, three of them of chronic nature, exhibiting fibrosis and one of acute nature, not associated with fibrosis. It is interesting that when tested on a tubular epithelial cell line, they all exerted different phenotypic alterations upon overexpression, suggesting that they can participate in a rather intricate pattern of regulatory networks during the development of nephropathies. However, it should be stressed that many other members of the identified lncRNAs might be even more interesting and possibly been involved in specific processes characteristic of special/exclusive aspects of each nephropathy.
In conclusion, our data from RNA-seq analysis of a murine model of renal fibrosis provide a list of novel and interesting sequences of coding and non-coding RNAs, which will allow us to better understand the complexity of the pathogenetic mechanisms encompassing but not limited to fibrosis and design more specific, targeted therapeutics in the near future.

Methods
Mouse model of UUO. Eight-to 12-week-old male C57BL/6 mice, were supplied from the colony of the Center of Experimental Surgery of our Institute. Mice underwent ligation of the right ureter and were divided into three groups during the surgery: sham operated, 2 days ligated and 8 days ligated. Before surgery the mice were anesthetized via face mask delivering sevoflurane 47 . The right ureter was exposed through a midline abdominal incision and was either completely obstructed 1 cm below the renal pelvis with 5.0 silk ligature (ligated animals) or manipulated similarly but not ligated (sham operated animals). Two or eight days after surgery the kidneys were collected, rinsed with isotonic saline, dissected and stored in liquid nitrogen for further analysis.
All aspects of animal experimentation were performed in adherence to the NIH and the European Union Guide for the Care and Use of Laboratory Animals and were approved by the Institutional Review Board and the Animal Experimental Committee of the Biomedical Research Foundation of the Academy of Athens.
Other mouse models. The Renin overexpression model is a mouse model of hypertensive nephropathy, where Ren-1 and Ren-2 genes are inserted into a liver-specific locus. The resulting trans-gene expresses renin ectopically at a constantly high level and achieves elevated plasma levels of pro-renin, renin and angiotensin II throughout life. Sustained up-regulation of pro-inflammatory and pro-fibrotic macromolecules is observed, followed by high systemic blood pressure, leading to structural damage and functional decline of the kidney [48][49][50] .
The renal ischemia-reperfusion is a model of acute kidney injury, achieved by clamping the renal artery alone or the pedicle. The procedure usually involves ischemia induction from 15 to 60 min, followed by reperfusion; during the process, changes in coloration of renal parenchyma confirm the procedure's effectiveness 51 .
The nephrotoxic serum-induced glomerulonephritis model is induced by injection of de-complemented heterologous serum containing immunoglobulins against glomerular antigens resulting in an acute inflammatory reaction characterized by immune cell infiltration. This model mimics the rapidly progressive glomerulonephritis in humans 52,53 .