HMGA1-pseudogene7 transgenic mice develop B cell lymphomas

We have recently identified and characterized two pseudogenes (HMGA1P6 and HMGA1P7) of the HMGA1 gene, which has a critical role in malignant cell transformation and cancer progression. HMGA1P6 and HMGAP17 act as microRNA decoy for HMGA1 and other cancer-related genes upregulating their protein levels. We have previously shown that they are upregulated in several human carcinomas, and their expression positively correlates with a poor prognosis and an advanced cancer stage. To evaluate in vivo oncogenic activity of HMGA1 pseudogenes, we have generated a HMGA1P7 transgenic mouse line overexpressing this pseudogene. By a mean age of 12 months, about 50% of the transgenic mice developed splenomegaly and accumulation of lymphoid cells in several body compartments. For these mice FACS and immunohistochemical analyses suggested the diagnosis of B-cell lymphoma that was further supported by clonality analyses and RNA expression profile of the pathological tissues of the HMGA1P7 transgenic tissues. Therefore, these results clearly demonstrate the oncogenic activity of HMGA1 pseudogenes in vivo.

Many evidences indicate that long non-coding RNAs (lncRNAs) are key modulators of different biological phenomena. Given this scenario, it is predictable that deregulated expression and aberrant role of lncRNAs are involved in the development of several diseases including cancer 1 . Among lncRNAs, pseudogenes, a subgroup of genes that arises from protein-coding genes that have lost the capacity to produce proteins, have been considered for long time as non-functional genomic junk 1 . However, recent studies have unveiled important functions of pseudogenes in the regulation of the expression of the parental genes. Indeed, the majority of the identified pseudogenes has high sequence homology with their protein-coding parental counterparts, enabling them to take part in post-transcriptional control of their parental genes. The regulation of parental gene relies on several mechanisms: (i) the generation of endogenous short interfering RNAs (siRNAs) 2,3 ; (ii) the engagement of regulatory proteins on the parental gene by pseudogene RNAs to control gene expression and chromatin remodelling 4,5 ; (iii) the ability of the pseudogenes to compete with the parental genes for RNA-binding proteins and the translation machinery [6][7][8] ; (iiii) the ability of pseudogenes to compete with their parental genes for a common pool of shared microRNAs (miRNAs) 9 through the high sequence homology of the 3′ Untranslated region (UTR), thus regulating each other expression as competitive endogenous RNAs (ceRNAs) 10 .
The HMGA protein family includes the HMGA1a, HMGA1b and HMGA2 members 11 . The first two are coded for by the same gene through an alternative splicing. They have no transcriptional activity per se, but, modifying the chromatin architecture, they are able to positively or negatively regulate the expression of several genes, particularly those involved in cancer progression 11,12 . Consistently, these proteins are expressed at very low levels in normal adult tissues, but are abundant in almost all the human malignant neoplasms 11 , and their expression significantly correlates with the capability of cancer cells to metastatize and a patient poor prognosis [13][14][15] . Moreover, in vitro and in vivo models support a causal role of the HMGA proteins in cell transformation and cancer development 11,16,17 . www.nature.com/scientificreports www.nature.com/scientificreports/ We have recently identified two human HMGA1 processed pseudogenes (HMGA1P6 and HMGA1P7) that are not present in mouse genome. HMGA1P6 and HMGA1P7 can compete with HMGA1 for miRNA binding, leading to the upregulation of HMGA1 cellular levels, thereby enhancing the expression of cell malignant features [18][19][20][21][22][23] . The overexpression of these HMGA1 pseudogenes (HMGA1Ps) also increases the levels of HMGA2 and other cancer-related genes, such as EZH2 and VEGF, by inhibiting the suppression of their synthesis 18 . Noteworthy, HMGA1Ps were found overexpressed in several human cancer types supporting their involvement in carcinogenesis 18,[20][21][22][23] . To investigate the role of HMGA1 pseudogenes overexpression in vivo, we generated transgenic mouse model overexpressing HMGA1P7 (HMGA1P7-TG) 18,[22][23][24] . Mouse Embryonic Fibroblasts (MEFs) derived from HMGA1 pseudogene transgenic mice showed a higher growth rate and a later onset of senescence than the wild-type (WT) counterpart 18 .
Here, we report that HMGA1 pseudogene transgenic mice develop haematological neoplasia characterized by monoclonal B-cell populations, most of them diagnosed as large B-cell lymphoma. These results validate the oncogenic role of the HMGA1 pseudogenes 18 .

Results
HMGA1P7 transgenic mice develop lymphoproliferative lesions. Transgenic mice carrying the HMGA1P7 gene were generated by the injection of the transgene into C57BL/6N derived-zygotes and, then transferred into pseudo-pregnant as previously described 18 . The expression of the HMGA1P7 was assessed in lungs, spleens and kidneys explanted from transgenic mice (Fig. 1).
Interestingly, HMGA1P7 mice showed significant increased mortality with respect to the WT mice (Gehan Breslow Wilcoxon test, p < 0.0001) with a mean age of death of about 52 weeks ( Fig. 2A). About 50% of 12 months-old HMGA1P7 transgenic mice displayed splenomegaly at necropsy, whereas WT mice showed no relevant alteration in splenic size or weight (Fig. 2B,C). Histological sections of the HMGA1P7-TG spleens showed a clear distinction between the red and the white pulp. In the red pulp multiple foci of extramedullary haematopoiesis, as well as hemosiderin-laden macrophages were frequently observed (Fig. 3A,III). White pulp showed a moderate expansion with some confluent areas and partial loss of normal structures and germinal centers. In some mice, higher magnification showed a diffuse, monotonous lymphoid population composed of medium-to-large rounded cells with scant cytoplasm, round to oval nuclei and single or multiple, prominent nucleoli often adherent to the nuclear membrane (Fig. 3A,IV). Mitotic activity was medium to high (<10 × 10 HPF). Intriguingly, histopathological analyses revealed monotonous lymphoid cells infiltrating liver (≈25%), kidneys (≈25%), lung (≈30%), and pancreas (≈20%) (Fig. 3B). Immunohistochemical analysis of lymphoid component displayed a predominant CD45/B220-positive population intermingled with few, scattered CD3-positive cells (Fig. 3C). Based on morphology and immunophenotype, a diagnosis of large B-cell lymphoma with immunoblastic features was made (human counterpart: DLBCL, immunoblastic variant) 25 .
Furthermore, FACScan analysis of lymphocytes isolated from WT or pathological spleens using the CD3, CD19 and NK anti-mouse antibodies confirmed the immunohistochemical data. CD19 population resulted almost doubled, while CD3 population was decreased in HMGA1P7-TG mouse spleens in comparison with WT animals (Fig. 4A).
To investigate the clonal status of the accumulation of the CD19 positive population in HMGA1P7-TG mice, genomic DNAs from TG and WT spleens were analysed. As shown in Fig. 4B, only one dominant PCR product was generated by the amplification of the DNA extracted from the transgenic spleens, whereas DNA derived from a WT spleen yielded three prominent PCR products of 1.0, 0.7 and 0.12 kb, corresponding to DJH2, DJH3 and DJH4 Immunoglobulin (Ig) gene rearrangements, respectively 26 .
Taken together, these results indicate that HMGA1P7-TG mice lymphoid expansion was monoclonal, therefore further supporting the diagnosis of B-cell lymphoma.
Then, we validated the results obtained by RNA-Seq analyses, testing the expression of a panel of deregulated mRNAs in spleens from HMGA1P7 by qRT-PCR (Fig. 7). Among the upregulated genes we chose CCAAT/ enhancer-binding protein delta (Cebpd), chemokine (C-C motif) ligand 24 (Ccl24), Bcl-2-like 1 (Bcl2l1), Fos, Interleukin 1 Alpha (Il1a), BTB and CNC homolog 2 (Bach2), one of the downregulated genes. Next, the increased expression levels of Cebpd, Bcl2l1 and Fos were also confirmed by western blot analyses (Fig. 7). Finally, to demonstrate that HMGA1P7 acts through a ceRNA mechanism on the genes deregulated in pathological spleens (Fig. 8A), we inserted downstream of the luciferase open reading frame the 3′-UTRs of these genes. These reporter vectors were transfected into NIH3T3 cells overexpressing or not HMGA1P7. As expected, the luciferase activity was markedly increased in the cells that overexpressed HMGA1P7 (Fig. 8B), confirming the ceRNA action induced by HMGA1P7 on these new targets.
Therefore, on the basis of the FACS and immunohistochemical data combined with the RNA-Seq analyses we can assess that the lymphoproliferation in the HMGA1P7 transgenic mice shares transcriptome features with DLBCL of the non-GCB type [43][44][45][46][47][48][49] .

Discussion
We have previously reported that the overexpression of HMGA1Ps accelerates cell proliferation, by enhancing the G1-S transition, increases cell migration ability, likely raising the levels of HMGA1 and other oncogenic proteins such as HMGA2 and EZH2 18 . Moreover, the MEFs obtained from HMGA1Ps transgenic mice showed a reduced proliferation time and senescence in comparison with the WT MEFs 18 . www.nature.com/scientificreports www.nature.com/scientificreports/ Therefore, the aim of this study was to better characterize the transgenic mice overexpressing the HMGA1P7 to possibly validate its oncogenic activity in vivo. The analysis of HMGA1P7 transgenic mice at 12 months of age, shows that about 50% of these mice developed a pathology characterized by splenomegaly and invasion www.nature.com/scientificreports www.nature.com/scientificreports/ of lymphoid cells in different anatomical districts. The pathological spleens showed a diffuse and monotonous lymphoid population effacing the splenic parenchyma with the loss of typical structures and germinal centres. Neoplastic lymphoid cells were medium to large, rounded, with scant cytoplasm and round to ovular nuclei with single or multiple prominent nucleoli.
By immunohistochemistry and FACS analyses, we found that the neoplastic cells were respectively positive for CD45/B220 and CD19 proposing a B cell phenotype of the lymphoid cells. Clonality assay on pathological spleens evidenced the clonal expansion of CD19-positive lymphoid population supporting a diagnosis of B cell lymphomas for these lesions. Interestingly, RNA-Seq analyses performed on spleens derived from WT and HMGA1P7 mice revealed a deregulation of several genes, likely due to HMGA1P7-ceRNA activity. The deregulated genes were involved in inflammation pathways such as NFKB pathway, the IL6/JAK/STAT3 and MTOR signalling, the oxidative phosphorylation, and targets of MYC, E2F, STAT3, AP1, ATF3. Moreover, the spleens from HMGA1P7 mice had a gene expression signature compatible with an induction of senescence and immune escape (Il13ra2, Il1a, Mmp3, Il1b, Pvrl2, Il10, Cd160, Ido1) [35][36][37][38][39][40][41][42] .
Noteworthily, the genes suppressed by BCR inhibitors in DLBCL were found significantly enriched in the pathological tissues of HMGA1P7 mice. In particular, the downregulated genes were enriched of transcripts decreased in post-GC BCL6 dependent B cell lymphomas and present in the GCB DLBCL signature. Therefore, the transcriptome study of the lymphoproliferative lesions in the HMGA1P7 transgenic mice unveils a pathology compatible with DLBCL of the non-GCB type.
Consistently with the ability of the HMGA1Ps to regulate gene transcription by a ceRNA mechanism, bioinformatic analyses demonstrate that several upregulated genes emerged from RNA-Seq data shared the same microRNA Responsive Elements with HMGA1P7 (i.e. Cebpd, Ccl24, Bcl2l1, Fos, Il1a).
Surprisingly, HMGA1 did not result upregulated by HMGA1P7 overexpression in the analysed pathological spleens, suggesting that pseudogene-induced lymphomas were based on other molecular targets already described  . Moreover, we did not find any change in HMGA1 expression levels during spleen development of HMGA1P7-transgenic mice (data not shown). However, we cannot exclude the possibility of increased HMGA1 protein levels in a limited cell compartment in the initial steps of lymphomagenesis. www.nature.com/scientificreports www.nature.com/scientificreports/ Altogether the data presented here show that deregulated expression of HMGA1P7 pseudogene has oncogenic role also in vivo, thus representing a new class of genes involved in cancer pathology as their upregulation occurs frequently in multiple human cancers 50 . An oncogenic role for pseudogenes has been already reported. Indeed, mice engineered to overexpress the full-length murine B-Raf pseudogene Braf-rs1 develop an aggressive malignancy resembling human diffuse large B cell lymphoma by ceRNA mechanism that elevates BRAF expression 50 .
Noteworthy, preliminary studies on a mouse strain overexpressing HMGA1P6 pseudogene show that several mice develop a lymphoid pathology characterized by splenomegaly that resembles that found in HMGA1P7-TG mice.
Therefore, our mouse model confirms the oncogenic potential of pseudogenes and provides compelling support for a causal link between altered pseudogene expression and cancer, mediated by ceRNA mechanism. Studies are in progress to evaluate the expression levels of HMGA1 pseudogenes in human lymphomas. Preliminary results indicate HMGA1P1 overexpression that could contribute to lymphomagenesis by a similar ceRNA mechanism.

Histology and immunohistochemistry
Light microscopy was performed as previously described 52 . Definition and classification of lymphoid disease were based on criteria reported elsewhere 25 .

Flow cytometric analysis (FACS)
For FACS analyses, spleens were collected from WT and transgenic mice, hard-pressed through a stainless-steel mesh, resuspended in PBS and then in Red Blood Lysing Buffer (Sigma-Aldrich, Saint Louis, MI, US) for 3 min. After two washes in PBS, lymphocytes (5 × 10 5 ) were set in 96-well round-bottom dishes.

Analysis of the clonality of lymphomas
Genomic DNA was extracted from fresh spleens through Phenol/Chloroform/Isoamyl Alcohol Extraction (Thermofisher, Waltham, MA, USA). The obtained DNAs were utilized as PCR templates with DSF and JH4 primers that recognize mouse DNA DJ rearrangement 26 .
DSF primer: 5′-AGGGATCCTTGTGAAGGGATCTACTACTGTG-3′; JH4 primer: 5′-AAAGACCTCCAGAGGCCATTCTTACC-3′. FastQC tool 56 was utilized for the quality control analysis of the generated raw sequence files (.fastq files). Cutadapt was used in order to eliminate the adapter sequences. Paired-end reads were mapped using STAR (version 2.5.2b) 57 on reference genome assembly mm10 acquired from Ensembl 58,59 . The quantification of transcripts expressed for each replicate of the sequenced samples was performed using HTSeq-Count algorithm 60 . The differential expression analysis was performed through DESeq. 2 61 .

RnA-Seq analyses
Gene Set Enrichment Analysis (GSEA) was used for functional annotation on pre-ranked lists using the MSigDB 5.2 62 , the SignatureDB collection 63 and genesets obtained from different publications 64,65 , applying false discovery rate (FDR) values <0.05 as threshold.

Statistical analysis
Two-sided unpaired Student's t tests and Mann-Whitney tests were utilized to analyse data (GraphPad Prism, GraphPad Software, Inc.). P < 0.05 values were taking into account as statistically significant. The mean values +/− s.d were obtained from three or more separate experiments. GraphPad Prism, GraphPad Software, Inc. was used to obtain regression analyses and correlation coefficients.