Hematopoietic stem and progenitor cell-restricted Cdx2 expression induces transformation to myelodysplasia and acute leukemia

The caudal-related homeobox transcription factor CDX2 is expressed in leukemic cells but not during normal blood formation. Retroviral overexpression of Cdx2 induces AML in mice, however the developmental stage at which CDX2 exerts its effect is unknown. We developed a conditionally inducible Cdx2 mouse model to determine the effects of in vivo, inducible Cdx2 expression in hematopoietic stem and progenitor cells (HSPCs). Cdx2-transgenic mice develop myelodysplastic syndrome with progression to acute leukemia associated with acquisition of additional driver mutations. Cdx2-expressing HSPCs demonstrate enrichment of hematopoietic-specific enhancers associated with pro-differentiation transcription factors. Furthermore, treatment of Cdx2 AML with azacitidine decreases leukemic burden. Extended scheduling of low-dose azacitidine shows greater efficacy in comparison to intermittent higher-dose azacitidine, linked to more specific epigenetic modulation. Conditional Cdx2 expression in HSPCs is an inducible model of de novo leukemic transformation and can be used to optimize treatment in high-risk AML.

T he caudal-related homeobox gene CDX2 is not expressed in normal hematopoietic stem cells (HSCs), but is expressed in~90% of acute myeloid leukemia (AML) patients 1,2 , as well as those with high-risk myelodysplastic syndrome (MDS) and advanced chronic myeloid leukemia (CML). Retroviral Cdx2 expression in bone marrow (BM) progenitor cells facilitates in vitro self-renewal and causes a serially transplantable AML in vivo [1][2][3] . CDX2 is thought to be necessary for leukemia growth, as knockdown of human CDX2 by lentiviral-mediated short hairpin RNA (shRNA) impairs growth of AML cell lines and reduces clonogenicity in vitro 1 . These data indicate that aberrant Cdx2 expression may promote HSC transformation to leukemia stem cells (LSCs).
Cdx2 plays a critical role in embryogenesis and early developmental hematopoiesis [4][5][6] . Loss of Cdx2 in murine blastocysts results in lethality at 3.5 days post-coitum 7 . Cdx2 is a critical regulator of the trophectoderm layer, the first cell lineage to differentiate in mammalian embryos 8 . Cdx2 downregulation in embryonic stem cells (ESCs) causes ectopic expression of the pluripotency markers Oct4 and Nanog, while Cdx2 upregulation triggers trophectoderm differentiation. Cdx2 is also essential for in vitro trophoblast stem cell self-renewal, demonstrating a pivotal role for Cdx2 in ESC fate specification [7][8][9][10] . In developmental hematopoiesis, CDX2 and other caudal-related family members (CDX1 and CDX4) are transcriptional regulators of homeobox (HOX) genes [11][12][13] . HOX gene function has been closely linked to self-renewal pathways in ESCs and HSCs, and the reactivation of these pathways by aberrant HOX expression has been implicated in leukemogenesis [14][15][16][17] . Despite this association, evidence of direct interaction between CDX2 and the HOX cluster is lacking 18,19 . CDX2 may also act via non-HOX pathways including via downregulation of KLF4 20,21 . Therefore, understanding targets of CDX2 in hematological malignancy and mechanisms of transformation may provide new opportunities to treat patients with leukemia.
Retroviral overexpression models of oncogenesis provide a powerful tool to study the functional consequences of genetic mutations. However, these models also have limitations including the ex vivo manipulation of cells and preferential transduction of proliferative progenitor cells, rather than long-term HSCs. To overcome these barriers and to understand the mechanism of in vivo transformation of HSCs, we generated a transgenic model of Cdx2 overexpression in hematopoietic stem and progenitor cells (HSPCs) to depict the cellular dynamics of transcriptional deregulation. Ectopic Cdx2 expression in HSPCs results in lethal MDS, characterized by abnormal blood cell counts, dysgranulopoiesis, and thrombocytopenia, followed by secondary transformation to acute leukemia (AL) in a percentage of surviving mice. This is dependent on Cdx2 expression within HSPCs, as myeloidrestricted Cdx2 expression attenuates the phenotype. Unexpectedly, we observe reduced expression of Hox cluster genes and upregulation of differentiation factors in Cdx2 HSPCs, signifying that non-Hox-mediated pathways drive these hematological diseases. Cdx2-driven leukemia is sensitive to azacitidine, with enhanced sensitivity when administered at a lower-dose on an extended schedule in comparison to a higher-dose on a shorter schedule. This work provides a model of MDS with stepwise transformation to AML that can be used to provide clinically relevant information for patients with MDS and AML with multilineage dysplasia.

Results
Ectopic expression of Cdx2 alters function of HSPCs. To examine the effects of Cdx2 expression in adult hematopoiesis, we generated a transgenic mouse by insertion of Cdx2 (NCBI gene ID: 12591) and mCherry (Clontech) open reading frames downstream from a CAG promoter and a loxP-flanked stop cassette in the mouse Rosa26 locus of C57BL/6 ES cells (LSL-Cdx2-mCherry, TaconicArtemis). The Cdx2 and mCherry cDNAs were separated by a T2A self-cleaving peptide, which allowed for co-expression of the two proteins after Cre excision of the stop cassette between the loxP sites. Thus, mCherry reported expression of Cdx2 in cells following Cre-recombinase mediated activation ( Supplementary Fig. 1a). The LSL-Cdx2-mCherry mice were crossed to Scl-CreER T mice 22 to generate offspring that can inducibly express Cdx2-mCherry in HSCs following tamoxifen exposure (Scl:Cdx2; Supplementary Fig. 1b). Scl:Cdx2 and control mice (Ctrl; consisting of mice from the genotypes: C57BL/6 wildtype [WT], Scl-CreER T , and LSL-Cdx2-mCherry) were fed a diet of rodent chow containing tamoxifen (400 mg/kg) for two weeks. Cdx2 expression was confirmed in mCherry-positive BM cells by western blot on whole BM and by quantitative reverse transcriptase PCR (qRT-PCR) ( Supplementary Fig. 1c, d). Scl:Cdx2 mice showed mCherry expression by flow cytometry at two weeks, and this rose further by four weeks after tamoxifen (Fig. 1a). To evaluate Cdx2 expression differences between previously published retroviral models 1 and Scl:Cdx2 transgenic cells, we transduced Scl-CreER T lineage-negative BM with MSCV-IRES-GFP (MIG)-Cdx2 and MIG-Empty retrovirus. Retroviral CDX2 overexpression resulted in approximately 900-fold higher CDX2 expression than Scl:Cdx2 transgenic cells, potentially accounting for phenotypic differences ( Supplementary  Fig. 1d).
Cdx2 expression in HSPC also led to depletion of long-term hematopoietic stem cells (LTHSCs [LKS + CD150 + CD48 − ]) and short-term HSCs (STHSCs [LKS + CD150 − CD48 − ]) in Scl: Cdx2 BM (Fig. 1h, i, Supplementary Fig. 1m). In vitro colony forming cell (CFC) assays with BM cells four weeks after tamoxifen induction demonstrated enhanced colony formation after two weeks (Fig. 1j), with enrichment of mCherry-positive cells at each passage ( Supplementary Fig. 1n) but Cdx2 expression did not facilitate in vitro serial replating beyond two weeks. These LTHSCs exclusively harbor self-renewing potential 23 , implying that the cellular mechanism of LTHSC exhaustion might involve enforced cell cycle entry and loss of quiescence. To test this hypothesis, we performed in vivo competitive BM transplantations using Scl-CreER T or Scl:Cdx2 donor BM (expressing CD45.2) from uninduced (naïve) mice, mixed with congenically marked competitor wild type (WT) BM (expressing CD45.1) (Fig. 1k). There was equivalent engraftment of donor cells of both genotypes four weeks after transplantation in the absence of tamoxifen. Induction of Cdx2 expression following intraperitoneal (IP) injection of 5 mg of tamoxifen caused a progressive loss of PB chimerism (Fig. 1l), which was associated with reduced BM HSPC populations (LKS+ and LTHSC) in induced mice transplanted with Scl: Cdx2 BM ( Supplementary Fig. 1o-r). Altogether, these data indicate that cell-intrinsic expression of Cdx2 impairs HSC  Peripheral blood chimerism Input Week 4 Week 6 Week 8 Week 12 Week 16 Week 20 Week 24 Week function with reduced capacity to sustain long-term hematopoiesis (Fig. 1m).
Cdx2 expression in HSPCs induces MDS and AL. After tamoxifen induction of Cre-recombinase, Scl:Cdx2 mice developed a variety of hematological diseases including MDS, myeloproliferative neoplasm (MPN) and AL (Fig. 2a, b). Scl:Cdx2 mice had a median survival of 43 weeks, while no disease was seen in Scl-CreER T controls (Fig 2a, b, Supplementary Fig. 2a). MDS was evidenced by reduced blood counts together with reticulocytosis, fragmented erythrocytes, anisopoikilocytosis, and neutrophil dysplasia (Fig. 2a). MPN was characterized by leukocytosis, reticulocytosis, and hypersegmented neutrophils (Fig. 2a, c, d). AL was diagnosed by >20% blasts in PB and BM (Fig. 2a, Supplementary Fig. 2b), together with leukocytosis, splenomegaly, and anemia ( Fig. 2c, d, g). All moribund mice had reduced hemoglobin compared with controls ( Fig. 2e) while all Scl:Cdx2 mice (regardless of health state) showed mild to profound thrombocytopenia (Fig. 2f, Supplementary Fig. 2c). All Scl:Cdx2 mice showed a propensity for hypersegmented neutrophils, and expansion of Gr1-positive myeloid cells and decrease in B220 B cells compared with Scl-CreER T controls ( Supplementary  Fig. 2d). Approximately 20% of Scl:Cdx2 mice did not develop overt hematological disease ( Fig. 2b) aside from thrombocytopenia and neutrophil dysplasia. In mice that developed AL, we observed biphasic disease, with initial MDS (dysplasia, leukopenia and thrombocytopenia; Supplementary Fig. 2h, i) followed by the later onset of leukocytosis, anemia, and increased mCherry+ and c-Kit+ cells in PB ( Supplementary Fig. 2j-m). Immunophenotyping revealed distinct leukemia lineage commitments (Fig. 2h). Scl:Cdx2 #252 showed a clonal expansion of c-Kit + B220 int CD3 int cells ( Supplementary Fig. 2e), Scl:Cdx2 #882 PB leukemic cells were c-Kit + CD3 + mCherry + , representative of acute T-cell leukemia ( Supplementary Fig. 2f), but most mice (#2259, #2261, and #472) developed acute myeloid/erythro-myeloid leukemia with a c-Kit + mCherry + population predominately Gr1 + CD11b − (Supplementary Fig. 2g). The evolution of MDS to AL in Scl:Cdx2 mice ( Supplementary Fig. 2h-j) with an expansion of mCherryexpressing c-Kit+ cells ( Supplementary Fig. 2k-m) is likely due to the acquisition of transformation events and is consistent with secondary leukemia after MDS observed in patients. The leukemias were transplantable as irradiated recipient mice phenocopied the primary donor in all cases (example in Fig. 2i) and had shortened survival compared with the primary setting (Fig. 2j), demonstrating rapid expansion of the leukemic clone.
Taken together, these data show that Cdx2 is able to transform HSPC populations in situ into a faithful model of MDS with secondary AML.
Secondary genetic lesions cooperate with Cdx2 expression. AML transformation is mediated through co-operative mutations in genes that confer a proliferative advantage to cells together with pathways that primarily impair cellular differentiation 24 . To determine whether co-operating mutations had contributed to Scl:Cdx2 HSPC full transformation, we performed whole exome sequencing (WES) of three AL samples and one MPN sample. WES was performed on genomic DNA of CD45.2-sorted cells (ie. donor cells) from transplanted leukemic mice. Tumor samples were sourced from mCherry-positive donor cells and compared with germline samples that were mCherry-negative donor cells. We found a number of frameshift and non-synonymous somatic mutations in known tumor-associated genes, including positive (Jak1, Raf1, Zap70) and negative regulators (Pten, Cgref1) of signal transduction, cell adhesion molecules (Fat1), transcription factors (Etv6, Ikzf1, Trp53), and DNA-binding proteins (Nabp2) (Supplementary Table 1). PTEN is a known tumor suppressor commonly altered in human AML 25 , and was mutated in Scl: Cdx2 AML along with ETV6, a recurring fusion partner with CDX2 2,26 . Other AML single nucleotide variants (SNVs) were uncovered in Fat1 and Raf1. Loss-of-function of cadherin-like protein Fat1 and mutations in the Ras effector Raf1 are also previously described in AML 27,28 . Bilineage ALL cells harbored a frameshift insertion in Ikzf1 zinc-finger protein, which is frequently mutated in human B-ALL and to a lesser extent in T-ALL [29][30][31][32] . Mutations in the tyrosine kinase Jak1 (sample #252), are more prevalent in T-ALL than B-ALL and are associated with poor prognosis 33,34 . Finally, Cdx2-induced erythro-myeloid leukemia #472 harbored a loss of heterozygosity (LOH) event in the commonly mutated tumor suppressor gene Trp53. To further determine the significance of these SNVs, we confirmed their presence in functional protein domains similar to pathogenic SNVs in human orthologues 35  We did not observe any SNVs in cancer-associated genes (as listed in MSK-HemePACT cancer panel and COSMIC 36 ) from Scl:Cdx2 MPN BM demonstrating that the emergence of secondary mutations was found exclusively in AL.
Using the Beat AML trial cohort 37 , we found significant coexpression of CDX2 and FLT3 in AML patients, as well as increased CDX2 expression in FLT3-internal tandem duplication (ITD)-positive samples compared with FLT3-ITD-negative . j Colony forming cell (CFC) assay of BM cells initially plated (p0) and replated (p1) in M3434 methylcellulose. Each BM sample was plated in triplicate and each data point represents the mean of triplicate plates (Ctrl n = 9; Scl:Cdx2 n = 7). k Diagram of BM transplant experiment setup. Scl-Cre (n = 5) and Scl:Cdx2 (n = 10, split into n = 5 per treatment arm) BM chimeras. Tamoxifen or corn oil (vehicle) was administered by intraperitoneal (IP) injection to indicated groups. l PB chimerism to monitor relative contribution of Scl-Cre or Scl:Cdx2 BM to peripheral hematopoiesis. Experiment was performed in duplicate. Arrow indicates IP injection time point. (m) Model of Scl:Cdx2 hematopoietic cell hierarchy showing decreases in LTHSC, CMP and MEP leading to a loss of platelets (thrombocytopenia) and erythrocytes (anemia), and a relative increase in GMP resulting in greater levels of myeloid cells: monocytes and granulocytes. N = biologically independent animals. Statistical analyses performed using two-tailed Mann-Whitney test except (l) which used mixed-effects model with Tukey's multiple comparisons test. Data are plotted as mean values +/− SD. n.s.; not significant. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001.    Fig. 3b, c). We therefore tested whether Scl:Cdx2 mice would accelerate development of AML when crossed with mice harboring Flt3-ITD, a common oncogene in AML 38 (Fig. 3a). Scl:Cdx2/Flt3 ITD/+ double mutant mice had shorter survival and disease latency compared with Scl:Cdx2 mice and Scl/Flt3 ITD/+ alone (Fig. 3b). There was a trend to leukocytosis in some Scl:Cdx2/Flt3 ITD/+ mice compared with controls ( Fig. 3c, Supplementary Fig. 3d). Hemoglobin levels showed a wide range across biological replicates, however there was no significant difference between the means of Scl:Cdx2/ Flt3 ITD/+ and controls (Fig. 3d). Scl:Cdx2/Flt3 ITD/+ double mutants showed severe thrombocytopenia (Fig. 3e) and succumbed to advanced MPN (Fig. 3b) characterized by splenomegaly (Fig. 3f) and increased in Gr1 − positive myeloid cells in PB compared with control or single knockin Cdx2 mice (Fig. 3g).
Unexpectedly, there was downregulation of Hox cluster genes in Scl:Cdx2 HSPCs (Supplementary Fig. 5d). Other groups have shown increased expression of Hox genes after enforced Cdx2 overexpression 1,3 , however the specific Hox gene targets were non-overlapping. In the context of normal HSC function, decreased Hox gene function is associated with loss of self-renewal 41,42 and progressive downregulation of Hox genes is seen in normal differentiation ( Supplementary Fig. 5e) 43 . We therefore hypothesized that Cdx2 may bind to factors that regulate myeloid differentiation, leading to concomitant downregulation of Hox genes in stem cell populations. To understand the regulatory activity of Cdx2 within rare HSPCs, we utilized Assay for Transposase Accessible Chromatin with high-throughput sequencing (ATAC-Seq) on purified Scl:Cdx2 HSPCs vs. controls (Scl-CreER T alone), to identify changes in chromatin accessibility mediated by Cdx2 44 . In total, 62,711 peaks were identified in Cdx2-expressing cells and 28,282 peaks in the Scl-CreER T . The majority of chromatin accessible regions (26,099) were shared between both groups. These common regions were dominated by promoter elements whereas condition-specific regions were dominated by distal elements (Fig. 6a, Supplementary Fig. 6b). Within the Scl:Cdx2 specific distal elements, we found the Cdx2 motif (p = 1.6e −34 ) centrally enriched and also motifs belonging to the CCAAT/enhancer-binding protein family (Cebpb [p = 9.8e −89 ], Cebpe [p = 1.6e −89 ], Cebpa [7.5e −83 ], Cebpd [p = 9.2e −89 ]) (Fig. 6b) confirmed with another algorithm (HOMER; 45 Supplementary Fig. 6c). We also compared our data to the publicly available CEBPα ChIP-Seq dataset (GSM1187163) performed in GMP and found a significant overlap in peaks in Scl: Cdx2 BM but not Scl-Cre control samples (Fig. 6c, Supplementary  Fig. 6d). ATAC-Seq provides representation of Cdx2 binding and suggests that Cdx2 expression associates with chromatin changes that increase the accessibility of pro-differentiation myeloid transcription factor binding sites of the CCAAT/enhancer-binding protein family.
Coordinated RNA-Seq and ATAC-Seq data provide evidence of transcriptional and epigenetic reprogramming of leukemic stem cell populations. ATAC-Seq showed enrichment of early myeloid progenitor programs in pre-leukemia samples, with progressive acquisition of committed megakaryocyte erythroid progenitor chromatin architecture in erythroid leukemia, and lymphoid chromatin architecture in lymphoid leukemias, even though these cells retained a stem cell surface immunophenotype ( Supplementary Fig. 6e, f). Furthermore, we used RNA-Seq profiles of each Cdx2-expressing leukemia to identify differentially expressed genes that were upregulated in T-ALL (#882) and B/T-ALL (#252) but not other samples (Supplementary Data 3). Here using the tool Enrichr 46,47 , we found significant enrichment for genes deregulated upon transcription factor alteration in T lymphocytes and T cell leukemia (p < 0.05), again showing lymphoid priming within the stem cell populations (Supplementary Data 4). These data are consistent with a Cdx2-induced transcriptional program priming LKS + towards progenitor cell differentiation. In support of this, RNA-Seq also showed upregulation of Cebp family genes in Scl:Cdx2 LKS+ (representing pre-leukemic HSPC, Supplementary Fig. 5f) in keeping with myeloid differentiation. Interestingly, transformed Scl:Cdx2 LKS+ BM cells from acute leukemic mice showed similar or decreased levels of Cebp gene transcripts compared with control cells, with the sole exception of Cebpb (Supplementary Fig. 5f), suggesting these leukemia cells downregulate effectors of myeloid commitment as a mechanism of transformation.
Next, we performed chromatin immunoprecipitation sequencing (ChIP-Seq) to identify Cdx2 binding sites in hematopoietic cells. We validated MSCV-IRES-GFP-Cdx2 1 tagged with a FLAG epitope (Cdx2-FLAG) by immunoprecipitation with rabbit anti-FLAG monoclonal antibody and confirmed expression and binding of FLAG-tagged Cdx2 in Ba/F3 cells ( Supplementary  Fig. 6a). We next transduced lineage-negative WT mouse BM with Cdx2-FLAG or empty vector (EV) and performed ChIP-Seq. Cdx2-FLAG ChIP-Seq confirmed strong central enrichment of Cdx2 motifs at peaks in both promoter and distal regions (Fig. 6b). To overcome any dilution of binding signal of Cdx2expressing HSPC, we sought to integrate Cdx2-FLAG ChIP-Seq data from lineage-negative cells with ATAC-Seq on LKS+ and publicly available CEBPα ChIP-Seq on GMP. The top 1000 gained peaks in either Scl:Cdx2 or Scl-Cre ATAC-Seq showed correlation with Cdx2-FLAG and CEBPα ChIP-Seq peaks (Fig. 6c), suggesting these cell populations share similar chromatin identity despite immunophenotypic differences. To further functionally assess the relevance of the Scl:Cdx2 gained or lost distal accessible chromatin regions in HSPCs and myeloid progenitors, we analyzed ATAC-Seq and histone methylation marks associated with enhancers (H3K4me1) and ChIP-Seq data of LKS+ (or MPPs), CMPs, and GMPs 43 . Scl:Cdx2 HSPCs had less accessibility at enhancer regions (regions that are ATACaccessible or histones with H3K4me1 modification) that regulate fate in LKS/MPP cells (Fig. 6c, d; Supplementary Fig. 6g, h). In contrast, Scl:Cdx2 HSPCs show an increase in accessibility of distal enhancer peaks that regulate committed myeloid progenitor cell differentiation (Fig. 6c, d; Supplementary Fig. 6g, h). These data suggest that Cdx2 results in chromatin remodeling at distal enhancers, with a bias towards increased accessibility of enhancers associated with myeloid differentiation and reduced accessibility at enhancers of cell types with self-renewal potential. Concordantly, ChIP-Seq revealed Cdx2 peaks at HoxA and HoxB loci consistent with Cdx2 binding (Fig. 6e)   ( Supplementary Fig. 5d). Together these data suggest that Cdx2 represses certain Hox genes and primes HSPCs for myeloid differentiation.
Cdx2 leukemia is sensitive to myeloid disease therapy. Scl: Cdx2-induced MDS with secondary transformation to AML (sAML) is mediated by common oncogenic mutations seen in human disease, and thus, this model provides an opportunity to examine the preclinical efficacy of anti-leukemic drugs. sAML is refractory to standard chemotherapy and is associated with dismal survival. We performed preclinical studies to evaluate the activity and in vivo mechanism of 5-azacitidine (Aza), a clinically approved therapy for high-risk MDS and AML 48 . We evaluated mice that had received secondary transplants from Scl:Cdx2 AML, together with support WT BM cells. Aza treatment commenced once donor engraftment was established at 2 mg/kg IP injection daily for one week followed by three weeks of rest ( Supplementary Fig. 7a), mimicking the clinical schedule 49 . After one cycle of treatment, we observed a dramatic reduction in WBC counts in Aza-treated mice but not in vehicle treated controls (Fig. 7a). This was supported by similar pronounced reduction in mCherry cells and c-Kit expression in PB of Aza mice (Fig. 7b, c). In all experiments, the leukemia relapsed by the end of cycle one, however a second cycle of Aza dosing led to reduced leukemic burden and significant improvement in overall survival (Fig. 7d). There was increased apoptosis of Aza-treated Cdx2 cells, showing direct cytotoxicity of Aza on leukemia cells, with minimal effects on WT support cells or vehicle treated mice (Fig. 7e, Supplementary Fig. 7b). We also compared the standard 7 day regimen (2 mg/kg, 14 mg total per cycle) to a lower dose of Aza administered for 14 days over a 28 day cycle (1 mg/kg, qd, Monday-Friday, 14 mg total per cycle) ( Supplementary Fig. 7c), to mimic the prolonged exposure to low level drug that is seen with oral Aza dosing 50,51 . Interestingly, greater improvement was seen in mice receiving low exposure, extended duration (LE-ED) with Aza, compared with high exposure, limited duration (HE-LD) Aza (Fig. 7f, g). These data were confirmed in an independent myeloid leukemia model, also driven by Cdx2 (#2261) (Fig. 7h, i), suggesting that dose and scheduling may be relevant in optimizing clinical responses to Aza in MDS/AML. RNA-Seq was performed on mCherry-positive LKS+ cells from mice treated with vehicle vs. HE-LD and LE-ED Aza ( Supplementary Fig. 7d, Supplementary Data 5). LE-ED Aza treatment enriched for gene signatures associated with DNA hypomethylation (Fig. 7j), in accordance with mechanistic changes supported through extended oral dosing of Aza 50,51 . In contrast, HE-LD Aza treatment enriched for DNA damage and apoptosis signatures suggesting cytotoxicity of this regimen (Fig. 7k, Supplementary Fig. 7e, f). Both groups of Aza-treated cells showed significant upregulation of Trp53 and downregulation of Mycn (Fig. 7l), supporting a general mechanism of Aza in the induction of p53 and suppression of cellular proliferation 52,53 . The gene expression changes seen after Aza treatment mimicked the signature found in Cdx2 expressing cells prior to AML transformation (Fig. 7m), suggesting that Aza may revert AML to a pre-leukemic state, and also upregulates Klf4 (Fig. 7l), a gene known to be repressed by Cdx2 and has been shown to have a tumor suppressor function in AML 21 . Altogether, these data demonstrate the preclinical efficacy of Aza in MDS/AML and suggest that extended schedules of low-dose therapy may have improved efficacy compared with standard regimens.

Discussion
Transcriptional deregulation is a common leukemic mechanism that is thought to perturb cellular self-renewal and differentiation by modifying developmental cues. CDX2 is essential for ESC fate determination and is aberrantly expressed in myeloid malignancy. We generated a conditional transgenic mouse model of Cdx2 activation and characterized the de novo phenotype of Cdx2 expression in various hematopoietic subsets. Mice expressing Cdx2 in HSPCs develop lethal hematological diseases with prominent features of MDS and subsequent transformation into AL. The development of AL shows a long clinical latency with stepwise acquisition of oncogenic mutations, suggesting that Cdx2 expression predisposes cells to a pre-leukemic state with permissive conditions to the accumulation of cooperating secondary genetic events. This closely reflects the progression of human MDS to AML, where stepwise genetic mutations occur within HPSC and identifies these immature populations as the reservoir for leukemia initiating activity in vivo. Importantly, this model allows temporal control of Cdx2 expression within HSCs, leading to in situ transformation of HSPCs to LSCs, thereby eliminating the confounding effects of ex vivo manipulation of HSPC populations and retroviral models.
In humans, ectopic CDX2 expression is described in AML but also approximately 80% of newly diagnosed ALL or pediatric ALL 54,55 , underscoring the clinical relevance of this model. Unexpectedly, our model shows strong downregulation of Hox factors in fully transformed leukemia, which contrasts with other studies 3 , and suggests that Cdx2 can activate a number of discrete oncogenic pathways for leukemogenesis. We suggest that CDX2 expression correlates differently with HOX expression in different contexts. For example, expression levels of CDX2 are comparable in ALL and AML samples 3 , however HOX deregulation is much less common in ALL than AML. Furthermore, in embryogenesis, Cdx2 coordinates posterior development via Hox-independent mechanisms 56 . In keeping with other publications 21 , we frequently observed repression of Klf4 in all cases of AL.
When Cdx2 is expressed in HSPCs, mice show a propensity to develop secondary mutations followed by the development of a range of ALs of varying lineages. Conversely, when Cdx2 expression is restricted to myeloid cells in LysM:Cdx2 mice, there is a more homogeneous phenotype, typified by myelocytic expansion, leukocytosis, and splenomegaly, but without the thrombocytopenia that is hallmark to Scl:Cdx2 mice. Transformation to leukemia was not observed in this model, consistent with the hypothesis that HSPCs represent a leukemia-initiating seed population that is required for full disease penetrance.
As mCherry is not observed in LysM:Cdx2 MEPs, platelet numbers are not affected in these mice. In contrast, hypersegmented neutrophils are present in both LysM:Cdx2 and Scl:Cdx2 models, suggesting that Cdx2 expression within GMP cells is key to this phenotype. In addition, Cdx2 expression at the HSPC level is also seen to affect lymphoid lineages, highlighting the multipotent nature of Scl:Cdx2 cells. Emphasizing this, we observe a key regulatory role of aberrant Cdx2 on common hematopoietic developmental pathways.
BM characterization of moribund pre-leukemic Scl:Cdx2 mice show a relative decrease in MEPs and relative increase in GMP compared with controls. This phenotype may represent differentiation arrest at this cellular level and is consistent with reports from human high-risk MDS patients 57 such as MDS with excess blasts. Initially, we observed a modest increase in in vitro colony formation of Scl:Cdx2 BM but no immortalization. In competitive BM transplant assays, Scl:Cdx2 cells had cell-autonomous HSC self-renewal defects. These data indicate that Cdx2 leads to impaired clonogenicity, a trait that is similar to other animal models of MDS mutations 58 .
Pathological changes in high-risk MDS cells may confer apoptotic resistance and provide growth and survival advantages, leading to leukemia progression 59 . This is consistent with the observation that apoptosis rates are elevated in low-risk MDS and decrease in high-risk MDS and AML 60,61 . We observe mild abnormalities in apoptosis induction in Scl:Cdx2 HSPCs, and these cells are prone to enhanced cycling. It is thought that arrest in G 1 ("proliferative quiescence") is critical for cell fate decisions 62 , and commitment to self-renewal or proliferation are determined in G 1 phase by G 1 cyclins 63 . Given the importance of maintaining the balance between self-renewal and differentiation in HSPCs and LSCs, thorough investigation of these aberrant processes in Scl:Cdx2 cells may identify key regulators of leukemia evolution.
We did not observe leukemia cooperativity between Scl:Cdx2 and Flt3 ITD/+ models. Transgenic single mutant models of known oncogenes are frequently observed to be insufficient for    Advanced MDS and leukemic transformation has traditionally been challenging to model in animals. For this reason, studies into the use of azacitidine in MDS have largely come from primary patient samples 67 . In transplant experiments of Scl: Cdx2 secondary leukemias, we find that Aza prolongs survival of mice compared with vehicle treated controls, and Aza is preferentially toxic to Cdx2-mCherry-positive cells. Using dosing schedules comparable to CC-486 oral Aza regimens used in human clinical trials 50,51 , Aza appears to be more effective and more specific for hypomethylating genes when administered in a lower-dose, extended schedule compared with higher-dose, limited schedule. These preclinical findings warrant follow-up clinical trials, for example, through the use of extended schedules of oral Aza in patients with MDS that do not respond to standard Aza. Our data suggest that Aza alone is insufficient to deplete LSC as all mice relapsed after 1-2 cycles of treatment. This is consistent with the clinical scenario and it is likely that combination strategies (for example, with venetoclax 68 ) may be required to induce meaningful long-term remissions.
Altogether, this work characterizes a model of conditional Cdx2 expression that demonstrates transformation of normal HSPCs to MDS and AL in situ. Cdx2 alters HSPC identity and confers pre-leukemic progenitor cell characteristics, facilitating clonal evolution with important biological correlates of human leukemia. This model can be used to study the clinical effects of Aza, and demonstrates that prolonged, low doses of hypomethylating agents may increase specificity and efficacy of these agents against MDS and AML.

Methods
Animals and phenotypic analysis. Experimental animals were maintained on a C57BL/6J strain in a pathogen-free animal facility and procedures were approved by the QIMR Berghofer Animal Ethics Committee (A11605M). Mice were housed in clean cages with shredded tissue as nesting material, and environmental enrichment provided as often as possible. Cages were maintained at an ambient temperature of 20-26°C on a 12 h light/dark cycle. LSL-Cdx2-mCherry mice were generated by TaconicArtemis. Flt3 ITD/+ mice 38 were obtained from Dr. Wallace Langdon, Perth. Scl-CreER T mice 22 were obtained from Dr. Carl Walkley, Melbourne. LysM-Cre mice were obtained from Jackson Laboratories. Azacitidine was dissolved in 0.9% saline by vortexing for 60 seconds and injected intra-peritoneally within two hours. Any remaining solution was discarded after use due to shortterm stability of the drug. Peripheral blood (PB) was collected by retro-orbital venous blood sampling into EDTA-coated tubes and analyzed on a Hemavet 950 analyzer (Drew Scientific). PB smears were prepared and stained with Wright-Giemsa (BioScientific) according to the manufacturer's protocol. Twenty microliters of fresh PB was lysed with 1 mL Pharmlyse (BD Biosciences) and stained with B220, CD33, Gr1, Mac1, and c-Kit for 15-30 min at 4°C. Flow cytometric data collection was performed on a fluorescence-activated cell sorter LSRII Fortessa (BD Biosciences) with BD FACSDiva software (version 8.0.1) and analyzed using FlowJo (version 9.9.6). Flow cytometry antibodies were used at 1:100 dilution unless otherwise specified (Supplementary Table 5). BM cells were harvested by flushing femur and tibia bones. LKS + (Lineage low cKit + Sca1 + ) cells were stained as previously described 69 . In brief, cells were stained with a lineage cocktail comprising of biotinylated antibodies (B220, CD3e, CD5, Gr1, Mac1, Ter-119). Cells were then stained with Streptavidin, c-Kit and Sca1. Common myeloid progenitors (CMP), granulocyte-macrophage progenitors (GMP) and megakaryocyte-erythroid progenitors (MEP) cells were identified with the addition of CD34 and CD16/32. Short-term (ST-) and long-term hematopoietic stem cells (LTHSC) were stained with the addition of CD48 and CD150. Incubations were performed for 20-30 min at 4°C. For sorting, cells were purified using a FACSAriaIII (BD Biosciences). Cell cycle analysis was performed by staining cells with surface markers for LKS+ followed by fix and permeabilization according to the manufacturer's instructions (Fix & Perm kit, Thermo Fisher). Cells were stained with Ki-67 (B56) (1:100) in permeabilization buffer for 30 min at 4°C. Cells were washed and resuspended in PBS with Hoechst 33342 (20 μg/mL, Invitrogen) prior to flow cytometry analysis. Events were acquired at <1000 events/s. Apoptosis analysis was performed by staining cells for LKS+ markers and keeping incubation times to 15 min to minimize cell death. Washed cells were then stained with 2.5 μL Annexin V (Biolegend) in 50 μL Annexin V binding buffer (BD Biosciences) (1:20) for 15 min in the dark at room temperature. Cells were not washed and 250 μL of Annexin V binding buffer containing 0.25 μL of Sytox blue (Invitrogen) was added. Cells were analyzed by flow cytometry within one hour.
Colony forming assay. BM cells were washed with PBS and seeded into 1 mL of methylcellulose (M3434; Stem Cell Technologies) in 35 × 10 mm dishes (Corning). 1 × 10 3 BM cells were plated in triplicate and cultured at 37°C. Colonies were counted after 7 days, prior to passage.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The RNA-Seq datasets generated and analysed during the current study are available in the GEO database (https://www.ncbi.nlm.nih.gov/geo) under the SuperSeries accession number GSE133829 (GSE133679, BM LKS four weeks after tamoxifen treatment; GSE133680, Cdx2-mediated AML treated with Azacitidine or vehicle; GSE133828, Cdx2mediated acute leukemia BM LKS). Publicly available datasets published by Lara-Astiaso et al. 43 was obtained from the GEO database, accession numbers GSE60101 (RNA-Seq) and GSE59992 (ATAC-Seq). Publicly available dataset for Cebpa ChIP-Seq on mouse GMP was obtained from GEO database, accession number GSM1187163.
The ChIP-Seq dataset generated during this study on Cdx2-FLAG-transduced mouse BM is available at the accession number GSE146598. The ATAC-Seq dataset performed on BM LKS four weeks after tamoxifen treatment can be accessed here: https://genome. ucsc.edu/s/JasminS/VU_2019_CDX2_ATAC. Whole exome sequencing (WES) datasets performed on Cdx2 mouse BM are available at the Sequence Read Archive (SRA) with accession number PRJNA552223. Whole exome sequencing data for Supplementary  Fig. 3a and Supplementary Table 1 can be found in Supplementary Data 1. RNA-Seq data for Fig. 5a-h and Supplementary Fig. 5d-f can be found in Supplementary Data 2. RNA-Seq data for Fig. 7j-m, Supplementary Fig. 5f and Supplementary Fig. 7d-g can be found in Supplementary Data 5. All other data supporting the findings of this study are available within the article and its supplementary information files and from the corresponding authors upon reasonable request.