Nature Genetics
23, 348 - 353 (1999)
doi:10.1038/15531
Leukaemia disease genes: large-scale cloning and pathway predictionsJiayin Li1, 5, Haifa Shen1, 5, Karen L. Himmel2, Adam J. Dupuy2, David A. Largaespada2, Takuro Nakamura3, John D. Shaughnessy Jr4, Nancy A. Jenkins1
& Neal G. Copeland11 Mammalian Genetics Laboratory, ABL-Basic Research Program, NCI-Frederick Cancer Research and Development Center, Frederick, Maryland, USA. 2 Department of Genetics, Cell Biology and Development, University of Minnesota Cancer Center, Minneapolis, Minnesota, USA. 3 PRESTO, JST and Japanese Foundation for Cancer Research, Tokyo, Japan. 4 Department of Medicine, University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA. 5 These authors contributed equally to this work.
Correspondence should be addressed to Neal G. Copeland copeland@ncifcrf.govRetroviral insertional mutagenesis in BXH2 and AKXD recombinant inbred mice induces a high incidence of myeloid or B- and T-cell leukaemia1,
2 and the proviral integration sites in the leukaemias provide powerful genetic tags for disease gene identification. Some of the disease genes identified by proviral tagging are also associated with human disease3,
4,
5, validating this approach for human disease gene identification. Although many leukaemia disease genes have been identified over the years, many more remain to be cloned6. Here we describe an inverse PCR (IPCR) method for proviral tagging that makes use of automated DNA sequencing and the genetic tools provided by the Mouse Genome Project, which increases the throughput for disease gene identification. We also use this IPCR method to clone and analyse more than 400 proviral integration sites from AKXD and BXH2 leukaemias and, in the process, identify more than 90 candidate disease genes. Some of these genes function in pathways already implicated in leukaemia, whereas others are likely to define new disease pathways. Our studies underscore the power of the mouse as a tool for gene discovery and functional genomics.Our IPCR method is a modification of described methods7,
8,
9 (Fig. 1). Briefly, we cleaved mouse leukaemia cell DNA with a restriction endonuclease that cuts once within the provirus, ligated the cut DNA to form circles and amplified the circles using proviral DNA-specific primers. The sensitivity and specificity of the IPCR is increased by a secondary PCR amplification using a nested set of primers that have dUMP tails to facilitate cloning into plasmid pAMP1. We tailored IPCR conditions so as to make it possible to amplify fragments as large as 12 kb. We then used sequencing primers homologous to the cloning site to sequence approximately 600−700 bp of cellular DNA from each end of the insert. These proviral tagged sequences (PTSs) were then compared with each other to identify common sites (which have independent viral integrations in at least two leukaemias), BLAST-searched against the non-redundant and expressed sequence-tagged (EST) databases to identify coding regions, and used as probes for chromosome mapping to determine whether they co-segregate with any previously mapped candidate disease genes.
 | |  | To evaluate this method, we cloned and analysed 237 SacII and 72 BamHI IPCR products from BXH2 myeloid leukaemias, and 82 SacII and 28 BamHI IPCR products from AKXD T- and B-cell lymphomas (Table 1). The number of proviral integration sites amenable to cloning by this method was increased by cloning both 5' and 3' junction fragments and by using more than one endonuclease to cleave leukaemic cell DNA. SacII was chosen because previous studies showed that it cleaves cellular DNA preferentially in CpG islands and enriches for proviruses located near disease genes10. IPCR products representing endogenous murine (Emv) loci, or multiple isolations of the same integration site, are not included in the data presented (Table 2).
 | | Table 2. Common sites of retroviral integration, genes and ESTs identified by IPCRa |  |  |  |
Full Table |
|  | The power of the IPCR method was demonstrated by the fact that it identified all BXH2 disease genes, and many AKXD B- and T-cell leukaemia disease genes, identified previously by proviral tagging (Table 2). Furthermore, 19 new common integration sites (Table 3) and 39 genes and 35 ESTs that were cloned from a single leukaemia were identified in the screen (Tables 2 and 4). BLAST searches or chromosome mapping identified candidate genes for 12 common sites (Table 3). As expected, SacII IPCR products were more informative than BamHI IPCR products (Table 1). Proviral integrations were often located in or near the 5' or 3' ends of the genes (data not shown), although in many cases this was not possible to predict with certainty due to the lack of coding region information. The effect of proviral integration on gene expression has not been determined.
 | |  | Consistent with previous studies indicating that the Ras pathway is often deregulated in mouse and human myeloid leukaemias11, one common integration site, Evi18, tagged Rasgrp, encoding a Ras guanine nucleotide exchange factor (GEF); a second, Evi17, tagged Cdc25l, encoding Ras-related protein-1a (Rap1a) GEF (Table 3; Refs 12,13). Although mammalian cells express multiple GEF genes, it is notable that these two GEFs belong to the sample subfamily, which is speculated to couple cell-surface receptors that signal through Ca2+ and diacylglycerol to the Ras and Rap1a pathways12,
13.
We also found multiple integrations in the 3' UTR (Evi15) and upstream (Evi16) of Sox4, a lymphocyte transcriptional activator14. Another common site, Evi13, tagged the human myeloid leukaemia disease gene Cbfa2, whereas Evi14 and Evi22 tagged the genes encoding cyclin D3 (Ccnd3) and the retinoic acid receptor- (Rarg), respectively, two candidate leukaemia disease genes. Evi21 tagged the homeobox gene Hhex. Homeobox genes, or the genes that regulate them, are an important class of leukaemia disease genes6.
Evi19 tagged Hmgcr, encoding an enzyme involved in cholesterol biosynthesis, whereas Evi20 tagged in Clabp, a gene for which little functional information is available. A role for these two genes in leukaemia is less clear. One possibility is that the leukaemia genes affected by the Evi19 and Evi20 proviral integrations lie some distance from the viral integration sites and Hmgcr and Clabp are not the 'real' leukaemia genes affected by viral integration15. In future studies, it will be important to validate all candidate disease genes identified in these studies to confirm that they are disease-causing genes. A second possibility is that they are leukaemia disease genes and their role in leukaemia has yet to be elucidated.
Evi23 tagged Sharp1, encoding a basic-helix-loop-helix transcription factor, whereas Evi24 tagged Zfp36, encoding zinc finger protein 36. Transcription factors are attractive candidates for disease-causing genes. Although not much is known about Sharp1, Zfp36 is part of a negative feedback loop that interferes with Tnf- (encoded by Tnf) production by binding to and destabilizing Tnf mRNA. Zfp36-null mice develop a complex syndrome of inflammatory arthritis, dermatitis, cachexia and autoimmunity in addition to myeloid hyperplasia16,
17.
The mouse chromosomal localization of all but one new common site was determined by interspecific backcross (IB) analysis18 or from published data. None of the common sites encoding unknown genes map near published common sites or known or predicted cancer-causing genes, suggesting that these sites will ultimately identify novel disease genes.
We also identified 39 genes that are not yet located at common sites. Although these genes have been identified only once, many belong to gene classes associated with cancer (Table 4). There is also evidence to indicate that at least seven of these genes, Nkbkia, Bcl5, Rara, Hras1, Ccnd2, Spn and Rcpn, are cancer genes. Because these are not located at common sites, however, their role in leukaemia induction should be interpreted with caution.
In many of the leukaemias, the IPCR identified more than one candidate disease gene (Table 2). This was not surprising, as 42% of the IPCR products identified potential cancer genes (Table 1) and a typical BXH2 or AKXD leukaemia contains multiple (3−4) proviral integrations. This kind of genetic information is very valuable, however, because it provides genetic evidence for cooperativity between genesthe caveat being that we analysed primary leukaemias, not cell lines, which in many cases are oligoclonal.
Some of the genes identified by the IPCR can be assigned to pathways already associated with leukaemia, such as the Ras or Hox pathways. These genes include Hras1 and Dusp2. Dusp2 encodes a protein phosphatase that specifically dephosphorylates the mitogen-activated protein kinases ERK1 and ERK2 (ref. 19), two proteins that function downstream of Ras. They also include Rara, Rarg, Top2a and Laptm5, which are known, or can be envisioned, to function in Hox regulation. Yet another gene, Rfng, functions in Notch signalling. The Notch pathway is another important signalling pathway in leukaemia induction20,
21,
22.
Genes identified by IPCR may also reveal disease pathways that have not been previously implicated in leukaemia induction. One such pathway suggested by the IPCR is involved in cholesterol/lipid metabolism. In addition to Hmgcr, we identified two genes, Ldlr and Ehhadh, involved in cholesterol and lipid metabolism. Ldlr has an important role in cholesterol homeostasis, providing a source of cholesterol for membrane biogenesis and removing low-density lipoprotein from circulation. Ehhadh encodes a -oxidation enzyme that generates hydrogen peroxide required for peroxisomal biogenesis and lipid metabolism. Increased LDLR and EHHADH activity has been seen in some acute leukaemia23 and human carcinoma24 patients, supporting a role for these genes in disease.
A negative regulator of transcription, Yy1, was also identified by the IPCR. Yy1 is a negative regulator of Hmgcr and Ldlr expression25. It is also possible that mutant Yy1 is leukaemogenic, as Yy1 regulates Myc expression26. Myc is a target of Myb, which is activated by proviral integration in 8−10% of BXH2 acute myeloid leukaemia (AML).
The power of the IPCR method described here is that it is sequence based. PTSs can be compared with each other to look for common sites or BLAST-searched against EST and nonredundant databases to identify coding sequences. BLAST searches have become a very useful tool in recent years for disease gene identification due to the large number of genes contained in current databases as a result of the mouse and human genome projects. PTSs can also be used as probes for chromosome mapping. Human-mouse comparative maps can then be used to look for human disease associations. Mapping can also help to identify disease genes that were not hit by BLAST searches or common integration sites that were missed because the proviral integrations are not tightly clustered, and the different PTSs do not overlap. In the future, it should be possible to use techniques such as radiation hybrid mapping to determine the chromosomal location of all IPCR products and the exact number of PTSs located at common sites.
In the next few years, the complete sequence of the mouse genome will become available. By aligning these PTSs with the mouse sequence, the exact position of every PTS in the mouse genome will become known, as will every gene that maps in the vicinity. This advance should produce a quantum leap in the power of IPCR for disease gene identification. Another advance that we can look forward to is the development of insertional element systems that are capable of inducing cancers other than leukaemia30. Then, methods such as IPCR will provide a truly universal approach for cancer genetics.
Methods Mice. BXH2 and AKXD RI strains were obtained from The Jackson Laboratory, and aged in our own colony at the NCI-FCRDC.
DNA isolation, BAC screening and Southern-blot analysis. We extracted high-molecular-weight genomic DNAs from frozen mouse tissues as described27. DNA probes used for Southern analysis, BAC screening and IB mapping were derived from restriction fragments or PCR products of IPCR clones and varied from 200 bp to 2 kb. We labelled probes with ( 32P)dCTP by random priming28. We purchased high-density mouse BAC filters and BAC clones (Research Genetics). BAC screening using ExpressHyb (Clontech) was performed according to the manufacturer's protocol. We performed genomic and BAC DNA Southern-blot analysis as described27.
IPCR cloning. Tumour DNA (5 g) was digested to completion with SacII or BamHI (New England BioLabs) in a total volume of 40 l. Reactions were stopped by heating at 65 °C for 20 min (SacII) or phenol/chloroform extraction followed by ethanol precipitation (BamHI). Digested DNA was then self-circularized by dilution and ligation using T4 DNA ligase (3,200 U; New England) in a total volume of 600 l at 16 °C for 16 h. Circular DNA was precipitated with ethanol and dissolved in TE (40 l); 2 l was used in the primary PCR in a 50- l reaction volume containing dNTPs (20 nmol each), forward and reverse primers (10 pmol each), 1 buffer 2 and enzyme mix (2.5 U) in the Expand Long Template PCR System (Boehringer). We used an Omnigene Hybaid thermocycler programmed as 92 °C for 2 min, followed by 10 cycles of 92 °C for 10 s, 63 °C for 30 s, 68 °C for 15 min, then 20 cycles of 92 °C for 10 s, 63 °C for 30 s, 68 °C for 15 min with a 20-s auto extension, and a final extension step at 68 °C for 30 min. The amount of primary PCR product was semiquantitated by electrophoresis on a 1% agarose gel and primary PCR product (0.01−1 l) was used as the template in the secondary PCR reaction under the same conditions, except secondary primers were used. The secondary PCR product was separated on a 1% agarose gel, purified using the QIAquick Gel Extraction Kit (Qiagen) and directly cloned using the CloneAmp pAMP1 System (GibcoBRL) according to the supplied protocol. In the primary PCR reactions, we used primers: S3'1F (5'−GGCTGCCATGCACGATGACCTT−3') and S3'1R (5'−CGGCCAGTACTGCAACTGACCAT−3') for SacII 3' cloning; S5'1F (5'−GAGGCCACCTCCACTTCTGAGAT−3') and S5'1R (5'−CTCTGTCGCCATCTCCGTCAGA−3') for SacII and BamHI 5' cloning; and S3'1F and B3'1R (5'−CGGGAAGGTGGTCGTCGGTCT−3') for BamHI 3' cloning. The secondary PCR primers were: S3'2F (5'−CUACUACUACUAGGGAGGGTCTCCTCAGAGTGATT−3') and S3'2R (5'−CAUCAUCAUCAUGGAAAGCCCGAGAGGTGGT−3') for SacII 3' cloning; S5'2F (5'−CAUCAUCAUCAUCCTGCCCCCTCTCCCATAGTGT−3') and S5'2R (5'−CUACUACUACUAGGCGTTACTGCAGTTAGCTGGCT−3') for SacII and BamHI 5' cloning; and S3'2F and B3'2R (5'−CAUCAUCAUCAUGGGGCCCCGAGTCTGTAATTT−3') for BamHI 3' cloning.
We detected a minor 2.2-kb product after the secondary PCR in every leukaemic sample analysed when we used 3' SacII primers for amplification. Sequencing studies showed that this product is a virus dimer formed by head-to-tail fusion of viral sequences. The source of this dimer remains unknown.
Southern analysis showed that some of the IPCR products were derived from proviral sequences, which are present in submolar amounts in leukaemic cell DNA. This result was expected, because previous studies showed that a number of these leukaemias are oligoclonal and the IPCR method is very sensitive and capable of amplifying these submolar proviral integrations. Many of these submolar proviral integrations identified genes that are known BXH2 or AKXD leukaemia disease genes, or are good candidates for leukaemia disease genes, demonstrating the value of these submolar proviral integrations for disease gene identification (data not shown).
DNA sequencing. DNA sequencing was performed using the PRISM BigDye Cycle Sequencing kit (Perkin Elmer) on an ABI Model 373A DNA Sequencer (Applied Biosystems). We purchased SP6 and T7 sequencing primers (Gibco BRL).
Chromosome mapping. IB mapping was performed at the NCI-FCRDC as described18.
Sequence comparison. We compared PTSs with each other by MacVector, and with non-redundant and EST databases by BLASTn. Sequences were deemed matching when a PTS unambiguously matched one member of a gene family and had a BLASTn probability value of 10−25 or less after visual inspection to confirm that the matching segments were free of repetitive DNA. In a few cases, a short exon was the only matching segment between a cDNA and a genomic clone; here, the probability value was often better than 10−25.
Received 8 September 1999; Accepted 15 September 1999
REFERENCES
- Bedigian, H.G., Johnson, D.A., Jenkins, N.A., Copeland, N.G. & Evans, R. Spontaneous and induced leukemias of myeloid origin in recombinant inbred BXH mice. J. Virol. 51, 586-594 (1984). | PubMed | ISI | ChemPort |
- Gilbert, D.J., Neumann, P.E., Taylor, B.A., Jenkins, N.A. & Copeland, N.G. Susceptibility of AKXD recombinant inbred mouse strains to lymphomas. J. Virol. 67, 2083-2090 (1993). | PubMed | ISI | ChemPort |
- Ogawa, S. et al. Structurally altered Evi-1 protein generated in the 3q21q26 syndrome. Oncogene 13, 183-191 (1996). | PubMed | ISI | ChemPort |
- Copeland, N.G. & Jenkins, N.A. Myeloid leukemia: disease genes and mouse models. in Animal Models of Cancer Predisposition Syndromes (eds Hiai, H. & Hino, O.) 53-63 (Karger, Basel, 1999). | ChemPort |
- Roberts, T., Chernova, O. & Cowell, J.K. NB4S, a member of the TBC1 domain family of genes, is truncated as a result of a constitutional t(1;10)(p22;q21) chromosome translocation in a patient with stage 4S neuroblastoma. Hum. Mol. Genet. 7, 1169-1178 (1998). | Article | PubMed | ISI | ChemPort |
- Look, A.T. Oncogenic transcription factors in the human acute leukemias. Science 278, 1059-1064 (1997). | Article | PubMed | ChemPort |
- Silver, J. & Keerikatte, V. Novel use of polymerase chain reaction to amplify cellular DNA adjacent to an integrated provirus. J. Virol. 63, 1924-1928 (1989). | PubMed | ISI | ChemPort |
- Sorensen, A.B., Duch, M., Jorgensen, P. & Pedersen, F.S. Amplification and sequence analysis of DNA flanking integrated proviruses by a simple two-step polymerase chain reaction method. J. Virol. 67, 7118-7124 (1993). | PubMed | ISI | ChemPort |
- Valk, P.J.M., Joosten, M., Vankan, Y., Lowenberg, B. & Delwel, R. A rapid RT-PCR based method to isolate complementary DNA fragments flanking retrovirus integration sites. Nucleic Acids Res. 25, 4419-4421 (1997). | Article | PubMed | ISI | ChemPort |
- Nakamura, T., Largaespada, D.A., Shaughnessy, J.D.Jr, Jenkins, N.A. & Copeland, N.G. Cooperative activation of Hoxa and Pbx1-related genes in murine myeloid leukaemias. Nature Genet. 12, 149-153 (1996). | Article | PubMed | ISI | ChemPort |
- Shannon, K. The Ras signaling pathway and the molecular basis of myeloid leukemogenesis. Curr. Opin. Hematol. 2, 305-308 (1995). | PubMed | ChemPort |
- Kawasaki, H. et al. A Rap guanine nucleotide exchange factor enriched highly in the basal ganglia. Proc. Natl Acad. Sci. USA 95, 13278-13283 (1998). | Article | PubMed | ChemPort |
- Ebinu, J.O. et al. RasGRP, a Ras guanyl nucleotide-releasing protein with calcium- and diacylglycerol-binding motifs. Science 280, 1082-1086 (1998). | Article | PubMed | ISI | ChemPort |
- van de Wetering, M., Oosterwegel, M., van Norren, K. & Clevers, H. Sox-4, an Sry-like HMG box protein, is a transcriptional activator in lymphocytes. EMBO J. 12, 3847-3854 (1993). | PubMed | ChemPort |
- Jonkers, J. & Berns, A. Retroviral insertional mutagenesis as a strategy to identify cancer genes. Biochim. Biophys. Acta 1287, 29-57 (1996). | Article | PubMed | ISI | ChemPort |
- Carballo, E., Lai, W.S. & Blackshear, P.J. Feedback inhibition of macrophage tumor necrosis factor-
production by tristetraprolin. Science 281, 1001-1005 (1998). | Article | PubMed | ISI | ChemPort |
- Taylor, G.A. et al. A pathogenetic role for TNF
in the syndrome of cachexia, arthritis, and autoimmunity resulting from tristetraprolin (TTP) deficiency. Immunity 4, 445-454 (1996). | Article | PubMed | ISI | ChemPort |
- Copeland, N.G. & Jenkins, N.A. Development and applications of a molecular genetic linkage map of the mouse genome. Trends Genet. 7, 113-118 (1991). | Article | PubMed | ISI | ChemPort |
- Ward, Y. et al. Control of MAP kinase activation by the mitogen-induced threonine/tyrosine phosphatase PAC1. Nature 367, 651-654 (1994). | Article | PubMed | ISI | ChemPort |
- Pear, W.S. et al. Exclusive development of T cell neoplasms in mice transplanted with bone marrow expressing activated Notch alleles. J. Exp. Med. 183, 2283-2291 (1996). | Article | PubMed | ISI | ChemPort |
- Rohn, J.L., Lauring, A.S., Linenberger, M.L. & Overbaugh, J. Transduction of Notch2 in feline leukemia virus-induced thymic lymphoma. J. Virol. 70, 8071-8080 (1996). | PubMed | ISI | ChemPort |
- Ellisen, L.W. et al. TAN-1, the human homolog of the Drosophila notch gene, is broken by chromosomal translocations in T lymphoblastic neoplasms. Cell 66, 649-661 (1991). | Article | PubMed | ISI | ChemPort |
- Vitols, S., Gahrton, G., Bjorkholm, M. & Peterson, C. Hypocholesterolaemia in malignancy due to elevated low-density lipoprotein-receptor activity in tumour cells: evidence from studies in patients with leukaemia. Lancet 2, 1150-1154 (1985). | Article | PubMed | ISI | ChemPort |
- Cable, S. et al. Peroxisomes in human colon carcinomas. A cytochemical and biochemical study. Virchows Arch. B Cell Pathol. Incl. Mol. Pathol. 62, 221-226 (1992). | PubMed | ISI | ChemPort |
- Ericsson, J., Usheva, A. & Edwards, P.A. YY1 is a negative regulator of transcription of three sterol regulatory element-binding protein-responsive genes. J. Biol. Chem. 274, 14508-14513 (1999). | Article | PubMed | ISI | ChemPort |
- Austen, M., Cerni, C., Luscher-Firzlaff, J.M. & Luscher, B. YY1 can inhibit c-Myc function through a mechanism requiring DNA binding of YY1 but neither its transactivation domain nor direct interaction with c-Myc. Oncogene 17, 511-520 (1998). | Article | PubMed | ChemPort |
- Jenkins, N.A., Copeland, N.G., Taylor, B.A., Bedigian, H.G. & Lee, B.K. Ecotropic murine leukemia virus DNA content of normal and lymphomatous tissues of BXH-2 recombinant inbred mice. J. Virol. 42, 379-388 (1982). | PubMed | ISI | ChemPort |
- Feinberg, A.P. & Vogelstein, B. A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity. Anal. Biochem. 132, 6-13 (1983). | Article | PubMed | ISI | ChemPort |
- Mucenski, M.L., Taylor, B.A., Jenkins, N.A. & Copeland, N.G. AKXD recombinant inbred strains: models for studying the molecular genetic basis of murine lymphomas. Mol. Cell. Biol. 6, 4236-4243 (1986). | PubMed | ISI | ChemPort |
- Luo, G., Ivics, Z., Izsvak, Z. & Bradley, A. Chromosomal transposition of a Tc1/mariner-like element in mouse embryonic stem cells. Proc. Natl Acad. Sci. USA 95, 10769-10773 (1998). | Article | PubMed | ChemPort |
Acknowledgments We thank N. O'Sullivan for technical assistance; D. Gilbert for help with the IB mapping; and H.C. Morse and M.C. Dean for helpful comments. This research was sponsored in part by the National Cancer Institute, DHHS, under contract with ABL.
|