Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

An integrative approach to reveal driver gene fusions from paired-end sequencing data in cancer


Cancer genomes contain many aberrant gene fusions—a few that drive disease and many more that are nonspecific passengers. We developed an algorithm (the concept signature or 'ConSig' score) that nominates biologically important fusions from high-throughput data by assessing their association with 'molecular concepts' characteristic of cancer genes, including molecular interactions, pathways and functional annotations. Copy number data supported candidate fusions and suggested a breakpoint principle for intragenic copy number aberrations in fusion partners. By analyzing lung cancer transcriptome sequencing and genomic data, we identified a novel R3HDM2-NFE2 fusion in the H1792 cell line. Lung tissue microarrays revealed 2 of 76 lung cancer patients with genomic rearrangement at the NFE2 locus, suggesting recurrence. Knockdown of NFE2 decreased proliferation and invasion of H1792 cells. Together, these results present a systematic analysis of gene fusions in cancer and describe key characteristics that assist in new fusion discovery.

Figure 1: Exploring cancer-related gene fusions in the context of known molecular interaction networks.
Figure 2: Distinguishing biological features of gene fusions and point mutations in cancer.
Figure 3: Characterizing the genomic imbalances of recurrent gene fusions in acute lymphocytic leukemia.
Figure 4: Discovery and validation of the R3HDM2-NFE2 fusion using the ConSig algorithm and the fusion breakpoint principle.

Accession codes




  1. 1

    Mitelman, F., Johansson, B. & Mertens, F. The impact of translocations and gene fusions on cancer causation. Nat. Rev. Cancer 7, 233–245 (2007).

    CAS  Article  Google Scholar 

  2. 2

    Tomlins, S.A. et al. Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science 310, 644–648 (2005).

    CAS  Article  Google Scholar 

  3. 3

    Soda, M. et al. Identification of the transforming EML4-ALK fusion gene in non-small-cell lung cancer. Nature 448, 561–566 (2007).

    CAS  Article  Google Scholar 

  4. 4

    Kumar-Sinha, C., Tomlins, S.A. & Chinnaiyan, A.M. Recurrent gene fusions in prostate cancer. Nat. Rev. Cancer 8, 497–511 (2008).

    CAS  Article  Google Scholar 

  5. 5

    Vastrik, I. et al. Reactome: a knowledge base of biologic pathways and processes. Genome Biol. 8, R39 (2007).

    Article  Google Scholar 

  6. 6

    Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y. & Hattori, M. The KEGG resource for deciphering the genome. Nucleic Acids Res. 32, D277–280 (2004).

    CAS  Article  Google Scholar 

  7. 7

    Prasad, T.S. et al. Human Protein Reference Database–2009 update. Nucleic Acids Res. 37, D767–72 (2009).

    CAS  Article  Google Scholar 

  8. 8

    Hu, Z. et al. VisANT 3.0: new modules for pathway visualization, editing, prediction and construction. Nucleic Acids Res. 35, W625–632 (2007).

    Article  Google Scholar 

  9. 9

    Chen, C. et al. Leptin induces proliferation and anti-apoptosis in human hepatocarcinoma cells by up-regulating cyclin D1 and down-regulating Bax via a Janus kinase 2-linked pathway. Endocr. Relat. Cancer 14, 513–529 (2007).

    CAS  Article  Google Scholar 

  10. 10

    Chen, G.J., Weylie, B., Hu, C., Zhu, J. & Forough, R. FGFR1/PI3K/AKT signaling pathway is a novel target for antiangiogenic effects of the cancer drug fumagillin (TNP-470). J. Cell. Biochem. 101, 1492–1504 (2007).

    CAS  Article  Google Scholar 

  11. 11

    Vantler, M. et al. PI3-kinase/Akt-dependent antiapoptotic signaling by the PDGF alpha receptor is negatively regulated by Src family kinases. FEBS Lett. 580, 6769–6776 (2006).

    CAS  Article  Google Scholar 

  12. 12

    Walz, C., Cross, N.C., Van Etten, R.A. & Reiter, A. Comparison of mutated ABL1 and JAK2 as oncogenes and drug targets in myeloproliferative disorders. Leukemia 22, 1320–1334 (2008).

    CAS  Article  Google Scholar 

  13. 13

    Fuhrer, D.K. & Yang, Y.C. Complex formation of JAK2 with PP2A, P13K, and Yes in response to the hematopoietic cytokine interleukin-11. Biochem. Biophys. Res. Commun. 224, 289–296 (1996).

    CAS  Article  Google Scholar 

  14. 14

    Kharas, M.G. et al. Ablation of PI3K blocks BCR-ABL leukemogenesis in mice, and a dual PI3K/mTOR inhibitor prevents expansion of human BCR-ABL+ leukemia cells. J. Clin. Invest. 118, 3038–3050 (2008).

    CAS  Article  Google Scholar 

  15. 15

    Mullighan, C.G. et al. BCR-ABL1 lymphoblastic leukaemia is characterized by the deletion of Ikaros. Nature 453, 110–114 (2008).

    CAS  Article  Google Scholar 

  16. 16

    Mullighan, C.G. et al. Genome-wide analysis of genetic alterations in acute lymphoblastic leukaemia. Nature 446, 758–764 (2007).

    CAS  Article  Google Scholar 

  17. 17

    Drexler, H.G. . The Leukemia-Lymphoma Cell Line Factsbook (Academic Press, San Diego, 2000).

  18. 18

    Mitelman, F., Mertens, F. & Johansson, B. A breakpoint map of recurrent chromosomal rearrangements in human neoplasia. Nat. Genet. 15 Spec No 417–474 (1997).

    CAS  Article  Google Scholar 

  19. 19

    Maher, C.A. et al. Chimeric transcript discovery by paired-end transcriptome sequencing. Proc. Natl. Acad. Sci. USA 106, 12353–12358 (2009).

    CAS  Article  Google Scholar 

  20. 20

    Bashir, A., Volik, S., Collins, C., Bafna, V. & Raphael, B.J. Evaluation of paired-end sequencing strategies for detection of genome rearrangements in cancer. PLoS Comput. Biol. 4, e1000051 (2008).

    Article  Google Scholar 

  21. 21

    Kent, W.J. BLAT–the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).

    CAS  Article  Google Scholar 

  22. 22

    Wishart, D.S. et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 34, D668–D672 (2006).

    CAS  Article  Google Scholar 

  23. 23

    Olshen, A.B., Venkatraman, E.S., Lucito, R. & Wigler, M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5, 557–572 (2004).

    Article  Google Scholar 

  24. 24

    Weir, B.A. et al. Characterizing the cancer genome in lung adenocarcinoma. Nature 450, 893–898 (2007).

    CAS  Article  Google Scholar 

  25. 25

    Richard, W. Overall experiment characteristics. National Cancer Institute〉 (2009).

  26. 26

    Roth, R.B. et al. Gene expression analyses reveal molecular relationships among 20 regions of the human CNS. Neurogenetics 7, 67–80 (2006).

    CAS  Article  Google Scholar 

  27. 27

    Rhodes, D.R. et al. Oncomine 3.0: genes, pathways, and networks in a collection of 18,000 cancer gene expression profiles. Neoplasia 9, 166–180 (2007).

    CAS  Article  Google Scholar 

  28. 28

    Rubin, M.A. et al. Overexpression, amplification, and androgen regulation of TPD52 in prostate cancer. Cancer Res. 64, 3814–3822 (2004).

    CAS  Article  Google Scholar 

  29. 29

    Garraway, L.A. et al. Integrative genomic analyses identify MITF as a lineage survival oncogene amplified in malignant melanoma. Nature 436, 117–122 (2005).

    CAS  Article  Google Scholar 

  30. 30

    Cao, Q. et al. Repression of E-cadherin by the polycomb group protein EZH2 in cancer. Oncogene 27, 7274–7284 (2008).

    CAS  Article  Google Scholar 

Download references


We thank F. Mitelman for offering the fusion genes list from the Mitelman database; T.W. Glover for the important comments for improving the manuscript; J. Granger for help with editing the manuscript; S. Qin for useful discussions about biostatistics; NCIBI colleagues L. Ke and A. Ade for helping in the implementation of tools and technologies; C. Yang for the guidance in drug informatics; Z. Hu from Boston University for help with network visualization. This work was supported by the National Institutes of Health (NIH; U54 DA021519) and a National Institutes of Health Cancer Biology Training Grant (CA009676-18 to J.R.P). J.R.P. is a Fellow of the University of Michigan Medical Scientist Training Program. A.M.C. is supported by NIH early detection network U01 CA111275, DOD W81XWH-09-2-0014, the Doris Duke Foundation and the American Cancer Society.

Author information




X-S.W., G.S.O. and A.M.C. designed the study. X-S.W., J.R.P. and M.A.S. performed bioinformatics analyses. X-S.W., J.R.P., G.C., Q.C., S.M.D., R.P., X.C. and S.V. performed experimental studies. B.H. and N.P. performed FISH analysis. D.G.T., T.J.G. and D.G.B. coordinated the clinical and pathology components. X-S.W., J.R.P., G.S.O. and A.M.C. wrote the manuscript.

Corresponding authors

Correspondence to Gilbert S Omenn or Arul M Chinnaiyan.

Supplementary information

Supplementary Text and Figures

Supplementary Results (PDF 1812 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Wang, XS., Prensner, J., Chen, G. et al. An integrative approach to reveal driver gene fusions from paired-end sequencing data in cancer. Nat Biotechnol 27, 1005–1011 (2009).

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing