Diversity of TMPRSS2-ERG fusion transcripts in the human prostate


TMPRSS2-ERG gene fusions have recently been reported to be present in a high proportion of human prostate cancers. In the current study, we show that great diversity exists in the precise structure of TMPRSS2-ERG hybrid transcripts found in human prostates. Fourteen distinct hybrid transcripts are characterized, each containing different combinations of sequences from the TMPRSS2 and ERG genes. The transcripts include two that are predicted to encode a normal full-length ERG protein, six that encode N-terminal truncated ERG proteins and one that encodes a TMPRSS2-ERG fusion protein. Interestingly, distinct patterns of hybrid transcripts were found in samples taken from separate regions of individual cancer-containing prostates, suggesting that TMPRSS2-ERG gene fusions may be arising independently in different regions of a single prostate.


Gene fusions have been reported in many classes of solid human tumours including adult and childhood sarcomas, thyroid cancer, kidney cancer, and in some classes of benign tumours such as lipomas and leiomyosarcomas (Cooper, 2001). Considerable variation may occur in the structure of individual fusions. For example, in Ewing's sarcoma, the N-terminal domain of EWS can become fused to the C-terminal domains of several members of the ETS family of transcription factors including FLI1, ERG, ETV1, ETV4 and FEV (Delattre and Sevenet, 2002). For hybrid EWS-FLI1 transcripts, many combinations of fusions of exons from EWS and exons from FLI1 can encode in-frame fusion transcripts that result in different lengths and compositions of the encoded chimaeric protein (Kojima et al., 2002). Specifically, fusion of EWS exons 7, 8 or 10 may occur to FLI1 exons 5, 6 or 8. Irrespective of the EWS binding partner or precise structure of the hybrid transcript, the encoded protein always retains the C-terminal DNA-binding domain of the ETS transcription factor, highlighting the key role of this domain in cellular transformation (Janknecht, 2005). Analysis of alternative hybrid transcripts may also lead to information that is prognostically relevant. In Ewing's sarcoma, the presence of type 1 fusion (fusion of exon 7 of EWS to exon 6 of FLI1) is an independent predictor of favourable clinical outcome (de Alava et al., 1998; Lin et al., 1999).

Recently, fusion of the TMPRSS2 gene to three members of the ETS family of transcription factor genes, ERG, ETV1 and ETV4, has been reported and confirmed in human prostate cancer (Tomlins et al., 2005, 2006; Soller et al., 2006). ERG is the ETS family gene most commonly altered, with Tomlins et al. (2005) reporting the fusion of exon 1 of TMPRSS2 to either exon 2 or 4 of ERG. In the present study, we have used a reverse transcriptase–PCR (RT–PCR)-based approach to screen for the presence of TMPRSS2-ERG fusions in a series of 26 prostate cancers, 17 samples of morphologically normal prostate taken adjacent to prostate cancer, 31 samples of benign prostatic hyperplasia (BPH) and 20 normal (non-prostate) tissue RNAs. cDNAs prepared from RNA from these samples were subjected to PCR using a 5′ primer from exon 1 of the TMPRSS2 gene and a 3′ primer from exon 6 of the ERG gene (Figure 1a). We were interested to find that, in contrast to the simple pattern of fusion initially reported, an extensive array of different-sized PCR products was observed. Selected examples of these are shown in Figure 1. Samples that failed to yield PCR products in a first round of amplification were subject to nested PCR (e.g. see Figure 1b) to see whether lower levels of hybrid transcript were present. Samples that failed to yield a visible PCR product following either a first round or nested PCR reaction were scored as negative for TMPRSS2-ERG transcripts. Individual PCR products were extracted from the gels and identified by DNA sequencing. The results are presented in Figures 2, 3 and 4. A total of 14 distinct TMPRSS2-ERG hybrid transcripts were detected (1–14; Figure 2), which included fusions of exons 1, 2 and 3 of TMPRSS2 to exons 2, 3, 4, 5 or 6 of ERG. Five of the 14 hybrids (1–5) contained in-frame stops and are unlikely to encode functional ERG proteins. For three of these (transcripts 1, 2 and 4), translational initiation is predicted to occur within TMPRSS2 sequences, but an in-frame stop codon occurs soon after the transition to ERG sequences. Although the 3′-ERG exon sequences are intact, based on rules of translational control for transcripts with premature stop codons that have been worked out for other mammalian genes such as β-globin (Lykke-Andersen et al., 2000), these TMPRSS2-ERG transcripts are unlikely to direct translation of additional protein products. Three novel TMPRSS2-ERG fusion transcripts recently reported by Soller et al. (2006) are also included in Figure 2 (transcripts 15–17). All three of these transcripts are predicted to initiate translation within TMPRSS2 sequences but contain in-frame stops soon after transition to ERG sequences and are again unlikely to be translated to form truncated ERG proteins.

Figure 1

Diversity of TMPRSS2-ERG fusions in human prostate. (a) cDNA prepared from human prostate samples was subjected to PCR using the 5′-TMPRSS2 exon 1 primer and 3′-ERG exon 6 primer described by Tomlins et al. (2005) and the resulting PCR products were separated by agarose gel electrophoresis. Lane 1, M: kilobase markers (Invitrogen, Paisley, UK, 500 bp is indicated by the red arrowhead); lane 2, NCI-H660 prostate cancer cell line (Carney et al., 1985); lanes 3–9, samples from human prostate. All products shown on this gel were generated in first round PCR reactions. The identity of the individual PCR product was determined by subcloning (TA cloning kit, Invitrogen) and DNA sequencing. Some of the PCR products could not be subcloned and we have established that, at least in some cases, these represent heteroduplexes of DNA strands from other PCR products present in the reaction (results not shown). The TMPRSS2 exons (T1, T2, etc.) and ERG exons (E1, E2, etc.) involved in the fusions are indicated. I and III indicate intronic sequences for TMPRSS2 intron 1 and ERG intron 3, respectively, present in the fusion transcripts. The precise structure of each of the fusions is shown in Figure 2. PCRs were essentially carried out using the conditions described by Tomlins et al. (2005), whereas all other methods used were described by Clark et al. (1994). Transcripts highlighted by ‘*’ are those found by Tomlins et al. (2005). The quality of all cDNAs that failed to give TMPRSS2-ERG PCR products was verified by PCR using ETV1 primers (result not shown). (b) A 3–5 mm prostate slice was preserved for research in RNALater as described by Jhavar et al. (2005). Histopathological H and E assessment of 5 μm whole-mount sections taken from prostate slices below and above the preserved research slice revealed regions harbouring cancer or non-malignant glands (see regions marked in the upper panel). RNA prepared from samples dissected from within corresponding regions in the research slice was used to make cDNA for first round PCR and gel electrophoresis (middle panel) as described above. Nested PCR (lower panel) was carried out using 0.25 μl of primary PCR product and the following primers: ‘TMPRSS2-Ex1-nest’, GGAGCGCCGCCTGGAG, and ‘ERG_Ex6-nest’, CCATATTCTTTCACCGCCCACTCC (lower panel).

Figure 2

Structure of TMPRSS2-ERG hybrid transcripts. The structures of the 14 (numbered 1–14) distinct fusion transcripts detected in the experiments described in the legend to Figure 1 are shown. Sequences derived from TMPRSS2 are shown in blue: sequences derived from ERG are shown in red. Light colour, untranslated regions; heavy colour, open reading frames predicted by http://www.dnalc.org/bioinformatics; horizontal stripes, frame shift resulting in an alternative reading frame terminating at a premature stop. ‘IIIa, IIIb, IIIc’ in fusion transcripts 12–14 are three different sequences from intron 3 of ERG. IIIc corresponds to exon 4 of an ERG splice variant designated AY204740; ‘I’ is a sequence from TMPRSS2 intron 1; diagonal stripes indicate multiple exons. All ERG translation initiation sites other than that shown in exon 3 are putative. PCR detection of hybrid transcripts was carried out using an exon 6 3-′ERG primer, ERG sequences 3′ to this primer are assumed to be present. The structure of novel hybrid transcripts recently identified by Soller et al. (2006) is also shown (15–17). Transcripts 6 and 9 (highlighted by ‘*’) are the same as those detected by Tomlins et al. (2005).

Figure 3

Structure of predicted TMPRSS2-ERG fusion proteins. (a) TMPRSS2 (RefSeq NM_005656) and ERG (RefSeq NM_004449) transcripts aligned with encoded proteins (492 and 462 aa, respectively). The relative positions of the exon boundaries in the transcripts are shown and aligned with the encoded protein. A red downward arrowhead indicates novel fusion positions detected in our studies, whereas grey (*) downward arrowheads indicate fusions found by Tomlins et al. (2005). Upward pointing black arrowheads indicate the position of the fusion breakpoints found within the ERG gene in Ewing's sarcoma. Within the proteins, the transmembrane (TM) domain and serine protease domains of TMPRSS2 are indicated, as are the PTN (pointed), ETS (DNA binding) and CTD (C-terminal transactivator domain) domains of ERG. (b) Structure of the fusion proteins (A–F) encoded by the open reading frames predicted with algorithms at http://www.dnalc.org/bioinformatics from the TMPRSS2-ERG hybrid transcript structures shown in Figure 2. The numbers (left-hand side) indicate the identity of the TMPRSS2-ERG fusion transcripts from Figure 2 that are predicted to encode each protein. The table on the right-hand side indicates the number of times we observed fusion transcripts predicted to encode each protein in first round PCR (primary) and second round nested PCR (nest) reactions (based on the data shown in Figure 4). The number of samples in which each ERG protein is predicted to occur as sole fusion encoded protein product is indicated in the right-hand column (sole coding fusion).

Figure 4

Pattern of TMPRSS2-ERG hybrid transcripts found in individual tissue specimens. (a) RT–PCR analysis was carried out on three cell lines (DuCaP, VCaP and NCI-H660), 26 prostate cancer specimens, 17 morphologically normal samples taken adjacent to prostate cancer, three commercial prostate (CNP) RNAs claimed to be from normal prostate (Clontech 1,2, Ambion 3) and 31 benign prostatic hyperplasias (BPH). In addition to the data displayed, 20 normal (non-prostate) tissue RNAs were analysed and found to be negative for a fusion transcript. Analyses were carried out as described in the text and in the legend to Figure 1. Red indicates a product detected by first round PCR. Blue indicates that the band was detected only after nested PCR. Column heading numbers and descriptions correspond to the fusion products in Figures 1 and 2. E-ORF, open reading frame encoding normal ERG protein; F-ORF, transcript encodes TMPRSS2-ERG fusion protein; IFS, in-frame premature stop codon; NTT, N-terminal truncated ERG protein. The data in (b) are represented to allow comparisons of TMPRSS2-ERG fusions present in paired specimens of prostate cancer and adjacent morphologically normal prostate tissue from individual prostates. T, tumour; NN, morphologically normal region selected near the main cancer; NA, morphologically normal region selected away from the main cancer. The right-hand column indicates whether the sample yielded products in the first round PCR (red), following nested PCR (blue) or in either of these reactions (black).

The predicted structures of the proteins encoded by the remaining nine hybrid transcripts are shown in Figure 3. Two encode normal ERG proteins, six encode N-terminal truncated ERG proteins, whereas one encodes a TMPRSS2-ERG chimaeric protein. Interestingly, all protein products except that encoded by the fusion transcript 11 retained the PTN (pointed) domain of ERG in addition to the DNA binding (ETS) domain and C-terminal transactivated domain (CTD) domains. This contrasts with the situation found in EWS-ERG fusions in Ewing's sarcoma where the PTN domain is usually lost. The PTN domain, when present in other proteins, has been implicated in homo-oligomerization, hetero-dimerization and transcriptional repression (Sharrocks, 2001).

The fusion transcripts detected most frequently (Figure 4) were formed by fusion of exon 1 of TMPRSS2 to exons 4 and 5 of ERG (designated T1/E4 and T1/E5, respectively). The T1/E4 transcript was previously reported by Tomlins et al. (2005). As these two hybrid transcripts usually occur together in individual samples, it is likely that they result from alternative splicing of an mRNA encoded by a single TMPRSS2-ERG gene fusion. Alternative splicing of transcripts encoded by a single ERG gene fusion has also been reported to occur in acute myelogenous leukaemia (AML). In individual AML patients, expression of three alternatively spliced FUS-ERG transcripts designated types A–C is observed, with only type B transcripts encoding a functional protein (Ichikawa et al., 1994; Kong et al., 1997). This situation contrasts with the TMPRSS2-ERG fusions, as both T1/E4 and T1/E5 are predicted to encode N-terminal truncated ERG proteins. The T1/E5 fusion was found in the absence of T1/E4 in three samples. This could be explained if fusion at the DNA level in these samples had occurred within intron 4 of the ERG gene. The second hybrid transcript (T1/E2) detected by Tomlins et al. (2005) was found in three prostate samples, but only occurred in the absence of the more frequent T1/E4 and T1/E5 transcripts in a single case (Figure 4).

Analysis of the data set of TMPRSS2-ERG transcripts (Figure 4) highlights ERG proteins A, B, C and D as potentially important proteins: each is predicted to be encoded in individual prostate samples in the absence of other fusion encoded ERG proteins (Figure 3b). Underscoring the particular importance of proteins C and D, they (i) occur as sole ERG proteins encoded by TMPRSS2-ERG fusions in nine samples (Figure 3b) and (ii) occur together as the only two fusion encoded ERG proteins in a further 15 samples (Figure 4). A and C were reported as predicted ERG proteins encoded by TMPRSS2-ERG fusion transcripts by Tomlins et al. (2005). Our studies additionally highlight the potential importance of ERG proteins B and D. Unfortunately, the tendency of TMPRSS2-ERG fusions to lose ERG 3′ coding exons (e.g., compare transcripts 3 and 7 and transcripts 5 and 6 in Figure 2) means that RT–PCR approaches that measure distinct 5′-TMPRSS2-exon to 3′-ERG-exon boundaries may not be suitable to quantitate fusion transcripts encoding individual proteins.

TMPRSS2-ERG hybrid transcripts were found in the three prostate cell lines VCaP, DuCaP and NCI-H660. VCaP and DuCaP were reported to harbour TMPRSS2-ERG by Tomlins et al. (2005), and had been included in our study as positive controls. The data for NCI-H660 represent a novel report of a cell line containing this fusion. All three cell lines harboured both T1/E4 and T1/E5 hybrid transcripts.

Human prostates were assessed for the presence of TMPRSS2-ERG fusions. A 3–5 mm thick ‘research slice’ was taken from individual prostatectomy specimens as described by Jhavar et al. (2005). Haematoxylin and eosin-stained whole-mount sections from immediately above and below this slice were subjected to histopathological examination. When cancer was detected at a particular location in both these sections, it was inferred that cancer was also present in the research slice at the same position and a ‘cancer’ area was marked (see Figure 1b, upper panel). When non-malignant epithelial cells were detected in both these sections, a ‘non-malignant’ area was marked (see Figure 1b, upper panel). TMPRSS2-ERG hybrid transcripts were detected in 19/26 (73%) samples taken from the areas designated as primary prostate cancer but also in 8/17 (43%) samples taken from regions assigned as non-malignant. One such example is shown in Figure 1b where T1/E4 (major), T1/E5 and T2/E4 transcripts were detected in the cancer, whereas T1/E5 (major) and T2/E5 are present in apparently morphologically normal tissue taken from the opposite side of the prostate. The transcript patterns obtained in 10 paired samples of non-malignant and cancer tissues taken from the same prostate are shown in Figure 4b. In four cases, the pattern of hybrid transcripts present in samples assigned as non-malignant was distinct from that found in the corresponding cancer. In a further two cases, TMPRSS2-ERG fusions were present in the non-malignant specimen but not in the cancer specimen. It is not possible to conclude from these experiments that the TMPRSS2-ERG fusion is present in individual non-malignant prostate cells, because the histopathological assessment carried out using sections taken above and below the ‘research slice’ (see above) may have missed small areas of cancer cells present entirely within the slice. None-the-less, these analyses suggest that distinct TMPRSS-ERG fusions may be arising independently in different regions of a single cancerous prostate.

TMPRSS2-ERG fusions were detected in two of 31 (6%) BPH specimens. When these two specimens were subjected to a repeat histopathological examination, no prostatic malignancy was identified. Interestingly, TMPRSS2-ERG gene fusions were also detected in three independent samples of commercial normal prostate RNA: one containing only a unique hybrid transcript (transcript 14) that was not observed in any other sample. In contrast, we failed to detect hybrid transcripts in commercial samples of RNA from 20 other (non-prostate) human tissues (results not shown).

The purpose of this study was to investigate the diversity of TMPRSS2-ERG hybrid transcripts present in human prostate samples. Including the three variant fusion transcripts detected by Soller et al. (2006), a total of 17 variant structures containing different combinations of TMPRSS2 and ERG gene sequences are now reported, with nine of these predicted to encode truncated ERG proteins or a TMPRSS2-ERG fusion protein. Our results further indicate that distinct fusions may arise independently in different regions of a single prostate. A priority now must be to assess the precise stage at which TMPRSS2-ERG fusion occurs during the development of prostate cancer and to determine whether the presence of individual fusion transcripts has prognostic significance.


  1. Carney DN, Gazdar AF, Bepler G, Guccion JG, Marangos PJ, Moody TW et al. (1985). Establishment and identification of small cell lung cancer cell lines having classic and variant features. Cancer Res 45: 2913–2923.

    CAS  PubMed  Google Scholar 

  2. Clark J, Rocques PJ, Crew AJ, Gill S, Shipley J, Chan AM et al. (1994). Identification of novel genes, SYT and SSX, involved in the t(X;18)(p11.2;q11.2) translocation found in human synovial sarcoma. Nat Genet 7: 502–508.

    CAS  Article  Google Scholar 

  3. Cooper CS (ed). (2001). Translocations in Solid Tumours. Landes Biosciences: Texas, USA.

    Google Scholar 

  4. de Alava E, Kawai A, Healey JH, Fligman I, Meyers PA, Huvos AG et al. (1998). EWS-FLI1 fusion transcript structure is an independent determinant of prognosis in Ewing's sarcoma. J Clin Oncol 16: 1248–1255.

    CAS  Article  Google Scholar 

  5. Delattre O, Sevenet N . (2002). Chromosomes Translocations in the Ewing Family of Tumors. Landes Bioscience: Texas, USA.

    Google Scholar 

  6. Ichikawa H, Shimizu K, Hayashi Y, Ohki M . (1994). An RNA-binding protein gene, TLS/FUS, is fused to ERG in human myeloid leukemia with t(16;21) chromosomal translocation. Cancer Res 54: 2865–2868.

    CAS  PubMed  Google Scholar 

  7. Janknecht R . (2005). EWS-ETS oncoproteins: the linchpins of Ewing tumors. Gene 363: 1–14.

    CAS  Article  Google Scholar 

  8. Jhavar SG, Fisher C, Jackson A, Reinsberg SA, Dennis N, Falconer A et al. (2005). Processing of radical prostatectomy specimens for correlation of data from histopathological, molecular biological, and radiological studies: a new whole organ technique. J Clin Pathol 58: 504–508.

    CAS  Article  Google Scholar 

  9. Kojima T, Asami S, Chin M, Yoshida Y, Mugishima H, Suzuki T . (2002). Detection of chimeric genes in Ewing's sarcoma and its clinical applications. Biol Pharm Bull 25: 991–994.

    CAS  Article  Google Scholar 

  10. Kong XT, Ida K, Ichikawa H, Shimizu K, Ohki M, Maseki N et al. (1997). Consistent detection of TLS/FUS-ERG chimeric transcripts in acute myeloid leukemia with t(16;21)(p11;q22) and identification of a novel transcript. Blood 90: 1192–1199.

    CAS  PubMed  Google Scholar 

  11. Lin PP, Brody RI, Hamelin AC, Bradner JE, Healey JH, Ladanyi M . (1999). Differential transactivation by alternative EWS-FLI1 fusion proteins correlates with clinical heterogeneity in Ewing's sarcoma. Cancer Res 59: 1428–1432.

    CAS  PubMed  Google Scholar 

  12. Lykke-Andersen J, Shu MD, Steitz JA . (2000). Human Upf proteins target an mRNA for nonsense-mediated decay when bound downstream of a termination codon. Cell 103: 1121–1131.

    CAS  Article  Google Scholar 

  13. Sharrocks AD . (2001). The ETS-domain transcription factor family. Nat Rev Mol Cell Biol 2: 827–837.

    CAS  Article  Google Scholar 

  14. Soller MJ, Isaksson M, Elfving P, Soller W, Lundgren R, Panagopoulos I . (2006). Confirmation of the high frequency of the TMPRSS2/ERG fusion gene in prostate cancer. Genes Chromosomes Cancer 45: 717–719.

    CAS  Article  Google Scholar 

  15. Tomlins SA, Mehra R, Rhodes DR, Smith LR, Roulston D, Helgeson BE et al. (2006). TMPRSS2:ETV4 gene fusions define a third molecular subtype of prostate cancer. Cancer Res 66: 3396–3400.

    CAS  Article  Google Scholar 

  16. Tomlins SA, Rhodes DR, Perner S, Dhanasekaran SM, Mehra R, Sun XW et al. (2005). Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science 310: 644–648.

    CAS  Article  Google Scholar 

Download references


This work was funded by Cancer Research UK, the National Cancer Research Institute, the Grand Charity of Freemasons and the Rosetrees Trust. We thank Christine Bell for help with typing the manuscript. This work was approved by the Clinical Research and Ethics Committee at the Royal Marsden NHS Foundation Trust and Institute of Cancer Research.

Author information



Corresponding author

Correspondence to J Clark.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Clark, J., Merson, S., Jhavar, S. et al. Diversity of TMPRSS2-ERG fusion transcripts in the human prostate. Oncogene 26, 2667–2673 (2007). https://doi.org/10.1038/sj.onc.1210070

Download citation


  • prostate cancer
  • ERG gene
  • TMPRSS2 gene
  • ERG fusion transcripts

Further reading