Main

By screening a human melanoma cDNA library with an anti-tumour cytolytic T lymphocyte (CTL), we isolated a 1004 bp-long transcript coding for the tumour antigen recognised by the CTL.1 It was named BAGE, for human B melanoma antigen (gb U19180) and, hereafter, will be referred to as BAGE1e. BAGE encodes a putative protein of 43 amino acids and the antigen recognised by the CTL consists of BAGE1-encoded peptide AARAVFLAL bound to HLA-Cw16 molecules.1 BAGE1a was found to be expressed in tumours of different histological types. It is silent in normal and foetal tissues except for testis.

Several other unrelated families of genes with a pattern of expression similar to that of BAGE1 have been identified, notably the MAGE, GAGE and LAGE/NY-ESO-1 families.2,3,4,5,6,7 These genes are located on the X chromosome and their function is still unknown. The MAGE family contains 18 related genes divided into three clusters (MAGE-A, B, and C) located on the X chromosome.3,8,9

A subset of these male germline-specific genes are activated in a wide variety of tumours, in which they code for tumour-specific antigens recognised by autologous T lymphocytes.10 Male germline cells do not express molecules of the major histocompatibility complex (MHC), which are surface molecules required to present antigenic peptides to T lymphocytes.11 The antigens recognised by T lymphocytes that are encoded by BAGE, MAGE, GAGE and LAGE/NY-ESO-1 families are, therefore, strictly tumour-specific and may prove useful for cancer immunotherapy. Clinical trials involving defined MAGE and LAGE tumour antigens are proceeding.12,13 Long-lasting tumour regressions have been observed in a minority of patients.

During annotation of genomic sequences of human chromosome 21,14 we identified three DNA stretches having 92–99% nucleotide identity with BAGE1a mRNA. The BAGE-related sequences are located less than 1 Mb away from the centromere: two in 21p and one in 21q. We searched for new BAGE transcripts and show here that BAGE is a gene family composed of expressed genes that map to the juxtacentromeric regions of chromosomes 13 and 21, and of unexpressed gene fragments that are scattered in the juxtacentromeric regions of several chromosomes. We have analysed the expression of the new BAGE genes in tumours and normal tissues.

Materials and methods

Primer sequences and detailed protocols are available on-line (http://www.igh.cnrs.fr/equip/centromere/Bage.html).

Screening of a melanoma library

The MZ2.MEL melanoma cell line cDNA library was screened using probes derived from the 5′ and 3′ regions of BAGE1a. Hybridisations were done at stringent conditions according to standard procedures.15

5′ and 3′ RACE and cDNA cloning

5′ and 3′ RACE experiments were done using the Marathon Ready human testis cDNA (#7414-1, Clontech) and the Expand PCR kit (Boehringer) according to the manufacturer protocols. 5′ and 3′ RACE products were purified using the Wizard kit (Promega) and cloned using the pGem-T PCR cloning kit (Promega). Afterwards, specific primers were designed to conserved regions of the 5′ and 3′ end sequences and were used to reamplify three cDNA libraries: two testis and melanoma (MZ2-Mel43) cDNA libraries1 and a purchased testis cDNA library (Clontech #7414-1). The obtained 2 kb-products were gel purified and cloned using the pGem-T PCR cloning kit (Promega).

BAGE genomic structure

The genomic structure of BAGE genes was determined through alignment of transcripts with genomic sequences. The genomic organisation of BAGE1 was obtained through alignment of the six transcripts (BAGE1a, BAGE1b, BAGE1c, BAGE1d, BAGE1e, and BAGE1f) with the genomic sequences AF499647 containing (exon 1 and 2) and AC064811 containing (exons 3 to 7). The genomic organisation of BAGE2 and BAGE5 was obtained through alignment of the transcripts with the genomic sequences AL163201 and AL161418, respectively.

Sequencing

Sequencing was done with an Applied Biosystems 373XL sequencer.

Computer analysis

Nucleotide sequences and predicted amino acid sequences were analysed by BLAST16 and by CLUSTALW at http://www2.ebi.ac.uk/clustalw

Chromosome mapping

In situ hybridisation

A 1.8-kb genomic fragment spanning from exon 1 to exon 2 was labelled by nick translation with Bio-16-dUTP (Boehringer) and hybridised to normal human metaphase chromosomes as described in Fantes et al.17 Hybridisation signals were visualised using a Zeiss Axioplan epifluorescence microscope. Images were captured using Digital Scientific Smartcapture software.

Somatic hybrid cell lines

Primer pairs specific for BAGE were used to amplify DNA from the monochromosome somatic hybrid cell lines (NIGMS mapping panel 2).18 PCR products were cloned in pGem-T vector (Promega). To map BAGE1, 10 colonies per cloning reaction were sequenced with universal primers SP6 and T7.

Expression analysis

Expression analysis was done on nine normal tissues (heart, brain, placenta, lung, liver, skeletal muscle, kidney, pancreas, and testis; Clontech #1420-1; #7414-1) and on 215 tumour samples. Total RNA was extracted from tumour samples by the guanidine-isothiocyanate procedure as described.19 Reverse transcription was done on 2 μg of total RNA with 2 mM oligo(dT)15 primer and 200 U of MoMLV reverse transcriptase (GIBCO–BRL). The quality of the RNA preparation was tested by PCR amplification of human β-actin. Primers used for the expression analysis were located in different exons. A first set of primers, bage14 and bage19, was used to amplify a PCR product of 490 bp corresponding to BAGE1 and BAGE5 transcripts. A second set of primers was used to amplify a 368-bp PCR product corresponding to BAGE2, BAGE3, and BAGE5 transcripts. To identify individual genes, 15 μl of the PCR product was digested with restriction enzymes. Digestion of the 495 bp PCR product with either AluI or BstNI allowed us to identify BAGE2 and BAGE3 cDNAs, respectively. Digestion of the 368 bp PCR product with BssKI allowed us to identify BAGE1 (presence of the restriction site) and BAGE5 (absence of the restriction site).

To analyse the expression of BAGE gene fragments we amplified a testis cDNA library (Clontech #7417-1) with primers bage4 and bage5. The PCR product was cloned in pGEM-T vector (Promega). Twenty colonies were sequenced and compared with genomic sequences in databases.

Results

BAGE transcripts and genes

We renamed BAGE1 the gene corresponding to the transcript (gbU19180) previously isolated from a human melanoma cDNA library.1 The same library was hybridised with probes corresponding to the 5′ and 3′ regions of BAGE1 and six alternatively spliced mRNAs, whose sizes range from 1 to 2.3 kb, were isolated: BAGE1a (gbU19180, ref. 1), BAGE1b (AF527550), BAGE1c (AF527551), BAGE1d (AF527552), BAGE1e (AF527553), BAGE1f (AF527554) (Figure 1A).

Figure 1
figure 1

(A) Strategy used to isolate new BAGE transcripts: (left side) Hybridisation of a melanoma cDNA library with probes specific for BAGE1 allowed us to isolate six alternatively spliced BAGE1 isoforms; (right side) Through RACE experiments on a testis cDNA library we cloned the 5′ and 3′ ends of different BAGE transcripts. Then using primers designed in the conserved sequence of different BAGE ends, we amplified full length cDNAs on three different (melanoma and testis) cDNA libraries. These transcripts correspond to new BAGE genes. (B) Exonic structure of the BAGE1 transcripts. Open rectangles are exons, filled (grey) rectangles correspond to open reading frames (arrows indicate the initiation and stars the stop codons). Exons 2c, 4d, and 6 are not drawn to scale. (C) Nucleotide sequence identity among different members of the BAGE family. Black rectangles are transcribed exons. White rectangles are predicted exons, i.e. exon 3 of BAGE1 (an Alu sequence included in the 3′ untranslated region of BAGE1a and BAGE1e mRNA variants) was also present in BAGE2 and BAGE5 genomic sequences, but not in the corresponding transcripts. Likewise, exon 7 of BAGE2 was also present in the BAGE5 genomic sequence, but not in the BAGE5 transcript.

Afterwards, 5′ and 3′ RACE amplifications were done on a testis Marathon cDNA library (Clontech), using primers specific for BAGE1 (Figure 1B). We failed to amplify sequences upstream of the 5′ region of BAGE1 and concluded that the 5′ untranslated region is complete. Interestingly, 3′ RACE amplifications yielded fragments that differed in sequence and length from the BAGE1 mRNAs. PCR primers were chosen in the conserved 5′ and 3′ regions and full length cDNAs were amplified from two testis and one melanoma cDNA libraries. Full length cDNAs were cloned in a plasmid and, for each PCR product, 10 colonies were sequenced. We isolated three 1.9 kb-long transcripts, which were named BAGE2 (gbAF218570), BAGE3 (gbAF339514), and BAGE5 (gbAF339516).

BAGE2 and BAGE3 have putative open reading frames (ORFs) of 330 bp encoding predicted proteins of 109 amino acids. BAGE5, like BAGE1, has an ORF of 132 bp and encodes a predicted protein of 43 amino acids.

We determined the exon/intron structure of the BAGE genes (Figure 1B and Table 1). BAGE1 comprises seven exons. We could not measure the gene length because the corresponding genomic sequence is unordered and uncompleted. BAGE2 and BAGE5 span 76 kb and comprise 9 and 10 exons, respectively. All the splicing sites conform the consensus gt/ag sequence. The exon/intron structure of BAGE3 could not be determined because the corresponding genomic sequence is not available in databases.

Table 1 Exon/intron structure of BAGE genesBAGE1

BAGE genomic sequences

BAGE genomic sequences were searched in databases and we retrieved 14 sequences with a significant nucleotide identity (90–100% P<10−4) with the BAGE transcripts described above (Table 2). BAGE genomic sequences were classified as follows: gene, when a transcript could be assigned to the genomic sequence; predicted gene, when no transcript could be assigned to the genomic sequence, but the predicted ORF was intact; gene fragment, when the gene was truncated and the predicted ORF was disrupted by deleterious mutations (deletions and nucleotide changes) that introduced stop codons and/or erase the initiation codon.

Table 2 List of BAGE genomic sequencesa

Overall, we identified three genes (BAGE2, BAGE3, and BAGE5), two predicted genes (BAGE6 and BAGE7), and nine gene fragments (Table 2). BAGE gene fragments are 2–8 kb long and correspond to the 5′ region of BAGE genes (they span from exon 1 to truncated intron 2).

BAGE sequences (genes and gene fragments) share extensive regions of high nucleotide identity: nucleotide identity is higher among genes (97–99%) than between genes and gene fragments (90–96%) (Figure 1C).

Chromosome mapping

To map BAGE genomic sequences, we hybridised a 1.8 kb genomic probe to human metaphase chromosomes. The probe was amplified by PCR from the 5′ region that is common to all the BAGE sequences. BAGE sequences mapped to the juxtacentromeric regions of chromosomes 9, 13, 14, 15, 18, 21, and 22 (Figure 2). Hybridisation to chromosomes 14 and 15 was observed only in some metaphases suggesting that these BAGE sequences were more divergent.

Figure 2
figure 2

(A) and (B) In situ hybridization of the 5′ region of BAGE to two partial human metaphases. (C) and (D) DAPI-stained G banding of the partial metaphases.

We then amplified exon 1, which is common to all the BAGE sequences, on a panel of monochromosome somatic cell lines. Specific amplifications were obtained with chromosomes 9, 13, 15, 18, 21, and 22 (data not shown). No amplification was obtained with chromosome 14: this result can be due to nucleotide divergence between the primers used and the target sequence.

To map individual BAGE loci, we analysed the localisation of the genomic sequences retrieved from databases (Table 2). Assignment of genomic sequences to chromosomes 9, 13, 18, and 21 was consistent with our mapping results; by contrast, assignment to chromosomes 4 and 5 was at variance with both in situ hybridisation and somatic hybrid analysis. To ascertain the actual localisation of BAGE1, which has 100% nucleotide identity with the genomic clone AC064811 assigned to chromosome 4, we did FISH experiments: the genomic clone AC064811 hybridises to chromosomes 13, 14, 21 and 22 (data not shown). We confirmed this result by PCR on somatic hybrids: a genomic sequence matching (100% nucleotide identity) the BAGE1 transcripts was amplified from chromosome 13 and from no other chromosome (data not shown).

In conclusion, BAGE sequences map to the juxtacentromeric regions of different human chromosomes, each one containing more than one locus: BAGE genes map to chromosomes 13 and 21, whereas BAGE gene fragments map to chromosomes 9, 13, 18, and 21. BAGE-related sequences that are not yet characterised map to chromosomes 14, 15 and 22.

Predicted BAGE proteins

Multialignment of BAGE predicted proteins shows that amino acid sequence identity ranges from 88 to 98% (Figure 3). Although BAGE2 and BAGE3 differ only in two amino acids, transcripts encoding these proteins are unlikely to be allelic because they have 1.2% nucleotide divergence.

Figure 3
figure 3

Multialignment of BAGE predicted proteins. BAGE6 and BAGE7 are translations of predicted genes. Non-conserved amino acids are shaded. The frame encloses the sequence of the BAGE1 antigenic nonapeptide.

The nonapeptide AARAVFLAL (amino acids 2 to 10 of the BAGE1 predicted protein; boxed in Figure 3) is the sequence of the antigen recognised by a CTL.1 BAGE predicted proteins have two amino acid changes with respect to the sequence of the BAGE1 antigenic peptide: (R→G)3 and (A→V)4. Synthetic peptides containing these amino acid variations are not recognised by the CTL that recognises BAGE1 (data not shown).

After database searches, we concluded that BAGE predicted proteins have no significant identity/similarity to any known protein.

Expression analysis

We analysed the expression of BAGE genes in 215 tumour samples of various histological types. To distinguish individual genes, we took advantage of the few nucleotide variations that characterise different transcripts and we designed a strategy based upon PCR followed by restriction enzyme digestion.

A first set of primers (bage14/bage19) amplified a PCR product of 490 bp corresponding to BAGE1 and BAGE5 transcripts. Fifteen per cent (15 out of 103) of melanomas scored positive, whereas all the normal tissues were negative, with the exception of testis (Table 3). In a previous study with a different set of primers that amplified the same transcripts, the percentage of positive melanomas was comparable (22%).1

Table 3 Expression of BAGE genes in normal tissues and tumour samples

A second set of primers (bage20/bage21) was also used to amplify a PCR product of 368 bp corresponding to BAGE2, BAGE3, and BAGE5 transcripts. Thirty-four per cent (35 out of 103) of melanomas scored positive, indicating that a significant number of tumours expressed BAGE2 or BAGE3 without expressing the other BAGE genes, in particular in primary melanomas (Table 3). Here again, all the normal tissues were negative, with the exception of testis.

PCR products were then digested with restriction enzymes BssKI, AluI, BstNI, to identify individual transcripts for BAGE1/BAGE5, BAGE2 or BAGE3, respectively. In melanomas, BAGE2 (23%) was more frequently expressed than BAGE1 (14%), BAGE3 (14%) and BAGE5 (9%) (Table 3). Individual tumours generally expressed more than one BAGE gene simultaneously (data not shown).

In accordance to a previous analysis,1 our results confirmed that BAGE genes are expressed in melanoma, bladder and lung carcinomas and in a few tumours of other histological types. Leukemias, colorectal carcinomas, and renal carcinomas scored negative. Here, no head and neck tumours expressed BAGE, whereas 8% were found positive in the previous study. This could be explained by the small number of samples analysed.

To analyse transcription of BAGE truncated genes we amplified a testis and a melanoma cDNA libraries with primers derived from the 5′ region that is common to genes and gene fragments. After having cloned the obtained PCR product, we sequenced 20 colonies and no transcript corresponding to a truncated gene was obtained. We tentatively concluded, therefore, that BAGE gene fragments are not expressed.

Discussion

In this paper, we show that the BAGE gene family comprises genes that are transcribed and translated (as indicated by the antigenic properties of BAGE1) and gene fragments that are not expressed. Genes and predicted genes map to the juxtacentromeric regions of human chromosomes 13 and 21, whereas gene fragments map to the juxtacentromeric regions of chromosomes 9, 13, 18, and 21. The list of BAGE genes and gene fragments is incomplete: (i) BAGE-related sequences that are not yet characterised map to chromosomes 14, 15, and 22; and (ii) most of human juxtacentromeric regions are still unsequenced.20

The sizes of the new BAGE transcripts isolated in this work are consistent with those of the two mRNA species (1 and 2.4 kb) previously identified in Northern blot experiments.1 BAGE1 transcripts have a great variety of alternative terminal exons. Eleven per cent of genes undergoing alternative splicing have alternative terminal exons.21

BAGE genes and gene fragments share extensive regions of 97–99% nucleotide identity which may be accounted for by concerted evolution. Concerted evolution is supposed to be the molecular mechanism responsible for sequence conservation among human ribosomal genes. Ribosomal genes map to the short arms of the five acrocentric chromosomes and participate in the formation of a common nucleolus. In the human germline cells, acrocentrics undergo frequent interchromosome DNA exchanges.22,23 BAGE sequences may therefore undergo similar interchromosome exchanges that promote sequence homogeneity.

In addition, we observed that BAGE genes have a higher nucleotide identity than BAGE gene fragments. Interestingly, we have found that BAGE genes, but not the BAGE gene fragments, are under selective pressure and that the BAGE gene family was generated by chromosome rearrangements during the evolution of hominoids (De Sario, personal communication).

Similar to BAGE1, the new BAGE genes are expressed in different cancer cells but silent in normal tissues other than testis. Chromatin compaction and/or DNA methylation may account for gene silencing. DNA methylation was already shown to be the primary mechanism responsible for the silencing of the genes encoding the MAGE antigens.24,25

Given their expression profile restricted to cancer cells, the BAGE genes isolated during this work may encode new tumour antigens useful for cancer immunotherapy. BAGE2, which is expressed in 22% of melanomas and has a putative coding region of 309 residues, is the most promising.

Other genes located in the regions flanking human centromeres have a cancer/testis expression profile: TPTE (Transmembrane Phosphatase with Tensin homology)26 and CT2 (Creatin Transporter2)27 mapping to the juxtacentromeric regions of chromosomes 21 and 16, respectively, are exclusively expressed in testis; an NF1-related gene mapping next to the centromere of chromosome 15 is only expressed in neuroblastoma.28 These results lead us to suggest that the restricted pattern of expression is a feature of the few genes mapping to juxtacentromeric regions and that these are candidate genes encoding tumour antigens.