Main

Chromosome 1 is recurrently altered in a number of human malignancies. In solid tumours, structural aberrations include several recurrent chromosomal translocation sites, as well as frequent gains or losses involving either chromosomal arm (Struski et al, 2002; Teixeira et al, 2002). In breast cancer, chromosome 1 is the site of rare stereotypic rearrangements; isochromosome i(1)(q10), and der(1;16)(q10;p10) (Tsarouha et al, 1999). More significantly, it has been shown, by either LOH or CGH work, to be frequently involved in copy number changes (CNCs) (Kerangueven et al, 1997; Osborne and Hamshere, 2000). Fifty to 60% of breast tumours analysed by CGH presented gains at 1q, whereas the short arm showed predominantly losses, except the 1p31–p32 region which presented occasional gains (Courjal and Theillet, 1997; Tirkkonen et al, 1998). Gains at 1q frequently affect the whole arm; however, a number of tumours or cell lines exhibit interstitial gains sometimes reduced to a chromosomal band or sub-band (Courjal and Theillet, 1997; Larramendy et al, 2000). These data suggesting the existence at 1q of several regions of gains were thus concordant with LOH studies, indicating the occurrence of at least four regions of allelic imbalance in breast tumours (Kerangueven et al, 1997). Because gains at 1q were observed both in low- and high-grade breast tumours, its implication in early stages of disease development has been shown (Tirkkonen et al, 1998; Cummings et al, 2000). Recent data using BAC-based array-CGH on independent sets of breast tumours have confirmed the frequent nature of gains on chromosome 1, as well as the existence of multiple cores of amplification (Stange et al, 2006).

Altogether, these data suggested the presence of several important cancer genes on chromosome 1. Several known oncogenes (NRAS, JUN, MYCL, TAL1, BLYM, LCK) map on chromosome 1q, but their implication in breast cancer has remained elusive, whereas genes like MUC1 and PLU-1/JARIB1 were proposed as candidates (Bieche et al, 1997; Lu et al, 1999). However, it seems clear that most genes involved remain to be identified. This notion was reinforced by recent expression profiling studies in breast tumours that showed that 25 genes located on the long arm of chromosome 1 showed increased expression levels in conjunction to DNA copy number increase (CNI) (Hyman et al, 2002).

Our goal in this work was to determine more precisely the boundaries of regions of chromosome 1 showing CNCs in breast tumours and gain insight on genes involved. To achieve this, we built a genomic array covering both arms of chromosome 1 at an average density of one BAC clone/0.85 Mb and analysed 30 breast cancer cell lines and 30 primary breast tumours by array-CGH. Based on the array-CGH profile, we defined shortest regions of overlap (SROs) of copy number gain or loss. A total of 20 regions of loss, all located at chromosome 1p, and 19 regions of gain, one at 1p and 18 at 1q, were defined. Because gains at 1q were found in over 60% of the analysed samples and increased copy number are clearly related to augmented gene expression, we focused our expression study on the identification of candidate genes at 1q. To this aim, we studied expression profiles of 307 known genes located on the long arm of chromosome 1. Using a supervised analysis method, we selected 30 genes showing significantly increased RNA expression in relation to genomic gains. RNA expression levels of 28 out of 30 genes were verified by quantitative (Q)–RT–PCR and the overexpression in relation to gains was confirmed for 11 out of 28 genes, whereas 20 out of 28 showed overexpression compared to normal breast.

Materials and methods

Tumours and blood samples

Thirty breast tumours were obtained from the Pathology Department at the Val d'Aurelle Cancer Center of Montpellier (France). Tumour biopsies were snap-frozen in liquid nitrogen upon surgical removal and stored at −80°C until DNA and RNA extraction. Tumour cohort was composed of 63.7% invasive ductal carcinoma, 18% invasive lobular carcinoma, 15% invasive adenocarcinoma of unspecified type and 3.3% other types of carcinomas of the breast. The mean age of patients was 58 years. Tumours were mostly grade 2 and 3 (46.7 and 29.2%, respectively), whereas 13.9% were grade 1 and 10% were uninformed.

Cell lines and tumours

Breast cancer cell lines used in this study included BRCAMZ01, MDAMB175, MDAMB453 (D Birnbaum, INSERM U119, Marseille, France), CAL51, MDAMB435, SKBR7, ZR7530 (P Edwards, Department of Pathology, Cambridge, UK), BT474, MCF7Rich (F Vignon, INSERM U540, Montpellier, France), HS578T, MDAMB436, (A Puisieux, INSERM U590, Lyon, France), SUM149, SUM185, SUM52 (S Ethier, University of Michigan, Ann Arbor, MI, USA), EFM19, (DSMZ, Braunschweig – Germany), BT20, BT483, HCC1187, HCC1395, HCC1428, HCC1937, HCC1954, HCC2218, MDAMB157, MDAMB361, MDAMB468, SKBR3, T47D, UACC812 and ZR751 (ATCC, American Type Culture Collection, Manassas, VA, USA). All cell lines were cultured as recommended by suppliers.

Genomic arrays

We built a genomic array covering chromosomes 1, 8 and 17. Coverage of chromosomes 8 and 17 has been described by Orsetti et al (2004) and Gelsi-Boyer et al (2005). Chromosome 1 was covered by 257 BAC clones selected as follows: 225 BAC clones from the Barbara Trask collection (CHORI) http://www.ncbi.nlm.nih.gov/genome/cyto/hbrc.shtml and 32 clones selected according to their cytogenetic position and content in genetic markers. Clones were arranged according to the human genome freeze of April 2003. This resulted in an average density of one clone/0.85 Mb±0.95 Mb. However, clone distribution was uneven and thus could produce local variations in resolution (a complete list of BAC clones with precise coordinates is available in Supplementary Table S1).

Arrays were produced according to the following procedure. BAC, PAC and Cosmid DNA were isolated using Nucleobond BAC100 from Macherey-Nagel (Hoerdt, France). Probe DNA to be spotted was prepared by DOP-PCR amplification on 10 ng of BAC matrix DNA in a final reaction volume of 100 μl. Primer sequences and DOP-PCR protocol used are available on the Sanger Center web site (http://www.sanger.ac.uk/HGP/methods/cytogenetics/DOPPCR.shtml) (Orsetti et al, 2004). We performed this with slight modifications: the second round DOP-PCR primer was not aminolinked. Purification of PCR products was carried out using Nucleofast 96 PCR plates (Macherey-Nagel, Hoerdt, France). Purified PCR products were re-suspended in dd H2O at 2 μg μl−1. An aliquot was run on an agarose gel in order to ascertain even distribution of the product in all the wells. Prior spotting products were diluted 1 : 1 in spotting solution (GE-Healthcare, Orsay, France) and spotted in quadriplicate onto Corning GapsII slides (Schiphol-Rijk, The Netherlands) using a Lucidea array spotter IV (Amersham Biosciences, Orsay, France).

Array-CGH probe labelling, hybridisation, image capture and data analysis

Genomic DNA was digested by NdeII according to the supplier's recommendations (Roche Diagnostics, Meylan, France). Three hundred nanograms of digested genomic DNA was labelled by random priming in a 50 μl reaction containing 0.02 mM dATP, 0.02 mM dGTP, 0.02 mM dTTP, 0.05 mM dCTP, 0.04 mM Cy3-dCTP or Cy5-dCTP, 25 U of Klenow Fragment (50 U μl−1, New England Biolabs, Ozyme, Saint Quentin Yvelines, France), 10 mM -mercaptoethanol, 5 mM MgCl2, 50 mM Tris-HCL (pH 6.8) and 300 μg ml−1 random octamers. The reaction was incubated at 37°C for 20 h and stopped by adding 2.5 μl EDTA 0.5 M pH 8. The reaction product size was about 100 bp. We purified labelled products using microcon 30 filters (Amicon, Millipore, Molsheim, France). Abundance of the labelled DNA was checked using a spectrophotometer and incorporation of dyes was calculated using Molecular Probes software (http://www.probes.com/resources/calc/basedyeratio.html). A mix of 700 pmol Cy5- and 700 pmol Cy3-labelled probes was ethanol-precipitated in the presence of 250–300 μg of human Cot-1 DNA (Roche Diagnostics, Meylan, France) and 100 μg herring sperm DNA (Promega, Charbonnières, France). The pellet was dried and re-suspended in 110 μl Hybrisol VII (Appligene Oncor, Qbiogen, Illkirch, France). The probes were denatured at 80°C for 10 min, and repetitive sequences were blocked by pre-annealing at 37°C for 30 min.

Slide processing was performed using a HS4800 hybridisation station (Tecan, Lyon, France). Slides were treated with a blocking buffer (5 × SSC, 0.2% SDS, 1% BSA) at 42°C for 30 min and washed three times at 42°C using 2 × SCC, 0.2% SDS. Pre-annealed probes were injected in the chambers and hybridisation took place at 37°C for 16 h with mild agitation. Post-hybridisation washes were as follows: three washes at 52.5°C in solution 1 (2 × SSC, 0.2% SDS), followed by three washes in solution 2 (0.5 × SSC, 0.2% SDS) and one wash in solution 3 (0.1 × SSC) at 52.5°C. To remove salt and detergent residues, a brief wash with dd water was performed at 37°C and slides were dried within the chambers by an injection of N2 at 30°C.

Arrays were scanned by a GenIII Array Scanner (Amersham Biosciences, Orsay, France). Images were analysed by ARRAY-VISION 6.0 software (Amersham Biosciences, Orsay, France). Spots were defined by use of the automatic grid feature of the software and manually adjusted when necessary. Fluorescence intensities of all spots were then calculated after subtraction of local background. These data were then analysed using a custom made MS-Excel VBA script. Cy3 and Cy5 global intensities were normalised with the entire set of spots on the array, Cy3/Cy5 ratios were calculated, each BAC clone was spotted in four replicates, the median values of replicate spots were calculated and these values were used to define the selection threshold for individual spots. Only replicates showing less than 15% of deviation from the median were kept and a clone was taken into consideration when at least three of four replicates showed values within the 15% deviation limit. Representation of profiles with log 2 ratios in Y-axis and Mb position of clones (http://genome.ucsc.edu, April 2003 freeze) along the chromosome in X-axis. For each sample, at least two experiments were performed (Cy3/Cy5 and Cy5/Cy3), and the final profile corresponds to the mean of two experiments.

RNA expression profiling of chromosome 1q using cDNA arrays

Variations in gene expression levels were analysed by large-scale measurement with home-made cDNA mini-arrays (7.5 × 9 cm; 720 human genes; 11 genes cm−2) produced as described (Nugoli et al, 2003). More specifically, our mini-arrays comprised 319 ESTs corresponding to 307 known genes mapping at 1q (Supplementary Table S2). Selection of cDNA clones was performed according to information gathered (and crosschecked) from different web-based databases; Genemap: http://www.ncbi.nlm.nih.gov/genemap99/, Genecards: http://genecards.weizmann.ac.il/, Genelynx: http://www.genelynx.org/ or UCSC Genome Browser, release April 2003: http://genome.ucsc.edu/. Hybridisation signals were quantified using the HDG Analyzer software (Genomic Solutions, Ann Arbo, MI, USA) by integrating all spot pixel signal intensities and removing spot background values determined in the neighbouring area.

Quantitative RT–PCR

RNAs from cell lines and normal breast samples used for real-time PCR were isolated using the RNeasy Minikit (Qiagen, France) in accordance with the supplier's conditions. 1 μg of total RNA, treated beforehand with RNase-free DNase (Promega, France), was reverse-transcribed using the SuperScript II RT and 250 ng of random hexamers (Invitrogen, France). Q–PCR reactions were carried out in an ABI Prism 7000 instrument (Applied Biosystems, France) in a final volume of 15 μl according to the supplier's recommendations using SYBR Green as a detector. Primers were as described in Supplementary Table S3 in the supplementary data. We designed the primers for 17 genes, with the assistance of the Primer Express software (Applied Biosystems, France), and for the remaining 14 we used the Quantitect Primer Assays from the Gene Globe database (Qiagen, France). ESRRG primers were as described by Ariazi et al (2002). Standard curves were determined for each gene analysed by the use of serial dilutions from the same pool of cDNAs. Relative quantities were calculated referring to these curves and relative expression levels of each target gene was normalised to 28S RNA.

Identification of aberrantly expressed genes in regions of CNC

We applied a supervised analysis scheme to identify genes significantly correlated to CNCs. Sample selection was based on array-CGH profiles. For each consensus region, samples presenting at least 25% of the BACs included in the region with log 2 ratio exceeding 0.25 were considered as amplified. For each available gene at 1q, we computed a discriminating score (DS) by comparing expression levels between the subgroup of samples presenting amplification (subgroup 1) and the subgroup of samples without amplification (subgroup 2). Discriminating score (Golub et al, 1999) was defined as DS=(M1−M2)/(S1+S2), where M1 and S1 represent mean and the s.d. of expression levels of one gene in subgroup 1, M2 and S2 in subgroup 2. Confidence levels were calculated by performing 1200 iterative random permutations per gene as described previously (Bertucci et al, 2004). Significance threshold for expression differences was DS0.32 corresponding to <0.01 false positive. For Q–PCR results, we applied a t-test analysis.

Results

Patterns of gains and losses at chromosome 1 in breast cancer

We analysed genomic profiles of 30 primary tumours and 30 cancer cell lines by array-CGH using a home-built array covering chromosome 1 at an average density of one clone/0.85 Mb, with some local variations resulting in higher density locally at 1q. All cell lines studied, preselected on the basis of classical CGH profiles, presented gains and/or losses at either 1p or 1q. Array-CGH profiles were in good concordance with classical CGH data, confirming the prevalence of losses on the short arm combined with gains at 1q. However, in contrast to classical CGH data, gains encompassing the whole 1q were rare, with profiles typically showing multiple subregions of gains (Figure 1 and Supplementary Figure S1). Most prevalent gains were at 1q21–q22, 1q23–q24, 1q32 and 1q42–q44, whereas losses were noticeably rare on the long arm (Figures 1 and 2). On average, tumours and cell lines presented 1–3 regions of gains per sample (Figure 1).

Figure 1
figure 1

Profiles of gains and losses on chromosome 1 in breast cancer. Definition of SROs and events of high magnitude. Grey horizontal lines represent regions of gains (top) or losses (bottom) observed in each tumour or cell line (minimum two BACs involved with a log 2 ratio 0.25 or −0.25). Shortest regions of overlap are indicated as bold grey bars with gains above the chromosome ideogram and losses below. Shortest regions of overlap correspond to the smallest overlap shared by at least six tumours or cell lines. Arrow heads indicate events of high magnitude, either peaks of amplification or loss. They corresponded to events with log 2 ratio >0.7 in at least three tumours or cell lines. Code for cell lines 1: BRCAMZ01, 2: BT20, 3: BT474, 4: BT483, 5: CAL51, 6: EFM19, 7: HCC1187, 8: HCC1395, 9: HCC1428, 10: HCC1937, 11: HCC1954, 12: HCC2218, 13: Hs578T, 14: MCF7Rich, 15: MDAMB157, 16: MDAMB175, 17: MDAMB361, 18: MDAMB435, 19: MDAMB436, 20: MDAMB453, 21: MDAMB468, 22: SKBR3, 23: SKBR7, 24: SUM52, 25: SUM149, 26: SUM185, 27: T47D, 28: UACC812, 29: ZR751 and 30: ZR7530. Code for primary tumours 1: VA1593, 2: VA4055, 3: VA4380, 4: VA4390, 5: VA4435, 6: VA4956, 7: VA5033, 8: VA5077, 9: VA5101, 10: VA5410, 11: VA5450, 12: VA6088, 13: VA6190, 14: VA6204, 15: VA6219, 16: VA6277, 17: VA6582, 18: VA6586, 19: VA6660, 20: VA7079, 21: VA7106, 22: VA7417, 23: VA6052, 24: VA6094, 25: VA6138, 26: VA6143, 27: VA6270, 28: VA6403, 29: VA6603 and 30: VA7072.

Figure 2
figure 2

Definition of consensus regions of gain at 1q. Consensus regions were based on the curve of cumulated occurrence of gains (log 2.ratio 0.25) at 1q in 30 cell lines and 30 primary tumours. Low points defined boundaries and high points possible cores. Only regions showing an occurrence exceeding the mean (9.0) were considered. Plots are based on the Mb positioning of the clones on the array. Hence, clones positioned close to each other may appear as merged. Consensus regions of gains were designated G1 through G7 and represented as bold grey lines. Short grey lines represent the position of SROs relative to that consensus regions.

Our aim was to define the cores of the different regions of CNC on chromosome 1 and thus, it was important to determine their boundaries. Correspondingly, we delineated the SROs involved in either gains or losses on the whole chromosome 1. We overlaid all the array-CGH profiles and searched for shortest overlaps shared by at least six independent tumours or cell lines. We defined 19 SROs of gains (one at 1p and 18 at 1q) and 20 SROs of losses (all at 1p) whose sizes ranged from 170 kb to 3.2 Mb (Figure 1). Precise locations and BAC content are described in Supplementary Data (Table S4). However, it must be pointed out that the actual sizes of these regions of overlap may change according to the resolution of the array used to define them.

Although, gains were generally of low-to-moderate level, high magnitude amplifications were observed. Similarly, we observed high magnitude losses (Figure 1). We were interested to see whether high magnitude amplifications occurred at recurrent sites and, accordingly, defined seven peaks of amplification, which all, except that at 1p12, matched with SROs (Figure 1). This discrepancy can be explained by the different criteria used to define peaks of amplification and SROs. Whereas SROs required to be shared by at least six samples to be retained, peaks of amplification needed to occur in at least three tumours or cell lines.

Identification of candidate genes involved in CNCs at 1q

Because the relation between genomic gains and increased RNA expression is well established and linked to a selective advantage for cancer cells, we concentrated our efforts on the identification of the genes showing significantly increased expression levels as a consequence of gains at 1q. To this mean, we analysed RNA expression profiles of 307 genes located on chromosome 1q in 29 cell lines and 26 tumours using self-made cDNA arrays. We performed a supervised analysis aiming at selecting genes differentially expressed in tumours or cell lines presenting a gain. We formed groups of tumours and cell lines according to their ‘gain’ or ‘no gain’ status in each region. However, based on the 19 SROs, this resulted in a large number of subclasses whose samples were too small to reach statistical significance. To obviate this problem we defined larger regions of gains, designated consensus regions, which encompassed several SROs. To do this, we determined the occurrence curve for gains at each target clone at 1q. We reasoned that ruptures and low points in the curve represented the boundaries of the different regions (Figure 2). We retained only the events whose occurrence exceeded the mean (horizontal bar on Figure 2) and boundaries were defined by vertical lines tangential to the occurrence curve. Seven consensus regions of gains (G1 through G7), ranging from 3.6 to 11 Mb and encompassing two to three SROs on average were defined at 1q (Table 1 and Figure 2). Gains located between 170 and 180 Mb were not considered because their occurrence was below the threshold. Of the 307 genes studied, 178 genes were located within the consensus regions of gains defined at 1q. To identify genes whose expression levels were significantly modified in relation to CNC, we calculated the DS followed by 1200 random permutations (gain vs no gain) and our significance threshold for expression differences was DS0.32 corresponding to <0.01 false positive. This resulted in the selection of 30 genes distributed in consensus regions G1 through G7 (Table 2). Interestingly, we noted that a number of the selected genes were located in close vicinity to each other suggesting the existence of local clusters, possibly related to the existence of core regions of gain.

Table 1 Description of consensus regions of gain at 1q
Table 2 Gene expression analysis at 1q and correlation with copy number gain

Candidate gene verification by Q–RT–PCR

In order to confirm expression profiling results, we measured the RNA expression levels of 28 out of 30 genes by Q–RT–PCR in 25 cell lines typed by array-CGH. The c1orf2 and HNRPU genes could not be studied because of unsuccessful primer design. In addition to the 28 genes selected from the cDNA array data, we studied the recently identified candidate oncogene RAB25 (Cheng et al, 2004), which is located in consensus region G2, in close vicinity to two of our candidate genes, MAPBPIP and CCT3 (Table 2). A t-test analysis revealed that only 5 out of 29 genes showed P-values =<0.05, indicative of significant expression differences in relation to gains. We reasoned that this may be owing to small sample size (we had to restrict our Q–PCR analysis to the 25 cell lines because tumour RNAs were no more available) and decided to consider genes with P-values =<0.1. This allowed us to pick out a total of 11 genes (Table 2). It was, however, noticeable that the RAB25 gene was not selected in this test, whereas it was, when we compared mean expression levels in cancer cell lines to that in a series of five normal breast tissues expression (t-test P-value=0.002). We, thus, applied this test to the whole set of genes which revealed that 21 out of 29 genes were significantly overexpressed in cancer cell lines compared to normal breast.

Discussion

Chromosome 1 is a prevalent site of numerical anomalies combining losses on the short arm and gains on the long one in breast carcinomas (Courjal and Theillet, 1997; Teixeira et al, 2002). Gains at 1q are found in over 50% of breast tumours. Although being frequent in high-grade breast cancer, they have been related to ER-positive cancers (Rennstam et al, 2003; Loo et al, 2004) and have been suggested to occur early in the natural history of the disease (Buerger et al, 1999; Malamou-Mitsi et al, 1999). These particularities fostered our interest in characterising the genomic regions involved in CNCs and identifying genes at 1q whose expression was modified in relation to gains.

Array-CGH data presented here confirm chromosomal CGH results showing the duality on chromosome 1, with the short arm being mainly involved in losses and the long arm almost exclusively in gains. Our data clearly indicated the existence of multiple subregions of losses at 1p and of gains at 1q. In an attempt to define these subregions with greater precision and possibly delimitate their cores, we determined the SROs for gains (19 SROs) and losses (20 SROs) on chromosome 1, whose sizes ranged from 170 kb to over 3 Mb. Shortest regions of overlap were defined according to the classical LOH scheme, in order to narrow down genetic intervals encompassing candidate genes. Our data thus suggest that numerical anomalies at chromosome 1, be it losses or gains, are complex and involve a large number of subregions and possibly combinations of anomalies.

Although losses at 1p were observed in a sizeable portion of the tumours and cell lines, gains were notably prevalent. This was in full agreement with previous chromosomal CGH results by us and other groups (Courjal and Theillet, 1997; Tirkkonen et al, 1998; Malamou-Mitsi et al, 1999; Larramendy et al, 2000). Interestingly, gains at 1q were of low-to-moderate level with a lower prevalence of amplifications compared to other chromosomes. Furthermore, no sharp transitions were observed at the boundaries of amplification peaks at 1q, in contrast to chromosomes 8p or 17q, where such recurrent breakpoint sites were common (Orsetti et al, 2004; Gelsi-Boyer et al, 2005).

The relation between aberrant gene dosage and gene expression is well accepted and is best shown in case of CNI. The common nature of genomic gains (which include DNA amplification) in breast tumours indicates that it is an effective mechanism of positive genetic selection in cancer cells (Upender et al, 2004). By cDNA-array expression profiling, we identified 30 genes whose RNA expression was significantly increased in relation to genomic gains. Overexpression in the presence of genomic gain could be confirmed only for 11 genes by Q–RT–PCR. We suspect that these numbers may be related to the small size of our sample. We had to restrict our Q–PCR verification to 25 cell line RNAs, because tumour RNAs were no more available. We noted that 21 genes presented significant overexpression when compared to normal breast, suggesting the involvement of a larger number of genes within our original selection. It was interesting to see that, to the exception of PLU-1/JARID1B (Lu et al, 1999), all the genes identified in our study were newly proposed as candidate cancer genes. Furthermore, MUC1 (Schroeder et al, 2004), a long known cell surface marker overexpressed in a sizeable fraction of breast tumours, and KIF14 (Corson et al, 2005), a recently proposed candidate at 1q31, presented DSs below the threshold and were excluded from our selection. Genes selected in our study belong to rather diverse functional groups, of which three appeared prevalent. The first corresponded to a broad collection of positive regulators of cell proliferation. They include PIP5K1A, MAPBPIP, RAB25A, PCTK3, RAB4 and MPZL1. The second was made of genes whose products were related to transcriptional regulation or chromatin remodelling such as USF1, JARID1B, TBX19 or CROC4. The third included genes involved in cellular trafficking VPS45A, ARF1, LYST, CCT3 or basic cellular metabolism CA14, ALDH9A1. Note that RAB25 has also been related to the activation of protein trafficking between the membrane and the endoplasmic reticulum (Cheng et al, 2005). Similar functional groups have been observed in other selections of genes involved in genomic gains or amplifications, thus indicating the importance of activated transcription, increased signalling and protein trafficking or catabolism in cancer. However, 8 out of 24 overexpressed genes did not belong to any of the above-mentioned functional groups. Although two genes, PDZK1 and MLLT11, were clearly relevant to cancer as both have been proposed as a candidate oncogene in diverse haematological malignancies (Busson-Le Coniat et al, 1999; Inoue et al, 2004; Tse et al, 2004), six were more difficult to relate to cancer. Three corresponded to genetic determinants of genetic syndromes (MTMR, DISC1, MTX1) and the three others bore functions with no obvious link to cancer (NENF, ENSA, TARBP1).

We were interested to verify the concordance between our analysis and the recently described ‘Transcriptome Correlation Map’ (Reyal et al, 2005), which defined groups of collinear genes showing coordinated expression. Their data set indicated 235 genes presenting a significant Transcriptome Correlation Score (TCS) at 1q, of which 147 mapped within the region of gains defined in our work, of which 72 (48%) were located in G1 and G2 (1q21 or 1q22). Genes within consensus regions of gains presented a significantly higher TCS, thus being in accordance with the existence of a link between increased expression and copy number gains at 1q. This was further corroborated by the fact, that 14 out of 30 (43%) genes selected by DS showed significant TCS, which is an increase compared to the 55 out of 178 (31%) genes common to both studies and located in the regions of gains. This suggested an enrichment of genes belonging to the transcriptome correlation map in our set of candidate genes at 1q and contrasted to our previous findings at 8p (Gelsi-Boyer et al, 2005).

Despite their frequent nature, numerical anomalies affecting chromosome 1 in breast and other cancers have drawn less attention than deserved. Most studies focussed on specific subregions or candidate genes. In this work, we characterised at high-resolution regions recurrently involved in copy number alterations on chromosome 1 in breast cancer and identified 24 candidate genes overexpressed in regions of gains at 1q. To our knowledge, this is the first study mapping at high-resolution regions of loss and gain on the whole length of chromosome 1 and proposing a series of candidate genes affected by CNCs. Further work will need to ascertain the true relevance to breast cancer of these candidate genes. This will require bioclinical and functional studies. Moreover, as our screen was based on a set of 307 known genes representing 40–50% of the genes assigned at 1q, our selection leaves way to the identification of additional candidate genes.