Identification and characterisation of constitutional chromosome abnormalities using arrays of bacterial artificial chromosomes

Constitutional chromosome deletions and duplications frequently predispose to the development of a wide variety of cancers. We have developed a microarray of 6000 bacterial artificial chromosomes for array-based comparative genomic hybridisation, which provides an average resolution of 750 kb across the human genome. Using these arrays, subtle gains and losses of chromosome regions can be detected in constitutional cells, following a single overnight hybridisation. In this report, we demonstrate the efficiency of this procedure in identifying constitutional deletions and duplications associated with predisposition to retinoblastoma, Wilms tumour and Beckwith–Wiedemann syndrome.

Predisposition to a variety of cancer predisposition syndromes occurs as a result of the inheritance of gains and losses of chromosomes and chromosome regions. Currently, Giemsabanding analysis of constitutional chromosomes is the most usual way of identifying these chromosome abnormalities. Typically, these cytogenetic approaches can readily identify whole chromosome changes as well as intrachromosomal deletions, with a maximum resolution of approximately 10 megabase (Mbp). The limitation of these analyses is that they cannot always reliably detect small chromosome deletions or amplifications. If the regions of the chromosome involved are known, for example, based on the clinical features of the syndrome, it is possible (Kempski and Cowell, 1993;Cowell et al, 1994) to analyse the chromosomes using fluorescence in situ hybridisation (FISH). Duplications of small chromosome regions are harder to identify by FISH, especially if the duplication is small and tandem. The limitation of this FISH approach is that it is usually performed on a locus-by-locus basis, which requires some clinical indication of the part of the genome that is involved. Identifying the specific chromosome abnormality in these patients can be important for the clinical management of the patient, since the larger the deletions the more severe the associated clinical phenotypes.
Despite the limitations of karyotype analysis, it has been the cornerstone for providing the diagnosis of human chromosomerelated syndromes. A method that unequivocally identifies the presence of the chromosome changes in a nonbias analysis and also defines the exact region involved directly from DNA samples in a relatively short time, would address most of the shortcomings of cytogenetics-based approaches. Over the past several years, we have been developing a hybridisation approach that allows an analysis of chromosome deletions and amplifications, without the need to study metaphase chromosomes. This approach is referred to as array comparative genome hybridisation (CGHa), and consists of a series of mapped BACs arrayed on a glass slide to which DNA from test and control samples are competitively hybridised (Hodgson et al, 2001;Snidjers et al, 2001;Cowell and Nowak, 2003).
In previous reports, CGHa has been used in highly focused studies using limited sets of BACs, but only from well-defined regions of the genome. Thus, Veltman et al (2002) used CGHa arrays which comprised o100 BACs from the subtelomeric regions of the human chromosomes to identify subterminal deletions. Similarly, Veltman et al (2003b) used approximately 100 BACs from chromosome 18 to specifically investigate the deletions associated with congenital aural atresia, and approximately 100 BACs from chromosome 22 were used to identify deletions of the NF2 gene in NF2 patients (Bruder et al, 2001). In these cases, it was necessary to know where in the genome to look and, if there has been any other genetic change in the genome, these would be missed. Larger arrays have been used in the analysis of a number of different cancers, but again the level of resolution has been in the range of 1 -1.5 Mbp or less. Thus, Wessendorf et al (2003) used an array of approximately 500 BACs to study B-cell non-Hodgkin lymphoma, but these BACs only sampled regions of the genome which had previously been shown to be involved in this malignancy. Arrays containing approximately 2000 BACs were used to study renal cell cancer (Wilhelm et al, 2002), bladder cancer (Veltman et al, 2003a) and a number of cell lines (Fiegler et al, 2003), and could identify large genetic changes. The resolution of the BAC arrays, however, is an important consideration, since more dense arrays will detect smaller deletions and amplifications, which, in turn, provide the best opportunity to define the driver gene for the abnormality.
In this report, we describe an array of 6000 BACs, which provide an average inter-BAC interval across the genome of 500 kb. In a single hybridisation, using as little as 50 -100 ng, numerical chromosome abnormalities can be identified over the whole genome at a far greater resolution than previous reports (Cowell et al, 2003(Cowell et al, , 2004a(Cowell et al, , 2004bCowell and Nowak, 2004). We have used these CGH arrays to investigate their ability to define constitutional chromosome abnormalities associated with a number of different syndromes carrying deletions and duplications of varying size. In all cases, the specific abnormalities could be detected, which supports the idea that CGHa for this application could replace conventional karyotype analysis for most of the cancer predisposition syndromes that result from structural chromosome abnormalities.

MATERIALS AND METHODS
DNA samples were prepared from lymphoblastoid cell lines previously derived from peripheral blood leucocytes. These cell lines were cultured in RPMI medium supplemented with 10% foetal calf serum and 10 mM glutamine.

BAC array generation
A genome-wide resource of B6000 FISH mapped, gene/marker content verified, and sequenced BAC clones (Cheung et al, 2001) from the RPCI-11 human BAC library are represented as immobilised DNA targets on glass slides for array-based CGH analysis). Each clone is spotted in duplicate at 280 mm intervals (see http://genomics.roswellpark.org for a complete list of clones). The average inter-BAC interval on the array is approximately 500 kb, although the regions flanking the centromeres of all of the chromosomes are relatively under-represented, because of the high density of repetitive elements.

DNA preparation
Genomic DNA was prepared from all samples using the FlexiGene DNA Isolation kit (Qiagen, Inc.). according to the manufacturer's instructions. Two control DNA pools are used for BAC CGH array analysis. The male control and female control pools each contain DNA from 15 cytogenetically normal individuals. For procedural quality control, all analyses are performed as sex-mismatch hybridisations. This allows determination of chromosome X and Y copy number as an internal reference standard (see Figure 1).

Labelling of DNA
A measure of 1 mg of control and test genomic DNA was random primer labelled using a BioPrime DNA labelling kit (Invitrogen, Inc.) for 3 h at 371C, with the appropriate Cyanine dye (Cy3 or Cy5). After ethanol precipitation, the probes are resuspended in H 2 O and combined. Unincorporated Cy dye is removed by passage over a Qiagen spin column. The labelled probes are dried and stored at À201C until hybridisation.

Hybridisation
Briefly, the arrays are preblocked with 110 ml Ambion SlideHyb Buffer #3, 1 ml of 20 mg ml À1 Human Cot-1 DNA solution at 501C in a GeneTAC hybridisation station (Genomic Solutions, Inc.) for 30 min. Prior to hybridisation, the probe is resuspended in 110 ml Ambion SlideHyb Buffer #3 containing 5 ml of 20 mg ml À1 Cot-1 and 5 ml of 100 mg ml À1 yeast tRNA, heated to 951C for 5 min and placed on ice. The prehybridisation buffer is removed, the entire probe added to the hybridisation chamber, and hybridisation proceeds for 16 h at 651C in the GeneTAC. After hybridisation, the slide is washed in decreasing concentrations of SSC and SDS, followed by one 0.1 Â SSC wash, one 95% EtOH rinse and centrifugal drying for 3 min.

Image analysis
The hybridised slides are scanned using an Affymetrix 428 Scanner to generate high-resolution (10 mm) images for both Cy3 and Cy5 channels. Image analysis is performed on the raw image files using ImaGene (V4.1BioDiscovery). Each spot is defined by a circular region, the size of which is programmatically adjusted to match the chr20 chr21 chr22 chrY chrX Figure 1 CGHa profile of constitutional DNA from patient GOS 115. The individual BACs for all chromosomes show a test/control ratio about a mean of 0 (no change, log scale), with the exception of the 13q14 region where the ratio is À0.5, indicating the presence of a heterozygous deletion (see text). The sex chromosome mismatch for this male patient was an XX control, which demonstrates a ratio or À0.5, which is expected in this experiment for a hemizygous deletion.
size of the spot. A buffer region of 2 -3 pixels around the spot is ignored and a region 2 -3 pixels wide outside the buffer region is considered the local background for that spot. Each spot and its background region are segmented using a proprietary optimised segmentation algorithm, which excludes pixels that are not representative of the rest of the pixels in that region. The background corrected signal for each BAC is the mean signal (of all the pixels in the region) minus the mean local background. The output of the image analysis is in the form of two tab-delimited files, one for each channel, containing all of the fluorescence data.

Data analysis
The output of the image analysis is processed by a program written in Perl and R, developed at RPCI. For each spot, the ratio is calculated from the background subtracted mean signal of the two channels. The ratios are then normalised on the log scale with a nonlinear normalisation algorithm. Basically, for all spots that are flagged as having met the qualitative spot criterion, the log 2 background subtracted mean signal is plotted and a lowess function is applied. The normalised ratios are the computed ratios minus the expected values on the curve.
The results of the triplicate replicas are combined by taking the mean of the log 2 ratios and the standard error is calculated. Any BAC that has less than two replicates flagged as having met the qualitative spot criterion is excluded. Mapping information is added to the resulting ratios and standard errors. The mapping data for each BAC are found by querying the human genome sequence at http://genome.ucsc.edu. The Nov 14 2002 build is currently being used to precisely position the BAC clones on the draft sequence. The output, a tab delimited file, is imported to Excel for graphing.

Interpretation
The final ratio represents the relative amounts of DNA from the experimental sample and the reference control sample. Equal amounts of control and test DNA are labelled and the ratios are normalised to 1 (0 on the log scale), effectively normalising the array to the average modal number of the test sample. Knowledge of the sex of the test sample is used to perform a sex mismatch between the test sample and the control, providing an internal control for copy number. Typically, a degree of suppression is observed in these ratios. The X chromosome, therefore, can be used to estimate the amount of suppression when the test sample has a normal number of sex chromosomes, that is, XX or XY.

RESULTS AND DISCUSSION
Using conventional cytogenetics retinoblastoma (Rb), patients with mental retardation and dismorphic features have been typically shown to carry deletions on the long arm of chromosome 13 involving the 13q14.3 region (Cowell et al, 1989a). These deletions eliminate one copy of the retinoblastoma predisposition gene (RB1). Small deletions, however, do not have the same range of diagnostic clinical features, and screening all Rb patients cytogenetically is cost prohibitive, since less than 10% will carry deletions (Cowell et al, 1989a). The esterase-D (ESD) gene was localised adjacent to RB1 (van Heyningen et al, 1975), such that deletions involving this region could be identified by measuring ESD activity in red blood cells from the patient. To identify these deletions, we developed a screening procedure which involved measuring the ESD activity in retinoblastoma patients (Cowell et al, 1986a). Since the ESD gene lies approximately 650 kb centromeric to RB1 (Young et al, 1988), this was the only chromosomeindependent approach for the detection of deletions at that time. Esterase-D quantitation for the discovery of deletions was occasionally confounded by the possibility that the proximal breakpoint of the deletion separated the ESD and RB1 genes (Cowell et al, 1987;Mitchell and Cowell, 1988). Although ESD quantitation proved to be a very quick and effective screening tool (Cowell et al, 1986a), the low endogenous activity of the rarer '2'allele meant that 2 -2 homozygotes would often produce enzyme levels in these patients that were close to 50% of the normal controls (Cowell et al, 1986b). This was particularly problematic in groups such as the Japanese population (Horai and Matsunaga, 1984), where the incidence of the 2-allele is significantly greater (40%) than in Caucasian (10%) populations (Cowell et al, 1986a).
To assess the ability of the CGH BAC arrays to identify 13q14 deletions in Rb patients, we used DNA to perform the hybridisation derived from a series of lymphoblastoid cell lines (Cowell et al, 1989a) established from representative Rb patients with deletions of varying length. An example of a complete genomic profile that is produced from the 6000-clone BAC array is shown in Figure 1. All of the chromosomes show clustering of the hybridisation ratios about the mean of 0 (diploid on the log 2 scale). The heterozygous deletion in two patients (GOS 115 and GOS 191) was readily identified as the only abnormality in the sample (Figure 2). The deletion associated with patient GOS 191 was cytogenetically easily detectable (Cowell et al, 1989a) and this deletion was shown by CGHa to extend between BACs RP11-11k16 (32.04 Mbp) and RP11-37i864 (64.09 Mbp), which, based on the human genome sequence, represents a distance of 32.05 Mbp. The deletion associated with patient GOS 115, however (Figure 2), was more subtle (Cowell et al, 1989a), and was seen as a subband deletion surrounding the RB1 locus. Our CGHa analysis demonstrates that, in fact, this deletion spans the region of 13q14 between BACs RP11-20k19 (44.86 Mbp) and RP11-37i8 (64.09 Mbp), which constitutes maximally 20 Mbp. This analysis, therefore, also provides an approximate relationship between the DNA sequence and the appearance of deletions in metaphase chromosomes at the 850-band resolution. The RB1 gene is located at position 47.81 -47.99 Mbp along the long arm of chromosome 13.
The CGHa data demonstrate some important empirical details for interpretation of the profiles. The prediction from the XX/XY mismatch is that, on the log scale, there should be a ratio of À1 for DNA from males and þ 1 for females. In fact, we consistently see that this ratio is closer to 70.5. Although we do not know the full cause of this suppression of the hybridisation ratio, we presume that much of it is due to nonspecific hybridisation in the system, most likely due to repetitive sequences which cannot be competed out using Cot1 DNA (Cowell and Nowak, 2003). Despite this reduction in hybridisation ratio, however, it is clear that deletions and duplications (see below) show a change in ratio that is consistent with that seen for the X chromosome in the same experiment, which makes definition of the chromosome change relatively easy.
During our analysis of retinoblastoma patients using ESD screening, we identified a potential chromosome deletion in patient GOS 203, where the enzyme levels were 50% of that seen in normal controls (Cowell et al, 1986b). This patient was shown, using starch gel electrophoresis, to be homozygous for the '2' allele, which we had already shown had an inherently lower activity than the 1-allele. Heterozygotes show reduced levels (Cowell et al, 1986b), but not large enough to suggest a 50% reduction in activity. Patient 203 showed mild dismorphic features and reduced cognitive ability. Chromosome analysis, however, appeared normal but did not, together with the ESD levels and clinical phenotype, rule out the presence of a submicroscopic deletion unequivocally. Clearly, although a rare case, genetic counseling in this family was inadequate, since we could not ignore the potential that she carried a deletion based on the enzyme assays. These deliberations were tempered by the lack of convincing cytogenetic data and the mild clinical phenotype. The CGHa profile for chromosome 13 from this patient is shown in Figure 2, and clearly shows diploid levels along the length of the chromosome and in particular for BAC RP11-174i10, which contains the RB1 gene. This result formally demonstrates that this patient does not carry a 13q14 deletion. Although it has been some time since genetic counselling was given to this patient, the ESD result clearly influenced this family in their choice not to have children at the time.
We next extended our CGHa analysis to patients who had been reported as having 13q-syndrome which involves various partial deletions in the q22-qter region (Luo et al, 2000;Gutierrez et al, 2001;Hewson and Carter, 2002). These patients have a well-defined set of clinical phenotypes, including mental retardation, where the deletion was generally assumed to involve the terminal region of 13q, although somatic cell hybrid studies (Hawthorn and Cowell, 1995) suggested that these were, in fact, subterminal deletions. In this CGHa study, we analysed two patients, GOS 71 and GOS 107, reportedly with 13q-syndrome (Figure 3). The syndrome in patient GOS 71 included; hypertonia, small stature, low set posteriorly rotated ears, bilateral simian creases, metatarsus varus, cryptorchidism, high arched palate and wide alveolar margins. In contrast, the features of patient GOS 107 included, developmental delay, short stature, hearing impairment, tracheo-oesophageal fistula, renal impairment and asthma. Although the prior cytogenetic analysis had suggested the same diagnosis in these cases, the clinical phenotypes were different and CGHa analysis provided the basis for this. GOS 107 showed the typical deletion involving the 13q31 -33 region, but not including the telomere, confirming that, in fact, this is an interstitial deletion which was located between BACs RP11-86c3 (89.2 Mbp) and RP11-7b23 (101.15 Mbp), which spans a 12 Mbp region. By contrast, GOS 71 showed a much more proximal deletion involving the 13q12 -13 region, between BACs RP11-179a7 (33.2 Mbp) and RP11-269c23 (43.87 Mbp), a distance of 10.67 Mbp. Clearly, although there is some overlap in the clinical phenotype between these two patients, the deletions are very different, which accounts for the discrepancy in their clinical phenotype. Importantly, the original cytogenetic diagnosis for GOS 71 was misinterpreted and this deletion does not include RB1, which is located at 47.9 Mbp.
The study of constitutional deletions on chromosome 13 clearly has the advantage of speed, and accuracy of the diagnosis. To extend our studies, we investigated DNA from other patients with syndrome-related chromosome abnormalities. Aniridia is a rare hereditary disease resulting in the absence of irises. In the familial form, the phenotype segregates as an autosomal dominant disorder due to mutations in the PAX6 gene (Davis and Cowell, 1993). Sporadic cases of aniridia show a 50% increased risk to the development of Wilms tumour (WT), a pediatric cancer of the kidney (Riccardi et al, 1978). In these patients, the cancer predisposition results from the presence of a constitutional deletion involving the 11p13 region containing both the WT1 and PAX6 genes. Patients with PAX6 gene mutations clearly represent the hereditary form of the disease, and are not at increased risk to the development of WT. From a genetic counselling standpoint, a sporadic case of aniridia could either carry a deletion predisposing to WT, or carry a de novo mutation in the PAX6 gene. Therefore, being able to exclude the tumour risk in these patients would involve either a mutation study of the PAX 6 gene, which is a complex and time-consuming procedure, or a cytogenetic analysis of the 11p13 region. To assess the utility of the CGHa approach in this situation, we used DNA from patient GOS 157, which we had previously demonstrated to carry a small deletion involving the 11p13 region (Cowell et al, 1989b). The results are shown in Figure 4, where the presence of a heterozygous deletion is clearly seen in the short arm of chromosome 11, spanning a distance between 26.62 and 44.6 Mbp (18 Mbp), which includes the PAX6 and WT1 genes.
From the experiments described above, it is clear that CGHa can be used to quickly and efficiently identify heterozygous constitutional deletions, and could be easily extended to other syndromes such as the 15q deletion associated with Angelman/Prader Willi syndrome (Vogels and Fryns, 2002) or the 22q deletions associated with Di George syndrome (Baldini, 2002). Other genetic syndromes, however, are associated with extra copies of small chromosome regions. An example of this (Mannens et al, 1994) is the chromosome imbalance that is associated with the pediatric cancer predisposition syndrome, Beckwith -Wiedemann syndrome (BWS). In these cases, it has been demonstrated that duplication of a region in 11p15, which results in three copies of the 11p15.5 region, is responsible for the phenotype in some cases. BWS may sometimes be confused with the phenotypically similar Perlman's syndrome, which has a much higher frequency of cancer than BWS (Grundy et al, 1992), and for which no chromosome abnormality has yet been identified. To determine whether these types of chromosome aberration can also be detected using CGHa, we analysed the DNA from a patient who had been shown by extensive molecular and cytogenetic analysis to carry a nonreciprocal chromosome translocation t(5;11)(p15;p15), which resulted in the triplication of the 11p15 region (Grundy et al, 1998). The CGHa profile from this patient, GOS 637, is shown in Figure 4, and demonstrates that the translocation event is more probably the result of an insertion of the 11p 15.5 region spanning BACs RP11-120e20 (3.67 Mbp) and RP11-6k5 (20.37 Mbp), covering 16.77 Mbp, in the distal region of 5p15. On the log scale, all of the BACs in this region show an intensity ratio of þ 0.5, confirming the presence of an extra copy of this region. The most distal BACs (0 -3.67 Mbp), however, show a ratio closer to 1. The most telomeric BAC on the array, RP11-123f4, is clearly only present in a diploid complement and, although the adjacent series of BACs show a ratio of 1.2, this is still within the range of 'noise' shown for the other BACs along the chromosome, suggesting this region is also present in only two copies. Thus, although the diagnosis of BWS is not at issue, this analysis provides valuable information about the extent, and hence the gene content, of the region involved.
Our CGHa analysis of constitutional chromosome abnormalities, therefore, provides a demonstration that heterozygous, predisposing chromosome deletions and duplications can easily be detected and accurately defined. The high resolution of this array means that small chromosome deletions, which cannot be detected using conventional chromosome analysis, will also be identified. The other advantage of CGHa is that only small amounts of DNA are required for the hybridisation, without the need for the preparation of metaphase chromosome spreads, which means that the analysis can be performed using DNA from nondividing tissue such as buccal swabs or skin biopsies. As a consequence, the diagnosis can be made in a relatively short time, since there is no need for extensive culture periods for sample preparation. One area where this advantage would be particularly useful is in the analysis of amniotic fluid cells or chronic villous samples for prenatal diagnosis of hereditary chromosome abnormality syndromes. The rapid turn around time associated with CGHa also presents clear advantages for the clinical management of these patients, and has important implications for genetic counselling.
The analysis of specific chromosome deletions has frequently led to the discovery of the gene(s) involved in the associated phenotype. The ability of CGHa to clearly define the gene content within the deleted or amplified region and to compare these observations between patients provides a rapid way of selecting candidate genes for more detailed study.