Main

Considerable effort has been invested in establishing frameworks for the evaluation of genetic tests. A notable example is the ACCE format, which concentrates on four key areas: the analytical validity, clinical validity, clinical utility, and ethical, legal, and social implications of genetic tests.1 This paper reports the results of an evaluation of the clinical validity of array-based comparative genomic hybridization (array CGH) in patients with learning disability (LD) and discusses some of the challenges faced in evaluating an emerging genetic test. This project was undertaken at the request of the UK Genetic Testing Network to examine any implications for its use in the UK National Health Service.

LD can be defined as a significant impairment of cognitive and adaptive functions, with onset before 18 years of age.2,3 The disease burden from LD is substantial: the World Health Organization has estimated a prevalence of 3% in industrialized countries.4 Genetic factors have been estimated to be the main cause of LD in approximately half of all patients with severe LD and approximately 15% of patients presenting with mild LD.5 Chromosomal abnormalities are present in approximately 16% of individuals with LD (range: 4.0–34.1%).6 These structural chromosomal abnormalities are often associated with dysmorphic features, congenital abnormalities, and growth problems, many of which are nonspecific. Some syndromes, such as Down syndrome and Turner syndrome, are due to copy number changes involving whole chromosomes and are easily detectable with the light microscope. Other syndromes, such as cri-du-chat syndrome (deletion 5p), are due to loss or gain of part of a chromosome and are detectable by alterations in the pattern of G-banding and are again visible by light microscopy. Techniques such as fluorescent in situ hybridization (FISH) and multiplex ligation–dependent probe amplification (MLPA) can identify submicroscopic chromosome deletions and even deletions of single genes located on specific chromosomes.7 Deletions causing Williams syndrome, Prader-Willi syndrome, and the 22q11 deletion syndrome, as well as subtelomeric copy number changes, fall into this category. A new method of analysis, array CGH, is now being used to investigate children with LD and dysmorphic features when conventional cytogenetic analysis results proved negative.8,9

The clinical assessment of children with LD typically includes clinical examination by a pediatrician, followed by appropriate investigations, which consist of biochemical and hematological tests as well as chromosomal tests. At present, a karyotype analysis is performed, followed by FISH or MLPA where indicated. Currently, array CGH is used almost entirely in a research setting as an add-on test to look for chromosomal abnormalities that are strongly suspected on clinical grounds and when results of existing cytogenetic tests have proved negative.

ARRAY-BASED COMPARATIVE GENOMIC HYBRIDIZATION

CGH is a method for identifying copy-number variations (amplifications or deletions) within the genome.8 The procedure relies on combining fluorescence with microarray technology to allow not only the identification and measurement of changes in DNA sequence copy number, but also the simultaneous mapping of these sites within the genomic sequence. Because a microarray can contain thousands of individual DNA probes (or reporter sequences) representing the complete genome (with partial or complete sequence information), hybridization at a specific spot provides a much more precise indication of the site of aberrations in the genomic sequence than a band on a chromosome could do, yet still within a single experiment. Array CGH has many potential advantages over other cytogenetic techniques because it can provide rapid genome-wide assessments at a high resolution (≤1 Mb), and the information provided can be linked directly to physical and genetic maps of the human genome. Array CGH can detect single-copy gains and losses in specific chromosomal areas, telomeres, and whole chromosomes and thus has the potential to completely replace currently available cytogenetic techniques for the detection of known genetic abnormalities and clinical syndromes.

The main drawback to using array CGH in LD is its potential for identifying novel copy number variations that may not be responsible for the patient's LD.10,11 Even if a variant is present in an affected individual but absent from “normal” parental genomes, it does not necessarily follow that it is a pathogenic change, and it may rather represent an innocuous copy-number polymorphism (a normal variation in the human genome). Probes for array CGH generally avoid the use of sequences that hybridize to multiple genomic locations and thus are shielded from the detection of large-scale copy-number variations to some extent. The construction and interpretation of CGH arrays are a skilled process, requiring communication between specialists to associate apparent abnormalities with specific clinical features. To facilitate this information sharing, a number of international databases have been established, such as DECIPHER (Database of Chromosomal Imbalance and Phenotype in Humans Using Ensemble Resources; http://www.sanger.ac.uk/PostGenomics/decipher/), the Toronto-based Database of Genomic Variants (http://projects.tcag.ca/variation/), and ECARUCA, the European Cytogeneticists Association Register of Unbalanced Chromosome Aberrations (http://www.ecaruca.net/).

Clinicians determine the clinical relevance of identified genetic abnormalities to patients' phenotypes by using databases such as DECIPHER, by the putative functional location of detected abnormalities, and whether the abnormality has been inherited from the parents (who may or may not be phenotypically normal).

MATERIALS AND METHODS

Test definition

The concept of an assay can be distinguished from that of a test. An assay is any method for analyzing or determining the presence of a substance in a sample, and a test is the application of an assay in a clinical context for a specific purpose; thus, the same assay may be used in a number of different tests. Following on from this distinction, a genetic test is the use of an assay to detect specific genetic variants, in relation to a particular target disorder, in a defined population, for a specific purpose.12 This distinction is important because analytical validity is primarily concerned with evaluating the performance of the assay in the laboratory (accuracy and reliability), whereas clinical validity and clinical utility are concerned with evaluating the performance of the test in patients. For the purposes of this evaluation, we defined the assay as array CGH and the test as:

  • The application of array CGH (the assay)

  • For LD (the target disorder)

  • In patients with dysmorphic features and negative results from conventional cytogenetic analysis (the population)

  • To identify genetic subsets of the LD phenotype (the purpose)

Fundamentally, in the context of this evaluation, array CGH is not being used in the conventional sense to diagnose LD because this diagnosis has already been established; it is being used to classify the putative cause of the patient's LD.

Systematic review inclusion criteria

Studies were included that used array CGH to identify genetic abnormalities in patients with LD and developmental delay or dysmorphism, in whom results of conventional cytogenetic analysis proved negative. Both case series and cohort studies were eligible for inclusion.

Search strategy and data extraction

PubMed, MEDLINE, EMBASE, and BIOSIS databases were searched during February 2006 using both free text and MeSH terms, appropriately modified for the specific database (Appendix). No language or other search restrictions were imposed. Reference lists of primary studies were scrutinized for additional references, and experts in the field were contacted in an attempt to identify other unpublished studies. Two reviewers (S.S.-I. and G.S.) independently extracted data using prepiloted proformas. Reviewers compared results and resolved any differences through discussion or by involving other members of the team (S.S. and C.S.-S.).

Assessment of study quality

No quantitative methods were used to rate study quality, but the following quality indicators were assessed1,13: (1) clear description of the setting and study population; (2) whether criteria used for patient selection were clearly described; (3) evidence of appropriate pretesting with karyotyping, FISH, and telomere tests; (4) whether control samples were included and, if so, described clearly; (5) description of the array CGH platform, software, and assay process; (6) description of steps to identify and exclude known copy-number polymorphisms using genome databases; (7) appropriate follow-up testing; (8) clear description of the process of interpretation of array CGH results.

Statistical analysis

Array CGH can identify hitherto unknown genetic abnormalities previously undetectable by other cytogenetic techniques, taking us into an arena where there is no gold-standard reference test available that can be applied to all patients. Conventional measures of test discrimination, such as the sensitivity and specificity, cannot be used to evaluate its performance. We therefore adopted a pragmatic alternative, which was to evaluate the extent to which array CGH met its clinical objective,14,15 which is to identify genotypic subsets of the LD and dysmorphism phenotype. We measured the effectiveness by quantifying

  • The diagnostic yield: proportion of causal variants detected in those tested

  • The false-positive yield: proportion of noncausal variants detected in those tested

The number needed to test was also determined as the number of tests performed to identify one causal variant (calculated as 1/diagnostic yield).

Before meta-analysis, consistency of findings (often called heterogeneity) was tested using standard χ2 methods and by using the I2 statistic, which describes the proportion of total variation in estimates due to heterogeneity rather than random error.16,17 The meta-analysis was conducted using a random-effects model, which assumes that heterogeneity can be represented by a distribution of underlying effects and is conventionally a normal distribution.

RESULTS

Study characteristics

Seven primary studies, incorporating a total of 462 subjects, were identified that met the inclusion criteria (Table 1).7,1822 Only one study was conducted in Asian patients.19 All the studies were relatively small, ranging from 20 to 140 patients. All studies included sampling of control DNA as part of their protocol. The five studies investigating 1-Mb resolution arrays all used the same array as the Shaw-Smith et al. study,7 one used an array with a resolution of 50 kb,20 and another used a specific set of 2173 clones, resulting in an average resolution of 1.4 Mb.19 Control samples varied from 2 to 40 normal people, whereas Menten and colleagues18 used samples from other patients in the cohort as controls. There was some variation in the clinical criteria for patient selection and testing, with some investigators using a clinical severity score.7,20,21

Table 1 Identified studies and their characteristics

Test performance

The combined diagnostic yield of causal genetic abnormalities in the seven studies was 13% (95% confidence interval [CI]: 10–17%; Table 2 and Fig. 1). There was no evidence of heterogeneity (χ2 = 2.41, P = 0.878; I2 = 0%). The number needed to test was eight (95% CI: 6–10). The proportion of noncausal variants detected by array CGH ranged from 5% to 67%. However, the range is distorted by a high false-positive rate in the study by Miyake et al.19; the other studies ranged 5% to 10%. A meta-analysis of the five studies, excluding Miyake et al.,19 gives a combined false-positive yield of 7% (range: 5–10%).

Table 2 Genetic abnormalities identified by array comparative genomic hybridization in idiopathic learning disability
Fig 1
figure 1

Random effects meta-analysis of diagnostic yield from array-based comparative genomic hybridization in patients with learning disability. CI, confidence interval.

DISCUSSION

This meta-analysis of seven studies has found that array CGH is able to identify causal genetic abnormalities in patients with LD and dysmorphism, in whom conventional cytogenetic analysis results had proved negative. The variability in diagnostic yield is solely due to random error and cannot be attributed to underlying study heterogeneity (I2 = 0%). We believe that we have identified all currently available studies by using a comprehensive and sensitive search strategy. However, because of the low number of included studies, conventional graphical methods for assessing publication bias (such as funnel plots) were not used.23

Array CGH also identifies genetic abnormalities that are deemed to be noncausal; if the study by Miyake et al.19 is excluded (false-positive yield: 67%), the false-positive yield is an acceptable 5% to 10%. However, the reasons for the extreme results of Miyake et al.19 are unclear; it does not appear to be related to the array's resolution because they used the lowest resolution array (1.4 Mb) and the study with the highest resolution array (de Vries et al.20) had one of the lowest false-positive yields (5%). The spectrum of patients tested also appears to be similar to that of other studies.

One possible explanation is that there are important differences in the design, calibration, and use of their array and especially their choice of clones, although the similarity of the diagnostic yield compared with the other studies is striking. Another explanation is that ethnicity may be influencing the results, as this was the only study reporting data from patients in the Eastern hemisphere. It will be interesting to see whether future studies conducted in Asian patients report similar results.

The evaluation of emerging genetic tests, such as array CGH, is challenging because it has the potential to outperform currently available technologies, raising a number of important issues that may be applicable to other test evaluations. The first is the need for a very clear definition of the test being evaluated, and the conceptual distinction between assay and test is helpful here. There is also the question of how genetic disorders should be defined for evaluation purposes because they may be defined by reference to the phenotype, the genotype, or to a combination of the two. When evaluating a test, it is important that the target disorder is defined either by reference to the phenotype or for a genotypic definition, by an alternative assay reference method to prevent the problem of incorporation bias.12,24 Thus, in a known LD syndrome, such as DiGeorge syndrome, the traditional evaluative approach can be applied using either the phenotype or the genotype as the reference standard (assuming that the deletion could be detected by an alternative technology such as FISH). In this particular setting, the definition of the target disorder was very broad and largely encompassed patients with hitherto unknown genetic abnormalities. This meant that a phenotypic reference standard could not be defined.

Test evaluation was further hampered by the lack of an available genotypic reference standard for array CGH that could independently verify the “truth” in all subjects, especially in those testing negative. Although it may not always be possible to calculate the clinical sensitivity and specificity, the evaluation of test effectiveness is a pragmatic alternative. Indices such as the diagnostic yield, false-positive yield, and number needed to test coupled with data from control populations can provide useful indicators of test performance that are clinically meaningful (Table 3).

Table 3 Examining the prevalence of genetic abnormalities detected by array comparative genomic hybridization in LD patients with dysmorphic features compared with population controls

Implications for clinical practice

The results of this systematic review suggest that array CGH is a promising technology for investigating patients with LD in whom conventional cytogenetic analysis has proven negative. Before widespread introduction of this technology into clinical practice, a number of important technical questions need answering, such as the optimal array resolution, the choice of included clones, the most appropriate platforms, and the establishment of quality assurance mechanisms for use in a clinical setting. Although there is a need for more studies evaluating the test in highly selected patient groups, it is also important in the interests of the most equitable and efficient use of resources that the performance of array CGH be compared directly with existing cytogenetic tests as a first-line replacement test in the general LD patient population.25 The prevalence of normal copy-number variants may be much higher in the general patient population and the signal-to-noise ratio may be very different. More information is also needed about the clinical utility of array CGH testing. Its potential benefits include the considerable value that parents and caregivers place on a diagnosis, providing valuable information for explaining the LD diagnosis, and for improving clinical management, reproductive choice, access to genetic counseling, and reducing the “diagnostic odyssey” of multiple investigations that patients with LD often endure. Potential harms may include false reassurance due to false-negative results or a sense of fatalism, which may be fostered by a genetic tests result indicating an abnormality. There are also important questions about cost-effectiveness, as array CGH is currently more expensive and time-consuming than existing technologies (although these costs are likely to decrease over time). Counseling patients and parents about the suitability of array CGH testing and results is likely to have a significant impact on clinical workload. Thus, we recommend that pragmatic clinical trials in a service setting to examine these issues. The challenge of patient and professional education, the need for careful interpretation of the results, and the impact of array CGH test on clinical services must be quantified before array CGH test can be recommended as a first-line test to replace karyotyping in the assessment of patients with LD.

Implications for the evaluation of genetic tests

The unique features of many emerging genetic tests create problems for those attempting to produce or use evaluation frameworks, and due consideration must be given to providing flexibility for individual evaluations.24 Second, for newly emerging technologies, identification of either a phenotypic or genotypic reference standard can be problematic. In these situations, conventional indices of test discrimination cannot be calculated, but it is worth exploring pragmatic indices based on clinical effectiveness. Third, the movement of technologies from a research to a clinical setting requires time to build experience of test use and an evidence base; we suggest that the routine and prospective collection of data in databases, such as DECIPHER, is a key factor for evaluating the performance of array CGH, and similar models should be used for other emerging novel tests.