Over the past decade, DNA microarrays have emerged as an important tool for monitoring the expression of genes in a highly parallel fashion—made possible by a combination of miniaturized technologies and vast and ever-increasing amounts of sequence information. Key trends in recent years have included the shift from cDNA- to oligonucleotide-based microarrays and from 'home-brew' to commercial platforms as the technology has become more affordable and as platforms become more automated, offering improvements in sensitivity, specificity and reproducibility. Although microarrays are still predominantly used for gene expression analysis, they are also finding application in genotyping and resequencing applications, as well as comparative genomic hybridization studies1.

Affymetrix of Santa Clara, California dominated the market for many years, applying photolithographic technologies culled from the semiconductor industry to the production of high-density microarrays. Its GeneChip system is still the most widely used commercial platform. But the dynamics of the market are changing. New players have entered in recent years (see Box 1, 'Microarray Marketplace'), and some have flexible technologies that can be used to manufacture custom high-density arrays in less time and at lower cost than for other customized array platforms. This allows researchers to quickly apply experimental results obtained with one array design to subsequent array designs, and for iterative rounds of probes to be quickly synthesized with nominal up-front fees for refining and optimizing the probe sets.

Illumina's 96-sample Sentrix Array Matrix (top) and the multi-sample Sentix Bead Chips. (Courtesy of Illumina.)

Xeotron of Houston, Texas and NimbleGen Systems of Madison, Wisconsin are two such companies. Both have developed technologies—digital projection in the case of Xeotron and a series of tiny aluminum mirrors in the case of NimbleGen—to shine light in specific patterns and so direct on-chip probe synthesis. This eliminates the need for the time-consuming and expensive manufacture of photomasks used to create patterns of light in conventional photolithographic processes. Interestingly, Xeotron has now been acquired by Invitrogen and NimbleGen has partnered with Affymetrix (Box 1).

And print!

Like Affymetrix, NimbleGen and Xeotron, Agilent Technologies of Palo Alto, California also uses an in situ synthesis process—although the underlying technology is very different. Agilent's SurePrint technology involves a noncontact inkjet process that prints oligonucleotide probes of 60-mer length (one probe per gene) base by base onto specially prepared glass slides. The company says its inkjet format is flexible, enabling rapid design changes to be made.

“When we introduced the products a few years ago, we were very conscious of the fact that there were a lot of people using the 1 × 3 [inch] platform,” says Mel Kronick, chief scientist (gene expression platforms) of Agilent's Integrated Biology Solutions business. Because many researchers at the time were spotting their own arrays, it became “a de facto standard,” he says. To ease the transition from home-brew microarrays to commercial microarrays, Kronick says the company developed an 'open' platform whereby researchers can either buy what they need and still use their existing laboratory setup, or purchase the entire system.

Agilent's 1 × 3–inch glass slides can be read on most commercial microarray scanners, although the company does offer an automated 48-slide microarray scanner that can read any combination of glass slides (both those from Agilent and others). The scanner is optimized for use with cyanine 3 (Cy3) and cyanine 5 (Cy5) and provides simultaneous two-color scanning at 10-micron resolution, taking about 8 minutes per slide.

One problem common to all commercial arrays that can compromise microarray performance is degradation of the Cy5 signal when two-color competitive hybridization, with targets fluorescently labeled with Cy5 and Cy3 dyes, is used. It results from exposure of the processed slides to atmospheric ozone. However, the effects of ozone-mediated degradation of Cy5 can now be mitigated through use of Agilent's proprietary stabilization and drying solution containing an ozone scavenger dissolved in acetonitrile.

Agilent offers both off-the-shelf and custom-printed microarrays, and launched three whole-genome microarrays this year (see Box 2, 'Microarrays: How Low Can They Go?'). In February, the company began selling a human whole-genome microarray representing over 41,000 human genes and transcripts. This was followed in August by a whole-genome oligonucleotide microarray of Arabidopsis thaliana, the primary model for gene expression research in higher plants, and a mouse whole-genome microarray representing over 41,000 mouse genes and transcripts.

Amersham Biosciences (now part of GE Healthcare) also offers a gene expression platform that requires a marginal up-front investment—technology that the company purchased from Motorola in 2002. The CodeLink system consists of a range of prearrayed 30-mer oligonucleotide bioarrays (with one validated probe per gene), various processing tools, and software, and is compatible with a variety of commercial scanners. In April, the company began selling CodeLink Human Whole Genome Bioarrays, which target approximately 57,000 transcripts and expressed sequence tags (ESTs), including some 45,000 well-characterized human gene and transcript targets. The Rat Whole Genome Bioarray began shipping in August, and the release of the Mouse Whole Genome array, representing approximately 40,000 transcripts, is imminent. “We deliberately made the system open-ended,” says Chockalingam Palaniappan, head of R&D for the molecular diagnostics division of GE Healthcare.

Researchers at the University of Rochester Medical Center, Rochester, New York are using the CodeLink system to shed light on the pathogenesis of Parkinson disease. In a study published in the August issue of the Journal of Neuroscience, Howard Federoff, director of the Center for Aging and Developmental Biology, and colleagues were able to identify small yet significant changes in gene expression within the substantia nigra of mice treated with the neurotoxicant 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine (MPTP)2. The analysis uncovered dysregulation of genes in key areas related to neuronal function. “The platform seemed to be able to identify gene expression changes of a relatively small magnitude that can be subsequently validated by quantitative RT-PCR,” says Federoff.

Slowly but surely

Although Applied Biosystems (ABI) of Foster City, California only made its first foray into the expression microarray market last April, the company is hoping that researchers will think it has been worth the wait. In the years since the draft sequence of the human genome was published, ABI has been working with sister company Celera Genomics of Rockville, Maryland, and a number of other collaborators, to annotate the genome. “What we wanted to do is to remove any ambiguity and launch a product that was clearly gene-centric that had gene identities clearly itemized,” says Clark Mason, senior product line manager for gene expression arrays at ABI.

The Agilent Whole Human Genome Microarray represents over 41,000 human genes and transcripts. (Courtesy of Agilent Technology.)

The company's Expression Array System consists of the 1700 Microarray Analyzer, three supporting labeling and detection kits, the arrays and accompanying software, which can export results in text file and MAGE-ML format for further analysis using third-party software packages such as Spotfire, DecisionSite or GeneSpring.

Like Agilent, ABI uses 60-mer oligonucleotides. Mason says this provides a “good compromise between specificity and tightness of binding.” But whereas Agilent synthesizes its oligonucleotide probes directly on the chip, ABI synthesizes its oligos off line and then contact spots them onto a nylon substrate. The 3′ end of the oligonucleotide is covalently coupled to the nylon substrate through a carbon spacer, which lifts the oligonucleotide off the microarray to avoid problems with steric hindrance.

The use of chemiluminescence detection is one of the main features that distinguishes the ABI system from that of other commercial platforms. This approach provides a detection sensitivity at the femtomolar level, enabling the system to distinguish expression levels of less than one copy per cell using as little as 500 ng of total RNA starting material. Chemiluminescence, on the other hand, produces light when the label binds to a substrate rather than relying on light to excite the bound label.

ABI teamed up with another sister company, Tropix, to develop a microarray substrate that would work with chemiluminescence. “What we wanted to do was to get closer to RT-PCR performance,” says Mason. “The art in getting it to work was to get a stable signal, and our signals are stable for over 1 hour on the microarray, even though we just image for 25 seconds.”

The gene expression system was also developed to integrate fully with the company's existing product portfolio, including the TaqMan Gene Expression Assays and Real-Time PCR Systems. Mason says this provides researchers with the means to transition from global expression profiling on microarrays and the identification of genes that are differentially expressed to the examination of a more focused set of candidate genes using TaqMan-based assays. In this way researchers can validate microarray results, obtain absolute quantification of transcripts and investigate alternative splicing.

The company's Human Genome Survey Microarray contains 27,868 human genes from public and Celera databases. Mason says that despite marketing claims to the contrary, some of the other products are not representing the whole genome. ABI has cross-mapped its product to that of other competitive products and “at least 6,000 of those [genes] can't be found in any other commercial array,” he says.

Each Expression Array System also comes with a flat file that contains the most up-to-date information on the human genome from public and proprietary databases, which the company plans to update every three months. An updated version of the whole human genome array will be launched soon, in which the number of genes will increase to over 29,000. ABI also began shipping a Mouse Genome Survey Microarray in May, which contains probes representing approximately 32,000 mouse genes, and a rat genome array is in the works.

Bucking convention

Despite the success of the standard 3 × 1–inch slide microarray format, some companies are developing new array formats that lend themselves to the parallel processing of multiple samples. The aim is to drive per-sample running costs down and so expand the reach of microarray technology, particularly into large-scale clinical applications.

One such company offering alternatives to conventional microarrays is Illumina of San Diego, California. The company's bead-based technology is available on two distinct substrates, the Sentrix Array Matrix (for up to 96 samples) and the Sentrix LD BeadChip (up to 8 samples). Each is fabricated into an 'array of arrays' format for processing multiple samples at a time, and both support SNP genotyping and gene expression profiling applications.

Each array on each substrate contains thousands of tiny etched wells, into which 3-micron beads self-assemble in a random fashion. The company uses 50-mer gene-specific probes that are concatenated with 'address' sequences and immobilized on the bead surface. After bead assembly, each array is 'decoded' to determine which bead type containing which sequence is in each well of the substrate—information that is supplied on CD-ROM to customers at the time of purchase. The decoding algorithm that underpins Illumina's bead array technology is described in a paper by Kevin Gunderson et al. that appeared in the May issue of Genome Research3.

“Because the assembly of the features is random, we incorporate a redundancy to ensure that every different bead type will be present in the array,” says Todd Dickinson, Illumina's market manager for gene expression. More than 1,500 unique probe sequences or bead types are represented in each array, with an average 30-fold redundancy for each bead type, says Dickinson.

The higher sample throughput Array Matrix has a 6-micron bead-to-bead spacing and so requires the 1-micron scanning resolution of the company's BeadArray reader. But researchers need not buy a BeadArray reader in order to evaluate the technology. The lower-density Sentrix LD BeadChips have 20-micron feature spacing and so can also be scanned on GenePix 4000B and 4200A scanners from Axon Instruments.

As a complement to its focused arrays that offer standard (catalogue) or customized content, Illumina will soon make available two new high-density Sentrix BeadChips for whole-genome analysis that will enable the processing of multiple samples per experiment. The Human-6 BeadChip will contain over 10 million features and will enable researchers to interrogate 48,000 transcripts per sample, six samples at a time. It will be geared towards researchers interested in profiling all known genes and many of their alternative splice forms. The second, RefSeq-8, will contain the same approximately 24,000 transcripts from the National Center for Biotechnology Information (NCBI) RefSeq database as are on the Human-6 BeadChip and will allow eight samples to be analyzed at a time. Both products will have the same 6-micron feature-to-feature density as the Array Matrix for use in conjunction with the company's BeadArray reader.

An important difference between Illumina's focused set and the whole-genome products coming out later this year is the number of probes per gene. By adding an empirical screening step, where only the best of three bioinformatically designed probes is incorporated into the array, “we were able to reduce the number of probes per gene from two to one,” says Dickinson.

In a paper published last month in the American Journal of Pathology4, Marina Bibikova and colleagues at Illumina, along with researchers from Veridex of Warren, New Jersey (a Johnson & Johnson Co.), adapted the BeadArray technology for the high-throughput expression profiling of archived formalin-fixed, paraffin-embedded clinical material. Unlike fresh or frozen tissue, tissue treated in this manner is not amenable to conventional microarray-based analysis because RNA extracted from the tissue is often significantly degraded, leaving only small amounts accessible to cDNA synthesis. Consequently, gene expression analysis has typically involved the use of immunohistochemical staining and quantitative RT-PCR, but these methods allow only a few genes to be analyzed at a time. The cDNA-mediated, annealing, selection, extension and ligation (DASL) assay described in a recent report5 allowed the parallel analysis of hundreds of genes at a time using as little as 50 ng of total RNA isolated from formalin-fixed tissues that had been stored for up to 10 years. It should now pave the way for both prospective and retrospective analyses to be performed on this type of archived clinical material.

CodeLink Human Whole Genome Bioarrays show sensitivity of detection down to 1:2,000,000 mass ratio determined by spiking experiments at the cRNA. Total dynamic range shown is 0.05–50 pM spike concentration range. (Courtesy of GE Healthcare.)

Home-brew versus off-the-shelf

So does all this mean the end for home-brew microarrays? Perhaps not. “We're still printing arrays for the National Cancer Institute (NCI),” says Ernest Kawasaki, head of the microarray facility at NCI's Advanced Technology Center in Bethesda, Maryland. “It's a technology you have to be good at,” he says. “It's not idiot-proof yet.” When he arrived three years ago, the facility was printing cDNAs. “That's too much work and it's hard to QC,” says Kawasaki, and so he looked around for companies selling long oligo libraries. Kawasaki's preference is to print long oligos, because many of the researchers at NCI are working with complex mammalian systems where there is always a chance of cross-hybridization because there are so many bases. “With a 60-mer, there's practically no chance that there will be a sequence that can hybridize by chance,” he says.

Gary Hardiman, assistant professor in the department of medicine at the University of California, San Diego, runs the BioMedical Genomics Microarray Facility at the university. When he arrived five years ago there was already an existing Affymetrix-based core on campus and so his unit was set up an alternative to Affymetrix, making low-cost, custom spotted arrays in house. But in order to generate good data for researchers using homemade microarrays, units like Hardiman's are often faced with running at a financial loss.

As more systems became commercially available, providing a level of quality that was hard to match, Hardiman says he decided to switch. “Now obviously they're not going to cover everything that researchers at universities are going to want to print,” says Hardiman. For example, “we have researchers working on Dictyostelium and there's no commercial arrays for that,” he says. In addition to specialized, or boutique, arrays, his unit also fabricates focused sets of arrays.

Core facilities themselves can run the gamut: from a scanner and a computer at one end of the scale to a unit fully equipped with one or more commercial platforms and all the necessary robotics and PCR machines for printing arrays in house at the other. “You can easily go through $500,000 to set up a lab like we have,” says Hardiman.

Beyond expression analysis

Looking forward, there will be a continued desire for more automation and for more user-friendly data analysis tools to make sense of the mountain of data6, as well as tools to better evaluate the performance of gene expression assays within and across platforms. There will also be a continued push to increase the amount of information that can be gleaned per array, and one way to do that is to further reduce the feature size. This is a technical challenge, and ways will need to be found to carry out hybridizations involving just a few molecules that are quantitative and kinetically fast enough, and to read features perhaps at the one-micron level.

But as the technology has matured and with the abundance of available sequence data from human and model organisms, researchers are now able to use microarrays to ask increasingly complex questions. With the completed human genome sequence now in hand, the next step is to identify all the functional elements in that sequence—not just the protein-coding genes but also non-protein-coding genes and transcriptional regulatory elements. The ENCODE (ENCyclopedia Of DNA Elements) project, a public-private consortium funded by the National Human Genome Research Institute, was set up last year to do just that and aims to create a 'parts list' for a representative one percent of the human genome specified by the ENCODE project7. The pilot phase is focused on the development of high-throughput approaches for detecting functional sequences, which will be crucial if the project's aims are to be met.

Affymetrix is part of the effort and in October announced that its GeneChip ENCODE01 Array is now available to qualified laboratories through its Early Access Program. The array contains millions of DNA probes evenly spaced, or 'tiled,' across 35 million base pairs of DNA and can be used for the genome-wide analyses of transcription, transcription factor binding sites, sites of chromatin modification, sites of DNA methylation and chromosomal origins of replication.

Thomas Gingeras, vice-president of biological sciences at Affymetrix, believes this is where the microarray field is heading. He says the company has focused considerable effort over the past few years on fabricating array designs “to look across the genome in a sort of unbiased fashion without building probes specifically to where the annotation suggests the functional regions are located.”

Researchers can also purchase tiling arrays for human chromosomes 21 and 22 through Affymetrix's Early Access Program. Tiling arrays for the entire human genome and several model organisms, including Drosophila melanogaster, A. thaliana, Saccharomyces cerevisiae and Schizosaccharomyces pombe, should be commercially available in the second half of 2005.

On the clinical side, microarray-based products are now making their way into the diagnostic and healthcare fields. In September, Affymetrix announced that it had received European Union clearance to market the firm's GeneChip platform for in vitro diagnostic use. This was the first marketing approval for Affymetrix and Roche under their collaboration to develop and sell molecular diagnostic products based on the Affymetrix GeneChip system. The GeneChip System 3000Dx will enable clinical laboratories in Europe to analyze microarray diagnostics, such as the Roche AmpliChip CYP450 Test. The CYP450 Test can be used to identify certain naturally occurring variations in the drug-metabolism genes CYP2D6 and CYP2C19 that affect the rate at which a person metabolizes many commonly used drugs. The hope is that this kind of information may ultimately be used by physicians to stratify disease, predict therapeutic outcomes and make more informed treatment choices—perhaps heralding an era of personalized medicine. Table 1

Table 1 Suppliers guide: companies offering dna microarrays, platforms and software