MicroRNAs may be small, but these noncoding RNAs that regulate gene expression are creating a big stir. Finding differences in the expression of microRNAs between, say, healthy and diseased cells could potentially be used to diagnose diseases or to assess treatment effects. If researchers can understand how they work, microRNAs could provide tools for manipulating genes, not to mention help to untangle how genes are regulated.

At first glance, studying microRNAs seems more manageable than studying the menagerie of other types of RNA. Typical expression profiling experiments for protein-coding genes examine thousands of molecules; those for microRNAs examine hundreds. But researchers are still figuring out the most reliable ways to measure these important molecules.

The most common techniques for profiling microRNAs are deep sequencing, microarrays and quantitative real-time PCR (qPCR). All are supported by several commercial offerings (see Boxes 1, 2, 3). Though specific products and techniques vary, researchers generally agree on the relative strengths and weaknesses of the platforms. The best choice depends on the application, says Muneesh Tewari, who studies microRNAs at the Fred Hutchinson Cancer Research Center. “It's a balance of cost, precision, accuracy and sample quantity,” he says. “If the purpose is to screen a bunch of samples to find a few microRNAs that change and you can tolerate a false negative, then the microarray may be the best platform. If the purpose is to detect microRNAs where the sample amount is limiting, then qPCR has better sensitivity, and if you are trying to see different isoforms or very similar microRNAs, then sequencing is going to be the best approach.”

But not all researchers are aware of how the choice of product influences the data. “If you take the same sample and analyze microRNAs in different profiling technologies, the overlap can be surprisingly poor,” says Robert Blelloch, a stem cell biologist at the University of California, San Francisco. “It's a really murky world,” he says. “The community has to come together to come up with a strategy here.”

Uncertain profile

Scientists at the Cancer Research UK Cambridge Research Institute and the European Bioinformatics Institute in Cambridge recently assessed how well deep sequencing, microarrays from six manufacturers, and two forms of qPCR identified differences in microRNA amounts among three biological samples1. One analysis examined transcripts that were upregulated in a breast progenitor cell line compared with normal breast tissue: 136 microRNAs were identified in total, but only 53 were found in common by five assays. (Results from qPCR and microarrays from two manufacturers were excluded because of their high rates of false calls.)

Don Baldwin of the University of Pennsylvania Molecular Profiling Facility believes a synthetic reference library of microRNAs can help to develop clinical assays and to make profiling more accurate.

“I don't think any of the platforms demonstrate a significantly better view on the absolute truth,” says Don Baldwin of the University of Pennsylvania Molecular Profiling Facility, who cochairs a research group with the Association of Biomolecular Resource Facilities that recently compared four microarrays and two sequencing platforms commonly used to profile microRNAs (http://www.abrf.org/ResearchGroups/Microarray/Activities/R7_Baldwin.pdf; Table 1). Baldwin advises researchers to find the technologies that they or their core facilities can use with the least technical variance, then stick to that and pour their energies into planning their study. “The microRNA fraction is less complex,” he says, “but that doesn't mean you can get away with a poor experimental design or fewer replicates.”

Table 1 Platform comparison for microRNA profiling

The obvious solution is to verify results using different techniques, but that kind of cross-validation does not always happen, says Carlo Croce, director of human cancer genetics at Ohio State University. “Most people use one method and that, I think, is wrong.” He emphasizes that platforms are far from the only cause of variable results. Getting the right sample is crucial, and some samples and questions are more suited for different techniques. Still, validating findings takes significant effort, and researchers could be more efficient if they understood patterns of false positives and false negatives across different platforms.

Proper prep

Before launching any kind of profiling study, researchers need to assess the quality of their RNA sample, says Kelli Bramlett, senior manager of research and development with Ambion, a division of Applied Biosystems. To test sequencing applications, she recommends spike-in controls generated from the External RNA Control Consortium plasmids, an effort coordinated by the National Institutes of Standards and Technology. Analysis on an RNA gel or Agilent Bioanalyzer can assess whether RNA is too degraded to be used for a particular experiment. Getting rid of unwanted RNA can also be useful. Illumina recently introduced an enzyme designed specifically to clear out ribosomal RNA, which can comprise more than 99% of total RNA in a sample.

Some technologies require an amplification step, although University of Pennsylvania's Baldwin says that it is difficult to represent all microRNAs in a sample while preserving their relative abundance. The amplification step is unavoidable in current deep sequencing protocols, but microarrays often skip amplification in favor of directly labeling microRNA.

Anna Git of the Cambridge Research Institute believes differential labeling of microRNAs causes much experimental variation. The field has too few controls, she says. Credit: L'Oreal

This labeling step introduces much of the variability, says Anna Git of the Cambridge Research Institute, who co-led a comparison of profiling techniques. The detection components of various platforms often work well, she says. “The main problem is that the methods we use to label RNA are imperfect.” Each company's preparation treats some RNA molecules differently and so creates different artifacts. Worse, such biases tend to be more serious in degraded samples.

Many techniques add oligonucleotides to microRNA transcripts using RNA ligase, but that enzyme favors certain sequences over others. Enzymes that add other labels also have preferences. Deep sequencing is prey to similar biases; some microRNA sequences are preferentially ligated and amplified depending on the preparation technique2.

Muneesh Tewari of the Fred Hutchinson Cancer Research Center thinks profiling technologies for microRNAs are more different from each other than are those for profiling mRNAs. Credit: © 2010 Susie Fitzhugh

Not everyone is convinced that these differences pose a big problem for expression profiling. It is true that each method has its own peculiarities that will affect quantification, and some microRNAs survive some preparation techniques better than others, says David Bartel of the Whitehead Institute, but that should not make a huge difference in the results. “Usually those biases are going to be the same in the different samples, so if you are looking at the same microRNA in different samples you can still see if it has changed,” he says.

Shorter is harder

Technologies for studying microRNAs have been adapted from techniques for studying DNA and RNA molecules that are hundreds or thousands of nucleotides long. MicroRNAs are much smaller, typically about 22 nucleotides long. The short length gives researchers fewer options for designing complementary sequences: the entire microRNA sequence is often used for a single probe on a microarray.

MicroRNAs also exist in families in which members frequently vary by as little as a single nucleotide, and so are hard to distinguish. One solution is to boost the specificity of a probe or primer for its target with high temperatures, so that only the best matches bind. Genome-wide, however, microRNAs vary greatly in their GC content, or the percentage of their sequence that comprises guanines and cytosines. This means that the temperature at which microRNAs dissociate from complementary sequences varies greatly, perhaps by more than 20 °C, complicating efforts that depend on the separation and reannealing of complementary sequences.

Sequencing experiments have found variation even within microRNAs encoded by the same gene. Some single-nucleotide polymorphisms occur within microRNAs, and these variants are linked to differences in the expression of protein-coding genes. Another source of variation, post-transcriptional modification, can be identified through sequencing but complicates profiling by other techniques. “Not only does this bear significant implications on the function of the resulting microRNA,” says Git, “but it also introduces a mismatch between probes [which are designed against the genomic sequence] and the real edited target, resulting in an inaccurate readout of expression.”

To make matters worse, kits and algorithms designed from genome-scanning algorithms or even from miRBase, the common repository for microRNA sequences, could very well be designed to find molecules that are not transcribed or functional. “Papers have been written about imaginary microRNAs,” says Bartel, who believes that more than a quarter of mouse microRNAs deposited in miRBase may not really be microRNAs.

Bartel and colleagues sequenced 60 million small RNA molecules from a wide variety of mouse tissues and found that some 150 miRBase microRNAs were either not represented or likely to be artifacts3. (The study also found 108 genes not represented in miRBase that did seem to represent microRNAs and showed that representative transcripts were recognized by microRNA-processing enzymes.) The bright side, says Bartel, is that these false microRNAs could be used as negative controls that indicate what researchers should expect if a microRNA is not present.

A common reference

Git advises researchers to use many more controls than they think they need. For example, to validate microRNAs whose expression changes between different types of sample, researchers need to identify several other microRNAs that can be used for comparison within those same samples, including some that seem to change in opposite ways and some whose expression seems constant.

The Association of Biomolecular Resource Facilities (ABRF) hopes to help researchers discover precisely which microRNAs are favored by different techniques, says Baldwin. By the end of this year, ABRF plans to release a pool of synthetic RNA molecules that can be analyzed across microarray, qPCR and deep sequencing platforms. As the concentrations and identities in the pool will be known, says Baldwin, differences caused by the techniques themselves can be revealed.

Other researchers prefer to use biologically derived references, combining cell lines and tissue samples into a large pool that can be assessed alongside the samples of interest. Agilent, Ambion and other companies produce such universal reference pools for microRNA studies. Tewari, who used a small synthetic reference collection for assessing sequencing and qPCR, says that the technique works well, as long as scientists keep in mind how it might perform differently from a biological sample: a synthetic sample that contains a greater variety of microRNAs than typically seen in a biological sample may introduce high levels of cross-hybridization or competition, and it might also be tough to mimic the dynamic range in biological samples, particularly if some assays saturate. Most importantly, says Tewari, biological samples probably contain less than 0.01% microRNA. The rest is ribosomal RNA, tRNA and mRNA, which represents a background matrix that could compete with preparative enzymes and otherwise affect results.

But Baldwin thinks simplicity makes assessment possible. “All our tests so far have used messy biological samples, so we don't really know the absolute truth. That's why a synthetic reference sample may be useful.” Such samples could also be spiked with cell extracts to mimic biological samples, he says.

No matter what the ultimate approach, researchers agree that no reference sample can make up for using high-quality controls and constantly questioning the tools and protocols used for experiments. “Treat your assays and your kits with suspicion,” says Git, “and we're going to end up with better science.”

Table 2 Suppliers guide: Companies offering microRNA profiling technology