The finding that RNA can regulate gene expression is among the most exciting discoveries made in recent years. As well as RNAi, PTGS in plants and quelling in Neurospora, there are the stRNAs that regulate developmental timing. Discovered in Caenorhabditis elegans, the stRNAs lin-4 and let-7 have been the defining members of a group of small RNAs that are now referred to as microRNAs (miRNAs). Typical miRNAs are 22 nt long and are cleaved from larger (70 nt) precursors that form a characteristic stem-loop structure. Families of miRNA genes are present in both plant and animal genomes. Now, Bartel and colleagues take a computational genomics approach to identify miRNAs that are conserved across vertebrates. Their computational procedure (MiRscan) predicts that vertebrate genomes contain 200–255 miRNA genes, representing nearly 1% of the predicted genes in the human.

MiRscan — the details of which are being published elsewhere — identifies the evolutionarily conserved stem-loop precursors. Within each potential precursor, it scans 21 nt at a time to find the closest match to the original worm miRNAs. The authors compared the human, mouse and Fugu rubripes genomes and identified 15,000 stem-loop segments in the human. All of these fell outside of protein coding regions and were at least partially conserved in mouse and Fugu. MiRscan narrowed this number down to 188, but the sensitivity of the scoring indicated that this number might represent 74% of all miRNA genes, setting the maximum number at 255.

Given that some miRNA loci were already known, and that MiRscan identified 107 new candidates, Lim et al. point out that no more than 40 new miRNA loci remain to be discovered in the human. This estimate depends on the accuracy of the MiRscan prediction, so the authors set out to verify their candidates. It turned out that some of them are closley related to previously cloned miRNAs, whereas others could be detected in a zebrafish cDNA library that had been constructed specifically to contain miRNAs and siRNAs. Nonetheless, Lim et al. were left with 55 candidates that could not be verified — either because they were false positives or simply because their levels of expression were too low. So, the authors calculated the minimum specificity value and, taking into account the sensitivity of the zebrafish experiment and the incompleteness of the genome, proposed 200 as the lower limit for the total number of human miRNA genes.

Although MiRscan was 'trained' on worm miRNAs, it was able to identify most of the vertebrate counterparts, indicating that although most miRNA sequences have not been conserved, some of the generic features of miRNAs and their precursors have been. The authors also provide a parallel between protein coding and miRNA gene families: miRNA genes represent nearly 1% of the predicted human genes, a proportion that is similar for other families of regulatory genes. Because miRNA genes are absent from yeast, Lim et al. speculate that they might have evolved to regulate cell differentiation and developmental patterning. This is certainly true for some of the miRNAs that are already known; undoubtedly, functions will soon be assigned to the new miRNAs identified in this study.