We are developing a Rat Gene Index to serve as a non-redundant resource for rat genes that will contain information on expression patterns, gene identity and function as well as links to mapping information and orthologous genes in other species. The non-redundant transcript database was generated by assembling cleaned ESTs and non-redundant expressed transcripts (ETs) into tentative rat consensus sequences (TCs). The TCs were searched against a non-redundant amino acid database and assigned a putative identity where possible. ESTs that failed to assemble into TCs were searched individually. Of the more than 100,000 ESTs currently in GenBank, The Institute for Genomic Research has contributed over 50,000 ESTs. The current RGI Release, version 2.0, contains almost 90,000 ESTs which assemble into over 26,000 singletons and almost 14,000 TCs. We have assigned role categories to identified genes to estimate the number of genes represented by a variety of cellular and organismal functions. We have identified several hundred TCs that do not have significant homology to any known gene in several public databases. These TCs may serve as potential candidates for novel genes that have not been previously identified. We have used the Index to identify tissue-specific TCs and have confirmed the tissue-specific expression of several hypothetical and unknown proteins by Northern analysis. Finally, we have compared gene expression patterns between rat fibroblasts containing various Src-oncogene derivatives using a glass array containing cDNAs from over 4,000 clones selected from the Rat Gene Index.