Bone marrow cells that express the CD34+ antigen (CD34+ cells) represent a continuum of early and committed progenitor cells, including the most primitive stem cells. We used gene filter arrays to develop a complementary DNA database that is representative of the CD34+ transcriptosome, by comparing expression between human and baboon CD34+ cells to identify the most conserved and highly abundant genes in both species. RNA-based probes were sequentially hybridized to GeneFilters (Research Genetics) containing 25,920 human UniGene cDNAs. The human cells displayed 15,970 (62%) of the cDNAs on the filters, of which 97% varied less than threefold in expression level between species and 60% were expressed sequence tags. Polymerase chain reaction analysis with reverse transcription confirmed that selected genes were expressed at the predicted levels. To demonstrate the utility of this stem cell database for gene discovery, we mined it for transcriptional regulators by searching UniGene cluster descriptors for text strings corresponding to relevant protein motifs, including transcription factor, zinc-finger, POU, homeobox, helix-loop-helix and leucine zipper. By this method we selected 230 cDNAs, of which 186 were known genes, including 71 transcription factors, and the remaining 44 were expressed sequence tags. This preliminary description of transcriptional regulators expressed in CD34+ cells provides a potential wealth of new information about the molecular basis of normal hematopoietic cell proliferation and demonstrates the utility of the stem cell database for gene discovery. This database should prove useful for the preparation of a rationally designed cDNA array for expression analysis of normal hematopoiesis, stem cells and leukemias.