A growing variety of statistical analysis approaches are available to identify groups of genes that share common expression patterns; however, the interpretation of the biological characteristics of genes in such clusters remains primarily a manual task. We have developed a data-mining method that uses indexing terms from the published literature linked to specific genes to present a view of the conceptual similarity of genes within a cluster or group of interest. The method takes advantage of the hierarchical nature of medical subject headings used to index citations in the MEDLINE database and the registry numbers applied to enzymes. The results are generated as dynamic HTML with links to the citations whose keywords appear in the term hierarchies. We have applied this method to gene clusters in the publication by Golub et al.1 describing statistical methods for classifying acute myeloblastic leukemia (AML) and acute lymphoblastic leukemia (ALL) without a priori biological knowledge. In both sets of genes the most common enzymatic descriptor class is that of complement-activating enzymes. In the ALL-predictive set of genes, these enzyme descriptors include endonucleases, endopeptidases, amidohydrolases and acid anhydride hydrolases. In the AML-predictive set, several plasminogen activators occur as keywords, a finding that may correlate with defibrination syndromes and other hemostatic abnormalities that are associated with AML but not with ALL. Overall, complement activation is a common and potentially clinically significant phenomena in acute leukemias, and the high frequency of this descriptor in the set of highly expressed genes is consistent with our observations that informative genes were not merely markers of hematopoeitic lineage, but encoded proteins important in cancer pathogenesis. These conceptual similarities, revealed by the automated summing and organization of literature keywords associated with these 50 genes, are a new finding that complements the interpretations of the authors of the original paper.