Genome-wide knowledge of gene expression in cancer cells promises to illuminate many aspects of their clinical behaviour. We have begun a study of gene expression in lymphoid malignancies by constructing a specialized cDNA microarray, termed the ‘Lymphochip’, that is enriched in genes selectively expressed in lymphocytes and genes that regulate lymphocyte function. Because most human lymphomas appear to represent malignant transformation of the germinal center B lymphocyte, a cDNA library was created from germinal center B lymphocytes that were purified by flow sorting from human tonsils. We obtained 50,898 sequences from this library, over 10% of which had not been observed previously in other libraries. In addition, 14,645 EST sequences have been generated from cDNA libraries of a variety of B cell malignancies, with a similarly high rate of gene discovery. This rich source of novel genes formed the basis of the Lymphochip microarray, which currently contains over 18,000 clones.

Initial experiments with the Lymphochip have focused on three B cell malignancies: diffuse large cell lymphoma, follicular lymphoma and chronic lymphocytic leukaemia. These malignancies were chosen for study because they may encompass a variety of molecularly distinct diseases that cannot be distinguished morphologically. Disease-specific sets of genes were identified that were characteristically expressed in all cases of one malignancy and not the others. Nonetheless, substantial variation in gene expression was observed between patients within a given diagnostic group. In diffuse large cell lymphoma, two subgroups of patients were defined that had subgroup-specific gene expression signatures composed of 50–100 genes. Most genes that define these diffuse large cell lymphoma subgroups are novel genes derived from the germinal center B cell library. Our results demonstrate that molecular diagnosis of cancer using gene expression profiling can discover “diseases within a disease”.