The human immune system is important in immune responses, cancer biology and several genetic diseases. Studies on the expression levels of genes related to immune system are critical in understanding the mechanisms of immune response and pathology of human diseases. Recent development of cDNA microarray technology makes it possible to study genome-scale gene expression. Lymphochip is designed as a cDNA microarray chip, representing genes in a wide variety of lymphocyte differentiation and activation processes. Screening over 70,000 EST sequences from germinal centre B cell, follicular lymphoma, follicular mixed small and large cell lymphoma, mantle cell lymphoma and chronic lymphocytic leukemia libraries required development of automated computational tools and a structured, robust data management system. We used multiple criteria for selecting clones on the Lymphochip. One is the frequency with which a gene sequence has been sequenced from the select set of B cell and lymphoid libraries, an expanded set of lymphoid libraries versus all other libraries. Clones were classified as being unique to a library, unique to the pool of libraries or mostly lymphoid (i.e. 75% of matching hits were derived from an expanded set of lymphoid libraries). The analysis was performed by tabulating the BLAST results of each candidate sequence against the dbEST database. A second criterion was to select clones matching a specific set of interesting genes. This set of named genes includes genes encoding cytokines, cytokine receptors, adhesion molecules, cell surface differentiation markers, signal transduction and transcription factors, cell cycle and apoptosis proteins, oncogenes, tumour suppresser and human viral genes. Additionally, if they existed, we chose a second clone from a cluster meeting a criterion. Computational algorithms were designed for automating the selection process for the above criteria. To minimise the redundancy of clones included on Lymphochip, we used a modified version of the CLEANUP algorithm to cluster the overlapping sequences. UniGene clustering was also used as a reference and some additional clones from “mostly lymphoid” UniGene clusters were selected. The current version of Lymphochip includes 17835 clones, with 9865 clones unique to the B cell libraries. The clones on Lymphochip represent a total of 6343 UniGene clusters. With the fast accumulation of new EST sequence data and new studies on known genes, iterative analysis with the computational algorithms is necessary to include further novel clones derived from B cell libraries and new genes of interests on Lymphochip. A number of experiments with clinical samples or laboratory cell lines have been performed with Lymphochip.