To identify differentially expressed genes in infiltrating ductal carcinoma (IDC) of the breast, we measured the gene expression profiles of 15 IDC and 13 normal human breast tissues, using the Affymetrix GeneChip array platform for simultaneous analysis of over 60,000 genes. Fold-change comparison between normal and IDC breast tissue samples revealed 830 genes that were statistically over- or underexpressed by threefold or greater in the IDC samples. We identified 286 overexpressed genes and 544 underexpressed genes. Furthermore, the 830 genes were evaluated for tissue-specific expression by E-northern analysis of 28 different normal tissues, revealing tissue-specific candidate targets. We performed further analysis utilizing principal component analysis or hierarchical clustering with 5,467 prefiltered genes to determine if gene expression profiles could be used to distinguish between IDC and normal breast tissue samples. Both PCA and HCA indicated two distinct populations for the normal and tumor-derived samples, except for two samples, based on gene expression patterns. One aberrant tumor sample was explained by its histology, which showed only partial (<20%) involvement by malignant cells. We also employed cluster analysis to evaluate differences in samples at the single-gene level. Examination of a particular cluster of overexpressed genes displayed a multitude of differentially expressed sequence tags along with genes for topoisomerase II, cyclin B, CDC2, KI-67, and thymidine kinase, suggesting some functional similarities. Our studies provide a rational basis for additional analysis of differentially expressed genes in breast cancer, which may lead to the identification of therapeutic targets and diagnostic markers.