We have made a DNA microarray that includes not only all the open reading frames (ORFs) and other features in the yeast genome, but also all the intergenic regions. We are using this as a tool to construct genome-wide maps of DNA-protein interactions for proteins that interact directly or indirectly with DNA or chromatin in vivo. Proteins are crosslinked to DNA in vivo using formaldehyde and the crosslinked DNA is extracted and sheared. DNA specifically associated with any protein of interest is immunoprecipitated using a specific antibody against the protein or an epitope tag that may be fused to the protein. The selected DNA, representing loci that the protein interacts with in vivo, can be identified by fluorescently labelling it and hybridizing it to the microarray along with an appropriate reference probe. This approach is being used to map the genome-wide interactions of sequence-specific DNA binding proteins, components of the transcription machinery and chromatin components, under a variety of conditions.

We have analysed the interaction with the genome of SWI4, a sequence specific DNA binding activator that regulates genes expressed in the G1/S phase of the cell cycle. We detected specific interactions of SWI4 with the promoters of the HO, CLN1, CLN2, PCL1 and other G1/S specific genes. A majority, but not all, of the sites that SWI4 appears to interact with in vivo are in the promoters of genes that are G1/S specific. Initiation of transcription requires the specific binding of TATA-box binding protein (TBP) to the core promoter and the enhancement of this is believed to be a key step in the activation of transcription. Using the intergenic microarrays, we found increased association of TBP with the promoters of genes induced by heat shock, 15 minutes after yeast cells were heat shocked. This supports the idea that recruitment of TBP to these promoters is increased upon heat shock and is an important step in the process of transcriptional activation of these genes.

Comparison of the sites of actual interaction of a protein in vivo with its predicted binding sites is likely to give us a better idea of the sequence determinants for DNA interactions. Since these microarrays represent nearly every locus in the yeast genome, they can also be used to identify novel transcripts in the intergenic regions that may not have been predicted by the sequence annotation of the genome.