Main

Human chromosome 12 has been estimated to be 125 megabases (Mb) in size. Several low-density clone maps, genetic linkage maps and radiation hybrid maps that include chromosome 12 have been described3,4,5. The map presented here is based on large bacterial clones that include bacterial artificial chromosomes (BACs), P1 (phage) artificial chromosomes (PACs; http://www.chori.org/bacpac) and a few cosmids6. To construct the map, we divided the chromosome into 1–2 centiMorgan (cM) intervals and assembled all sequence-tagged site (STS) markers known to be in each interval. We used probes derived from the non-polymorphic markers in an interval to screen colony arrays representing six to ten genome equivalents from the RPCI libraries. To extend contigs and link adjacent contigs, we used STSs corresponding to end sequences of BACs at the outer boundaries of each contig to screen members of the contig from which they were derived as well as possible adjacent contigs. This process was reiterated until the region was covered by overlapping clones7. The STS-content map was assembled manually and the data stored in a relational database. A representative portion of the map corresponding to 5 cM of 12p can be seen in Supplementary Information; the entire map is available on our website (http://sequence.aecom.yu.edu/chr12/).

Our map contains 3,090 STS markers. Of these, 438 are polymorphic, 1,836 are non-polymorphic and the remaining 816 are based on genes or expressed sequence tags (ESTs). These markers were obtained from gene maps8, linkage maps3,9,10 or the Whitehead Institute radiation hybrid map (http://www.genome.wi.mit.edu), or developed at the Albert Einstein College of Medicine Genome Center. If the size of chromosome 12 is 125 Mb, the STS markers provide an average resolution of 40 kb. There are more than 5,300 large-insert clones on the map, and each marker is present in an average of 8.1 bacterial clones. This depth provides a high level of confidence in the quality of the map. Originally, 1,154 tiling path and seeder clones were identified for sequencing chromosome 12; 1,115 of them are being sequenced. The map indicates that some of the seeder clones are redundant and do not need to be finished. The estimated minimal set of clones covering chromosome 12 is 1,025. On average, adjacent clones overlap by 30%. As the average size of the clones in the RPCI-11 library is 174 kb, we can estimate the physical size of chromosome 12 excluding the gaps (see Supplementary Information) to be 125 Mb.

We assessed the accuracy of the map in two ways. First, we analysed tiling path clones with any available sequence to determine whether they overlapped as predicted by the map. Using electronic polymerase chain reaction (ePCR), BLAST (http://www.ncbi.nlm.nih.gov/) or the annotation of finished clones, most overlaps were confirmed. Second, we analysed clones located at 1-Mb intervals along the chromosome using fluorescence in situ hybridization (FISH) as part of the Cancer Chromosome Aberration Project (CCAP)11,12. We chose 120 clones, and their order proved to be consistent with the order in the integrated map. The names of the clones, their cytogenetic locations, and details of their distribution are available at http://www.ncbi.nlm.nih.gov/CCAP/.

The STS-content map of chromosome 12 has been compared and integrated with the whole-genome map generated by BAC clone fingerprinting. The RPCI-11 BAC library was the primary resource for both maps, so the two are fully compatible. The comparison validated both methods and revealed them to be complementary. When the comparison was initiated, the STS-content map contained 74 contigs anchored to the genetic map. The whole genome map contained 75 contigs anchored to the G4 radiation hybrid map4 and nine additional contigs that were assigned to chromosome 12 but unanchored.

The integrated map now consists of 18 contigs that are not linked by PCR or sequence data (see Supplementary Information). The genetic markers on the STS map provide strong evidence for the orientation and order of most of these 18 contigs. This permits the direct comparison of fingerprints of end clones that are adjacent to each other on the map. In two cases we find that the fingerprints overlap and consider these contigs to be manually closed2; in one case we know by sequence that the clones do not overlap and in one the fingerprints are difficult to interpret (see below). The remaining 12 pairs of end clones do not have similar fingerprints and additional clones may be required to achieve contiguity. By these criteria, we consider the chromosome 12 map to have 14 gaps (see Supplementary Information), excluding the centromere, in its bacterial clone coverage.

We previously reported a yeast artificial chromosome (YAC) map of chromosome 12 (ref. 5). When the YAC, BAC and whole-genome fingerprint maps are combined, YACs cover all but five of the gaps, so we expect that these gaps are less than 1 Mb. The gaps not covered by YACs include one in a repetitive region at 43.6 cM (described below), and four between 137.5 and 169.1 cM.

Some of the gaps differ, and may reflect significant genomic features. Understanding these regions will benefit the mapping and sequencing of this and other genomes. For example, the gap at 13.9 cM is covered by a YAC, but we have been unable to identify a bacterial clone to fill it, despite screening the libraries RPCI-1, 3, 4, 5, 11 and 13 several times with different probes. Furthermore, sequence is currently available for most of the clones in the region, but there are no sequence hits in any of the databases for the ends of the gap. This is the only region of the chromosome that appears to be missing from the available bacterial clone libraries, suggesting that true deficiencies in the bacterial libraries may be rare.

Two gaps appear in the region corresponding to 43.6 cM on 12p. Screening of the 12X BAC library with markers in this region yielded a set of clones whose number far exceeded unique representation of this region in the genome. STS-content analysis of the clones showed that each marker was positive for as many as 30–40 clones and the order of the clones could not be determined logically. These clones had similar restriction patterns and fell into very dense fingerprint contigs that included clones containing markers on chromosomes 2, 3, 11, 12 and 14. The combined fingerprint and PCR results indicate that this region may contain large duplications reminiscent of those described on chromosome 22 (refs 13,14) and it may also represent a region duplicated in other parts of the genome. The fingerprints of the clones flanking one gap in this region have been examined and the data strongly support coverage. Neither fingerprint nor STS data can currently resolve the second gap. This repetitive region may represent one of the most difficult types of region to map and sequence in full, in both the human and other genomes.

This map is one of the most complete human chromosome maps available. The known marker order for each clone allows easy assembly of draft sequence and the map will be invaluable in directing efforts towards completing the sequencing of chromosome 12 as well as future functional genomic studies.