Complete sequences of Schizosaccharomyces pombe subtelomeres reveal multiple patterns of genome variation

Genome sequences have been determined for many model organisms; however, repetitive regions such as centromeres, telomeres, and subtelomeres have not yet been sequenced completely. Here, we report the complete sequences of subtelomeric homologous (SH) regions of the fission yeast Schizosaccharomyces pombe. We overcame technical difficulties to obtain subtelomeric repetitive sequences by constructing strains that possess single SH regions of a standard laboratory strain. In addition, some natural isolates of S. pombe were analyzed using previous sequencing data. Whole sequences of SH regions revealed that each SH region consists of two distinct parts with mosaics of multiple common segments or blocks showing high variation among subtelomeres and strains. Subtelomere regions show relatively high frequency of nucleotide variations among strains compared with the other chromosomal regions. Furthermore, we identified subtelomeric RecQ-type helicase genes, tlh3 and tlh4, which add to the already known tlh1 and tlh2, and found that the tlh1–4 genes show high sequence variation with missense mutations, insertions, and deletions but no severe effects on their RNA expression. Our results indicate that SH sequences are highly polymorphic and hot spots for genome variation. These features of subtelomeres may have contributed to genome diversity and, conversely, various diseases.

(d) Detailed information for the Ch3 ends in the three strains. The sequence of SAS which has not been found in PomBase-972, is the same as that of box q (see Fig. 6).
Yellow box, common sequence (19 bp) between SAS and block III; blue arrow with a purple line, homologous region with the tlh2 ORF defined in PomBase; pale purple line, predicted extension of the coding region of the tlh2 gene as described in Fig. 8b; pink line, predicted in-frame extension of the coding region in the same reading frame; blue arrowhead, nucleotide deletion; dM (downstream M), the original first methionine codon of the tlh2 ORF in PomBase-972 or its corresponding first methionine codon in KYP33 and JP1225; uM (upstream M) and uM¢, putative first methionine codons upstream of dM (also see the main text and Fig. 8b). Note that the 24 bp-deletions just downstream of dM codon in KYP33 and JP1225 do not cause frame shifts or premature nonsense mutation.

Supplementary Fig. 2 Sequencing of SH-P regions by the serial deletion method.
The SH-P region with various common segments (~5 kb) was amplified by PCR from a 972SD4 strain, and inserted into a vector. The resultant plasmid was digested with restriction enzymes, KpnI and XhoI, at the multi-cloning sites (MCSs) of the vector.
The linearized plasmid was serially deleted from the 5¢-overhang at XhoI site by treating with exonuclease III (ExoIII) and mung bean nuclease (MBN) for fixed times.
After deletion of the SH-P DNA, both ends of the plasmid were blunted by klenow fragment (KF) and ligated by DNA ligase, and then re-circularized plasmids were cloned using Escherichia coli (E. coli). The partially deleted SH-P DNAs were sequenced using primers that anneal to MCSs of the vector. The sequence reads were assembled using overlapping sequences.            (a) Homologous Y sequences located at the ends of the 3.7 kb-change (purple and pink boxes in Fig. 5a-b). YSH1L-L (purple) and YSH1L-R (pink) are Y sequences on the left side and the right side of the 3.7 kb-insertion in SH1L in Fig. 5a-b, respectively; YSH1R, YSH2L, and YSH2R (pink) are Y sequences in SH1R, SH2L, and SH2R, respectively. Differences in sequences between YSH1L-L and the others are highlighted in yellow. Note that sequences of the pairs, YSH1L-R and YSH1R, and YSH2L and YSH2R are 100% identical.
(c) Homologous W sequences at the ends of the 7.1 kb-change (brown, red, and orange boxes in Fig. 5a-b). WSH1L-L (brown) and WSH1L-R (red) are sequences on the left side and the right side of the 7.1 kb-insertion, respectively, in SH1L in Fig. 5a Supplementary Fig. 9 Sequence alignment of the tlh1-4 genes in 972SD4 with the PomBase-tlh2 gene.