Introduction

Simple sequence repeats (SSRs), or microsatellites, consist of tandem repeats of 1–4 nucleotide motifs which are distributed throughout eukaryotic genomes (Charlesworth et al., 1994). SSR loci would be expected to show high levels of length polymorphism, on account of differences in the number of repeat units, and to show codominance. In addition, once specific primers have been designed, SSR loci can be amplified from small amounts of genomic DNA by PCR. Therefore SSR markers are useful and powerful tools for population studies. Recently, SSR markers have been developed for several woody plants (Smith & Devey, 1994; Dow et al., 1995; Kostia et al., 1995; Echt et al., 1996; Van de Ven & McNicol 1996; Chase et al., 1996a).

Shorea curtisii is a species of the Dipterocarpaceae, a family that includes major component species of the tropical forests of south-east Asia. Although the family is important in tropical forestry and ecosystems, genetic information about member species is limited. Therefore, to determine genetic diversity and structure in S. curtisii and related species, we planned to develop a series of SSR markers.

Several improvements in methodology have been reported recently for efficient developments of SSR markers (Ostrander et al., 1992; Karagyozov et al., 1993; Lyall et al., 1993; Cifarelli et al., 1995; Kirkpatrick et al., 1995; Takahashi et al., 1996), one of which is the vectorette PCR strategy (Lench et al., 1996). The procedure relies on PCR amplification using a vectorette-specific primer in combination with anchored dinucleotide repeat primers.

In this study, SSR markers of S. curtisii were developed using a vectorette PCR method and a general method involving screening by colony hybridization. The applicability of these primers to other species of the Dipterocarpaceae was also examined.

Materials and methods

Plant materials

Leaves of S. curtisii were collected at the research plot at Semangkok Forest Reserve in Serangor, Malaysia (Niiyama et al., 1993). The leaves of various other species of Dipterocarpaceae were collected from the arboretum of the Forest Research Institute of Malaysia (FRIM). Total DNA was extracted from leaves of each individual by a slightly modified CTAB method (Tsumura et al., 1996).

Screening the genomic DNA library

The genomic DNA samples were digested with Sau3AI, HaeIII, AluI and RsaI, and fragments of 300 to 500 bp were fractionated. The DNA fragments were ligated into pUC18, and the ligation mixtures were then transformed into E. coli JM109. Colony hybridization was carried out using a DIG detection kit (Boehringer Co. Ltd) according to the manufacturer’s instructions, except that 6×SSC was used in the hybridization buffer. Two oligonucleotides, (CA)15 and (CT)15, were used as probes to survey SSR sequences. The hybridization temperature was 64°C for each probe. Positive clones were isolated and sequenced, using an ABI 377 automatic sequencer, according to the manufacturer’s instructions (Perkin-Elmer ABI Co. Ltd).

Vectorette PCR

This procedure was based on the protocol described by Lench et al. (1996). Genomic DNA of S. curtisii was digested with a mixture of restriction enzymes including EcoRI, EcoRV, XbaI and XhoI. Digested fragments between 500 bp and 2500 bp in size were fractionated. Fragments were then repaired by the Klenow method to make the ends blunt, and were ligated to pUC18. The ligation mixtures were transformed into E. coli JM109 and a DNA sample was prepared by alkaline lysis.

Vectorette PCR was performed in 25 μL reaction volumes containing 10 ng DNA from the genomic DNA library, and 0.128 M universal or reverse vectorette primer for pUC18 (Primer 1: CCCAG TCACG ACGTT GT, or Primer 3: GGAAA CAGCT ATGAC CATG; Nippon Gene Co.) in combination with each one (separately) of the 12 anchored dinucleotide repeat primers: (CT)10A, (CT)10T, (CT)10G, (CT)10CA, (CT)10CG, (CT)10CC, (CA)10A, (CA)10T, (CA)10G, (CA)10CT, (CA)10CG and (CA)10CC. Reaction mixtures were denatured at 94°C for 3 min, followed by 34 cycles of amplification consisting of: 94°C for 30 s, 55°C for 30 s and 72°C for 30 s. Amplified fragments were then fractionated in 2% agarose gel. When a single fragment was observed, the PCR products were purified by Suprec-02 columns (TAKARA Co). When multiple fragments were observed, individual fragments were picked up from the gel by pipette tip. The tips were washed in PCR reaction mixture directly, and a single fragment was obtained through one more PCR cycle. The isolated fragments, including SSR flanking regions, were sequenced. Specific forward primers were designed for the flanking regions of each SSR based on the sequence data with the aid of the OLIGO program (version 4.0; National Bioscience). Using the forward primers in combination with primers for the other side of the vectorette primers, vectorette libraries were reamplified by PCR. Amplified fragments were sequenced, and reverse primers were also designed.

SSR polymorphism of S. curtisii

PCR was performed in 20 μL reaction volumes containing 12.5 ng genomic DNA. The PCR annealing temperature was 52–56°C, as appropriate for each pair of primers. PCR amplification conditions were as follows: reaction mixtures contained 10 mM Tris-HCl, pH 8.0, 50 mM KCl, 1.5 mM MgCl2, 0.16 mM each dNTP, 0.128 μM of each primer, 12.5 ng of template DNA, and 0.5 units of Taq polymerase. A PCR amplification was carried out for 3 min at 94°C, followed by 35 cycles of 45 s at 94°C, 30 s at 52–56°C and 45 s at 72°C, with a final 3 min incubation at 72°C with GeneAmp PCR System Model 9600 (Perkin-Elmer ABI Co. Ltd). We investigated 40 individuals from the Semangkok Forest Reserve population in Selangor, Malaysia, to estimate the genetic diversity of each SSR locus. Fragment analysis was carried out using an ABI 310 Genetic Analyzer (Perkin-Elmer ABI Co. Ltd) and each fragment size was determined by the GENESCAN program (Perkin-Elmer ABI Co. Ltd). We also calculated the number of alleles, the allele frequency and the heterozygosity in each locus. We also carried out sequence analysis of amplified fragments to confirm the SSR in these fragments.

Application of SSR primers to other dipterocarp species

PCR was carried out using the five primer pairs (Shc01, 04, 07, 09 and Shc11) that showed the highest levels of polymorphism in S. curtisii. PCR conditions were the same except that the annealing temperature was set at 50°C for three cycles for each locus. All amplified fragments of Shc07 and Shc11 were sequenced, and randomly selected fragments obtained using the other five SSR loci were sequenced to confirm the SSR sequences in the various species.

Results

Screening of SSRs

Colony hybridization with two kinds of probes resulted in four positive clones being isolated from 6000 genomic S. curtisii clones. Though three clones contained SSR sequences, the repeat length of one was too short, and another clone was not long enough to design a primer. Consequently, only one SSR primer pair, Shc01, was obtained.

By the vectorette PCR method, 24 fragments were amplified independently out of 104 clones, 20 of which could be isolated. Thus, 20 fragments were sequenced, and 17 specific forward primers were designed (three of the fragments had AT-rich sequences which prohibited forward primer design). Using the forward primers in combination with the other side of the vectorette primer in PCR, 15 amplified fragments were acquired and sequenced. Each fragment contained an SSR sequence, but four of them had insufficient repeat numbers for PCR amplification. Finally, therefore, eight SSR loci, Shc02, Shc03, Shc04, Shc07, Shc08, Shc09, Shc11 and Shc17, which could be amplified by PCR, were identified.

A summary of the SSR markers is shown in Table 1. Shc02, Shc03, Shc04, Shc08 and Shc09 were simple CT repeats and the others were compound repeats of CT, CA, AT and CTCA tetranucleotide repeats.

Table 1 SSR loci in Shorea curtisii, forward and reverse primer sequences, marker sizes based on sequence data, and optimum primer annealing temperatures

Analysis of polymorphism of the SSRs

Genetic diversity of eight SSR loci among 40 S. curtisii specimens in the Semangkok Forest Reserve population was investigated. The number of alleles per locus ranged from two to 20 and the average was 7.9 (Table 2). The expected heterozygosity ranged from 0.180 to 0.922 and the average was 0.639. Four SSRs, Shc01, Shc04, Shc07 and Shc09, were highly polymorphic (He>0.8), and another four SSRs (Shc02, Shc03, Shc11 and Shc17) detected two to four alleles. Shc08 gave weak amplification and was not used for the diversity study.

Table 2 SSR allele number, and the expected heterozygosy for eight loci in the Semangkok population of Shorea curtisii

For simple repeat loci such as Shc04 and Shc09, allele polymorphism was found to depend mainly on differences in the number of CT repeats. However, in some cases, short compound repeats were found in the SSRs, especially at the Shc11 and Shc17 loci. The polymorphism at these loci was found to depend on insertions or deletions in the flanking region. Shc01 and Shc07 were also compound repeats, and highly polymorphic. Their polymorphism depended not only on differences in the repeat number of each repeat unit, but also on the combination of different repeat units. Therefore, we observed many alleles per locus for Shc01 and Shc07.

Conservation of SSR loci within Dipterocarpaceae

The primers developed for analysing five of the SSR loci were used to assess conservation of the loci among 30 species from 10 genera of the Dipterocarpaceae. Within the genus Shorea, all five SSRs were well conserved (Table 3), but multiple amplification products were obtained from S. lepidota, S. macrophylla, S. ovalis and S. scaberrima DNA using Shc01 primers, from S. scaberrima DNA using Shc07 primers, and from S. lepidota DNA using Shc09 primers. For species in the genus Hopea, Shc01, 07 and 11 were strongly amplified by PCR. However, Shc04 and 09 were amplified weakly from the DNA of certain Hopea species: Shc04 from H. latifolia, H. nervosa, H. sangal and H. subalata DNA, and Shc09 from H. dyeri DNA. All loci in Neobalanocarpus heimii, Parashorea lucida and Dryobalanops aromatica were strongly amplified, except for Shc04 in Parashorea. Among the other seven species (Dipterocarpus baudii, D. kerrii, D. oblongifolius, Anisoptera oblonga, Vatica odorata, Cotylelobium malayanum and Upuna bornensis), the sequences of the flanking SSR regions were found to be more conserved than they were in Shorea and Hopea.

Table 3 Application of SSR primers developed for Shorea curtissi to other Dipterocarpaceae species

Shc07 and 11 were selected for sequence analysis of the 30 test species. Shc07 was chosen because the PCR amplification lengths for this marker differed on 3% agarose gels: Hopea species, in particular, giving longer lengths than other genera. Based on the sequence data for Shc07, the average amplification length is 200 bp in Hopea, 130 bp in Neobalanocarpus heimii, 168 bp in Shorea and Parashorea lucida, and 143 bp in the other seven species included in the tribe Dipterocarpeae. Shc11 was selected because this locus was well amplified in all species except for Upuna bornensis. We observed some base substitutions between species at this locus, but the locus was well conserved within the family.

Discussion

In this study, nine polymorphic SSR markers were developed using two different strategies. Comparing the two strategies, a vectorette PCR approach may generally be a more useful strategy than the commonly used method of colony hybridization because it allows more rapid surveys of numerous clones. However, as shown in Table 1, some SSR loci isolated by the vectorette PCR method included short or compound repeats. Hybridization screening of genomic libraries allows SSR loci which have short or compound repeats to be discarded before primers are designed, but for the vectorette primer method, it was necessary to design forward primers for all possible SSR loci to obtain SSR core sequences. This problem could be overcome by optimization of PCR conditions. In the colony hybridization method, we could detect only one SSR locus. One of the reasons might be that we used a relatively high stringency condition for the hybridization.

The CT/AG motif in Shorea curtisii is as abundant as it is in other plants (Lagercrantz et al., 1993; Wang et al., 1994). Simple CA repeats, known to be common in animals, were not isolated in this study.

The polymorphism of Shc04 and 09, which have simple repeats, depends mainly on differences in CT repeat number, but the polymorphism of alleles involving compound repeat SSR loci is much more complex (Table 2). Such complexity may lead to errors in designating genotypes. In other words, even if amplified fragments are the same in size, they may not always have the same sequences. The complex compound repeats like Shc01 and Shc07 were found to be highly polymorphic, but it may be better to develop SSR markers only for simple repeat loci to limit errors in genotype identification. However, SSRs containing complex compound and interrupted repeats can be used for estimates of mutation rates and analysis of homoplasy (Jarne & Lagoda, 1996).

The frequency of dinucleotide repeats in the genome of several woody plants has been assessed, the estimates ranging from one repeat every 64 to 1105 kb (Condit & Hubbell, 1991). In our study, two SSRs were found out of 6000 clones from a genomic DNA library in which the average insert size was 400 bp, and eight SSR loci were isoalted out of 104 clones from a genomic library with an average insert size of 1500 bp. Thus, according to these results, although the frequency of the dinucleotide repeats in the Shorea genome could not be determined exactly, they seem to occur approximately once every 1200 kb. In S. curtisii, therefore, dinucleotide repeats are apparently more widely separted than in most other species.

The SSR primers developed for S. curtisii are also useful for analysing other dipterocarp species. Therefore they are potentially powerful tools for genetic analysis of the tropical forest. The results also indicate that affinities among these species are relatively close. Methods of phylogenetic inference using SSRs have recently been discussed by Goldstein & Pollock (1997) and Ellegren et al. (1997).

Recently, molecular phylogenies of the same 30 species of Dipterocarpaceae were constructed using PCR–RFLP analysis of chloroplast genes (Tsumara et al., 1996) and sequence analysis of chloroplast DNA (Kajita et al., 1998). According to these studies, there are two major groups of Dipterocarpaceae: Shoreae, which includes Shorea, Hopea, Parashorea, Neobalanocarpus and Dryobalanops, and Dipterocarpeae, which includes Dipterocarpus, Anisoptera, Upuna and Cotylelobium. The molecular phylogeny analysis also revealed close affinities among Shorea, Hopea, Parashorea, Neobalanocarpus and Dryobalanops.

PCR amplification patterns of these 30 species, using the five SSR loci markers developed for Shorea curtisii, do not show clear correlations with the molecular phylogeny. However, the amplification patterns of Shc04 and Shc09 suggest that the sequences of the flanking regions of these SSR loci are not well conserved in relatively distant species. These loci were well amplified in Shorea, but not in several species from other genera. Dayanandan et al. (1997) showed that SSR loci developed for Pithecellobium elegans were conserved among closely related species, and that there is high potential for the transfer of SSR markers between closed related taxa, as in our study. Conservation of SSRs for closely related plant taxa was also reported by Kijas et al. (1995) and Wu & Tanksley (1993). Some reports in animals have also shown similar observations (Moore et al., 1991; Schlotterer et al., 1991; Levine et al., 1995) but sequences were not conserved in humans (Moore et al., 1991)

The average size of Shc07 amplification length also seems to correlate well with the molecular phylogeny. According to the sequence data, these differences mainly depend on CATA repeats in core sequences. In the tribe Dipterocarpeae, CATA repeats do not seem to exist. In contrast, although base substitutions in Shc11 were detected in several species, it is not clear whether this reflects genetic distance.

In tropical tree species, several studies have reported the development and use of SSR markers (Terauchi, 1994; Chase et al., 1996a;, b; White & Powell, 1997). However, further study is required in order to evaluate the generality of SSR conservation, and to understand the evolution of SSR markers during speciation, and the genetic mechanisms involved.