Introduction

The presence of large numbers of repetitive sequences is a common feature of both plant and animal genomes. These sequences are characterized by high variability and constitute families of repetitive DNA that represent a substantial component of eukaryote genomes, in some forming species more than 80% of the DNA content (Ridley, 1996). In most cases, repetitive DNA is composed of relatively short units arranged in tandemly reiterated arrays and is termed satellite DNA. These sequences are associated with regions of constitutive heterochromatin (Singer, 1982). In fact, satellite DNA sequences have been located in positive C-band regions, such as centromeres, telomeres and other heterochromatic regions of autosomes and sex chromosomes (Modi et al, 1988; Schwarzacher-Robinson et al, 1988; Hamilton et al, 1990; Kunze et al, 1999; Fernández et al, 2001). Satellite DNAs are generally A-T rich and show high variability in monomer size, nucleotide sequence, copy number, and genome and chromosome organization and localization (Charlesworth et al, 1994).

Most satellite DNAs do not seem to have any defined function. Generally these sequences are not transcribed, although some exceptions have been found (Neitzel et al, 1998). So, despite the numerous efforts to find the biological function of these DNA sequences, any such role still remains unclear, although a variety of hypotheses have been put forward to explain their role in genomic structure and evolution (Singer, 1982; Vogt, 1992; Hennig, 1999; Garagna et al, 2001).

The genome size of most bat species is between 50 and 87% of the genome size of other eutherian mammalian genomes (Burton et al, 1989). It has been suggested that the difference in the nuclear content of bats in relation to other mammalian groups is mainly because of their lack of repetitive DNA sequences (Van Den Bussche et al, 1995). However, C-band studies in bat chromosomes have revealed several species with large amounts of heterochromatic regions (Haiduk et al, 1981; Morielle-Versute et al, 1996; Volleth et al, 2001).

Satellite DNA sequences have been well studied in almost all mammalian groups, but little information is available at the molecular level for these sequences in bats. Recently we have described a G–C-rich repetitive DNA sequence in three megabat species from the genus Pteropus (Barragán et al, 2002) and an A–T-rich repetitive DNA sequence in the microbat species Miniopterus schreibersi (Barragán et al, 2002). Here, we describe a HindIII family of repetitive DNA sequences present in the bat species Pipistrellus pipistrellus, wa family that is located in the centromeric heterochromatin of the autosomes and the X chromosome.

Materials and methods

DNA extraction

A total of nine individuals from two Spanish populations of the bat species P. pipistrellus were analyzed: five individuals were from the Guadix (Granada) population and four from the Campillo de Arenas (Jaén) population. Genomic DNAs were extracted from different tissues according to standard phenol–chloroform procedures (Sambrook et al, 1989).

Cloning and sequencing

Genomic DNAs of individuals from both populations were digested with the AluI, HindIII, BamHI, PstI, EcoRI and SacI restriction endonucleases. After gel electrophoresis, a prominent band of 418 bp could be seen in all HindIII-digested DNA samples from individuals captured in Jaén. This band was eluted from the agarose gel and ligated with the pGEM-T vector (Promega) as described by Sánchez et al (1996) and Fernández et al (2002).

Escherichia coli JM109 competent cells were transformed with the ligation reactions. The recombinant bacteria containing the sequences of interest were selected after screening, using as probe the HindIII eluted band digoxigenin labelled by random priming (Roche).

Positive clones were sequenced in both directions using the Thermosequenase fluorescent cycle sequencing kit from Amersham. Sequence reactions were analyzed in a 6.5% polyacrylamide gel in a LICOR-400L automated sequencer.

Southern blot

Genomic DNAs were digested with restriction endonucleases (HindIII, HpaII or MspI). Fragments were separated in 1% agarose gels and blotted onto nylon membranes (Amersham) according to Sambrook et al (1989). The membranes were probed overnight with the digoxigenin-labelled HindIII eluted band at 55°C. Alkaline phosphatase detection was carried out according to the supplier's recommendations (Roche).

Dot blots

Serial dilutions of genomic DNA and cloned satellite DNA from P. pipistrellus (clone P.pip 1.1) were denatured by boiling and placed onto nylon membranes using a dot-blot apparatus. Membranes were rinsed in 2 × SSC, dried and UV crosslinked. The insert from clone P.pip 1.1 was isolated after digestion and used as probe after Dig-labelling by random priming. Hybridization and detection conditions were the same as described for Southern blots.

Chromosome preparations and fluorescence in situ hybridization (FISH)

P. pipistrellus chromosomes were obtained from fibroblast cultures according to Volleth (1987).

The FISH procedure followed was essentially that of Pinkel et al (1986). The probe (clone P.pip 1.1) was labelled with biotin-16-dUTP (Roche) by PCR (Lo et al, 1990). Hybridization was carried out in 20 μl (50% formamide, 10% dextran sulphate, 5 mg herring-sperm DNA, 2 × SSC pH 7) for 16 h at 37°C in a moist chamber. Posthybridization washes were first at 40°C in 40% formamide/2 × SSC and then at 37°C in 2 × SSC, three times for 5 min each. Slides were blocked for 30 min at 37°C in 4 × SSC/5% blocking reagent (4M), incubated in fluorescein isothiocyanate (FITC)-conjugated avidin (Q-BIOgene, France) at 37°C for 30 min. and then washed three times (5 min each) in 4 × SSC/0.05% Tween 20 (Roche) (4T). Slides were stained with 4′, 6-diamidino-2-phenylindole (DAPI).

Images were collected with a Nikon E-800 microscope equipped with a Hamamatsu CCD camera. Gray-scale images for either FITC or DAPI filter sets were pseudocolored and further processed with AnalySIS 2.11.005 software.

Sequence analysis

Pairwise sequence alignment and multiple alignments were carried out with the program CLUSTAL W 1.6 (Thompson et al, 1994). A sequence homology search was performed in GenBank using the BLASTN 2.2.2 program with default parameters (Altschul et al, 1997).

Results and discussion

Bat genomes are characterized by low DNA content (small C value) compared to the genomes of most mammalian species. Van Den Bussche et al (1995) suggested that the difference in the nuclear DNA content of bats compared to other mammalian groups is mainly because of the lack of interspersed repetitive DNA sequences. Despite this fact, little information about repetitive DNA sequences in bat species at the molecular level is available. Hence, it is of interest to characterize these sequences in this representative mammalian group.

Satellite DNA characterization and tandem repeat organization

Two Spanish populations of P. pipistrellus were analysed. Digestion of genomic DNAs with HindIII and further electrophoresis revealed the existence of specific patterns characteristic of each population. Thus, only DNA samples of individuals of the Jaén population showed a 418 bp prominent band, while no band was observed in DNA samples of individuals of the Granada population (Figure 1a). When genomic DNAs from individuals of both populations were completely digested with this endonuclease and probed with the 418 bp eluted band, a regular ladder pattern of bands can be observed in all samples (Figure 1b). This indicates that the repeated sequences are arrayed in tandem in both populations. However, the ladder pattern also shows the existence of several differences between populations. While in the Jaén population the most prominent band is the monomer and the ladder can be seen only up to pentameric multimers, in the Granada population, more bands can be seen, increasing in intensity with the size of the band. That is, in samples from Granada the monomer is the faintest band, while in samples from Jaén the monomer is the most prominent. These results imply that while in samples from the Jaén population HindIII cuts in most of the monomer units, in samples from the Granada population the target sequences for this enzyme have been lost from most monomers (this result was confirmed in three independent Southern blot experiments where genomic DNAs were completely digested). These differences between both populations could be explained as the result of homogenization processes of the satellite DNA acting in each population independently. In fact, in satellite DNAs, some particular monomer variations can be spread and homogenized throughout the genome and distributed among both homologous and nonhomologous chromosomes by unequal crossing over and amplification (Dover, 1986).

Figure 1
figure 1

(a) Gel electrophoresis of HindIII-digested genomic DNAs from P. pipistrellus individuals from the Jaén population. (b) Southern blot of HindIII-digested genomic DNAs from individuals of Jaén and Granada populations probed with the 418 bp band eluted from (a). (c) Southern blot of genomic DNA digested with the methylation-insensitive enzyme MspI and the methylation-sensitive enzyme HpaII, probed with the digoxigenin-labelled 418 bp HindIII eluted band. Left: DNA sample from the Jaén population and right: from the Granada population.

Sequence analysis of cloned monomers

Five positive clones containing a monomer unit of this satellite DNA were obtained, all from DNA samples from the Jaén population: P.pip. 1.1, 1.3, 1.6, 1.7 and 1.9. Sequence alignments allowed the determination of the consensus sequence, which was 418 bp long and 62.20% A–T rich (Figure 2). A high A–T content is considered to be a general feature of most mammalian satellite DNAs. Recently, we have described a repetitive DNA satellite in the bat species Miniopterus schreibersi that is 57.85% A-T rich (Barragán et al, 2002). Furthermore, A-T content of satellite DNAs of several rodent species ranges from 60 to 63% (Singer, 1982). However, in species of the genus Pteropus (megabat), we have cloned a G–C-rich family of satellite DNA (Barragán et al, 2002), despite the fact that genomes of Pteropus species are characterized by a high A–T content (Pettigrew and Kirsch, 1995).

Figure 2
figure 2

Sequence alignment of the consensus sequence with the five cloned monomer units. Partial target sequences for MspI and HpaII are in bold and single underlined. The putative CENP-B box is in bold and double underlined. Asterisks denote conserved positions of this sequence with the Mus musculus CENP-B box, and lower case letters denote the five conserved positions of the nine positions which are considered to be necessary for CENP-B protein binding. These sequences have been submitted to GenBanK and have been assigned the accession numbers AJ457179 to AJ457183.

The identity of the different monomer units with the consensus sequence ranges between 96.88 (monomer P.pis 1.1) and 99.76% (monomers P.pip 1.7 and 1.3) (Figure 2). Comparative analysis demonstrated that base-substitution mutations are randomly spread along the sequences and that the sequence identity between monomer units is very high (ranging between 94.97 (monomers P.pip 1.1 and 1.6) and 100.00% (monomers P.pip 1.3 and 1.7)). This high sequence identity between monomers could be because of homogenization processes such as sister chromatic exchanges, gene conversion or transposition (Dover, 1982; Walsh, 1987; Fernández et al, 2001). Alternatively, the sequence identity could be maintained by selective forces. In fact, as this satellite is located on pericentromeric heterochromatin (see below), it could have one or several roles associated with chromosome (centromeric) structure and function.

A very common feature of satellite DNAs is the presence of internal direct and inverted subrepeats (Zhang and Horz, 1984; Bogenberger et al, 1985; Lee and Lin, 1996). Nevertheless, no internal subrepeats were found in this satellite of P. pipistrellus. Furthermore, a BLAST search in GenBank using the consensus sequence did not find any significant sequence homology with repetitive or nonrepetitive DNAs from other organisms.

Chromosome localization

We investigated the chromosomal location of this satellite DNA by FISH. The karyotype of P. pipistellus has a diploid number of 2n=44 chromosomes, with a metacentric X chromosome and a small Y chromosome (Volleth et al, 2001). The FISH technique localized the repetitive DNA in the pericentromeric regions of all autosomes and the X chromosome. No fluorescent signal was detected on the Y chromosome (Figure 3). C-banding techniques have demonstrated that heterochromatin is located in the pericentromeric regions of all the autosomes and the X chromosome of this species, but is absent from the entire Y chromosome, including the centromeric region (Volleth et al, 2001). Hence, the location of this repetitive DNA coincides with the location of the pericentromeric constitutive heterochromatin previously described. The presence of satellite DNA in the centromeres of autosomes and X chromosomes and its absence from the Y chromosome centromere is a common feature of satellite DNAs from other mammal species (Lee and Lin, 1996; Kunze et al, 1999; Fernández et al, 2001).

Figure 3
figure 3

(a) Fluorescence in situ hybridization with cloned repeat DNA on male metaphase chromosomes from P. pipistrellus. (b) The same image, reverse printed in black and white.

Several centromeric satellite DNAs are considered to be functional components of centromeres because they contain the CENP-B box, the binding site for centromeric protein B (Kipling and Warburton, 1997). CENP-B box-like sequences have been described in several centromeric satellite DNAs in rodent species (Kunze et al, 1999; Stitou et al, 1999). The search for the 17-bp motif of the mammalian CENP-B box in the satellite DNA from Pipistrellus resulted in a sequence with nine and 10 base pair positions shared with the Homo sapiens and Mus musculus consensus sequences, respectively. The presence of this putative CENP-B box in the satellite of Pipistrellus suggests a possible role in the centromeric function. However, it is also possible that this box may not be functional as this putative CENP-B box has only five of the nine conserved positions which are considered to be necessary for CENP-B protein binding (Figure 2).

Methylation analysis

We have investigated the methylation status of the repetitive DNA sequence of P. pipistrellus in DNA samples from both populations. For this purpose, the genomic DNAs were digested independently with the methylation-insensitive enzyme MspI and with its methylation-sensitive isoschizomer enzyme HpaII. After Southern blotting, the membrane was probed with the HindIII eluted band labelled with digoxigenin. With both enzymes, the genomic DNA gives rise to the same regular ladder band pattern (Figure 1b). Hence, we can conclude that the cytosines present in the CCGG target sites in this repetitive DNA sequence are not methylated. The existence of a regular ladder implies that the target site for these restriction endonucleases is present in this satellite DNA, even though the sequence CCGG is absent in all the monomer units cloned and sequenced. This could be because of the existence of several positions where incomplete target sequences are located. All these positions can give rise to the CCGG sequence with a single base change (see Figure 2). Hence it is possible that this sequence could be present in most monomer units of satellite DNA although we have not cloned any of them.

Other bat satellite DNAs with total or partially methylated CCGG sequences have been described. In fact, a PstI family of repetitive DNA reported in three species of the genus Pteropus was totally methylated (Barragán et al, 2002), while a EcoRI family of Miniopterus schreibersi was partially methylated (Barragán et al, 2002).

Genomic content of satellite DNA

We estimated the amount of this repetitive DNA family by dot-blot hybridization. The results obtained indicate that this repetitive family represents approximately 3% of the whole genome of this bat species (results not shown). If P. pispitrellus genome size is similar to other Pipistellus species, which ranges between 1.99 pg in P. savii (Capanna and Manfredi Romanini, 1971, 1973) and 2.49 pg in P. abramus (Kato et al, 1980), the number of monomer units per haploid genome would range between 141 000 and 176 000.

The flying vertebrates, that is, bats and birds, have the highest metabolic rate and the smallest genome size (Hughes and Hughes, 1995; for a review see Redi et al, 2001). In fact, bat genomes are characterized by small C values compared to the genomes of most mammalian species (Capanna and Manfredi Romanini, 1971, 1973; Burton et al, 1989). Van Den Bussche et al (1995) examined the distributions of three classes of repetitive DNA sequences (dinucleotide microsatellites, ribosomal DNA cistrons and a repetitive DNA family) in the bat species Macrotus waterhousii. These authors found reduced copy number of these sequences when compared with other species of mammals, and suggested that the difference in the nuclear content of bats compared to other mammalian groups is mainly because of the lack of repetitive DNA sequences. They also proposed the existence of an unknown mechanism that maintains low copy numbers of repetitive DNA families. This mechanism could reduce different families proportionately (Van Den Bussche et al, 1995).

However, C-banding studies have demonstrated the presence of pericentromeric heterochromatic regions in most karyotypes of bat species analysed to date, with similar size to those described in most of the mammal species (Pathak et al, 1973; Bickham, 1979; Haiduk et al, 1981; Morielle-Versute et al, 1996; Santos and Souza, 1998; Volleth et al, 1999; Volleth et al, 2001).

The amount of pericentromeric satellite DNA is highly variable in mammals. In several species of the genus Acomys (Rodentia), a family of centromeric satellite DNAs represents between 0.53 and 0.88% (60 000 copies) of the haploid genome (Kunze et al, 1999). In mouse the minor satellite represents about 0.5–1%, and the major satellite DNA approximately 6% of the genome (see Kunze et al, 1999). The caribou (Rangifer tarandus caribou) centromeric satellite II DNA sequence represents 3.9% of the genome (157 000 copies per haploid genome) (Li et al, 2000). The data from P. pipistrellus pericentromeric satellite DNA, which represents 3% of the haploid genome (141 000 and 176 000 copies), are very similar to the data of other mammal species. Hence, if a mechanism that controls the amount of repetitive DNA sequences in bat exists, pericentromeric satellite DNAs must have escape to this control, at least in P. pipistellus. However, more studies about pericentromeric satellite DNAs in bats and the percentage of the genome that they represent are necessary to test this possibility.

Species (genus)-specific sequences

Genomic DNAs from other five Vespertilionidae (Eptesicus fuscus, E. serotinus, Miniopterus schreibersi, Myotis myotis) and one Rhinolophidae species (Rhinolophus hipposideros) were digested with HindIII and probed with the repetitive sequence from Pipistrellus. In this case, no bands were observed in the Southern blot (data not shown). This result implies that these sequences are absent in the genome of these species and that they are exclusive to Pipistrellus genome.

The genera Pipistrellus and Eptesicus are so similar morphologically that it is quite difficult to classify these two genera using morphological characters. Recently, several chromosomal differences have been described that enable identification of both genera (Volleth et al, 2001). The presence of this repetitive DNA family in the genus Pipistrellus and its absence in the genus Eptesicus could also be employed as a new molecular taxonomic criterion, easier to use than chromosome criteria.