Introduction

Satellite DNAs (satDNAs) are non-protein-coding DNA elements consisting typically of a sequence motif (the monomer, between 100 and 500 bp long) that is highly repeated in arrays of 104–106 tandemly organized, slightly varying, monomers. Arrays of a single family of satDNA may be located at one or more genomic sites, in the heterochromatic regions of chromosomes, near the centromere or in subtelomeric or intercalary regions; each species includes multiple, unrelated families of satDNA monomers (Ugarkovic and Plohl, 2002; Heslop-Harrison et al., 2003). SatDNAs often represent more than 30% of total genomic DNA (Pons et al., 2004). In Drosophila, changes in satDNA copy number during evolution positively correlate with genome sizes, which in turn might affect organismal traits of selective value, such as developmental rate, body size and/or sperm size (Gregory and Johnston, 2008).

Apart from monomeric organization, satDNAs can show periodicity at the level of higher-order repeat (HOR) units, made up of multimers with a number of diverged monomers. A two-monomer (2-mer) HOR has been reported for scallop PmPst9 satDNA (Biscotti et al., 2007), the plant rye has a highly abundant subtelomeric satellite sequence that is organized as a 3-mer HOR, where monomers are defined by HaeIII restriction sites and 3-mer HOR by TaqI sites (Vershinin et al., 1995), whereas centromeric repeats consisting of six monomers have been reported in rice (Lee et al., 2006). In humans, the α-satellite DNA comprises 171 bp monomers organized in a tandem, head-to-tail orientation, that can be arranged as several chromosome-specific HORs, such as the 6-mer HOR of chromosome 7 (Waye et al., 1987). In insects, higher-order satDNA repeats have been reported in some beetle species (Pons et al., 2002; Palomeque and Lorite, 2008) although there are no reports in Drosophila.

Satellite DNAs often exhibit a higher degree of sequence similarity among repeats from the same species compared with lower similarity between orthologous repeats, even in closely related species (Bachmann and Sperlich, 1993; Mantovani et al., 1997; Fernández et al., 2001; Heslop-Harrison et al., 2003; Rudd et al., 2006). Models to explain this phenomenon, known as concerted evolution, assume that once mutations arise in individual repeats, repeated cycles of unequal crossing-over and gene conversion can duplicate and spread them simultaneously throughout a sequence family and a population of reproductive individuals (Dover, 1982; Kuhn et al., 2008). Reproductive barriers would subsequently allow homogenization of different mutations in the genome of different populations and species. There is evidence that the homogenization process is more efficient within clusters of neighboring monomers than between more distal monomers located in the same chromosome or in nonhomologous chromosomes (Hall et al., 2005; Rudd et al., 2006).

There have been relatively few attempts to characterize the nature of the ‘ends’ of long tandem arrays of satDNAs. Whole-genome projects have many sharp junctions of satellites with generally uncharacterized genomic DNA, although accurate assembly of individual sequence reads into satDNA arrays may be extremely difficult (see Lee et al., 2006 for care required for assembly of blocks of HORs). Nevertheless, in examples that have been verified, abrupt transitions involving satDNA × satDNA or satDNA × defined non-satDNA junctions have been reported in a number of organisms, such as humans (Gaff et al., 1994), plants (Alkhimova et al., 2004; Lee et al., 2006) and insects (McAllister and Werren, 1999; Sun et al., 2003; Krzywinski et al., 2005).

We have been studying different aspects of the satDNA structure, organization and evolution in the Drosophila buzzatii cluster (D. repleta group). This is a monophyletic group comprising seven closely-related cactus-breeding species of South American origin: D. buzzatii, D. koepferae, D. serido, D. antonietae, D. gouveai, D. seriema and D. borborema (Manfrin and Sene, 2006). These species share at least five abundant unrelated satDNAs, three of them formally described (Kuhn and Sene, 2005; Kuhn et al., 2007; Franco et al., 2008). Two of these satDNA families with contrasting evolutionary and structural characteristics have been studied in more detail. The AT-rich pBuM family can be divided into two subfamilies: pBuM-1 consists of α repeats of approximately 190 bp, whereas pBuM-2 consists of α/β repeats of approximately 370 bp units, each consisting of an α unit and an additional 180 bp unrelated β unit (Kuhn and Sene, 2005). A complex pattern of variation of pBuM repeats across species was detected concerning abundance, chromosomal distribution, long-range organization, homogenization rate and nature of homogenized repeats (Kuhn et al., 2008). Intraspecific homogenization for a particular set of α or α/β variant occurred in the genome of five species of the cluster. The second family, DBC-150, is slightly GC rich (55.3% on average) and comprises repeat units of approximately 150 bp, showing no homology to pBuM repeats (Kuhn et al., 2007). In contrast to the pBuM family, no intraspecific homogenization for particular DBC-150 variants was detected in any of the seven species of the buzzatii cluster.

The basic karyotype of the seven species of the buzzatii cluster comprises one pair of sex chromosomes, four pairs of telocentric autosomes of similar size and one pair of microchromosomes (also known as the sixth chromosome pair or ‘dot’ chromosomes). The amount of heterochromatin present in the sex chromosomes and in the microchromosomes discriminates the karyotypes of most species of the cluster (reviewed by Kuhn et al., 2007).

The DBC-150 family was found to be restricted to the microchromosomes, whereas the pBuM family had a broader chromosomal distribution (Kuhn et al., 2007, 2008). Several Drosophila species present one pair of microchromosomes, which are typically heterochromatic, contain only a few genes and are stably transmitted to the progeny (Powell, 1997). In D. melanogaster, the analysis of thousands of normal flies conducted by Bridges (1935) revealed that the microchromosomes do not undergo crossing-over during meiosis under natural conditions (reviewed by Riddle and Elgin, 2006). Recent studies suggested that suppressed meiotic recombination is probably a characteristic of the microchromosomes found in the buzzatii cluster too (Casals et al., 2005; Kuhn et al., 2007).

We showed previously by independent single-color fluorescence in situ hybridization (FISH) experiments that the microchromosome pair of D. gouveai, D. seriema and D. antonietae is highly enriched with repeats from both pBuM and DBC-150 families (Kuhn et al., 2007, 2008). In this paper, we investigated the organization of pBuM and DBC-150 repeats in the metaphase microchromosomes, interphase nucleus and extended DNA fibers by double-target FISH, and analyzed the structure and dynamics of sequence junctions between the two satellites, isolated by PCR. Altogether, the data showed chromosomal regions and DNA fibers where the two satellites are interspersed at the molecular level, and sequence analysis of six pBuM × DBC-150 junctions isolated from D. antonietae and D. gouveai helped to identify possible factors associated with such organization.

Materials and methods

Drosophila strains

The following strains were analyzed: D. antonietae (J23A90M), D. gouveai (J79M4) and D. seriema (D73C3B). The geographical location of these strains in Brazil can be seen in Kuhn et al. (2008). The laboratory strains J79M4 and D73C3B are isofemale lines, derived from single females collected in the wild. The strain J23A90M was founded by several females.

Isolation of satDNA sequence junctions

Genomic DNA was extracted using standard procedures from a sample of approximately 20 adult flies from each strain. Different combinations of pBuM and DBC-150 primers were used to amplify pBuM × DBC-150 sequence junctions. The pBuM primers A2-F: 5′-CGGAGTATTTTTCATTCGAC-3′ and A2-R: 5′-GGTATGCCATAAAGAAGTCG-3′ are specific for α pBuM motifs present in both α and α/β pBuM repeats; BPS-F: 5′-GGCTCTAGCTAGGTTATCCT-3′ and BPS-R: 5′-GAGGGTTTACGTTACCTAAG-3′ are specific for β motifs from α/β repeats. The DBC-150 primers DBC-P1: 5′-TCCAACACCCGCGTATAACG-3′ and DBC-P2: 5′-CCAAAATTCTACGCCGAACG-3′ are specific for DBC-150 repeats. The PCR was composed of a 25 μl reaction mixture containing 30–50 ng of genomic DNA, 0.4 μM of each primer, 1 × PCR buffer, 0.2 mM (dNTPs) and 1 U Taq DNA polymerase. The PCR program consisted of a first denaturation step at 94 °C for 2 min followed by 30 cycles at 94 °C for 1 min, at 45 or 50 °C for 1 min and at 72 °C for 1 min, with a final extension step at 72 °C for 5 min. PCR products were cloned into the pGEM-T plasmid vector (Promega, Madison, WI, USA) before sequencing in both directions.

Sequence analysis

Multiple sequence alignments were performed using CLUSTALW (Thompson et al., 1994) and improved by eye. The MEGA software version 4 (Tamura et al., 2007) was used for the analysis of DNA primary structure, nucleotide variability, calculation of Kimura two-parameter (K-2P) distances (Kimura, 1980) and construction of neighbor-joining (NJ) dendrograms (Saitou and Nei, 1987). In addition to the NJ tree, a phylogenetical network was constructed using the Neighbor-Net method (Bryant and Moulton, 2004) combined with K-2P genetic distances. This method provides a more accurate representation of conflicting signals generated by recombination events, such as those involved in satDNA evolution.

Molecular cytogenetics

The preparation of metaphase chromosomes and extended DNA fibers was carried out as described in Kuhn et al., 2008. The interphase nuclei and metaphase chromosomes were obtained from neuroblasts of larvae at the third stage. DNA fibers were isolated from adult flies. Probes for the detection of pBuM and DBC-150 repeats were prepared by PCR labeling. The reactions consisted of a 25 μl mixture containing 1–10 ng of plasmid DNA, 0.6 μM of each primer, 1 × PCR buffer, 0.2 mM each dATP, TTP, dCTP and dGTP, 0.04 mM of digoxigenin-11-dUTP or biotin-16-dUTP (Roche, Indianapolis, IN, USA) and 1.5 U Taq DNA polymerase (Yorkshire Bioscience, York, UK). The M13 primers were used for labeling cloned α/β or DBC-150 monomers. In addition, a β-probe lacking plasmid-flanking sequences was prepared with a PCR reaction containing an α/β cloned monomer as template and the β-specific primers BPS-F and BPS-R.

Probes labeled with biotin and digoxigenin were detected with streptavidin-Alexa 594 and anti-digoxigenin-FITC, respectively. In situ hybridization experiments were performed as described previously (Kuhn et al., 2008). After hybridization, the chromosomes (or DNA fibers) were counterstained with DAPI (4′,6-diamidino-2-phenylindole) and analyzed with a Zeiss Axioplan epifluorescence microscope equipped with a Jenoptik ProgRes C12 digital image capture system. Images were processed in Adobe Photoshop CS2 using only cropping and adjustment functions that affected the whole image equally.

The length of single-DNA fibers was measured considering that 320 pixels (objective × 100) or 180 pixels (objective × 63) correspond to 10 mm, which roughly corresponds to 29 kb of the extended DNA fiber.

Results

Organization of pBuM and DBC-150 repeats

Double-target FISH with DBC-150 and pBuM (α/β) probes revealed contrasting patterns of genomic organization of the two satellite families (Figure 1). In D. gouveai and D. antonietae, the distribution of DBC-150 repeats overlaps with the distribution of pBuM repeats (α or α/β) in some regions of the microchromosome (Figures 1a, b, d and e, yellow color showing overlapping hybridization). The submetacentric microchromosome of D. seriema showed pBuM and DBC-150 repeats occupying different chromosome domains (no yellow color in Figure 1h), with DBC-150 repeats clustering in the pericentromeric region and pBuM repeats clustering at the subtelomeric regions of both chromosome arms (Figures 1g and h). In D. gouveai and D. antonietae, zones of overlapping distribution between the two satDNAs were evident in the interphase nuclei (Figures 1c–f). In D. seriema, large arrays of pBuM and DBC-150 repeats are organized in distinct domains in the interphase nucleus, with overlapping distribution detected only in some punctuate genomic regions (Figure 1i).

Figure 1
figure 1

Double-target in situ hybridization with α/β (pBuM) (red) and DBC-150 (green) probes to the microchromosomes (left half of each image) or interphase nuclei (right half of each image) of D. gouveai, D. antonietae and D. seriema. Yellow signals show regions where the two satellite sequences are interspersed. In a, d and g, the microchromosomes are shown counterstained with DAPI (4′,6-diamidino-2-phenylindole, blue) and b, e and h show only the hybridization signals generated with the two satellite probes. Pictures of representative interphase nuclei (c, f and i) show overlays of the pBuM (red), DBC-150 (green) and DAPI DNA stain (blue). Scale bar: 1 μm except 2 μm in g. The color reproduction of this figure is available on the html full text version of the manuscript.

High-resolution double-target FISH in extended DNA fibers revealed several examples where pBuM and DBC-150 repeats were found interspersed with each other within the same arrays (Figure 2). The same pattern of interspersion was observed with probes prepared with PCR using M13 or β-specific primers, or where labels were exchanged. The most common interspersion pattern observed in D. gouveai, D. antonietae and D. seriema consisted of long arrays (>100 kb) comprising predominantly repeats of one family but with a few intruding repeats from the other (Figures 2a–e). In two cases, we detected adjacent arrays of pBuM and DBC-150 (Figures 2a and b). The highest level of interspersion of pBuM and DBC-150 repeats was found in D. gouveai, with repeats showing mutual arrangements in regions spanning more than 100 kb of the chromatin fiber (Figure 2c). In contrast, D. seriema showed continuous arrays of repeats from one satDNA with few (Figure 2e) or no intruding repeats from the other (Figure 2f).

Figure 2
figure 2

Extended DNA fibers hybridized with α/β (pBuM) (red) and DBC-150 (green) probes showing the organization of pBuM and DBC-150 satDNAs in representative fibers from D. gouveai (ac), D. antonietae (d) and D. seriema (e and f). Interspersion and junctions are visualized by adjacent red and green hybridization sites, yellow where signals overlap. Scale bar represents 10 kb of stretched DNA. The color reproduction of this figure is available on the html full text version of the manuscript.

The pattern of interspersion of pBuM and DBC-150 in the same DNA fibers is consistent with the numerous gaps observed within pBuM arrays in our previous report (Kuhn et al., 2008). Apparent gaps with no hybridization of either satDNA were also observed in some arrays. Some may be for technical reasons (for example, breakage of single fibers), but we also investigated possible interspersion with a major transposon family. The foldback element Galileo was found to accumulate in the heterochromatic regions of several chromosomes and was more pronounced in the microchromosomes of several species from the buzzatii cluster (Casals et al., 2005). However, PCR experiments (data not shown) using single Galileo primers (G7 and E14 in Casals et al., 2005) and a combination of Galileo and pBuM or DBC-150 primers showed little evidence of interspersion between Galileo and pBuM or DBC-150 arrays in genomic regions spanning 0.1–2 kb.

Characterization of junctions between pBuM and DBC-150 satDNA families

PCR amplification of genomic DNA in D. gouveai, D. antonietae and D. seriema using combinations of DBC-150 and pBuM (α and α/β) primers produced DNA fragments that varied in number and size according to the species and combination of primers tested. In total, we sequenced nine D. antonietae clones derived from six DNA bands and four primer combinations, ten D. gouveai clones derived from another six DNA bands and three primer combinations, and three D. seriema clones derived from two DNA bands and two primer combinations (Table 1). Sequence analysis of clones from the first two species showed that they belong to six different satellite junctions, two of them exclusive to D. antonietae (junctions I and II), and four exclusive to D. gouveai (junctions III–VI) (Table 1). There was no direct relationship between the number of DNA bands, primer combination and the number of junctions. For example, sequences from junction II of D. antonietae were isolated from DNA bands of four different sizes generated with primers DBC-P1 × A2-R, sequences from junctions IV and V came from a same DNA band obtained with primers DBC-P1 × A2-R, and sequences from junction I, II and III came from distinct DNA bands generated with two different primer combinations (Table 1). These results also confirm that the satDNA junctions are not PCR artifacts.

Table 1 Satellite DNA sequences

Figure 3A shows a schematic representation of the nature and organization of pBuM and DBC-150 repeats involved in each satDNA junction type. DNA spacers were not detected between the two satellites. Instead, the junctions comprise only α/β (pBuM family) and DBC-150 repeats, with a number of adjacent repeats of each satellite flanking the junction site. This situation thus allowed pBuM or DBC-150 primers to anneal at different ‘junction repeats’ (JRs), producing in some cases DNA products of different sizes derived from the same pBuM × DBC-150 junction.

Figure 3
figure 3

(A) Schematic representation of the α/β (pBuM) and DBC-150 junction repeat units (JRs) found in the six different junctions (I–VI). Neighbor DBC-150 JRs are distinguished by letters (a–f), which are not meant to be equivalent between different junctions. A 20 bp sequence spanning the junction is shown in detail below the representation of each junction. The upper strand represents the actual sequence junction, and the lower strand represents the predicted nucleotide sequence continuation of each repeat reconstructed based on DBC-150 and pBuM consensus sequences. In Junction V, two possibilities for the predicted nucleotide sequences of the DBC-150 JR are given (see also Figure 4). The sequences were shaded with different colors corresponding to each family. (B) Location of the six junctions in an α/β consensus sequence derived from α/β junctions. The β sequence is in bold. (C) Distribution of the six junctions in a DBC-150 junction consensus sequence.

The sequences of junctions were aligned with the monomer sequences of DBC-150 and pBuM to characterize the molecular breakpoints (Figure 3A). Junctions I, IV, V and VI are abrupt transitions between pBuM and DBC-150 JRs. Junctions II and III are characterized each by a different 10 bp tandem duplication of a DBC-150 JR immediately bordering the junction. Each breakpoint occurred at a different position in both pBuM and DBC-150 monomers (Figures 3B and C), suggesting each was an independent recombination event. An alignment (Figure 3A) made with the 20 bp sequence of each junction with the predicted following nucleotide sequence of each repeat assuming no junction showed very low degree of sequence identity between the sequences involved in the junction (from 10% in junction IV to 35% in junction III). In four junctions (I, II, III and VI), 2–3 nucleotides immediately flanking the junction are ambiguous and could be classified as either belonging to a pBuM or a DBC-150 sequence.

None of the three clones from D. seriema obtained with primers DBC-P2 × BPS-F or DBC-P2 × A2 represents satDNA junctions. The clone Sma220/1 showed no significant sequence similarity to pBuM, DBC-150 or any other sequence available in the GenBank (July 2008). The two clones Sma370/1-2 contain only DBC-150 neighbor repeats.

Nature and organization of DBC-150 and pBuM junction repeats

The six DBC-150 JRs from D. gouveai and D. antonietae share some nucleotide features (including nucleotide substitutions and indels) that could not be found or are found at low frequencies in a collection of 35 randomly isolated DBC-150 monomers from all seven species of the buzzatii cluster (see nucleotide alignment in Figure 4), indicating that the same pool of DBC-150 variants was involved in the formation of all six satDNA junctions. Among these features is a long, almost perfect, palindrome sequence 5′-CCGGA(TA)nCCGG-3′ (positions 116–130 in Figure 4) present in at least 14 of the 20 isolated DBC-150 JRs (its presence or absence could not be determined in some JRs directly adjacent to the breakpoints, such as those in junctions II and IV). Another hallmark of DBC-150 JRs is a deletion of an approximately 50 bp segment spanning from the end of a repeat unit (positions 141–162) to the beginning of the following repeat unit (positions 1–27) (Figure 4). The average nucleotide variability between neighbor DBC-150 JRs in D. gouveai and D. antonietae is also higher (13 and 12%) compared to that observed between randomly isolated monomers from each species (7 and 10%).

Figure 4
figure 4

Nucleotide alignment containing (a) a DBC-150 consensus sequence derived from 35 DBC-150 monomers, with conserved nucleotides (0% variation across 35 repeats) gray-shaded. (b) Consensus sequences for each neighbor DBC-150 junction repeat found in the six pBuM × DBC-150 junctions (J–I to J–VI) (see Figure 3 for structure of each junction). (c) Consensus sequences of neighbor repeats (a–c) derived from twelve 3-monomer higher-order repeats (HORs) of D. serido (Supplementary material S2). The beginning of the alignment was defined as the beginning of the sequence generated by PCR. The sequence of the primers was removed before the generation of the alignment and their positions are indicated in the margin of the alignment. PF and PR stand for forward and reverse primers (see Table 1 for primer combinations). The position and organization of α/β repeats (αβ, αβα, αβαβ or βα) within each junction is also indicated. Nucleotide sequences from pBuM repeats are indicated by ‘>’ and unknown nucleotide sequences are indicated by ‘?’. The palindromic sequence 5′-CCGGA(TA)nCCGG-3′ (positions 116–130) is shown within a box in the DBC-150 consensus sequence. Asterisks represent deletions and dots indicate identical nucleotides (in relation to CONS-DBC-150).

Nucleotide comparisons and NJ trees showed that α/β (pBuM) JRs in D. gouveai and D. antonietae belong to the same α/β variant type characteristic of each species (α/β I and III in Kuhn et al., 2008; data not shown). Therefore, the two junctions of D. antonietae and the four junctions of D. gouveai probably originated independently after the split of these two species from their common ancestor. Within each species, α/β JRs are indistinguishable from a collection of 7 or 36 randomly isolated monomers from D. antonietae and D. gouveai, respectively. Therefore, in contrast to DBC-150 JRs, no atypical nucleotide features can be related to the α/β JRs.

The DBC-150 junction repeats are genetically related to a DBC-HOR present in D. serido

PCR with DBC-150 primers (DBC-P1/P2) produced DNA fragments corresponding to the sizes of monomers and multimers of DBC-150 repeats in all species of the buzzatii cluster (Kuhn et al., 2007). However, in addition to these DNA fragments, D. serido showed a rather prominent DNA fragment at a size of approximately 400 bp, which roughly corresponds to the size of DBC-150 trimers. We sequenced 12 clones from these DNA bands, derived from two independent PCR experiments. Sequence analysis revealed that they comprise three adjacent DBC-150 repeats (a, b and c), which are arranged in the form of a 3-mer HOR (Supplementary material S2). The average nucleotide variability between the a, b, c subrepeats is 9.4±1.8% and between the 3-mer HORs, the average nucleotide variability is 8±0.9%. The average nucleotide variability of the 3-mer HOR is slightly lower than that observed among random DBC-150 monomers (9.2±1.5%). The sequence analysis from four additional clones of D. serido obtained from an approximately 300 bp PCR DNA band with same primers revealed dimers originated by recombination between a, b, c subrepeats (see below).

Consensus sequences derived from each a, b, c subrepeats from the 12 D. serido 3-mer HOR are shown in the nucleotide alignment of Figure 4c. The 3-mer HOR of D. serido shares several nucleotide features in common to DBC-150 JRs, including the 5′-CCGGA(TA)nCCGG-3′ palindromic sequence, the 50 bp deletion and several other nucleotide substitutions that are frequent among DBC-150 JRs of D. gouveai and D. antonietae (Figure 4b). Therefore, the origin of these structural changes predates the cladogenesis of the three species.

Phylogenetic relationships of DBC-150 sequences

The NJ tree in Figure 5a shows that DBC-150 JRs from D. gouveai and D. antonietae are grouped into two main branches, one composed of repeats with the palindromic sequence and another without it. The latter group is clustered in a separate branch within the main branch leading to 35 DBC-150 monomers from all seven species of the buzzatii cluster. Repeats from different junctions and species were clustered together, confirming previous observations of a general lack of species specificity of repeats belonging to the DBC-150 satDNA family (Kuhn et al., 2007).

Figure 5
figure 5

(a) Neighbor-joining tree containing (1) consensus sequences derived from each DBC-150 junction repeat (JR) present in junctions I, II, III, V and VI isolated from D. antonietae (ANT-DBC) and D. gouveai (GOU-DBC); (2) consensus sequences derived from each a, b and c subrepeats from D. serido 3-mer higher-order repeats (HORs) (SDO-DBC-HOR/a–c) and (3) 35 randomly isolated DBC-150 monomers from D. antonietae (ANT150), D. gouveai (GOU150), D. serido (SDO150), D. seriema (SMA150), D. koepferae (KOE150), D. borborema (BOR150) and D. buzzatii (BUZ150). Partial DBC-150 JR sequences with less then 80 bp were not included in generation of the tree (for example, GOU-DBC/1a-J-IV). Different symbols in the beginning of each sequence were added to facilitate discrimination of repeats from each species and junction. Black symbols represent DBC-150 JRs and D. serido 3-mer HORs; gray and white symbols represent DBC-150 monomers. The branch leading to DBC-150 JRs containing the 11 bp palindromic sequence is showed in bold. (b) Neighbor-Net network with the same 55 sequences shown in (a). Squares represent junction sequences (sequences featuring palindromes are black-shaded) and circles represent a, b and c subrepeats from D. serido 3-mer HORs. In (a) and (b) genetic distances were calculated according to Kimura two-parameter (K-2P) model.

The a and b subrepeats from the 3-mer HOR of D. serido were clustered in the same branch containing DBC-150 JRs presenting the palindrome, whereas the c subrepeat was clustered with the DBC-150 monomers (Figure 5a). This result raises the possibility that the satDNA junctions involved DBC-150 repeats with a certain level of higher-order organization.

Five (out of six) repeats from junction II were clustered in two specific branches. One branch comprises repeats a, c and e, whereas the other comprises repeats b and d (Figure 5a). Such alternating pattern of interrepeat variation suggests that the DBC-150 JRs from junction II (panels a–f in Figure 3A) are members of a 2-mer HOR. In fact, the average nucleotide variability among all DBC-150 JRs from junction II is 12.2%, but falls to the half (6.0%) when the repeat size includes the 2-mer HOR (ab, cd, ef). Evidence of HORs was not detected in the remaining satDNA junctions, although we cannot exclude out the possibility that the number of adjacent DBC-150 JRs was too small for their detection.

The same DBC-150 sequences were used to construct a phylogenetic network based on the Neighbor-Net method. The resulting network (Figure 5b) displayed splits representing the same main groups of repeats illustrated in the NJ tree. The presence of several boxes throughout the network illustrates conflicting splits more likely generated by recombination events between repeats.

Recombination between satDNA repeats

Recombination events between neighbor DBC-150 satDNA repeats were detected seven times in our data. All the recombination events resulted in total reciprocal exchanges. In the satDNA junctions, the events involved repeats e/c (Rec1) and e/f (Rec2) from junction II (D. antonietae), and b/d (Rec3) from junction III (D. gouveai) (Supplementary material S1). The average pair-wise variability values between repeats involved in Rec1, Rec2 and Rec3 are 5.2, 13.7 and 6.1%, respectively. Three clones representing 3-mer HORs of D. serido showed recombination between b/c (Rec4-6), and one clone showed recombination between a/b repeats (Rec7) (Supplementary material S2). The average nucleotide variability between repeats involved in Rec4-6 is 10.9% and in Rec7 is 14.4%.

At least one of the junctions is repeated

The seven clones spanning the junction II of D. antonietae (Table 1) represent at least six different sequences showing variability values ranging from 0.0 to 3.2% (average 0.9±0.2%). As the clones were derived from a strain founded by several females (collection date 22 March 1998), they may represent either different alleles or different paralogous sequences.

The seven clones spanning the junction III of D. gouveai were isolated from a strain founded by a single female collected recently from nature (collection date 27 February 2002). Either one (homozygous) or two (heterozygous) different types of sequences would be expected if they represent different alleles. Notably, these seven clones represent seven different sequences showing pair-wise variability values ranging from 0.3 to 4.4% (on average 1.9±0.4%), so there are at least five copies of junction III in the genome of D. gouveai. The higher nucleotide variability of junction III compared to junction II clones (Figure 6) probably reflects a more ancient origin of junction III.

Figure 6
figure 6

Graph showing the frequency distributions (y axis) of pair-wise nucleotide variability frequencies (x axis) between full DBC-150 × pBuM junction sequences, calculated individually for D. gouveai (junction III) and D. antonietae (junction II).

Discussion

Interspersion of satellite DNA families

Using FISH on metaphase and interphase chromosomes, we showed that the localization of DBC-150 and pBuM satDNA families overlaps in some regions of the microchromosomes of D. gouveai and D. antonietae and high-resolution FISH revealed extended DNA fibers containing interspersed repeats from both the pBuM and DBC-150 families (Figures 1 and 2). Notably in D. gouveai, mutual arrangements of DBC-150 and pBuM repeats were found in regions spanning more than 100 kb of the chromatin fiber. The species D. seriema showed the lowest level of interspersion of both satellites (Figures 1 and 2).

Nonhomologous satDNA families present in the same species and chromosomal region(s) are typically organized in separate arrays. For example, in D. melanogaster, a different set of satDNA families are often located near one another on the same chromosome (Lohe et al., 1993) and sequence analysis of a 31 kb DNA centromeric region from the X-derived Dp1187 minichromosome revealed that the highly repeated satellites AATAT and TTCTC are organized in uniform arrays directly juxtaposed, with neither interspersion nor intervening DNA (Sun et al., 2003). In humans, the centromeric region of each chromosome contains at least one α-satellite DNA together with a unique combination of other satDNAs. These satDNAs are organized in discrete and uniform arrays that can be directly adjacent to each other, such as satellites 1, 3 and β in chromosome 22 (Shiels et al., 1997).

Reports of interspersion of satDNAs usually involve repeats with marked homologies, such as satellites I and II in the beetle Tribolium madens (Zinic et al., 2000); pBuM-1 and pBuM-2 satellite subfamilies in the D. buzzatii cluster (Kuhn et al., 2008) or psr2 and psr18 in the wasp Nasonia vitripennis (Reed et al., 1994). Interspersion of nonhomologous satellites is rarely reported, although was observed between the pSc200 and pSc250 satellites in subtelomeric regions of rye chromosomes (Alkhimova et al., 2004).

Most studies on satDNAs involve the characterization of repeat units, their chromosomal location and their evolutionary dynamics across different species. In contrast, there are only few data sets about the structure and organization of satDNA arrays at the molecular level (Shiels et al., 1997; Zinic et al., 2000; Alkhimova et al., 2004; Lee et al., 2006). It remains to be investigated whether the interspersion noted here is a regular occurrence or associated to some peculiar features of the Drosophila microchromosomes, such as highly heterochromatic nature, suppressed meiotic recombination and presence of few genes (reviewed by Riddle and Elgin, 2006), allowing a more complex interplay between satDNAs without deleterious effects.

Illegitimate recombination and the origin of satDNA × satDNA junctions

In mammals and plants, in vivo and in vitro experiments showed that efficient recombination between sequences requires a minimum of approximately 200 bp of uninterrupted homology (Rubnitz and Subramani, 1984; Opperman et al., 2004). Below this threshold, sequence divergence has an inhibitory effect on homologous recombination and can decrease threefold with a divergence level as low as 0.16%. In our data, all recombination products between DBC-150 direct repeats involved sequences with more than 5% sequence divergence (reaching 14% in Rec7). In another study, Okumura et al. (1987) detected recombination between human α-satellite DNA repeats (171 bp long) showing 20–30% sequence divergence. Therefore, recombination among neighbor satDNA repeats seems to require less homology than in other systems.

Very little sequence identity (<35%) exists over 20 bp regions spanning the α/β × DBC-150 junctions, and the α/β and DBC-150 JRs do not share any segment of 20 bp with more than 70% sequence identity, ruling out the involvement of homologous recombination in generation of the satDNA interspersion. pBuM and DBC-150 repeats do not have structural, sequential or terminal features of transposons, so we do not consider transposase-mediated transposition as a mechanism for amplification of repeats or generation of the interspersion patterns. Therefore, our results indicate that alternative illegitimate recombination mechanisms created the satDNA junctions.

Illegitimate recombination can occur by different mechanisms (reviewed by Müller et al., 1999). In the ‘breakage-and-rejoining’ model, free DNA ends can be joined either by direct ligation of blunt ends or by pairing of single-stranded cohesive ends mediated by short stretches (2–6 bp) of homology (microhomologies). In the ‘copy-choice’ model, illegitimate recombination can occur during DNA replication, when the polymerization of a new DNA strand is interrupted at certain sites and continues at a different template position. Such slippage events can be facilitated by the presence of palindrome sequences forming secondary structures, such as that found in one of the junctions between satellites pSc200 and pSc250 in rye (Alkhimova et al., 2004).

Sequence analysis of six different DBC-150 × pBuM satDNA junctions revealed that they consist of a number of α/β and DBC-150 JRs flanking abrupt transitions where there are no intervening bases between the two satellite sequences. The absence of intervening bases in our junctions favors the ‘breakage-and-rejoining’ model as the main mechanism creating at least the six isolated satellite junctions. This is further supported by the fact that microhomologies between DNA ends were found in four junctions (I, II, III and VI). Illegitimate recombination is likely to occur during DNA transcription or replication, when DNA topoisomerases introduce nicks on one or both DNA strands to change the superhelical state of DNA (Wang, 1985), leaving temporarily free 3′ DNA ends. In our data, such satDNA exchanges might have been facilitated by the fact that some arrays are very close to each other (Figures 2a and b).

Two of the junctions (II and III) can also be distinguished by the presence of a tandem duplication of different DBC-150 JR's 10 bp segment, immediately adjacent to the junction. Interestingly, a 9 bp tandem duplication of the human α-satellite monomer was also found in abrupt junctions between this satellite and satellite III (Gaff et al., 1994), pointing to a similar recombination pathway.

All the DBC-150 JRs feature in at least one of the bordering repeats an almost perfect palindrome sequence 5′-CCGGA(TA)nCCGG-3′, which was not found in a collection of 35 DBC-150 monomers (Figures 4 and 5). The involvement of palindromes in recombination events between repeats has been suggested by other studies (Stary and Sarasin, 1992; Reed et al., 1994). The 11 bp gap observed in the region of the palindrome can be explained by the fact that palindromes form hairpin structures that can be deleted during DNA replication, following repair mechanisms (Trinh and Sinden, 1993).

We isolated two different pBuM × DBC-150 satDNA junctions of D. antonietae and another four of D. gouveai through PCR. The junctions were found in pBuM repeats, which are characteristic of each species, suggesting that they probably originated after the split of the two species from their common ancestor. Each junction originated by independent recombination events. Taking into account the interspersion level observed between pBuM and DBC-150 repeats in some arrays (Figure 2), it is realistic to expect that additional junctions are present in the genome of these two species. First, they might contain divergent pBuM and DBC-150 sequences that could have missed amplification due to our PCR settings (primer composition and annealing temperatures). Second, different junctions might be present in the same DNA band, as we showed twice in our data (Table 1). Finally, we also found evidence that at least one junction (junction III of D. gouveai) is repeated. Unequal crossing-over between homologous sequences flanking the junction is the most likely mechanism responsible for the propagation and consequently increase of the interspersion organization of satDNA junctions. Examples of reiteration of specific satDNA × satDNA junctions are scarce in the literature. In humans, a junction between satellite III and α-satellite DNAs is present in at least three chromosomes (Gaff et al., 1994), and in the beetle T. madens, there are multiple copies of a junction between satellites I and II (Mravinac and Plohl, 2007).

Evolutionary dynamics of satDNA × satDNA junctions

Unequal crossing-over between sister chromatid repeats has been proposed to be a mechanism that promotes both homogenization and amplification/deletion of satDNA repeats during evolution (Smith, 1976). The model also predicts that repeats immediately adjacent to satDNA × satDNA or satDNA × non-satDNA junctions should present higher levels of sequence divergence compared to repeats from more central regions of the array, where unequal crossing-over is more likely to happen. There are very limited data sets on the literature concerning the analysis of satDNA junctions under this context, but they generally support the expectations of the model (McAllister and Werren, 1999; Schueler et al., 2001; Mravinac and Plohl, 2007). The homogeneity of a junction between the AATAT and TTCTC satellites in the D. melanogaster Dp1187 minichromosome is an exception (Sun et al., 2003). Probably other molecular mechanisms, such as replication slippage, might be important in the homogenization and amplification of simple satDNA repeats.

In our data, α/β repeats bordering the pBuM × DBC-150 junctions from D. antonietae and D. gouveai cannot be distinguished from a collection of random isolated α/β monomers from each species, indicating similar rates of mutation and homogenization in JRs and non-JRs. In contrast, neighbor DBC-150 JRs showed on average higher interrepeat variability compared to random isolated DBC-150 monomers from each species. Although we lack evidence of functionality for pBuM or DBC-150 sequences, it is unlikely that selection has played a major role in the conservation of α/β JRs present in all six independent junctions. A possible explanation is that some satDNA junctions we sequenced came from small ‘islands’ of DBC-150 repeats located within long pBuM arrays. FISH on DNA fibers supports this pattern of organization. In such a situation, the homogenization rate of α/β JRs is not expected to differ from the overall rate of homogenization in the array, because they are separated by only a small cluster of DBC-150 repeats. In contrast, unequal exchanges and consequently homogenization between adjacent DBC-150 JRs are expected to occur at a lower rate, with mutations tending to accumulate at a higher rate (Dover, 1982). We detected three recombination events between DBC-150 JRs (Supplementary material S1) and one of them just involved the last two repeats before the junction (e/f in junction II). However, such recombination events could have happened before the DBC-150 transposition event and the formation of junction II. In three junctions where the number of neighbor JRs enabled interrepeat comparisons, we verified that the last DBC-150 JR before the junction is also the most divergent repeat in two cases (Figure 7). Therefore, the dynamics of DBC-150 JRs is generally in agreement with the unequal crossing-over model, but the dynamics of α/β JRs is clearly not.

Figure 7
figure 7

Neighbor-joining tree containing DBC-150 junction repeats (JRs) a–f from junction II (a), b–e from junction III (b) and a–c from junction V (c). The last repeat from each junction is gray-shaded. Bootstrap values >70% indicated at branches; scale bar represents genetic distances calculated according to Kimura two-parameter (K-2P) model.