Integrative genetic map of repetitive DNA in the sole Solea senegalensis genome shows a Rex transposon located in a proto-sex chromosome

Repetitive sequences play an essential role in the structural and functional evolution of the genome, particularly in the sexual chromosomes. The Senegalese sole (Solea senegalensis) is a valuable flatfish in aquaculture albeit few studies have addressed the mapping and characterization of repetitive DNA families. Here we analyzed the Simple Sequence Repeats (SSRs) and Transposable elements (TEs) content from fifty-seven BAC clones (spanning 7.9 Mb) of this species, located in chromosomes by multiple fluorescence in situ hybridization (m-BAC-FISH) technique. The SSR analysis revealed an average density of 675.1 loci per Mb and a high abundance (59.69%) of dinucleotide coverage was observed, being ‘AC’ the most abundant. An SSR-FISH analysis using eleven probes was also carried out and seven of the 11 probes yielded positive signals. ‘AC’ probes were present as large clusters in almost all chromosomes, supporting the bioinformatic analysis. Regarding TEs, DNA transposons (Class II) were the most abundant. In Class I, LINE elements were the most abundant and the hAT family was the most represented in Class II. Rex/Babar subfamily, observed in two BAC clones mapping to chromosome pair 1, showed the longest match. This chromosome pair has been recently reported as a putative sexual proto-chromosome in this species, highlighting the possible role of the Rex element in the evolution of this chromosome. In the Rex1 phylogenetic tree, the Senegalese sole Rex1 retrotransposon could be associated with one of the four major ancient lineages in fish genomes, in which it is included O. latipes.

using the library plate number, and columns and rows coordinates. Clones were used in m-FISH experiments and in repetitive sequence analysis. BAC clones were sequenced as described in Garcia-Cegarra et al. 33 . Briefly, DNA from the S. senegalensis BAC genome library was isolated and purified using the Large-Construct Kit (Qiagen, Hilden, Germany), and then digested with Hae II and Rsa I enzymes (20 U). A total of 454 sequencings were performed according to supplier's recommendations.
BAC sequences from another twenty-five BAC clones previously described 28,29,33 were used for the repetitive sequences study and integrated mapping analysis. BACs with several chromosome locations were counted as many times as they were localized. Overall, sixty-four BAC clone sequences have been analyzed in this work, integrating information about their chromosome localization, number and distribution of SSRs and TEs (Accession Numbers AC278047-AC278120).
The experimental procedures were in accordance with the recommendation of the University of Cádiz (Spain) for the use of laboratory animals (https://bit.ly/2tPVbhY) and the Guidelines of the European Union Council (86/609/EU). The experiment was authorised by the Ethics Committee of University of Cadiz (Spain). FISH analysis. Chromosome Preparations. Chromosome preparations were made according to Cross et al. 35 . Briefly, 2-3 day-old S. senegalensis larvae were pretreated with 0.02% colchicine for 3 h. Then they were subjected to hypotonic shock with KCl (0.4%) and finally fixed in a freshly-prepared solution of absolute ethanol:acetic acid (3: 1). Larvae were homogenized in Carnoy, and the preparations were then dropped onto wet slides and placed on a hot plate with damp paper to create the necessary moisture for a good spread of the chromosomes 29 .
mBAC-FISH. BAC clones labeling was carried out with a first amplification by DOP-PCR, followed by a conventional PCR for labeling, as described previously in Garcia Angulo et al. 28 . Three different fluorochromes were used: Texas red (Thermo Fisher Scientific, USA), fluorescein-isothiocyanate (FITC) (Enzo, USA), and diethyl-aminocoumarin (DEAC) (Vysis, USA). The chromosomes were pretreated with pepsin and fixed in formaldehyde. Finally, the chromosome preparation was dehydrated with ethanol series and air-dried before hybridization. Hybridization was done according to Portela-Bens et al. 29 .
Bioinformatic analysis. SSR and TE analysis. After determining the chromosome location of BAC clones, using the FISH technique, the genomic sequences obtained from those clones (taking into account several multi-loci situations) were loaded into a local pool. A configuration file was used together with the perl script MISA (Microsatellite identification tool) 37 . DNA sequences were then searched for both perfect and compound microsatellites, with a basic motif of 2-8 bp. Only 1 to 6 motifs were considered, and the minimum repeat unit was defined as 10 for mononucleotide, 6 for dinucleotide repeats, and 5 for tri-, tetra-, penta-and hexa-nucleotides. The maximum number of bases interposed between two SSRs in a compound microsatellite was set at 100. A homology-based approach using the Repbase (release 23.07) database; RepeatMasker 38 was also applied. Analysis of TEs distribution in the Senegalese sole genome was made possible by using both the information of BAC clone position obtained from the FISH technique, and the coordinates of the TE elements from the RepeatMasker software. Statistical analysis to determine frequency and distribution by chromosome of both TE and SSR elements was done using SPSS software (v17.0).
Phylogenetic analysis. In order to generate the phylogenetic tree for the Rex retrotransposon, fish Rex1 sequences from Repbase (Giri repbase -https://www.girinst.org/) were downloaded. In addition, the BLASTn algorithm 39 was used in the Ensembl database (https://www.ensembl.org) to find homologies with sequences matching the S. senegalensis Rex1 element, and the matched sequences were also used. One hundred and twenty five fish sequences were included in the phylogenetic tree. All sequences were then aligned in MAFFT software 40 using an iterative method. To eliminate poorly-aligned positions and divergent regions of DNA, the Gblocks server was used, and different options for a less stringent selection (allowing smaller final blocks, allowing gap positions within the final blocks, and allowing less-strict flanking positions) were applied to the analysis. Then the SMS program (Smart Model Selection) was applied to determine the best-fit phylogenetic model 41 and, finally, the PhyML 3.0 software 42 was used to run the model. The resulting best-fit model predicted was GTR + G + I. The proportion of invariable sites was 0.012, the number of substitution rate categories was 4, and the Gamma shape parameter estimated was 1.389. The statistic used for model selection was the Akaike information criterion (AIC), the value of which was 235939.11 and the -LnL was -117712.55986. Branch support was tested by the fast likelihood-based method using aLRT SH-like 43 Tree edition was carried out using MEGA version 7 44 .
Multiple hybridization analysis showed four clones producing a single signal and not co-localizing with other clones. Specifically, BAC9-J4 presented a signal in the small metacentric chromosome pair 3. The other three BACs appear in acrocentric chromosomal pairs: BAC15-I19 in pair 10; BAC36-M2 in chromosomal pair 18; and BAC4-M14 gives a signal in the chromosomal pair 20.
The hybridization signals of 5 new BACs were observed in the median metacentric chromosome pair number 2. In one arm appear the five BACs 4D-15, 52-G10 and 4C-5 that co-localize with those previously detected by Portela Bens et al. 29 . (2017), 6-P22 and 19-J21, the latter also signaling in the acrocentric pair 15. In the other arm, the BACs 36-I3 and 36-K1 signal; these co-localize with the BAC 21-O23 that also signals in the acrocentric pair 14 as a single signal, as described in Portela-Bens et al. 29 .
In six acrocentric chromosomal pairs, signals from new BACs clones were detected co-localizing with others already described above: in pair 12 BACs 35-D17 and 13-F2 co-localize with the BAC30-J4 located in an almost centromeric position, also detected on chromosome 4 32 . In chromosome pair 13 we detected the BAC29-D4 signal, also located in a more centromeric situation, which co-located with BAC8-07 that is co-localized in the long arm of the subtelocentric chromosome 7 29 . In pair 15 we find 2 signals that correspond to the BACs 4-F-12 and 36-E3 and co-localize with the BACs 19-J21 29 and 16-E36; the latter co-localizes in the large metacentric chromosome 1 and in the long arm of the subtelocentric chromosome 6 28 . In the chromosomal pair 16 we detect the signals corresponding to the BACs 52-E7 and 30-P17 that co-locate with BAC9-N8 27 . In chromosomal pair 19 we detected the signals of BACs 31-C1 and 13-F4 that co-localize with BAC12-K6; 27 and finally in pair 21 we find the signal of BAC 63-A3 co-locating with the signal of BAC30-H22 29 .

FISH mapping of SSRs in S. senegalensis.
To study the distribution of SSR sequences in the S. senegalensis genome, 2 mono-, 2 di-, 4 tri-and 2 tetra-nucleotide probes were used (Table 1). Seven out of the 11 probes yielded positive signals: (A) 20 , (C) 20 , (AC) 10 , (AG) 10 , (GCA) 5 , (GACA) 4 , (GATA) 4 (Fig. 3). (AC) 10 probe presented the largest and most intense signals. This SSR was found in subtelomeric position both in larger (metacentric and submetacentric) and acrocentric chromosomes. This distribution was similar to the location of the (GACA) 4 probe, and in a smaller quantity, the GATA repeats. The (AG) 10 probe displayed a dispersed pattern of FISH

nGS analysis of SSRs and tes in the S. senegalensis BAC clones.
To study the number, distribution and abundance of microsatellites in S. senegalensis, fifty-seven BAC clones from a genome library were analyzed with MISA software. Twenty-three out of them had been sequenced previously [27][28][29] (Table S1). The 57 clones comprise 6.9 Mb. As described in BAC-FISH results, some BACs were localized in two or more chromosomes, so these were included as many times as they appear in the S. senegalensis chromosomes. Taking this into account, the total number of BACs used in the SSR analysis was 64, and the total sequence length analyzed was 7.9 Mb. The number of SSR loci observed was 5330, comprising 1.27% of the genome analyzed, and presenting a total of 53505 repeat units. The average number of loci per Mb was calculated as the total number of identified loci (5330) in relation to the BAC sequences length analyzed (7.9 Mb) and normalized by Mb. In average, 675.1 SSR loci per Mb were found in the S. senegalensis genome. The coverage calculated as the quantity of sequences of SSR (bp) in relation to the BAC sequences length analyzed (7.9 Mb), and again normalized by Mb, was 12716.63 bp. Attending to the motif length of microsatellite DNA, the di-nucleotide motif showed the largest number of identified SSRs,   www.nature.com/scientificreports www.nature.com/scientificreports/ with almost thirty thousand repeats (29968) in the Senegalese sole BAC clones, followed by mononucleotide repeats (16246). The mean number of repeats loci was higher for the mononucleotide motif than for the dinucleotide motif (11. 63 and 10.33 respectively). The analysis also showed a high level of dinucleotide coverage (measured as nucleotides per Mb sequenced), with an abundance of 59.69%. The mononucleotide and trinucleotide abundance presented lower values (16.18 and 16.32%) than dinucleotides (59.69%) ( Table 2). When microsatellite abundance per motif length class is studied, it can be seen that the mononucleotide "A" is rather more abundant than "C" (81.3% vs 18.7%). In the dinucleotide class, "AC" was the most abundant in the genome analyzed (67.7%). The most abundant trinucleotide motifs were "AAT" and "AGC" (28.4 and 21.8% respectively) ( Table 3).
After positioning BACs on chromosomes by means of FISH, the location and genome abundance of SSRs (measured as bp of SSR per Mb) could be studied (Fig. 4). Chromosome 17 showed the highest SSR coverage, with more than 41000 bp of SSR per Mb. Chromosomes 1 and 20 showed the lowest SSR coverage (9644 and 5457 bp per Mb). When number of loci was measured, similar results were found: chromosomes 1 and 20 show the lowest values and chromosome 17 the highest (Suppl. File 2).
All BAC sequences, with information about their chromosome position, were also analyzed using Repeat Masker software. After removing simple repeats and artifacts, 4685 BAC clone positions matching with known Repbase TE elements were obtained. Results were organized by: Class I (retrotransposons); Class II (DNA transposons); and Other repeat elements ( Table 4). As it can be observed in Table 4, Class I transposons showed 1549 elements in the genome sampled (BACs sequenced) which represents 33.04% of the TEs in the genome analyzed. From this Class I, 717 elements were found as LINES. Within LINES elements, we found another 14 families, the most abundant being, in numbers of elements (660 out of 717 = 92%) the following: L2 (364), RTE-BovB (112), Rex (88), L1 (63) and Penelope (33). When LINES elements were filtered by length (higher than 1 kb), only Rex and L2 families (5 loci) showed matching repeats (L2:1199-2113 bp and Rex: 2551 bp). When filtering for matches longer than 500 bp, again only these two families were found. The DNA transposons (Class II) were the most abundant with 54.9% of the TEs found, with the hAT family being the element with the greatest presence in the S. senegalensis genome: 900 elements and an abundance of 19%. Other repeated elements such as rRNA, tRNA and scRNA show an abundance of 12.06%. Taking into account the genome distribution of TE elements by   www.nature.com/scientificreports www.nature.com/scientificreports/ chromosomes, results showed a heterogeneous distribution (Fig. 5). Chromosomes 8 and 17 showed the greatest abundance with more than 450 loci per Mb. The Class I: Class II ratio was 1.86 on average for all chromosomes, with an extreme value (ratio: 5) in chromosome 14 because of the very low Class I TE value.
Within TE elements found in the S. senegalensis genome analyzed (BAC sequences), hAT elements were the most abundant (900 elements, Table 4). Within hAT, some elements as Charlie, Ac or TIP100 were the most frequent (818 out of 900 elements: > 90%). The Fig. 6 represents the percentage of these elements out of total hAT elements. As it can be observed, the hAT elements from Class II found in the genome showed that more than half of those elements analyzed (51.22%; 461 elements) were hAT-Ac, followed by hAT-Charlie (32.11%; 289 elements), hAT-Tip100 (7.56%; 68 elements) and other repeats of minority elements (9.11% cumulative).
After BAC clone sequence analysis, the distribution and abundance of TE elements in the S. senegalensis genome was assessed. Within LINES elements, five of them matched regions longer than 1 kb, with the Rex/Babar subfamily showing the longest one (match length 2551 bp). This Rex family was observed in two BACs localized in chromosome 1 (10-L10 and 5-K5). According to recent literature this chromosome could be a proto-sex chromosome. In this sense, we measured the coverage of Rex elements per chromosome, finding the highest value (7427) in chromosome 1, followed by chromosome 4 (3277) and chromosome 19 (1277). In addition, short sequences from BACs of different chromosomes showed similarities with Rex transposon, having a wide distribution across the genome (Fig. 7).
The Rex/Babar sequence, from BACs localized in chromosome 1, was then used against other teleost genomes, as a query in a BLAST search of the Ensemble database. The matches obtained were then extracted as FASTA files and a phylogenetic tree was made (Fig. 8)   www.nature.com/scientificreports www.nature.com/scientificreports/

Discussion
In the present work, we studied repeated DNA in S. senegalensis using cytogenetic techniques and BAC sequencing. Although the fraction of the Senegalese sole genome studied in the present work is approximately 1.1% of the estimated total size, it can be considered a representative sample to detect the repetitive elements present    www.nature.com/scientificreports www.nature.com/scientificreports/ in the genome of this species 45 . In another species, the pea (Pisum sativum), it has been found that a low-pass sequencing of its genome is sufficient to capture the repetitive sequences present in its genome with at least 1000 copies; 46 and the potential of bioinformatic analysis of low-depth sequencing data for investigation of repeats has been further demonstrated in several other studies 47,48 . Consequently, the analysis of BAC sequences, together with the knowledge of their location by BAC-FISH and supported by the results of SSR-FISH, has enabled us to quantify for the first time the number and distribution of the repetitive elements of the genome of S. senegalensis.
In previous studies an integrated genetic map was constructed in the Senegalese sole; 27,29 this map comprises the sequence and localization of more than 50 BACs. In this study, using the mFISH technique, we determined the chromosome location of 32 new BAC clones and their genome sequences. Using this approach, the main advantage is that it allows us to study the repetitive distribution elements, after BAC hybridization, in various Senegalese sole chromosomes. The results obtained in the present work indicate a frequency of SSRs similar to the data of the transcriptome published 34 . In the genome studied here, dinucleotides were the most abundant motifs (at around 60%), followed by trinucleotides and mononucleotides (both at 16%); in the transcriptome the most abundant microsatellite were dinucleotides, followed by trinucleotides and tetranucleotides in decreasing abundance. The most common SSR motifs in the S. senegalensis transcriptome were AC and GT for dinucleotides (at 74.6%) and in the present study they are slightly lower (at 67%). The most abundant trinucleotides in the transcriptome are AGG and CCT (at 21.5%): this finding differs from that detected by BACs (AAT and CCG, at 28.3%) 34 .
In this study, the source of SSRs are BAC clone sequences that have been located throughout the genome of the sole. A total of 5330 microsatellites were identified based on BAC sequences and comprising 1.27% of the genome analyzed, a value similar to that found in the genome of the fugu puffer fish T. rubripes (1.29%) 49 and slightly less than half that in the green puffer fish T. nigroviridis, where SSRs account for 3.21% of the genome 50 . These data are consistent with the long-standing assumption that microsatellites are present in all the vertebrate and invertebrate species so far studied. The abundance of microsatellites in sole is similar to that found in humans (>1.5%) 49 and a little lower than that found in mouse (2%) 51 and snake (2.8%) 52 .
Although it is widely assumed that the abundance of microsatellites rises with the genome size, many exemptions have been recorded in animals and plants 53,54 . The microsatellite frequency described in this work (on average, 675 loci per Mb) is similar to that obtained in Drosophila, with a genome three times smaller (180 GB), in human with a genome five times larger (3000 GB), and in mouse 51 . The relative abundance of length classes of microsatellite motif exhibits a remarkable inter-species variation but dinucleotides and mononucleotides are the predominant in the majority of cases 55,56 . In S. senegalensis the dinucleotide motifs are the most abundant, in a proportion of 59.69%. Next, with lower but similar values, we found the mononucleotides and the trinucleotides (16.18 and 16.32% respectively).
The most abundant dinucleotide motifs are AC and GT, a finding similar to that described for the swamp eel genome 57 and that of human 58 . The AC and GT motifs have been reported as the most frequent SSRs in the intergenic and intron regions of vertebrates 6 and are 2.3 times more frequent than (AT) n, the second most general type of dinucleotide 6 . The more notable repeats in trinucleotides are AAT, AGC and AGG, and the relative abundance of the AAT motif is the most notable; this also occurs in the swamp eel genome 57 . It shows a predominance of A-rich repeats during the evolution of the genome in teleosts. The extent of the repeats is probably affected by their secondary structures and the influence on DNA replication; 58 or it could reflect a genetic adaptation to the aquatic environment during speciation of fish. Trinucleotide, tetranucleotide, penta-and hexanucleotide microsatellites are much less frequent than dinucleotides and are usually present 1 to 5 times less frequently than dinucleotides in the genomic DNA of vertebrates 6,50 . The mononucleotides detected have an abundance similar www.nature.com/scientificreports www.nature.com/scientificreports/ to that of the trinucleotides (around 16%) and the most abundant motifs are A and T, which account for about 81%. In primates, mononucleotides are represented by A and T motifs and are the most frequent among SSRs 6 . In relation to the distribution of SSRs throughout the genome, using mapped BAC sequencing as sampling, there is some variation among the particular chromosomes, and it is noteworthy that chromosome 1 is one of those with the lowest SSR abundance values.
Using SSR probes in FISH experiments, we observed that some di-and tetra-nucleotide microsatellites produce the strongest FISH signals. The bioinformatic analysis of the BAC clones indicated that the AC motif has the highest relative abundance value. Our SSR-FISH results support this datum: AC shows up in clusters with www.nature.com/scientificreports www.nature.com/scientificreports/ the brightest and most intense signal in almost all chromosomes. Furthermore, the GACA, GATA and AG elements showed similar patterns after applying the FISH technique to localize them. Hence, these four microsatellites are probably present as an established combination of repetitive elements in the heterochromatin of sole. Mononucleotide probes (A) and (C) were found scattered throughout the chromosomes. Using FISH there was no clear correspondence between the frequency of the microsatellite motif and the intensity of the signal, since in our study the AAT motif with frequency of 28% gave no signal while the C motifs with 18% and AG with 14% gave more intense signals. In fish species such as D. rerio, Rineloricaria latirostris and Steindachneridion scripta, these repetitive sequences tend to be grouped in the telomeric and centromeric regions 59 .
In relation to repeated elements, we have identified a total of 4686, from which, 4121 were TE elements. These TE elements showed a total length of 467144 bp (5.94% of the genome analyzed). When compared with other fish, this proportion is similar to that found in the two smallest reported genomes of teleost fish, the green spotted pufferfish, and the fugu (T. rubripes), with genome sizes of approximately 342 and 393 Mb, respectively, that contain only ~ 6% of their DNA derived from TE 60 . The proportion of TEs in stickleback, cod and European eel, with values of 12-15% of their genome, is twice that observed in the S. senegalensis genome 13 . In the group of tilapia, platyfish, medaka and spotted gar the proportion of TEs is even higher, with values between 20 and 30%; and the proportion observed in the coelacanth is 25% of the genome 13 . The number and proportions of TEs differ widely among genomes of actinopterygian (ray-finned) fishes, especially teleosts. In fact, a large part of the zebrafish genome (~1.4 Gb) consists of TEs (55%) 14 . The abundance of TEs seems to be the main determining factor of genome size in this group 13,61 . However, TEs proportion in the small genome of tetraodon (representing just 7.13% of its genome) and in other vertebrates as birds (TE content values ranging 8-10%) are also close to those found in sole 13,61 .
Considering the genome distribution of TEs by chromosome, our results show a heterogeneous distribution. Class II TEs (DNA transposon) cover almost 55% of total Repetitive DNA found in the S. senegalensis BAC sequences analized and it is similar to the 60% of Class II TEs detected in cichlids and somewhat greater than the 39% of the same type of TEs in the zebrafish genome. In S. senegalensis the retrotransposons (Class I) account for 33%, with a coverage of LINES of 15%, the most abundant with 717 matches, coverage of SINES (short interspersed elements) of 10%, and of LTRs of almost 8%, whereas in cichlids and zebrafish retrotransposons represent less than 12% of each type 14,62 . The DNA transposons (Class II) were the most abundant with 55% of the TEs found, these being the class with the highest presence (900 elements and abundance of 19%) in the S. senegalensis genome analyzed. In particular, two main TIR (Translocated Intimin Receptor) families (hAT and Tc-Mariner), with many subfamilies, constitute the largest fraction of DNA transposons in the sole genome. To a lesser extent, Harbinger has also been detected.
The TIR family of hAT transposons is worth mentioning, given its coverage of 1% of the genome studied. This value represents a coverage ten times higher than that detected in the coelacanth (0.11%) and Lung fish (0.1%) 13 . The hAT transposons are also found in the genomes of mammals, including humans, where they are the most abundant DNA transposons and comprise 1.55% (195 Mb) of the total genome 22 . In chicken, values similar to those of fish have been found (0.1%) and the value detected in salamander (0.63%) is also lower than that of sole 13 . The highest value of genome coverage in the hAT superfamily: 6.10% was detected in frog 1 . Few data are available on the role played by the hAT superfamily in fish; however, it is known that none of the hAT elements in the human genome have been active during the last 50 million years 22 . In vertebrates, most hAT transposons are inactive, since host cells have developed the mechanism of vertical inactivation to silence and prevent the deleterious effects of active transposons on genome stability 63 .
We have found a sequence that presents homology with the Rex retrotransposon of many species. The abundance of TEs of the Rex type detected mostly in the chromosome pair 1 of the S. senegalensis genome, raises the hypothesis that this chromosome could be a proto-sex chromosome 29 . It is known that Rex-type transposons are very important in the evolution of the eukaryotic genome, and participate in processes of chromosomal rearrangement 64 and chromosomal sex differentiation [65][66][67] , which are involved in sexual differentiation. Several authors have also associated these transposable and retro-transposable elements with chromosomal sex differentiation in groups of fish such as Cyprinodontiformes 68 , Characiformes 69 , and Beloniformes 70 . Indeed, in the Cyprinodontiforme Semaprochilodus taeniurus, Terencio et al. 69 observed a significant increase in the size of the W chromosome due to repetitive DNA accumulation, and among these DNA sequences was Rex1.
In O. nitolicus, Rex elements are concentrated in the first pair of chromosomes 18 . In this species, the first pair of chromosomes seems to correspond to the sex chromosomes 71 , possibly originated from fusion processes 72 . The location of the Rex1 elements in the chromosome pair 1 could have had some role in chromosomal rearrangements of the S. senegalensis genome, as occurs in O. nitolicus 18 .
In S. senegalensis, our SSR-FISH results showed a higher concentration in subtelomeric positions of several probes that are probably present as a combination of repetitive elements in the heterochromatin of sole. This heterochromatin is present in metacentric chromosomes, such as chromosome pair 1. In addition, one of the BACs where Rex1 presented the highest length and abundance values (BAC10-L10), was found in a subtelomeric position in chromosome 1. Hence, this subtelomeric region could be comprised of heterochromatin in which (or adjacent to which) the Rex1 retrotransposon could occur.
It has been described that the preferential position of Rex1, Rex2 and Rex6 genes in heterochromatic regions of the genomes of some fish 73,74 could indicate some mechanism of regulation of these elements that impedes or prevents excessive dispersion and propagation in the genome, since the presence of heterochromatin could be regulating, through epigenetic mechanisms, the dispersion of these sequences without modifying their sequence 75 . Several studies have shown a relationship between the preferential presence of repetitive sequences in sexual chromosomes and heterochromatin regions. Thus, in Harttia carvalhoi (Loricariidae) it has been discussed how the location of the retroelements Rex1, Rex3 and Rex6 in the pericentromeric region of an X chromosome could have influenced its fission, which led to the formation of chromosomes Y1 and Y2 69 .
The first reference to the existence of Rex1 was published by Volff et al. 68 , after finding an insert in a cosmid from the Y sex chromosome of X. maculatus that revealed a sequence encoding a product with similarities to the RT of non-LTR retrotransposons. That sequence was called Rex1-XimJ 68 . After a wide analysis, the phylogeny of Rex1 sequences was explained by the presence of four major ancient lineages in fish genomes. The lineage 4 contained sequences from O. latipes and O. niloticus among others. Lineage 4 is observed in all Acanthopterygii, but not in C. carpio, D. rerio or O. mykiss, among others 68 . In the Rex1 phylogenetic tree constructed, the Senegalese sole Rex1 retrotransposon could be associated with one of the four major ancient lineages in fish genomes, in which it is included O. latipes.
One of the hypotheses to explain the wide distribution of the lineage 4 of Rex1in fishes is the possibility of horizontal transfer 68 . Horizontal transfer has been well documented for some DNA transposons and for LTR retrotransposons 76 . The possibility of a horizontal transfer (HT) event between phylogenetically distant species (Perciformes and Batrachoidiformes orders) has been recently reported in fishes 77 . It has also been demonstrated that 5 S rRNA genes and retro-transposons can interact with one another 78 , and this interaction might be the cause of the pattern of evolution and the dispersed arrangement of some organisms. Therefore a putative role of the Rex1 retrotransposon, and its presence in a heterochromatic region of S. senegalensis, in the evolution of this putative sex proto-chromosome 1 should be not rejected. On the other hand, the chromosome pair 15 has also shown high abundance of Rex1 sequences in the BACs localized in this pair. In a previous work, the BAC 19-J21 also localized in this chromosome, and it carried the SOX9 gene 29 . In the Prochilodontidae fish family, the W chromosome of Semaprochilodus taeniurus species, has significantly increased in size due to the accumulation of repetitive DNAs, like the Rex1 retro-element, with the consequent differentiation of the ZZ/ZW system of sex chromosomes 69 . In that study, one of the W-specific fragments showed high similarity with the transcription factor of the SOX9 gene in T. rubripes. The SOX9 is a gene related to sex determination in many organisms and is present in the BAC 19-J21 in S. senegalensis. Hence the presence of the Rex1 gene in regions where it occurs, and the role it has played in certain events related to sex determination, must be taken into account in studies of the evolution of the Senegalese sole genome.

conclusions
Our work represents a first approach to the study of the repetitive elements of the genome of the Senegalese sole (S. senegalensis). The analysis of the location of SSR allowed the description of large clusters of microsatellites in centromeric and subtelomeric positions, as well as the study of their composition by bioinformatic analysis. These results reflect a prevalence of A-rich repetitions during the evolution of this species as occurs in the genomes of other teleostats. The study of TEs revealed that the most abundant family in the genome of this flatfish is the hAT, as well as the discovery of a transposable Class I element, Rex, in the largest metacentric chromosome pair, recently described as a possible proto-sex chromosome. The presence of this element on this chromosome and its position in a heterochromatin region might have been relevant during the evolution of the chromosome. Our results present an important advance on the evolution of the S. senegalensis genome through the analysis of the distribution and quantification of repetitive elements and the role that Rex 1 may have played in certain events related to sex determination.