Introduction

Sex chromosomes are some of the most dynamic parts of genomes, including their repetitive DNA content (Steinemann and Steinemann 2005; Graves 2008; Matsunaga 2009; Hobza et al. 2017). Across diverse taxa, these chromosomes have originated independently from ordinary pairs of autosomes during evolution, but they present similar evolutionary fates (Bachtrog et al. 2011; Bachtrog 2013; Wright et al. 2016; Charlesworth 2017). Current models suggest that sex chromosomes evolved by acquiring a major sex-determining locus in the proto-Y or proto-W in the heterogametic sexes. Next, the accumulation of sexually antagonistic loci whose fitness consequences are in opposition between sexes, may favor recombination reduction followed by recombination suppression and repetitive DNA accumulation, resulting in mutational decay and gene loss in the sex-limited region of the Y or W chromosomes (Bachtrog et al. 2014; Wright et al. 2016; Charlesworth 2017). The genetic erosion of the Y or W chromosomes may even lead to complete disappearance, or perhaps to translocation to the autosomes (Bachtrog 2013; Blackmon and Demuth 2014, 2015; Blackmon et al. 2017; Daish and Grützner 2019).

Cases of loss of the heteromorphic element (Y or W), leading to emergence of X0 or Z0 sex systems, are well documented in insects and worms (Bachtrog et al. 2014; Blackmon et al. 2017; Daish and Grützner 2019), yet the persistence of Y or W chromosomes in many insects suggests that they may arise de novo by fusion of the ancestral X or Z with autosomes. This results in newly evolved neo-sex chromosomes (Charlesworth et al. 2005; Veltsos et al. 2008), like those neo-XY, neo-ZW, and multiple sex systems as reported in orthopterans and lepidopterans (Bidau and Marti 2001; Traut et al. 2007; Marec et al. 2010; Castillo et al. 2010a; Blackmon et al. 2017). However, despite many cytological works explaining the repeated evolution of neo-sex chromosomes across insects, a detailed molecular view of how and what kinds of changes occur in the neo-Y is severely lacking.

Among grasshoppers (Orthoptera), the sex-determining system X0 in males and XX in females is considered ancestral. However, in representatives of Melanoplinae (Acrididae), which is one of the largest subfamilies of Acridiae grasshoppers distributed in Eurasia and America (Chintauan-Marquier et al. 2011), a high frequency of transition to neo-sex chromosome systems is observed (Castillo et al. 2010a). Among Melanoplinae species, Ronderosia genus is represented by ten valid species, which are morphologically similar. They are commonly found in the Pampas and Atlantic forests, and some species may cause damage to agricultural crops (Cigliano et al. 2014). The highest Ronderosia diversity is found in Argentina, Brazil, Paraguay, and Uruguay where most species are sympatric in restricted regions (Cigliano et al. 2018; Castillo et al. 2019).

Ronderosia is a monophyletic group in which chromosomal rearrangements led to the origin of simple neo-XY and multiple neo-X1X2Y/X1X1X2X2 systems. In some representatives of Ronderosia, the neo-XY system evolved several times independently by repeated centric fusion of the ancestral X with one of the autosomes. However, some species with the ancestral state of neo-XY, i.e., fusion involving the same autosome pair with the X chromosome, have been documented (Castillo et al. 2019). Moreover, in R. dubia, a multiple neo-X1X2Y sex system evolved by a second centric fusion between the neo-Y and an autosomal pair. In another Ronderosia species, R. bergii, besides the fusion that led to the karyotype 2n = 22 and neo-XY, a large pericentric inversion on the neo-Y and complete heterochromatinization of this element took place (Castillo et al. 2010b; Palacios-Gimenez et al. 2015, 2018). In contrast, the neo-X shows heterochromatic blocks on terminal and pericentromeric regions, and in autosomes, the heterochromatin occupies primarily the pericentromeres. In meiosis, the neo-XY of R. bergii presents synapsis and recombination limited to the distal end of XR and short arm of neo-Y (Castillo et al. 2010a; Palacios-Gimenez et al. 2015).

With the advent of high-throughput sequencing methods and bioinformatics tools, it is possible to identify and characterize much of the repetitive portions of the genomes, allowing for a better picture about the chromosomal organization and evolution of this genomic fraction. Recent works utilizing such approaches in many species have successfully elucidated the evolutionary history of sex chromosomes (Palacios-Gimenez et al. 2017; Lisachov et al. 2019; Rodríguez et al. 2019; Schemberger et al. 2019). Part of the composition of the R. bergii neo-Y chromosome was studied (Palacios-Gimenez et al. 2015, 2018). Using two satellite DNAs (satDNA) as probes, the occurrence of multiple paracentric inversions in the neo-Y resulting in cryptic polymorphism in one natural population from Rio Claro, São Paulo/Brazil was revealed (Palacios-Gimenez et al. 2018). Here we extend earlier works in R. bergii (Palacios-Gimenez et al. 2015, 2018), aiming to more deeply understand the repeat composition and evolutionary history of its intriguing neo-sex chromosomes, and the action of repetitive DNAs in the evolution of sex chromosomes after the establishment of a large chromosomal rearrangement. For this, we characterized the composition of satDNAs and transposable elements (TEs) in this species by combining computational and cytogenetic tools, focused on the differential composition of repeats between sexes. Our data support the accumulation of various satDNA families exclusively on the neo-Y chromosome and low differential accumulation of TEs in both sex chromosomes. Furthermore, we provided compelling evidence for extensive reshuffling of neo-X and neo-Y at an intrapopulation level, resulting from both reorganization of satDNAs and complex chromosomal rearrangements. We hypothesize the possible mechanisms involved in the origin of multiple variants of the neo-sex chromosomes presenting empirical data of high dynamism of a sex chromosome after establishment of large non-recombining state.

Materials and methods

Sample collection and chromosome preparations

Males and females of R. bergii were collected in the campus of São Paulo state University in Rio Claro/São Paulo (SP), Brazil (22°24′45″ S, 47°31′28″ W). Adult males were anesthetized, and the testis follicles were dissected and fixed in modified Carnoy’s solution (3:1, absolute ethanol: glacial acetic acid) to obtain preparation of meiotic chromosomes. Some females were maintained in captivity until oviposition. After ~15 days from egg laying, oothecas were dissected for embryo neuroblasts obtained for mitotic chromosome acquisition, according to Webb et al. (1978). Whole animals were stored in 100% ethanol for DNA extraction. The material was stored at −20 °C until use. The slides were obtained by maceration of follicles or embryos in a drop of 50% glacial acetic acid, followed by spreading on a hot plate at 40–45 °C.

Genome sequencing and trimming

We extracted genomic DNA from femurs (saltatory legs) of one adult male harboring the neo-Y variant type II that is the most common among adults (see Palacios-Gimenez et al. 2018), and one adult female individual using Qiagen DNeasy kit (Qiagen Inc., CA, USA), following the manufacturer’s protocol. The neo-Y variant type II is characterized by the presence of the satDNA Rber248 (new nomenclature RbeSat25-166) proximal to the centromere in the long arm and Rber299 (new nomenclature RbeSat23-285) located in the middle of the long arm (Palacios-Gimenez et al. 2018). The extracted DNAs were sonicated to obtain fragments of ~500 bp and used to generate the paired-end genomic library as recommended by Illumina. The libraries were constructed following the protocol TruSeq DNA PCR-Free sample, using the kit TrueSeq DNA PCR-Free kit (Illumina Inc., San Diego, CA, USA). The libraries were sequenced using an Illumina Hiseq 4000 platform, using a service of Macrogen Inc. (Seoul, Republic of Korea). The sequencing yielded 11,956,092 and 11,577,878 reads of 2 × 101 nt for the male and the female individuals, respectively. We used Trimmomatic software (Bolger et al. 2014) to perform the quality trimming of the reads at Q20 and minimum length of 101 nt. The sequencing data are deposited in GenBank under accession numbers SRX8370569 and SRX8370570.

SatDNA analysis

We used the sequenced genomes for comparative analysis of satDNAs, focused on differences between sexes in order to obtain sequences enriched on neo-sex chromosomes and description of their molecular composition. Sequences enriched on neo-Y are expected to be more represented in male (neo-XY) reads than in female (neo-XX) reads, while sequences enriched on neo-X should be putatively more represented in female than in male genomes. The differential enrichment of satDNAs across sexes was estimated by normalizing the abundance of a given repeat in the male genome by the female genome and vice versa. We used 1.1 times more abundant as the threshold to consider satDNA enrichment across sexes. These enriched sequences were selected for cytogenetic treatment, i.e., in FISH chromosomal mapping.

For satDNA prospection, we used the satMiner protocol (Ruiz-Ruano et al. 2016) available at GitHub (https://github.com/fjruizruano/satminer). The protocol identifies the maximum possible number of satDNAs by several rounds of clustering in RepeatExplorer (Novák et al. 2013) followed by DeconSeq (Schmieder and Edwards 2011) to filter out reads previously assembled in RepeatExplorer. We started this analysis with a random selection of 150,000 read pairs using rexp_prepare.py script for the male and the female genomes together as an input for comparative RepeatExplorer using default options suggested in Ruiz-Ruano et al. (2016). Then we searched for putative satDNA clusters by selecting those clusters with a high graph density, and with a spherical or ring-like shape. Finally, we studied the internal structure of the contigs for those clusters to search for tandem repetitions to consider them as satDNA clusters using Tandem Repeat Finder (Benson 1999) and the dotplot tool implemented in Geneious v4.8 software (Drummond et al. 2009). Using the described analysis, sequences arranged in tandem were considered satDNAs, which was confirmed by a ladder pattern in PCR amplification. The clustering and filtering steps were repeated five times adding 2 × 50,000 filtered reads in each interaction, until no new satDNA was detected.

We searched for homologous satDNAs, performing an all-against-all RepeatMasker (Smit et al. 2013–2015) comparison of the recovered satDNA monomers, using the “rm_homology.py” script (https://github.com/fjruizruano/ngs-protocols). Then, we classified satDNA sequences into superfamilies, families, or variants, and named these according to Ruiz-Ruano et al. (2016). We calculated abundance and divergence of all the recovered satDNA families in the female and in the male genomes using RepeatMasker. For this purpose, we randomly selected 11.5 million read pairs (23 million reads per library in total) per library with the seqtk tool (https://github.com/lh3/seqtk) and aligned them against dimers of satDNA consensus sequences. For smaller satDNAs, several monomers were concatenated, until they approximately reached the read length. We used the generated alignment files to estimate the average Kimura 2-parameter distances (K2P) of each satDNA family using the calcDivergenceFromAlign.pl script from the RepeatMasker utility tool. Next, we calculated the genomic abundance for every satDNA family as the proportion of nucleotides aligned with the reference consensus sequence divided by the library size. We compared the divergence of satDNAs between sexes by generating repeat landscapes, showing the relative abundance of repeat elements on the Y axis at 1% intervals of K2P distance from the consensus on the X axis. Recently amplified elements are unlikely to have accumulated a large number of mutations that differentiate them from their consensus sequences. The relative divergence thus was calculated as the K2P genetic distance between the satDNA pair.

TE characterization and quantification

We used the dnaPipeTE pipeline (Goubert et al. 2015) freely available at GitHub (https://github.com/clemgoub/dnaPipeTE) to estimate TE abundance and divergence across the male and female genomes. Because TEs in grasshoppers are quite abundant and frequently widely spread on euchromatin (see, e.g., Montiel et al. 2012; Palacios-Gimenez et al. 2014), we used 5× difference in copy number as a threshold to consider TE enrichment in males and females. The sequences represented more than five times as much in one genome were used for FISH chromosomal mapping. DnaPipeTE is a suitable tool for de novo assembly of TEs and uses low-coverage raw reads as input (<1× genome coverage). It uses Trinity (Grabherr et al. 2011) as assembler, followed by automatic annotation and quantification of TEs in sequencing raw reads, together with the Repbase database. DnaPipeTE thus provides a good estimate of the recent TE age distribution in a small sample of sequencing raw reads, unlike RepeatMasker that quantifies repeats with a wide range of ages in assembled genomes (Goubert et al. 2015). To optimize the amount of sequencing data for dnaPipeTE subsampling of each genome, we selected the male genome and ran dnaPipeTE on subsamples ranging between 500,000 and 1,000,000 reads in intervals of 100,000 reads (6 runs). For each of the 6 runs in the male genome, we selected the subsample yielding the highest contig N50 metric in the Trinity assembly step of dnaPipeTE, as a measure of optimized read subsampling. The optimized read subsample in male genomes (800,000 reads) was then used to run dnaPipeTE on the female genome. We annotated TEs using the -RM_lib option in dnaPipeTE with Arthropoda Repbase as database and an unpublished custom-curated TE library generated from grasshopper genomes. The dnaPipeTE output landscape graph (TE age distribution) depicts the BLASTn divergence distribution between reads and the contigs on which they map.

Amplification of repetitive sequences through PCR

We selected the sequences (satDNAs and TEs) with higher male versus female abundance differences to proceed with PCR isolation and FISH mapping (see the next few sections). We used the obtained consensus sequences of each repeat family to design divergent primers for satDNAs and convergent ones for TEs (Supplementary File 1). Divergent primers were designed manually or using Geneious software on conserved regions.

Genomic DNA of males was used to amplify the repetitive elements by PCR using the mix: 10 × PCR Rxn Buffer, 0.2 mM MgCl2, 0.16 mM dNTPs, 2 mM of each primer, 1 U of Taq Platinum DNA Polymerase (Invitrogen, San Diego, CA, USA), and 50–100 ng/μL of template DNA. The PCR conditions included an initial denaturation at 94 °C for 5 min and 30 cycles at 94 °C (30 s), 55 °C (30 s), and 72 °C (80 s), plus a final extension at 72 °C for 5 min. The PCR products were separated with a 1% electrophoresis agarose gel. The monomeric bands were isolated and purified with ZymoclenTM Gel DNA Recovery Kit (Zymo Research Corp., The Epigenetics Company, USA), following the manufacturer’s recommendations, and used as a source for reamplification through PCR. PCR products were sequenced by Sanger method using the service of Macrogen Inc. (Seoul, Republic of Korea) to check the correct amplification of desired sequences. Sequences are deposited in GenBank under accession numbers MT501156-MT5001208.

Fluorescence in situ hybridization

The sequences of each satDNA and TE obtained through PCR were labeled with digoxigenin-11-dUTP (Roche, Mannheim, Germany) or biotin-14-dATP (Invitrogen) by nick translation. The telomeric probe was obtained by non-template PCR using self-complementary primers (TTAGG)5 and (CCTAA)5 according to Ijdo et al. (1991).

Unstained slides of mitotic or meiotic preparations were used for single or two-color FISH following the Pinkel et al. (1986) protocol with modifications (Cabral-de-Mello 2015). Probes labeled with digoxigenin-11-dUTP were detected using anti-digoxigenin-Rhodamine (Roche), while probes labeled with biotin-14-dATP were detected with Streptavidin Alexa Fluor 488-conjugated (Invitrogen). After FISH, preparations were counterstained with 4′,6-diamidino-2′-phenylindole (DAPI) and mounted in VECTASHIELD (Vector, Burlingame, CA, USA). The slides were observed using an Olympus microscope BX61 equipped with fluorescence lamp and appropriate filters. The images were documented with a DP70-cooled digital camera in grayscale and then pseudo-colored. Images were merged and optimized for brightness and contrast with Adobe Photoshop CS6. To describe the patterns of satDNA and TE distribution, at least 15 metaphases were analyzed.

Experimental design for analysis of neo-sex chromosome variability through satDNA location

First, we mapped the satDNAs with differential abundance between male and female genomes in meiotic cells (metaphase I) of one adult individual harboring the neo-Y variant type II, which is the most common variant among adults (Palacios-Gimenez et al. 2018). After bearing in mind the variability for the neo-Y chromosome reported by Palacios-Gimenez et al. (2018), we mapped satDNAs with signals on neo-Y or neo-X chromosomes plus satDNAs Rber248 (new nomenclature RbeSat25-166), Rber299 (new nomenclature RbeSat23-285), and telomere motif in embryos from the same and from different ootheca to check the variability of sex chromosomes. We used a total of 20 embryos, eight selected randomly from distinct oothecas, and 12 selected from four individual oothecas (three per ootheca). Each embryo was used to produce nine slides, to allow the mapping of all satDNAs and telomeric probe by two-color FISH. This enabled us to define the position of each satDNA on the neo-Y and neo-X. We then performed an experiment to check the satDNA location and its relation on the neo-Y chromosome. For this, we used two slides prepared from the same embryo (with the most common neo-Y chromosome), and hybridized and photographed each slide after FISH rounds with different satDNAs. After each FISH experiment, we removed the coverslips and probes by three washes, 5 min each, in 4× SSC and 1% Triton-100. After that, we proceed the refixation in 3.7% formaldehyde diluted in wash-blocking buffer (0.4× SSC, 0.1% Triton C, and 1% skimmed milk), and then followed the standard protocol of FISH.

Results

Bioinformatic satDNA characterization

Computational analysis of repetitive DNAs from male and female genomes revealed the occurrence of a total of 53 satDNA families, all of which were shared between sexes (Fig. 1a, b). Among the 53 satDNA families detected, 31 were differentially enriched across sexes. From these, 27 satDNAs were male-biased, suggesting enrichment on the neo-Y chromosome, while four satDNAs were female-biased, and therefore putatively enriched on the neo-X chromosome (Table 1, Fig. 1e).

Fig. 1: Repeat landscapes from male and female genomes showing abundance and divergence for satDNAs and TEs identified in Ronderosia bergii.
figure 1

Bar color- coded repeat landscapes for the satDNA families in males (a) and females (b), and for the most abundant TE superfamilies in males (c) and females (d). The graphs represent genome coverage (y axis) for each type of repeats (color coded) in the different genomes analyzed, clustered according to Kimura distances to their corresponding consensus sequence (x axis, K value from 0 to 40, for a, b) or BLASTn divergence (x axis, BLASTn from 0 to 30, for c, d). Copies clustering on the left of the graph do not diverge very much from the consensus sequence of the element, and potentially correspond to recent copies, while sequences on the right might correspond to ancient/degenerated copies. Note the higher amount of satDNA for male genomes, while for TEs, there is almost no difference between the landscapes. Bar color-coded comparative quantitative analysis for satDNA families (e) and TE superfamily variants (f) enriched between sexes. Each column represents 100% of the reads of a specific sequence with proportions in male (blue) and female (red) genomes.

Table 1 SatDNA families recovered from six rounds of RepeatExplorer in male and female genomes of R. bergii and their general characteristics, length, A+T content, number of variants, abundance, divergence, and chromosomal occurrence.

Similarity sequence comparison between the 53 satDNAs did not reveal sequences with similarity higher than 50% that could be grouped into superfamilies. The total satDNA composition is 2.77% in male and 2.44% in female genomes. This means that there was 13.5% higher abundance of satDNAs in males than in females. The satDNA monomer length varied from 11 bp to 748 bp, and A+T content was on average 58.90% (ranged from 36.40 to 79.40%). Each satDNA family was named in decreasing order of abundance based on male genome. The ten satDNAs previously identified by Palacios-Gimenez et al. (2018) were renamed to follow this criterion. The most abundant satDNA (RbeSat01-53) represented 0.36% of the male genome and 0.43% of the female. The least-abundant satDNA (RbeSat53-21) represented 0.0022% of the male genome and 0.0025% of the female genome. The K2P genetic distance between satDNA families was 8.62% in males and 10.24% in females, on average. The least-divergent satDNA in males was RbeSat36-222 (K2P 1.02%), and in females it was RbeSat52-176 (K2P 1.43%). The most divergent satDNA in males was RbeSat17-175 (K2P 23.72%), while in females it was RbeSat21-209 (K2P 24.54%). More detailed data about satDNAs are summarized in Table 1.

Bioinformatic characterization of TEs

The TEs were much more abundant in comparison with satDNAs in both male and female genomes. However, the difference between sexes was lower, suggesting less accumulation on sex chromosomes (Fig. 1c, d). We were able to identify a total of 56 superfamilies, corresponding to 52.74% of male and 53.72% of female genomes. The most abundant TE was an LTR/Gypsy element (corresponding to 8.47% of male and 8.15% of female genomes) (Supplementary File 2). With a threshold of fivefold difference between sexes (see “Materials and methods”), none of the TE families were highly enriched. However, we noted some differences in TE enrichment when specific variants of superfamilies were examined. Following the same criterion used to analyze biased TE superfamilies across sexes, a total of six variants from distinct superfamlies were biased in males and four biased in females (Fig. 1f). Although we detected enrichment of TE in one of the sexes, these superfamily variants represented only a small proportion in the genomes, the largest 0.33% in males and 0.37% in females (Supplementary File 3).

Chromosomal location of satDNAs and TEs with differential abundance between sexes

Aiming to understand the repetitive DNA composition and its organization on sex chromosomes, we selected repeats that were enriched in one of the sexes, according to bioinformatic analysis for FISH mapping. In this way, we selected 27 satDNAs from a total of 31 differentially enriched across sexes. We excluded four of them because they were previously mapped by Palacios-Gimenez et al. (2018), RbeSat08-11 (previous name Rber61), RbeSat12-176 (previous name Rber158), RbeSat37-22 (previous name Rber185), and RbeSat40-16 (previous name Rber370). Although the RbeSat23-285 (previous name Rber299) and RbeSat25-166 (previous name Rber248) were previously mapped, we used them in FISH experiments because they were used by Palacios-Gimenez et al. (2018) to describe neo-Y variants.

To confirm the general satDNA chromosomal location, we mapped the 27 satDNAs on metaphase I of one individual harboring the neo-Y variant type II (Palacios-Gimenez et al. 2018), which is the most common neo-Y variant in adults. This revealed the occurrence of loci on sex chromosomes for 14 satDNAs: five were located exclusively on autosomes and eight did not reveal FISH signals (Table 1). Among the satDNAs with loci on sex chromosomes, we identified seven exclusively placed on neo-Y, none exclusive on neo-X, three located on neo-X and neo-Y, one of which was located on neo-X and autosomes, and one located on both neo-Y and autosomes (Supplementary File 4). Two other satDNAs with exclusive occurrence in neo-Y were also mapped, RbeSat23-285 (Rber299) and RbeSat25-166 (Rber248). To get more precise information about the distribution of the newly detected satDNA families, we mapped 11 families plus the families RbeSat23-285 (Rber299) and RbeSat25-166 (Rber248) on one embryo harboring neo-Y variant type I, the most common variant among embryos (Palacios-Gimenez et al. 2018). Two satDNAs were exclusively located on the short arm, RbeSat19-162 near the terminal region and RbeSat35-374 occupying the entire extension of the neo-Y. RbeSat09-216 and RbeSat10-272 probes mapped to the centromere. One locus of the satDNAs, RbeSat34-312, RbeSat10-272, RbeSat15-289, and RbeSat18-239, occurred on the middle of the long arm. The other satDNAs (RbeSat21-209, RbeSat25-166, RbeSat03-391, RbeSat44-296, and RbeSat23-285), besides loci of RbeSat34-312, RbeSat10-272, RbeSat15-289, and RbeSat18-239, were grouped on the first half of the long arm, except the RbeSat36-222, which presented an additional band in the second half of the long arm. As expected, telomeric repeats mapped to terminal regions of the neo-Y chromosome (Fig. 2).

Fig. 2: FISH mapping of 13 satDNAs and telomere probes on neo-Y chromosome.
figure 2

Each probe was pseudo-colored with the indicated colors (a, b), and in (c) all the probes were merged in an ideogram showing the relative position of each other. The neo-Y chromosome in (a′, b′) is aligned by the centromere, and the arrow points to the middle part of the long arm. As the FISH mapping was done in distinct neo-Y chromosomes from the same embryo, the satDNAs, RbeSat25-166 and RbeSat15-288, were used to anchor the two neo-Y, allowing to precisely define the position of all FISH signals. This neo-Y chromosome represents the most common observed in the population, regarding satDNA distribution. Bar 2.5 µm.

The ten TE superfamily variants enriched in one of the sexes detected by computational analysis did not reveal signals in FISH experiments (results not shown). This may partly be due to the low abundance of these sequences in the genomes or scattered organization, which may be below the threshold for FISH resolution.

satDNA landscape of neo-Y

We generated a comparative satDNA landscape for sequences exclusively located on neo-Y chromosome (based on FISH data), i.e., RbeSat03-391, RbeSat21-209, RbeSat23-285, RbeSat25-166, RbeSat34-312, RbeSat36-222, and RbeSat44-296, to check the differential amplification and divergence of copies between sequences found in female and male genomes. The seven satDNA, which presented the lowest K2P divergence in male than in female genomes, were also more abundant in males than in females. In addition, the satDNA RbeSat03-391 showed monomers with low K2P divergence in males, indicating the occurrence of highly divergent copies between sexes (Fig. 3).

Fig. 3: Line color-coded landscape showing the temporal accumulation of seven satDNA families exclusively located on neo-Y chromosome detected by FISH analysis.
figure 3

The graphs represent the Kimura distance-based copy divergence analysis to the consensus on the x axis and the satDNA abundance on the y axis. Copies clustering on the left of the graph do not diverge very much from the consensus sequence of the element, and correspond to recent amplified neo-Y- linked copies (in males), while sequences on the right might correspond to ancient/degenerated copies.

SatDNA mapping reveals variable neo-sex chromosomes

The mapping of satDNAs in distinct embryos revealed variability in regard to the distribution of satDNAs on neo-sex chromosomes. Among 20 analyzed male embryos, we observed eight distinct neo-Y chromosomes in comparison with satDNA distribution presented in Fig. 2, which corresponds to the most common variant (occurring in twelve embryos). Based on the analysis of Palacios-Gimenez et al. (2018), it corresponds to the neo-Y variant I. Only slight differences were observed for two embryos that revealed additional centromeric locus for RbeSat15-289 (Fig. 4a), and for one embryo with repositioning of RbeSat36-222 and RbeSat15-288 (Fig. 4b). On the other hand, we noted extensive reshuffling for another five neo-Y variants, as follows: three embryos with repositioning of satDNAs RbeSat23-285, RbeSat15-288, RbeSat36-222, RbeSat10-272, and RbeSat18-239. Among these three embryos, the RbeSat15-288 and RbeSat36-222 occupied two distinct positions (Fig. 4c, d). One embryo, that besides repositioning these five satDNAs, also presented an additional subterminal locus on the long arm for RbeSat25-166 and one centromeric band for RbeSat15-288 (Fig. 4e). Finally, we observed one embryo with repositioning of satDNAs RbeSat23-285, RbeSat36-222, RbeSat10-272, and RbeSat18-239, an additional locus for RbeSat25-166, and repositioning plus additional loci for RbeSat15-288 (Fig. 4f). Interestingly, among these embryos with distinct neo-Y chromosomes, three of them belonged to the same ootheca (Fig. 4c, d).

Fig. 4: Cryptic neo-Y variants of Ronderosia bergii.
figure 4

FISH mapping of satDNAs revealing variants of neo-Y chromosome of distinct male embryos (af). Colored squares next to each neo-Y chromosome indicate the location of the satDNA on the most common neo-Y variant (see Fig. 2). The neo-Y chromosomes are aligned by a centromere, and the arrowhead point to the middle part of the long arm. Bar 2.5 µm.

The neo-X chromosome from the 20 male embryos analyzed presented small FISH signals (dot- like), variable in position and in loci number. We followed the classification from White (1973) to describe neo-X chromosome arms, i.e., XL is the ancestral X and XR is the arm that shares homology with neo-Y. Considering this, the short arm of the neo-X is the XR, while the long arm of the neo-X is XL. The satDNAs RbeSat12-176, RbeSat25-166, RbeSat26-141, RbeSat03-391, RbeSat44-296, RbeSat34-312, and RbeSat23-285 did not reveal loci on the neo-X chromosome (results not shown). On the other hand, in all embryos, the satDNAs RbeSat07-277, RbeSat09-215, RbeSat10-272, RbeSat15-288, and RbeSat19-162 were placed on neo-X (Fig. 5a). The RbeSat07-277 occurred invariably in the terminal region of XL, RbeSat10-272 occurred invariably on the proximal region of XR, RbeSat19-162 occurred invariably in the terminal of XR, RbeSat09-215 revealed multiple loci occurring in both arms, variably in terminal, interstitial, or proximal regions, RbeSat15-288 occurred as a single locus or multiple loci exclusively on the XR, and as single/multiple loci in both the XR and XL arms. Finally, the satDNAs RbeSat35-374 and RbeSat18-239 occurred in some neo-X but not in others (Fig. 5b). RbeSat35-374 occurred as a single locus terminally located on XR in 10% of the embryos, while RbeSat18-239 occurred in 80% of the embryos mainly as single loci on XR or XL with terminal or interstitial location, but in one embryo, two loci (subterminal and interstitial) on XR were observed. All variations were observed between individuals, and no intraindividual variability was noticed for either neo-X or neo-Y chromosomes. Some satDNAs with loci on the neo-X and autosomes detected by FISH in mitotic chromosome preparation from embryos were not well visualized on meiotic chromosomes (metaphase I). This is probably due to a differential condensation state of chromosomes, and small signal size or population variability for the presence and absence of loci.

Fig. 5: SatDNAs located on neo-X chromosome and its variants found on male embryos.
figure 5

In (a) satDNAs that presented signals on neo-X of all embryos and in (b) satDNAs that were present or absent on neo-X, depending of the embryo. XL represents ancestral X chromosome and XR is the arm that shares homology with neo-Y. The neo-X chromosomes are aligned by a centromere. Bar 2.5 µm.

Discussion

Despite a few empirical observations of the accumulation of repeats in neo-sex chromosomes of orthopterans (Palacios-Gimenez et al. 2013, 2015, 2017, 2018; Palacios-Gimenez and Cabral-de-Mello 2015), there is still a limited insight into the potential role of repetitive DNAs in leading to neo-X and neo-Y differentiation. Here, by recovering repetitive DNAs in R. bergii, we provided compelling evidence of differential enrichment satDNAs on neo-X and neo-Y chromosomes, leading to their differentiation. These data can answer questions about (i) the molecular divergence between neo-X and neo-Y and (ii) possible driving forces acting for sex chromosome differentiation and variability at the intrapopulation level, clarifying the evolutionary history of the intriguing sex chromosome system in R. bergii.

Besides the satDNAs and TEs described here, the U2 snDNA and H3 histone genes and some microsatellites were differentially enriched in one of the sex chromosomes of R. bergii (Palacios-Gimenez et al. 2015). The accumulation of distinct classes of repetitive DNAs in sex chromosomes is a common and convergent pattern of sex chromosomes that evolved repeated times independently across various taxa (Bachtrog 2013; Wright et al. 2016; Charlesworth 2019; Daish and Grützner 2019). In this way, accumulation of satDNA families, as observed here, was reported in many other species, for example, in crickets (Palacios-Gimenez et al. 2017), lizards (Giovannotti et al. 2018), frogs (Gatto et al. 2018), fish (Utsunomia et al. 2019), rodents (Acosta et al. 2007), and bovids (Escudeiro et al. 2019). Concerning TEs, their involvement in sex chromosome evolution was noticed in many other species, like Drosophila miranda (Bachtrog et al. 2019), woodpecker birds (Bertocchi et al. 2018), and the fish Apareiodon sp. (Schemberger et al. 2019) differing from R. bergii in which we observed low accumulation of this kind of repetitive DNA in comparison with satDNAs. This suggests that TEs play a minor role in sex chromosome differentiation in R. bergii in comparison with satDNAs. The TEs differentially enriched by sex have low abundance in the R. bergii genome with no detectable clusters through FISH, implying that if they are accumulated on neo-sex chromosomes, they are likely scattered and in low copy number.

Considering the satDNAs mapped here and those previously studied (Palacios-Gimenez et al. 2018), we can observe three main patterns of chromosomal distribution: (i) exclusively on the neo-Y chromosome (e.g., RbeSat03-391 and Rbe44-296, Fig. 2, Supplementary File 4), (ii) highly amplified on neo-Y chromosome and dot signals in chromosomal arms of neo-X and autosomes (e.g., RbeSat09-216, Figs. 2, 4, Supplementary File 4), and (iii) amplified at centromeres of neo-X and some autosomes (e.g. RbeSat08-11, Fig. 1b from Palacios-Gimenez et al. 2018). These data offer support for extensive differentiation between the sex chromosomes by differential amplification of satDNAs in the neo-Y chromosome. This is in accordance with its heterochromatic nature (Palacios-Gimenez et al. 2015).

Among the satDNAs enriched on neo-Y, seven were exclusively mapped on this chromosome (FISH data, Fig. 2) and presented high homogeneity (low level of K2P, Fig. 3), which is an indication of recent amplification, i.e., after the origin of neo-sex chromosomes. The occurrence of these satDNA families in low abundance in the female genome implies that they were present in R. bergii genome before establishment of neo-Y, and they were amplified during its evolution. Interestingly, RbeSat03-391 presents similar abundance between sexes and their landscape, revealing two peaks, one with low divergence (male repeats) and another with higher divergence (male and female repeats) (Fig. 3). This is indicative of the occurrence of very recently amplified and homogenized copies of this repeat in the neo-Y chromosome, in regard to other chromosomes. The specific amplification on neo-Y chromosome could be corroborated by the absence of FISH signals on other chromosomes.

Although the neo-sex chromosomes are highly differentiated, they still share some repetitive DNAs between them and with autosomes, revealing insights about their evolution. The similarity in satDNA composition of the centromeres of neo-X chromosome and autosomes indicates that the neo-X chromosome preserved centromeric ancestral composition, while the neo-Y suffered composition turnover. For the neo-XY, besides the occurrence of RberSat35-374 that is exclusively shared between neo-X and neo-Y chromosomes, these chromosomes share another five satDNA families that are confined to discrete regions of neo-X chromosome, and are also shared with some autosomes. These data suggest conservation of ancestral homology for the ancient autosome involved in the origin of the neo-sex chromosome system. Interestingly, the RberSat35-374 and RberSat19-162 are shared between the terminal region of XR (short arm of neo-X) and short-arm neo-Y that could be involved in the proper recognition, orientation, and segregation of these chromosomes during meiosis.

It is believed that repetitive sequences can facilitate and mediate chromosomal rearrangements, like inversions, nonhomologous or ectopic recombination, translocations, and transpositions (Molnár et al. 2011; Skinner and Griffin 2012; Raphael 2012; Li et al. 2016; Christmas et al. 2019). Exceptionally in R. bergii, these mechanisms generated multiple variants of neo-sex chromosomes in the same population. Five variants for neo-Y were previously documented in R. bergii, which was thought to have occurred due to three independent paracentric inversions (Palacios-Gimenez et al. 2018). Our mapping of satDNAs suggests that the turnover of neo-Y chromosome in R. bergii is much more complex, involving amplifications and transpositions, as well as multiple paracentric inversions (Fig. 4). These events generated highly variable distribution of satDNAs between individuals of R. bergii, mainly for the proximal two-thirds of the long arm of neo-Y. The occurrence of some other variants was also observed for the neo-X chromosome, but only amplification of a minor amount of satDNAs took place in chromosomal arms (Fig. 5). Interestingly, the high variability for satDNAs on neo-sex chromosomes occurred at an intrapopulation level, a pattern that is not well documented. Variability for sex chromosomes, and the putative role of repetitive DNAs in sex chromosome differentiation at the intrapopulation level as noticed for R. bergii, were revealed in other few species based on less-detailed analysis, like in Mazama gouazoubira, Cervidae (Valeri et al. 2018), by analysis of heterochromatin distribution, and in Omophoita aequinoctialis, Coleoptera by C banding and mapping of repetitive probes, i.e., 18 S and 5 S rDNAs (Goll et al. 2018).

In summary, our data support a high differentiation of neo-sex chromosomes in R. bergii caused primarily by action of satDNAs and events like inversions, amplification, and transpositions. The critical role of satDNAs in neo-sex chromosome evolution is exemplified by the presence of 21 families in the neo-sex chromosomes (39% of families found in the genome of the species). The amplification of satDNAs in the neo-Y is supported by the occurrence of 13.5% more satDNA in male than in female genomes. Moreover, seven satDNA families with recent amplification were exclusively enriched on neo-Y, as validated by bioinformatic and FISH experiments, which accounted for its differentiation from the rest of the genome. The turnover of sex chromosomes in this species caused emergence of multiple variants of neo-X and neo-Y at the population level, making R. bergii an interesting species for intrapopulation analysis of sex chromosome evolution. Finally, considering that recombination between neo-sex chromosomes in R. bergii was mostly suppressed at least since the large pericentric inversion took place, we documented an empirical example of differential accumulation and variability of satDNA as a posterior force after suppression of recombination driving sex chromosome evolution.