Introduction

Transposable elements (TEs) are DNA sequences that are able to self-replicate and move within a genome1. TEs can be divided into two classes on the basis of their transposition mechanism. Class I elements, also known as retrotransposons, move by reverse transcription of an RNA intermediate. The presence/absence of long terminal repeats (LTRs) at the element ends further distinguishes LTR and non-LTR retroelements. Non-LTR TEs include long interspersed elements (LINEs) and non-autonomous short interspersed elements (SINEs)2. Class II elements, or transposons, move mainly via a mechanism of DNA sequence excision and insertion, although a few of them exploit a rolling-circle mechanism1.

The rule of TE vertical inheritance (i.e. the transmission of genetic material from parents to offspring through sexual or asexual reproduction) may be broken by horizontal transfer (HT) events between lineages3,4. Horizontal transfer is the passage of genetic material between reproductively isolated taxa and it has been proposed as an essential part of the lifecycle of some TE types. TEs are, in fact, usually subject to host’s suppression mechanisms, limiting their mobility and their copy number expansion, and to stochastic losses. In this view, HT events can be considered as an opportunity for TEs to invade a new genome and persist in evolution5. HT is apparently more frequent among Class II than Class I TEs. Among the latter, it is more frequent for LTR retrotransposons. This is probably linked to the different transposition mechanisms used by the two classes. DNA transposons have a more stable double-stranded DNA intermediate, while retrotransposons have a relatively unstable RNA intermediate that is reverse-transcribed directly into the chromosomal target site, so that the possible transfer outside the cell appears unlikely. Yet, among Class I elements, there is a small group of LTR retrotransposons (as the Gypsy elements) that are similar to retroviruses in terms of replication mechanism and structural organization which may successfully mediate the transfer6. Concerning non-LTR retrotransposons, previous analyses assigned these elements to 11 distinct clades but no evidence for HTs was detected within or between these clades during the past 600 Myr7,8. However, some cases of HT involving both LINEs (CR1, R4, L1 and BovB)9,10,11 and SINEs (Sauria and RUDI)12,13 have been suggested.

Mechanisms leading to HT are still unclear, although different ways have been observed/suggested: from TEs intrinsic features (such as the presence of retrovirus-like proteins) to vector-mediated transmission and to host-parasite and/or trophic relationship6,12,14,15,16,17. HT has also been attributed to hybridization/introgression events18,19 linking, therefore, possible TE lateral transfers to specific reproductive biological traits of involved taxa.

Among non-LTR retrotransposons, R2 is probably one of the most investigated element. It has a specific insertion site into the repeated 28S rRNA genes. R2 has a single open reading frame (ORF) flanked by two untranslated sequences (UTRs) of variable length. The translated protein comprises the central reverse transcriptase (RT) domain, the DNA-binding motifs at the N-terminus and the restriction enzyme-like endonuclease (RLE) domain at the C-terminus20. Phylogenetic analyses demonstrated that R2 elements can be divided into four main clades (R2-A, -B, -C and -D), in agreement with the number and configuration of N-terminal end zinc-finger motifs, and that the element is present in the animal Kingdom since the Radiata-Bilateria split21,22,23. R2 is subject to significant changes in copy number, even within a single species, due to its rapid turnover24,25. This may account for the overall phylogenetic incongruence with host species: an R2 phylogenetic analysis across metazoan, in fact, suggested that paralogous lineages replacements and extinctions during the long-term evolution of this non-LTR element may explain its distribution, without assuming horizontal transfer events22. A further study suggested that also ancestral libraries of paralogous lineages may account for the observed diversity of multiple lineages within a single genome23.

The genus Bacillus includes three Mediterranean species of stick insects: B. rossius (with eight subspecies), B. grandii (with three subspecies) and B. atticus (with three karyological/allozyme races). Their taxonomy, systematics and phylogeny are well established and inferred from nuclear (allozyme, centromeric satellite DNA) and mitochondrial markers, as well as from karyological and cytological analyses26. The evolutionary history of the genus shows an ancestral divergence, about 22.5 Myr ago, that separated the lineage leading to B. rossius from the other ones. Then, around 17 Myr ago, the B. grandii/B. atticus clade radiated, ending with a paraphyletic relationship between the two species. In fact, the B. grandii grandii subspecies appears more related to B. atticus than to the B. grandii benazzii/B. grandii maretimi clade26,27. Bacillus species show a wide range of reproductive strategies, from strict bisexuality (in B. grandii) to facultative (B. rossius) or obligatory (B. atticus) parthenogenesis. Moreover, these species gave origin to inter-specific hybrid taxa, reproducing either through obligate parthenogenesis or hybridogenesis26. Hybridogenesis, or hemiclonal reproduction, takes place when the hybrid individual (often a female) discards the paternal haploset during meiosis. The hybrid condition is then restored through fertilization of the egg containing the maternal unrecombined haploset with a paternal sperm26. Bacillus hybridogenetic strains involve B. rossius as maternal lineage and B. g. grandii or B. g. benazzii as paternal lineage26. In these taxa, further, natural androgenesis (i.e. offspring with full paternal nuclear genome derived from two sperms of fertilizing males) was also demonstrated for the first time in the Animal Kingdom28. Hybridogenesis and androgenesis are known to occur in both vertebrates and invertebrates where also a peculiar social hybridogenetic system has been demonstrated29,30,31.

We recently analyzed R2 in stick insects of the genus Bacillus, addressing the evolutionary dynamics of their insertions in different reproductive strategies32,33, as well as the coevolution of multiple lineages in the genome of B. rossius34. In order to have a full picture of R2 evolution in the Bacillus species complex, we isolated and characterized R2 from the obligatory parthenogenetic B. atticus and the strictly gonochoric B. grandii. The analyses of the evolutionary history of all retrieved Bacillus R2 lineages suggest the occurrence of HT events that can be explained taking into account the hybridogenetic mechanism.

Results

R2 sequences characterization

Structures and features of newly obtained R2 elements are given in Fig. 1. One R2 element was found in B. atticus (R2Ba); it is 3507 bp long, excluding the 3′ terminal poly-(A) tail of nine nucleotides, with an ORF of 3177 bp encoding a protein of 1058 amino acids. Two R2 elements were isolated from B. g. grandii. R2BggA is 3513 bp long, excluding the poly-(A) tail of seven nucleotides and shows two overlapping ORFs of 1545 bp and 1698 bp. The analysis of the two proteins suggests they are actually part of a 3243 bp long ORF that encodes a protein of 1079 amino acids in which a frame shift mutation occurred (at pos. 1726). The same can be observed for the second B. g. grandii element (R2BggB), that is 4832 bp long excluding the poly-(A) tail, with a putative ORF of 4264 bp in which a frameshift mutation occurred (at pos. 2191). This element also shows two stop codons (pos. 2913 and pos. 4722) and three sequence duplications of 333 bp, 68 bp and 57 bp, respectively. The first and second duplications are within the ORF while the third one is in the 5′ UTR. The element found in B. g. benazzii (R2Bgb) is 3479 bp long, excluding the poly-(A) tail, with an ORF of 3165 bp encoding a protein of 1054 amino acids. In the genome of B. g. maretimi, two R2 elements were retrieved. The first (R2Bgm) is 3485 bp long, excluding the poly-(A) tail. Its ORF appears as 3164 bp long but, once again, it has one frame shift mutation disrupting the sequence at 875 bp. The second element is 3059 bp long, excluding the poly-(A) tail and it is nearly identical to R2Bgm (0.4% of divergence), with the exception of a 426 bp deletion located between positions 1035–1460. It has been therefore called R2Bgmdel. This element shows a complete ORF of 2739 bp encoding a protein of 912 amino acids. Interestingly, in a 5′ end sequence survey carried out in B. rossius34, an incomplete element was retrieved (R2Brdel), showing 99% of similarity with R2Bgmdel and sharing the same internal deletion. The R2Brdel fragment encompasses the whole 5′ UTR and the first 1293 bp of the ORF; the protein sequence obtained from that part of the ORF is 430 amino acids long, and the RT domain was only partially covered.

Figure 1: Schematic representation of presently characterized R2 elements in the Bacillus genus.
figure 1

For comparison, also the fragment representing R2Brdel has been reported. Black boxes indicate the open reading frame (ORF) with the zinc finger (ZF), reverse transcriptase (RT) and restriction enzyme-like endonuclease (RLE) domains. Vertical grey lines indicate frameshift mutations; ovals represent stop codon(s); grey squares represent duplications and triangles represent large deletions.

All isolated elements showed one zinc finger motif (CCHH) at the protein N-terminal end, with the exception of R2BggB that shows two zinc finger motifs (CCHC + CCHH).

Subsequent comparisons involved presently isolated elements (B. atticus: R2Ba; B. g. grandii: R2BggA and R2BggB; B. g. benazzii: R2Bgb; B. g. maretimi: R2Bgm and R2Bgmdel Fig. 1) together with previously identified ones (B. rossius: R2Brfun, R2Brdeg and R2Brdel32,34).

The comparisons between all Bacillus R2 lineages at the nucleotide level indicated that R2BggB element is the most differentiated, its divergence ranging from 55.1% (R2BggA) to 56.3% (R2Bgmdel; Table 1). R2s from B. g. maretimi and B. g. benazzii are less divergent from B. rossius R2Brfun and R2Brdeg(3.9–4.6% and 10.3–10.6%, respectively) than from B. g. grandii R2BggA and B. atticus R2Ba (19.3–19.6% and 19.7–20.1%, respectively; Table 1). Moreover, the R2Brdel fragment divergence ranges from 0.9% (R2Bgm) to 55.5% (R2BggB).

Table 1 Nucleotide and amino acid (in parentheses) p-distances between Bacillus R2 elements (below the diagonal).

Phylogenetic analyses

The Maximum Likelihood tree (Fig. 2) computed on amino acid sequences spanning from the RT domain to the C-terminal end, in addition to the RLE domain, clearly reflects the clustering pattern based on the number of zinc-finger motifs at the N-terminal end22,23. Accordingly, the element R2BggB clusters within clade R2-B, while the remaining Bacillus R2s fall within the clade R2-D and are all grouped in a fully supported, monophyletic clade.

Figure 2: Phylogenetic position of Bacillus R2 elements.
figure 2

Maximum Likelihood tree (-ln L = 35666.30) built on RT amino acid sequence of Bacillus elements and of elements representative of the four main clades (R2-A, R2-B, R2-C and R2-D). Numbers at nodes represent bootstrap values ≥50%.

Since amino acid sequence is not available for the degenerate element R2Brdeg and only the N-terminal end is available for R2Brdel, another phylogenetic tree was constructed based on the nucleotide sequence of Bacillus elements only, excluding R2BggB as its sequence is too divergent and belongs to a distantly related clade. The Maximum Likelihood and Bayesian dated phylogenies fully agree (Fig. 3a) and overlap the topology of the Bacillus cluster in the amino acid sequences analysis (Fig. 2). Elements from B. g. grandii (R2BggA) and B. atticus show a closer relationship while those from B. rossius, B. g. benazzii and B. g. maretimi are included in the same clade. Interestingly, B. rossius, B. g. benazzii and B. g. maretimi elements show a paraphyletic relationship, where R2s from B. g. benazzii and B. g. maretimi are grouped in a monophyletic cluster. R2Bgmdeland R2Brdel show a well-supported relationship with R2Bgm. Node ages indicate that B. rossius, B. g. benazzii and B. g. maretimi elements radiated about 18 Myr ago and that the cluster including B. g. benazzii and B. g. maretimi R2s split from R2Brfun around 8.6 Myr ago. Finally, R2Bgb and R2Bgm/R2Bgmdel/R2Brdel diverged about 2.6 Myr ago.

Figure 3: Bacillus R2 elements phylogeny.
figure 3

(a) Dated phylogeny based on R2 nucleotide sequences; numbers at nodes indicate, above, Maximum Likelihood (-ln L = 2449.07) bootstrap values ≥50%/Bayesian posterior probabilities ≥0.90 and, below, age estimate 95% high posterior density interval. Nodes with black dots are those where age calibration was applied. The R2BggB element was not included due to its extreme nucleotide divergence. (b) Schematic drawing of Bacillus taxa phylogeny as derived from ref. 27.

The R2 tree topology is only partially congruent with the host species phylogeny (Fig. 3b), the significant difference being the placement of B. g. benazzii and B. g. maretimi. In fact, these two taxa are known to have closer relationship with B. g. grandii and B. atticus than with B. rossius26,27.

Divergence vs. age analysis

When the nucleotide or amino acid sequence divergences were plotted against host species split ages, comparisons between R2 B. rossius (R2Brfun, R2Brdel), B. g. benazzii (R2Bgb) and B. g. maretimi (R2Bgm, R2Bgmdel) elements were found less divergent than expected (Fig. 4). In fact, they showed nucleotide p-distance values ranging from 0.009 (R2Bgm vs. R2Brdel) to 0.047 (R2Bgb vs. R2Brdel), while co-eval sequence divergences ranged from 0.187 (R2Bgg vs. R2Brfun) to 0.222 (R2Bgg vs. R2Brdel) (Table 1; Fig. 4a). Amino acid sequence divergences followed the same trend (Table 1; Fig. 4b). Interestingly, when all comparisons were plotted R2 values resulted very low (0.034 and 0.041 for nucleotide and amino acid analyses, respectively) and correlations were not significant. On the other hand, when comparisons between R2 B. rossius, B. g. benazzii and B. g. maretimi elements were excluded, R2 values resulted higher (0.796 for both nucleotide and amino acid analyses) and correlations became significant (p < 0.001; Fig. 4).

Figure 4: Age versus divergence analysis.
figure 4

Plot of host age vs. nucleotide divergence (a) and vs. amino acid divergence (b). Red diamonds indicate comparisons relative to putative horizontal transfers (HTs). Two trend-lines are reported: the dotted one is relative to the analysis including all comparisons; the filled one is relative to the analysis that exclude putative HT comparisons. Correlation coefficients (R2), together with probability (ns: not significant; *p < 0.001), are reported near the respective trend-line.

Discussion

R2 non-LTR elements are probably among the most widely studied TEs in both model and non-model organisms. One of the main issues regarding R2 evolution is the frequent incongruence between its phylogeny and that of the host species, for which different hypotheses have been put forward. Kojima and Fujiwara22 while analyzing a number of elements sampled across metazoan, observed numerous instances of “local” congruence between R2 and host phylogenies and did not identify clear cases of putative HT. Instead, they found instances of paralogous lineages. Therefore, they concluded that the evolution of R2 can be explained by vertical inheritance with extensive paralogous lineages extinction/diversification. Luchetti and Mantovani23, in addition, suggested that in some ancestral genomes a library of unrelated elements might have been present. The differential amplification of these elements in the derived taxa - irrespective of their phylogeny - would also explain the presence in some genomes of multiple, unrelated R2 lineages.

Data reported here, though, suggest the possibility of HT between congeneric species of stick insects. In order to verify this aspect, we checked if data fit the three criteria that are considered relevant to identify an HT event: (i) a divergence between elements from different species lower than expected on the basis of host split ages, (ii) the phylogenetic incongruence between the putatively transferred element and the host species, and (iii) the patchy distribution among closely related taxa3,5.

R2 elements of B. g. benazzii and B. g. maretimi are less divergent from B. rossius R2Brfun than from the element of the conspecific B. g. grandii. Accordingly, the divergence versus age analysis indicated the comparisons R2Bgb/R2Brfun, R2Bgb/R2Brdel, R2Bgm/R2Brfun, R2Bgm/R2Brdel, R2Bgmdel/R2Brfunand R2Bgmdel/R2Brdel as significantly less divergent than expected on the basis of the host split age (22.8 Myr ago)27.

The phylogenetic analyses conducted on amino acid sequences indicated that all Bacillus elements have a well-supported monophyletic origin and fall in the R2-D clade, with the exception of the R2BggB element that is nested within the R2-B clade. This distribution is in full agreement with the number and type of zinc-finger motifs found at N-terminal end22,23. Both phylogenetic analyses conducted on amino acid and nucleotide sequence evidenced that, within the Bacillus cluster in the R2-D clade, the relationships between elements only partially overlap those between host taxa. In fact, elements from B. g. maretimi, B. g. benazzii and B. rossius group together in a well-supported cluster in clear-cut contrast with the sister relationship occurring between the B. g. maretimi/B. g. benazzii clade and the B. g. grandii/B. atticus one, evidenced on the basis of allozymes, mitochondrial DNA and satellite DNA26,27.

Patchy distribution refers to the presence of a given element in one lineage and its absence within the sister lineage. In the case of R2, beside the number of zinc fingers, there are no unequivocal criteria to determine lineages; on the other hand, Stage and Eickbush35, in Drosophila species, considered that elements diverging >1% would represent separated lineages. If we take into account this threshold, then R2Bgmdel and R2Brdel belong to the same lineage (but not to R2Bgm due to the large deletion occurring within the ORF). Therefore, the “R2Bgmdel-R2Brdel” lineage shows a patchy distribution.

On the whole, data presented here suggest two interspecific HT events in Bacillus stick insects, one occurring from B. rossius toward the B. g. benazzii/B. g. maretimi ancestor, accounting for R2Bgb and R2Bgm elements origin from R2Brfun, and another one from B. g. maretimi toward B. rossius, accounting for the origin of R2Brdel from R2Bgmdelelement (Supplementary Figure S1). Random lineage loss and/or the possibility of having missed some lineages in B. g. grandii and B. atticus could explain the patchy distribution and phylogenetic incongruence: however, the observed divergence, which is lower than the expected one considering the host species split, appears to rule out these possibilities. In addition, notwithstanding the large error associated with age estimates, dated phylogeny clearly indicates a more recent origin of the R2s involved in HTs with respect to species/subspecies split age. In R2 phylogeny, the Bacillus lineages including R2Bgb and R2Bgm derived from the R2Brfun element with a divergence time of about 8.6 Myr ago; in contrast, B. rossius and B. grandii taxa split more than 22 Myr ago27. Moreover, the monophyletic origin of R2Bgm, R2Bgmdeland R2Brdel and their very recent radiation (less than 1 Myr ago) suggest that the second HT event would have happened much more recently.

TE horizontal transfer has been linked to vectors able to survive outside the host cell (for example virus)16,36, or to host-parasite14 or predator-prey17 relationships. Actually, we do not have evidences that may explain the HT between Bacillus taxa following these routes, although a parasite-mediated transfer cannot be disregarded: Bacillus stick insects, in fact, have been recently observed to be infected by nematodes without apparent loss of fecundity (Mantovani, personal observation).

HT has also been attributed to hybridization/introgression18,19 and R2 HT events can be well explained taking into consideration Bacillus parental and hybrid taxa reproductive biology which “provides unusual opportunities for genomes to cycle between the various systems”37. In particular, B. rossius and B. g. benazzii are the maternal and paternal species, respectively, of the hybridogenetic strain B. rossius - g. benazzii: these are hybrid females expressing both parental haplosets in the somatic tissues, but they discard the paternal B. g. benazzii haploset in the germ line. Before meiosis, the paternal haploset is eliminated and the maternal one is duplicated; thus, meiosis involves two maternal haplosets. This prevents the recombination between paternal and maternal chromosomes26. Accordingly, extensive allozyme data obtained from the hybridogenetic system have never retrieved the occurrence of recombination between the maternal and paternal haplosets28,38. Then, the egg, containing the unrecombined B. rossius maternal haploset, can be fertilized by a sperm from a B. g. benazzii male, restoring the hybrid condition; in the lab, hybridogenesis occur also with B. g. maretimi and B. rossius males; in the latter instance, pure B. rossius individuals are produced (Supplementary Figure S2)25,26,37. However, among hybridogenetic descendants, instances of maternal haploset exclusion at meiosis or of androgenesis may occur (Supplementary Figure S2)26,28,38. Maternal B. rossius haploset exclusion leads to an egg containing the paternal B. g. benazzii haploset: through the paternal haploset doubling at the development onset, a diploid B. g. benazzii nuclear genome in a B. rossius cytoplasm can be reconstituted. It is to be noted, though, that this instance has been only rarely observed28,38 and that the same kind of offspring could be obviously obtained through a backcross with a B. g. benazzii male. In Bacillus stick insects, androgenesis takes place when the loss of both parental genomes from the egg nucleus is followed by the entrance and mixes of two sperm nuclei. This reconstitutes a diploid genome, resulting in a fertile offspring with a fully-paternal nuclear genome and a maternal cytoplasm. Breeding experiments in lab populations resulted in a high number of androgenetic individuals, especially when crossing either B. g. maretimi or B. rossius males.

In this context, HT events described here may have followed different routes. The common hybridogenetic pattern nicely explains the case of an element originating in the paternal genome and transferred to the maternal one. During the hybridogenetic stage, retrotransposition may lead to insertion in the maternal haploset; therefore, when the egg is fertilized by a maternal species male, the R2 element may enter into the maternal species lineage in the following generations (Fig. 5a). Thus, the HT involving R2Bgmdel and R2Brdel could have followed this route less than 2 Myr ago. B. g. maretimi diverged from B. g. benazzii about 2 Myr ago and it is presently restricted to the Marettimo Island. Although we have no evidence at present of B. rossius - B. g. maretimi hybridogenesis in nature, populations of the two taxa may have come into contact during the Pliocenic-Pleistocenic marine transgression that have connected the Island of Marettimo to western Sicily, allowing hybridization.

Figure 5: Possible models of R2 horizontal transfer (HT) in Bacillus stick insects.
figure 5

(a) HT of an element from the paternal to the maternal species haploset. Via a standard hybridogenetic path followed by a backcross with a male of the maternal species, the element can invade the genome of the maternal species. (b) Horizontal transfer of an element from the maternal to the paternal species haploset. Following the standard hybridogenetic path, the element has no chance to enter into the paternal species genome given that the paternal haploset carrying the transferred element is lost. Only instances of maternal haploset exclusion may lead to offspring with fully paternal genome either through gynogenesis or backcross. Also androgenesis produces offspring with fully paternal genome in a maternal cytoplasm. In this case, though, the HT event must be mediated by the element RNA intermediate that allows the transfer from the excluded hybridogenetic genomes to the one derived from two sperm nuclei mixis. Dashed circles within cells are nuclei; red chromosomes represent the maternal haploset and green chromosomes represent the paternal haploset. Red and green small lines within cell cytoplasm represent transcribed RNAs. Thick and thin arrows indicate transcription and re-integration, respectively. Excluded haplosets are shaded in grey.

On the other hand, the interspecific transfer of an element from the maternal haploset to the paternal one, as in the case of R2Brfungiving origin to the R2Bgm/R2Bgb lineage, cannot be explained by an hybridogenetic route (first example of Fig. 5b). This could have occurred by one of the two possible patterns following the maternal haploset exclusion (second example of Fig. 5b) or via androgenesis. In the latter instance, no inheritance of any part of the hybrid nuclear genome is observed, as the offspring genome is given by the haplosets of two additional sperms. On the other hand, the androgenetic offspring inherits from the maternal species the cytoplasm. Here R2 RNAs can be present as, after transcription and maturation, they are transferred from the nucleus to the cytoplasm for proteins translations. RNAs, then, re-enter into the nucleus for the integration process (Supplementary Figure S3)39. It is thus possible that cytoplasmic R2 RNAs re-entered into the nucleus and re-integrated after the mixis of two sperms’ nuclei (third example in Fig. 5b). As suggested by data, the HT event originating both R2Bgb and R2Bgm was a single event, dating back to 7 Myr ago, while R2Bgb and R2Bgm divergence time was close to that of the pertaining B. grandii subspecies split (~2 Myr ago, timing actually overlapping if we consider the error associated to the node age estimate). At variance of the recent split of B. g. benazzii and B. g. maretimi subspecies, B. rossius populations carrying R2Brfun were already present before the end of the Messinian (~5.4 Myr ago)27,32. Following these evidences, we can hypothesize that hybridization may have already taken place between B. rossius and the ancestor of B. g. benazzii/B. g. maretimi and, in this scenario, the element R2Brfun could have been transferred into the B. g. benazzii/B. g. maretimi ancestor’s genome. It is finally to be noted that, with the exception of the derived R2Bgmdel, we did not detect other R2 lineages in B. g. benazzii nor in B. g. maretimi; it is thus likely that the invasion of the new R2 element led to the replacement of other possible R2 lineage(s) present in the ancestor genome.

On the whole, analyses presented here support the occurrence of HTs involving R2 retrotransposons, a process previously undetected for this element on the basis of a metazoan-wide analysis22. The peculiar way by which HT events seem to have occurred in stick insects further suggests some interesting speculations. Introgression is commonly observed when inter-specific hybridization occurs, and it is thought to bear beneficial effect on the receiving genome by promoting adaptation40. In Bacillus hybridogenetic lineages, though, gene introgression has never been observed possibly owing to the peculiar cytological mechanism of haploset exclusion during egg maturation that prevents recombination41, but as above suggested TE spreading may not require crossing over events.

Transposons are well-known promoters of genome evolution: therefore, possible HTs not only would allow TE survival but may also bring this advantage to the invaded genome. In this view, the role of hybridization as a contributor to genome and species evolution39,41 can be achieved also by mean of TEs transfer from one genome to another, instead of a classical introgression. It is to be noted, in fact, that successful introgression usually occurs after several round of hybridization/backcrossing42 while a TEs HT may even require a single generation.

Taking into account the complex history of hybridization and intertaxa relationships that characterize Bacillus stick insects, they can constitute an interesting evolutionary framework in which to address studies of such phenomena.

Materials and Methods

Sampling, DNA isolation and R2 sequencing

Samples of the obligatory parthenogenetic Bacillus atticus (subsp. atticus) and of the three subspecies of the strictly gonochoric Bacillus grandii were field-collected in Sicily and immediately frozen at −80 °C. Full-length R2 elements were isolated and characterized from one female of B. atticus (Scoglitti; BattSCO♀25) and from one male each of B. g. grandii (Ponte Manghisi; BggPMA♂54), B. g. benazzii (Torre Bennistra; BgbTBE♂4) and B. g. maretimi (Marettimo Island; BgmMAR♂2).

Total DNA was extracted from single stick insect legs with the standard phenol/chloroform protocol. R2 was isolated through PCR amplification, cloning and sequencing32. Universal and specifically designed primers used in this study are reported in Supplementary Table S1.

Sequence analysis

Sequences were edited and assembled using MEGA v. 6.043 and open reading frames (ORFs) were searched with the ORF Finder tool server (available at: http://www.ncbi.nlm.nih.gov/gorf/gorf.html). All newly characterized R2 elements are reported as Supplementary Material S1. Nucleotide and amino acid sequence alignments have been carried out using MAFFT 7.244 with L-INS-i parameters. Sequence divergences, calculated as uncorrected p-distances, were obtained using MEGA v. 6.0.

Phylogenetic analyses

Presently obtained R2 elements have been analyzed together with those previously isolated from B. rossius: the functional element R2Brfun and the degenerate one R2Brdeg32. Present analysis includes also the 5′ half of a new element, carrying an internal deletion (hence called R2Brdel) obtained from B. rossius while screening for the 5′ end sequence variation34.

Two phylogenetic analyses have been performed. The first analysis was based on inferred amino acid sequences encompassing the reverse transcriptase (RT) and the restriction enzyme-like endonuclease (RLE) domains. This analysis included elements from several metazoan representatives of the main R2 clades (Supplementary Table S2)22,23,45,46,47,48,49,50,51,52. The second analysis included only Bacillus elements (with the exception of R2BggB) and was based on nucleotide sequences.

The best substitution models, chosen on the basis of the Bayesian information criterion (BIC), were calculated as LG + G + I for amino acid dataset using Prottest v. 3.453 and as GTR + G for nucleotide dataset with jModelTest v. 2.1.754. Maximum Likelihood phylogenetic analyses, with 100 bootstrap replicates for nodal support, have been carried out using PhyML v. 3.055. A dated phylogeny has been also build on the nucleotide sequence dataset using a Bayesian analysis implemented in BEAST v. 1.856. Two runs were performed with 3 × 107 generations and evaluated for convergence both graphically and checking for Estimated Sample Size (ESS) > 200. The tree search was setup using an uncorrelated, log-normal relaxed molecular clock and the Birth-Death speciation process. This analysis produces a tree topology with nodal supports (posterior probabilities) and relative node age estimates. Age calibration was implemented using secondary calibration points obtained from the Bacillus species dated phylogeny27. We calibrated the divergence between the B. rossius functional (R2Brfun) and degenerate (R2Brdeg) elements32 at a minimum of 5.4 Myr ago, that is the split between European B. r. rossius/B. r. redtenbacheri and the North African B. r. tripolitanus A: as all these taxa retain the two elements32, R2s divergence should date before their subspecific split. Prior distribution of this calibration was modelled using an uniform distribution with hard bounds and a maximum of 50 Myr ago. We also calibrated the divergence between R2BggA and R2Ba (for R2 lineages acronyms see Fig. 1) based on the estimated split age between B. g. grandii and B. atticus at ~15.4 Myr ago. Prior distribution of this calibration was modelled using a log-normal distribution, with soft bounds (maximum set at 50 Myr ago), as there is little probability of a younger R2 divergence but, since multiple R2 lineages can be commonly found in a genome, it is possible that they may have diverged before the host species split.

A divergence versus age analysis has been performed by plotting nucleotide and amino acid sequence divergences against the host age split. This analysis is based on the principle that elements deriving from a HT are less divergent than expected from the host-split age, while paralogous lineages are more divergent than expected3,5,8.

Additional Information

How to cite this article: Scavariello, C. et al. Hybridogenesis and a potential case of R2 non-LTR retrotransposon horizontal transmission in Bacillus stick insects (Insecta Phasmida). Sci. Rep. 7, 41946; doi: 10.1038/srep41946 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.