Introduction

Transposable elements (TEs) are selfish genetic elements that propagate through the population by exploiting the host sexual reproductive machinery. TEs have existed in the genomes of living organisms for hundreds of millions of years and are found in most plants and animals; with as much as 90% of the genome being TE-derived sequences (Finnegan, 1978). For example, there are 100 000 copies of LINE 1 elements and about a million copies of the Alu elements in humans (Xing et al., 2007; Lander et al., 2001). As genetic parasites, TEs exploit the host transcriptional system to increase copy number (Nuzhdin et al., 1996). Insertions into new chromosomal locations are generally deleterious because they disrupt gene expression, cause ectopic exchange, and chromosomal breakage (Charlesworth et al., 1994). Although some TE-induced mutations are beneficial, it is clear that their net effect on host fitness is negative, with an estimated fitness cost in excess of 5% (Charlesworth et al., 1994). So how have they continued to parasitize the genome without self-destructing by means of killing the host? One explanation seems to be that TE activity resulting from copy number increase is balanced by selection against hosts with TE-induced deleterious effects.

There are 50 families of TEs that occupy the genome of Drosophila melanogaster, a model species for studying the dynamic relationship between TEs and their host. Horizontal transmission is thought to be the primary mode of TE infection across species and populations. P-elements, for instance, have recently invaded the genome of D. melanogaster and have managed to rapidly sweep through the population within the last 50 years (Brookfield et al., 1984; Marzo et al., 2008). With such sweeps, the number of elements within a host might continue to increase and will do so at the expense of host fitness (Charlesworth and Langley, 1989; Mackay et al., 1992; Biémont, 1994; Houle and Nuzhdin, 2004). As TE survival is dependent on the host survival and reproduction, a decrease in host fitness will inevitably interfere with the chances of TEs propagating through the population. This dynamic power play between TEs and their hosts has been shown to ignite co-evolution between these players. Therefore, a self-imposed control mechanism is likely to evolve in TEs to prevent detrimental fitness effects on the host. Indeed, some elements such as Type I transposons can restrict their transposition at high copy number (Biémont, 1994; Lohe and Hartl, 1996). However, self-regulation seems not to be an evolutionarily stable strategy for retrovirus-like elements (Brookfield and Badge, 1997). For instance, copia and Doc elements do not show reduced transpositions when TE copy number increases (Nuzhdin et al., 1996).

Host fitness loss can be partially rescued if host genes controlling transpositions sweep through the infected population. However, these sweeps might not result in fixation of the host transposition suppressors but are likely to remain incomplete due to weak selection (Nuzhdin et al., 1998), which would then result in an overall low transposition rate observed in nature. This hypothesis stems from the fact that D. melangaster genotypes differ in which TE families are active. One genotype, for example, showed transposition of hobo and I elements but none for copia and 412 families (Harada et al., 1990). In another genotype, copia, 297, and FB4 were active but I, mdg1, gypsy, hobo, 2161, 2242, and Doc elements were not (Eggleston et al., 1988). In yet another genotype, out of the 13 TE families, only copia, I, roo, and Doc were found to be actively transposing (Nuzhdin and Mackay, 1994, 1995). These studies collectively suggest that there exists natural variation in TE stability and that the host genetic background is an important factor that influences TE activity and thus copy number maintenance in natural populations.

The gypsy element, one of the best-studied retrotransposons, offers a glimpse of genetic variation in TE transmission and host control factors. Two components are required for the activity of gypsy: maternally inherited ‘rogue’ or active elements and the permissive flamenco locus (Bucheton, 1995; Prud’homme et al., 1995; Mevel-Ninio et al., 2007). There are two classes of flamenco alleles, restrictive and permissive, present in natural populations (Pelisson et al., 1997). The restrictive alleles are dominant and repress the mobility of functional proviruses by suppressing gypsy transcripts through the piwi RNA (piRNA)-silencing pathway (Pelisson et al., 2007). The piRNA pathway is a maternally transmitted defense mechanism designed to control the proliferation of a variety of TE families, such as TAHRE retroelement (Shpiz et al., 2007) and P transposons (Simmons et al., 2007). piRNAs are 29 nucleotide RNA molecules that interact with silencing proteins Piwi, Aubergine, and Argonaut, which are involved in transcriptional and chromatin silencing of retrotransposons in the D. melanogaster germ line (Klenov et al., 2007). Worth noting is that piRNA has so far been shown to act through decreasing transcript levels by either direct silencing of transcription or RNA decay (Hutvágner and Zamore, 2002; Grivna et al., 2006).

We focus on characterizing a potential host control mechanism for the copia element. Similar to the gypsy element, copia is a long terminal repeat (LTR) retrotransposon. However, unlike the TAHRE retroelement and P transposons, copia activity is suppressed in genotypes exhibiting high copia transcript levels (Nuzhdin et al., 1998). Here, we analyze the genetics of this unusual post-transcriptional suppression mechanism. Copia-Ty1 belongs to the Psuedoviridae retrotransposon family (Peterson-Burch and Voytas, 2002) and is abundant in plants, present in insects (1731 and copia families in Drosophila), but uncommon in vertebrates. Copia transpositions take place primarily in fly spermatocytes, where their transcript level is strongly elevated in the early stages of spermatogenesis (Pasyukova et al., 1997). Copia is stable in the majority of genotypes, including Oregon R (Ore), but active in several wild-type strains, such as 2b (Pasyukova and Nuzhdin, 1993). There is a strong positive correlation between copy number and transposition rate, suggesting that copia copy number is not self-regulated (Pasyukova et al., 1998). Copia, unlike gypsy, does not require maternal transmission of a ‘rogue’ copy for transposition to occur (Perdue and Nuzhdin, 2000).

We identified candidate genes and potential control mechanisms that suppress the copia element in D. melanogaster. According to Boeke and Chapman, 1991 (Figure 1), virus-like particles (VLPs) are assembled, during the late stages of the transposition pathway and serve as sites for RNA reverse transcription into linear and circular DNA transposition intermediates. For this reason, we quantified copia extrachromosomal DNA plasmids in 98 recombinant inbred lines (RILs) for QTL analyses. We mapped regions of the genome required for copia reverse transcription and conducted a deficiency mapping analysis to localize candidate genes within these QTL regions that interfere with copia reverse transcription. The most interesting gene represented in the analysis is the vasa gene, which is expressed in a special compartment of the germ cell called the nuage. In the cytoplasm, the nuage interacts with the PIWI protein, which is an important component of the piRNA machinery (Klattenhoff and Theurkauf, 2008). Therefore, we propose an unusual TE suppression mechanism that requires transcript sequestering during critical stages of spermatogenesis.

Figure 1
figure 1

Copia transposition process. (a) Copia transcription. (b) copia mRNA processing and translation of gag, pol, int, and Rnase transcripts in the cytoplasm. (c) Translated proteins are delivered into the nucleus and VLPs are assembles from gag proteins. (d) Inside the VLPs, reverse transcription of full-length mRNA into DNA. (e) Complex strain exchanges required to fully replicate the two long terminal repeats (LTRs) to produce linear DNA, single and double LTR extrachromosomal plasmids. Finally, integration of a newborn element into the host genome occurs.

Materials and methods

Wild-type D. melanogaster lines and copia transpositions

We used the two isogenic lines 2b (Pasyukova et al., 1998), Ore (Oregon R, Nuzhdin et al., 1996), and a set of RILs derived from them (Nuzhdin et al., 1997). Briefly, in the Ore line, no transpositions have been found over 6 years and copia positions are fixed (Nuzhdin et al., 1998). In the 2b line, a high copia transposition rate (10−3–10−2) was observed in 1991 (Pasyukova and Nuzhdin, 1993) and has continued since then (Pasyukova et al., 1998). Although copia has fixed in some sites, new transpositions segregate in the 2b line. We generated a set of 98 RILs by crossing one 2b male with an Ore female, then backcrossed F1 females with a 2b male, with repeated full-sib mating of subsequent progeny thereafter (Nuzhdin et al., 1997 for details). We mapped copia positions (either fixed or segregating) in two replicate parental 2b males (Table 1). Accordingly, RILs vary not only in genetic composition, but also in the copy number of copia elements. Furthermore, several copia transpositions were detected during and after the construction of the RILs (see Nuzhdin et al., 1998 for detail). We used transcript level, copy number, and transposition activity generated from Nuzhdin et al. (1998) in our analysis of the copia element in each of the RILs. Transcript level was evaluated by standard Northern analysis. Cytological positions were obtained by in situ hybridization of the plasmid cDM5002 containing a full-length copia element (Finnegan, 1978) to polytene salivary gland chromosomes of third instar larvae (Shrimpton et al., 1986). Probes were labeled with biotinylated dATP (bio-7-dATP, BRL) by nick translation. Hybridization was detected using the Vectastain ABC kit (Vector Labs, Burlingame, CA, USA) and visualized with diaminobenzidine. The element locations were determined at the level of cytological bands on the standard Bridge's map (Lefevre, 1976).

Table 1 Effects of deficiencies on copia plasmid concentrations

Amplification of copia plasmid from parental lines Ore and 2b

During retrotransposition, copia DNA plasmids are produced containing either a single LTR or double LTR (dLTR). Although, single LTR plasmids are thought to be abortive, it is unclear whether dLTR are abortive or direct intermediates of copia transposition (Flavell and Ishorowicz, 1981; Flavell, 1984). According to the copia sequence reported by Csink and Mcdonald (1995), We designed primers to amplify the junction between the 5′ and 3′ LTRs of the copia dLTR plasmids (Supplementary Table 1). Six different primer pairs were designed to represent copia sequence variance (P1/P2, P1/P3, P1/P4, P1/P5, P1/P6, and P1/P7).

DNA samples were prepared from 10 males aged 1 week using the Puregene Kit (Gentra, Bankok, Thailand). The DNA concentrations were determined on a Beckman Specktrophotometer Du-65. PCR amplifications were carried out in the 21 μl reaction containing 200 ng genomic DNA, 20 ng of the primer P1, and one of the primers P2–P7, 1 × PCR Buffer, 200 μM dNTP, 3 mM magnesium chloride, and 2.5 U of Taq polymerase (Promega, Madison, WI, USA) 0.15 microCi of γ-33P (Amersham, Piscataway, NJ, USA) per amplification was incorporated into primer P1 with 0.1 U of T4 Polynucleotide Kinase (New England Biolabs, Ipswich, MA, USA) in a buffer supplied by the manufacturer. Thermal cycle conditions were: 5 min at 94 °C, followed by 35 cycles for 0.5 min at 94 °C and 1 min at 60 °C, followed by a 5 min final extension at 72 °C. The amplification products were visualized after fractionation through 6% acrylamide gel, using Phosphorimaging on the Storm 8600 imager.

Electron microscopy images of VLPs in testes

As VLPs are believed to be primary vehicles for retrotransposition (Boeke and Chapman, 1991), we took electron microscopy images of testes cross-sections to validate this assumption. Testes were dissected in phosphate buffer (pH=7.4) and fixed in 2% paraformaldehyde and 0.1% glutaraldehyde in phosphate buffer (pH=7.4) for 12 h at room temperature; then placed in 2% osmium tetroxide and cacodylate buffer for storage at 4 °C. The fixed testes were dehydrated through an ethanol (EtOH) dilution series up to 100% EtOH, infiltrated in a 1:1 EtOH/LR White mixture overnight, embedded in 100% LR White acrylic resin (Ted Pella Inc., Redding CA, USA) in beam capsules, and incubated overnight at 60 °C. The blocks were then ultra thin sectioned (75 nm in thickness) and placed on parlodian-coated nickel grids. Sections on grids were etched with 0.5% sodium metaperiodate 10 min at room temperature to remove excess resin, and then washed five times for 10 min each by drops of 0.025 M Tris buffer (pH=7.4). Sections were viewed and photographed on a Philips CM10 electron microscope.

Copia dLTR plasmid sequencing

To verify that we are isolating copia dLTR plasmids, we sequenced the junction between the LTRs (Figure 1d–e, for dLTR structure). DNA was extracted from individual flies and primers P8 and P9 were designed to amplify the junction between the 5′ and 3′ LTR of the dLTR plasmid DNA (Supplementary Table 1). PCR reaction mixture consisted of 1 × AmpliTaq Gold PCR Buffer, 2 U of AmpliTaq Gold DNA Polymerase, 100 mM MgCl, 40 mM of dNTPs, 1 mM of each primer, and 200 ng of DNA template. Thermo cycling conditions were: 10 min at 94 °C, 15 s at 94 °C, 30 s at 60 °C, and 15 s at 72 °C for 30 cycles. PCR products were visualized on 1.5% acrylamide gel. We cut a band of the expected size (180 bp) from the gel and extracted the product from the gel using the Wizard SV Gel and PCR Clean-Up System (Promega). The extracted product was prepared for sequencing as follows: 6.4 pmol of copia P9 primer, 40 ng of template and deionized water to bring total volume to 18 μl. The sample was submitted to USC Norris Core Facility for sequencing.

Semi-quantitative analysis of concentration of copia plasmid in each RIL

To get an idea of the relative differences in copia dLTR plasmids between the 2b and Ore parental lines, we extracted DNA and semi-quantified relative copia plasmid concentrations, as mentioned above. To measure the amount of target dLTR molecules, we compared the concentration of copia plasmids relative to a known quantity of pBB54 plasmid containing two tandemly arranged LTRs separated by 15 bp insertion 5′-AGGTGAAAAGGTTTC-3′ (Flavell and Ish-Horowicz, 1983). Both the total DNA and pBB54 plasmid were amplified in the same reaction mixture that included primers P1 and P4 with 0.15 microCi of γ-33P (Amersham) per amplification. Before amplification, the plasmid was linearized with restriction enzyme Hind III (Promega) at 1 μg per 6000 μl and 1 μg at 60 000 μl. Each PCR amplification contained 200 ng of total DNA, 20 ng of each primer, and 1 μl of the plasmid pBB54. The intensity of the amplified products were recorded after fractionation through 6% acrylamide gel by Phosphorimaging on the Storm 8600 imager. The amount of DNA (33P) in each band was approximated from the signal level in the same area rectangles covering the bands. The relative concentrations of the dLTR circles were estimated as a ratio of the signal intensity of the genomic DNA template band to the plasmid DNA template band generated from the same reaction mixture. Across 98 RILs and 2 parental lines, the average ratio was 0.472 for the plasmid concentration 1/6000 μg/μl, and 4.81 for the plasmid concentration 1/60 000 μg/μl. Correlations between copia copy numbers, transcript levels, and plasmid concentrations among RILs were calculated with the CORR procedure on data averaged between observations within each genotype (SAS Institute, 1989).

QTL mapping for copia plasmid concentration

Each RIL has been genotyped using 92 roo TE with polymorphic markers on the X, second, third, and forth chromosomes (Nuzhdin et al., 1997) to map factors controlling concentration of copia plasmid. We used the marker data available from Nuzhdin et al. (1997) to map factors controlling concentration of copia plasmid. We replicated statistical techniques and directly linked our new analyses to those described by Nuzhdin et al. (1998). Briefly, we used the QTL Cartographer software (Basten et al., 1997) for composite interval mapping (Zeng, 1994) to test whether an interval flanked by two adjacent markers actually contains a QTL affecting the trait of interest. With multiple regression analysis of markers outside of the test interval, we can control for the effects of QTLs that are chromosomally linked. In all, 76 cytological markers (16 markers out of 92 were completely linked with neighboring markers and excluded from the analysis; Nuzhdin et al., 1997) were used for the analysis with parameters 6 (model) and 10 (window size). The conditioning markers were chosen by stepwise forward regression. The likelihood ratio test statistic, LR, is −2ln(L0/L1), where L0/L1 is the ratio of the likelihood under the null hypothesis (there is no QTL in the interval) to the alternative hypothesis (there is a QTL in the interval). An empirical distribution of LR test statistics under the null hypothesis of no association between any of the intervals and the trait values were obtained by randomly permuting the trait data 1000 times and calculating the maximum LR statistics across all intervals for each permutation. LR statistics from the original data that were exceeded by the permutation maximum LR statistics <50 times are significant at P=0.05. The trait values were the ratios of the band intensities averaged over 2 (3) measurements per line as described below. The analysis of the log-transformed data yielded the same results (data not shown).

Deficiency mapping of copia plasmid concentration

We obtained Drosdel deficiency stocks from the Drosophila Genetic Resource Center in Kyoto, Japan. The deficiencies span the 33E–43E cytological region on the second chromosome and maintained with SM6a balancer chromosomes. In choosing a subset of deficiencies, we attempted to minimize their number while maximizing coverage of the QTL controlling the amount of copia plasmid as described in the Results section. Virgin females from isogenic 2b and Ore lines were crossed separately with males from each deficiency line. F1 progeny consisted of four classes: Df/2b, SM6a/2b, Df/Ore, and SM6a/Ore (Df corresponds to a deficiency and SM6a corresponds to a balancer chromosome with a dominant curly wing marker). We made two replicates of each cross consisting of five virgin females and three males in 5 ml vials with standard cornmeal-agar-yeast-corn syrup-malt-soy-yeast medium and maintained at 25 °C. F1 males from each genotype were collected and aged for 3–5 days before freezing in liquid nitrogen for DNA extractions.

The relative concentration of copia dLTR plasmids was analyzed using DNA samples from F1 males collected from the 2b and Ore deficiency crosses. Replicate DNA samples were extracted from 5 F1 males per genotype. DNA extractions were carried out in a 96-well format using ABI Prism 6100 Nucleic Acid Prepstation (Applied Biosystems, Inc., Foster City, CA, USA) by following the manufacturer's suggested DNA extraction protocol. The concentrations of extracted DNA were spectrophotometrically quantified. The Taqman assay of the real-time PCR detection technique was used to quantify relative concentrations of copia plasmids with the comparative method for relative quantification (Applied Biosystems). We chose mdg3 as the endogenous reference gene, a LTR retro-TE reported to be a stable element in both parental lines (Pasyukova and Nuzhdin, 1993). Amplification efficiencies for both target and reference genes should be between 90 and 105% (Real-Time PCR Applications Guide, Bio-Rad, Hercules, CA, USA). Preliminary experiments consisted of 5 μl of a 10-fold dilution series between 100 and 0.05 ng/μl of DNA extracted from 2b parental lines, targeting both mdg3 and copia plasmids to optimize experimental conditions.

As Taqman-based qPCR assays are very sensitive, the primers used in the semi-quantitative analysis did not produce efficient amplification of copia plasmids. We designed a new set of primers maintaining the same conceptual design in this experiment. The primers and probes were designed using Primer3 web-based software (http://fokker.wi.mit.edu/primer3/input.htm). mdg3 primers (P10/P11) and probe (P12) were chosen to flank the region between the 5′ and 3′ LTR in the genome. Copia primers P8 and P9 are described above and the copia probe (P13) was designed to hybridize to the complementary strand of the 5′ and 3′LTR junction. PCR amplifications were performed in 96-well reaction plates, using separate wells to detect copia and mdg3 sequences. We also included multiple replicates of calibrator samples and a 10-fold dilution series of a sample 2b DNA template for copia and mdg3 sequences to ensure reaction efficiencies are between 90 and 105% for each plate. The reaction mixture consisted of 12.5 μl of Taqman Universal PCR Master Mix (Applied Biosystems), 500 nM of each primer, 250 nM of each Taqman mdg3 FAM Probe (Applied Biosystems), 30–60 ng of DNA template and deionized water to bring the total reaction volume to 25 μl. The real-time PCR amplification was performed in an MJ Thermocycler (Bio-Rad) with the following conditions: 95 °C, 10 min; 95 °C, 15 s; 60 °C, 1 min; for 50 cycles.

As the genomic DNA samples contain both chromosomal and extrachromosomal DNA, the target copia plasmid amplification efficiencies were low. As a result, we verified the efficiency of the experimental conditions using PCR products amplified with P8/P9 primers. 2b, Ore and deficiency parental genotypes were used as calibrators to compare and estimate the threshold cycle (Ct) values (that is the fractional cycle number at which the amount of amplified sequence reaches the threshold, Bio-Rad) for the F1 samples. The amount of target sequence (copia) was normalized to the reference sequence (mdg3) and compared with the calibrator samples (parental genotypes: 2b/Ore/Def). The calculations to estimate the relative expression ratio of the target and calibrator samples are as follows:

ΔCt=(target)copia Ct−(ref)mdg3 Ct

ΔCt=(calibrator)copia Ct−(ref) mdg3 Ct

ΔΔCt=ΔCt(target)−ΔCt(calibrator)

Relative expression ratio=2−ΔΔCt

Results

Copia plasmid in parental stocks

In the process of transposition, copia is believed to produce dLTR plasmid products that may be direct intermediates of retrotransposition (Flavell and Ishorowicz, 1981). To test the parental strains 2b and Ore for the presence of dLTR copia plasmid, we amplified the junction between the dLTRs. Although Copia elements and retroviruses both share similar transposition mechanism along with structural and genetic sequence homology, they differ in the mode of integration and excision. Retroviruses require an additional two base pair overhang for integration, whereas copia Ty1 can integrate with blunt ends (Boeke and Chapman, 1991).

Similar to that seen in the pseudoviridae retrotransposon, we detected an exact junction without the additional two base pair insertion between the dLTRs for copia plasmids (Flavell and Ish-Horowicz, 1983). In the 2b and Ore lines, we used four different combinations of primers and observed the expected size bands of 260 bp (P1/P4 and P1/P5), 310 bp (P1/P2 and P1/P3), and 330 bp (P1/P6 and P1/P7) in the 2b line in addition to multiple faint bands of different sizes from the amplification products, which may be products of neighboring or scrambled copia copies in the genome with LTRs within close proximity (Supplementary Table 2). We also observed multiple faint bands in the Ore samples, some partially overlapping with 2b amplified bands. However, there was no apparent amplification product of the expected size detected in the Ore parental line. These results suggest that dLTR plasmid structures may be intermediates required for copia transposition.

Electron microscopy images of VLPs in testes

As VLPs are believed to be the primary vehicle for copia plasmid formation and transposition (Boeke and Chapman, 1991), we took electron micrographs of testes cross-sections and found distinct differences between 2b and Ore parental lines (Figure 2). We found VLPs in 2b, but not Ore in the serial cross-sections taken from the distal parts of the testes containing maturing spermatids, specifically after the pre-individualization and before the coiling stages of maturing spermatids. As described in Rachidi et al. (2005), we saw spherical structures composed of clusters of well-defined particles 50 nm in diameter, which appear analogous to A-type particles produced in yeast Ty1 LTR retrotransposons. This might indicate that VLPs and copia plasmids are a necessary part of the transposition pathway.

Figure 2
figure 2

Electron micrographs of testes cross-section parental lines. (a) and (c) Maturing spermatid from the 2b and Ore line, respectively, with its axoneme (Ax), major mitochondrial derivative (M), and minor mitochondrial derivative (m). (b) and (d) A higher magnification of a single spermatid from 2b and Ore, respectively, with virus-like particles indicated by the arrow.

Copia plasmid in RILs

As plasmids are believed to be copia transposition intermediates, we semi-quantified copia plasmid concentrations in the 2b and Ore parental lines and found strong differences. We measured approximate concentrations of copia plasmids in each of the 98 RILs, to map the regions of the 2b and Ore genomes that underlie these phenotypic differences between the parents. The intensity of the 2b parent was 45-fold stronger than the known pBB54 plasmid concentration, whereas the Ore parental line and 70% of the RILs displayed amplification intensities similar to the pBB54 concentration (Figure 3). In the remaining 30% of RILs, the amplification intensity of the plasmid product was intermediate between the two parental strains, but closer to the stable Ore parent. Interestingly enough, it was previously observed that there was direct copia activity in five of the RILs with intermediate plasmid concentrations (Nuzhdin et al., 1998; Figure 3 shown as black bars), suggesting that these RILs may contain regions affecting copia transposition. Although our assay was only semi-quantitative, the differences were strong enough to allow us to test our hypothesis.

Figure 3
figure 3

Distribution of copia plasmid concentrations among parental and recombinant inbred lines. The white and black triangles represent relative copia plasmid concentration of the parental Ore and 2b line, respectively. Black bars represent RILs exhibiting copia transpositions.

We observed a considerable amount of variation in the RILs, when comparing copia plasmid concentrations to copy number and transcript levels. There was a moderate positive Pearson product-moment correlation between copy number and plasmid concentration (r=0.285, P=0.004), shown in Figure 4 (filled circles are RILs with direct transpositions). Interestingly, the copia copy number is moderate in some RILs with high concentrations of copia plasmids, whereas some lines with high copy number had low plasmid concentrations. Similar to the relationship between copia copy number and plasmid concentration, there was a moderate positive correlation between copia transcript level and plasmid concentration (r=0.314, P=0.002). Some RILs exhibiting very high transcript levels yielded little to no detectable plasmids, whereas some RILs exhibiting relatively low transcript levels had high plasmid concentrations. These moderate correlations indicate that genomic factors, presumably encoded by the host, may have additional and possibly stronger effects on copia activity than copia copies themselves.

Figure 4
figure 4

Correlations of plasmid concentration with copy number and transcript level of copia element. Filled circles are RILs with direct transpositions.

QTL mapping of copia plasmid concentration

We measured copia plasmid concentrations in each of the panel of RILs and mapped QTLs that account for the genetic variation among these RILs. As QTL mapping outcomes might, in principle, be sensitive to the procedures of the analyses, we replicated the exact approach for mapping copia transcript levels among the same set of RILs used in Nuzhdin et al. (1998). We detected a single QTL in the pericentric region of the second chromosome (33A–43E) with substantial statistical support with an LR of 39, which is much higher than the permutation threshold of 24 (Figure 5). When we compared our current mapping results with those from Nuzhdin et al. (1998), the copia plasmid concentration QTL was positioned precisely within a QTL required for copia transposition. There are at least two (possibly three) regions of the 2b genome required for copia transpositions (Nuzhdin et al., 1998) but the copia plasmid concentration mapped only to one region: the pericentric region between the intervals 33A–43E.

Figure 5
figure 5

Location of QTL for copia plasmid concentration with respect to transcript level and transposition rate. Plot from double-log likelihood ratio (LR) of composite interval mapping against recombination distance on the X (a), second (b), and third (c) chromosomes. Horizontal lines, Bonferroni-corrected LR critical value for experiment wise α=0.05.

Deficiency mapping of copia plasmid concentration

We used a quantitative deficiency complementation technique to more precisely localize the region directly affecting plasmid concentration within the 33A–43E interval on the second chromosome. We collected 18 deficiency lines covering the candidate pericentrical region (33A–43E, Figure 6) to detect genetic interactions with the parental Ore and 2b alleles that influence copia plasmid concentrations. As copia is only active in males and transposition is not influenced by maternal effects (Nuzhdin et al., 1996), we made separate replicate one-way crosses between males from both parental lines and females from each deficiency line. DNA was extracted from the F1 males of 72 genotypes that fall into four classes: 2b/Def, 2b/SM6, Ore/Def, and Ore/SM6. Two-way analyses of variance were used to estimate the significance of the interaction between the parent of origin and the presence of the deficiency (Table 1). In the majority of the genotypes, copia plasmid concentration was higher in the F1 progeny with the balancer chromosome as opposed to the deficiency chromosome (Supplementary Table 3).

Figure 6
figure 6

Black solid lines represent deficiency spanning the pericentrical region of the 2R (regions 41–44) and 2L (regions 33–40) chromosome.

F1 progeny from the crosses of the two deficiency lines, however, showed significant deviations from the patterns described above. The copia plasmid concentrations were higher in the 2b/Def and Ore/Def genotype classes compared with the majority of the deficiencies (P<0.0001) in the Df(2L)1196 line, with break points 36E6 and 37B1. Moreover, the deficiency containing F1 genotype classes (2b/Def and Ore/Def) for both parental lines had a much higher plasmid concentration than the progeny containing balancer chromosomes (2b/SM6 and Ore/SM6) of 4-fold changes on the log-scale, P<0.0001. We conclude that the deficiency had a strong effect on the copia reverse transcription process, irrespective of the 2b versus Ore alleles of the parent (P=0.78).

In contrast, Df(2L)1054 with the break points 35B10 and 35D4 exhibited epistatic interactions with the parental alleles. In the F1 progeny from the Ore and Df(2L)1054 cross, the genotypes containing balancers (Ore/SM6) had higher concentrations of copia plasmid than the ones containing the deficiency chromosome (Ore/Def). This is much like the typical pattern observed for all the deficiencies described above. However, the F1 genotype with the 2b parental and Df(2L)1054 chromosome (2b/Def) displayed elevated levels of copia plasmid concentration, which indicate that this region might be involved in copia plasmid regulation. The main effect of the parent (P<0.0001), deficiency versus balancer (P<0.0001), and the interaction term between them (P=0.006) were highly significant. This indicates that there is an interaction between the deficiency and the 2b-originated alleles defining copia plasmid concentrations. Out of 50 genes represented in the deficiency (Supplementary Table 4), the vasa gene seemed to be the most interesting, because it has a pivotal function in germ cell development (Styhler et al., 1998).

To test whether vasa directly influences plasmid concentrations, we conducted a complementation test using a heterozygote loss-of-function vasa mutant maintained over a CyO balancer (http://flybase.org/reports/FBst0000284.html). Following the same crossing scheme for the deficiency crosses mentioned in the Materials and methods section, we generated four genotype classes: 2b/vas−, 2b/CyO, Ore/vas−, and Ore/CyO. As expected, the F1 genotype of the two possible heterozygote combinations of the 2b chromosome (2b/vas− and 2b/CyO) displayed high levels of copia plasmids (Supplementary Table 3). However, the F1 progeny derived from the Ore parental line (Ore/vas−) did not show the same patterns observed in the deficiency crosses. We previously observed lower levels of plasmid concentrations in Ore-derived F1 males containing a deficiency chromosome (Ore/def) than the ones with a balancer chromosome (Ore/SM6), but the complementation results show that all Ore-derived genotypes (Ore/vas− and Ore/CyO) displayed approximately the same level of plasmids. Overall, 2b-derived genotypes (2b/vas− and 2b/CyO) had higher plasmid concentrations compared with the Ore-derived genotypes (Ore/vas− and Ore/CyO), which is a similar pattern observed for the Df(2L)1054 deficiency cross. The failure to completely complement may be due to the fact that Df(2L)1054 and vasa mutant lines carry different balancer chromosomes, which makes direct comparisons challenging. Another complementation test using a loss-of-function vasa mutant with an SM6 balancer chromosome comparable to the deficiency mutants is required to validate that vasa directly influences copia activity. Although the deficiencies available to us had some gaps in coverage of the QTL support interval, they were sufficient to identify strong candidate regions that interact with alleles from the unstable 2b and stable Ore parental lines. At present, we can use these regions to infer possible genetic interactions that interfere with the general process of copia transposition.

Discussion

In this study, we explore the co-evolution between copia retrotransposon and the host genome to understand the potential mechanism that alleviates the detrimental effects caused by retrotranspositions. The piRNA pathway is thought to serve as a pre-adaptive defense mechanism against viruses and TEs (Aravin et al., 2004, 2007; Blumenstiel and Hartl, 2005). Novel TEs spread unsuppressed until the element inserts itself into either a heterochromatic or euchromatic piRNA generating loci (Brennecke et al., 2007). This results in the production of piRNA, which then elicits a silencing response by the host that is generally mediated through direct transcriptional silencing or degradation of TE transcripts (see recent reviews by Peters and Meister, 2007; Buchon and Vaury, 2006; Seto et al., 2007; O’Donnell and Boeke, 2007; Klattenhoff and Theurkauf, 2008). However, copia control seems to deviate from this general mechanism. Nuzhdin et al. (1998) showed that copia transcript level QTLs did not coincide with the copia transposition QTLs, indicating that copia transposition is not restricted by transcriptional silencing but instead by a post-transcriptional mechanism. As copia transpositions are limited to male spermatocytes, we propose an alternative piRNA suppression mechanism that involves spermatogenesis-specific events.

During spermatocyte development in D. melanogaster, transcription ceases during early stages of spermatogenesis, with limited transcription observed after the first meiosis (see Parvinen, 2005 for a helpful review). Accordingly, the transcripts required for completion of spermatogenesis must be produced during the early stages and preserved for use in the later stages. This kind of translational regulation is mediated by a special membrane-free microtubule-based structure called the nuage, also known as the chromatoid body in mammals (Parvinen, 2005). The nuage is comprised of a variety of germ cell related proteins with the most important being the vasa protein, a DEAD-box RNA helicase, which acts synergistically with the components from the piRNA system Piwi, AGO3, and Argonaut. The nuage is thought to mediate translational regulation, silencing male-specific transcripts and transcripts originating from TEs through complex interactions among different families of piRNAs (Aravin et al., 2007; Kotaja and Sassone-Corsi, 2007; Lim and Kai, 2007). Therefore, we hypothesize that copia transcripts are suppressed in the nuage by a post-transcriptional mechanism during spermatogenesis.

Here, we show that the QTL mapping to the 33A–43E region associated with copia transposition (Nuzhdin et al., 1998) contributes to blocking one or more of the post-transcriptional stages of copia retrotransposition. The concentration of copia plasmids purported to be an intermediate for copia transposition (Flavell, 1984) is strongly increased in the RILs with 34A–43E pericentrical region originating from the active 2b parental line. We also establish a strong affect on copia plasmid concentration that is partially accounted for by the 35B10–35D4 segment (Supplementary Table 4). Among the 54 genes represented within this region, vasa stands to be a promising candidate because this protein is a critical component of the nuage-piRNA processing machinery. Vasa is a hallmark for germ cell development and belongs to a class of proteins that act as RNA chaperones (Parvinen, 2005). Vasa expression is restricted to testes and ovaries and null mutations of this gene results in complete sterility (Styhler et al., 1998). Furthermore, there were strong fertility problems in the RILs with the pericentrical region inherited from 2b (Nuzhdin et al., 1997). Coincidently, this commonly observed fertility problem, along with abdomen abnormalities (Nuzhdin and Pasyukova, 1991; Yoshioka et al., 1992), are among the phenotypes typical for vasa mutants (http://flybase.org/reports/FBgn003970.html). However, future studies examining the copy number variation, structure, and expression patterns of vasa is necessary to validate the direct role of vasa in copia activity of the 2b line.

Unfortunately, the deficiency lines for the most potent pericentrical regions on the second chromosome were not available in the co-isogenic deficiency collection (http://www.drosdel.org.uk/coverage.php). Accordingly, we were unable to test one interesting prediction of this hypothesis: the failure to complement 2b deficiency in copia suppression by one of the piRNA generating loci. At present, our results provide a candidate post-transcriptional pathway that involves the nuage-piRNA machinery for copia suppression, but how this suppression is mediated remains unclear.

The only other well-studied element in the pseudoviridae family of retroelements is 1731, which offers insight as to how the nuage-piRNA machinery might silence TE transcripts. Much like copia, transcription of 1731 is strongly upregulated in the early stages of spermatogenesis but 1731 VLPs are not assembled immediately after translation (Haoudi et al., 1995). Rather, the VLPs are observed in the nuclei during the late stages of spermatogenesis, in which chromatin condensation would likely make transpositions impossible. Rachidi et al. (2005) interpreted these observations by reasoning that, perhaps, 1731 transcripts are silenced or sequestered in the nuage to alleviate detrimental effects of transposition during critical stages of germ cell development. Overall, this picture is consistent with the post-transcriptional blocks of copia transpositions that we have analyzed here.

Our experimental results suggest that translational silencing of copia elements through the nuage-piRNA machinery seems to be a likely hypothesis for host suppression. In our quantitative study of copia plasmid concentration (Nuzhdin et al., 1996), we detected copia plasmids in the active 2b but detected little to no copia plasmids in the inactive Ore line. Consistent with these results, we saw VLPs present in the 2b line but none were observed in the Ore line in the testes electron microscopy images. Both copia and 1731 VLPs are self-assembled in the nucleus through an autocatalytic process (Yoshioka et al., 1990), which serves as a site for mRNA reverse transcription into linear and plasmid DNA. Therefore, copia silencing is likely to occur at the post-transcription blocks of spermatogenesis, in which copia transcripts may be sequestered in the nuage by complex interactions among piRNAs and its associated proteins.

In the experiments reported here, we have clarified the different stages at which transpositions of pseudoviridae TEs may be blocked by providing an analogous pathway to better understand the potential copia control mechanism. We mapped regions of the host genome controlling copia transpositions, and identified candidate genes and potential pathways that could explain how copia transposition is controlled during critical stages of germ cell development. We know that at least some of the mechanisms of TE suppression are post-reverse-transcriptional because copia transpositions are blocked at multiple stages with several regions from different chromosomes that interact to suppress or enable copia mobility.