Transposable elements maintain genome-wide heterozygosity in inbred populations

Elevated levels of inbreeding increase the risk of inbreeding depression and extinction, yet many inbred species are widespread, suggesting that inbreeding has little impact on evolutionary potential. Here, we explore the potential for transposable elements (TEs) to maintain genetic variation in functional genomic regions under extreme inbreeding. Capitalizing on the mixed mating system of Arabidopsis lyrata, we assess genome-wide heterozygosity and signatures of selection at single nucleotide polymorphisms near transposable elements across an inbreeding gradient. Under intense inbreeding, we find systematically elevated heterozygosity downstream of several TE superfamilies, associated with signatures of balancing selection. In addition, we demonstrate increased heterozygosity in stress-responsive genes that consistently occur downstream of TEs. We finally reveal that TE superfamilies are associated with specific signatures of selection that are reproducible across independent evolutionary lineages of A. lyrata. Together, our study provides an important hypothesis for the success of self-fertilizing species.

However, it was not an easy read, and perhaps because of this, I have several major concerns, mostly related to the methodology.
Overall, I think the genomic analyses (i.e., the genotyping and TE identifications) were thorough. The only aspect I wonder about is whether the minimum read depth of 10 per called locus may have been on the low side? Authors may want to justify this choice in the paper. Related to this, please specific whether there was a minimum allele frequency required for a SNP to be called? Looking at the supplemental tables, SNPs are scored as 0 (hom) or 1 (het). Surely, for Bayescan, the actual allelic states were used? It would be good to provide these in a data repository, if this has not been done so already (in which case, please mention the link).
The statistical analyses were more difficult to follow, and I identified a few potentially major concerns: 1) Writing and presentation. Both the results and methods sections were hard to follow. In the results section, subheadings would be helpful to give some structure. Also, I would avoid too many almost literal stylistic repetitions of intro and methods, but rather formulate the results as findings, with a brief interpretation in relation to the hypotheses. For example: In support of *idea/hypothesis*, we found XX was related to YY.
2) Bayescan -have authors ignored purifying selection? Especially in the results, but also the methods, I think the Bayescan approach should be described better. The results strongly rely on the Bayescan outlier detections. However, the resulting classes of loci (under balancing selection, under divergent selection & neutral) are very much taken as a given for making certain comparisons. It took me a while to figure out that these classes had been inferred with Bayescan (probably based on the alpha parameter, but this should be explained in the paper, neither the methods L351-354 mention Bayescan, nor do the figure captions). A major concern related to this is the interpretation of outlier loci with alpha<0 as under balancing selection. In my understanding (which is limited, so please rebut if I missed something obvious) it is not really possible with Bayescan to distinguish between signatures of balancing vs. purifying selection. Thus, unless the authors have taken an analytical step enabling them to rule out purifying selection (if so, I don't think they have described it in their paper), their classification of loci with alpha<0 as being under balancing selection is misleading. I would personally expect strong signatures of purifying selection, as one would expect most mutations (including those due to TE activity) to be deleterious. If I am correct, this would somewhat weaken the overall author's interpretation that TE-associated mutations increase evolutionary potential, in which case the paper's general message should be toned down 3) Generalized linear mixed modelling. There are a couple of very confusing aspects to the modelling approached described: a) Relatedness seems to be a continuous variable but has been included in the model as a random effect. However, AFAIK, random effects are always categorical. So, it seems to me that relatedness should have been included as covariate. Despite the explanations in L318-320, I could not understand how it was calculated. What is a "sample" in this context? Relatedness to what? Is it simply the groupings arising from the PCA? b) I do not understand the logic of the competing background models, and I think it has not been explained how this works at all. Why is it better or more appropriate than using a single model with a binomial error term to explain the probability of heterozygosity (H) including fixed terms: TEsuperfam, FIS, Selection, Poly(TEdist quadratic), TEsense, SNPsense and (a choice of) their interactions? I am sure there is good reasons for the taken approach, but the reader needs much more guidance. It is currently impossible to judge whether the taken modelling approach is appropriate or not. 4) FIS. The authors use FIS as an estimation of inbreeding levels, and it is reassuring that the highest values correspond to populations known to display high selfing rates. Still, I disagree that we truly have a gradient here. In line with what we know from earlier work, most populations are either highly outcrossing, or highly selfing. The FIS values reflect this, as data points are really sparse between FIS=0.5 and FIS =0.9. The patterns in Fig 4 D and E remain extremely interesting regardless, with clearly increased heterozygosity at all FIS levels for loci under purifying (or balancing) selection. 5) Tests of significance. Authors mainly report F values, but they seemingly mention only one degree of freedom (I assume the residual df are not mentioned). For example -L136 F13=4.98**. Meaningless if we do not know the residual df. Moreover, the * are not explained (I assume ** means p<0.01). In my understanding, F values indicate whether fixed effects are significant or not. However, if such fixed effects have multiple levels, they do not inform about differences between the factorlevels, for which post hoc comparisons are required. It only became clear after looking at Table S5 that in reality the significance of factor level estimates are evaluated (z-tests).
6) Related to the previous point, I struggled to understand what the intercept represented in the models. For example, there are 13 TE superfamilies in figure 3A, and 13 factor levels listed in Table  S5, plus an intercept (i.e., 14 levels). It is probably my ignorance, and has a very simple explanation, but it would be good to include this explanation in the paper. Likewise, I struggled to understand how a binomial/binary Y can be zero-inflated. 7) TE superfamily. Could an individual SNP be affected by more than one TE superfamily? I think the models assume that SNP heterozygosity can only be affected by its closest TE, but how realistic is this? Could there be interactive effects of TEs? I also have a few conceptual concerns: 1) Inbreeding depression/purging as a strategy. In my view, urging is not a strategy, it just happens, as a consequence of inbreeding. L45deleterious alleles may be purged, but not complete loci. In the discussion, there is a misconception that repeated selfing increases the risk of inbreeding depression (L214-216). Given purging, the opposite is true.
2) The motivation for this study is understanding "the evolutionary success of selfers". While I completely agree that it is intriguing that selfers manage to occupy a wide range of habitats, they might not necessarily achieve this through adaptation. The alternative view -that selfers are evolutionary dead-ends, but may evolve repeatedly due to short term advantages -also requires attention. Under this scenario, selfers do not need to adapt per se. They may arise pre-adapted to the prevailing conditions (from a locally adapted outcrossing ancestor), or many formed selfing lineages may arise, after which selective filtering allows establishment of those lineages that are best-adapted to the prevailing conditions. Either way, such lineages may eventually go extinct when those conditions change due to limited adaptive potential. It would be good to discuss whether the observed patterns of increased heterozygosity in highly inbred individuals could actually reflect mutational meltdown rather than increased adaptive potential. This would at least explain the finding of many loci being under purifying selection (interpreted by the authors as balancing selection). The work by Emma Goldberg et al (Science 2010) comes to mind. In any case, I would play down L191-194 a bit. L302 -"only those" -only those SNPs or only those RAD loci? L232-239 This paragraph feels like a loose end not tied in with any findings. Should it be integrated into the next paragraph?
Reviewer #2: Remarks to the Author: In this manuscript by De Kort and colleagues the authors have reanalyzed RADseq data from 91 individuals of Arabidopsis lyrata (from US and EU, a total of around 5900 SNPs) with contrasted breeding systems, in order to test the potential role of transposable elements (TEs) in maintaining genetic variation in inbreeding conditions. While the biological hypothesis is very novel and promising, I see maybe one major drawback in this study. The study investigates the proximity of SNPs with given TEs based on the lyrata genome annotated with REPET. However it is well documented that a plethora of TE insertion polymorphisms (TIPS) are found in plant populations, even in narrow geographical areas (see in Nature Comm 2019 the works by Quadrana et al for Arabidopsis thaliana https://doi.org/10.1038/s41467-019-11385-5 and Carpentier et al, for Oryza sativa https://doi.org/10.1038/s41467-018-07974-5 for instance). In this context, how can the authors estimate the precise location of the TE families they investigate?
Minor comments: Line 209: Overstatement: The simultaneous enrichment of biological regulation and stressresponsiveness found here suggests that TEs are involved in genome-wide regulation and evolution of adaptive traits. Line 255: not described in the Results section! Line 261: I think the conclusions in this paragraph are not supported by the data. Could the authors comment please?
Reviewer #3: Remarks to the Author: Review of "Transposable elements maintain genome-wide heterozygosity in inbred populations" This study by De Kort et al reports association between heteozygosity estimated from RADseq-derived SNPs across populations of Arabidopsis lyrata and the presence of different transposable elements (TEs) in the reference genome. The biological system is well established for its gradient of inbreeding across North America due to transitions towards selfing, which is here used to discuss possible contrasts between TE-flanking and "background" regions of the genome. The manuscript is concisely written (sometime to such an extent that superficial descriptions hamper proper understanding; see below) and, bringing fresh perspectives on long-held questions, appears typical of what is published in Nature Communication. Although the introduction on selfing and purging offers valuable background (despite some seminal work by Willis on Mimulus being ignored), I found transposable elements being more superficially introduced. The supposedly 50x higher mutation rate around TEs should be clearly justified by literature (currently, based on loosely connected references). Also, available knowledge about TEs in A. lyrata and their evolution across populations shall be better introduced and used in the discussion (e.g. early insights from Gaut's lab a decade ago); also further justifying remaining gaps in our understanding that is here addressed. A crucial issue that should be considerably clarified is the possible limits of the approach. First, that RADseq offers accurate genotyping (and particularly heterozygosity) should be validated with additional details on the here-used protocol (e.g. coverage) and in depth post hoc analyses. Second, and most importantly, TEs are known as mutagenic by themselves and polymorphic copies are therefore expected to segregate with the species, although only sites flanking TE copies inserted in the reference genome of A. lyrata are here under scrutiny. How was it taken into consideration that some TE loci may be absent and, even more pervasive, that quite many sites free of TEs in the reference can be populated by inserted copies across populations under scrutiny? How does such a referencebias affect conclusions regarding SNPs flanking TEs? Finally, selection tests shall be spelled out and justified as conservative indirect estimates based e.g. on literature (the very limited neutral loci apparent on Fig.4a suggests non-conservative estimates). Generally, figures and their legends should be better linked to the core text and made comprehensive.

Reviewer 1:
This paper examines whether transposable elements play a role in "generating" genetic variation (heterozygosity) through increased mutation rates, thereby increasing adaptive potential. Specifically, the authors test whether highly inbred populations may benefit from this process (as a mechanism to compensate for the reduced overall genetic diversity due to inbreeding).
Using a RAD-seq approach, authors find evidence for elevated heterozygosity downstream of some TE superfamilies, and indications that this increased heterozygosity may be associated with functional genes through detection of signatures of selection (outlier detection) and gene ontology/enrichment analyses. In general, I think the paper addresses a timely and original question, takes an original approach and has a message that is of broad interest. However, it was not an easy read, and perhaps because of this, I have several major concerns, mostly related to the methodology.

1.
Overall, I think the genomic analyses (i.e., the genotyping and TE identifications) were thorough. The only aspect I wonder about is whether the minimum read depth of 10 per called locus may have been on the low side? Authors may want to justify this choice in the paper. Related to this, please specific whether there was a minimum allele frequency required for a SNP to be called? Looking at the supplemental tables, SNPs are scored as 0 (hom) or 1 (het). Surely, for Bayescan, the actual allelic states were used? It would be good to provide these in a data repository, if this has not been done so already (in which case, please mention the link).
Response: While the minimum read depth of 10 per locus is far from unusual for RADseq projects (e.g. a minimum depth of 5x per locus in Kim et al. 2017 in Nature Ecology & Evolution doi.org/10.1038/s41559-017-0235-2, and minimum read depth of 10x in Yan et al. 2020 in Nature Ecology & Evolution doi.org/10.1038/s41559-019-1081-1), the average read depth in our study is significantly higher (European mean locus depth = 73.18; North American mean depth = 60.09). We now provide supporting figures displaying the range of read depths across the SNP dataset. The minimum allele frequency was set to 0.05, which reduces false SNP calls as well as rare deleterious SNPs that could impact the interpretation of our outlier detection methods (please see comment 3 for more details on this issue). No missing data were allowed to avoid arbitrary imputation methods that could impact the accuracy of our data. For Bayescan, the actual allelic states were used; and we now add supporting tables with the vcf data containing this information (Tables S3 and S4). These vcf files were used for FIS calculations, PCadapt, and Bayescan.
Methods New L372-374: "We limited the dataset to highly qualitative SNPs through maintaining (i) minimum allele frequencies of 0.05, (ii) no missing data, and (iii) an average read depth of 73.18 (European samples) and 60.09 (North American samples) per SNP (Fig. S4)." 3. Bayescan -have authors ignored purifying selection? Especially in the results, but also the methods, I think the Bayescan approach should be described better. The results strongly rely on the Bayescan outlier detections. However, the resulting classes of loci (under balancing selection, under divergent selection & neutral) are very much taken as a given for making certain comparisons. It took me a while to figure out that these classes had been inferred with Bayescan (probably based on the alpha parameter, but this should be explained in the paper, neither the methods L351-354 mention Bayescan, nor do the figure captions). A major concern related to this is the interpretation of outlier loci with alpha<0 as under balancing selection. In my understanding (which is limited, so please rebut if I missed something obvious) it is not really possible with Bayescan to distinguish between signatures of balancing vs. purifying selection. Thus, unless the authors have taken analytical step enabling them to rule out purifying selection (if so, I don't think they have described it in their paper), their classification of loci with alpha<0 as being under balancing selection is misleading. I would personally expect strong signatures of purifying selection, as one would expect most mutations (including those due to TE activity) to be deleterious. If I am correct, this would somewhat weaken the overall author's interpretation that TE-associated mutations increase evolutionary potential, in which case the paper's general message should be toned down.
Response: We agree that purifying selection can give rise to signatures of apparent balancing selection when genetic variants are slightly deleterious. TE insertions have indeed been demonstrated to be subject to intense purifying selection due their overall deleterious impact on gene functioning (e.g. Baduel et al. 2021). Because mutations caused by TEs may have similar deleterious consequences, we have now (i) explored the possibility of purifying selection, (ii) clarified the balancing selection methodology, and (iii) devoted part of the discussion to this issue. Distinguishing purifying from balancing selection is a challenging task, particularly on low-density SNP sets where it is impossible to look at the shape of the allele frequency spectrum within genomic regions. However, a characteristic difference between signatures of balancing and purifying selection is the behavior of non-synonymous mutations (e.g. Charlesworth & Eyre-Walker 2008; Hernandez et al. 2019). Specifically, purifying selection more readily constrains non-synonymous alleles to low frequency than balancing selection, and can therefore be identified as low-frequency signatures of balancing selection. We now consider SNPs with negative alpha values as putative candidates of purifying selection if their alleles are rarer than expected under neutrality. More specifically, we consider SNPs with signatures of balancing selection as candidates of purifying selection if their minimum allele frequency (MAF) was (i) smaller than the MAF of neutral SNPs, and (ii) smaller for non-synonymous relative to synonymous codon usage. While SNPs near CACTA elements appear to be associated with some purifying selection, we found low-frequency signatures of balancing selection to be rare, particularly across the American populations (please see New Fig  Results New L172-182: "Because signatures of purifying selection can resemble those of balancing selection, we also explored the behavior of non-synonymous and synonymous substitutions in our set of putative balancing SNPs. Specifically, purifying selection more readily constrains non-synonymous substitutions to low frequency than balancing selection, and can therefore be identified as lowfrequency signatures of balancing selection. While SNPs near CACTA elements appeared to be associated with low-frequency signatures of balancing selection, which could point to purifying selectin acting on deleterious alleles, we found low-frequency signatures of balancing selection to be generally rare, particularly in the North American lineage (Fig. S3). Genomic regions with signatures of balancing selection were also significantly enriched for processes that have been frequently associated with balancing selection (e.g. response to biotic stimuli; Table S8), suggesting that our findings are not strongly impacted by purifying selection." Discussion New L325-335: "Our findings do not suggest a significant role for purifying selection and mutational meltdowns as drivers of genetic variation across the genome of self-fertilizing A. lyrata. While lethal TEs are rapidly purged from a population (Baduel et al. 2021), slightly or conditionally deleterious alleles arising near TEs are more difficult to purge, and may reach higher frequencies resembling signatures of balancing selection (Fijarczyk & Wiesław Babik 2015). The lack of lowfrequency signatures of balancing selection (Fig. S3) suggests that purifying selection does not considerably contribute to genome-wide variation across American A. lyrata populations, with the enrichment of processes related to signaling and biotic responses particularly supporting a role for balancing selection (Table S8). In A. thaliana, genome-wide signatures of balancing selection are often associated with resistance and stress responses (Wu et al. 2017), and with temporally fluctuating selection pressures acting upon defense metabolism and signaling genes (Kerwin et al. 2015)." Methods New L405-413: "To assess the role of purifying selection in producing negative alpha values, we explored the tendency of SNPs with negative alpha values to be associated with lowfrequency non-synonymous variants. If alleles at these loci were rarer than expected under neutrality, we considered these SNPs candidates of purifying selection (Charlesworth & Eyre-Walker 2008;Fijarczyk & Wiesław Babik 2015;Hernandez et al. 2019)

. Specifically, we considered SNPs with signatures of balancing selection as candidates of purifying selection if their minimum allele frequency (MAF) was (i) smaller than the MAF of neutral SNPs, and (ii) smaller for non-synonymous as compared to synonymous codon usage. Codon usage (4-fold and 0-fold positions in the A. lyrata reference genome) of individual SNPs was obtained using the NewAnnotateRef.py script (Williamson et al. 2014)."
New references: 4. Generalized linear mixed modelling. There are a couple of very confusing aspects to the modelling approached described: a) Relatedness seems to be a continuous variable but has been included in the model as a random effect. However, AFAIK, random effects are always categorical. So, it seems to me that relatedness should have been included as covariate. Despite the explanations in L318-320, I could not understand how it was calculated. What is a "sample" in this context? Relatedness to what? Is it simply the groupings arising from the PCA? b) I do not understand the logic of the competing background models, and I think it has not been explained how this works at all. Why is it better or more appropriate than using a single model with a binomial error term to explain the probability of heterozygosity (H) including fixed terms: TEsuperfam, FIS, Selection, Poly(TEdist quadratic), TEsense, SNPsense and (a choice of) their interactions? I am sure there is good reasons for the taken approach, but the reader needs much more guidance. It is currently impossible to judge whether the taken modelling approach is appropriate or not.
Response: We confirm that the random effect "Relatedness" consists of categorical groups obtained by PCadapt. This was explained in the Methods, but we now also mention it in the Results section and in the caption of a new table so as to facilitate the interpretation of the results (New Table 1).
Our modeling choices were based on two criteria: (i) providing a logical stepwise approach matching the hypotheses, and (ii) avoiding model convergence issues and minimizing model complexity. We thus first identified those TE superfamilies that significantly impact heterozygosity under inbreeding (cfr. our main hypothesis), which considerably reduced the degrees of freedom in subsequent models from 13 TE superfamilies (x interaction terms) down to one grouping variable discriminating between relevant TE superfamilies (group1) and background TE superfamilies (group2). We now also simplify most of the models, rendering them more compatible with the hypotheses. Specifically, we removed the "TE_Sense" variable from our first model, since it was not connected to our hypotheses and did not significantly impact heterozygosity. In addition, because distance to nearest TE did not contribute to Model 1, it was removed from subsequent models. Note that these new adjustments have not changed our key findings. One minor result that was difficult to explain in our previous version (i.e. the behavior of heterozygosity with distance to nearest TE), however, was not significant in these revised models. We changed the discussion on the distance effect accordingly (New Lxxx, see below). Finally, the use of competing models (HTE vs. H0) served to avoid complicated three-way interactions: "existing two-way interactions × TE group" and to facilitate the interpretation of our findings (see adjusted Methods section below for more details). To emphasize the relations between our models and our research questions, we also added a table that summarizes the models and their corresponding hypotheses. We hope that the rationale behind our approach is now clearer to the readers.
Methods New L416-428: "We followed a stepwise modeling approach that allowed testing our hypotheses consecutively without running complex models with many parameters. We first ran an explorative mixed model (Model 0,  (Fig. 3A,B) Response: We now explicitly test our TE effect in the self-fertilizing vs. outcrossing samples (Models 0a and 0b, please see reply to comment 4), and added a panel (Fig. 3B) showing that heterozygosity near Copia and Harbinger elements strongly differs between the highly inbred vs. outcrossing American populations. This could suggest a role for mating system as a potential driver of heterozygosity-TE relationships. To explore this, we now disentangle the potential role of mating system strategy from other processes shaping inbreeding coefficients through subsetting the European individuals (all characterized by a predominantly outcrossing mating system) according to their inbreeding coefficients. Specifically, we divided the European dataset into individuals from populations that likely faced stronger demographic bottleneck across glacial periods (FIS > -0.3) vs. individuals with FIS < -0.3) (Mattila et al. 2017). We found a marginal TE Effect on heterozygosity in Europe (see Table S9 and New Fig. S1), suggesting that demographic processes rather than mating strategy govern TE-heterozygosity relationships. Response: The subscript with our F values do not refer to degrees of freedom but to number of parameters, which is provided by the ANOVA tables for generalized models (as opposed to general models, where degrees of freedom are provided). We now removed these subscripts to avoid confusion, and mention in corresponding figure captions that significance of factor level estimates are provided in the corresponding supplementary tables.   figure 3A, and 13 factor levels listed in Table  S5, plus an intercept (i.e., 14 levels). It is probably my ignorance, and has a very simple explanation, but it would be good to include this explanation in the paper. Likewise, I struggled to understand how a binomial/binary Y can be zero-inflated.
Response: The intercept represented the first TE superfamily in alphabetical order (i.e. CACTA). We kept this arbitrary choice by the statistical software because the CACTA superfamily was characterized by intermediate heterozygosity levels that represent genome-wide heterozygosity (please see Fig. 3A). We clarified this in our Methods section and in the caption of Fig.3. Given the large number of factor levels, a full statistical contrast of all factor levels would be difficult to interpret and not offer more clarity. Note that our original model also erroneously included TEs with unknown superfamily (SuperfamilyTE_NA); we have now excluded this group from our dataset (hence the total number of TE superfamilies is now 13 including CACTA). We agree that binomial data cannot be statistically zero-inflated, but rather used this term to emphasize the high proportion of zeros in our dataset (because most SNPs are homozygous, value "0"). To avoid confusion, we have removed the zero-inflation statement from the text (New Lxxx).

Fig. 3 New L542-545: "The CACTA superfamily was associated with intermediate levels of heterozygosity in the American
A. lyrata lineage, and therefore taken as a reference group in the mixed model output of Table S6." 8. TE superfamily. Could an individual SNP be affected by more than one TE superfamily? I think the models assume that SNP heterozygosity can only be affected by its closest TE, but how realistic is this? Could there be interactive effects of TEs?
Response: Many TEs are indeed clustered into TE islands, and relationships between SNPs and their closest TE may represent an indirect relationship with another, more distant, TE. We now discuss the possibility of interacting TE effects in our manuscript, while emphasizing that our findings represent the genome-wide impact of TEs on genetic variation under inbreeding, irrespective of TE clustering behavior. TE interactions might indeed be expected to hide TE effects of individual TE superfamilies rather than generate false positive findings. We therefore believe our results are conservative in interpreting the effect of individual TE superfamilies.

Discussion New L344-349: "Second, many TEs cluster into TE islands (Schrader et al. 2014, De Kort et al. 2021), corresponding to the idea that functional genomic regions are frequently revisited by TEs (Baduel et al. 2021). In TE islands, the associations between heterozygosity at specific genetic variants and their nearest TE may be impacted by a slightly more distant TE. Such TE interactions may have reduced our ability to detect significant effects of individual TEs."
New reference:

Introduction New L45: "Purging of alleles associated with inbreeding depression therefore is thought to contribute to the success of self-fertilizing species."
Introduction New L48: "While purging can be an effective strategy to safeguard …"  "While purging can safeguard…" Discussion New L244: "Extreme inbreeding levels typically increase the risk for inbreeding depression,…".

10.
The motivation for this study is understanding "the evolutionary success of selfers". While I completely agree that it is intriguing that selfers manage to occupy a wide range of habitats, they might not necessarily achieve this through adaptation. The alternative view -that selfers are evolutionary dead-ends, but may evolve repeatedly due to short term advantages -also requires attention. Under this scenario, selfers do not need to adapt per se. They may arise pre-adapted to the prevailing conditions (from a locally adapted outcrossing ancestor), or many formed selfing lineages may arise, after which selective filtering allows establishment of those lineages that are best-adapted to the prevailing conditions. Either way, such lineages may eventually go extinct when those conditions change due to limited adaptive potential. It would be good to discuss whether the observed patterns of increased heterozygosity in highly inbred individuals could actually reflect mutational meltdown rather than increased adaptive potential. This would at least explain the finding of many loci being under purifying selection (interpreted by the authors as balancing selection). The work by Emma Goldberg et al (Science 2010) comes to mind. In any case, I would play down L191-194 a bit.
Response: We thank the referee for pointing us into the direction of purifying selection and mutational meltdowns. As detailed in our reply to comment 3 (on purifying selection) and comment 5 (selfers vs outcrossers), we now explore (i) the behavior of low-frequency non-synonymous mutations (which are more likely to be under purifying selection than more common nonsynonymous mutations), and (ii) the TE effect across the outcrossing populations in Europe (no mating system variation but a partial FIS gradient ranging from -0.237 to -0.637). We found limited signatures of purifying selection (please see comment 3 for details). In addition, we found that both in the American and European lineage, Copia and Harbinger elements were associated with increased heterozygosity where inbreeding levels where relatively high (New Fig. 3A and 3B), suggesting that (i) the TE effect is linked to levels of homozygosity (which in turn are influenced by demographic processes) rather than to the mating system, and (ii) signatures of balancing selection unlikely reflect an ongoing mutational meltdown. Note that while these additional findings suggest that the TE effect associated with inbreeding also arises in outcrossing species, it may still partially explain the success of self-fertilizing species (at least under the assumption that mutational meltdown is not an issue), through facilitating evolution under genome-wide genetic depletion. The idea that our signatures of selection reflect pre-adaptation and that self-fertilizing A. lyrata populations are not capable of evolving is interesting, but not in line with our finding of elevated genetic variation in stress-responsive genes, which suggest adaptive potential (see New L195-197).
Finally, we now refer to a study from Wright et al. (2002) specifically addressing the issue of mating system on rates and patterns of molecular evolution in inbred and outbred Arabidopsis, where no indications of mutational meltdown associated with a long history of self-fertilization could be identified. We also recognize and discuss the possibility of mutational meltdown in our paper.
Introduction New L31-33: "Arabidopsis thaliana, for instance, a self-compatible species with extreme inbreeding levels across its natural range, does not show widespread signatures of increased extinction risk but instead exhibits significant adaptive potential (Wright et al. 2002;Atwell et al. 2010;Platt et al. 2010 New Phytol. 198, 386-397 (2013). 15. L232-239 This paragraph feels like a loose end not tied in with any findings. Should it be integrated into the next paragraph? Response: We agree and now integrated our findings with respect to adaptive potential in this paragraph (New L267-284).

Reviewer 2:
In this manuscript by De Kort and colleagues the authors have reanalyzed RADseq data from 91 individuals of Arabidopsis lyrata (from US and EU, a total of around 5900 SNPs) with contrasted breeding systems, in order to test the potential role of transposable elements (TEs) in maintaining genetic variation in inbreeding conditions. While the biological hypothesis is very novel and promising, I see maybe one major drawback in this study. The study investigates the proximity of SNPs with given TEs based on the lyrata genome annotated with REPET. Response: We realize that TE insertions and other indels in genomes other than the reference genome likely are abundant across the sampling range, so we have an incomplete image of the true extent that TEs might enhance nearby heterozygosity. Nevertheless, several arguments justify the use of reference genome positions across our dataset for the purpose of our study. First, recent TE insertions in between our nearest neighbor SNP-TE pairs are expected to reduce the power of our analyses rather than generate false positive results. Second, our hypotheses focus on the impact of TE superfamilies on genome-wide heterozygosity and the orientation of SNPs relative to their nearest TE (downstream vs upstream). The precise positions of the SNPs and their nearest TE therefore is of lesser importance for most of our analyses. Third, as the reference genome is American, the largest impact of non-reference TE insertions could be expected in the European dataset, yet we found repeatable patterns for the American and European dataset. Finally, a recent study by Baduel et al. (2021) in A. thaliana demonstrated that the majority of TIPs are found at a very low frequency as a result of intense purifying selection acting upon deleterious TE insertions, unless they inserted into functional genomic regions. Such functional regions are frequently revisited by TEs, and explain the existence of TE islands near functional genes. We refer to our reply to comment 8 on the potential effect of TE interactions on heterozygosity. We also added some lines to our manuscript to show that we recognize the ubiquity of TIPs and that they may have created noise in our findings. Minor comments:

17.
Line 209: Overstatement: The simultaneous enrichment of biological regulation and stressresponsiveness found here suggests that TEs are involved in genome-wide regulation and evolution of adaptive traits. Response: We removed this sentence because it is redundant given the sentence before (New L277-278).
18. Line 255: not described in the Results section! Response: Indeed; we now described these compelling findings in the results section. Results New L187-189: "Interestingly, SNPs downstream of (i) MuDR TEs were enriched for processes involved in photosynthesis and growth, (ii) Helitron were typically associated with abiotic stresses (heat, light, drought), and (iii) MITEs with wounding and disease (Table S8)."

19.
Line 261: I think the conclusions in this paragraph are not supported by the data. Could the authors comment please? Response: The relationship of Copia and Harbinger elements with inbreeding suggests their involvement in specific demographic scenarios. In addition, the relationship of (i) MuDR TEs with photosynthesis and growth, (ii) Helitron with abiotic stresses (heat, light, drought), and (iii) MITEs with wounding and disease, suggests their involvement in specific environmental scenarios. We nevertheless rephrased so as to make clear that further research is required to consolidate these findings. New L306-307: "While calling for further research, our results collectively suggest that each TE superfamily can facilitate evolution in particular environmental and demographic situations. Specifically, ….."

Reviewer 3:
Review of "Transposable elements maintain genome-wide heterozygosity in inbred populations" This study by De Kort et al reports association between heterozygosity estimated from RADseqderived SNPs across populations of Arabidopsis lyrata and the presence of different transposable elements (TEs) in the reference genome. The biological system is well established for its gradient of inbreeding across North America due to transitions towards selfing, which is here used to discuss possible contrasts between TE-flanking and "background" regions of the genome. The manuscript is concisely written (sometime to such an extent that superficial descriptions hamper proper understanding; see below) and, bringing fresh perspectives on long-held questions, appears typical of what is published in Nature Communication.

20.
Although the introduction on selfing and purging offers valuable background (despite some seminal work by Willis on Mimulus being ignored), I found transposable elements being more superficially introduced. The supposedly 50x higher mutation rate around TEs should be clearly justified by literature (currently, based on loosely connected references). Also, available knowledge about TEs in A. lyrata and their evolution across populations shall be better introduced and used in the discussion (e.g. early insights from Gaut's lab a decade ago); also further justifying remaining gaps in our understanding that is here addressed.
Response: We completed our selfing and purging introduction with a key study by Willis (1999). We also toned down the mutation rate statement, as 50fold increases are rare. Increases of 10x and 15x are more common and explicitly shown in Wicker et al. 2016 andHabig et al. 2021. We also clarified the role of transposable elements in evolution, with an emphasis on what is known in A. lyrata. Indeed, Gaut's lab represents much of what is known about the demographic and evolutionary history in A. lyrata and the role of TEs herein. We thus added two more references involving the work of Gaut's lab. New L38: " (Lynch et al. 1995;Willis 1999;Coron et al. 2013;Kyriazis et al. 2020)" New L62-65: "Transposable elements (TEs) can rapidly generate genetic variation through increasing mutation rates of upstream and downstream flanking genomic regions (Schrader et al. 2014;Lu et al. 2017;Schrader and Schmitz 2019;Habig et al. 2021); several studies have observed increases over 10x the background mutation rate (Wicker et al. 2016;Habig et al. 2021)." New L102-112: "In the mixed mating species Arabidopsis lyrata, self-fertilizing populations are characterized by increased TE copy numbers and higher TE frequencies than outcrossing populations, aligning with models of neutral evolution and ectopic recombination (Bonchev & Willi 2018). More specifically, the higher levels of sequence homozygosity in self-fertilizing populations reduces the rate at which nonhomologous sequences recombine, thereby generating harmful chromosomal rearrangements. As a consequence, self-fertilization can facilitate genome-wide TE accumulation, potentially providing considerable opportunities for evolution in genomic sequences flanking TEs. However, while inbreeding may affect TE dynamics and evolution in A. lyrata (Lockton et al. 2010;Bonchev & Willi 2018;Lockton et al. 2018), the impact of inbreeding on genetic variation and evolution surrounding TEs has never been explicitly tested." New references: • . We now provide supporting files demonstrating the range of read depths across the SNP dataset. The minimum allele frequency was set to 0.05, which further reduced false SNP calls as well as rare deleterious SNPs that could impact the interpretation of our outlier detection methods. No missing data were allowed to avoid arbitrary imputation methods that could impact the accuracy of our data. We are not sure which post hoc analyses should be conducted to validate the approach. Given specific suggestions of such analyses, we would be happy to implement additional validation of the dataset if still deemed necessary. However, we do note that this RAD-seq dataset has been used successfully in several peer-reviewed scientific manuscripts (Buckley et al. 2018, Buckley et al. 2019.

Fig. S4 Distribution of mean read depth and mean genotype likelihood across (a) European samples and (b)
North American samples for SNPs included in the study.

22.
Second, and most importantly, TEs are known as mutagenic by themselves and polymorphic copies are therefore expected to segregate with the species, although only sites flanking TE copies inserted in the reference genome of A. lyrata are here under scrutiny. How was it taken into consideration that some TE loci may be absent and, even more pervasive, that quite many sites free of TEs in the reference can be populated by inserted copies across populations under scrutiny? How does such a reference-bias affect conclusions regarding SNPs flanking TEs?
Response: We fully agree that variable TE insertions (TIPs) are likely to populate our dataset. Importantly, however, a recent study in Arabidopsis thaliana (Baduel et al. 2021) highlights that the majority of recent TE insertions are deleterious and rare, in agreement with purifying selection against TEs that could insert between our SNPs and their nearest reference TE. As a result, it is unlikely that rare and randomly distributed non-reference TEs would have notable effects on our conclusions. Even when this is the case, it would likely just reduce the power of our analyses for detecting effects of TE superfamilies. As for reference TEs being absent across the dataset, we assume limited impacts on our result for two reasons: (1) common TE insertions typically re-visit genomic regions already occupied by TEs because genetic variation in these genomic regions provides adaptive potential (Baduel et al. 2021). This is reflected by our results because we found predictable patterns of natural selection between independent evolutionary lineages, suggesting that reference TEs involved in adaptive evolution are common across the dataset; (2) similar to nonreference TEs, reference TEs that are absent across the range are expected to be rare, and to add noise to our models rather than generate false positive findings. We discuss this issue in more depth (see also our reply to comment 16).

Reviewers' Comments:
Reviewer #1: Remarks to the Author: I reviewed the previous version of this interesting manuscript, and was happy to receive the revision. First, I want to compliment the authors on their very thorough rebuttal and revision, and the way they have taken my feedback (and that of the other reviewers) into consideration. I think the manuscript has improved a lot in terms of clarity and is now a much easier read.
I think there is only one major remaining issue.
The authors have attempted to explain better why they interpret the patterns in their data as signatures of balancing rather than purifying selection. However, I'm still confused. In the methods for example, the explanation makes sense until about L392. Then they lose me. L393-395: "Specifically, we considered SNPs with signatures of balancing selection as candidates of purifying selection if their minimum allele frequency (MAF) was (i) smaller than the MAF of neutral SNPs, and (ii) smaller for non-synonymous as compared to synonymous codon usage." To me, this wording implies that the authors consider negative alpha values as signatures of balancing selection, unless proven otherwise. This seems weird to me, as purifying selection should generally be more frequent than balancing selection. It is not until the discussion that there is some explanation (Fijarczyk & Wiesław Babik 2015).
Although it would be good to incorporate that explanation also in the methods and wherever else appropriate, I would have much less problems with the logic, if it were phrased as follows: "Specifically, we considered SNPs with signatures of selection (negative alpha values in Bayescan analysis) as candidates of purifying selection if their minimum allele frequency (MAF) was (i) smaller than the MAF of neutral SNPs, and (ii) smaller for non-synonymous as compared to synonymous codon usage. In any other cases, we considered SNPs with negative alpha values as candidates to be under balancing selection." Maybe you could also provide numbers (in the results) on the number/proportion of SNPs with negative alpha that were considered as under purifying vs. balancing selection. The discussion could then also discuss if the number of SNPs under putative balancing selection is normal/higher than expected, and why this may be. Related to this, in my opinion, the discussion should more carefully consider what may have caused this signature of balancing selection. The Fijarczyk & Wiesław Babik 2015 explanation is good, but what about other potential explanations? Heterozygote advantage (overdominance)? Or could it also be pseudo-overdominance?? I think so, but pseudo-overdominance seems to have been ignored by the authors. Could it be "real" balancing selection, by the way, through direct selection for heterozygotes??
L168/169/170 -"can be identified as low frequency signatures of balancing selection". I find this wording confusing. It seems to reflect that authors consider balancing selection the norm, and that purifying selection is a special case of this. However, I do not agree with that view. I think that SNP-variants with a relatively low frequency are indicative of negative (aka purifying) selection. Why call this "low frequency signatures of balancing selection"?? Why is this not simply a "signature of purifying selection"?
L191 -it would be helpful to briefly repeat the crux of the hypothesis.
L218 -authors correctly highlight that their evidence for the importance of TEs for evolution are INDIRECT. The abstract, however, lacks this disclaimer, which I find somewhat misleading.
L232-235 and throughout the ms -the term vital and essential are used in a confusing way (to me), mainly because vital (colloquially) is often used interchangeably with essential, and does not mean the same as vital in a strict genetic sense. It would help if the authors define what they mean by "vital" and perhaps replace with "non-essential". L233 mentions "signatures" of balancing selection in essential genes, but should this not be purifying?? How would one get balancing selection in essential genes that are supposed to stay intact?? Does the Fijarczyk & Wiesław Babik 2015 explanation apply here`?
L245 "Fitness-related genomic regions" -I have trouble understanding what is meant by this. Is it regions with many non-essential/"vital" genes L246 -I would say "findings may help explain earlier studies…" L312-313 -This explanation for why one would end up with signatures of balancing selection (rather than purifying selection) makes sense. I think the authors may want to bring this logic forward, rather than hiding it deep down in the middle of the discussion. This would have taken away much of my confusion. Maybe also change "Het SNP involved in evolution" to "Het SNP providing evolutionary potential"  Reviewer #2: Remarks to the Author: I thank the authors for having nicely prepared their answers to reviewers. Most points are clarified. There is nevertheless one point on which I disagree (point 16): "Recent findings in A. thaliana confirm that non-reference TE insertions appear at low frequency due to their deleterious nature": I think this statement is too general, in my opinion most TE insertions are neutral (except for the families with a tendency to jump into promoter/genic regions). This sentence refers to the work by Baduel et al, thus I encourage the authors to discuss this interpretation with Baduel's co-authors. My opinion on TE insertions does not affect the conclusions of the present work, and I believe it will be of great interest to the Nature Comm readers.