Genome and single-cell RNA-sequencing of the earthworm Eisenia andrei identifies cellular mechanisms underlying regeneration

The earthworm is particularly fascinating to biologists because of its strong regenerative capacity. However, many aspects of its regeneration in nature remain elusive. Here we report chromosome-level genome, large-scale transcriptome and single-cell RNA-sequencing data during earthworm (Eisenia andrei) regeneration. We observe expansion of LINE2 transposable elements and gene families functionally related to regeneration (for example, EGFR, epidermal growth factor receptor) particularly for genes exhibiting differential expression during earthworm regeneration. Temporal gene expression trajectories identify transcriptional regulatory factors that are potentially crucial for initiating cell proliferation and differentiation during regeneration. Furthermore, early growth response genes related to regeneration are transcriptionally activated in both the earthworm and planarian. Meanwhile, single-cell RNA-sequencing provides insight into the regenerative process at a cellular level and finds that the largest proportion of cells present during regeneration are stem cells.

This is clearly an important study with a lot of interesting data, and which cvclearly provides useful insights for our understanding of regeneration in annelids. There are however problems that should be solved by the authors.
Major concerns : 1. One of major concern is the lack of a clear description of E. andrei anterior regeneration.
The authors took regenerated region at different time points after amputation, but we do not have any idea of what these regions look like at these different time points. When is wound healing completed ? Are there proliferating cells at these different time points ? When are differentiated cells or structures, such as muscles or neural cells, observed ? Is the brain fully regenerated by 72 hours post-amputation ? I think that these are crucial information to be able to make in depth use and interpretation of the nice transcriptomic data generated by the authors. This information should be provided.

2.
My second concern is about the section « Evolution of Gene Families Related to Regeneration », which I found not very clear and misleading. The authors identified gene families that have been expanded in E. andrei, including some belonging to particular pathways such as Wnt signaling pathway. I'm not sure what can be concluded from these data and how they can be related to regeneration. In particular, the sentence « These results are consistent with the conclusion that cell-cell communication and biosynthesis actively take place during regeneration to induce dedifferentiation/neoblast state, to regulate the proliferation of pluripotent cells and to specify the fates of the resulting cells to reconstruct the missing organs. » seems to me senseless. The final sentence « Collectively, our analyses suggested that the evolution of regeneration in earthworms might have been enhanced through the specific expansion of key genes or pathways that regulate the wound healing process or cellular proliferation. » is inappropriate, because this is not supported by the data. 3. Third main concern is about sc-RNA-seq data. This is clearly a strong positive aspect of this paper that such an analysis has been conducted and the authors should be congratulated for that. However, the assignment of cell clusters to cell types is, to my point of view, not really convincing. In particular, I'm really not sure that expression of sox2 is enough to demonstrate that these cells are pluripotent stem cells. In many species, including other annelids, orthologs of this gene are for example expressed in neural cells, including putative neural stem cells (which are not pluripotent) and probably also progenitors (not stem cells). Other genes whose expression is supposed to support a pluripotent stem cell fate are histone genes (H4, H14 and H2A). Their expression could maybe show that these clusters correspond to proliferating cells, but I don't see clearly how their expression can show that cells expressing these genes are pluripotent stem cells. The identification of neuron cells based on a single marker (NF70) is also not much convincing. Please note that I do not argue that cell type identification is wrong, but that it should be much more substantiated by data. My other concern is that it is a good practice to provide some experimental support of cell assignment in single cell data analyses, for example, like it is done in most or all such studies, by showing in situ hydridization for characteristic genes used to define identities of cell clusters. The authors should provide such data.
Other questions and suggestions : More importantly, the authors should made some comparisons between E. andrei and E. fetida genomes. For example, one conclusion drawn from E. fetida genome analysis was that this species (or one of its ancestrors ?) underwent extensive gene duplications. It seems to be the case in E. andrei as well, but did these gene duplications occurred before or after to E.
andrei/E. fetida divergence ? On the other way, is the LINE2 expansion desribed in this manuscript, specific to E. andrei or also found in E. fetida ? I found quite strange that E. fetida was not included in the diagram b of Figure 3 and in the corresponding analysis.
2. The authors chose to perform their transcriptomic analysis on anterior regeneration. I have no problem with this choice, but I think that they should briefly explain why they favored anterior over posterior regeneration (opposite choice was for example made by Bhambri et al. for E. fetida).
3. In the section « Temporal Gene Regulation Patterns in the Regeneration Response Process », the authors claimed, when discussing about the « brown module », based on their expression data and the fact that the « neoblast » term was first coined for annelid cells, that « Therefore, our analyses suggest that the brown module, including vital regulators, is initially activated and may induce the activation of pluripotent stem cells and supply necessary materials for the cell cycle. ». This is again an overstatement in particular because I think that there is no clear evidence for existence of pluripotent stem cells in their annelid model and I don't think that this can be inferred by expression of genes « involved in cellular proliferation, differentiation and programmed cell death ». Along the same line, I don't think the sentence « Therefore, our results imply that the two modules might be vital for the proliferation and maintenance of pluripotent stem cells in the regenerative processes of earthworms. » is supported by data. These overstatements should be suppressed. arguments that would favour convergence over homology. It is possible that the three studied genes/gene families could be ancestrally involved in regeneration in bilaterians or even animals. What's also a bit annoying is this section, as in other sections in fact, is that the authors seemed to want, whatever the data they have, to find parallels and similarities with flatworms. it should not be a aim in itself! Along the same line, I don't agree with sentence in the conclusion, " We report a convergent mechanism of earthworm and planarian regeneration, including the genes EGR, RUNT, JUN and FOS." -once more nothing to support convergence.

« Convergent Genes in Earthworm and
5. The authors should give more details about their protocol for single-cell sequencing. The sentence « Earthworm single-cell sample that had undergone regeneration for up to 72 hours was prepared, and ChromiumTM Single Cell Solution was applied. » is not enough. How cells are prepared in a very important step in a sc-RNA-seq analysis and the authors should provide details about how samples were treated, how cell dissociation was performed, how many worms were used, how cell sorting was done (if it was done), …. This is important to judge quality of the data, which is are strongly dependent of the used protocol. Along the same line, it should be indicated for the bulk transcriptomic analysis how many worms were used for each biological replicates (single worms or pools of worms ?).
6. I would have much like to have a real discussion section and not simply a few lines of conclusion. I think there are many aspects of this interesting work that deserves careful discussion.

Reviewer #2:
Remarks to the Author: The manuscript describes the sequencing of the earthworm genome Eisenia andrei. Formally, a couple of assemblies have been produced before for the related Eisenia fetida species but at much lower quality. The authors use the genome sequence to explore the genetic pathways related to regeneration using evolutionary data and gene expression data, including at the single-cell level. The findings are fairly descriptive in nature, no mechanism is truly uncovered, but the paper makes interesting observations, especially with respect to gene clusters either co-expressed during regeneration or with respect to cell types enriched in the regeneration process. I am no expert in regeneration biology, so I cannot really estimate how much of an advance this represents in the field, and how well the results are discussed with respect to this state-of-the art, but I thought that these analyses ( for DEG and non-DEG respectively, two mutually exclusive classes that together represent 100% of the genes in the earthworm. So taken literally, is the conclusion that 84% of LINE2 are present in DEG and 75% in non-DEG? If so, why is the sum more than 100% of LINE2 elements? Do the authors mean that 84% and 75% of the respective gene classes contain genes" relates to "most of the 19 genes presented in panel 3F" or to "most DEG genes"? If the former, then why would this be convincing since we do not know how these 19 genes were picked? If the latter, please provide exact numbers out of the ~6000 DE genes and a test of significance. When several LINE2 elements lie within 5 kb of a gene, do they all show a consistent expression change? Overall, I find that these results, because they are drawn from a large dataset, will always yield interesting subsets that fit whichever biological process is of interest. The authors should provide stronger evidence in favour of the LINE2 link. 2. Lines 234-261: in this section, an attempt is made at linking gene family expansion with differential gene expression. However it is not clear to me how consistent and significant these results are, compared to some simple observational results. For example, Figure 4b is used to say that "These genes may be especially important as regulatory genes during the regenerative process". The alternative is that gene families expand and contract under some other influences (unrelated to regeneration). The overlap between these expanded genes families and the DEG gene set then captures a distribution of the former, with, as in all distribution, some samples at the extremes of this distribution (like the ZNFX1 gene family).
But the manuscript does not provide any evidence that it is specifically linked to regeneration.
The rest of figure 4 runs through hand-picked gene families and the related text discusses them as suggestive evidence that regeneration in earthworms evolved under the influence of (i.e. was "enhanced", line 259) through the "specific" expansion of key genes or pathways.
But again, the data currently does not show this to be a specific link (a randomization test might be helpful in this regard).  Table 8: what is the ranking scheme and why are some terms highlighted in red (the logic is not obvious).
7. Lines 213-214. Could the authors please indicate what threshold was used to identify the "substantially expanded" gene families?
8. Lines 312-320. I do not understand the connection between the regenerative blastema (which is formed after >96 hrs following amputation, well after the time line studied in the manuscript) and the data presented. In particular, I do not understand how it can help the authors make the conclusion line 321: "Thus, our analysis…"

Responses to reviewers' comments
Reviewer #1 (Remarks to the Author): In this submitted manuscript, Shao and co-workers present an impressive genomic and transcriptomic study of an annelid species, the earthworm Eisenia andrei. Earthworms, which are key species of the soil ecosystem, display several interesting biological properties, including important regenerative abilities. These worms can indeed regenerate lost body structures, in their posterior body region, as well as in several cases (such as E. andrei) in their anterior region. Earthworms, and more generally annelids, are interesting models to study regeneration, notably because they have a much more elaborated body plan (with complex organs and organ systems) than other highly studied non-vertebrate species with high regenerative abilities such as flatworms and cnidarians. A better understanding of annelid regeneration is therefore of high interest for the whole regeneration field. In addition, despite the importance of annelids, there only were until now three published full genome sequences from this group. Shao et al. combined PacBio long-reads, Illumina short-reads and Hi-C sequencing to generate a high-quality genome assembly of E. andrei genome, producing what is, to my knowledge, the first chromosome level assembly of an annelid genome. The authors also performed a bulk-RNA-seq analysis at different time points of E. andrei anterior regeneration, identifying a large number of differentially-expressed genes during regeneration. They found that LINE2 transposable elements, which underwent a quite recent expansion in E. andrei, are often transcriptionnally active during regeneration and might have impact on expression of adjacent genes, an interesting hypothesis for which the authors unfortunately did not show experimental evidence. The authors also studied the evolution of some gene families in E. fetida, and provide examples of expansion of some families by gene duplications (EGFR and TCAF families). Finally, the authors performed sc-RNa-seq at one time point after amputation, an analysis from which they drew the conclusion that major cell types of the regenerated region are pluripotent stem cells. This is clearly an important study with a lot of interesting data, and which clearly provides useful insights for our understanding of regeneration in annelids. There are however problems that should be solved by the authors. Reply: Thank you for your time spent on reviewing our manuscript. We sincerely appreciate your valuable comments which have definitely helped us to improve our manuscript. Please see our revisions in this manuscript version and our responses to your comments in the following.
Major concerns: 1. One of major concern is the lack of a clear description of E. andrei anterior regeneration.

Reply:
We thank the reviewer for this comment. According to your comment, we now describe more clearly about anterior regeneration after 1-4 body segments post-amputation. A series of morphological photos in amputation plane across regenerative stages including 0h, 6h, 12h, 24h, 48h, 72h, 4d, 5d, 6d, 7d, 14d, 18d and 28d were taken to estimate anterior regeneration compared to control E. andrei (Fig. 2b and Supplementary Fig 4). Meanwhile, the HE staining of transverse sections of anterior in early phases of regenerative stages was performed (Fig 2c and Supplementary Fig 5). Please check our revised main text, Fig. 2

and Supplementary Figs 4 and 5 for detail.
We also provided these descriptions here: "Using Ki-67 immunofluorescent labeling, we found that cell proliferation initiated at 24 hours post-amputation, and at 48 and 72 hours post-amputation the proliferating cells increased rapidly and gradually migrated to the center of cross sections ( Fig. 2d and Supplementary Fig 6). At 5 days post-amputation, the wound healing was fully accomplished and a small blastema (de-differentiated cells) appeared in center of the amputation plane (Supplementary Fig 4). At 6 and 7 days post-amputation, the blastema persistently experienced growth and elongation (Supplementary Fig 4). Although the newly produced body segments were not observed at 14 days post-amputation, the base of outgrowth has accumulated pigments (Supplementary Fig 4). At 18 days post-amputation, new body segments arise, and at 28 days post-amputation the obvious body segments take shape in regenerative appendages (Supplementary Fig 4).".
We provide revised Fig.2a-2d  The early growth response genes and transcriptional factor genes respectively were compared for two species (earthworm and planarian). The planarian gene expression changes were obtained from a previous study.
The authors took regenerated region at different time points after amputation, but we do not have any idea of what these regions look like at these different time points. Reply: We thank the reviewer for pointing out the issue. According to this comment, we now performed some experiments to describe more about these regions during regeneration process. We provided some high resolution graphs for each amputation plane at different stages after amputation (0h, 6h, 12h, 24h, 48h and 72h) to show general views of these regions ( Fig. 2b and Supplementary Fig 4). And we also performed a series of HE labeling of transverse sections of anterior in early regeneration phases ( Fig. 2c and Supplementary Fig 5). Please see our revised manuscript, Fig. 2b  When is wound healing completed? Reply: A previous histological study of the earthworm, E. andrei 1 uncovered that at 3-5 days post-amputation, the wound healing was completed, while at 5 days post-amputation the wound healing process was fully completed because a regeneration blastema structure starts to appear.
According to your comment, we also performed a series of experiments. Consistent with the above reference 1 , we found that at 3 days post-amputation, the wound section was covered by an intact epithelium and at 5 days post-amputation, we could observe a blastema structure and at 6 days post-amputation, the blastema was obvious (Supplementary Fig 4). Therefore, together with previous study 1 , we concluded that the wound healing was completed at 5 days after post-amputation. We now provide this information in the revised manuscript. When are differentiated cells or structures, such as muscles or neural cells, observed? Is the brain fully regenerated by 72 hours post-amputation? Reply: Thank you so much for this comment. Actually, until now, when these differentiated cells or structures, such as muscles or neural cells, emerge remains unclear in this earthworm. During this round of review process, we used ISH of cell markers (including TPM and NF70) to tract when these cells emerge. Unfortunately, due to technological failure, we didn't get positive result. However, we think it is out of scope of this study, and will not change conclusion and result of this study.

Supplementary
Future more experimental studies using more different markers are necessary to answer these questions.
I think that these are crucial information to be able to make in depth use and interpretation of the nice transcriptomic data generated by the authors. This information should be provided. Reply: Thanks a lot for your comments. We fully agree with your points. According to your suggestions, we did a lot of experiments above. And thus we utilized the useful information to further explain our transcriptomic results. These revisions improved our manuscript well.

2.
My second concern is about the section «Evolution of Gene Families Related to Regeneration», which I found not very clear and misleading. The authors identified gene families that have been expanded in E. andrei, including some belonging to particular pathways such as Wnt signaling pathway. I'm not sure what can be concluded from these data and how they can be related to regeneration. In particular, the sentence « These results are consistent with the conclusion that cell-cell communication and biosynthesis actively take place during regeneration to induce dedifferentiation/neoblast state, to regulate the proliferation of pluripotent cells and to specify the fates of the resulting cells to reconstruct the missing organs. » seems to me senseless. The final sentence « Collectively, our analyses suggested that the evolution of regeneration in earthworms might have been enhanced through the specific expansion of key genes or pathways that regulate the wound healing process or cellular proliferation. » is inappropriate, because this is not supported by the data. Even the title of the section is misleading because I don't see clearly what are these « Gene Families Related to Regeneration ». EGFR, TCAF, ZNFX, and Collagen are likely to have many roles during development and life of the animal, and it is an over-interpretation to consider that, because some of them are expressed during regeneration, their duplication might have had a role in the evolution of regeneration in E. andrei. The authors should completely rewrite this section, sticking to what can really be inferred by the data, or suppress this section if no clear conclusion can be drawn. Reply: Thanks very much this comment. We agree with the reviewer that we overclaim our results. According to your comment, we rewrite this section, narrow down some claims, and removed some descriptions.

Regeneration" of this section into "Evolution of Gene Families in the Earthworm Genome". And we also rewrite several sentences the reviewer commented,
Please see the detail in the revised manuscript.
3. Third main concern is about sc-RNA-seq data. This is clearly a strong positive aspect of this paper that such an analysis has been conducted and the authors should be congratulated for that. However, the assignment of cell clusters to cell types is, to my point of view, not really convincing. In particular, I'm really not sure that expression of sox2 is enough to demonstrate that these cells are pluripotent stem cells. In many species, including other annelids, orthologs of this gene are for example expressed in neural cells, including putative neural stem cells (which are not pluripotent) and probably also progenitors (not stem cells). Other genes whose expression is supposed to support a pluripotent stem cell fate are histone genes (H4, H14 and H2A). Their expression could maybe show that these clusters correspond to proliferating cells, but I don't see clearly how their expression can show that cells expressing these genes are pluripotent stem cells. The identification of neuron cells based on a single marker (NF70) is also not much convincing. Please note that I do not argue that cell type identification is wrong, but that it should be much more substantiated by data. My other concern is that it is a good practice to provide some experimental support of cell assignment in single cell data analyses, for example, like it is done in most or all such studies, by showing in situ hydridization for characteristic genes used to define identities of cell clusters. The authors should provide such data.

Reply:
We sincerely thank the reviewer for your careful reading and professional comments, which we believe have improved greatly our manuscript.

As we know, and papers we read, many studies have validated that "Sox2 is a well-established pluripotent transcription factor that plays an essential role in establishing and maintaining pluripotent stem cells (PSCs). Together with octamer-binding transcription factor 4 and Nanog, they co-operatively control gene expression in PSCs and maintain their pluripotency." Many studies 2-4 have reported SOX2, a master regulator of pluripotency, as a marker in pluripotent stem cells (PSCs). We now described this more clearly and cited additional references in our revised manuscript.
However, we agree with the reviewer that "In many species, including other annelids, orthologs of this gene are for example expressed in neural cells, including putative neural stem cells (which are not pluripotent) and probably also progenitors (not stem cells)". Therefore, we search for more evidences to support the identification of PSC. (1). Histone genes (i.e., H4, H1 and H2A) are highly expressed in clusters (0/1/3), although they are also highly expressed in other clusters in our data (Fig. 6b, and Supplementary Figs 26 and 28).

To clarify the identification of neuron cells, we further search for more markers and evidences. A series of significantly highly expressed marker
Other questions and suggestions: 1. As mentioned by the authors, genome sequence of the closely-related species E. fetida has been published. The authors could add reference to Bhambri et al. 2018 Plos One in addition to Zwarycz et al. 2015, as in fact E. fetida genome has been sequenced twice independently. More importantly, the authors should made some comparisons between E. andrei and E. fetida genomes. For example, one conclusion drawn from E. fetida genome analysis was that this species (or one of its ancestrors?) underwent extensive gene duplications. It seems to be the case in E. andrei as well, but did these gene duplications occurred before or after to E. andrei/E. fetida divergence? On the other way, is the LINE2 expansion desribed in this manuscript, specific to E. andrei or also found in E. fetida? I found quite strange that E. fetida was not included in the diagram b of Figure 3 and in the corresponding analysis. In the revision, we performed a further analysis by including the genome of E. fetida. Indeed, in line with your points, we found that the genomes of both E. fetida and E. andrei potentially underwent extensive gene duplications (i.e., abundant expanded gene families in earthworm branches) (Fig. 4a and  Supplementary Fig 16). And our analyses of Ks distributions suggested these gene duplications occurred before E. andrei and E. fetida diverged (Fig. 4b).

Please see our revisions in the section "Evolution of Gene Families in the Earthworm Genome".
Furthermore, the E. fetida genome also possessed abundant content of LINE2 (~4.1%) (Fig. 3b), although the low genome assembly quality potentially underestimated the evaluation.
2. The authors chose to perform their transcriptomic analysis on anterior regeneration. I have no problem with this choice, but I think that they should briefly explain why they favored anterior over posterior regeneration (opposite choice was for example made by Bhambri et al. for E. fetida). Reply: We now explained this in the revision: "Some studies have documented transcriptomic and some phenotypic changes of posterior regeneration in the earthworms 16,22,23 , but very few researches are focused on the anterior regeneration 14 ".

In the section «Temporal Gene Regulation Patterns in the Regeneration
Response Process», the authors claimed, when discussing about the « brown module », based on their expression data and the fact that the « neoblast » term was first coined for annelid cells, that « Therefore, our analyses suggest that the brown module, including vital regulators, is initially activated and may induce the activation of pluripotent stem cells and supply necessary materials for the cell cycle. ». This is again an overstatement in particular because I think that there is no clear evidence for existence of pluripotent stem cells in their annelid model and I don't think that this can be inferred by expression of genes «involved in cellular proliferation, differentiation and programmed cell death». Along the same line, I don't think the sentence «Therefore, our results imply that the two modules might be vital for the proliferation and maintenance of pluripotent stem cells in the regenerative processes of earthworms.» is supported by data. These overstatements should be suppressed.

Reply:
We thank this reviewer for pointing out this issue. We agree with the reviewer that we overclaim our results. According to your comment, we rewrite this section, revised some description and narrow down some claims.

« Convergent Genes in Earthworm and
Planarian Regeneration » is a very bad title for the corresponding section. First because I don't understand what means « convergent genes ». Second, while I guess that the authors meant « convergent expression », convergence is an evolutionary hypothesis that requires some support to be proposed. Here I don't see what are arguments that would favour convergence over homology. It is possible that the three studied genes/gene families could be ancestrally involved in regeneration in bilaterians or even animals. What's also a bit annoying is this section, as in other sections in fact, is that the authors seemed to want, whatever the data they have, to find parallels and similarities with flatworms. it should not be a aim in itself! Along the same line, I don't agree with sentence in the conclusion, "We report a convergent mechanism of earthworm and planarian regeneration, including the genes EGR, RUNT, JUN and FOS." -once more nothing to support convergence. Reply: We sincerely thank this reviewer for this professional comment. We agreed with the reviewer that we may misuse "convergent". We now revise this section, particularly, we removed "convergence" and "convergent", and revised the title of this section as "Parallel Transcriptional Activation of Immediate Early response Genes in Earthworm and Planarian Regeneration" We also changed our conclusion: "our results suggest the earthworm and planarian potentially utilize a set of similar transcriptional activated immediate early response genes to regulate early regeneration process".
Please see the detail in this section of revised manuscript.

The authors should
give more details about their protocol for single-cell sequencing. The sentence « Earthworm single-cell sample that had undergone regeneration for up to 72 hours was prepared, and ChromiumTM Single Cell Solution was applied. » is not enough. How cells are prepared in a very important step in a sc-RNA-seq analysis and the authors should provide details about how samples were treated, how cell dissociation was performed, how many worms were used, how cell sorting was done (if it was done), …. This is important to judge quality of the data, which is are strongly dependent of the used protocol. Along the same line, it should be indicated for the bulk transcriptomic analysis how many worms were used for each biological replicates (single worms or pools of worms ?).

Reply:
We thank the reviewer for pointing out this issue. We now provided more detailed information about our protocol for the single-cell sequencing in this revised manuscript. Earthworm single-cell samples were prepared using the following protocol: (1) 15 Earthworms were cleaned and soil was removed using PBS or ddH2O.
(2) We used tweezers to drag the earthworms to make its head natural extended and then quickly amputated the first four body segments (the brain is located in body segment 3~4 of the anterior). (3) Amputated earthworms were placed into soil with fertilizer and cultivated at 25 °C until to 72 hours, and then we obtained the wound healing plane segments from 15 earthworms. (4) These wound healing segments were dissociated by adding Collagenase I (500ul 1mg/ml) and then maintained 1.5~2 hours under 37°C. (5) Cells were pelleted by centrifugation at 3000rpm in 5min; the supernatant was removed and cell pellets were washed one time using 1X PBS. We then added 200ul 0.25% TE and allowed the cells to incubate for 5~10 minutes and then neutralized using 1ml 1640/DMEM including serum. 6. I would have much like to have a real discussion section and not simply a few lines of conclusion. I think there are many aspects of this interesting work that deserves careful discussion. Reply: Thanks this reviewer sincerely for your valuable comments. We fully agree with your points. Therefore, according to your suggestions, we added a discussion section in our revised manuscript. Please check our main text in Discussion section.

Reviewer #2 (Remarks to the Author):
The manuscript describes the sequencing of the earthworm genome Eisenia andrei. Formally, a couple of assemblies have been produced before for the related Eisenia fetida species but at much lower quality. The authors use the genome sequence to explore the genetic pathways related to regeneration using evolutionary data and gene expression data, including at the single-cell level. The findings are fairly descriptive in nature, no mechanism is truly uncovered, but the paper makes interesting observations, especially with respect to gene clusters either co-expressed during regeneration or with respect to cell types enriched in the regeneration process. I am no expert in regeneration biology, so I cannot really estimate how much of an advance this represents in the field, and how well the results are discussed with respect to this state-of-the art, but I thought that these analyses (Figures 5 & 6) were well conducted. I have some reservations with other results, namely those presented in Figures 3 & 4, as well as some minor comments.

Reply:
We sincerely thank you for your time spent on reviewing our manuscript. We appreciate your useful comments, which have largely helped us to improve our manuscript. Please see our revisions according to your comments in our revised manuscript, and our responses to your comments as follows.
1. Figure 3 presents evidence for the association of LINE2 expansion with regeneration-linked differentially expressed genes (DEG). I think that the specificity of this association should be much more carefully presented. First, both the text and Fig 3e present the proportion/frequency of LINE2 elements in DEG. The two proportions are ~0.84 and ~0.75 for DEG and non-DEG respectively, two mutually exclusive classes that together represent 100% of the genes in the earthworm. So taken literally, is the conclusion that 84% of LINE2 are present in DEG and 75% in non-DEG? If so, why is the sum more than 100% of LINE2 elements? Do the authors mean that 84% and 75% of the respective gene classes contain LINE2 elements? Reply: We thank the reviewer for pointing out this issue. We are sorry for the unclear description about the frequency of LINE2 elements in this analysis. In addition, we further performed a more strict screening for DEGs and removed those DEGs harboring low expression value (0) at least at one compared condition. Thus, we re-calculated the proportions (5,065/6,048 vs. 19,421/25,769) and the difference was still statistically significant (P=7.641E-07, χ 2 test).
We now revised the Fig. 3e and described it clearly in the figure legends. We also described it clearly in the main text: "We discovered that the proportion of DEGs (described above) harboring LINE2 elements, was significantly higher than that of non-DEGs (background genes) harboring LINE2 elements". Second, the authors should explicit in the Methods section how the LINE2 content of DEG and non-DE genes was computed. Reply: Thank you so much for your valuable suggestion. We now describe it more clearly in our Methods section.
Third, figure 3f is not clear. How were these 19 LINE2/gene combinations selected out of the ~6.000 DE genes? There are described as "representative" but of what? Is the pattern proposed by the authors in any way different from what would be expected under some null hypothesis? Line 197-199, "Most neighbouring genes" relates to "most of the 19 genes presented in panel 3F" or to "most DEG genes"? If the former, then why would this be convincing since we do not know how these 19 genes were picked? If the latter, please provide exact numbers out of the ~6000 DE genes and a test of significance. When several LINE2 elements lie within 5 kb of a gene, do they all show a consistent expression change? Overall, I find that these results, because they are drawn from a large dataset, will always yield interesting subsets that fit whichever biological process is of interest. The authors should provide stronger evidence in favour of the LINE2 link.

Reply:
We thank this reviewer for pointing out this issue. We are sorry for the unclear description. We now described Fig.3f and our methods more clearly.
We divided the gtf annotation of LINE2 located in 5k flanking of the gene locus into two gtf files including 5k 5'-flanking and 5k 3'-flanking. Then, we respectively mapped our RNA-Seq at different time points (0, 6, 12, 24, 48, and 72 hours) after post-amputation to the reference genome according to the two annotations using the bowtie2 program in tophat2 8 software. The expression abundance of each LINE2 was quantified by the cuffquant program in the cufflinks 9 , and the cuffdiff program in cufflinks 9 was utilized to detect differentially expressed LINE2 (P < 0.05) between 0 hour and other time points (6, 12, 24, 48 and 72 hours)  To provide stronger evidence in favor of the LINE2 link, we do additional analyses by plotting the mean expression of differentially expressed LINE2 elements in 5k 5'-flanking and 5k 3'-flanking across six different regeneration time points (Fig. 3f and supplementary Fig 12). Interestingly, we discovered that the differentially expressed LINE2 elements in 5k 5'-flanking (44 DEL2s) displayed an increasing expression trend during regeneration process (Fig. 3f and Supplementary Fig 12, p<0.05, Mann-Whitney U test). And also, the differentially expressed LINE2 elements in 5k 3'-flanking (119 DEL2s) exhibited an increasing expression trend b (Fig. 3f and Supplementary Fig 12). The pattern suggested that partial LINE2 elements may potentially participate in regeneration process.
Among the neighboring genes of these significantly differentially expressed LINE2 elements (44 + 119), 19 significantly differentially expressed LINE2 elements and their neighboring genes (belonged to DEGs) exhibited similar expression trends during regeneration process. Based on previous studies on these genes, we found that most of 19 neighboring genes are involved in regeneration biology. Therefore, we have added a section at Methods to make our analyses clearer. Please check the revisions in the updated manuscript version. Supplementary Fig 12 | Expression profiles of differentially expressed LINE2 elements in 5k 5'-flanking and 5k 3'-flanking of coding genes during regeneration process. DEL2 represented differentially expressed LINE2 elements.
2. Lines 234-261: in this section, an attempt is made at linking gene family expansion with differential gene expression. However it is not clear to me how consistent and significant these results are, compared to some simple observational results. For example, Figure 4b is used to say that "These genes may be especially important as regulatory genes during the regenerative process". The alternative is that gene families expand and contract under some other influences (unrelated to regeneration). The overlap between these expanded genes families and the DEG gene set then captures a distribution of the former, with, as in all distribution, some samples at the extremes of this distribution (like the ZNFX1 gene family). But the manuscript does not provide any evidence that it is specifically linked to regeneration. The rest of figure 4 runs through hand-picked gene families and the related text discusses them as suggestive evidence that regeneration in earthworms evolved under the influence of (i.e. was "enhanced", line 259) through the "specific" expansion of key genes or pathways. But again, the data currently does not show this to be a specific link (a randomization test might be helpful in this regard). Reply: Thanks for pointing out this issue. We are sorry for these unclear descriptions, and we agree with you that we overclaimed some results. We now rewrite this section to make it more clearly, and narrow down some claims.
In addition, according to your comment, we performed a randomization test. Briefly, we randomly chose 6,048 coding genes (equal to the number of total DEGs during regeneration) from whole genome wide annotated gene set (31,817 coding genes) using a sample function in R software (https://www.r-project.org/). Then, we computed the statistical significances for these 35 candidate significant expanded gene families harboring higher proportions of DEGs by using Χ 2 test between observed values and random values. We provided p values in update Fig. 4c and Supplementary Fig 18. Again, we provided the evidence that ZNFX1 is specifically linked to regeneration by expression profiles during regeneration process (Supplementary Fig 19).
Additionally, we also used qPCR to validate expression trends of EGFRs at regeneration time points, which validated our findings (Supplementary Fig  20).