Epiallelic variation of non-coding RNA genes and their phenotypic consequences

Epigenetic variations contribute greatly to the phenotypic plasticity and diversity. Current functional studies on epialleles have predominantly focused on protein-coding genes, leaving the epialleles of non-coding RNA (ncRNA) genes largely understudied. Here, we uncover abundant DNA methylation variations of ncRNA genes and their significant correlations with plant adaptation among 1001 natural Arabidopsis accessions. Through genome-wide association study (GWAS), we identify large numbers of methylation QTL (methylQTL) that are independent of known DNA methyltransferases and enriched in specific chromatin states. Proximal methylQTL closely located to ncRNA genes have a larger effect on DNA methylation than distal methylQTL. We ectopically tether a DNA methyltransferase MQ1v to miR157a by CRISPR-dCas9 and show de novo establishment of DNA methylation accompanied with decreased miR157a abundance and early flowering. These findings provide important insights into the genetic basis of epigenetic variations and highlight the contribution of epigenetic variations of ncRNA genes to plant phenotypes and diversity.

1.The manuscript would benefit from language editing.2. Consider revising the title to "Epialellic Variation of Non-Coding RNAs and Their Phenotypic Consequences."3. Provide citations for line 56.4. Provide citations for line 67.Early studies on this topic conducted genome-wide GWAS, not solely on gene bodies.They might have followed up on genes specifically, but the initial GWAS covered DMRs throughout the genome, regardless of location.5.The first methylQTL mapping study in Arabidopsis by Schmitz et al., Nature, 2013, should be cited along with others mentioned on line 66. 6. Edit line 91 for clarity.7. Correct the spelling of "epigenomes" on line 104.8.While the correlations are valid, it is unknown whether the cause of the correlation is due to population structure, likely linked to genetic variations associated with ncRNA genes.9. Line 204 -Be cautious with these conclusions, as statistical power also influences the ability to detect distant methylQTLs.It is still possible that they exist.10.Line 224 -This hypothesis requires further experimentation for verification.It is currently a plausible hypothesis.11.In general, there is a lack of discussion regarding the significance of genetic variants associated with methylation variation.The literature on this topic in plants (including TEs, copy number variants, etc.) is extensive.Line 300 attempts to address this, but it lacks citations to many studies that have focused on genetic variants associated with methylation in Arabidopsis and maize.
Reviewer #2 (Remarks to the Author): In the manuscript under consideration, the authors conducted a thorough study of DNA methylation of Arabidopsis genomes, focusing on ncRNAs.They first presented the profiling of methylation polymorphisms, and then studied the DNA polymorphisms affecting such methylations via GWAS (using a linear mixed model), discovering novel loci other than known methyltransferases.Focusing on flowering time phenotype, they carried out in-depth investigation of the functional roles of methylations in related biological pathways.In particular, miR157a was studied using CRISPR-dCas9 to show de establishment of DNA methylation and its association to early flowering.
The work looks solid and interesting.I don't have main concerns.The only minor point for the authors' consideration is on its GWAS analysis: would you please show a QQ-plot to demonstrate that the population structure in indeed under control?
Reviewer #3 (Remarks to the Author): This paper analyses epigenetic variation in plants by mining the extensive database of 811 accession of Arabidopsis.Through GWAS they identified methylQTLs outside known DNA methyltransferases or related genes and many of them were located close to non-coding RNA genes.The Methyl QTLs are enriched in specific chromatin states and they could correlate methylation of 1400 ncRNA genes with many plant adaptation phenotypes, including flowering time.Then they tethered a DNA methyltransferase MQ1 to the mir157a locus and showed that this led to reduction in miR157 expression and an early flowering phenotype.They have correlated these changes with reduction in expression of SPL targets.They propose that epigenetic variations in ncRNA genes may contribute to plant diversity and adaptation.The paper is interesting and provide significant correlative information genome-wide about the role of ncRNAs in accession adaptation.However, there are several points that may require further analysis: 1. Fig. 1e there are significant differences between snoRNA, snRNA genes and both long ncRNAs and protein-coding genes.The statement in line 137-138 needs to be clarified.2. The significant correlation in cytoscape needs to be better presented.What is the difference between development and root morphology to claim that high correlation with plant adaptation phenotypes exist? 3.They have constructed 2 KD miR157 lines and 2 overexpressor lines which showed opposite phenotypes.However, the expression of targets is not correlative (e.g.SPL3 is increased in KD lines but not reduced in OE lines; SPL10 have opposite expression between the 2 lines; even for SPL11,13,15 two OE lines differed?).This needs to be clarified or repeated?4. Then they constructed 6 transgenic lines with pmiR157:MIR157A constructs.What for?This is partial overexpression (copy number variation?) compared to previous OE lines.It will be good to include the previous OE ones.Furthermore, to conclude that Fig. 3H and Fig. 3i are "correlative" maybe an over-interepretation? Normally lines 5 and 6 should have longer flowering times than the others? 5.They use the methylQV1 strategy to induce DNA methylation on miR157a locus where a minor change in DNA methylation is observed (Fig. 4b).What happen with miR157a expression?Here SPL3 is strongly upregulated compared to other SPLs? 6.The DNA methylation targeting by using fusions of MQ1 to Cas9 was already described but it is very nice that the authors leverage the potential of the approach to knockdown a miR157 gene in this work.As expected, there is a related physiological output, i.e. early flowering.What is missing here is a deeper assessment of what may happen later, comparing the impact of epigenomic variation vs genomic variation.How do the next generations of segregated clean lines behave?Is the epiallele maintained through the next generation at least? 7. When this change in methylation occurs in natural accessions?Is it gradually or suddenly during flowering?Can some links with environmental clues be deduced from the 1st part of the analysis (vernalization, latitude, longitude, cytoscape network?) with miR157 methylQTL?It may be the case but this should be better highlighted.

REVIEWER COMMENTS: Reviewer #1:
The study conducted by Liu and Zhong investigates the extent of epigenetic variation at ncRNAs in Arabidopsis thaliana.They have identified numerous significant methylQTLs, the majority of which are located proximal to the locus.These findings suggest the presence of genetic variants at these loci that likely contribute to the observed methylation variation.The researchers have performed correlations between publicly available trait data and methylation levels, revealing numerous associations.However, it is challenging to determine causality and distinguish between genetic and epigenetic factors.Notably, they have identified a methylQTL at miR157 and convincingly demonstrated that targeting this locus with a methyltransferase fused with the SunTag system induces ectopic methylation and an early flowering phenotype.
Overall, this study is timely given the growing interest in ncRNA biology.The subsequent investigation on miRNA157 provides a compelling example of linking natural methylation variation to phenotypic differences, although it would have been preferable to demonstrate this with an unknown flowering regulator.The manuscript could benefit from toning down some of the claims about links to adaptation/plasticity and providing a more comprehensive review of the literature on methylation QTL from the plant community, considering the extensive body of research in this area.Response: We appreciate this reviewer's positive comments and helpful suggestions to improve this improvement.
Major Comments: Q1.It is worth noting in the introduction that while miRNAs are technically ncRNAs, they have well-established functions compared to the debated roles of other ncRNAs.Response: We appreciate this comment and have added one sentence in the introduction part to capture this point (Lines 85-86).
Q2.The study would benefit from toning down the implications of epialleles in adaptation, as there is no supporting data on adaptation presented in this study.Although epialleles may be important for adaptation, further research is required to establish this.Response: Thanks for this suggestion.As suggested, we have revised the abstract and main text to tone down the implications of epialleles in adaptation.
Q3.Is the flowering phenotype stable when the transgene is segregated away?Testing whether the phenotype persists in the absence of the transgene would determine whether methylation or the SunTag-targeting system is the causal factor.Response: We appreciate this comment.As suggested, we obtained the F2 and F3 progenies with and without transgene from heterozygous parental lines (Figure R1a).We genotyped and then determined the flowering time of individual F2 and F3 progeny.We found that the flowering time of both transgene-free and transgene-containing (either homozygous or heterozygous) plants showed significant earlier flowering phenotype compared with Col-0 in both populations.
Further comparison between transgene-free and transgene-containing plants revealed no significant flowering phenotype difference (Figure R1b).This result confirmed that the early flowering time phenotype is not caused by the transgene insertion and is stable when the transgene is segregated away.Figure R1.Flowering phenotype is stable when transgene is segregated away.1F is unclear.How is it controlled?Can the same analysis be performed with protein-coding genes for comparison?Otherwise, there is no basis for establishing a background control.Response: We apologize for the confusion.We performed the Person's correlation test between 12,115 methylation data of 4,228 ncRNA genes and 303 phenotypes.As mentioned in the main text, we used a highly stringent threshold (P < 6.08x10 -5 , N ≥ 100) and only kept significant correlations to construct the network.Most of the correlations were non-significant.Thus, the non-significant correlations serve as the background control.The same analysis could also be performed with protein-coding genes.However, we feel that the non-significant ncRNAs is a better background control comparing with protein-coding genes.For clarity, we have revised the Fig. 1f and edited the legend with the following sentences: "The network of significant correlations between methylation of ncRNA genes and plant phenotypes.Each dot indicates one ncRNA gene and each line indicates significant correlation between the DNA methylation of this gene and the plant phenotype."

Q5.
The experiments targeting mi157 methylation are interesting, but it should be noted that it is not technically an epiallele until the targeting transgene has been segregated away.Is the methylation stable in the absence of the transgene, along with the flowering phenotype?Response: As suggested, we have performed quantitative PCR following McrBC digestion of gDNA extracted from multiple randomly selected individuals from F2 population (see Figure R1).As shown in Figure R2, we observed increase of DNA methylation in three transgenecontaining and six transgene-free plants comparing with Col-0.This result indicated that the increased methylation could be inherited to next generation in the absence of transgene.R1).We also found some CHG and CHH DMRs.Q17.In general, there is a lack of discussion regarding the significance of genetic variants associated with methylation variation.The literature on this topic in plants (including TEs, copy number variants, etc.) is extensive.Line 300 attempts to address this, but it lacks citations to many studies that have focused on genetic variants associated with methylation in Arabidopsis and maize.
Response: We appreciate these comments.As suggested, we have added the discussion on the significance of other types of genetic variants including TE, repeat sequences, gene duplications (Lines 306-316).

Reviewer #2:
In the manuscript under consideration, the authors conducted a thorough study of DNA methylation of Arabidopsis genomes, focusing on ncRNAs.They first presented the profiling of methylation polymorphisms, and then studied the DNA polymorphisms affecting such methylations via GWAS (using a linear mixed model), discovering novel loci other than known methyltransferases.Focusing on flowering time phenotype, they carried out in-depth investigation of the functional roles of methylations in related biological pathways.In particular, miR157a was studied using CRISPR-dCas9 to show de establishment of DNA methylation and its association to early flowering.The work looks solid and interesting.I don't have main concerns.
Response: We appreciate the enthusiastic and positive comments from this reviewer on this manuscript.
Q18.The only minor point for the authors' consideration is on its GWAS analysis: would you please show a QQ-plot to demonstrate that the population structure in indeed under control?Response: Thanks for this good suggestion.We have plotted QQ-plot from the P-values of GWAS result of miR157a CHG methylation and added it in Fig. 3a (also attached below).From the QQ-plot, we confirmed that the population structure was well controlled.
Fig. 3a.QQ-plot for CHG methylation of miR157a Reviewer #3: This paper analyses epigenetic variation in plants by mining the extensive database of 811 accession of Arabidopsis.Through GWAS they identified methylQTLs outside known DNA methyltransferases or related genes and many of them were located close to non-coding RNA genes.The Methyl QTLs are enriched in specific chromatin states and they could correlate methylation of 1400 ncRNA genes with many plant adaptation phenotypes, including flowering time.Then they tethered a DNA methyltransferase MQ1 to the mir157a locus and showed that this led to reduction in miR157 expression and an early flowering phenotype.They have correlated these changes with reduction in expression of SPL targets.They propose that epigenetic variations in ncRNA genes may contribute to plant diversity and adaptation.The paper is interesting and provide significant correlative information genome-wide about the role of ncRNAs in accession adaptation.
Response: We appreciate this reviewer's enthusiastic and positive comments.
Q24.The DNA methylation targeting by using fusions of MQ1 to Cas9 was already described but it is very nice that the authors leverage the potential of the approach to knockdown a miR157 gene in this work.As expected, there is a related physiological output, i.e. early flowering.What is missing here is a deeper assessment of what may happen later, comparing the impact of epigenomic variation vs genomic variation.How do the next generations of segregated clean lines behave?Is the epiallele maintained through the next generation at least?Please see our responses to Q3 and Q5 from reviewer #1(see above).We observed the early flowering phenotype and increased CG methylation of in the next generation progenies without the transgene.This indicated that this epiallele and the phenotype is inheritable for at least one more generation.
Q25.When this change in methylation occurs in natural accessions?Is it gradually or suddenly during flowering?Can some links with environmental clues be deduced from the 1st part of the analysis (vernalization, latitude, longitude, cytoscape network?) with miR157 methylQTL?It may be the case but this should be better highlighted.Response: Thanks for these interesting questions.As mentioned in main text (Lines 235-236), we found a significant positive correlation between DNA methylation of miR157a and longitude (Pearson's r = 0.15, N = 768, P = 3.51x10 -5 ).However, it is currently challenging to trace the methylation changes of miR157a in different generations and environments due to the lack of necessary data, such as the abundance data of miR157a in natural accessions.The "epimutation-clock" could be used to estimate the divergence of DNA methylation of miR157a in natural accessions.While multiple tools to estimate epigenetic age in animals are available, such tool for this purpose in Arabidopsis is still lacking.Recently, Yao et al. (2023) reported the application of epigenetic clock in Arabidopsis, but the details of the method/software have not yet been made available.

Figure R2 .
Figure R2.Quantitative PCR following McrBC digestion (McrBC-qPCR)-based DNA methylation assay of miR157a As a comparison, we re-analyzed published WGBS data of SunTag-MQ1v lines from the Jacobsen lab (Ghoshal et al., 2021) using same pipeline and their own control.Our data showed less CG and CHG DMRs but variations in the numbers of CHH DMRs.We don't know whether these CHH DMRs are real or function.The non-CG DMRs might be due to the nature of non-CG methylation plasticity.Similar large non-CG variations were observed among 54 high-quality WGBS data of Col-0 from worldwide (Zhang et al., 2018).

Table R1 .
DMR analysis and comparison