Identification of epigenetic variation associated with synchronous pod maturity in mungbean (Vigna radiata L.)

Cytosine methylation in genomic DNA affects gene expression, potentially causing phenotypic variation. Mungbean, an agronomically and nutritionally important legume species, is characterized by nonsynchronous pod maturity, resulting in multiple harvest which costs extra time and labor. To elucidate the epigenetic influences on synchronous pod maturity (SPM) in mungbean, we determined the genome-wide DNA methylation profiles of eight mungbean recombinant inbred lines (RILs) and their parental genotypes, and compared DNA methylation profiles between high SPM and low SPM RILs, thus revealing differentially methylated regions (DMRs). A total of 3, 18, and 28 pure DMRs, defined as regions showing no significant correlation between nucleotide sequence variation and methylation level, were identified in CpG, CHG, and CHH contexts, respectively. These DMRs were proximal to 20 genes. Among the 544 single nucleotide polymorphisms identified near the 20 genes, only one caused critical change in gene expression by early termination. Analysis of these genome-wide DNA methylation profiles suggests that epigenetic changes can influence the expression of proximal genes, regardless of nucleotide sequence variation, and that SPM is mediated through gibberellin-mediated hormone signaling pathways. These results provide insights into how epialleles contribute to phenotypic variation and improve SPM in mungbean cultivars.

Scientific Reports | (2020) 10:17414 | https://doi.org/10.1038/s41598-020-74520-z www.nature.com/scientificreports/ for maintenance 7 . However, the effect of epigenetic variation on the phenotype remains unclear because only a few epigenetic marks have been characterized to date 8 . In soybean (Glycine max L.), the model species closest to mungbean, carbohydrate metabolism pathway genes are enriched in epigenetic variation 9 . In tomato (Solanum lycopersicum L.) fruits, vitamin E content is determined by epialleles 10 . Although genome-wide DNA methylation patterns have been investigated in two mungbean genotypes, VC1973A and V2984, epigenetic factors associated with any agronomic trait have not been studied in mungbean 5,11 . In this study, we aimed to characterize the epigenetic variation associated with SPM in mungbean. Whole genome resequencing and bisulfite-sequencing of four high SPM recombinant inbred lines (RILs), four low SPM RILs, and two parental genotypes (VC1973A and V2984) were conducted to determine genome-wide DNA methylation patterns. The mungbean cultivar VC1973A is widely grown in Asia and exhibits high SPM, whereas V2984 is a Korean landrace with low SPM 3 . Comparison of these DNA methylation profiles between high vs. low SPM lines revealed pure differentially methylated regions (DMRs), independent of genetic variation. Genes in these DMRs showed differential expression levels between the parental genotypes VC1973A and V2984. These results advance our understanding of SPM in mungbean and of the contribution of epialleles to phenotypic variation.

Results
Variation in the SPM index of the RIL population. A total of 187 RILs were harvested weekly for 8 weeks ( Figure S1). The SPM index (range: 0 to 1) was calculated for each RIL. A value of SPM close to 1 theoretically indicates that all pods mature at the same time. The number of pods in VC1973A (maternal parent) was the highest at week 2 (SPM index: 0.862), while the number of pods in V2984 (paternal parent) peaked again at week 7 (SPM index: 0.487) (Fig. 1, Table S1) 3 . Among the 187 RILs, we selected the four RILs (SK10, SK68, SK186, and SK49) with the highest SPM index (0.792, 0.783, 0.772, and 0.764, respectively) and the four RILs (SK94, SK152, SK176, and SK156) with the lowest SPM index (0.394, 0.374, 0.364, and 0.319, respectively) for bisulfite-sequencing and resequencing analyses with their parental lines. Groups of RILs with high and low SPM indices showed similar patterns of weekly harvests as VC1973A and V2984, respectively, indicating that the group with a high SPM index showed higher synchronicity in pod maturity (Fig. 1).

Identification of DMRs associated with SPM via bisulfite-sequencing.
To identify candidate epialleles involved in the regulation of SPM, the selected ten lines, including eight RILs and two parental genotypes, were subjected to Methyl-Seq after bisulfite conversion (Table S2). The bisulfite conversion efficiencies ranged from 98.5 to 99.0%, which were sufficiently high for subsequent biological applications such as Methyl-Seq analysis, as reported previously 12 . A total of 488, 336, and 406 DMRs in contexts of CHG, CHH, and CpG methylation were identified ( Figure S2), of which 134 (36%), 160 (19%), and 65 (33%) were located at genic regions, respectively, including upstream sequence (− 2 kb from the transcription start site), 5′ untranslated region (5′-UTR), coding sequence (CDS), intron, 3′-UTR, and downstream sequence (+ 2 kb from the transcription termination site) ( Table 1). After the removal of genes redundant in DMRs of all methylation contexts, 211 genes were identified as proximal to DMRs. Among the genes in genic DMRs associated with SPM, gene ontology (GO) analysis revealed the enrichment of reproductive structure development and transcription factor activity-related GO terms such as 'nucleotide binding' and 'catalytic activity' ( Figure S3). Additionally, 'metabolic pathways' , 'biosynthesis of secondary metabolites' , and 'plant hormone signal transduction' were the top three enriched pathways in the KEGG database (Table S3) 13 .
Methylation levels at genic DMRs. The distribution of methylation at genic regions, including upstream and downstream sequences of genes in mungbean, was consistent with that in Arabidopsis and soybean www.nature.com/scientificreports/ ( Fig. 2) 14,15 . A slight increase in methylation was detected in the middle of the gene body, and a steep increase was observed in upstream and downstream sequences in mungbean. On the basis of differences in methylation levels between high and low SPM groups, we classified genic DMRs into two sets: DMRs with significantly higher methylation in the high SPM group than in the low SPM group (hhDMRs), and DMRs with significantly higher methylation in the low SPM group than in the high SPM group (hlDMRs). In both DMR groups, differences in average methylation levels for CpG and CHG between high and low SPM lines were higher in gene bodies than in upstream and downstream regions. Cytosine residues methylated in the CpG context showed the largest difference in methylation levels in genic DMRs, whereas those methylated in the CHH context showed the lowest difference in methylation levels (Table S4). This result was consistent with the previous observation that methylation level of CHH had the least effect on gene expression levels in mungbean 11 .
Genetic diversity and pure genic DMRs. To identify pure DMRs, where no SNPs are significantly associated with methylation levels, approximately 5 Gb of Illumina raw data were generated for each RIL (Table S5). Raw sequence reads of eight RILs and V2984 (paternal genotype) were mapped against the mungbean reference genome (VC1973A; maternal genotype) 5 . The number of SNPs and Indels in each line ranged from 269 to 798 k and 26 to 134 k, respectively. Out of 282,517 SNPs identified among the ten genotypes, 40,178 SNPs co-segregated with high and low SPM groups. A total of 1081, 1343, and 397 SNPs were located within ± 1 kb of the DMRs in CpG, CHG, and CHH contexts, respectively; among these, 1059, 1248, and 247 SNPs, respectively, were significantly associated with methylation levels of the DMRs (p < 0.01) ( Table 2, Table S6). Thus, three CpG, 18 CHG, and 28 CHH pure DMRs were revealed, of which one CpG, ten CHG, and ten CHH were pure genic DMRs located within ± 2 kb sequence of 20 genes. These genes included Vradi11g00660 (log 2 fold-change [log 2 FC] = 4.51; encodes an expressed protein), Vradi09g08580 (log 2 FC = 2.28; encodes a receptor-like protein), Vradi01g11940 (log 2 FC = − 2.14; encodes a peroxisomal membrane protein), Vradi03g00420 (log 2 FC = 1.86; encodes a gibberellin receptor GID1L2), Vradi08g18940 (log 2 FC = 0.94; encodes an F-box domain containing protein), and Vradi01g09470 (log 2 FC = 0.92; encodes a transmembrane protein) ( Table 3, Fig. 3) 11 . The difference in methylation status possibly affected the difference in expression levels of these genes between VC1973A and V2984, independent of nucleotide sequence variation.

Discussion
There is an increased interest in utilizing epialleles as a potential breeding resource for harnessing previously unassessed heritable variation affecting plant phenotypes 15 . Because differential DNA methylation, together with genetic variation, can affect gene expression and thus phenotypic variation, DNA methylation should be considered as a critical factor in molecular breeding 6,14 .
To measure SPM in mungbean, we had tried three different approaches. The first approach was the length of productive days indicating each plant produces at least three pods when harvested each week. Synchronous plants should have shorter productive day compared to non-synchronous plants. The second approach was the ratio between the highest pod number among weekly harvests and total pod number. Synchronous plant should have a value close to one. These two approaches did not represent SPM well because of the presence of the second peak of weekly harvest (Fig. 1). The third approach was SPM index used in this study, which showed synchronicity of pod maturity the best among three approaches.
To elucidate the epigenetic effect on SPM in mungbean, DMRs were identified between high and low SPM groups, and genic DMRs were subsequently selected for bisulfite-sequencing and resequencing analyses (Table 1). Transcription factor activity-related GO terms were enriched among the 211 genes proximal to SPM-associated genic DMRs, and 17 of these genes encoded transcription factors ( Figure S3, Table S7). Because 'plant hormone signal transduction' was one of the enriched KEGG pathways in our data (Table S3), three genes, including Vra-di07g03990, Vradi08g06860, and Vradi0284s00060 encoding bZIP, AP2, and ARF transcription factors, respectively, were mapped to auxin, abscisic acid, and ethylene-related signal transduction pathways, respectively, suggesting that these plant hormones participate in the regulation of SPM in mungbean ( Figure S4) 13 .
To investigate whether gene expression was affected by nucleotide sequence variation or not, we identified 544 SNPs within 2 kb upstream and downstream of 20 genes proximal to 21 pure DMRs (Table 3). These DMRs   Figure S5). Most SNPs (> 80%) were located in non-coding regions of genes (Table S6). Only 24 SNPs were missense variants, and one SNP resulted in a premature stop codon, indicating that most nucleotide variants around the genes do not have a significant impact on the expression level of the 20 genes. Flowering pathways have been well studied in model plant species: photoperiod, autonomous, vernalization and gibberellin pathways 16 . Gibberellins are plant hormones that regulate various developmental processes, including flower induction and development, seed development, and fruit senescence 17 . Plants flower in response to gibberellins, and DELLA proteins negatively regulate this gibberellin signaling (Fig. 4) 18 . Degradation of DEL-LAs is mediated by gibberellins. Out of the 20 genes, we were able to retrieve four genes, where pure epialleles possibly affected gene expression levels (Fig. 3). These four genes, including Vradi01g09470, Vradi01g11940, Vradi03g00420, and Vradi09g08580, showed the highest log 2 FC values (0.92, − 2.14, 1.86, and 2.28, respectively) between VC1973A and V2984, and their Arabidopsis orthologs have been well characterized. Vradi03g00420 encodes a gibberellin receptor, which is a positive regulator of the gibberellin-mediated signaling pathway in Arabidopsis (Table 3) 19 . At4G08850, an Arabidopsis ortholog of Vradi09g08580, which encodes a receptor-like protein, is regulated by DELLA proteins in flowering buds 18 . Vradi01g09470 and Vradi01g11940 are annotated www.nature.com/scientificreports/ as encoding a transmembrane protein and peroxisomal membrane protein, respectively, which may participate in communication among cells 20,21 . These results suggest that the expression levels of these genes are influenced by epigenetic factors and gibberellin-mediated signaling pathways involve in synchronicity of pod maturity in mungbean. About 1 kbp genomic regions around the DMRs for these four genes were confirmed by Sanger sequencing (Table S8).
In this study, we identified pure genic DMRs between high and low SPM groups in a mungbean RIL population. Although the data was collected without environmental repeat, phenotypic variation was normally distributed (p-value < 0.01), widely ranged from 0.319 to 0.862 among 187 lines. Based on pure epialleles independent of genetic variation, epigenetic factors underlying SPM in mungbean were identified, which would not be detected by nucleotide-based analysis. The data consistently indicated that various transcription factors and receptor proteins regulate SPM in mungbean via plant hormone (especially gibberellin)-mediated signaling pathways. However, our study has a limitation for estimating environmental variance of SPM, as phenotypic data was collected in a single environment without replication. Functional validation for the candidate genes responsible for SPM and phenotypic data with environmental repeats are still needed to fully elucidate SPM in mungbean. Overall, our results will help to understand the potential of epialleles as a possible breeding resource for improving the SPM phenotype of mungbean elite cultivars.

Methods
Plant materials and phenotyping. 187 F 10 RIL population derived from a cross between VC1973A (maternal parent) and V2984 (paternal parent) was planted at Seoul National University Experimental Farm in Suwon, Korea (37°16′12.7″N, 126°59′19.2″E) in 15 July 2016 22 . Natural photoperiod and average temperature from July to September 2016 in Suwon were 11.9-14.7 h per day and 22.7-27.7 °C, respectively. The mungbean cultivar VC1973A was developed at the World Vegetable Center (AVRDC) in 1982, and the Korean landrace V2984 is mostly grown in the Kyung-Ki province of South Korea. Genomic DNA was extracted from the first trifoliate leaf of ten mungbean genotypes, including eight RILs and two parental genotypes, using GeneAll Exgene Plant SV Kit (GeneAll Biotechnology, Co., Ltd., Korea). 10 plants for each genotype were harvested once a week for 8 weeks from August 26 to October 14, 2016 3 . The SPM index was calculated as the highest sum of pod numbers of 2 consecutive weeks divided by total pod numbers. Genotypes with the highest and lowest SPM indices were selected for bisulfite-sequencing and resequencing analyses. Resequencing was performed on the Illumina HiSeq4000 platform.
Bisulfite-sequencing. Libraries for bisulfite-sequencing were constructed using TruSeq DNA Methylation Kit (Illumina, USA), after bisulfite conversion using Zymo EZ DNA Methylation Gold Kit (Zymo Research, USA), according to the manufacturer's guidelines. Methyl-Seq was performed on the HiSeq X platform (Table S2). For each library construction, 0.1% lambda DNA (Invitrogen, USA) was added to plant genomic DNA. Methyl-Seq reads were mapped onto the in silico bisulfite-converted lambda DNA genome, and the efficiency of bisulfite conversion was calculated as the ratio of the number of converted cytosine residues to the total number of cytosine residues (including converted and unconverted).

Identification of DMRs.
Raw Methyl-Seq reads were mapped onto the in silico bisulfite-converted mungbean reference genome using Bismark-0.20.0 23 . To minimize potential clonal bias by PCR amplification, reads that mapped to the same position were removed. Cytosine residues with three or more reads were selected. DMRs were identified using Metilene with default parameters 24 . DMRs located within ± 2 kb of gene sequences were defined as genic DMRs. Arabidopsis thaliana orthologs of mungbean genes containing DMRs were mapped to the Kyoto Encyclopedia of Genes and Genomes database (KEGG; https ://www.genom e.jp/kegg/tool/map_ pathw ay1.html) 13 . GO enrichment was analyzed using BiNGO plugin in Cytoscape software 25 .
To calculate the methylation levels of genic DMRs, upstream (− 2 kb from the transcription start site), gene body, and downstream (+ 2 kb from the transcription termination site) sequences containing DMRs were divided into 20 bins. Average methylation levels of cytosine residues in three consecutive bins were calculated and plotted using a sliding window approach. www.nature.com/scientificreports/ Identification of pure DMRs. Illumina short reads generated using the Hiseq4000 platform were mapped against the mungbean reference genome (VC1973A) using Burrows-Wheeler Aligner (BWA) ( Table S5) 5,26 . Single nucleotide polymorphisms (SNPs) and insertion/deletion mutations (Indels) were identified using Samtools with the following criteria: mapping quality > 30, minimum mapping depth ≥ 3, and maximum mapping depth ≤ 20 for RILs and ≤ 100 for V2984 27 . After the removal of duplicate variants, SNPs shared by all ten lines were used to identify pure DMRs. The association between methylation level and SNPs located within ± 1 kb of the DMRs was tested using Student's t-test with p-value < 0.01. SNPs were annotated using SnpEff 28 . The DMRs that showed no significant association with SNPs were defined as pure DMRs.