Genome wide in-silico miRNA and target network prediction from stress responsive Horsegram (Macrotyloma uniflorum) accessions

Horsegram (Macrotyloma uniflorum (Lam.) Verdc.) is a drought hardy food and fodder legume of Indo-African continents with diverse germplasm sources demonstrating alternating mechanisms depicting contrasting adaptations to different climatic zones. Tissue specific expression of genes contributes substantially to location specific adaptations. Regulatory networks of such adaptive genes are elucidated for downstream translational research. MicroRNAs are small endogenous regulatory RNAs which alters the gene expression profiles at a particular time and type of tissue. Identification of such small regulatory RNAs in low moisture stress hardy crops can help in cross species transfer and validation confirming stress tolerance ability. This study outlined prediction of conserved miRNAs from transcriptome shotgun assembled sequences and EST sequences of horsegram. We could validate eight out of 15 of the identified miRNAs to demonstrate their role in deficit moisture stress tolerance mechanism of horsegram variety Paiyur1 with their target networks. The putative mumiRs were related to other food legumes indicating the presence of gene regulatory networks. Differential miRNA expression among drought specific tissues indicted the probable energy conservation mechanism. Targets were identified for functional characterization and regulatory network was constructed to find out the probable pathways of post-transcriptional regulation. The functional network revealed mechanism of biotic and abiotic stress tolerance, energy conservation and photoperiod responsiveness.


Materials and methods
Query and reference datasets. Totalling 27,997 shotgun assembled contigs of transcriptome were collected from TSA sequence set: GANR01000001-GANR01027997 from European Nucleotide Archive (www.ebi. ac.uk/ena), SSH-Mu library of moisture stressed cDNA of Macrotyloma (https ://www.ncbi.nlm.nih.gov/nuces t NCBI dbEST ID 75866463 from) and 1008 ESTs were downloaded (www.ncbi.nlm.nih.gov/dbEST /) from publically available NCBI EST database to represent query sequences for miRNA homolog search. About 8496 miR-Base mature Viridiplantae miRNAs were used (https ://www.miRba se.org) (Released 21: June, 2014) database 23 in the present investigation and clustered by CD-HIT-EST, with threshold value of 100 24 . Out of 3777 clustered sequences, only non-redundant miRNAs were used as reference miRNAs for finding the homologs in M. uniflorum (Lam.) Verdc. candidate sequences to create a local nucleotide sequence database. To elucidate likely role of miRNAs involved in drought stress tolerance trait, the predicted microRNAs were screened for target genes listed in DroughtdB database (https ://pgsb.helmh oltz-muenc hen.de/droug htdb/).

Bioinformatics tools employed.
For conserved miRNA prediction of horsegram candidate sequences from Viridiplantae miRNAs reported in miRBase database, NCBI BLAST version 2.2.27 25 an alignment tool was used (https ://ftp.ncbi.nih.gov/blast /execu table s/blast +). The representative sequences of all plant miRNAs were obtained after clustering with CD-HIT-EST with threshold value of 100 24 . MFOLD online tool 26  www.nature.com/scientificreports/ unafo ld.rna.alban y.edu/?q=mfold ) used for pre-miRNAs secondary structures prediction. The miRNA targets were deduced using psRNAtarget server 27 . Online circoletto 28 tool was used to illustrate the miRNAs and its target genes by circos plot 29 . The phylogenetic analysis of predicted miRNAs with their closely related miRNA families was performed with MEGA7 30 .
Computational prediction of potential miRNA homologs. The prediction of miRNA homologs was performed using TSA and EST sequences of M. uniflorum available at NCBI and EBI databases. The plant miR-NAs were obtained from NCBI was clustered and the representative sequences generated were employed as reference miRNA database using BLAST-2.2.27+. The query sequence consisting of TSA and EST of horsegram were subjected to nucleotide blast (blastn) against reference sequences of plant miRNAs with the following set parameters: (i) word match size-7; (ii) length of mature miRNA sequence ≥ 18 nt without gap; (iii) mismatch range-0-2 (iv) e-value-0.1. The filtered sequences were utilized to find out the coding and non-coding candidate sequences by performing blastx against non-redundant protein database. The protein coding regions were removed whereas the non-coding regions were exploited for secondary structure prediction and validation.
Secondary structure prediction and validation. With maximum stringency only non-coding sequences were subjected to Zuker's folding algorithm based secondary structure prediction in MFold 26 . A sliding window of 100nt from ~ 80nt upstream and ~ 80nt nucleotide downstream of the mature miRNA were set to find the precursors 33 with parameters set as: (a) Folding temperature-37 °C, (b) Ionic conditions of 1 M NaCl without divalent ions, (c) linear RNA sequence, (d) percent sub-optimality number of 5 and (e) maximum interior/bulge loop size-30. As reported elsewhere, to validate the structures of pre-miRs adjusted minimal folding free energy (AMFE) and minimal folding free energy index (MFEI) were calculated 34 (Table 1). The secondary structures were screened for 35 : (i) Less than three nucleotide substitutions within predicted mature miRNA to reference miR, (ii) The candidate sequence must fold into an appropriate and proper stemloop hairpin secondary structure, (iii) The localization of mature miRNA must be in one arm of the stem-loop structure, (iv) Less than six mismatches between mature miRNA and its corresponding star sequence (miRNA*) Figure 1. Pipeline for prediction of miRNAs and its in-silico validation. From non-coding seqeuences and secondary structure analyses miRs were predicted in-silico as depicted in this pipeline.  mun-miR482d-3p  gma-miR482d-3p gguaugggagguguagggaaga  22  0  3′  −   mun-miR482b-5p  gma-miR482b-5p uuccuucccaauccccccaua  21  0  5′  −   mun-miR482a-5p  gma-miR482a-5p agaauuugugggaaugggcuga  22  0  5′  +   mun-miR482-5p  pvu-miR482-5p  ggaaugggcugauugggaagca  22  0  5′  +   mun-miR482a-3p  gma-miR482a-3p ucuucccaauuccgcccauuccua 24  0  3′  +   mun-miR156r  gma-miR156r  ugcucucuaucuucugucag  20  Phylogeny of horsegram miRNA. To enumerate the conserved nature and its phylogenetic relationship of miRs and its precursors were aligned with reported plant miRNAs using BLASTn with e-value, maximum mismatch and hits number set as 10, 3 and 5 respectively. The homologous precursor miRNAs were aligned to predicted miRNAs with Phylogeny.fr web tool (https ://www.phylo geny.fr/index .cgi) 37 by integrating multiple sequence alignment tools, MUSCLE 38 and GBlocks 39 to refine the alignment; phylogenetic tree construction by PhyML 40 and Tree rendering by TreeDyn 41 . The alignment output is used in MEGA7 30 for interpretation of molecular clock to estimate interfamily evolution.  Functional annotation of miRNA targets. As annotations are not available for horsegram, the closely related soybean was used as a reference to annotate miR targets of the predicted miRNAs. Putative identified query miRs were hit against horsegram mRNA sequences using psRNATarget tool 27 . The parameters set target prediction were (i) 9-11 nucleotide mismatch range for translational inhibition, (ii) maximum 4 mismatches without gap at complementary site, (iii) multiplicity of target sites -2, (iv) maximum expectation value -3 and (v) number of hits -10. The homologs of putative miRNAs and their targets were represented in Glycine max genome as circos plot to demonstrate multiple targets of identified putative miRNAs in horsegram 28 . In addition, to elucidate the role of identified miRs in drought stress response, specific miRNA target genes among drought tolerance contributing genes reported in Droughtdb (https ://pgsb.helmh oltz-muenc hen.de/droug htdb/) were confirmed. The genes reported for horsegram are insufficient, thus, A. thaliana was used as reference organism to determine gene targets for the new miRNAs predicted. The identified putative miRNAs were used as query against the A. to determine the functional roles and to classify Gene Ontology (GO) terms into molecular functions, biological processes and cellular components. Further, corresponding pathways were mined through KASS server, which assigns KEGG Orthology (KO) terms and employs KEGG pathways for miRNA target genes. The simplified schema of workflow adopted in present investigation for miRNA prediction is given in Fig. 1.

Quantitative expression analysis of identified miRNAs. Stress responsive horsegram germplasm
were identified from the germplasm core developed as described earlier (6). Selected stress responsive variety Paiyur 1 was raised in glass house under control and moisture stress conditions. Plants were maintained at field condition for control and at temporary wilting point for stress condition. Leaf tissues of control and stress plants were collected and fixed in RNALater and stored at − 80 °C till RNA isolation. High quality RNA samples were extracted from 100 mg (wet weight) leaf tissue using Plant RNA Easy mini spin coloumn kit following the manufacturer's guidelines (Qiagen). Quantity and quality of the RNA was quantified with the advanced Qiaexpert.
RNA samples of one microgram each were 3′ polyadenylated using Poly A RNA polymerase (Sigma Aldrich). Forward primers sequences of all 15 miRs and a common Poly T reverse primers were designed (Supplementary material 1).
Quantitative rtPCR using One Step PrimeScript™ RT-PCR Kit with SYBR green (Takara) was performed for all 15 identified miRs. qPCR capturing was conducted using the Illumina Eco RT PCR machine.
For qPCR cycling conditions set were 50 °C for 10 min for cDNA sysnthesis; 95 °C for 3 min for polymerase activation followed by 45° cycles of 95 °C 15 s, 60 °C for 30 s capturing with SYBR green filter at end of each cycle.
Results miRNAs in M.uniflorum. Significant miRNA homologs within reported 8496 miRNAs were identified by executing nucleotide blast (BLASTn) with 27,997 TSA contigs, SSH-Mu library sequences of moisture stressed horsegram cDNA (NCBI dbEST ID 75866463) and 1008 EST sequences as query resulted in 16 ESTs and 6303 TSA hits (E value 0.1) for further analysis. Stringent filtering (mismatch < 3 and length > 17) was incorporated to narrow down the hits to 5807 (8 ESTs and 5799 TSA) sequences for putative candidate sequences identifica- Table 4. Summary of the outcomes of the bioinformatics approach adopted to identify miRNAs in horsegram. Secondary structure prediction and validation. The predicted miRNAs were analyzed for various structural features to distinguish from other small RNAs such as tRNAs, rRNAs and mRNAs (Table 1 and Fig. 2). Most crucial characteristic feature of stable secondary structure is its minimal folding energy (MFE) ranging from 37.3 to 59.6 (−kcal/mol) for ten in-silico predicted premiRNAs. The MFE of the precursor miRNA was retained low to achieve thermodynamic stability 45 . Owing to their sequence length polymorphism, pre-miRNAs characterization was based only on MFEI (Minimum free energy index) following previous reports 22,46 . The MFEI ranges from 0.57 to 1.03 for ten in-silico predicted pre-miRNAs (Fig. 2). The (A + U) % of precursor horsegram miRNAs ranges from 39.24-60.38 satisfies the criteria included by Zhang et al. 18 . Nucleotide distribution (A = 31.6%, U = 32.31%, G = 26% and C = 23.8%) of the predicted pre-miR are heterogeneous as given Tables 1  and 3. On an average pre-miRs were of 111.4 bp length and mature miRs ranged from 18 to 24 ( Fig. 3a) with mean of 20.73 (Table 3). Of predicted miRs, miR482 family had more members (six) representing its presence as one of the supreme molecule in the consortia of horsegram regulatory miRs. Similarly, miR156 and miR390 were found to have two members each whereas, other miRs have singlets from each family (Fig. 3b). Although, no miRs were identified from ESTs and SSH library validating them as coding sequences, the TSA sequences showed miRNA frequency of 1 in 1937 contigs ( Table 4). The results of the predicted miRs and its targets in horsegram were statistically analyzed and summarized in Tables 3 and 4.
Evolutionary relationship with soybean. Soybean (G. max L.), the closely related model legume crop with complete genome information cracked has been used to establish a comparative syntenic map of predicted horsegram miRNAs to determine the putative regions of homologous miR genes. From mapping results, except www.nature.com/scientificreports/ mun-miR-482a-3p and miR2673a, rest of the putative horsegram miRNAs have orthologs widespread in G. max genome. miR156r and miR156k had more number of orthologs. The mapped miRs are depicted in soybean genome (Fig. 4). Comparative genomics enabled us to infer miR gene function, which could enumerate future research focus contributed by horsegram miRNAs in related species and in distant crops as well.
Conservation and phylogenetic analysis. The pre-miR homologs were identified by performing blast of horsegram pre-miRNA against the miRBase database. The hits were filtered to retrieve only miRNAs of same family. In this study, high degree of conservation was exemplified by comparison of mun-miR168 to other plant precursor miRNAs (Figs. 5 and 6). It is evident that horsegram miRNAs share similarity with related legume species (Fig. 7). It is conclusively apparent that, horsegram miRNAs seem to have evolved at different rates in different time period similar to other plants.
Putative target genes, their functions and networks. Inferring the function of miRNA targets is crucial to significantly substantiate the functional role of miRNAs in gene expression and regulation. For predicted 15 horsegram miRs, 39 target genes were identified and classified into different groups based on their functional annotation. As revealed by Mercator annotation results (Fig. 8) of predicted miR targets, diverse processes ranging from RNA transcription regulation, protein posttranscriptional modifications, development, signalling, biotic and abiotic stress tolerance and glycolysis were being regulated by the identified 15 miRs. In addition, target genes also regulate cell wall degradation, hormone biosynthesis and redox components like ascorbate  Table 5). The annotation of target genes using the Mercator tool categorized them into12 broad biochemical processes (Fig. 9). Differential expression of drought responsive miRNAs and gene networks Transcriptome analysis of Illumina sequence data from eight samples representing shoot and root tissues of contrasting horsegram genotypes M-191 (drought sensitive) and M-249 (drought tolerant) (NCBI Bioproject PRJNA216977) was used to decipher the horsegram miRNA expression in root and shoot samples each under normal and drought stress condition respectively 11 . The miRNA abundance estimated from blast output indicates the behavioral miRNA expression (Fig. 10). The clustering simplifies the differential gene expression levels of horsegram miRNA and samples based on the similar expression. Network predicted and depicted as target-target interaction (Fig. 11) designates machineries of energy conservation confirming earlier hypothesis of structural compaction and energy conservation to survive under stress conditions 14,15 . This hypothesis may best fit and appropriate for other crops as well. To extend this hypothesis we identified target homologs in other plants also (Fig. 12).

Discussion
Comprehensive studies on plant miRNAs endorse stage specificity and multitude of targets [47][48][49] . The non-coding miRNAs play a regulatory role in its target protein coding gene expression 50 . The miRNA multiplexes with RNA induced silencing complex (RISC) guiding the repression or cleavage of its target messenger RNA by seed nuclei base-pairing 3 . Complementarity between miRNAs and their target genes are high to regulate developmental processes, metabolism and stress responses 51,52 . Hence, there is enormous necessity to identify and validate miRNAs for further downstream applications in plants 22 . In horsegram like deficit moisture stress tolerant crop, major genes interaction network influence in expression of a tolerant phenotype. From contrasting stress responsive genotype data, and qPCR of identified miRNAs from a single stress responsive genotype under irrigated and deficit stress (Fig. 13), we could identify differential expression of miR genes. These predicted 15 miRs were clustered in to 11 different families and are conserved. To validated identified miRNAs qPCR was performed as reported earlier (53). Of the total eight miRs validated, two clusters were formed. Mun-miR 482 was found in both the clusters; whereas, mun-miR1507 was found in one cluster which distinguish the tested variety Paiyur1 for its ability to react for stress (Fig. 14). Of the 15 identified mun-miRs, miR 156, miR 171 and miR 390 were found to be differentially expressed in germinating seeds of halophyte Reaumuria soongorica under salt stress conditions (54). Legume genomes are rich in SNPs. Saturated map of SNPs were reported in legumes like vigna (55) and In pigeonpea (56). Of which highest haplotype desnsity of 0.7380 was reported for serine threonine kinase coding disease resistance gene (56)   www.nature.com/scientificreports/ to a great extent and we followed the trend in identifying them in a drought tolerant crop. Conserved miRNAs in ginger, garlic, coffee and tea were identified [60][61][62][63] . For predicted pre-miRNAs, MFEI resolution for length variation extends from 0.57 to 1.03 in horsegram. Pre-miRs illustrate the absence of large internal loops/bulges and at least 20 nucleotides for Wobble base pairing (G/U base pairings) or Watson-Crick base pairing between the miRNA and the star sequence 62,64 . Estimated (A + U) % of pre-miRs range satisfies the predetermined criteria 18 . The putative genomic region of these non-coding sequences were determined and intergenic sequences were utilized for secondary structure prediction with the stringent filtering criteria to attain potential precursor miRNA 26 . Resultant secondary structures were inspected for precursor miRNA and the positioning of the mature miRNA within its stem-loop structure manually. The sequences with suitable secondary structure 66 were characterized for structure attributes and its possible existence as potential precursor of the predicted miRs. The transcription of miRNAs from sense and antisense strands of genes were already reported 67 . Our results stand by the possibility of the miRNA in both sense and antisense strands ( Table 2) similar results were already reported in potato, tobacco and B. rapa 16,68,69 . The frequency of miRNA using expressed sequence tags is 1 in 1000 confirming previous reports 62 . The interaction of miRNAs with its target genes can enumerate evolutionary role of microRNAs 48,70,71 . Utilization of miRNAs in RNA interference (RNAi) mediated gene regulation has emerged as an important tool in novel traits engineering either by over expression of a miRNA or target genes by synthetic miRNAs. Gene silencing/knockout may help in understanding miRNA in plant responses to stress resulting increased productivity with improved nutritional value 72 . Cellular functions were not known for eight of the target proteins identified in the present investigation. Biotic/Abiotic stresses are critically affecting growth and development of plants. There are reports on contrasting mechanisms confirming grades of deficit moisture stress condition to control among different horsegram germplasm sources 6,15 . Among the identified miRNAs mun-miR156r, mun-miR156k, mun-miR482a, mun-miR390b-5p and mun-miR482a-5p were noticed to be involved in RNA processing, protein synthesis and modifications as well as plant development by altering the expression of their respective targets, whereas the genes encoding abiotic stress, biotic stress (NBS-LLR resistant class) and signalling associated proteins were targeted by mun-miR482a-5p, mun-miR482d-3p and mun-miR1507a respectively. Thus, there is concurrent substantiation that microRNAs can play pivotal role in crop improvement 73 and the miRNAs predicted from this investigation can be useful for future research. Closely related homologs of predicted putative miRs exhibited high degree of conservation as in mum-miR482 with related plant mature miRNAs (Fig. 6). The precursor sequences from miR482, miR390 and miR150 represented a strong candidate promulgating its necessary importance at post-transcriptional gene regulation in horsegram. The conserved nature of precursor and mature miRNAs has been reported in various plant groups from earlier studies 16,34,73,74 . The potential of miRNAs to bind corresponding target mRNA are comparable with their complementarity to degrade target mRNA. Therefore, to infer the contribution of microRNAs in cellular functions and regulatory gene networks, the miRNA target gene prediction is a crucial step 64 . The majority of the predicted miRNA targets in horsegram depict energy conservation mechanism which play important role in survival during adverse  www.nature.com/scientificreports/ conditions. Target gene prediction reveals the regulation of multiple genes by single miRNA with different levels of regulation 22,47,75 . Also, the genes targeted belong to more than one gene family, which shows the multitude of miRNA functions in various metabolic processes (Table 4). We could spot 39 target genes of predicted miRs in horsegram and its homologs in other plant species. This indicates that, drought tolerance phenotype can be manipulated by adjusting the expression of identified miRs and their targets.

Conclusions
Great concern is being publicized recently to unravel various mechanisms involved in drought tolerance with emphasis to changing climatic conditions. Even though, there are extensive inquiries on miRNAs discovery and their functional prediction were completed, some of the non-model plants with considerable traits were not subjected to systematic investigation to elucidate contrasting mechanisms and their key players. Eight novel miRNAs were 76 predicted from horsegram. However, these non-validated miRNAs were predicted with low stringency and standard nomenclature was not followed. Inappropriately, the drought hardy horsegram miRNAs still remain unknown and lacks extensive validation. In this study, 15 conserved miRNAs belonging to nine different families www.nature.com/scientificreports/ were identified from EST and TSA sequences and validated in this first report with its tissue specific differential expression (Fig. 10). Mun-miR482a-5p and mun-miR482d-3p are miR482 family miRNAs reported to regulate stress response in soybean and apple 77,78 . The same trend is observed in our results confirming the earlier report. Mun-miR390b-5p is a miR390b family miRNA earlier found for protein degradation and post transcriptional modifications in Arabidopsis, maize and soybean [79][80][81] . Upregulation of miR390b and downregulation of 1507 family miRNA was observed (Fig. 14) in qPCR of Paiyur1 variety which is a stress responsive accession. The present investigation indirectly links stress tolerance to energy conservation as indicated by the network rather than direct response to stress conditions (Fig. 15). Hence, we confirm the interplay of miRNA-stress response-NBS-LLR class R-protein response 77 in energy conservation to survive the stress conditions. Additionally, the study reveals that identified miRNA regulated target genes have differential biological functions including cell wall degradation, hormone synthesis and synthesis of redox component like ascorbate and glutathione (Table 5). These findings can further help in translational research of related legumes and may result in selection of better germplasm for higher productivity and nutritional security. The network predicted from this investigation confirms the earlier hypotheses: structural compaction to overcome stress tolerance 14,15 . Altogether, the outcomes of the present investigation deliver clarity that, horsegram is a drought adapted crop and can be considered as a model crop for drought tolerance research. Thus, the predicted horsegram miRNAs may unravel the unique Figure 11. Predicted miRNA target genes function network. The function based protein target network drawn using sting server 82 elucidates the role of miRs identified in cell development and maintenance of minimal functions during stress conditions. Most of the reductions in functions identified were correlated to energy conservation mechanism.
Scientific RepoRtS | (2020) 10:17203 | https://doi.org/10.1038/s41598-020-73140-x www.nature.com/scientificreports/ tolerance capability associated to its metabolic pathways and the present workflow represents simple and straightforward approach for the prediction and characterization of miRNAs in those plants for which genomes are yet to be splintered.  Under stress conditions AAO expression defines the tolerance with declined total sugar and chlorophyll contents. Pith autolysis and shrunken stomata was observed under soil moisture deficit stress.