Expressivity of the key genes associated with seed and pod development is highly regulated via lncRNAs and miRNAs in Pigeonpea

Non-coding RNA’s like miRNA, lncRNA, have gained immense importance as a significant regulatory factor in different physiological and developmental processes in plants. In an effort to understand the molecular role of these regulatory agents, in the present study, 3019 lncRNAs and 227 miRNAs were identified from different seed and pod developmental stages in Pigeonpea, a major grain legume of Southeast Asia and Africa. Target analysis revealed that 3768 mRNAs, including 83 TFs were targeted by lncRNAs; whereas 3060 mRNA, including 154 TFs, were targeted by miRNAs. The targeted transcription factors majorly belong to WRKY, MYB, bHLH, etc. families; whereas the targeted genes were associated with the embryo, seed, and flower development. Total 302 lncRNAs interact with miRNAs and formed endogenous target mimics (eTMs) which leads to sequestering of the miRNAs present in the cell. Expression analysis showed that notably, Cc_lncRNA-2830 expression is up-regulated and sequestrates miR160h in pod leading to higher expression of the miR160h target gene, Auxin responsive factor-18. A similar pattern was observed for SPIKE, Auxin signaling F-box-2, Bidirectional sugar transporter, and Starch synthetase-2 eTMs. All the identified target mRNAs code for transcription factor and genes are involved in the processes like cell division, plant growth and development, starch synthesis, sugar transportation and accumulation of storage proteins which are essential for seed and pod development. On a combinatorial basis, our study provides a lncRNA and miRNA based regulatory insight into the genes governing seed and pod development in Pigeonpea.

characteristic features of predicted lncRnAs. The average length of C. cajan lncRNAs were 1784nt, whereas the maximum number of lncRNAs were in the range of 200-1000 nt. A total of 1638 (54%) of lncRNAs obtained in C. cajan were mono-exonic, followed by 580 (19%) di-exonic and 232 (7.7%) were tri-exonic. A continuous variation of exon numbers from 1-18 was observed except for two lncRNA having 28 and 36 exons. It was noted that lncRNAs are AU-rich as compared to mRNAs. Similar kind of observations has been reported in Arabidopsis, Maize, Populus, and Rice [4][5][6][7] . The chromosome-wise distribution of lncRNAs in C. cajan genome shows that chromosome 11 has the highest number of lncRNAs, while chromosome 5 has lowest. The characteristic features of C. cajan lncRNA are provided in Supplementary S2 and Fig. 2.

Identification of differentially expressed lncRNAs between 0 days after pollination (DAP) bud
vs. all other developmental stages of seed and pod development. A total of 1159 differentially expressed (DE) lncRNAs were identified among all samples. DE [FC value > 2 and <−2 (p-value less than 0.005 and q-value less than 0.01)] lncRNAs were observed in tissue-specific manner. Highest DE lncRNAs were observed in 0 DAP vs. 30 DAP (576) followed by 0 DAP vs. 20 DAP (523) and 0 DAP vs. 30 DAS (511) with many DE lncRNAs common between them. The detailed expression pattern of differential expression between the tissues is shown in the heatmap. The lncRNAs expression in each tissue varied from 17.03 fold to −16.67 fold ( Fig. 2; Supplementary S3).

Identification of lncRNAs targeted mRNAs including TFs.
The lncRNAs control gene expression in two ways; viz. they positively regulate the target gene by sequestering gene-specific miRNA or via accommodating other regulatory elements in 5′UTRs of the genes. The super-secondary structure of lncRNAs provides the binding sites for miRNA, mRNA, TFs, and other regulatory elements. Hence, the formation of the super-secondary structure of lncRNA helps to bring together all transcription factors, miRNA, and other regulatory elements to influence gene expression in a positive manner or vice versa. In C. cajan seed and pod tissues, a total of 1,660 lncRNAs were found to have potential binding sites for 3768 mRNAs (Dng less than equal to −50), and among these, 83 were TFs belonging to 20 families (Supplementary S4). Pearson correlation coefficient was calculated for each pair of interacting lncRNA, and the target mRNA. The correlation coefficient was calculated between the expression changes of lncRNAs and their target genes in different developmental stages of seed and pod development viz. 5DAS, 10DAS, 20DAS, 30DAS, 5DAP, 10DAP, 20DAP and 30DAP (in terms of log2 fold changes). It was observed that among all the possible interaction with certain binding energy (cut-off value Dng less than equal to −50), 81% of the total lncRNA and its targets mRNA showed a correlation. Among the interacting pair, 79% have shown a positive correlation and 21% negative correlation. Overall 51% pair belongs to the correlation range of +0.8 to +1 and −0.8 to −1, 63% pair belongs to the correlation range of +0.6 to +1 and −0.6 to −1, and 69% pair belongs to the correlation range of +0.5 to +1 and −0.5 to −1(Supplementary S4).
The identified transcription families were MYB, Ethylene responsive TFs, bHLH, Nuclear TF-Y, TFIID, GATA, WRKY, etc. which are reported to be actively involved in embryo, and seed development activities of plant 36 . The identified lncRNAs target mRNAs are involved in the essential biological functions such as chromatin modification (topoisomerases), DNA replication (DNA Helicase, DNA ligase), transcription (transcription initiation, elongation factor etc.), cell division (expansion), vascular sorting, trafficking proteins, heat shock proteins, callose and cellulose synthase. Most of the mRNAs are related to hormone biosynthesis and signaling (auxin-induced protein, gibberellin-regulated proteins, auxin-responsive F-box proteins), serine/threonine-protein kinase, F-box-LRR repeat protein, and B-box zinc fingers. The lncRNAs also target the mRNAs for transporters (ABC transporters, sodium/potassium transporters, and sugar transporters). Many other mRNAs coding for enzymes like glycosyltransferase, glycerol-3-phosphate 2-O-acetyltransferase GDSL esterase/lipase were also a target for lncRNAs. The function of lncRNA targeted mRNAs are diverse and are required to carry out the routine as well as a specialized role such as seed development, maturation, and storage of reserved food in the seed 36 .
Real time-pcR validation of correlated lncRnAs and mRnAs pairs. To explore expression behavior of tissue and stage specificity, we performed expression analysis of lncRNAs along with its target genes in 7 different tissues (0 DAA, 10 DAS, 20 DAS, 30 DAS, 10 DAP, 20 DAP, and 30 DAP) using real-time PCR (Fig. 3).
The results of real-time PCR analysis revealed that the Cc_lncRNA-510 and its target mRNA Cc_lncTAR-510, a VQ motif-containing protein showed the highest expression after 10 DAA in both seed and pod tissues, and it was sustained up to seed and pod maturity. The VQ family proteins were known to be involved in seed development, photomorphogenesis and biotic, and abiotic stresses in plants 37 . VQ motif-containing proteins were seen to interact with WRKY transcription factor and showed a potential role in transcriptional regulation of genes involved in fruit ripening 38 . The Cc_lncRNA-604 and its target mRNA Cc_lncRNAtar-604, a Late Embryogenesis www.nature.com/scientificreports www.nature.com/scientificreports/ Abundant (LEA) protein showed the highest expression during 30DAS. It is the member of LEA protein which is known to be expressed during embryo maturation and seed development. During the seed maturation, LEA protein has a significant role in mitigating water loss and maintaining cellular stability within the desiccated seed 39 . The Cc_lncRNA-765 and its target mRNA Cc_lncTAR-765, a carboxy peptidase-like mRNA expression was observed to be consistently higher in seed tissue as compared to the pod tissue. Higher expression was seen in 10 DAS, which gradually reduced during the later stages of seed and pod development. In seed and pod development stages, both Cc_lncRNA-765 and its target mRNA expression level are highest at 30 DAP. Serine carboxypeptidase (SCP) is one of the major groups of enzymes involved in functional protein maturation through catalyzing proteolysis. Previous studies in Arabidopsis and rice showed that SCP expresses in roots, leaves, flowers, and predominately in seed and siliques 40 . Cc_lncRNA-810 and its target Cc_lncTAR-810, an F-box/Kelch repeat-containing protein showed consistent expression throughout the seed and pod development with the highest expression of Cc_lncTAR-810 and minimal expression of Cc_lncRNA-810 at 30DAP. The expression pattern of lncRNA and it's target showed negative correlation, which suggest that Cc_lncRNA-810 negatively regulate the expression of its target mRNA. F-box/Kelch repeat-containing protein was involved in the regulation of light, flower initiation, and grain filling in rice 41 . The Cc_lncRNA-902 and its target Cc_lncTAR-902, which codes for Aquaporin TIP4 showed the highest expression 10 DAP and 10 DAS with consistent lower expression at later stages. The members of aquaporins family proteins are engaged in the transport of water and small neutral solutes across the cell membrane. It was also reported that some of the members of the aquaporin family-like TIP3, TIP2, etc. showed upregulated expression in the developing seed and has a possible role in the transportation of organic compound during seed maturation 42 . The Cc_lncRNA-1331 and its target Cc_lncTAR-1331, Loricrin-like mRNAs showed consistent expression throughout the seed and pod developmental stages with a peak of expression at 20DAS. Both lncRNAs and its target showed higher expression in seed tissues as compared to pod tissues. Loricrin is involved in cornified cell envelope development in animal 43 , whereas its function is not elucidated in plants. The Cc_lncRNA-1516 and its target Cc_lncTAR-1516, codes for P24 oleosin like protein and showed maximum expression at 30 DAS. Oleosin is an integral component of oil bodies and has a structural role in stabilizing the lipid body during desiccation of the seed 44 . The Cc_lncRNA-1660 and its target Cc_lncTAR-1660, which codes for hydrolase enzyme showed higher expression at 10DAS and 20DAS and declined expression at 30DAS. It was observed that the appearance of hydrolase during milk gain stage was at the peak and decreased at seed maturation stage 45 . Likewise, Cc_lncRNA-1896 and its target mRNA CclncTAR-1896, a beta-conglycinin, the beta chain-like peptide expression pattern was positively correlated (0.99) and found to be at the highest expression level at 30 DAS. The reported studies suggest that beta-conglycinin and glycinin are the major proteins found in soybean seed development 46 . The Cc_lncRNA-2150 and its target Cc_lncTAR-2150, which codes for dirigent www.nature.com/scientificreports www.nature.com/scientificreports/ protein was highly expressed in pod tissues as compared to seed and showed a peak of expression at 30DAP. It is involved in deposition of defense cyanogenic glucosides in plants. In the flax seeds, synthesis of dirigent protein occurred during seed development, which was further utilized for lignin and cyanogenic glucoside formation 47 . Cc_lncRNA-2482 and its target mRNA Cc_lncTAR-2482, a Glycine max maturation associated protein (MAT-9) expression was highest in 30 DAS sample. It is either present at low levels or absent in leaves, but its level gradually increases in seed tissues as the seed enters maturity. The MAT-9, a maturation-associated plant-specific protein also known as DHN1 (Type 2 LEA protein) is commonly associated with seed maturation, seed dehydration, and desiccation tolerance during seed maturation 48 . comparative Go analysis of lncRnAs and its targeted mRnAs. In GO (gene ontology) study at the level of biological process, comparative analysis between lncRNAs (Supplementary Fig. 1, Supplementary S5), and lncRNA's target (Fig. 4, Supplementary S6), revealed that 49% and 48.40% of total lncRNAs and lncRNA's targets respectively were involved in a metabolic and cellular process. It was observed that 30% of lncRNAs and 30.57% of lncRNA's targets were associated with biological regulation, cellular component organization, biogenesis, and localization. Rest 21.7% of lncRNAs, and 21.1% of lncRNA's targets are involved in the biological process like localization, response to a stimulus, signaling, developmental process, reproductive process, growth, and other cellular processes.
The comparative analysis at the level of the molecular function of lncRNAs ( Supplementary Fig. 1, Supplementary S5), and lncRNA targeted mRNAs (Fig. 4, Supplementary S6), showed that majority of them are involved in a catalytic activity which covers 43.55% and 44.46% in lncRNAs and its targets respectively. Binding covers 38.8% of lncRNA and 40.02% of lncRNA's targets. Rest 18% of lncRNAs, and 15.53% of lncR-NA's targets were involved in activities like the transporter, molecular function regulator, signal transduction, and protein binding. The comparative analysis at the level of localization of lncRNAs ( Supplementary Fig. 1, Supplementary S5), and lncRNA targeted mRNAs (Fig. 4, Supplementary S6) also showed a positive correlation.
In the cell and cell part, 37.22% and 38.15% of lncRNAs and lncRNAs target mRNA were residing respectively while in membrane and membrane parts 36.1% and 28.79% of lncRNAs and lncRNAs target mRNA were present, respectively. The comparative analysis at three different levels suggests that lncRNAs and its targets coexist in the cell because their functionalities are interdependent and linked.

Identification and characterization of miRNAs involved in seed and pod development. A total
of 227 miRNAs were identified using transcriptome data obtained from various stages of seed and pod development in C. cajan using the pipeline depicted in Fig. 1. The miRNA identification pipeline used in the study has been computationally validated with high specificity and sensitivity in A. thaliana 23 , P. vulgaris 24 , and C. cajan 33 . All physical properties of identified miRNAs are provided in Supplementary S7. All the identified miRNAs are www.nature.com/scientificreports www.nature.com/scientificreports/ novel and represent 33 different families, the maximum numbers of miRNA fall under miR156 family (27) followed by miR169 (24), miR172 (23), and miR166 (22). A similar type of family-wise distribution of miRNA was reported in P. vulgaris 24 . The length of predicted mature miRNAs ranges from 14 to 25 nt. It was observed that 29 signature SSRs were present in these miRNA families, among them UUU signature frequency was highest followed by AAA. The complete information about the miR-family wise distribution of predicted miRNAs is shown in Fig. 5(A) and Supplementary S8. Identification of mRNA targeted by miRNA. All the identified 227 miRNA were subjected to psRNA-Target software for prediction of their target mRNAs. The 5′UTR and 3′UTR region of mRNA sequence from C. cajan database were used as miRNAs targets. We chose the UTR region of genes for target prediction analysis with the assumption that the miRNAs usually act on the UTR regions of the genes 20 . Total, 3060 potential mRNAs targets were identified and further classification of all target mRNA showed that 154 mRNA were coding for TFs, representing 21 TFs families including bHLH, WRKY, MYB, ERF, etc. (Fig. 5(B) and Supplementary S9). Studies on Arabidopsis mutants for carpel margin tissues and fruit patterning showed that bHLH families of transcription factors are fundamentally crucial for developmental and environmental responses 49 . The MYB family TFs (MYB-118) is actively involved in Arabidopsis endosperm development and seed maturation 50 . In our analysis, we found that MYB (14.5%) family and bHLH (12.5%) family mRNAs are targeted by Cc-miRNAs suggesting that Cc-miRNA may be involved in regulating seed and pod development activity by modulating the expression of these two TF family members.
The results of Cytoscape (Fig. 6) showed the interaction pattern between miRNAs and their targets, and it was observed that one miRNA could interact with many mRNAs. The miR156 showed its interaction with bHLH and MADS-box TF along with other mRNAs. miR-172l interacts with MYB TF, and miR408a interacts with TCP4 TF. MiR319l can target Calmodulin-binding transcription activator and bHLH TF. It is well known that relative level of miR156, miR172 helps in juvenile to reproductive phase transition 51 . MYB, MADS-box, and bHLH are involved in growth and development during different seed stages. All the miRNAs targets (3′ UTRs and 5′ UTRs) with their interacting mode is given in Supplementary S9. All the predicted miRNAs and their target mRNA interacting pattern is visualized via Cytoscape and shown in Supplementary Fig. S2.
Go annotation of miRnAs targeted mRnAs. We performed Blast2Go analysis to annotate the predicted mRNAs, which are targeted by miRNAs. The molecular function, biological processes, and cellular components of the targets are shown in Fig. 4 (Supplementary S10). Under the Biological process, the majority of the targets viz. 56.08% were involved in the metabolic and cellular process. Around 41.05% were involved in biological regulation, localization, regulation of the biological process, response to a stimulus, signaling, cellular component organization or biogenesis, multicellular organismal process, developmental process, reproduction, reproductive process, and growth. Rest 2.86% was associated with detoxification, positive and negative regulation of the biological process, cell population proliferation, immune system process, and nitrogen utilization.
Under molecular functions, catalytic activity covered 44.59%, binding covered 43.01% of transcripts and rest of 12.38% were involved in transporter activity, transcription regulator activity, structural molecule activity, www.nature.com/scientificreports www.nature.com/scientificreports/ antioxidant activity, molecular function regulation, molecular transducer activity, molecular carrier activity, nutrient reservoir activity and protein folding chaperone.
The targets localized in different cellular components, among them a large number of proteins localize in cell and cell part (38.25%), membrane and membrane parts (38.38%), organelle and organelle part (17.54%) and rest 5.12% in protein-containing complex, extracellular region, membrane-enclosed lumen, supramolecular complex, cell junction, microtubules, symplast, virion and nucleoid.

Identification of lncRNAs as candidate endogenous target mimics. Endogenous target mimics
(eTMs) is build up through pairing among miRNAs and lncRNAs occurring in cells 29,30 . The psRNATarget analysis revealed that a total of 227 miRNAs target 3060 mRNAs, among these 227 Cc-miRNAs, 139 Cc-miRNAs were also act as target for 703 Cc_lncRNAs, suggesting that miRNA could be involved in interacting with both lncRNA and miRNA. The results obtained in our study supports the possibility that the formation of endogenous target mimics (eTMs) between Cc_lncRNAs and miRNAs regulates the gene expression profile during seed and pod development.
The result of endogenous target mimics (eTMs) analysis using an in-house script and psRobot software revealed that 302 lncRNAs interact with 114 miRNAs and formed eTMs ( Fig. 5(C) and Supplementary S11). The obtained results support the hypothesis that lncRNA regulates the function of mRNAs indirectly by the mechanism of endogenous target mimic (eTMs) via miRNAs. Details of eTMs and their target mRNAs along with their function are provided in Table 1. The miRNA and lncRNA interacting network via eTMs is also executed and visualized via Cytoscape in Supplementary Fig. S3.

Validation of lncRNAs, miRNAs and eTMs targets through expression analysis. The expres-
sion patterns of predicted eTMs were validated using real-time PCR by taking nine pairs of eTMs, which are linked to seed and pod development pathways (Fig. 7). The real-time expression analysis revealed that the eTM (Cc-miR160h-Cc_lncRNA-2830) regulate the mRNA level of XM_020377020, which codes for ' Auxin response factor 18-like (ARF-18)' protein. Cc-miR160-h expression was minimal at 0 DAA and was progressively up-regulated at the later stages of seed and pod development, and found to be highest at 30 DAS. The expression pattern of Cc-miR160h target mRNA XM_020377020 was minimal at 0 DAS and maximum at 10 DAS and 20 DAS and reduced by 30 DAS. The obtained result does not support the existing hypothesis that only Cc-miRNA160h regulates the expression of XM_020377020, because, at 10 DAS and 20 DAS, sufficient amount of Cc-miR160h were present in the cell. Cc_lncRNA-2830 showed minimal expression at 0 DAS and progressively www.nature.com/scientificreports www.nature.com/scientificreports/ increased its expression up to 20 DAS with a decline towards 30 DAS. In conclusion, it can be assumed that in the presence of high amounts of Cc-miR160h at the 10 DAS and 20 DAS, expression of XM_020377020 was found to be high, because at same time abundance of Cc_lncRNA-2830 sequesters Cc-miR160h and XM_020377020 transcript could be freely available for translation. At 30 DAS, Cc_lncRNA-2830 expression is on a decline, and Cc-miR160h expression is up-regulated as a result, which could be leading to degradation of XM_020377020. It was observed that seeds and pods reach its maximum size and shape about 20 days after anthesis by vigorous cell division and cell growth, in which the ARF-18 factor is needed because it helps in auxin-regulated cell growth and development 57  Transcription factor SPIKE, also known as NAL1 in rice, regulates the development of seed/spike and is responsible for an increase in spike number and seed size 58 . In our study, it was observed that its expression was higher in 10DAP and 20DAS, which is the most critical stage of seed and pod development where most of the photosynthate produced by plant starts to accumulate in seed tissues as a sink. The auxin signaling F-box 2 is linked with positive regulation of auxin biosynthesis, which is involved in seed enlargement and seed size enhancement 53 . Bi-directional sugar transporters were involved in the transport of sugar molecule from one part of the plant to other 54 . Our study confirmed that after 10 and 20 days of anthesis in both seed and pod, bidirectional sugar transporter is activated to accumulate the plant photosynthetic sugar to seed as a sink. UDP-glucose-glycoprotein-glycosyltransferase is involved in glycosylation of protein moieties. Higher expression of this protein indicated that it gets collected in the seed as sink and gets glycosylated 59 . In case of LEA (Late embryogenesis abundant) protein, its expression was higher at 10 DAA which is similar to reports in other crops like barley, where its expression was higher in late embryo maturation and during seed dehydration and played a vital role in the seed maturation process. Likewise, the level of granule-bound starch synthase-2 expressions also depends on the relative level of miR-166f and Cc_lncRNA-2117. In our study, it was observed that highest level of granule-bound starch synthase-2 expression occurs during 20 DAS, which reveals that during seed and pod development, source synthesized by the plant through photosynthesis was converted into starch for long term storage in the seed. In the case of maize, it was seen that there is a gradual increase in expression of granule-bound starch synthase-2 from 6 to 30 days after anthesis when the maximum accumulation of amylase occurs. In the case of C. cajan also, at 20 DAA, the accumulation of amylase was at higher levels in the endosperm. So overall study reveals that a complex lncRNA, miRNA, and mRNA interacting network is involved during the seed and pod developmental stages and all the interacting eTMs (lncRNAs) miRNAs and mRNAs interaction pattern is shown in Supplementary Fig. S4. www.nature.com/scientificreports www.nature.com/scientificreports/ conclusion Noncoding RNA is an essential element for gene regulation in plant and animal genomes. In our study we have predicted high fidelity 3019 lncRNAs and 227 miRNAs from the transcriptome data; It was observed that the lncRNA targets a group of mRNAs including TFs which are involved in various hormonal pathways required for seed and pod development, cell proliferation, cell division, signaling based pathways and genes involved in seed development and maturation. It was also observed that the predicted miRNAs also target the TFs like bHLH, MADS-box, NAC, etc. which are primarily involved in seed development, starch synthesis, sugar transporters, seed maturation, and fatty acid biosynthesis. More interestingly, we deduced that lncRNAs and miRNAs interact to form an eTMs, which ultimately regulate the mRNA expression. We validated our eTMs results via real-time-PCR analysis and observed that miRNAs expression was reduced when its corresponding eTMs, (the lncRNAs) sequesters them; as a result, the miRNA targets like sugar transporter, auxin-related F-box protein, starch synthetase, SPIKE genes were up-regulated during seed and pod development stages. Overall our analysis gives a clear insight that these ncRNAs (lncRNAs and miRNAs) interact with mRNAs and also with each other during the seed and pod development in C. cajan.

Materials and Methods
Transcriptome data and lncRNA identification pipeline. RNAseq data generated from three tissues and five different stages of seed and pod development (C. cajan var Asha) was used for identification of Cc_lncR-NAs and miRNAs. The ten samples of tissues included leaf; flower bud at 0 days of anthesis (DAA); seed at five days after anthesis (DAS), 10 DAS, 20 DAS, and 30 DAS; pod at five days after anthesis (DAP), 10 DAP, 20 DAP, and 30 DAP. The sequence data of all the mentioned tissue and stages comprising NCBI accession number SRR4341972, SRR4341973, SRR4341974, SRR4341975, SRR4341976, SRR4341977, SRR4341978, SRR4341979, SRR4341980, and SRR054580 were downloaded from NCBI. Adaptor trimming and removal of low-quality reads was performed using Trimomatic 0.36 software with default parameters. Each dataset was mapped to the C. cajan genome through Tophat 2.0 program 34,35 , and all the dataset were assembled via the Cufflinks 2.0 program 35 , and Cuffmerge program from Cufflinks was used to merge all the pooled transcripts. Cuffdiff was performed with accepted BAM files (output file from Tophat 2.0) to calculate the abundance of the transcript. prediction of lncRnAs. All the transcripts having strand information with the characteristics of non-overlap to known genes were taken for further analysis. Intergenic lncRNA (lincRNA), intragenic lncRNA, and natural antisense transcripts (lncNAT) were identified based on their location and strand information. The transcripts prediction of lncRnAs targeted mRnAs. Both cis-acting and trans-acting mRNA targets were identified for lncRNAs. The lncRNAs present in 10 kb window of upstream and downstream of the gene were considered as potential cis-target genes 62 and mRNA having a complementary sequence to lncRNAs and coded by the genes which are not in the vicinity of lncRNA gene were considered as trans-acting targets. First, we used BLASTn to select target sequences that were complementary to the lncRNA, setting E-value ≤ 1e −5 , and identity ≥95%. Then we used the RNAplex software to calculate the complementary energy between two sequences for further screening and to select potential trans-acting target genes (RNAplex dNG-50) 63 .
We calculated the Pearson correlation coefficient between the expression changes (in terms of log2 fold changes as a reference to zero days of anthesis) of lncRNAs and their target mRNAs (genes) in different stages of seed and pod development. miRnA prediction from the transcript data related to seed and pod development. All known miRNA and pre-miRNA were downloaded from miRBase21. Non-coding transcripts obtained from all the ten transcriptome data were used for miRNA prediction. The miRNA and pre-miRNA from miRBase21 was subjected to BLASTn against these pooled transcripts with the parameter (mismatch less than 3, word size 7 and e-value cut-off of 1000). To confirm its noncoding property again, Blastx was carried out against C. cajan protein database with the sequence identity cutoff ≥80% and all the protein-coding sequences fall in this cutoff was removed. Rest of the sequences were processed through CPC and CNCI with reference to the NR database, and the coding sequences were excluded.
All the non-coding transcripts were processed using the following parameters (i) A Proper stem-loop hairpin secondary structure should be formed with minimum folding free energy and MFEI ≥ 0.41 (i) AU content should be 22-77% of the sequences (ii) The mature miRNA sequence should be positioned in one arm of the hairpin structure (iii) There should not be any loop or break in miRNA* sequence (iv) miRNA sequence with opposite miRNA* should not have more than six mismatches (v) Normalized Shannon entropy (NQ), Normalized base-pair distance (ND) and Normalized base-pairing propensity (Npb) value should be ≤ 0.45, ≤ 0.15 and ≥ 0.25, respectively (xvi) SSR(simple sequence repeat) signature value R ≥ 2.5 with corresponding miRNA family From the multiple BLAST hits, we chose only those sequences that follow all the parameter and having maximum MFEI and R-value. prediction of miRnA targets. All the predicted miRNAs sequences were subjected to identify their target mRNA using psRNATarget software with default parameter 64 . To identify mRNAs as miRNAs target, mRNA sequences of C. cajan were used as targets query against miRNAs. Similarly, miRNAs as lncRNA target were identified using the same software, but the predicted miRNAs were used as the target query against predicted lncRNAs.

Identification of lncRNAs as candidate endogenous target mimics. Endogenous target mimics
(eTMs) is build-up for pairing among miRNAs and lncRNAs as it occurs in cells 65,66 . The eTMs among lncRNAs and miRNAs were discovered via the parameters enlisted below (i) Only one bulge is allowed among miRNA and lncRNA, preferably 9 th to the 12 th nucleotide at 5′ end of miRNA. (ii) The bulge present in eTMs should be only for three nucleotides; perfect nucleotide pairing essential at 2 nd to 8 th nucleotide at end of lncRNAs. (iii) The total mismatches and G/U pairs within eTM and miRNA pairing regions should be less than three except for the central bulge. (Wu et al., 2012) 66 .
The psRobot software was used to identify the putative eTMs.

Prediction and visualization of interaction among and between non-coding and coding
RnAs. The secondary structures of lncRNAs and miRNAs were predicted with the Vienna RNA package Gene Ontology (GO) Annotation of lncRNAs, miRNA targeted mRNAs and lncRNA, targeted mRnAs. Gene Ontology annotation is an affirmation regarding the function of RNA or gene; this annotation program is always associating a RNA, or gene with a GO term. GO annotations confine about the gene functions and their involvement at the molecular level, cellular functions, and biological processes (pathways, programs). Molecular Function of RNAs reveals in which molecular activities they are involved. Cellular Component of RNAs interpret their location of availability, and the location where the RNAs and its products are active Biological Process interpret that the RNAs and their products are actively contributing the pathways and processes involved in the biological system. GO evidence code is supported each annotation. Annotation of lncRNAs, miRNA, mRNA targeted by miRNA, and mRNA targeted by lncRNAs was performed using Blast2Go analysis pipeline.
RNA isolation, quantification, and cDNA synthesis. Total  Validation of candidate lncRNAs, miRNAs and their target genes using quantitative real-time pcR. Prospective pigeon pea lncRNAs, miRNAs, and their target's expression were analyzed using Light Cycler (Roche) real-time PCR instrument. All the primers required in the analysis were designed through IDT software (Integrated DNA Technologies, Inc., the US) (Supplementary S12). All the reaction was carried out in 96 well PCR plate. Brilliant III Ultra-Fast SYBR Green QPCR Master Mix kit (Agilent Technologies, USA) was used for qPCR analysis. 2 µl of cDNA, 0.5 µl of each one primer, 10 µl SYBR green (ROX added) and DEPC water were used per sample. The cDNA obtained from all the selected tissues and stages were used as a template for qRT-PCR analysis. The following reaction condition was set in Light Cycler (Roche) real-time PCR for amplification of templates. The initial denaturation temperature was 94 °C for 3 min, followed by 40 cycles of 94 °C for the 30 s, 60 °C for 15 s and 72 °C for 20 s. The melting curve study ranged from 56 to 95 °C, with increasing temperature steps of 0.5 °C every 10 s after completion of the amplification cycles. Tubulin-3α was used as an internal control for all the qRT-PCR reaction. Ubiquitin 6 (U6) was used as an endogenous control for all miRNAs. Three technical replicates were used for each sample. The specificity of the primers for each amplicon was checked by performing the melting curve analysis. The 'comparative Ct method' was used for the calculation of relative expression of each sample. The correlation between lncRNAs expression deliberate by RNAseq (FPKM) and qRT-PCR was analyzed via the R statistical package.