Stage specific comparative transcriptomic analysis to reveal gene networks regulating iron and zinc content in pearl millet [Pennisetum glaucum (L.) R. Br.]

Pearl millet is an important staple food crop of poor people and excels all other cereals due to its unique features of resilience to adverse climatic conditions. It is rich in micronutrients like iron and zinc and amenable for focused breeding for these micronutrients along with high yield. Hence, this is a key to alleviate malnutrition and ensure nutritional security. This study was conducted to identify and validate candidate genes governing grain iron and zinc content enabling the desired modifications in the genotypes. Transcriptome sequencing using ION S5 Next Generation Sequencer generated 43.5 million sequence reads resulting in 83,721 transcripts with N50 of 597 bp and 84.35% of transcripts matched with the pearl millet genome assembly. The genotypes having high iron and zinc showed differential gene expression during different stages. Of which, 155 were up-regulated and 251 were down-regulated while during flowering stage and milking stage 349 and 378 transcripts were differentially expressed, respectively. Gene annotation and GO term showed the presence of transcripts involved in metabolic activities associated with uptake and transport of iron and zinc. Information generated will help in gaining insights into iron and zinc metabolism and develop genotypes with high yield, grain iron and zinc content.

RNA extraction and high-throughput sequencing. Total RNA was extracted from each replicate of the 12 samples (Table 1) using RNeasy Plant Mini Kit (QiaGen). The quality of RNA was analyzed on 1.2% agarose gel electrophoresis and concentration was measured using a Qubit RNA HS Assay Kit (Invitrogen) along with the Qubit 2.0 Fluorometer (Invitrogen). mRNA was extracted using Dynabeads mRNA DIRECT Kit (Thermofisher Scientific) from ~ 1 μg of total RNA input. After purification, mRNA was fragmented by RNase enzyme at 37 °C in RNase buffer provided in cDNA library preparation kit Total RNA-Seq Kit v2 (Thermofisher Scientific). These fragmented mRNA were hybridized and ligated with adapters followed by SPRI cleanup (Solid Phase Reversible Immobilization). The fragments were reverse transcribed using random hexamers and superscript II reverse transcriptase (Life Technologies) and further cleaned using Beckman colter agencourt ampure XP SPRI beads. The cDNA library was amplified using PCR for the enrichment of the adapter-ligated fragments. The individual libraries were measured using a Qubit 2.0 Fluorometer and validated for quality in E-Gel 2% Agarose (Invitrogen). Subsequently, these libraries of each sample were diluted up to 100 pM concentration and subjected to emulsion PCR (Ion One Touch 2, Life Technologies). Later, enriched template positive ion sphere particles were loaded on 540 chip and were sequenced in ION S5 next generation sequencer platform (Life technologies).
Preprocessing of RNA-Seq data and transcriptome profile analysis. Initially, low-quality sequences were removed from the raw reads from all the individual data files. The reads having > 50% bases with low-quality scores and/or > 10% bases unknown (N bases) were removed from each raw data for accuracy of results using CLC genomics tool kit 26  www.nature.com/scientificreports/ De novo transcriptome assembly, read mapping and sequence annotation. The reads were processed for sequencing adapter removal, further trimmed for quality and ambiguity. The de novo assembly was conducted using default parameters in the CLC Genomics workbench 20.01 (CLC bio, Aarhus, Denmark) to develop contigs/transcripts for all the samples. The transcript reads were further matched to the draft genome of pearl millet 27 to find out the percentage match. BlastX of assembled transcripts was done with the close relative of pearl millet i.e. Foxtail millet protein database with an e-value cut-off of 1E-6, as the protein database of pearl millet was not available. Sequence annotation was carried out with the software BLAST2GO. Based on these results, gene ontology (GO) terms were assigned to the sequences 28 .
Differential gene expression analysis. Two (Table 1). A significant genetic variation in Fe and Zn concentration was found among high and low Fe genotypes. The iron concentration in the genotype PPMI 953 is sufficient enough to fulfill the desired iron concentration of daily human need 33 . Ample work has been carried to enhance iron and zinc content in pearl millet [34][35][36][37][38][39][40][41] , however little or negligible work has been carried out to understand the mechanism or pathways involved in high Fe and Zn containing genotypes.
RNA sequencing and assembly. The samples for RNA-sequencing were collected from four pearl millet genotypes (two genotypes for high Fe and Zn while two for low Fe and Zn) at three different stages i.e. panicle initiation stage, flowering stage and milking stage. The samples were collected in three replicates for each stage. The total RNA was isolated from each replicate and were pooled before sequencing to make a total of 12 samples.
The sequencing was carried out in Ion S5 next generation sequencing platform with a chemistry of 200 bp. The total number of raw reads varied from approximately 9-20 million reads for the samples of different stages. The raw reads were further subjected to adaptor trimming and bases quality check. The percentage of high quality (HQ) reads varied from 76.6% to 91.44% and the number of HQ reads varied from ~ 7 to 15 million ( Table 2).
Mapping of HQ reads with draft pearl millet genome showed 82.39% to 86.71% of reads mapped. Further, the HQ reads of all the 12 samples were subjected to master assembly by combining all the reads of all the samples. A total of 83,721 contigs were generated with an average contig length of 519 bp and N 50 of 597 bp ( Table 3).
The complete data has been submitted in National Center for Biotechnology Information (NCBI) with Submission ID:SUB10645683 and BioProject ID:PRJNA779559.

Differentially expressed genes in high and low Fe and Zn genotypes. Differentially Expressed
Genes (DEGs) were defined as the fold change of the normalized (RPKM) expression values of at least 2 transcripts/contigs in both direction of log2 ratio ≥ 1 and p value ≤ 0.05. During panicle initiation stage, a total of 406 transcripts were differentially expressed of which 155 were up-regulated and 251were down-regulated in high Fe and Zn containing genotypes in comparison to low Fe and Zn containing genotypes. During flowering stage, 349 transcripts were differentially expressed of which 227 were up-regulated and 122 were down regulated. Similarly, during milking stage, 184 transcripts were up-regulated and 194 transcripts were down-regulated in HFZPG in comparison to LFZPG ( Table 4). The GO terms with their gene description and other functional activities of all the up and down regulated transcripts have been described in Supplementary Table S1 to S6. Among the differentially expressed genes, 62 transcripts were expressed during all the three stages of panicle development, while 239 transcripts were specifically expressed during panicle initiation and 163 and 261 transcripts were www.nature.com/scientificreports/ specifically expressed during flowering and milking stage, respectively (Fig. 1A). During panicle initiation and milking stage, 18 transcripts were expressed commonly, while during panicle and flowering stage, 87 transcripts were expressed commonly during both the stages. When the up-regulated genes were characterized, it was found that there were 32 transcripts which were up-regulated during all the three stages of panicle development. However, 68 transcripts up-regulated only during panicle initiation stage and 120 and 122 transcripts up-regulated only during flowering and milking stages (Fig. 1B). While among the down-regulated genes, 30 transcripts were down-regulated during all the three stages of panicle development (Fig. 1C).

Functional annotation of DEGs.
To identify the putative function, pearl millet's differentially expressed genes were compared against the Foxtail millet protein database using BLASTx search. Only 1,133 transcripts of HFZPG which includes 566 up-regulated and 567 down-regulated were used for functional annotation and assigning GO-based classification (Fig. 2). Cellular components (CC) GO terms i.e. plasma membrane, intracellular, cell periphery, protein-containing complex, transcription factor complex, membrane-bounded organelle, organelle membrane, cytoplasm, membrane, cytosol, extracellular region, apoplast, membrane, etc. were found to be enriched, indicating their participation in the uptake of micronutrients. Biological process (BP) GO terms likewise oxidation-reduction process, regulation of transcript, macromolecule localization, establishment of localization, metabolic process, organic substance metabolic process, biosynthetic process, small molecule metabolic process etc. were associated with up-regulated genes. The presence of nucleoside-triphosphate phosphatase as up regulated enzyme has been reported to be involved in different metal ions accumulating metabolic process 42 . Molecular functions (MF) GO terms like ion binding, small molecule binding, amide binding, lipid binding, transporter activity, transferase activity, transmembrane transporter activity, etc. were highly enriched in up-regulated genes. The further classification of ion binding components of MF indicated up-regulation of metal binding, zinc ion binding, iron ion binding, calcium ion binding etc. which may have contributed significantly in the uptake and transport of iron, zinc and other metals (Fig. 3). Fe was involved in chlorophyll synthesis, contributing to the maintenance of chloroplast structure and its function 43 . The presence of metal ion binding, iron ion binding, zinc ion binding and calcium ion binding in HFZPG genotypes indicates their role in accumulation of Fe and Zn in pearl millet genotypes ( Fig. 3 and Supplementary Table S7). [44][45][46] were analyzed to find out the differences in metabolic processes between HFZPG and LFZPG genotypes (Fig. 4). Pathway results infer that in HFZPG, starch/sucrose metabolism, glycolysis/gluconeogenesis, amino sugar and nucleotide sugar metabolism, carbon fixation in photosynthetic organisms, purine metabolism and glycerolipid metabolism pathways were highly enriched.. MapMan pathway analysis at milking stage of high Fe genotypes as depicted in Fig. 4, indicates up-regulated and down regulated genes in different locations of cell and their level of expression in red to dark blue color range. Genes involved in photorespiration process and carbon metabolism were found to be more up and down regulated (Fig. 5).

DEGs involved in transportation and uptakes of mineral in HFZPG.
Transportation and accumulation of minerals (Zn, Fe, Cu, Ca, Cu, Cd etc.) by plants from soil involves many complex regulatory processes directed by group of genes responsible for enhancing mineral uptake by the roots and their transportation to the most suitable place of accumulation. Different types of protein families involved in transportation of minerals from root/shoot to the other parts of the plant have been identified. In the present study, different protein families have been identified which are involved in different processes of mineral uptake at various stages of panicle development. During different stages of panicle development i.e. panicle initiation, flowering and milking different protein families responsible of transportation and uptake of minerals were either up-regulated or down-regulated in the HFZPG (Table 5 and Supplementary Table S7 and S8). A total of 21 protein families were identified to have function associated with minerals uptake in pearl millet. Among these 21 protein families, all the identified protein family were up-regulated during different stages of panicle development except for ABC transporter family protein and ZC3H15 family which were down regulated during milking stage and flavanone 3-dioxygenase 2 which was down regulated during panicle initiation stage. ABC transporter family proteins having functions like ATP binding, ATPase activity and transmembrane movement of substances, were found to be up-regulated during panicle initiation stage (↑panicle initiation) and down regulated during milking stage (↓milking stage). This indicates that the mineral transport process is active during panicle initiation stage and further deactivated during flowering and milking stage. Three transcripts related to multicopper oxidase family

Genes related to Fe and Zn biosynthesis in HFZPG.
In plants, for translocation of minerals from one part to another part, various kind of accumulators have been identified. Transport of Fe and Zn from soil to root and then to shoot is affected by many factors like pH, water availability, organic substances, which may affect transport of Fe and Zn to its target destinations. Many studies have been carried out on accumulator which may      www.nature.com/scientificreports/ play role in mineral transport and also in stress tolerance response (STR) 47 . Functional annotation of high Fe and Zn containing genotypes with foxtail millet database revealed transcripts belonging to different families of genes to be involved in higher Fe and Zn accumulation (Supplementary Table S7 and S8). The functional annotation of the DEG's indicated few transcripts to be belonging to the families having function related to metal binding and metal ion transport. The protein families identified in the high Fe and Zn containing genotypes having function specifically related to Fe and Zn were multicopper oxidase family, major facilitator superfamily, cytochrome P450 family, zinc-containing alcohol dehydrogenase family, ferredoxin-NADP reductase type 1 family and CONSTANS family (Table 5).

Discussion
Micronutrient malnutrition due to iron and zinc deficiencies is a serious public health problem in developing countries. Globally malnutrition has been in the prime focus of researchers across the world. The development of cereal crops rich in micronutrients is one of the major approaches to overcome this problem. In India alone, about 74% of children and 80% of the pregnant women suffer from iron and zinc deficiency. At present, knowledge of the genes involved in transportation and accumulation of Fe and Zn in the seeds of pearl millet is not known and hence a study on genes responsible for Fe and Zn has become important. Various strategies for iron uptake, iron transport and iron storage in many agricultural crops mainly in rice, wheat, common bean, maize reported earlier by many researchers. Movement of iron from soil to its destination place is a complex mechanism and require transportation system which is performed by different genes that regulate the mineral uptake 48 . Transcriptome analysis using Next Generation Sequencing (NGS) technologies is a robust and efficient method for exploring the pattern of gene expression in host. Transcriptome sequencing or RNA-seq surpasses cloning and is helpful in exploring the complexity of transcriptome by allowing RNA analysis through cDNA sequencing at massive scale 49 . It has the potential to identify important secondary metabolic pathways, various transcripts associated with diseases and helpful for discovering novel genes 50 and developing molecular markers 51 . RNA sequencing (RNA-Seq) is widely used among the different transcriptome analysis methods as it can efficiently detect unknown genes and novel transcripts and has much potential to study gene expression and their regulating pathways 52 . This approach has been successfully applied in many important crops, including sesame 53 , chickpea 54 , finger millet 55 , foxtail millet 56 , peanut 57 , maize 58 and cotton 59,60 .
In this study, we used transcriptome sequencing for identification of transcripts involved in Fe and Zn accumulation in the pearl millet genotypes. Two sets of pearl millet genotypes having high and low Fe and Zn in their grains were taken. The samples were collected at three stages of spike development i.e. panicle initiation, flowering stage and milking stage. mRNA was isolated and was subjected to sequencing using Ion S5 next generation sequencer. The analysis of the transcripts was carried out individually for each stage and later combining all the transcripts into two group i.e. high Fe and Zn containing pearl millet genotype (HFZPG) and low Fe and Zn containing pearl millet genotype (LFZPG).
The pearl millet developing spikes transcriptome sequence generated 43 million sequence reads resulting in 83,721 transcripts with N 50 of 597 bp. When matched with the pearl millet de novo assembly, 84.35% of transcripts matched with the genome assembly. Further, when the differentially expressed gene were analyzed, it was found that the maximum number of genes were differentially expressed during the panicle initiation stage of panicle development indicating that the accumulation and transportation of Fe and Zn might be taking place during the panicle initiation stage of panicle development. The group of genotypes having high Fe and Zn showed that during panicle initiation stage a total of 406 transcripts were differentially expressed of which 155 were up-regulated and 251were down-regulated while during flowering stage and milking stage 349 and 378 transcripts were differentially expressed, respectively. The gene annotation and GO term assigning of upregulated genes showed the presence of transcripts involved in metal ion binding, ATP binding, DNA binding, heme binding, iron ion binding, zinc ion binding, catalytic activity, transferase activity, transferring acyl groups, oxidoreductase activity, peroxidase activity, calcium ion binding, copper ion binding, nutrient reservoir activity, Table 5. List of transcripts along with their protein family having a functional association with minerals uptake in high Fe and Zn containing pearl millet genotypes (HFZPG). www.nature.com/scientificreports/ etc. indicating their role in uptake and transport of iron, zinc and other metals. The genes like multicopper oxidase family, major facilitator superfamily, cytochrome P450 family, Zinc-containing alcohol dehydrogenase family, ferredoxin-NADP reductase type 1 family and CONSTANS family having activities related to iron and zinc accumulation were also found to be up-regulated in the transcripts of high Fe and Zn containing genotypes. The iron we consume from plant-based foods undergoes several processing steps with the help of many plant proteins before it forms the final product in form metal cofactor or in storage form 61 . Enzymes such as proton ATPases, ferric reductases intervene with the solubilization of iron hydroxides in the soil. Further, Fe is transported from the apoplast to the symplast and numerous transporters are needed to distribute it within the plant via xylem and phloem. Specific transporter proteins are required for its transportation across intracellular membranes and biosynthetic enzymes helps in integrating it into heme or iron-sulfur clusters. Later, iron is stored in ferritin in the plastids or in the vacuoles. Hence, any of these proteins/ genes can be utilized as biofortification targets and iron uptake entails many genes 62,63 . But, only a few them have been identified as biofortification targets. Plants possess two main approaches for iron uptake -chelate based approach in grasses and reductive approach in non-grasses. Iron concentration was observed to increase by 1.7-fold in rice leaves while it was 1.1-fold in grains due to over expression of IRT1, a divalent metal transporter which is central to reductive iron uptake 64 . Thus, iron accumulates in the vegetative tissues in the absence of extra sink capacity in seeds. Infact, iron concentration increased in the endosperm up to fourfold in polished rice when IRT1 is overexpressed along with PvFER1 65 . Similarly, collective over expression of Arabidopsis IRT1 and FER1 leads to a 5.5-fold in cassava 66 . During reductive iron uptake mechanism, small molecules such as coumarin-derivatives are secreted in some species while flavins in others 63 , but till now it has not been confirmed if biosynthesis genes may be used for biofortification. Derivatives of NA-mugineic acid and deoxymugineic acid (DMA) are secreted in the rhizosphere in grasses and cereals and chelate Fe 3+ . The Fe-chelator complexes are transported into the cell by YSL15 in rice and YS1 in maize 67 . But, overexpression of IDS3 (Iron Deficiency Specific Clone 3, encoding 2'-deoxymugineic-acid 2'-dioxygenase) and YSL15 by increasing DMA resulted into modest increases in iron concentration in rice seeds 68,69 .

Conclusion
We successfully used transcriptome sequencing for identification of transcripts involved in Fe and Zn accumulation in the pearl millet genotypes and it is the first report for identification of metabolic pathways involved for Fe and Zn content in pearl millet. This will be further useful in developing hybrids/varieties rich in Fe and Zn content in pearl millet which will aid in mitigating hidden hunger and enhance nutritional security.