Evolution, and functional analysis of Natural Resistance-Associated Macrophage Proteins (NRAMPs) from Theobroma cacao and their role in cadmium accumulation

The presence of the toxic metal cadmium (Cd2+) in certain foodstuffs is recognised as a global problem, and there is increasing legislative pressure to reduce the content of Cd in food. The present study was conducted on cacao (Theobroma cacao), the source of chocolate, and one of the crops known to accumulate Cd in certain conditions. There are a range of possible genetic and agronomic methods being tested as a route to such reduction. As part of a gene-based approach, we focused on the Natural Resistance-Associated Macrophage Proteins (NRAMPS), a family of proton/metal transporter proteins that are evolutionarily conserved across all species from bacteria to humans. The plant NRAMP gene family are of particular importance as they are responsible for uptake of the nutritionally vital divalent cations Fe2+, Mn2+, Zn2+, as well as Cd2+. We identified the five NRAMP genes in cacao, sequenced these genes and studied their expression in various organs. We then confirmed the expression patterns in response to variation in nutrient cation availability and addition of Cd2+. Functional analysis by expression in yeast provided evidence that NRAMP5 encoded a protein capable of Cd2+ transport, and suggested this gene as a target for genetic selection/modification.

Scientific REPORTS | (2018) 8:14412 | DOI: 10.1038/s41598-018-32819-y and AtNRAMP3-AtNRAMP4) were found in Arabidopsis. One syntenic block containing the paralogous pair OsNRAMP2 and 6 was found in rice (Fig. 4). No tandem duplicated pair of NRAMP genes was found in Arabidopsis and rice. For elucidation of duplication events, we utilized the Ks values for each paralogous pair within a syntenic block. The Ks values for paralogous pairs in cacao (TcNRAMP2-TcNRAMP3) and Arabidopsis (AtNRAMP3-AtNRAMP4) were 1.46 and 1.63, respectively, which implies that these paralog gene pairs may have evolved from the ancient hexaploidization event. Other paralogous gene pairs in Arabidopsis (AtNRAMP1-AtNRAMP6) and rice (OsNRAMP2-OsNRAMP6) had Ks values of 0.88 and 0.89, respectively, which suggests their origin occurred later from species specific lineage segmental duplication events. Comparative analysis of syntenic data from cacao, Arabidopsis and rice showed that TcNRAMP6 corresponds to a pair of recently duplicated Arabidopsis paralogs AtNRAMP1-AtNRAMP6. The OsNRAMP2 gene corresponds to TcNRAMP2 and 3 and AtNRAMP3 and 4 genes (Fig. 4). In summary, syntenic analysis of the cacao NRAMP gene family revealed three distinct types i.e. Type1-TcNRAMP2-TcNRAMP3 (segmental duplicates), Type2-TcNRAMP6, and Type3-TcNRAMP1-TcNRAMP5 (tandem duplicates). Examination of the phylogenetic tree topology (Fig. 1), showed that type 1, 2 and 3 are located in the vascular plants exclusive sub-clusters B3, C1 and C2, respectively.

Comparative expression analysis of cacao, Arabidopsis and rice NRAMPs in different organs.
To investigate functional synteny among the identified NRAMP paralogs in the three species, expression profiles of NRAMP homologs in cacao were determined experimentally, whereas Genevestigator was used to obtain transcriptome data of Arabidopsis and rice NRAMP genes.
Four diverse organ types including root, mature leaf, unopened flower bud and bean were subjected to RT-PCR to obtain expression pattern and relative abundance of cacao NRAMP transcripts. The reference gene Acyl Carrier Protein (ACP1) was constitutively expressed across the various tissues, thus proving its suitability as a comparator (Fig. S1). Among the target genes, TcNRAMP1 and 5 were specifically expressed in root, unopened (a) Conserved motifs of NRAMP proteins were identified in 171 sequences from selected species of Bacteria, algae, moss, spike moss (Selaginella) and angiosperms using the MEME search tool. Consensus and discriminative motifs are indicated by rectangular box and wedge shapes, respectively. The motif matches are shown with a cut-off p-value less than 0.00001. (b) Multiple alignment of coding nucleotide sequences of motif1 and motif9 was determined based on their corresponding amino acid translations using TranslatorX server. (c) Sequence logo of the motif1 and motif9 was generated by the WebLogo application. Four conserved residues of the substrate biding site are shaded grey. (a) Pattern of conserved motifs of cacao, Arabidopsis and rice NRAMP proteins identified using the MEME search tool. The motif matches are shown with a cut-off p-value less than 0.00001. Shapes represent conserved motifs, whereas black lines indicate non conserved regions. Conserved and non-conserved regions are exhibited proportionally. Scale at bottom is drawn on the Arabidopsis NRAMP1 protein. (b) Exon/intron organization of cacao, Arabidopsis and rice NRAMP genes. The blue rectangles represent 5′ and 3′ UTR, the pink roundcornered rectangles indicate exons, and the black lines indicate introns. The sizes of exons and introns can be estimated using the scale at bottom. The Maximum Likelihood tree presented in both figures was generated using the JTT matrix-based model in MEGA7 from full-length amino acid sequences of the 18 cacao (Tc), Arabidopsis (At) and rice (Os) NRAMP proteins. flower bud and bean with a comparatively higher level of expression in root, suggesting their specificity to this organ. The TcNRAMP6 gene was also predominantly expressed in root and unopened flower bud among the tissues examined and its transcripts were found in low abundance in other tissues studied. The RT-PCR analysis revealed uniform expression of TcNRAMP2 and 3 across the various organs.
Transcriptome data of Arabidopsis and rice NRAMP paralogs retrieved from Genevestigator are depicted in Supplementary Fig. S2. Comparison of the expression profile of cacao NRAMP paralogs with Arabidopsis and rice transcriptome data revealed a distinct pattern. Most of the paralogs from the three species grouped in cluster B3 either showed higher expression in leaf and reproductive tissues compared to root or were expressed universally across the organs. Root-specific NRAMP paralogs were grouped in cluster C2 and C3. The TcNRAMP1 and 5 genes, which previously showed strong structural homology with the rice root specific NRAMP5, also followed a similar pattern of expression. Similarly, AtNRAMP1, which tightly clustered with TcNRAMP6 in the phylogenetic tree, was also highly expressed in root compared to leaf and reproductive organs. The two genes TcNRAMP2 and AtNRAMP2, which clustered together, were expressed universally across the organs. Similarly, TcNRAMP3 showed association with its Arabidopsis counterpart NRAMP4 that demonstrated constitutive expression.

Expression pattern of cacao NRAMPs under metal cation deficiency.
To determine the role of the cacao NRAMP gene family in cation transport, we used qRT-PCR to conduct gene expression studies on seedlings grown hydroponically with various combinations of nutrient cations. In the first experiment, expression of the cacao NRAMP genes was assessed under two different hydroponic growth conditions i.e. seedlings grown in standard Hoagland solution and modified Hoagland solution, which lacked Fe 2+ , Zn 2+ , and Mn 2+ . Overall, TcNRAMP1, 5 and 6 transcripts were predominantly expressed in roots, whereas TcNRAMP2 and 3 were constitutively expressed in leaf and root. These findings are consistent with the RT-PCR results. Root specific cacao NRAMP genes exhibited a high degree of sensitivity to nutrient cation deficiency (Fig. 5a). Expression of TcNRAMP5, 1 and 6 increased 15, 10 and 2.5 fold, respectively, in nutrient cation deficient condition compared to control. TcNRAMP3 also showed significant transcript sensitivity to nutrient deficiency in roots; however, its expression remained stable in leaf. Cation exclusion did not significantly influence expression of TcNRAMP2 in either organ (Fig. 5a,b). The high degree of sensitivity of cacao NRAMPs to nutrient cation deficiency found here, suggests their putative role in cation uptake. However, assessment of the individual effect of each nutrient cation on expression requires further investigation. Therefore, in a subsequent experiment, we attempted to separate the individual effect of each of the divalent cation on the expression of TcNRAMP1, 3 and 5, which were found to be most sensitive to combined Fe 2+ , Zn 2+ and Mn 2+ deficiency. The expression data revealed that deficiency of Zn 2+ and/or Mn 2+ did not trigger any change in expression of any of the three genes compared to control. However, a  (Fig. 6a). TcNRAMP5, 1 and 3 demonstrated a respective 9, 4 and 3 fold increase in expression in Fe 2+ deficient condition compared with the control. Taking account of the significant change in expression under Fe 2+ depleted condition, it is suggested that these genes may have a role in Fe 2+ transport in cacao.
Expression response of cacao NRAMPs to cadmium. The NRAMP gene family is fundamentally involved in uptake and transportation of essential nutrient elements like Fe 2+ and Mn 2+ . However, the encoded proteins exhibit limited selectivity for divalent metal cations and some NRAMP proteins can also mediate Cd 2+ transport. Cadmium enters into root cells as an opportunistic hitchhiker on poorly specific transporters. The nutrient cations and Cd 2+ are taken up through roots and then transported to other plant organs. This fact implies that the genes encoding transporters involved in uptake of Cd 2+ from soil may be expressed more highly in roots compared to other plant organs. The Fe 2+ and Mn 2+ NRAMP transporters can also mediate Cd 2+ transport. To investigate relative expression between species, we compared the pattern of expression under metal deficient conditions for Arabidopsis and rice root specific NRAMPs retrieved from Genevestigator ( Fig. S3) with data for the cacao root-specific NRAMPs from the present study. Analyses of microarray expression data revealed high sensitivity of Arabidopsis Cd 2+ transporters AtNRAMP1 and 4 to Fe 2+ deficient conditions, with AtNRAMP4 found to be more sensitive than AtNRAMP1 to conditions of Fe 2+ deficiency. The extent of sensitivity also increased with an increase in the duration of iron deficiency. Similarly, in rice, the root-specific iron transporter OsNRAMP1 is also involved in cadmium transport. As described previously, root-specific TcNRAMP1 and 5 both exhibited a high degree of sensitivity to Fe 2+ deficiency, suggesting their putative role in Fe 2+ uptake. In view of the significant structural and functional homology of these cacao NRAMPs with the functionally characterized Arabidopsis and rice Cd 2+ transporters, they may have a similar role in cadmium uptake. To test the hypothesis, we conducted a third study to determine the transcript accumulation of TcNRAMP1, 3 and 5 in response to cadmium stress. For separation of the individual effect of Cd 2+ from other nutrient cations, expression of TcNRAMP genes was assessed under three different hydroponic growth conditions i.e. seedlings grown in standard Hoagland solution (CTRL), CTRL lacking Fe 2+ , Zn 2+ , Mn 2+ (T1) and T1 supplemented with Cd 2+ (T2). Root-specific TcNRAMP genes exhibited a high degree of upregulation in expression in nutrient cation deficiency (Fig. 6b). Interestingly, the expression level of the genes in a nutrient deficient conditions supplemented with Cd 2+ (T2) was significantly reduced, to the level of expression detected under control conditions. Expression of TcNRAMP3 was also induced in response to nutrient cation deficiency; however, addition of Cd 2+ did not show any effect. Functional characterization of cacao NRAMP genes. Cloning of TcNRAMPs. For functional characterization, the complete opening reading frame (ORF) of the five cacao NRAMPs were cloned by the RT-PCR approach. Interestingly, restriction analyses showed the presence of more than one cDNA clone for TcNRAMP5. Sequencing of TcNRAMP5 clones revealed a fully spliced 1671 bp ORF, and a 1527 bp misspliced variant containing partial deletion of exons 10 and 12 and complete deletion of exon 11. However, the deletion did not alter the reading frame, and the clone encoded a 509 amino acid protein with a deletion from aa 411 to aa 458 compared with the full-length protein. The variant is designated as TcNRAMP5s. Clones generated for TcNRAMP1, 2, 3 and 6 contained full length coding sequences.
Heterologous expression in yeast. The role of the cacao NRAMPs in Fe 2+ , Zn 2+ and Mn 2+ transport was determined by testing for the complementation of the phenotype of the double mutant S. cerevisiae strain DEY1453 (fet3fet4), ZHY3 (zrt1zrt2) and single mutant smf1, respectively. The strain DEY1453 is defective for both high-and low-affinity Fe 2+ uptake systems 6 . Similarly, the ZHY3 strain lacks a functional copy of the high-and low-affinity Zn 2+ transporters, ZRT1 and ZRT2, respectively 7 . The Mn 2+ mutant strain smf1 lacks the SMF1 gene, essential for high affinity Mn 2+ uptake. Consequently, these mutants need a much higher amount of the respective metal for their growth compared to the parental wild-type strain. For the Fe 2+ assay, the transformed mutant strain fet3fet4 and wild type cells were grown on SD medium either supplemented with the Fe 2+ chelating agent bathophenanthroline disulfonate (BPS) or without BPS (control). The control plate showed good growth for all yeast cells. In Fe 2+ limited conditions, out of the five TcNRAMPs studied, only cells expressing TcNRAMP3 and 5 could rescue the phenotype (Fig. 7a). To test the various TcNRAMPs for their Mn 2+ transport activity, the treatment plate was prepared by adding the divalent cation chelator Ethylene glycol-bis(2-aminoethylether)-N,N,N′, N′-tetraacetic acid (EGTA) to the solid SD medium to limit the availability of Mn 2+ . Expression of TcNRAMP3, 5 and 6 successfully complemented the phenotype of the smf1 strain on the treatment plate. The control plate showed uniformly good growth for all yeast cells (Fig. 7b). In the assay of Zn 2+ uptake, the strain carrying TcNRAMP5 showed significant Zn 2+ transport activity (Fig. 7c). TcNRAMP1, 2 and splice variant TcNRAMP5s failed to show transport activity for any of the metals tested. As expected, wild type strain DY1457 containing the empty vector pDR195 grew well in both conditions, whereas mutant strains transformed with the empty vector pDR195 showed good growth on control plate only. The growth assay conducted on low iron or zinc liquid medium produced the similar results (Fig. 7d,e).
In addition to uptake and transportation of essential nutrient elements, the NRAMP gene family is involved in transport of Cd 2+ due to poor substrate specificity. Therefore, we investigated cadmium transport ability of TcNRAMPs in yeast at different Cd 2+ concentrations. For the cadmium uptake test, wild type yeast strain DY1457 was transformed with individual cacao NRAMPs including splice variant TcNRAMP5s, and empty vector pDR195 and spotted on control and Cd 2+ containing SD medium agar plates. The yeast cells transformed with TcNRAMP5 showed absolutely no growth in the presence of 10 µM Cd 2+ , indicating high-affinity Cd 2+ transport by this protein. However, the splice variant TcNRAMP5s failed to show sensitivity to the Cd 2+ . Significant differences in growth of TcNRAMP6 transformed cells was observed on 10 and 20 µM Cd 2+ concentrations compared to control (Fig. 8a). Growth of the cells expressing either TcNRAMP1, 2 or 3 was not significantly affected by the presence of Cd 2+ , and was quite comparable with growth of cells containing empty vector. For quantitative assessment of growth response to Cd 2+ , we evaluated the transformed yeast cells in liquid SD medium containing five different Cd 2+ concentrations. Expression of TcNRAMP5 leads to highly significant sensitivity to Cd 2+ . Growth inhibition of 70% compared to control was recorded for cells expressing TcNRAMP5 even at the very low Cd 2+ concentration of 2 µM. Higher Cd 2+ concentrations of 5 µM and above completely diminished growth of the cells (Fig. 8b). The yeast cells transformed with TcNRAMP6 also showed mild inhibition of 34% in relative growth at 2 µM Cd 2+ concentration, which gradually increased to 45, 67, 80 and 84% at 5, 10, 20 and 50 µM Cd 2+ concentrations, respectively. The growth inhibition of cells expressing either TcNRAMP1, 2 or 3 or splice variant TcNRAMP5s was comparable with the cells containing empty vector. In addition to the sensitivity test, accumulation of Cd 2+ was quantified in the wild type yeast strains expressing TcNRAMP5 or 6; these showed sensitivity to Cd 2+ that was similar to that of control cells transformed with the empty vector. As the strain expressing TcNRAMP5 revealed very high sensitivity to the Cd 2+ , the cells were exposed to low concentration of 2 µM Cd 2+ for 72 h. The yeast cells expressing TcNRAMP5 accumulated three times more Cd 2+ than those with only the vector. Despite showing hypersensitivity to Cd 2+ , TcNRAMP6 expressing cells showed non-significant differences for accumulation of Cd 2+ compared to the control (Fig. 8c). These findings strongly suggest that, TcNRAMP5 is able to transport Cd 2+ in addition to essential nutrient metal cations.

Discussion
Natural resistance-associated macrophage proteins (NRAMPs) are reportedly involved in binding and transport of essential metal cations, and exist in all kingdoms of life. Cacao is an economically important crop renowned for its integral role in chocolate and beverage industry. The members of NRAMP family have been identified and functionally characterized in number of plant species including Arabidopsis, rice, and soybean [8][9][10][11][12] , however to date such information is lacking in cacao. Here, we searched and aligned Arabidopsis NRAMP homologs in selected viridiplantae species to obtain an insight about the evolutionary relationship of NRAMPs among the plant species that included cacao. Then Arabidopsis and rice NRAMP transporters, and cacao NRAMPs identified following phylogenetic analysis were subjected to detailed structural predictions and functional analyses.
The genomes of recently evolved plant species genomes contain variable number of NRAMP proteins. Though the basal angiosperm A. trichopoda had three copies, these copies underwent lineage specific expansion to 10 and 13 copies found here in monocot species Panicum virgatum and eudicot species Glycine max, respectively. An informatics search of the cacao genome identified five NRAMP homologs compared to six and seven homologs in Arabidopsis and rice, respectively. Gene duplication contributes to expansion and functional diversification Scientific REPORTS | (2018) 8:14412 | DOI:10.1038/s41598-018-32819-y of gene families, and may result from tandem duplication, arise through unequal crossing over, or be caused by segmental duplication, including whole genome duplication (WGD) and duplications of large chromosomal regions 13 . Analyses of syntenic data of cacao revealed one tandem duplicated pair of paralogs (NRAMP1 and 5) and one segmental duplication (NRAMP2 and 3), which led to an increase in copy number to five. However, the three types of cacao NRAMPs (1-5; 2-3; 6) represent three members in the basal angiosperm (A. trichopoda). All the sequenced angiosperm genomes have undergone ancient and more recent WGD events 14 . Arabidopsis has experienced at least three rounds of such events since its divergence from other Brassicales 15 . The cacao genome has not undergone any WGD event since the pan-eudicot triplication 16 . Therefore, the segmental duplication (TcNRAMP2-TcNRAMP3) might have arisen from duplications of large chromosomal regions during evolution.
Phylogenetic analysis grouped NRAMP proteins from selected viridiplantae species into three clusters. Algae and moss NRAMPs formed a distinct cluster. Cluster B had representatives of NRAMP homologs from all selected viridiplantae species, whereas cluster C was formed by NRAMPs exclusively from vascular plants. Three NRAMP copies in the basal angiosperm A. trichopoda and the basal monocot S. polyrhiza nested in three distinct clusters, which suggests that each cluster had a common ancestor (Fig. 1). The five NRAMP proteins identified in cacao clustered into two groups B and C with cluster C being further branched into two distinct sub-clusters. The phylogenetic analysis of NRAMP proteins in Arabidopsis 17 , rice 10 and soybean 12 have revealed the same clustering pattern found here in cacao. Since gene structures are reported to be conserved among paralogs and homologs in many gene families, these data may provide insights into the evolutionary history of NRAMP gene family. The comparative analysis of conserved motifs and intron/exon organization in cacao, Arabidopsis, and rice (Fig. 3a,b) corresponded to the phylogenetic groups. The number of exons/introns was highly conserved within each of the two subfamilies. For instance, TcNRAMP3 and 4; AtNRAMP2 through 5; and OsNRAMP2 and 6 in cluster B contain 3-4 exons, whereas the other subfamily represented by TcNRAMP1, 5 and 6; OsNRAMP2 through 5; and AtNRAMP1 and 6 have 12 to13 exons.
Determination of tissue specificity of gene expression may aid the prediction of physiological function of the respective gene. Expression profiling of cacao NRAMP genes revealed a diversified pattern of expression in different organs. TcNRAMP1, 5 and 6 were primarily expressed in roots whereas TcNRAMP2 and 3 were uniformly expressed across the organs (Fig. S; Fig. 6a,b). The root specific expression of TcNRAMP1, 5 and 6 implies their role in uptake of metals from the external solution. We also retrieved microarray expression data for Arabidopsis and rice from Genevestigator (Fig. S2) to compare it with cacao. The expression pattern also supported the relationships found among the three species in the phylogenetic and gene structure analysis. For example, most of the paralogs grouped in cluster B either showed higher expression in leaf and reproductive tissues compared to roots or were expressed universally across the oragns, whereas paralogs in cluster C expressed exclusively in roots. The relationship found here in microarray data is also supported by tissue specific expression determined through qRT-PCR in Arabidopsis 8 and rice 10 NRAMP transporters.
Functional divergence is a primary outcome of gene duplication, and may occur by either neo-functionalization (duplicated gene gains entirely new function compared to the function of ancestral gene) or sub-functionalization, (duplicate gene complements the function of ancestral gene). Having occurred recently, tandem duplicates are more likely to show higher level of complementarity than segmental duplicates 18 . Results of comparative expression analysis of duplicated paralogous pairs predicted in synteny analysis were found to be consistent with sub-functionalization. The cacao tandem duplicated pair (NRAMP1-NRAMP5) were co-expressed in root and flower bud whereas segmental duplicate pair (NRAMP2-NRAMP3) showed uniform expression.
Expression of NRAMP transporter genes was generally shown to be up-regulated under divalent metal starvation. We also found significant transcript sensitivity to cation deficiency for cacao NRAMP genes in roots. The root specific TcNRAMP1 and 5 transcripts were significantly upregulated under iron deficiency; however, limitation of Zn 2+ and/or Mn 2+ did not trigger any change in expression (Fig. 6a). Similar features have been reported for their paralogs in Arabidopsis and rice. AtNRAMP1 was primarily expressed in roots and demonstrated up-regulated expression under Fe 2+ starvation 8,19 . However, its expression was also upregulated by Mn 2+ deficiency 20 . The rice paralog OsNRAMP5 was also predominantly expressed in roots and exhibited significantly high expression under Fe 2+ limited conditions, but was unaffected by Mn 2+ deficiency 21 . The protein sequence comparison showed that TcNRAMP1 and 5 had high identity of 72% with OsNRAMP5 compared to 58% identity with AtNRAMP1. These findings suggest that TcNRAMP1 and 5 may have more functional homology with OsNRAMP5 than with AtNRAMP1 in relation to metal transport. TcNRAMP3 also exhibited highly significant expression in Fe 2+ limited conditions (Fig. 6a), which corresponded to induced expression of the Arabidopsis ortholog AtNRAMP4 under Fe 2+ deficiency 8 . AtNRAMP6 is expressed mainly in shoot and dry seeds and acts as an intracellular metal transporter 11 . TcNRAMP6, which showed mild sensitivity to exclusion of all three metal cations, failed to show sensitivity specific to deficiency of any particular metal and might have a role in intracellular metal transporter. A recent study has reported detailed characterization of AtNRAMP2. The gene expressed constitutively in root and shoot but the expression was not altered by exclusion of Fe 2+ , Zn 2+ or Mn 2+ in either organ 22 . Similarly, we also did not find any significant influence of cation exclusion on expression of the NRAMP2 ortholog in cacao (Fig. 6a).
Cadmium has long been recognized as a major health concern to humans. Plants have a tendency to uptake Cd 2+ from soil and accumulate it in edible parts, which represent the main source of Cd 2+ in human food 23 . Cadmium, being toxic, is not vital for plant growth, and therefore at first sight, there is expected to be no selection pressure to favour a Cd 2+ specific transporter. The question remains, however, as to whether Cd 2+ accumulation might have an indirect benefit to plant performance, for example by inhibiting pests and/or pathogens [24][25][26] , and therefore there might indeed be selection in favour of Cd 2+ uptake. The evidence in support of such a theory comes from the increasing number of studies 27-32 that have identified the inhibitory effect of leaf cadmium on feeding and other behaviour of herbivorous caterpillars such as Spodoptera litura and Lymantria dispar.
The NRAMPs have broad substrate specificity and have been reported to transport cadmium in addition to Mn 2+ , Fe 2+ and Zn 2+ in Arabidopsis, rice and many other plants 8,10,21,[33][34][35][36] . Significant structural and transcriptional similarity of TcNRAMP1, 3 and 5 with the functionally characterized Arabidopsis and rice Cd 2+ transporters implies their possible role in Cd 2+ uptake. To test the hypothesis, we conducted an expression study to determine the transcript accumulation of TcNRAMP1, 3 and 5 in response to Cd 2+ stress. The expression studies on response of NRAMPs to cadmium usually compare conditions where plants are grown in medium containing nutrient cations in the presence and absence of cadmium 10,35 . As we know that Cd 2+ and nutrient cations have common transporters, a significant impact of Cd 2+ on expression of these transporters cannot be expected in the presence of nutrient cations. For separation of the individual effect of Cd 2+ from nutrient cations, expression of TcNRAMPs found to be sensitive to metal deficiency was assessed in the presence and absence of nutrient cations, and absence of nutrient cations in media containing Cd 2+ . All three genes, as expected, showed high sensitivity to nutrient cation deficiency; however, addition of Cd 2+ drastically reduced the expression of TcNRAMP1 and 5 to the level detected under control conditions (Fig. 6b). Differential expression of these genes in plants grown both with and without Cd 2+ suggests their role in cadmium transport. As previously discussed, Cd 2+ enters the root cell opportunistically through poorly specific transporters. It can be argued that the control system (at present unknown) for TcNRAMP1 and 5 probably recognized Cd 2+ as a divalent nutrient cation, which resulted in complete reversal of their extremely high expression sensitivity under Cd 2+ supplemented conditions. These results are also supported by reported role of closely similar AtNRAMP1 8 and OsNRAMP5 10 in Cd 2+ transport. Addition of Cd 2+ did not trigger change in expression of TcNRAMP3, which suggests a role in the transport of a metal other than Cd 2+ .
Expression pattern of a gene may not necessarily be informative with respect to its function. Therefore, we cloned five TcNRAMPs and expressed them in yeast strains for functional characterization. Results showed that transporters encoded by TcNRAMP3 and 5 have broad substrate specificity including Fe 2+ and Mn 2+ . TcNRAMP6 is specific for Mn 2+ transport (Fig. 7). In addition to nutrient cations, yeast expressing NRAMP5 and 6 exhibited high sensitivity to Cd 2+ (Fig. 8a,b). The yeast cells expressing TcNRAMP5 accumulated three times more Cd 2+ than the vector only. In contrast, cells expressing TcNRAMP6 showed the same level of Cd 2+ accumulation as the control (Fig. 8c). TcNRAMP1 and 2 failed to show transport activity for the metals tested. The metal transport activity reported previously for Arabidopsis and rice NRAMPs supports the present findings for the respective orthologs in cacao. Like TcNRAMP5, OsNRAMP5 and AtNRAMP1 have been implicated in Fe 2+ , Mn 2+ and Cd 2+ transport 8,10 . Similarly, heterologous expression of AtNRAMP6 in yeast enhanced sensitivity to Cd 2+ without affecting cadmium content in the cells 11 . However, TcNRAMP3, an ortholog of AtNRAMP3 and 4, showed transport activity for Fe 2+ and Mn 2+ but not for Cd 2+ . Recently, it has been reported that the AtNRAMP2 protein is involved in remobilization of Mn 2+ in Golgi for root growth instead of uptake through roots 22 . Based on structural and expression similarities, it is suggested that TcNRAMP2 may be involved in remobilization rather than uptake of metal cation(s).
Structure-function relationships in the context of substrate preference in the NRAMP family have been demonstrated in recent studies 2,37 . The crystal structure of the bacterium Staphylococcus capitis NRAMP transporter (ScaNRAMP) revealed a substrate-binding site that coordinates divalent transition-metal ions including Mn 2+ , Fe 2+ and Cd 2+ . Four conserved residues including aspartic acid (D)49, asparagine (N)52, alanine (A)223, and methionine (M)226 directly bind the metal substrate. Functional investigations have established that mutation of ion-coordinating residue N52 very strongly reduces its binding affinity for Mn 2+ , Fe 2+ and Cd 2+ . Inspection of the substrate-binding site in the sequences used in this study revealed that the metal substrate binding residue N52 located in motif1 was conserved in all sequences apart from TcNRAMP1 where serine replaced asparagine (Fig. 2b). TcNRAMP1 and TcNRAMP5 shared 92% similarity in protein sequence and showed similar expression pattern in expression studies; however, TcNRAMP1 failed to show transport activity for the metals tested (Figs 7 and 8), a finding that may be attributed to the mutation at the conserved residue N52 (based on ScaNRAMP numbering) resulting in complete loss of function. In order to test this hypothesis we developed a synthetic version (TcNRAMP1m) of TcNRAMP1 (S52N). Like TcNRAMP1, functional characterization of the TcNRAMP1m in yeast expression system did not show any transport activity for Fe 2+ , Mn 2+ or Zn 2+ , (Fig. 7) which implies that the N52 residue in cacao NRAMP1 has no role in binding of the metals tested.
Complete/partial loss of function of the NRAMPs have been associated with mutation of highly conserved residues involved in metal selectivity 1,38 and the truncations leading to major structural changes 35 . The key metal selectivity role of conserved methionine (M226) is well established in bacteria. A methionine-to-alanine substitution reduced binding affinity and transport of Cd 2+ without affecting binding behaviour of Mn 2+ . However, it enables transport of calcium and magnesium, a finding that suggests that the conserved methionine is essential for transport of low-abundance transition metals in the presence of high-abundance divalent metals such calcium and magnesium 1 . Inspection of the conserved methionine in the sequences we identified here revealed 86% conservation. Among the various protein sequences, OsNRAT1 (NRAMP aluminium transporter 1) and AtNRAMP5 have a methionine-to-alanine substitution, whereas valine has replaced the methionine in OsNRAMP4. Rice OsNRAT1 transports the highly abundant trivalent aluminium metal 39 , hence the substitution of methionine may have led to the diverged function. However, AtNRAMP5 and OsNRAMP4 have not yet been functionally characterized. Phenylalanine (F398 ScaNRAMP numbering) is also highly conserved in NRAMPs and induced mutation of the residue in AtNRAMP4 has reduced its ability to transport Cd 2+ in yeast 38 . On other hand, a mutant identified in tobacco encoding a truncated NRAMP5 protein showed no transport activity for Mn 2+ and a weak transport activity for Cd 2+ compared to wild type 35 . Also, OsNRAMP5 mutants developed using either ion-beam irradiation 40 or the CRISPR/Cas9 gene-editing system 5 generated C terminal truncations that showed low uptake of Cd 2+ without compromising yield. We identified a splice variant among the TcNRAMP5 clones with partial deletion of exons 10 and 12 and complete deletion of exon11. The variant, which encodes a 510 aa peptide, contains the conserved residues that have been implicated as the substrate binding site; however, it lacks transmembrane domain 10 compared to full length TcNRAMP5 (Fig. S4). In contrast to TcNRAMP5, the splice variant did not show evidence for any metal transport activity when expressed in yeast (Figs 7 and 8); this suggests that a deletion in the C terminal region may cause a complete loss of function despite having the conserved substrate binding site. This result is consistent with previous findings in tobacco 35 and rice 5 . This finding is also supported by an investigation in yeast that affirms the importance of the entire C-terminal domain in stability and trafficking of membrane protein Pma1 H + -ATPase 41 . Taken together, these results imply a role of TcNRAMP5 in Cd 2+ uptake in cacao. Identification or induction of loss of function mutations in the gene may help in the development of cacao clones with reduced Cd 2+ .

NRAMP homologs prediction, phylogeny, and gene/protein bioinformatic analyses. A BlastP
search was conducted using the Arabidopsis metal transporters NRAMP1 to 6 protein sequences as queries to identify orthologs in two algae and 28 other plant species (http://www.phytozome.net/). A List of the sequences is provided in Table S2 in Supplementary Information. Each selected sequence was inspected for the presence of the NRAMP domain (pfam01566) using Pfam (https://pfam.xfam.org/) software. The selected NRAMP sequences were subjected to multiple alignment using MUSCLE 42 . The initial tree was generated by using the Maximum Likelihood method based on the JTT matrix-based model in MEGA7 43 . Bootstrap support was determined from 1000 replicates. The phylogenetic tree was visualized and drawn by iTol 44 .
The isoelectric point and molecular mass of the NRAMP sequences were predicted by the ProtParam tool (http://web.expasy.org/protparam/). Transmembrane domains (TMDs) were predicted by the TMHMM Server v. 2.0 (http://www.cbs.dtu.dk/services/TMHMM/). Genomic sequences of cacao NRAMP genes were downloaded from NCBI and gene structures were constructed on the Gene Structure Display Server (http://gsds.cbi.pku. edu.cn/). The expression data of Arabidopsis and rice NRAMPs were downloaded from Genevestigator (https:// genevestigator.com/gv/) and heat maps were generated using heatmapper software (http://www1.heatmapper.ca/ expression/). Plant growth. Cacao clone NA702 was chosen for expression analysis and functional characterization of cacao NRAMP genes. For organ specific expression, root, mature leaf, unopened flower bud and bean were obtained from the International Cocoa Quarantine Centre, Reading, UK (http://www.icgd.reading.ac.uk/icqc/). For metal sensitivity expression studies, beans were germinated in compost. Two weeks old seedlings were then transferred to half strength Hoagland solution with the pH adjusted to 5.2. The nutrient solution was aerated for 15 min after every two hours, and renewed every week. Plants were cultured under controlled environment conditions (28/22 °C day/ night temperature, 16 h photoperiod with 60% relative humidity). The seedlings were grown in the solution for 21 days and then subjected to different nutrient combinations to determine their effect on the gene expression. In the first experiment, the seedlings were subjected to nutrient deficiency by exclusion of Fe 2+ , Zn 2+ and Mn 2+ from the half strength Hoagland solution. In a subsequent experiment conducted to separate the individual effect of each of the divalent cations on the expression of cacao NRAMPs, the seedlings were exposed to three nutrient deficient conditions i.e. half strength Hoagland solution excluding: Fe 2+ (T1), Mn 2+ (T2) and Zn 2+ (T3). In the third experiment to determine the transcript accumulation in response to cadmium stress, seedlings were grown in half strength Hoagland solution excluding Fe 2+ , Zn 2+ , Mn 2+ (T1) and T1 supplemented with 20 µM cadmium chloride (T2). Seedlings grown in half strength Hoagland solution were sampled as the control in each experiment. Each treatment included four biological replicates. Leaf and root tissues ware sampled seven days after application of the treatments. The organs sampled for expression studies were stored at −80 °C prior to subsequent RNA extraction.
Expression analyses. Total RNA was isolated from various organs using a modified CTAB method, and subsequently purified by the RNeasy ® Plant Mini Kit (Qiagen) according the manufacturer's instructions, and quantified through NanoDrop 2000 Spectrophotometer. Additionally, aliquots of extracted RNA were run on 1.5% agarose gel for quality determination. Isolated total RNA (1.0 µg) was converted into cDNA using High-Capacity RNA-to-cDNA ™ kit (Thermo Fisher Scientific). Nucleotide sequences for the reference gene and genes of interest were retrieved from the NCBI GenBank database (http://www.ncbi.nlm.nih.gov) and primers were designed using Primer Blast tool (http://www.ncbi.nlm.nih.gov/tools/primer-blast). The list of primers used in the expression study is given in Supplementary Table S3. RT-PCR was performed in a Veriti Thermal Cycler, whereas the StepOnePlus ™ Real-Time PCR system was used for real time RT-PCR. Regarding PCR mix, BioMix TM (Bioline) and PowerUp ™ SYBR ® Green master mix (Applied Biosystems) were used in RT-PCR and qRT-PCR, respectively. RT-PCR products were run on 2.5% agarose gel stained with ethidium bromide. A relative standard curve assay was used to determine the relative expression of genes of interest in real time analysis. The cacao Acyl Carrier Protein (ACP1, GenBank: TCM025966) was utilised as the reference gene for normalization.
Cloning of cacao NRAMPs. Total RNA isolated from roots was converted into cDNA using SuperScript ™ III First-Strand Synthesis SuperMix (Thermo Fisher Scientific) to obtain full length products. Full coding sequence of cacao NRAMP genes, including stop codon, was amplified by Phusion Hot Start II High-Fidelity DNA Polymerase (Thermo Fisher Scientific) with the primers (see Supplementary Table S4) containing attB overhang. The TcNRAMP1 and 5 were cloned into pCR ™ 4Blunt-TOPO ® cloning vector and then cloned into pDONR221 donor vector, whereas TcNRAMP2, 3 and 6 were directly cloned into the donor vector. The entry clones were produced using Gateway ® BP Clonase ™ II Enzyme Mix (Thermo Fisher Scientific). Yeast expression vector pDR195 was converted to Gateway destination vector by ligating Gateway ® Reading Frame Cassette B at the XhoI cloning site, and designated as pDR195GTW. Expression clones were generated using Gateway ® LR Clonase ™ II Enzyme Mix (Thermo Fisher Scientific). Integrity of the expression cassette was confirmed by restriction analysis, and sequencing of promoter/gene/terminator region.
Heterologous expression in yeast. The S. cerevisiae strains used in this study included double mutant strains DEY1453 (fet3fet4), ZHY3 (zrt1zrt2) and corresponding parental wildtype strain DY1457; and single mutant strain HomDip-YOL122C lacking SMF1 (transOMIC). Mutant and wild type strains were transformed with the yeast expression vector pDR195GTW containing one of the six cacao TcNRAMP genes including splice variant TcNRAMP5s or the empty vector pDR195 using a yeast transformation kit (Sigma-Aldrich) following manufacturer's instructions. Transformants were selected on synthetic defined medium containing 6.7 g/l yeast nitrogen base without amino acids (Thermo Fisher Scientific), 1 g/l of amino acid supplement without uracil (Sigma Aldrich) and 2% glucose, designated as SD-U medium. The SD-U medium was supplemented with 100 μM of ferric chloride (FeCl 3 ), zinc chloride (ZnCl 2 ) and manganese sulfate (MnSO 4 ) for the selection of DEY1453, ZHY3 and YOL122C transformants, respectively. For the complementation assay, a single yeast colony from each plate was inoculated into the liquid medium used in the selection and grown to an OD 600 of 1.0. The yeast cells were pelleted by centrifugation, washed in sterile water to remove metal adsorbed to cell walls and diluted to an OD 600 of 0.1. Four 10-fold serial dilutions were prepared in water and 5 μl were spotted on the plate. Transformed fet3fet4 cells were spotted on SD-U medium plate (pH 4.0) supplemented with 10 μM FeCl 3 or 10 μM Fe chelator BPS. Transformed zrt1zrt2 cells were assessed on SD-U medium plate (pH 5.8) supplemented with 100 μM ZnCl 2 or with 100 μM of the metal chelator Ethylenediaminetetraacetic acid (EDTA), 10 μM each of FeCl 3 and ZnCl 2 .Transformed smf1 cells were grown on SD-U medium plate (pH 5.2) supplemented with 100 μM MnSO 4 or with 12.5 mM EGTA. The growth assay of ZHY3 and DEY1453 cells was conducted in liquid low zinc medium (LZM100) 7 and low iron medium (LIM1) 45 , respectively. The wild type yeast strain DY1457 transformed with the empty vector pDR195 was included as a positive control in all three assays.
For the cadmium sensitivity assay, transformed wild type cells DY1457 were spotted on SD-U medium plate without cadmium chloride (CdCl 2 ), and supplemented with 10 and 20 μM CdCl 2 . The plates were incubated at 30 °C for 3 (fet3fet4, smf1, WT) or 6 (zrt1zrt2) days before photography. For quantitative assessment of growth response to Cd 2+ , 10 ml of liquid SD-U medium, which contained 0, 2, 5, 10, 20 and 50 μM CdCl 2 , was inoculated with primary culture of the transformed DY1457 cells at an OD 600 of 0.01. The cells were grown at 30 °C with shaking at 250 rpm for 72 h. The OD 600 was measured on SpectraMax i3x (Molecular Devices) microplate reader. The growth inhibition was calculated by comparing final OD 600 of the treated cultures with control (No CdCl 2 ). pelleted cells were washed with cold 20 mM EDTA for 10 min, rinsed three times with deionized water, and dried at 70 °C for 2 days. The dried cells were digested for 8 hours in 5 mL of 70% nitric acid (TraceSELECT ™ grade) in closed glass vessels at 110 °C. All digestions were performed in duplicate, and for quality control, a blank and a plant certified reference material (IAEA-359 cabbage leaves) were included. The Cd 2+ accumulation in the cells was determined by inductively coupled plasma mass spectrometry (Thermo Scientific ™ iCAP ™ Q ICP-MS). Data analysis. Significance analysis was performed by Student's t test using SPSS software. The difference at P < 0.05 and P < 0.01 was considered as significant and highly significant, respectively.