Article | Open

Genetic diversity at the Dhn3 locus in Turkish Hordeum spontaneum populations with comparative structural analyses

Received:
Accepted:
Published online:

Abstract

We analysed Hordeum spontaneum accessions from 21 different locations to understand the genetic diversity of HsDhn3 alleles and effects of single base mutations on the intrinsically disordered structure of the resulting polypeptide (HsDHN3). HsDHN3 was found to be YSK2-type with a low-frequency 6-aa deletion in the beginning of Exon 1. There is relatively high diversity in the intron region of HsDhn3 compared to the two exon regions. We have found subtle differences in K segments led to changes in amino acids chemical properties. Predictions for protein interaction profiles suggest the presence of a protein-binding site in HsDHN3 that coincides with the K1 segment. Comparison of DHN3 to closely related cereals showed that all of them contain a nuclear localization signal sequence flanking to the K1 segment and a novel conserved region located between the S and K1 segments [E(D/T)DGMGGR]. We found that H. vulgare, H. spontaneum, and Triticum urartu DHN3s have a greater number of phosphorylation sites for protein kinase C than other cereal species, which may be related to stress adaptation. Our results show that the nature and extent of mutations in the conserved segments of K1 and K2 are likely to be key factors in protection of cells.

Introduction

The wild progenitor of barley, Hordeum vulgare spp. spontaneum (C. Koch), which is also known as H. spontaneum in modern taxonomy, is distributed in the Fertile Crescent and Irano-Turanian region as its primary habitat and the Mediterranean and Central Asia as secondary habitats1,2. The Southeastern region of Turkey covers the “Anatolian group”, which is one of the diversity centres for H. spontaneum3. The region – as the north part of the Fertile Crescent– is characterized by hot, dry climate with average temperature of 38 °C in August and monthly average precipitation is 1.2 mm near the Syrian border (Turkish State Meteorological Service 2015). H. spontaneum has a high distribution in semiarid regions where the temperature range is extreme; soil types, altitudes and photoperiods are diverse4,5. Frequently, agricultural fields of barley and wheat in Southeastern Turkey are occupied by H. spontaneum during severe drought seasons. These observations together with the data suggest the increased ability to survive and adaption capacity of the H. spontaneum populations under harsh environmental conditions. Today, wild relatives and landrace accessions are more important than ever as they are potential resources of genetic variation for drought tolerant crop development.

Dehydrins are group 2 Late Embryogenesis Abundant (LEA) proteins, which were first characterized in cotton seeds as an up-regulated protein group during maturation6,7. Dehydrins are normally expressed at low levels in cells and are induced by drought, salinity, low temperatures or exogenous application of absisic acid (ABA)8,9,10,11. Certain type of dehydrins has also been shown to be expressed constitutively12. Six to ten Dhn genes have been characterized in Arabidopsis thaliana, and eight of them have been found to be linked by duplication, either as tandem repeats or homologous pairs13. In barley, 13 dehydrin genes (Dhn1 to Dhn13) have been identified14,15. Like most of the group 2 LEA proteins, barley dehydrins contain an S-segment of poly-serine residues near to the N-terminus and a lysine-rich 15 amino-acid consensus sequence named K-segment (EKKGIMDKIKEKLPG) near to the carboxyl-terminus16. The K-segment occurs in all dehydrins but the number of copies changes from 1 to 11 within a single polypeptide14. Dehydrin molecules also contain another conserved sequence tyrosine-rich residues, known as the Y-segment [V/T]D[E/Q]YGNP near to the N-terminus. Sequence analysis of each of the Dhn genes demonstrated that their allelic variations have originated from deletion or duplication of Φ domains (named the Φ-segments) in the K-segment and single nucleotide polymorphisms through the entire gene14. The Φ-segments are variable motifs rich in polar amino acids (Gly or Ala/Pro) located between or before the K-segment. There are five different subgroups of dehydrins (YnSK2, Kn, KnS, SKn, and Y2Kn types) based on the number and position of the conserved segments17. Dehydrin3 (Dhn3) and Dehydrin4 (Dhn4) genes are located on barley chromosome 6H as consecutive genes, while other Dhn genes are distributed on 3H, 4H, 5H and 6H14. Expression patterns of barley Dhn genes have been found to be correlated with the known regulatory element compositions in their sequences14. Barley Dhn3 and Dhn4 genes have been reported as early responsive genes to drought and other stress factors18,19,20,21. Both genes have been rapidly up-regulated in drought-tolerant barley (Hordeum vulgare L. “Chalbori”) and their transfer and over-expression in Arabidopsis conferred tolerance to this plant18. There are not many studies on expression profile of Dhn genes of H. spontaneum under drought. However, Suprunova et al.22 has clearly proved Dhn3 expressional induction in response to drought, similar to barley.

Dehydrins are hydrophilic due to the presence of polar and charged amino acids (Ala, Gly Lys, Asp, Glu, and Ser). They do not have stable three-dimensional structures, thus they are intrinsically disordered proteins (IDPs)23. Despite having high flexibility and minimal secondary structure, IDPs are an important class of proteins, functionally related to cell signalling, transcription, and assembly of protein complexes24. As cryoprotectants, dehydrins are known to interact with membrane phospholipids, metal ions, and water during stressful conditions25,26,27,28. Recent studies also suggest that conserved motifs of dehydrins, such as K-segments, have a role in their intrinsic disordered structure and binding affinity to other molecules and cell membranes24,29,30. IDPs such as dehydrins often gain structural stability when bound to ligands such as membranes, and they may change their oligomeric state when bound to ions31. Experimental studies of disordered regions of proteins have been difficult for X-ray diffraction analyses, and often for nuclear magnetic resonance imaging (NMR) due to their flexible nature and so predictive tools are often used in conjunction32.

Although there have been many investigations related to structure and functions of dehydrins, the genetic diversity of individual dehydrin proteins in natural populations and its effects on the resultant polypeptide structure are not well known. In the present study, we have selected DHN3 from the dehydrin family for characterization of biochemical features in comparison to other cereal species. Additionally, we identified indels and SNPs within HsDhn3 alleles to understand the range of mutations in H. spontaneum populations of Turkish origin, which represent an important gene pool.

Materials and Methods

Plant material

H. spontaneum L. accessions collected from 21 locations mostly in Southeastern Turkey, and Hordeum vulgare L. cv. Tokak 157/37 (TK157/37) were used in the study (see Supplemantary Table S1). Seeds were germinated and grown in pots filled with soil in a growth chamber (Angelantoni, Ekochl 700) under a short-day photoperiod (8 h light/16 h dark), at a temperature of 25 °C and 50–60% relative humidity. One plant was used from each accession.

DNA extraction and isolation of Dhn3 alleles

We extracted genomic DNA from the leaves of wild barley seedlings following the CTAB method33. Specific primers (forward: 5′ AGGCAACCAAGATCAACACC 3′ and reverse: 5′ TTCTGCAAGGTAGCCAGACC 3′) were designed to amplify the whole sequence of the Dhn3 gene based on sequences of H. vulgare cv. Dicktoo presented in GenBank database (AF043089.1) using Primer3 software. The specificity of designed primers was confirmed by BLASTN analysis34. Genomic DNA amplifications were performed in a 25 μl reaction containing 0.5 U of Dreamtaq DNA polymerase (Thermo Scientific EP0702), 200 μM of each dNTP, 0.4 μM of primer, 2 mM MgCl2 and 50 ng genomic DNA. Thermocycling was performed at 95 °C for 5 min, followed by 35 cycles at 95 °C for 1 min, 61 °C for 40 s, 72 °C for 1 min, and a final extension at 72 °C for 10 min. The resulting amplicons were purifed using Wizard® SV Gel and PCR Clean-Up System (Promega, USA) and cloned into pTZ57R/T plasmid vector (Thermo Scientific, USA) according to the manufacturer’s instructions. The cloned HsDhn3 fragments were sequenced by the dideoxy chain termination method using ABI Prism 310 Genetic Analyzer (Applied Biosystems, USA).

DNA sequence analysis

The sequencing chromatograms were examined with Chromas Lite 2.1.1 (Technelysium Pty Ltd, Australia) and converted to FASTA format. The vector sequences were removed using web based VecScreen. The nucleotide and predicted amino-acid sequences were compared with sequences in the GenBank and EMBL databases respectively, using BLAST. The intron was identified by aligning to known sequences of Dhn3 CDS of cv. Dicktoo (AF043089.1). Amino acid sequence alignments of the predicted DHN3 polypeptide and nucleotide sequence alignments were performed using CLUSTALW235 with default parameters. We identified SNPs among the HsDhn3 DNA sequences from different genotypes using MEGA 6.06 software36.

Measurement of nucleotide diversity

Nucleotide diversity (π) between the genotypes was calculated as the average of the pairwise nucleotide difference per site between two sequences according to Nei37 (1987) using the MEGA 6.06 software. The number of unique haplotypes (h) and haploid diversity (Hd) were measured using DNAsp version 5.10.0138.

Protein sequence analyses

We analysed the physical and chemical properties of DHN3 including molecular weight, theoretical isoelectric point (pI), stability index and hydropaticity index, according to amino acid scale values by Kyte and Doolittle39, using the ProtParam tool from Expasy40. Putative protein kinase C (PKC) and casein kinase 2 (CK2) phosphorylation sites were predicted using NetPhosK 1.041. The protein sequences of the DHN3 variants were submitted to the IntFOLD server42,43 to generate alternative 3D models using the latest methodology44. Predictions of the intrinsically disordered (natively unstructured) regions in the sequences were generated using DISOclust45 and likely disordered and protein binding regions were predicted with DISOPRED346,47.

Public protein sequences of other cereal species

For comparision of biochemical and structural characteristics, we used public sequences of H. vulgare (AF043089.1), Triticum aestivum WZY1 (AAL50791), Triticum urartu DHN3 (EMS45466.1), Aegilops tauschii DHN3 (EMT24840), Brachypoidum distachyon DHN3-like (XP003574997), Zea mays RAB17 (CAM56274.1), Sorghum bicolor DHN (AAA19693), and Oryza sativa (NP001067843) downloaded from NCBI.

Results

Sequence Diversity

The H. spontaneum Dhn3 (HsDhn3) gene has typically 486 bp of coding sequences with two exons and 439 bp of noncoding sequences (59-bp 5′UTR, 113-bp intron and 267-bp 3′UTR) like Hordeum vulgare Dhn3 (Fig. 1). We sequenced a 692-bp region of the Dhn3 alelle including all the coding regions (195-bp Exon1 and 291-bp Exon 2) and 206 bp of noncoding regions. Through all sequences, we detected total 29 SNPs in the coding regions and intron of HsDhn3. The variation in the intron, with one SNP every 14 bp on average, was one-and-a-half fold as high as in the coding regions, with one SNP every 23 bp on average. There was only one gap observed in the sequenced region of the HsDhn3, 18-bp length in Exon 1. This gap was observed in only three H. spontaneum genotypes. The nucleotide diversity for the whole HsDhn3 was estimated by Nei’s37 π statistics to be 0.00684 for the whole sequence. There was higher diversity in the intron region (π = 0.01247) than in the exon regions (π = 0.00290 and 0.00720 for Exon 1 and Exon 2, respectively). 16 haplotypes were observed for all regions of Dhn3, where the highest haplotype score was observed in Exon 2 with 12 haplotypes. The allelic variation was measured according to haplotype diversity (Hd; Table 1). The lowest value was found in the Exon 1 (Hd: 0.458), while Exon 2 showed the highest variability (Hd: 0.824). The Hd value was smaller in the intron (0.795) than Exon 2.

Figure 1: General structure of the Dhn3 locus in H. vulgare (Choi et al. 1999) that is conserved in H. spontaneum.
Figure 1

Arrows show the location of primers used to amplify Dhn3 alleles in this study.

Table 1: Summary statistics of nucleotide and haplotype diversity in the coding region and conserved motifs of the Dhn3 gene of H. spontaneum.

The SNP number, nucleotide diversity, and haplotype diversity were also calculated for each sub-region of HsDhn3 gene (Table 1). These sub-regions are highly conserved regions containing one Y-segment, one S-segment, and two K-segments. There was also a spacer, named Ksp, in-between the two K-segments. 10 of the 21 SNPs were observed in the Ksp region. The Ksp region also showed the highest scores for the number of haplotypes and haplotype diversity. The variation in K1 and K2 was detected with one SNP every 10 and 9 bp, respectively. Nucleotide diversity of K1 (π = 0.01354) is higher than K2 (π = 0.01058). The lowest variation between regions was observed in Y with π = 0.00414.

The HsDhn3 gene is GC rich with a content of 66.9%. In total, 21 SNP mutations were detected within the coding region of HsDhn3 (Table 1). Regarding the nature of base mutations, transition mutations were 76.2% of total, while transversion mutations were about 23.8% (Table 2). A/G substitutions had the highest percentage at 57.1. While 6 SNPs were synonymous, 15 SNPs were non-synonymous and led to amino acid replacements.

Table 2: Summary of nucleotide changes in the coding region of H. spontaneum Dhn3 according to the consensus sequence.

Biochemical features and motif structure of DHN3 in H. spontaneum

The molecular weight of Hordeum spontantenum DHN3 (HsDHN3) varied from 15.72 kDa to 16.22 kDa with 155 or 161 amino acid residues, respectively (see Supplementary Table S2 online). HsDHN3s had a number of putative protein kinase C (PKC) phosphorylation sites, varying from 9 to 11, which was similar to that of H. vulgare (see Supplementary Table S2 online). PKC sites were outside of the conserved motifs, except a serine residue within the S-segment (Fig. 2). Aliphatic index values of HsDHN3 were predicted to range from 32.19 to 35.22 (See Supplementary Table S2 online), thus showing high protein thermostability48. The instability index showed that the HsDHN3 proteins were highly stable with values much lower than 4040. All HsDHN3 were found to be highly hydrophilic, with GRAVY values ranging from −1.020 to −1.128 and also basic, with theoretical pIs varying from 7.99 to 8.90.

Figure 2: Multiple sequence alignment of the deduced amino acid sequences of the DHN3 proteins from H. spontaneum (9 haplotypes) and two H. vulgare genotypes (cvs TK157/37 and Dicktoo).
Figure 2

The conserved segments (Y-, S-, and K-segments) are shown in yellow shade. The NLS segments are denoted in red and framed by a black line. The PKC phosphorylation sites are in boldface letters and the CK2 sites are underlined. The SNPs are shown in purple. The intron position is indicated by an arrow. Asterisks (*) indicates fully conserved residues, while colons (:) and periods (.) indicate less conserved residues.

Similar to the DHN3 protein of cultivated barley, HsDHN3 is YSK2-type containing one Y-segment, one S-segment and two K-segments (Fig. 2). The Y-segment sequences (DEYGNPV) were the same as in cultivated barley14, with the exception of the LH1 variant, which contains DEYGYPV, where the amino acid Asn was replaced by Tyr. S-segments are Ser rich conserved motifs and typically described as RSGSSSSSSS14 and interrupted by an intron (Fig. 1). Although S-segments appear to be conserved in all H. spontaneum genotypes, Ser was replaced by Thr in H. vulgare cv. TK157/37 (Fig. 2).

In barley, the K segment has two 15-mer Lys-rich consensus segments RKKGLKDKIKEKLPG and EKKGIMDKIKEKLPG named the K1-segment and the K2-segment, respectively14. In the K1-segment, the amino acid Asp is replaced by Glu in the K102, K169, K394, and LK8 variants (Fig. 2). In addition, the LK8 variant included an amino acid change of Gly to Ser. In the K1 segment, another non-synonymous substitution included Lys replaced with Arg in the LH4 variant. Regarding the Φ-segments, there were conserved GHFQ, GDQQ, YGQH, and YGQQ sequences found between the Y-S and K1-K2 segments, similar to cv. Dicktoo in all HsDHN3 variants with the exception of a Cys substitution occurring at position 101 in the Φ-segment of the AA3 variant. In addition, four amino acid substitutions, Gly to Ala, Thr to Ile, Thr to Ala, and Gly to Ser, were also observed between the K1- and K2- segments at positions 111, 112, 130, and 139, respectively.

HsDHN3 is hydrophilic due to the presence of K-segments (Fig. 3A). Additionally, the region between position 40 and 150 was found to be both hydrophilic and disordered in all HsDHN3 (Fig. 3B). At position 145 we observed that TR4982 replaces a Lys by an Arg that results in an increased hydophilicity (Fig. 3A).

Figure 3: Structural characteristics of DHN3 in H. spontaneum genotyes and H. vulgare cvs. TK157/37 and Dicktoo.
Figure 3

(A) Hydrophobicity values according to Kyle-Doolittle (1982). (B) Disorder probability predicted by DISOclust via the IntFOLD server. (C) Probability of protein binding amino acids predicted using DISOPRED3.

Comparison of DHN3 sequences in cereal species

We compared the predicted DHN3 proteins from H. spontaneum, versus other closely related cereals in terms of their general biochemical properties (Table 3). All DHN3 proteins were YSK2-type, with the exception of T. urartu DHN3 (YSK-type). The number of amino acids varied from 154 (S. bicolor) to 183 residues (B. distachyon), while molecular weights ranged between 15.73 kDa (A. tauschii) and 18.93 kDa (T. urartu). All DHN3 proteins were stable, with an instability index (II) under 40, with the exception T. urartu (41.12). The most basic protein among the DHN3s was T. urartu DHN3 with a pI of 10.22. The number of predicted phosphorylation sites varied from 4 to 15 for PKC and 1 to 5 for CK2 in DHN3s (Table 3). All the DHN3 proteins were identified to be highly hydrophilic with GRAVY values ranging from −0.946 to −1.145. H. spontaneum variants contain on average 55.5% of charged and polar amino acids. The most frequent amino acid is Gly, a non-polar one, which constitutes 26.7% of the amino acid content. The frequency of the Cys and Phe are less than 1%. Cys was discovered only in the H. spontaneum variant AA3 and is a rare amino acid in Dhn genes (Fig. 2). Trp residues were not detected among any of the DHN3 proteins (see Supplementary Table S3 online).

Table 3: Biochemical characteristics of DHN3 protein from H. spontaneum and the closely related cereals listed in Materials and Methods section.

Comparisons of the predicted DHN3 proteins in different cereal species indicate that the Y-, S-, and K-segments are highly conserved sequences (Fig. 4). A consensus motif of [V/T]D[E/Q]YGNP (the Y-segment), located near the N-terminus was found in all cereal DHN3s. The Val, the first amino acid of the Y segment, was replaced by Ile and Leu in T. urartu and S. bicolor, respectively. In addition, the E/Q to V substitution is also present in O. sativa DHN3. The S-segment (RSGSSSSSS) was conserved intact with an extra Ser in T. urartu, A. tauschii and O. sativa.

Figure 4: Multiple alignments of the predicted amino acid sequences of the DHN3 protein from H. spontaneum (consensus sequence), along with the ortholog proteins in other cereal species.
Figure 4

The conserved segments are shown in yellow shade. The NLS segments are denoted in red and framed by a black line. A novel conserved sequence is shown in blue shade. Amino acid substitutions in conserved regions (Y-, S- , and K-segments) are denoted in blue. Asterisks (*) indicates fully conserved residues, while colons (:) and periods (.) indicate less conserved residues.

The NLS peptide (RRKK), placed just upstream from the K1 segment (first K-segment), was found in all cereal DHN3s (Fig. 4). Although, the K1 segment (RKKGIKDKIKEKLPG) was found highly conserved in all cereals, some amino acid substitutions were discovered in the K1 segment. A non-polar amino acid Ile was replaced with Leu and Met, which were also non-polar. The positively charged Lys was substituted with the non-charged Gly in the S. bicolor DHN3 protein. In addition, there was another amino acid replacement between Asp and Glu in B. distachyon, Z. mays, S. bicolor and O. sativa. Although DHN3s have two highly conserved and Lys-rich segments named K1 and K2 in all cereals, the K2-segment occurring at the C-terminus (EKKGIMDKIKEKLPG) was not detected in the T. urartu DHN3. Two substitutions were discovered in the K2 segments at the same amino acid position. An Ile residue was replaced by Leu and Phe in B. distachyon and O. sativa, respectively. Another highly conserved region, [E(D/T)DGMGGR], not previously reported, was discovered between the S-segment and the K1-segment in all cereal DHN3s (Fig. 4). Only a single amino acid replacement, Asp to Thr, was found in the S. bicolor DHN3 for this conserved sub-sequence.

Structural predictions for DHN3 variants

As expected, the 3D models predicted by IntFOLD server strongly suggest that all HsDHN3 variants are mostly unstructured. No high quality globular 3D models were obtained; all models were highly variable and most of the models obtained were neither folded nor compact. The DISOclust results from the IntFOLD server, shown in Fig. 3B, also confirmed the extent of the intrinsic disorder for each of the variants, due to the large variations in the 3D locations of residues across the multiple alternative 3D models. The results in Fig. 3C and Supplemantary Fig. S1 indicate the putative regions of protein binding to be in the first 10–15 residues and the last 10–15 residues with a peak in the region around residue ~80. Often intrinsically disordered regions in proteins coincide with protein binding sites and the latest version of the DISOPRED method provides confidence scores for protein binding residues. Differences in protein binding profiles were observed between e.g. cultivated barley (TK157/37) and TR4982 with varying peak sizes occurring in the K1-segment around amino acid position ~80 (see Supplemantary Fig. S1). The varying confidence scores indicate that the SNPs may affect the putative protein binding function of HsDHN3. Interestingly, the protein binding regions coincide with point mutations in the KxKIxEKLPx subsequence and in the C-terminal of K-segment (Fig. 3C).

Discussion

Dehydrins play a fundamental role in the response of plants to different abiotic stresses especially dehydration, salinity and low temperatures by accumulating in vegetative tissues. They are the best-investigated group within LEA proteins with the characterized multilocus families including ten members in Arabidopsis13, eight members in rice49, fifty-four unigene in wheat50, and thirteen members in barley14,15. Dehydrins are characterized by the presence and copy number of several conserved motifs named the K-, S-, and Y-segments. DHN3 from wild barley is YSK2-type and structurally highly similar to cultivated barley. Interestingly, a 18-bp deletion occurring on Exon 1 was determined in only three out of 21 H. spontaneum genotypes: TR4982 (Çanakkale), TR47002 (İzmir) and TR49085 (Adıyaman). Different polypeptide size as a result of the indel was previously reported in H. vulgare cv. Himalaya and cv. Dicktoo14. Dehydrins are known to be located in different compartments of cell, including the cytoplasm, nucleus, mitochondria, chloroplast, and vicinity of plasma membrane31. YSK2-type dehydrins have both cytoplasmic and nuclear localizations but are mostly found in the nucleus51,52. Goday et al.51 also reported the in vitro interaction of a maize DHN5 homolog, RAB17 with a SV40 NLS signal for its import to nucleus. In our study, a “RRKK” motif, postulated as a nuclear localization signal (NLS), was determined just upstream from the first K-segment of nine cereal DHN3s (Fig. 4). Further experimental data is needed to confirm the functionality of the NLS sequence as well as the exact localization of DHN3s in barley.

In general, cereal DHN3s were found to be stable proteins except the T. urartu (with an instability index of 41.12). The presence of only one K-segment in T. urartu DHN3 (TuDHN3), in contrast to other dehydrins, may have a negative effect on protein stability. On the other hand, TuDHN3 was also the most basic protein with the highest molecular weight (18.93 kDa). T. urartu is a wild diploid wheat and progenitor species of a genome of bread wheat. Despite the sparse T. urartu literature, LEA proteins have been recently found associated with cold tolerance in this species53. The NetphosK 1.0 program predicted that HsDHN3 might be specifically phosphorylated by protein kinase C at 9-11 sites, which were mainly Thr residues. YnSKn-type DHNs are predominantly phosphorylated by protein kinase C group proteins, rather than CK2s29. Compared to other cereal species, H. vulgare and H. spontaneum had one of the highest occurrences of PKC phosphorylation sites, second only to T. urartu. Particularly, phosphorylation by protein kinase C at K-segments has been found to be associated with membrane binding functions of dehydrins29. Therefore, both HsDHN3 and TuDHN3 are good candidates to investigate membrane-dehydrin interactions. Amino acid changes led to the occurrence of a new phosphorylation site in two H. spontaneum accessions, LK8 and K169, by replacement of Thr at the position of 112. Brini et al.52 found that the phosphorylation pattern in wheat DHNs was related to abiotic stress tolerance. In particular, higher phosphorylation indicated higher tolerance to drought and salinity. This suggests that the extra phosphorylation site may play a role in the drought tolerance of LK8 and K169.

The amino acid composition of HsDHN3 showed a high proportion of Gly residues (26.7%) conferring flexibility to the protein with the lack of a hydrophobic core and other factors. Moreover, 55.5% of HsDHN3 amino acids were polar amino acids with hydrophilic character and this was also supported by the GRAVY results, with calculated values ranging from −1.020 to −1.128. In general, DHN3s are Gly-rich proteins and known to be deficient in Trp and Cys in the literature. We have found a Cys residue in the H. spontaneum variant AA3. S. bicolar and T. urartu contained one Cys (0.6%) and one Trp (0.6%) residue among the cereal DHN3s. Intrinsically disordered proteins are also significantly depleted in Cys and Trp54; typically less than 1%, compared with the average folded protein in the Protein Databank. In general, His residues are rarely found in proteins and constitute approximately 2% of the amino acid content55,56. Nevertheless, dehydrins contain a higher proportion of His residues. For example, His content ranged from 3.2% to 13.5% in Arabidopsis DHNs56. We have found that HsDHN3s were relatively His rich proteins containing 8.1% His residues. Moreover, conserved His residues were adjacent, with both K-segments formed as Gly-His (GH) or Gln-His (QH) motifs in HsDHN3. Eriksson et al.29 reported that the ionization state of His residues flanking the K-segments modulates the affinity of dehydrins to the cellular membranes in a pH dependent manner. His residues were not concentrated as motifs in cereal DHN3s, as earlier reported for a Citrus dehydrin56.

The HsDHN3 protein variants are all likely to be intrinsically disordered (natively unstructured) with a likelihood of protein binding sites near the N- and C-termini and surrounding residue 80. The C-terminal site and the site around residue 80 also coincide with predictions of alpha helices and so these regions may undergo and disorder-order transition on protein binding. Importantly, the point mutations are observed occur within these protein binding regions and therefore the amino acid substitutions in these sequence variants may affect protein interactions. Similarly, the protein binding regions coincide with point mutations in the KxKIxEKLPx subsequence and in the C-terminal K-segment, a region highly conserved in plants27. Often disordered regions become ordered on binding, so it is interesting to predict secondary structures to determine if local structures may form during protein-protein interactions. The specific nature of these interactions is not yet known although the predicted helices are unlikely to form coiled-coil interactions according to results obtained from Pcoils. DHNs are also known to interact with lipids, membranes, metal ions, water, ice and DNA31. They function as cryoprotectant and have binding properties that allow chaperon activities. However, the exact mechanism and details of these interactions are not completely clear. Recently, dehydrin-dehydrin binding has been demonstrated in two plant species57,58. Yeast two-hybrid assays confirmed that K-segments and His residues are required for dimerization of Opuntia DHN158. In our study, the peaks shown in Fig. 3C and Supplemantary Fig. S1 indicated disordered residues that may fold or become ordered upon protein binding. These regions are therefore likely to be the dimerization sites in the HsDHN3 variants.

In this study, we report the genetic structure and diversity of near-complete Dhn3 alleles from native H. spontaneum plants. By taking the advantage of the additional data, we were able to compare predicted DHN3 sequences with other closely related cereals, which allowed us to distinguish polymorphisms and motif structures. Most of the SNPs identified occurred in non-coding and inter-segment positions and resulted non-synonymous mutations. However, point mutations in several variants have resulted in amino acids with opposite chemical properties as seen in the substitution of Met (a sulphur containing hydrophobic) for Lys (a basic, polar, and positively charged), or Gly (Aliphatic and nonpolar) for Ser (Non-aromatic hydroxyl containing, polar). Dehydrins, as IDPs, are structurally not globular folded molecules; however they are proposed to be rich in functionality because of their flexibility and modularity. Bioinformatics tools such as, IntFOLD, DISOclust and DISOPRED are ideal for deducing the nature and the functional properties of plant IDPs, and act as a guide for further experimental studies. From the predictions in our work, we have showed the potential availability of at least one likely protein binding site in barley DHN3. Furthermore, point mutations within the conserved sequences in H. spontaneum variants affected the predicted protein binding profile. Our results may contribute to future experimental designs to resolve the interactions of barley DHNs with known and undetermined ligands, which lead to their diverse functions in plant cells as cryoprotectants and chaperons.

Additional Information

How to cite this article: Uçarlı, C. et al. Genetic diversity at the Dhn3 locus in Turkish Hordeum spontaneum populations with comparative structural analyses. Sci. Rep. 6, 20966; doi: 10.1038/srep20966 (2016).

References

  1. 1.

    & Domestication of plants in the Old World. The origin and spread of cultivated plants in West Asia, Europe and the Nile Valley. Clarendon Press, Oxford, England (1993).

  2. 2.

    et al. On the origin and domestication history of barley (Hordeum vulgare). Mol. Biol. Evol. 17, 499–510 (2000).

  3. 3.

    , , , & Ecogeographical diversity – a Vavilovian approach in Diversity in barley (Hordeum vulgare), (eds et al.) 53–76 (Elsevier, 2003).

  4. 4.

    et al. Locating genotypes and genes for abiotic stress tolerance in barley: a strategy using maps, markers and the wild species. New Phytologist. 137, 141–147 (1997).

  5. 5.

    & Drought and salt tolerances in wild relatives for wheat and barley improvement. Plant Cell Environ. 33, 670–685 (2010)

  6. 6.

    & Developmental biochemistry of cotton seed embryogenesis and germination. XIII. Regulation of biosynthesis of principal storage proteins. Plant Physiol. 68, 187–194 (1981).

  7. 7.

    et al. (1989) Common amino acid sequence domains among the LEA proteins of higher plants. Plant Mol. Biol. 12, 475–486 (1989).

  8. 8.

    & Drought and salt tolerance in plants. CRC. Crit. Rev. Plant Sci. 24, 23–58 (2005).

  9. 9.

    : Emergence of a biochemical role of a family of plant dehydration proteins. Physiol. Plant. 97, 795–803 (1996).

  10. 10.

    , , & Expression of the barley dehydrin multigene family and the development of freezing tolerance. Mol. Gen. Genet. 264, 145–153 (2000).

  11. 11.

    et al. Dehydrin gene expression provides an indicator of low temperature and drought stress: transcriptome-based analysis of barley (Hordeum vulgare L.). Funct. Integr. Genomics. 8, 387–405 (2008).

  12. 12.

    , , & Dehydrin expression in seeds: an issue of maturation drying. Front. Plant. Sci. 5, 402 (2014).

  13. 13.

    & LEA (late embryogenesis abundant) proteins and their encoding genes in Arabidopsis thaliana. BMC Genomics. 9, 118 (2008).

  14. 14.

    , & The barley (Hordeum vulgare L.) dehydrin multigene family: sequences, allele types, chromosome assignments, and expression characteristics of 11 Dhn genes of cv. Dicktoo. Theor. Appl. Genet. 98, 1234–1247 (1999).

  15. 15.

    , , , & Barley Dhn13 encodes a KS-type dehydrin with constitutive and stress responsive expression. Theor. Appl. Genet. 110, 852–858 (2005).

  16. 16.

    , & A view of plant dehydrins using antibodies specific to the carboxy terminal peptide. Plant Mol. Biol. 23, 279–286 (1993).

  17. 17.

    Dehydrins: a commonalty in the response of plants to de- hydration and low temperature. Physiol. Plant. 100, 291–296 (1997).

  18. 18.

    et al. Rapid upregulation of Dehyrin3 and Dehydrin4 in response to dehydration is a characteristic of drought-tolerant genotypes in barley. J. Plant Biol. 49, 455–462 (2006)

  19. 19.

    et al. Differentially expressed genes between drought-tolerant and drought-sensitive barley genotypes in response to drought stress during the reproductive stage. J. Exp. Bot. 60, 3531–3544 (2009).

  20. 20.

    et al. Comparative expression analysis of dehydrins between two barley varieties, wild barley and Tibetan hulless barley associated with different stress resistance. Acta Physiol. Plant. 33, 567–57 (2011).

  21. 21.

    et al. Expression analysis of dehydrin multigene family across tolerant and susceptible barley (Hordeum vulgare L.) genotypes in response to terminal drought stress. Acta Physiol. Plant. 35, 2289–2297 (2013).

  22. 22.

    et al. Differential expression of dehydrin genes in wild barley, Hordeum spontaneum, associated with resistance to water deficit. Plant, Cell Environ. 27, 1297–1308 (2004)

  23. 23.

    & Cryoprotective mechanism of a small intrinsically disordered dehydrin protein. Protein Sci. 20, 42–50 (2011).

  24. 24.

    et al. The importance of size and disorder in the cryoprotective effects of dehydrins. Plant Physiol. 163, 1376–1386 (2013).

  25. 25.

    , & Characterization and cryo-protective activity of cold-responsive dehydrin from Citrus unshiu. J. Plant Physiol. 158, 1333–1339 (2001).

  26. 26.

    , & (2003) Ion binding properties of the dehydrin ERD14 are dependent upon phosphorylation. J. Biol. Chem. 278, 40882–40889 (2003).

  27. 27.

    et al. The K-segment of maize DHN1 mediates binding to anionic phospholipid vesicles and concomitant structural changes. Plant Physiol. 150, 1503–1514 (2009).

  28. 28.

    et al. Protein-water and protein-buffer interactions in the aqueous solution of an intrinsically unstructured plant dehydrin: NMR intensity and DSC aspects. Biophys. J. 91, 2243–2249 (2006).

  29. 29.

    , , , & Tunable membrane binding of the intrinsically disordered dehydrin Lti30, a cold-induced plant stress protein. Plant Cell 23, 2391–2404 (2011).

  30. 30.

    et al. The K-segments of the wheat dehydrin DHN-5 are essential for the protection of lactate dehydrogenase and β-glucosidase activities in vitro. Mol. Biotechnol. 54, 643–650 (2013).

  31. 31.

    & Disorder and function: a review of the dehydrin protein family. Front. Plant Sci. 5, 576 (2014).

  32. 32.

    , , & Disorder prediction methods, their applicability to different protein targets and their usefulness for guiding experimental studies. Int. J. Mol. Sci. 16, 19040–19054 (2015).

  33. 33.

    & Identification and mapping of polymorphisms in cereals based on polymerase chain reaction. Theor. Appl. Genet. 82, 209–216 (1991).

  34. 34.

    , , , & Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).

  35. 35.

    et al. Clustal W and Clustal X version 2.0. Bioinformatics. 23, 2947–8 (2007).

  36. 36.

    , , , & MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol. Biol. Evol. 30, 2725–9 (2013).

  37. 37.

    In Molecular evolutionary genetics. (Columbia University Press, 1987).

  38. 38.

    & (2009) DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics. 25, 1451–2 (2009).

  39. 39.

    & A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105–132 (1982).

  40. 40.

    et al. Protein Identification and Analysis Tools on the ExPASy Server in The proteomics protocols handbook (ed. ) 571–607 (Humana, 2005).

  41. 41.

    , , , & Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics. 4, 1633–1649 (2004).

  42. 42.

    , , & The IntFOLD server: an integrated web resource for protein fold recognition, 3D model quality assessment, intrinsic disorder prediction, domain prediction and ligand binding site prediction. Nucleic Acids Res. 39, 171–176 (2011).

  43. 43.

    , , , & IntFOLD: an integrated server for modelling protein structures and functions from amino acid sequences. Nucleic Acids Res. 43, W169–73 (2015).

  44. 44.

    , & Improvement of 3D protein models using multiple templates guided by single-template model quality assessment. Bioinformatics. 28, 1851–1857 (2012).

  45. 45.

    Intrinsic disorder prediction from the analysis of multiple protein fold recognition models. Bioinformatics. 24, 1798–1804 (2008).

  46. 46.

    et al. Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J. Mol. Biol. 337, 635–645(2004).

  47. 47.

    & DISOPRED3: Precise disordered region predictions with annotated protein binding activity. Bioinformatics. 31, 857–863 (2015).

  48. 48.

    Thermostability and aliphatic index of globular proteins. J. Biochem. 88, 1895–1898 (1980).

  49. 49.

    et al. Genome-scale identification and analysis of LEA genes in rice (Oryza sativa L.). Plant Sci. 172, 414–420 (2007).

  50. 50.

    et al. Classification and expression diversification of wheat dehydrin genes. Plant Sci. 214, 113–120 (2014).

  51. 51.

    et al. The maize abscisic acid-responsive protein RAB17 is located in the nucleus and interacts with nuclearlocalization signals. Plant Cell. 6, 351–360 (1994).

  52. 52.

    et al. Functional characterisation of DHN-5, a dehydrin showing a differential phosphorylation pattern in two Tunisian durum wheat (Triticum durum Desf.) varieties with marked differences in salt and drought tolerance. Plant Sci. 172, 20–28 (2007).

  53. 53.

    , , & A proteomic analysis to identify cold acclimation associated proteins in wild wheat (Triticum urartu L.). Mol Biol Rep. 41, 3897–3905 (2014).

  54. 54.

    Intrisically unstructured proteins. Trends Biochem. Sci. 27, 527–533 (2002).

  55. 55.

    , & Current and prospective applications of metal ion-protein binding. J. Chromatogr. A. 988, 1–23 (2003).

  56. 56.

    , & Metal binding by citrus dehydrin with histidine-rich domains. J. Exp. Bot. 56, 2695–2703 (2005).

  57. 57.

    et al. Interactions of Thellungiella salsuginea dehydrins TsDHN-1 and TsDHN-2 with membranes at cold and ambient temperatures-surface morphology and single-molecule force measurements show phase separation, and reveal tertiary and quaternary associations. Biochim. Biophys. Acta. 1828, 967–980 (2013).

  58. 58.

    et al. Nuclear localization of the dehydrin OpsDHN1 is determined by histidine-rich motif. Front. Plant Sci. 6, 1–8 (2015).

Download references

Acknowledgements

This work was supported by the Scientific Research Projects Coordination Unit of Istanbul University, Projects No. 4712 and No. 30853. Authors thank to Dr. Aydın Alp (Dicle Univ.) for H. spontaneum seeds of AA1, AA2 and AA3 and Dr. Haluk Ertan for his valuable contributions on sequence data analyses.

Author information

Affiliations

  1. Department of Molecular Biology and Genetics, Faculty of Science, Istanbul University, Vezneciler 34134, Istanbul, Turkey

    • Cüneyt Uçarlı
    • , Süleyman Çaputlu
    • , Andres Aravena
    •  & Filiz Gürel
  2. School of Biological Sciences, University of Reading, Whiteknights, Reading RG6 6AS, UK

    • Liam J. McGuffin

Authors

  1. Search for Cüneyt Uçarlı in:

  2. Search for Liam J. McGuffin in:

  3. Search for Süleyman Çaputlu in:

  4. Search for Andres Aravena in:

  5. Search for Filiz Gürel in:

Contributions

C.U. carried out experiments and analyzed the DNA and protein sequence data, L.J.M. performed the structural bioinformatics and edited the manuscript, A.A. examined protein data and reviewed the manuscript, S.Ç. assisted experiments, F.G. designed the study, managed the overall project and wrote the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Filiz Gürel.

Supplementary information

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/