Introduction

Bread wheat (Triticum aestivum) is a rich source of starch (~ 70%)1, that comprising of amylose (~ 25%) and amylopectin (~ 75%)2. High amylose starch has nutritional benefits as it is classified under type-2 resistant starch (RS) category. The resistant starch is slowly digestible in human gut, therefore it has low glycemic index. It is beneficial for combating gastrointestinal diseases, diabetes and obesity3,4,5,6. In wheat, the natural variation for high amylose starch is narrow. It can be improved by genetic engineering tools using the key pathway genes. It is found that the progress is limited as overexpression and knock-down/out of two key genes—GBSSI and SBEII has limited success7,8,9. This can also be done by mutational breeding approaches10. Many post translational studies disclosed the different aspects of starch biosynthesis pathway11,12,13 to date but the ubiquitin-mediated post translational insights of the starch biosynthesis pathway has not yet revealed and no attempt has been done for high amylose starch.

Ubiquitin-mediated post-translational modifications targeting protein degradation has emerged as a crucial process in controlling different aspects of cellular processes in eukaryotes14,15. This process involves the subsequent action of three enzymes, E1 (ubiquitin activating enzyme), E2 (ubiquitin conjugating enzyme), and E3 (ubiquitin ligase)16. In this multistep process E3 ubiquitin ligases are key determinant of target specificity for degradation by 26S proteasome system17,18, which are mainly classified in two groups HECT (Homologous to the E6-associated protein C terminus) and RING (Really Interesting New Gene) finger/U-box domain17,19. Of these known E3 ligases, RING finger domain containing proteins comprise major proportion. The RING domains are considered to be involved in protein–protein interactions and essential for E2 dependent ubiquitination, that can function as a single subunit or in multi-subunit complexes20,21,22. RING (Really Interesting New Gene/U-box)-type E3 ligases belong to the largest class of E3 ligases with > 600 members in human as the RING E3 ligases function is diverse and act as allosteric activators of the E2. Initially the RING domains were characterized as RING-H2 and RING-HC type on the basis His and Cys at fifth metal ligand position, respectively. Further 5 modified RING types RING-C2, RING S/T, RING-v, RING-D and RING-G having variation in amino acid residue at metal ligand positions were also identified in A. thaliana23,24.

The role of RING finger E3 ligases in the plant development have been extensively studied for various biological processes. RING finger containing proteins like COP1 is a well-known for its role as photomorphogenic repressor25, XBAT35.2 in cell death induction and pathogen response26, HOS1 in cold response27, KEG in growth and development28, Capsicum annuum CaAIRF1 in ABA and drought signaling29, ATL2/ATL9 in defense response30. Recent studies revealed that SP1, a RING type E3 ligase is involved in degradation of TOC components (translocon) present in chloroplast outer membrane and hence regulate the chloroplast protein import31. However, the role of this degradation pathway in amyloplast proteins turnover remains unclear. Beside the role in different biological processes the RING finger E3 ligase GW2 found to be involved in grain size and weight in rice and wheat possibly by regulating the expression of starch biosynthetic pathway genes32,33,34. The reduction in the transcripts of GW2 by RNAi in the durum wheat cultivar increased the starch content by 10–40%, width by 4–13% and surface area by 3–5%, that suggest the active role of GW2 RING finger E3 ligase in starch biosynthesis but its interacting partners need to be explored35. These previous studies provides substantial information to perform the current study.

In this study, we reported the genome-wide identification and characterization of RING domain containing E3 ligase family in wheat as well as identified the putative RING E3 ligases that may involve in high amylose biosynthesis based on quantitative gene expression and correlation analysis in developing wheat grains. A large set of 1255 proteins containing 1272 RING domains was first time identified in wheat and 10 potential RING protein genes found to be involved in high amylose biosynthesis and significantly associated with two starch biosynthesis genes; GBSSI and SBEIIa. Further, the transcriptome sequencing using next-generation sequencing method identified several unique induced mutations in 698 RING protein genes. Hence, this studies lay the foundation for future research to make better understanding of RING finger E3 ligases involvement in high amylose starch biosynthesis in cereal crops.

Results

Identification and classification of RING finger proteins in wheat

A total of 1272 potential RING domains in 1255 proteins were identified in wheat proteome through in silico studies (Supplementary Table S1). Among the identified proteins, 1241 proteins contained a single RING domain, 12 proteins with two RING domains and one protein each with three and four domains. Proteins with multiple RING domains were suffixed with an alphabet to their gene IDs. Predicted RING domain containing proteins size ranged from 84 to 4749 amino acids with domain size ranging from 30 to 102 amino acids. On the basis of amino acid residues at eight metal ligand positions (Cys and/or His) and number of residues between them, 1272 RING domains were classified into 4 groups according to Stone et al., (2005). Maximum number of RING domains i.e. 875 domains (68.79%) were identified as RING-H2 followed by 323 RING-HC domain (25.39%), 67 RING-v domains (5.27%) and 7 RING-G domains (0.55%) (Supplementary Table S1, Supplementary Fig. S1). Representative sequence logos of protein motifs of the four identified groups are shown in Fig. 1. In RING-H2 type (out of 875) six domains were identified as RBX type having Asp instead of Cys at eighth metal ligand position (Supplementary Fig. S1). In wheat we did not identify RING-D, RING-C2 and RING-S/T type domains which were present in Arabidopsis, Brassica rapa, apple and rice. In this study, a large number of RING containing proteins were identified as compared to other plant species such as Arabidopsis (469), rice (488), apple (663) B. rapa (715) and B. oleracea (734) 24,36–39. Additionally, 117 RING domains in 94 proteins were also identified by in silico studies but due to the absence or substitution of one or more amino acid residues at metal ligand positions they were not classified in any group and considered as incomplete RING domains (Supplementary Table S2). But for the downstream analysis only complete RING protein genes were considered.

Figure 1
figure 1

Sequence logo of overrepresented motifs found in RING-H2, RING-HC, RING-v, RING-G domains, respectively generated using WebLogo program version 2.8.2 (https://weblogo.berkeley.edu/logo.cgi). The figures were generated by on-line WebLogo. The height of the letters indicates the conservation at that particular position. The asterisked letters indicate the conserved metal ligands and zinc coordinating amino acid pairs.

Phylogenetic analysis

To expand the knowledge on wheat RING proteins, this study included phylogenetic analysis among 1272 RING domain sequences of 1255 wheat RING proteins. Majority of similar type of RING domains (RING-H2, RING-HC, RING-v and RING-G) were clustered together with various small subgroups (Supplementary Fig. S2). The phylogenetic relationship suggested that most of the domains from the same group were categorized together while some were found intermixed with other domains. This topology might be due to the discrepancy in domain sequences except the conserved metal ligands, during evolutionary process40. RING-H2 and RING-HC domains tend to be distributed within each other forming several subgroups containing small number of domains. The RING-G and RING-v domains were clustered together but lied within major RING-H2 and RING-HC domain groups.

Additional domains in RING finger proteins

Out of the 1255 RING finger proteins, 885 proteins (70.52%) contained one or more additional domains other than RING (Supplementary Table S3). On the basis of presence/absence and organization of the additional domains, the 1255 RING domain containing proteins were grouped into 36 major groups (Table 1; Supplementary Table S3). Of 885 RING domain containing proteins, 544 (group 6.1) proteins contained additional transmembrane domain (1–14 repetitions) and 43 (group 3) proteins had only coiled coil (repetition 1–6) domain. The Zinc-Ribbon-9, signal peptide and VWA domains were present in 26, 19 and 18 RING domain containing proteins of group 13.1, 21.1 and 11.1 respectively. Some of the RING finger proteins were found to have two or more other domains along with the aforementioned domains. As with the transmembrane domain other domains, namely GIDE, IBR, PA, VWA, zinc_ribbon_9, Tmemb_185A, coiled-coil and signal peptide were also present in RING domain containing proteins. A protein–protein interacting domain VWA was simultaneously found in 20 RING proteins in combination with Vwaint domain. The Vwaint domain previously not found in Arabidopsis RING finger proteins24 but further identified in brassica39. The two domains Copine and NB-ARC were specifically found in wheat RING proteins, previously not found in any other plant species. Four domains ZnF_UBP, GIDE, RWD and CUE involved in ubiquitination were also identified. Nucleic acid binding domains DEXDc, HELICc, HIRAN, ZNF_C2H2, ZNF_C3H1 and ZNF_C2HC were identified in 27 RING finger proteins. Protein–protein interaction domains, Ankyrin repeats, BRCT, TPR, VWA, Coiled-coil and WD40 repeats were identified in RING finger proteins that help in substrate recognition by E3 ligases41,42,43 The two protein–protein interaction domains PHD and ZnF_ZZ from the same family of cross-brace zinc finger, from which RING domain belongs, were also identified. Presence of these various types of additional domains indicates that instead of RING finger domains these proteins share different features and are very distinct to each other.

Table 1 Wheat RING domain containing proteins grouped based on the presence or absence and organization of the additional domain/s in RING proteins. The possible wheat RING protein orthologs in other species are shown.

Chromosomal locations and duplication events analysis

The chromosomal location of 1255 RING protein genes were extracted from T. aestivum genome database downloaded from Ensembl plants (ftp://ftp.Ensemblgenomes.org/pub/plants/release-46/gff3/triticum_aestivum). Out of 1255, 17 genes were mapped to unassembled chromosomal locations that were excluded in this study. The retrieved chromosomal locations of rest 1238 wheat RING protein genes were  mapped on wheat genome that were randomly distributed on all the 21 chromosomes (1A, 1B, 1D to 7A, 7B, 7D) (Supplementary Table S4). RING protein genes were found to be less distributed on 4B (3.5%) and largely distributed on 7D (6.7%) chromosome (Supplementary Fig. S3). The total number of genes in each RING class (RING-H2, RING-HC, RING-v, RING-G) were also determined and it was observed that RING-H2, RING-HC, and RING-v type RING protein genes were distributed throughout all the 21 chromosomes but genes from RING-G class were only present on chromosomes 1B, 1D, 4B and 5A (Fig. 2). Average distance between two RING protein genes ranged from 7.6 to 15.6 Mb on chromosomes 1D and 4B, respectively. The results decipher that RING protein genes are inconsistently distributed on wheat genome. Meanwhile, singleton (not duplicated) and four duplication events dispersed, proximal, tandem, WGD/segmental in 4, 49, 15, 90, and 1080 RING protein genes were identified, respectively (Supplementary Table S4). Whole genome duplicated/segmental, tandem and dispersed (except 2A and 6D) duplicated genes were distributed all over the genome. Singleton were found only on 1B, 3D, 4B, and 6D, while proximal were found on 1A, 1B, 1D, 2A, 3A, 5A, 5D, 7A and 7D. Whole genome/segmental duplicated genes from each chromosome are displayed in Fig. 3.

Figure 2
figure 2

Histogram representing the distribution of RING protein genes in the identified four RING domain groups on 21 chromosomes (Ta1A to Ta7D) based on chromosomal locations from Ensembl Triticum aestivum database. Where, Ta stand for Triticum aestivum followed by chromosome name.

Figure 3
figure 3

Chromosomal locations and whole genome/segmental duplication events of 1238 RING protein genes on 21 chromosomes of hexaploid wheat. The suffix after ‘chr’ represent wheat chromosome name (chr1A to chr7D) and is given a different color.

Expression profiling during grain development

To explore the potential role of RING domain containing proteins during wheat grain development, gene expression values (TPM -Transcripts Per Million) of 1255 RING protein genes at different developmental stages i.e., 2 DAA, 14 DAA and 30 DAA was retrieved from publicly available wheat expression database. Out of the 1255 RING protein genes, 698 were found expressed at any of the three developmental stages suggesting their potential role during grain development. However, 557 genes that were not found expressed at any of the three stages, were considered as insignificant and excluded from this study. Of the 698 RING protein genes, 306 were consistently expressed at all the developmental stages. We found three different patterns of expression including a combination of any two days i.e., (i) 2 and 14 DAA, (ii) 14 and 30 DAA, and (iii) 2 and 30 DAA, these three combination included 22, 3, and 156 genes respectively. Days specific expression of the RING protein genes was also found involving 152, 5, and 54 genes expressed at 2, 14, and 30 DAA (Fig. 4). To analyze the pattern of expression of the identified 698 RING protein genes clustered heat map was generated (Fig. 5, Supplementary Fig. S4). Based on expression pattern genes were clustered into four major groups (Group I, II, III and IV) with two subgroups: Group III (III A, III B) and Group IV (IV A, IV B) (Fig. 5). Group I included majority of genes (123) with high expression levels at 2 DAA. A total of 194 genes with almost same expression patterns at 2 DAA and 30 DAA were clustered in group II. Group III A represented 167 genes that were mainly expressed at 30 DAA compared to 2 DAA. Simultaneously, Group III B included 66 genes that were expressed at 14 and 30 DAA but with lower expression values. Group IV A contained the 133 genes expressed at 2 DAA and 14 DAA and IV B included the 15 genes particularly expressed at 14 DAA. It was observed from this expression analysis majority of RING containing protein genes were having higher expression levels at initial (2 DAA) and late stages (30 DAA) of seed development in comparison to mid stages (14 DAA). This information briefs the involvement of RING protein genes in all stages of seed development and its functional diversity.

Figure 4
figure 4

Venn diagram showing number of commonly and uniquely expressed RING finger containing genes at three seed developmental stages (2, 14 and 30 DAA). The data were taken from the publicly available RNA-seq databases of wheat (https://wheat.pw.usda.gov/GG3/node/237). DAA stands for day after anthesis.

Figure 5
figure 5

Expression profiles of 698 wheat RING protein genes at three seed developmental stages (2, 14, and 30 day after anthesis, DAA). The data was extracted from RNA-seq data set (https://wheat.pw.usda.gov/GG3/node/237), clustered expression heat map generated by MeV software version 4.9.0 and divided into four major groups according to gene expression patterns represented as color gradient. The color scale indicating largest gene expression values in pink color, intermediate values in black color and the smallest values in green color.

Further to examine the putative role of the 698 RING protein genes in amylose biosynthesis during grain development, differential gene expression (DEGs) analysis was performed using the In-house transcriptome data of high (‘TAC 75’) and low (‘TAC 6’) amylose mutant lines along with parent (‘C 306’). The DEGs were analyzed in two groups: Group 1 (‘TAC 75’ vs ‘C306’), and Group 2 (‘TAC 6’ vs ‘C306’). The DEGs were considered significantly expressed following the criteria of log2FC > 1 and p-value ≤ 0.05. Under this criteria fifteen genes were found significant in Group 1 (nine) and Group 2 (six) respectively (Supplementary Table S5). Two DEGs TraesCS6B02G286500 and TraesCS1A02G206100 in Group 1 (‘TAC 75’ vs ‘C306’) showed significant (seven fold) up- and down-regulated expression respectively. While, the DEG TraesCS2A02G099300 showed significant 11 fold down-regulated expression in group 2 (‘TAC 6’ vs ‘C306’).

The above 15 differentially expressed RING proteins genes in group 1 and group 2 might have potential role in amylose biosynthesis. So to predict the involvement of RING proteins in amylose biosynthesis all 15 DEGs were further validated by qRT-PCR in amylose mutant lines (‘TAC 75’ and ‘TAC 6’) and parent variety (‘C 306’). Considering the non-ubiquitous expression of the RING protein genes throughout the grain development, additional 21 highly expressed RING protein genes previously mined from the publicly available database (http://www.wheat-expression.com/) were also considered for qRT-PCR based expression analysis.

Analysis and characterization of RING protein gene variants in mutant lines

Different genotypic variants were identified in ‘TAC 75’ and ‘TAC 6’ in reference to parent variety (‘C 306’), randomly distributed across the wheat genomes (A, B and D). This analysis revealed a total of 457 and 667 variants in 202 and 250 RING protein genes in ‘TAC 75’ and ‘TAC 6’ respectively. Highest number of mutations were located on chromosome 2B and least on 4D in both mutants. Mutations including single nucleotide polymorphisms (SNPs), insertion-deletions (InDels), and multiple nucleotide polymorphisms (MNPs) at different positions involving genic, intergenic, and 3′ and 5′ UTR were identified. The predicted effect of identified variants was characterized into high, low, moderate and modifier variants, in which 26 mutations in ‘TAC 75’ and 23 in ‘TAC 6’ exhibited high variant effect (Supplementary Table S6). Largest number of mutations were identified within intergenic regions (29.1% and 28.64% in ‘TAC 75’ and ‘TAC 6’ respectively) followed by upstream gene variants (26.04% and 28.19% in ‘TAC 75’ and ‘TAC 6’ respectively) and intronic variants (11.6% and 15.7% in ‘TAC 75’ and ‘TAC 6’ respectively), hence amino acids remained unaltered. Conservative, disruptive, splice, synonymous, missense and frameshift variants were found comparatively lower in both ‘TAC 75’ and ‘TAC 6’ (Fig. 6), but some with deleterious effect on proteins. The detail of total number of identified variants listed in Table 2. Mutant characterization indicated the presence of EMS induced specific mutations in RING protein genes identified in high (‘TAC 75’) and low (‘TAC 6’) mutant lines making them unique for the variants. One of the previous studies from our lab6, has also characterized the two mutant lines for key enzymes GBSSI and SBEII responsible for amylose biosynthesis and possessing the defined amylose contents.

Table 2 Total number of variants identified in different variant types in mutant lines ‘TAC 75’ and ‘TAC 6’.
Figure 6
figure 6

Percent distribution of different genetic variants in mutant lines ‘TAC 75’ and ‘TAC 6’.

qRT-PCR validation of candidate genes during seed development

A total of 36 RING protein genes, 15 genes from In-house transcriptome data and 21 from publicly available wheat expression database were considered for qRT-PCR validation along with two key regulators (GBSSI and SBEIIa) for amylose biosynthesis. Information of gene specific primers used in the study is given in Supplementary Table S7. Gene expression analysis was performed in high amylose (‘TAC 75’), low amylose (‘TAC 6’) mutant lines and parent variety (‘C 306’) at four developmental stages (7, 14, 21 and 28 DAA) to possibly cover all stages of endosperm development. Differential gene expression was analyzed in two combinations, Group 1 (‘TAC 75’ vs. ‘C 306’), and Group 2 (‘TAC 6’ vs. ‘C 306’). All the 36 RING protein genes showed the variable pattern of differential expression among three groups (Supplementary Table S8, Fig. 7). Differential gene expression analysis showed that most of the RING protein genes were down-regulated in high amylose line, when compared to parent (Group 1). Out of the 36 RING protein genes, eight genes showed consistent down-regulated expression in high amylose line at all four developmental stages (Fig. 7). Out of the above eight RING protein genes, four genes (TraesCS6D02G254700, TraesCS2D02G162600, TraesCS3B02G000800, TraesCS5A02G049400) showed significant down-regulated expression (log2 FC > 2) at the later stages (21 and 28 DAA) of seed development. However the remaining four genes (TraesCS6B02G301900, TraesCS3B02G461900, TraesCS6D02G278100, TraesCS7B02G118700), were found significantly down-regulated during the early stages (7 and 14 DAA) of seed development (Fig. 8). These results suggest that the down-regulated expression of the above identified eight RING protein genes in high amylose lines, might play role in negative regulation of high amylose biosynthesis.

Figure 7
figure 7

Differential gene expression (Log2 fold change) data of 36 RING finger protein genes in three genotypes ‘TAC 75’, ‘TAC 6’ and parent ‘C 306’ at four seed developmental stages (7, 14, 21, and 28 DAA (days after anthesis) using qRT-PCR analysis. Differential gene expression analysis was performed in two groups; (A) Group 1 (‘TAC 75’ vs. ‘C 306) and (B) Group 2 (‘TAC 6’ vs. ‘C 306’). ‘TAC 75’ and ‘TAC 6’ are high and low amylose mutant lines derived from parent variety ‘C 306’ with amylose percent ~ 65%, ~ 7%, ~ 26%, respectively. The 11 candidate RING protein genes are marked with red color (down-regulated in high amylose line) and blue color (up-regulated in high amylose line).

Figure 8
figure 8

Bar graphs showing the differential gene expression of 11 candidate RING protein genes along with GBSSI and SBEIIa using qRT-PCR for amylose biosynthesis at (A) 7 DAA, (B) 14 DAA, (C) 21 DAA and (D) 28 DAA in ‘TAC 75’ vs. ‘C 306’ and ‘TAC 6’ vs. ‘C 306’. ‘TAC 75’ and ‘TAC 6’ are high and low amylose mutant lines derived from parent variety ‘C 306’ with amylose percent ~ 65%, ~ 7%, ~ 26%, respectively. The results were obtained using three technical and three biological replicates and are expressed as mean ± SD.

On the other hand six RING protein genes were found consistently up-regulated in high amylose line in all the four developmental stages (Fig. 7). Of these six genes, two genes (TraesCS4A02G112900, TraesCS1A02G341400) showed differential log2 FC  > 2 and at the same time were found comparatively down regulated in low amylose line comparative to the parent (Group 2) (Fig. 8). The RING protein gene TraesCS6B02G286500 was found to be extremely up-regulated at later seed developmental stages (21 and 28 DAA) in Group 1 (Fig. 8). These 3 RING protein genes were found up-regulated when amylose biosynthesis is high hence they might possibly act as positive regulators for amylose biosynthesis.

At the same time expression of two key regulatory enzymes GBSSI and SBEIIa in starch biosynthesis was also analyzed. The GBSSI was highly up-regulated and SBEIIa was highly down-regulated in high amylose line (Group 1). Whereas we found a vise-versa expression for the same (GBSSI and SBEIIa) in low amylose line (Group 2) (Fig. 8). The higher and lower expression levels of GBSSI and SBEIIa in high amylose line might be the reason for high amylose starch biosynthesis. The above identified 11 candidate (eight down-regulated and three up-regulated) RING protein genes might be involved in regulation of GBSSI and SBEIIa via ubiquitin-mediated post-translational modifications, either directly or through the modulation of other regulatory factors that has not been studied yet. As the RING proteins are involved in ubiquitin-mediated degradation pathway and can play a positive or negative role in regulation of a biosynthetic pathway44. So both the down regulation and up regulation of RING protein genes might be responsible for high amylose by targeting GBSSI and SBEIIa, respectively.

Correlation analysis of RING protein genes with starch pathway genes (Pearson’s correlation)

Correlation between the normalized gene expression data of 36 RING protein genes and two starch pathway genes GBSSI and SBEIIa was analyzed. GBSSI and SBEIIa are key enzymes regulating the amylose and amylopectin biosynthesis respectively7,9. Therefore, to explore the role of RING protein genes in starch biosynthesis, statistical correlation with key regulatory genes is essential. The pairwise correlation analysis revealed that 55.5% (20) RING protein genes were negatively correlated with GBSSI. Out of twenty, only one gene (TraesCS2D02G162600) showed very strong (r2 ≥ 0.80) and four genes (TraesCS5A02G049400, TraesCS6D02G254700, TraesCS6A02G274400, TraesCS3B02G000800) showed strong (r2 ≥ 0.60) significant negative correlation with GBSSI at p-value ≤ 0.05. While only two genes TraesCS4A02G112900 (r2 = 0.84) and TraesCS6B02G286500 (r2 = 0.78) showed significant positive correlation with GBSSI at p-value ≤ 0.05 (Fig. 9). These seven RING protein genes, five and two with strong negative and positive correlation might be involved in regulation of GBSSI expression respectively.

Figure 9
figure 9

Pair wise Pearson's correlation analysis of 10 RING protein genes with two starch pathway genes GBSSI and SBEIIa. The significant values of correlation coefficient (r) labeled with *p ≤ 0. 05, **p ≤ 0. 01 and ***p ≤ 0. 001.

The correlation of 36 RING protein genes with SBEIIa showed 16 genes to be negatively correlated and 20 genes to be positively correlated (Fig. 9). The correlation analysis revealed 4 genes (TraesCS4A02G11290, TraesCS6B02G286500, TraesCS4B02G164000 and TraesCS1A02G341400) showing significant strong negative correlation with SBEIIa (r2 ≥ 0.60). However, two genes (TraesCS2D02G162600, TraesCS3B02G000800) showed very strong (r2 ≥ 0.80) and other two genes (TraesCS6D02G254700, TraesCS3B02G461900) showed significant strong (r2 ≥ 0.60) positive correlation with SBEIIa at p-value ≤ 0.05. These eight RING protein genes, four each with strong negative and positive correlation to SBEIIa might be involved in the regulation of SBEIIa expression.

From the above correlation analysis with GBSSI and SBEIIa, five RING protein genes (out of 36) were found to have significant negative or positive correlation with both the genes, hence total 10 RING protein genes found to be correlated with GBSSI and SBEIIa (Fig. 9). These results suggest the possible involvement of these RING protein genes in regulation of both GBSSI and SBEIIa at protein levels, through the post-translational modifications via direct or indirect gene regulation.

Discussion

Many genetic analyses have shown that large number of RING proteins have been identified as positive or negative regulators of various biological processes. By considering their importance in plants, RING proteins were initially identified in A. thaliana, encoding ~ 36% RING protein genes out of the ~ 1400 E3 ligases24,45,46. In the present study by exploring the hexaploid wheat genome (Triticum aestivum L., 2n = 6x = 42, AABBDD) we identified 1272 potential RING domains in 1255RING finger proteins (Table S1). These potential RING domains were further classified in four major groups—RING-H2 (875), RING-HC (323), RING-v (67) and RING-G (7) (Fig. 1) illustrates that large number of identified RING proteins 79% and 25.39% are RING-H2 and RING-HC types, respectively. Previous studies in A. thaliana have shown the presence of ~ 2% RING protein genes with eight types of RING domains- RING-H2 (241), RING-HCa (145), RING-HCb (41), RING-D (10), RING-G (1), RING-S/T (4), RING-v (25), and RING-C2 (10)24. We were unable to identify some above previously identified RING domains from Arabidopsis and other plant species in wheat genome. We were also not able to differentiate the identified RING-HC group domains in RING-HCa and RING-HCb, all were found RING-HCa type as in rice37. Our analysis of additional domains in wheat RING finger proteins identified many protein–protein interaction domains, nucleic acid and ubiquitin binding domains in different organizations that have been previously identified in Arabidopsis, apple, B. rapa, and B. oleracea24,36,37,38,39. The 29.48% proteins had only RING domain, on the other hand 70.52% were having one or more additional domains other than RING (Supplementary Table S3). The transmembrane domain was most frequently present with RING domain, indicating their role as membrane integral proteins. Copine and NB-ARC domain identified in wheat RING domain containing proteins involved in lipid binding activities and nucleotide binding, respectively47,48. In wheat multiple domain of unknown function (DUFs), DUF1117, DUF4792, DUF4793, DUF1232, and DUF3675 were identified rather than only DUF1117 found in Arabidopsis, most of the time DUF domain found in combination of transmembrane domains (Supplementary Table S3). The presence of additional domains and their different organizations in RING proteins suggests the role in different biological processes that might be specific for different organisms.

Phylogenetic clustering of RING finger domains clearly demonstrated their evolutionary relationship. All the four RING domains (RING-H2, RING-HC, RING-v, RING-G) clustered with their respective group domains with some exceptions. The intermixing of a specific domain with other might be the reason of variation in protein sequences except the conserved metal ligands, during evolutionary process. This study shows that in spite of different RING types they share some similarities that confer their same origin during evolution. To look more into the evolutionary history of RING domain proteins different orthologous RING proteins can be considered for phylogeny, this will give the clear view of their origin, structural and functional characteristics.

Furthermore, our study focused to find out the genomic locations of RING protein genes and their gene duplication events within three homeologous genomes (A, B and D) in wheat. Wheat RING protein genes were mapped onto all 21 chromosomes. We observed that most of the identified RING protein genes were located on 7D (6. 7%) and less distributed on 4B (3. 5%) chromosome (Supplementary Fig. S3). Wheat (Triticum aestivum) is an allohexaploid (2n = 6x  = 42, genome AABBDD) originated by two isolated hybridization events 10, 000 years ago. Hence, wheat contains three genomes from closely related species, assumed having triplicate copies of genes followed by chromosomal doubling to maintain the fertility49,50. We identified 4 singleton (not duplicated) and 49, 15, 90 and 1080 RING protein genes as dispersed, proximal, tandem and WGD/segmental duplicated genes, respectively. The RING protein genes on homeologous chromosome 3 and 7 were found whole genome/ segmental duplicated within their A, B, D genome, no collinearity from other chromosomes found. The genes from homeologous chromosome 1 (1A, 1B and 1D) were found collinear within the chromosome and also showed collinearity in large extent with other homeologous chromosome 3 (3A, 3B, and 3D) (Fig. 3). Results of duplication events indicated that in the expansion of RING protein gene family in wheat whole genome duplication (WGD)/segmental duplications has played major role.

The analysis from RNA-seq data revealed that of 1255 RING protein genes, a set of 557 genes showed no expression at any of the three developmental stages, considered as possibly no role in seed development. The 698 RING protein genes that were preferentially expressed at any seed developmental stage (2 DAA, 14 DAA and 28 DAA), suggest possible role in developing wheat seed. The 43.84% genes were found to be expressed in all the above developmental stages, while, remaining revealed seed development specific expression (Fig. 4). The different expression patterns and expression levels of RING protein genes provide information of their functional diversity with different extent in various biological processes. If possible role of these RING proteins is determined, it will be great impact on seed quantity and quality control.

Further, the 698 seed specific RING protein genes were analyzed to illustrate the involvement in amylose biosynthesis. In seeds amylose is the main component for resistant starch, a healthy starch that is beneficial for combating gastrointestinal diseases, diabetes and obesity as well as play important role in food processing. The expression analysis from in house transcriptome data of amylose variants and qRT-PCR data of 36 wheat RING protein genes revealed that RING protein genes were expressed at both initial and late developmental stages, suggesting their putative role throughout the seed development (Supplementary Table S8, Fig. 7). Differential gene expression analysis of 36 RING protein genes in amylose variant genotypes also revealed 8 strongly down-regulated and 3 up-regulated genes in high amylose line. These RING protein genes might play potential role as positive or negative regulators of amylose biosynthesis by acting as effective E3 ligases in ubiquitin-mediated post-translational modifications. As ubiquitin-mediated post-translational modifications are involved in various biological processes including stress51, growth28, immunity52, photosynthesis53 and hormone signaling54, but their direct role in starch biosynthesis is not yet known. A study in wheat has been shown the role of TaGW2, a RING E3 ligase in seed size and weight. They hypothesized the increased seed weight might be a reason of higher accumulation of starch controlled by TaGW233. This study suggests only the probable role of RING proteins in starch biosynthesis that need to be explored. The correlation studies of 36 RING protein genes were performed with GBSSI and SBEIIa, two key enzymes regulating the amylose and amylopectin biosynthesis, respectively7,9. Therefore, to explore the role of RING proteins in starch biosynthetic pathway, statistical correlation with key regulatory starch pathway genes was essential. The correlation studies of 36 RING protein genes with GBSSI and SBEIIa expression data showed 10 strongly correlated RING protein genes, with GBSSI and SBEIIa. These results indicate the involvement of RING protein genes in the regulation of transcripts levels and further protein levels of GBSSI and SBEIIa, hence will possibly impact on the amylose biosynthesis via direct regulation or through negative and positive regulators of these enzymes. This study has provided substantial information of RING protein genes, their correlation with starch biosynthetic genes and potential targets for wheat amylose regulation in wheat.

Conclusion

In the present study, we identified 1255 RING proteins in wheat through in silico approaches. The RING protein genes were found to be distributed all over wheat genome with duplication across the genome. Majority of variations and expansion of RING protein gene family in wheat took place through several duplication events that has contributed for the functional diversification of RING proteins. The expression analysis of 698 RING protein genes during various seed developmental stages revealed their possible involvement in seed development. The identified mutation in RING protein genes could be the reason of amylose variation in wheat mutants. Interestingly, this study demonstrated that RING E3 ligases might play a potential role in the amylose biosynthesis as positive or negative regulators, thus imparting great knowledge for the grain quality enhancement. This whole study would be helpful to reduce the study gap of ubiquitin-mediated post translational modifications of amylose regulatory enzymes and other seed functional traits.

Materials and methods

Plant materials

Two mutant lines (M6 generation), ‘TAC 75’ (amylose content ~ 65%) and ‘TAC 6’ (amylose content ~ 7%) developed via EMS mutagenesis along with their parent wheat variety, ‘C 306’ (amylose content ~ 26%) were used in this study10.

Genome-wide identification of RING domain containing proteins in wheat

To identify presumptive wheat RING finger domain containing proteins, two strategies were followed. In the first strategy RING finger proteins of Oryza sativa37 and A. thaliana24 were taken as query data set to identify potential wheat orthologs using Ensembl T. aestivum protein database (RefSeq 1.1) (https://plants.ensembl.org/Triticum_aestivum/Info/Index). The second strategy used to identify proteins containing RING [ZnF-RING (IPR001841)] and Pfam domain [zf-C3HC4 (PF00097)] of InterProScan against the same T. aestivum protein database55. To confirm the presence of RING domains in the retrieved irredundant protein sequences from both the strategies, all the sequences were analyzed by Single Modular Architecture Research Tool (SMART) database (http://smart.embl-heidelberg.de/) integrated with the Pfam domain database. Further, the confirmation of RING proteins was done by manual inspection of each protein sequences for the presence of any of the eight conserved zinc coordinating metal ligands (Cys or His) and the distance between each metal ligand. The identified RING domain containing proteins were classified into groups based on the amino acid residues at metal ligand positions and the residues number between each metal ligand24. Proteins confirmed by SMART database but lacking one or more metal ligands were not considered in any group and they were classified as incomplete RING domain containing proteins. Sequence logos of multiple domain alignments from each representing group were created using online WebLogo tool (https://weblogo.berkeley.edu/logo.cgi).

Multiple sequence alignment and phylogenetic analysis of RING domains

Multiple sequence alignment was performed on the sequences of the identified RING domain that were extracted from their respective protein sequences in Clustal X (Version 2.1; http://www.clustal.org/clustal2/). The parameters for pairwise alignment were 40 gap opening and 0.8 gap extension penalty using PAM350 protein matrix and other criteria at default setting. Further, the aligned RING domain sequences were manually edited in BioEdit software to correctly align eight metal ligand residues and were used for performing phylogenetic analysis by MEGA X software (Version 10.1). Phylogenetic tree was generated by Neighbor-Joining (NJ) algorithm with 1000 bootstrap for significant evaluation of phylogenetic tree. Evolutionary distances were inferred using Jones–Taylor–Thornton (JTT) model and for gap profiling pairwise gap deletion model was used.

Identification of additional RING domains

To identify the presence of other possible domains in RING finger proteins, SMART database considering Pfam domains and signal peptide was used. RING containing proteins were classified into different groups based on the presence or absence and organization of the identified additional domains. Species with the same protein domain architectures as that in wheat RING domain containing proteins were identified using NCBI BLASTp search in model organisms only.

Chromosomal location of wheat RING protein genes

The chromosomal locations of wheat RING protein genes were retrieved from GFF file of T. aestivum genome database downloaded from Ensembl plants (ftp://ftp.ensemblgenomes.org/pub/plants/release-46/gff3/triticum_aestivum). The genes that were not mapped on any chromosome were excluded from the analysis. Chromosomal location of genes was drawn onto T. aestivum 21 chromosomes by Circos software (Version 0. 69-9). All duplication events such as dispersed, proximal, tandem, whole genome (WGD)/segmental and singleton (not duplicated) in RING protein genes were predicted by MCScanX tool56. WGD/segmental duplications are shown in comparative analysis of synteny of RING protein genes on the entire wheat chromosomes using Circos software.

Expression analysis of RING containing genes in developing wheat seeds

The expression profiles of wheat RING containing genes were analyzed at three seed developmental stages (2, 14, and 30 DAA) in the previously reported RNA-seq data57 using RefSeq1.1 nucleotide database in wheat expression browser (http://www.wheat-expression.com/). Normalized transcripts per million (TPM) values were used to analyze the expression pattern at different developmental stages. Clustered heat map showing expression profile of wheat RING containing genes was generated by MeV software58. The hierarchical clustering was performed using uncentered pearson correlation distance matric and complete linkage clustering method. Additionally, the differential gene expression of RING protein genes was also analyzed using the in-house transcriptome data of mutant lines (‘TAC 75’ and ‘TAC 6’) and parent (‘C 306’) in three biological replicates. For the whole transcriptome data analysis CLC genomics workbench (CLC GWB, Version 20) was used. High-quality paired end reads were aligned against annotated wheat genome reference assembly (IWGSC RefSeq1.0). Differential expression analysis of genes was performed using normalized RPKM (reads per kilobase of exon model per million reads) values and genes presenting p-value ≤ 0.05 were considered as significant and were further used for downstream analysis. The differential gene expression values of RING containing genes were acquired by DESeq2 (v1.22.1) and genes with fold change > 2 were retained for further analysis. The RING containing genes were selected based on expression values from the public domain and in-house transcriptome data and were used for qRT-PCR validation in mutant lines along with parent.

RING protein genes variant identification in mutant lines

For variant analysis of RING protein genes in mutants ‘TAC 75’ and ‘TAC 6’ the Illumina generated reads were used and high quality (Q30) reads were assembled and mapped against the wheat reference genome (Ensemble release 49) using the BWA-MEM tool (version 0.7.17) [17, Paolacci et al.59)]. This alignment file (.bam) was further used for variant identification by Samtools using parameters described by Paolacci et al.59. Genetic variants like SNPs, insertions, deletions and MNPs were annotated using SnpEff (version 4.3t) tool25 against the wheat reference genome. The in-house Perl scripts were used to analyse the distribution of variants (SNPs and Indels) in mutants ‘TAC 6’ and ‘TAC 75’ among identified 698 RING protein genes possibly involved in grain development.

qRT-PCR validation of candidate RING protein genes during seed development

Genome-specific primers of candidate RING protein genes, limiting factor genes of starch biosynthesis (GBSSI and SBEIIa)10, and wheat ADP-Ribosylation Factor (ARF) as an internal control60 were designed using OligoCalc (http://biotools.nubic.northwestern.edu/OligoCalc.html). RNAs extracted from primary individual spikes from three biological replicate samples were used for qRT-PCR analysis. Spikes were tagged on the first day of anthesis and harvested at 7, 14, 21, and 28 day after anthesis (DAA), frozen in liquid nitrogen, and stored at − 80 °C till further use. RNA was isolated by introducing minor changes in Trizol method20 and cDNA was synthesized using iScriptgDNA clear cDNA synthesis kit (Bio-Rad, USA). Three biological replicates with their three technical replicates were used for qRT-PCR analysis using 1:10 diluted cDNA and Fast SYBR Green Master Mix as per manufacturer’s instruction in 7500 Fast Real-Time PCR System.

Statistical analysis

To analyze the relationship among wheat RING protein genes and key regulatory genes of starch metabolism (GBSSI, SBEIIa), pair wise Pearson’s correlation coefficient (r) and their significance test was performed using Graph Pad Prism (version 5) with p-value criteria ≤ 0.05. Correlation analysis was done using their mean normalized Ct values (at four different seed developmental stages).

Ethics declarations

All experiments were performed in accordance with relevant institutional guidelines.