Introduction

Glutathione-S-transferases (GSTs) are enzymes encoded by a ubiquitous gene family in aerobic species, able to conjugate electrophilic xenobiotics and endogenous cell components with glutathione (GSH)1. GSTs in plants are composed of two subunits with a molecular mass of around 25–29 kD2.

Initially, plant GSTs were identified in Zea mays for their involvement in defense mechanisms against damage by herbicide3. The importance of GSTs in herbicide tolerance has been demonstrated expressing maize GSTs in tobacco plants. The treated plants were revealed to have a greater herbicide tolerance compared to untreated tobacco plants4. GSTs can also act as detoxifying agents from endogenous cell components. For example, Bronze 2 in maize has been demonstrated to be involved in anthocyanin transport into cytoplasmic vacuoles5. A similar behavior has been highlighted for An9 in Petunia hybrida 6, TT19 in Arabidopsis thaliana 7, PGSC0003DMG400016722 in Solanum tuberosum 8 and DQ198153 in Citrus sinensis, cultivar Moro nucellare9, suggesting that, probably, GSTs act in the last step of the anthocyanin biosynthetic pathway10, when these molecules are transported to the vacuole.

GSTs are also important for the prevention of heavy metals damage, facilitating their storage in the vacuole. In particular, a truncated isoform of the protein encoded by Bronze 2 in maize has a high affinity for heavy metals11. Moreover, GSTs may take part in the hydrogen peroxide detoxification12.

GSTs have a high affinity for auxins and cytokinins and this suggests that GSTs are important for hormone homeostasis and in plant defense against pathogens2,13. In fact, in Solanum tuberosum, the plants infected with the pathogen fungus Phytophthora infestans revealed a fast increase in the prp 1-1 GST content, accompanied by the increase of intracellular auxin levels, suggesting the association of the phenomena to infection defense13.

Initially, plant GSTs were classified into four categories, type I, II, III and IV, based on amino acids sequence identity and on the conservation of the gene structure14,15. This classification was modified into 7 GST classes: 6 cytoplasmic classes (Tau, Phi, Zeta, Theta, Lambda and Dhar) and a further microsomal class (Mapeg)2,16.

Tau and Phi classes are considered plant specific classes, being the most representative in terms of the number of sequences16. In 2016, Munyampundu et al. demonstrated that the Phi class is also present in bacteria, fungi and protists. Tau and Phi classes link a wide range of xenobiotics16, or endogenous cell components17. These components function as glutathione peroxidases (GPOXs), as flavonoid-binding proteins6,7,8,9, and as stress-signaling proteins18. Moreover, the Tau class expansion appears to be associated with plant adaptation to land living19.

The Zeta class is linked to tyrosine degradation, catalyzing the GSH-dependent conversion of malelyacetoacetate to fumarylacetoacetate. The Theta class is similar to the corresponding mammalian class9 and it is present in bacteria, insects, plants, fish, and mammals20.

Lambda and Dhar classes were identified comparing the human Omega GSTs versus the Arabidopsis genome17.

Finally, the Mapeg class includes the microsomal GSTs, with transferase and peroxidase activities21.

Recently more 6 GST classes have been identified in plants: TCHQD, EF1Bγ, URE2p, Omega-like, Iota and Hemerythrin19. Members of the URE2p class were found in Physcomitrella patens, in Selaginella moellendorffii and in bacteria, probably because of horizontal gene transfer events in bacteria, while the Iota GST class was found only in Physcomitrella patens and in Selaginella moellendorffii 19. Hemerythrin GSTs are non-heme iron binding proteins found in metazoans, prokaryotes, protozoans, and fungi22, which acts in detoxification from heavy metals by catalyzing the conjugation of GSH with metal ions19.

A phylogenetic analysis made both in monocots (maize and rice) and in dicots (soya and Arabidopsis) demonstrated that Zeta and Theta classes are monophyletic groups in monocots, dicots and mammals, suggesting that their origin might be anterior to the division between plants and animals23. Zeta and Theta classes have undergone one or two duplication events, presenting at maximum three paralogs in maize, rice, soya and Arabidopsis. Phi and Tau classes show differences between monocots and dicots due to the extensive gene duplication events that monocots and dicots underwent after their divergence. Extensive duplications also resulted in genic clusters sharing high similarity in small genome regions. The reasons of these retained extensive gene duplications are still unknown23.

1107 GSTs from 20 different plant species with sequenced genomes were analyzed (Table 1) to reveal the organization of this relevant family in plants. Two green algae genomes, two Bryophytes, one Marchantiophyta, one Lycopodiophyta, one Gymnosperm, three monocots, ten dicots, including the reference plant species Arabidopsis thaliana (family Brassicaceae), were examined.

Table 1 List of plants considered for this study. Scientific name (name) of the organisms considered, their classification (A (CHL): Algae Chlorophyta, A (CHA): Algae Charophyta, B: Bryophyta, L: Lycophyta, MA: Marchantiophyta, G: Gymnosperms, M: Monocots, D: Dicots), number of chromosomes (Chr), genome size estimation in Mb (Genome), total number of genes currently estimated (Gene), genomics resource, bibliographical reference (Source + Reference) and publication year (Year).

Results

Class assignment of unclassified GSTs

The collection of 1107 GST protein sequences from the 20 species consisted of 214 Tau, 53 Phi, 41 Theta, 7 Lambda, 23 Dhar, 28 Zeta, 21 Mapeg, 10 Hemerythrin, 15 EF-gamma, 4 URE2p, 9 TCHQD, 2 Iota and 16 Omega-like GSTs. In addition, 666 unclassified GSTs were also included (Table 2, numbers in brackets).

Table 2 Number of GSTs per species and per class. Type classes as in Table 1.

In order to associate the unclassified GSTs with specific classes, the collection was analyzed by a multiple protein sequence alignment using Muscle24 and an associated phylogenetic tree based on the maximum likelihood method25 (Fig. 1). The analysis defined the class association of the 666 unclassified GSTs (Table 2, numbers non in brackets), highlighting the presence of GST-Tau in Chlorophytes, Marchantiophytes and in Klebsormidiales, and confirming results from Liu et al., 2013, concerning their absence in Bryophytes.

Figure 1
figure 1

Phylogenetic tree of all the 1107 GSTs. Colors of the leaves indicate the species, while those of the branches indicate the GST class, as reported in the corresponding legends.

Plant phylogeny depicted by GSTs

It can be noted (Fig. 1) that one GST (kfl00659_0030) from Klebsormidium flaccidum (Klebsormidiales) and two GSTs (213211, 49816) from Micromonas pusilla (Chlorophyta) resulted in the Tau class, as also summarized in Table 2.

In Liu et al., 2013, the authors suggested that GST-Tau genes were absent in algae and Bryophytes and served in Tracheophytes to colonize lands. Interestingly, our preliminary results show also that two GSTs (Mapoly0031s0032.1, Mapoly0118s0009.1) of Marchantia polymorpha (Marchantiophyta) belong to the Tau class.

In Table 3 the results of further analyses on the assignment of these 5 sequences to a specific GST class are shown. A BLASTp analysis26, versus all the other GST protein sequences here collected and versus the UNIPROTkb27 database, highlighted that the two Marchantia polymorpha (Mapoly0031s0032.1, Mapoly0118s0009.1) GST-Tau sequences are actually significantly similar to other members of the Tau class. This result is also valid for one of the two Micromonas pusilla (213211) sequences, although with lower significance (low score and identity values).

Table 3 Summary of the two BLASTp results.

On the other hand, the sequence from Klebsormidium flaccidum (kfl00659_0030) and the remaining one from Micromonas pusilla (49816) showed a significant alignment with members of the Mapeg class (Table 3).

A domain search using the Interpro tool28 (Figure S1) showed that a GST-Tau from both the phylogenetic tree and the BLASTp analysis in Micromonas pusilla (213211) is actually an Omega-like GST (Figure S1).

The presence of the GST-Tau class in plants from Lycophytae to higher plants in Liu et al., 2013, suggested that this class of proteins served the plants to colonize lands. The absence of Tau GSTs in all Bryophytes by a multiple sequence alignment and an associated phylogenetic tree of all the available GSTs from this division and the 1107 proteins from our collection (data not shown) was confirmed. This study highlighted the presence of two Tau GSTs in the Marchantiophytes division. This evidence supports the hypothesis of a paraphyletic origin for Bryophytes 29,30,31 (Fig. 2), in contrast with the general assumption that Bryophytes and Marchantiophytes are a separated clade from the one that gave rise to higher plants, and it also suggests that Marchantiophytes could indeed belong to the branching bringing to higher plants.

Figure 2
figure 2

(A) Phylogenetic tree currently proposed for green plants evolution. (B) Green plants evolutionary tree resulting from Cooper 2014. (C) Green plants evolutionary tree proposed herein.

Tau subclasses

Data collected in this research clearly highlights the amplification of the GST-Tau class when compared to other GST classes8 (Fig. 1). In the work of Wagner32, the authors suggested that GST-Tau in Arabidopsis could be divided into three subclasses. In order to further investigate the expansion of the Tau class, a pairwise similarity of these proteins in Arabidopsis thaliana (Fig. 3) and in Solanum lycopersicum (Table S2), respectively, was carried out. The results highlight the presence of four subclasses in Arabidopsis (Fig. 3), one more than what Wagner32 described. Whereas five subclasses were identified in tomato (Table S2).

Figure 3
figure 3

Arabidopsis thaliana GST-Tau similarity matrix. Minimum and maximum values per column are indicated. The last columns indicate annotation of the gene in terms of chromosome (Chr), gene start (Start) and gene end (End), number of exons per gene (N. of exons) and the assignment to the identified subclass (Subclass number).

For further confirmation, two independent phylogenetic trees, one for Arabidopsis and one for tomato (Fig. 4), respectively, were drawn. The trees support our results from the pairwise similarity matrices. Successively, a phylogenetic tree (Fig. 5) with a reduced number of species, when compared to the one in Fig. 1, and including only Arabidopsis, S. lycopersicum, V. vinifera, three monocots (maize, rice and greater duckweed), S. moellendorffii and M. polymorpha was built. The latter two species are considered plants ancestors33. The figure shows the specific grouping into five subclasses, which are indicated from subclass 1 to 5, already detected in the species-specific analysis of tomato Tau GSTs. Subclass 5 does not include GSTs from Arabidopsis.

Figure 4
figure 4

Phylogenetic tree of GSTs from the class Tau in tomato (red) and Arabidopsis (yellow). The branches indicate the possible different subclasses, according to their color reported in the legend. Bootstrap values are also indicated.

Figure 5
figure 5

Phylogenetic tree of GSTs from class Tau of nine different species (as reported in the leaves legend). The branches indicate the possible different subclasses, according to the color reported in the corresponding legend. Bootstrap values are also indicated.

In the work of Dixon and Edwards34, all Arabidopsis GSTs were assigned with a specific role. Considering these functional assignments, subclass 1 includes nine Arabidopsis GSTs (AT3G43800.1, AT1G78370.1, AT1G78340.1, AT1G78380.1, AT1G78320.1, AT1G78360.1, AT1G17180.1, AT1G17190.1 and AT1G53680.1) that are reported to be expressed under abiotic and biotic stresses, since they bind herbicides (AT1G17190.1), 1-chloro-2,4-dinitrobenzene (AT1G78380.1, AT1G17180.1, AT1G53680.1), and salicylic (AT3G43800.1) or jasmonic acid (AT1G78370.1).

Subclass 2 includes eight Arabidopsis GSTs (AT1G59700.1, AT1G59670.1, AT1G69930.1, AT1G69920.1, AT1G27130.1, AT1G27140.1, AT1G10370.1 and AT1G10360.1) all reported to have a low capability of binding glutathione. These GSTs result to be abundant in the nucleus and also bind RNA.

Arabidopsis Tau GSTs preferentially expressed in root (AT3G09270.1, AT2G29480.1, AT2G29470.1, AT2G29490.1, AT2G29460.1, AT2G29440.1, AT2G29450.1 and AT2G29420.1) when the concentration of auxin and/or abscisic acid increase are all located in the subclass 3. Finally, the three GSTs (AT1G74590.1, AT5G62480.1 and AT5G62480.2), which result to be highly expressed in seed under stress condition, are all included in subclass 4.

Subclass 5 includes S. lycopersicum, V. vinifera and O. sativa members while Arabidopsis GSTs are all absent. This aspect was further investigated also considering Tau GSTs from B. oleracea, another Brassicaceae in which 28 Tau GSTs were also characterized35. The phylogenetic tree, including Tau GSTs from B. oleracea, V. vinifera, S. lycopersicum and A. thaliana (Figure S2), shows that GSTs from B. oleracea are not included in the subclass 5, and suggests that the absence of members of subclass 5 could be a common feature in Brassicaceae.

47 GSTs are included in subclass 5 (Fig. 5). LOC_Os12g02960.1, from O. sativa 36, and Solyc01g081250.2.1 and Solyc09g063150.2.1, from S. lycopersicum 37 result to be expressed under abiotic stress. Moreover, six V. vinifera GSTs in the subclass were characterized as each one is able to bind and transport flavonoids in the berry’s skin (VIT_201s0026g01340.1, VIT_207s0005g04890.1, VIT_215s0024g01630.1, VIT_215s0024g01650.1 and VIT_215s0107g00150.1, in the work of Costantini38, and VIT_215s0024g01540.1 in the work of Malacarne39). Interestingly, four V. vinifera GSTs (VIT_205s0051g00240.1, VIT_207s0005g04880.1, VIT_205s0049g01090.1, VIT_205s0049g01120.1)40 and one S. lycopersicum GST (Solyc01g081270.2.1)41 result to be expressed during the abscission. This could suggest a functional divergence of members of subclass 5 and a possible association with abscission mechanisms thus explain its absence in Brassicaceae in contrast with their presence in grapevine and tomato42.

GST-Tau from M. polymorpha (Marchantiophyta) and S. moellendorffii (Lycopodium) are all grouped in subclass 1. This may suggest that this Tau subclass could be the group of ancestral GSTs sequences.

Discussion

This analysis of 1107 GSTs from plants with sequenced genomes results in a wide phylogenetic tree providing insights on the organization of the different GST classes and highlights the presence of subclasses in the major classes currently described.

Beyond the assignment to specific GST classes for 666 unclassified proteins, the main aspect presented in this study is the possible confirmation of the paraphyletic origin of Bryophytes in contrast with the general assumption that Bryophytes and Marchantiophytes are a separated clade from the one that gave rise to higher plants. Moreover, the results indicate that Marchantiophytes could indeed belong to the branching bringing to higher plants.

The study includes the analysis of GST-Tau class, resulting in the discovery of the presence of at least 5 subclasses. The study tried to define the function of these subclasses. The results highlight the presence of a GST-Tau subclass including all the GST sequences from ancestor species, suggesting a primordial functionality for the members of this subclass. Finally a possible subclass, including genes associated with abscission, appears to be absent in Brassicaceae.

Materials and Methods

Genomic resources

GST protein sequences were searched by keyword. For Amborella trichopoda (v1.0), Selaginella moellendorffii (v1.0), Sphagnum fallax (v0.5), Spirodela polyrhiza (v2), Zea mays (Ensembl-18), Micromonas pusilla CCMP1545 (v3.0), Marchantia polymorpha (v3.1) and Populus trichocarpa (v3.0) the sequences were downloaded from Phytozome 1143 (https://phytozome.jgi.doe.gov/pz/portal.html); GSTs from Picea abies (v1.0) were downloaded from Congenie (http://congenie.org/); GSTs Klebsormidium flaccidum were downloaded from CGA (http://genome.microbedb.jp/Klebsormidium) while the ones from Oryza sativa were downloaded from TIGR44 (http://rice.plantbiology.msu.edu/); GST sequences from Coffea canephora were obtained searching in the Coffee genome Hub database45 (http://coffee-genome.org/coffeacanephora); Glicine max’s GSTs protein sequence were downloaded from Gramene46 (http://www.gramene.org/); GST sequences of Solanum lycopersicum (iTAG2.4) and Capsicum annuum (v1.55) were downloaded from SGN47 (https://solgenomics.net/), while the ones of Solanum tuberosum (PGSC_DM_v_3.4) were obtained from Spud db48 (http://solanaceae.plantbiology.msu.edu/); GST sequences of Arabidopsis thaliana were downloaded from TAIR10 (https://www.arabidopsis.org/). Vitis vinifera GST sequences (v2) were obtained from Cribi (http://genomes.cribi.unipd.it/grape/). GST sequences of Physcomitrella patens were obtained from19 and the ones from Citrus sinensis were obtained from9.

Phylogenetic Analysis

Multiple alignments were obtained using Muscle24 with default parameter (gap open penalty -2,9, gap extension penalty 0). The Phylogenetic tree was built with RaxML25, using the maximum likelihood method, considering PROTCATBLOSUM62 as similarity matrix with the Bootstrap option. Finally the editing tool iTOL v349 was used.

In order to obtain the pairwise distances of GST-Tau protein sequences we used “protdist” from PHYLIP, using the JTT matrix50. All the alignments, trees and matrices were built using shorter identifiers to indicate each gene. The conversion table between the original gene IDs and the code here used is reported in the supplemental Table 1.

Class assignation for ambiguous cases

In order to understand the class of the three putative GST-Tau of the two algae and the class of the two putative Tau GSTs of the Marchantiophyta we performed a BLASTp26 with default parameters versus the entire GSTs collection here considered. A Uniprot BLASTp was also performed using default parameters versus UNIPROTkb27. The M. pusilla putative GST-Tau was further investigated by an InterProScan28 analysis with default parameters.