Analyzing the G3BP-like gene family of Arabidopsis thaliana in early turnip mosaic virus infection

The Arabidopsis thaliana genome encodes several genes that are known or predicted to participate in the formation of stress granules (SG). One family of genes encodes for Ras GTPase-activating protein–binding protein (G3BP)-like proteins. Seven genes were identified, of which one of the members was already shown to interact with plant virus proteins in a previous study. A phylogenetic and tissue-specific expression analysis, including laser-dissected phloem, by qRT-PCRs was performed and the sub-cellular localization of individual AtG3BP::EYFP fluorescent fusion proteins expressed in Nicotiana benthamiana epidermal cells was observed. Individual AtG3BP-protein interactions in planta were studied using the bimolecular fluorescence complementation approach in combination with confocal imaging in living cells. In addition, the early and late induction of G3BP-like expression upon Turnip mosaic virus infection was investigated by RNAseq and qRT-PCR. The results showed a high divergence of transcription frequency in the different plant tissues, promiscuous protein–protein interaction within the G3BP-like gene family, and a general induction by a viral infection with TuMV in A. thaliana. The information gained from these studies leads to a better understanding of stress granules, in particular their molecular mode of action in the plant and their role in plant virus infection.

www.nature.com/scientificreports/ primary target of virus infection. Viruses can employ different strategies to interfere with SG assembly 16,[19][20][21] . For example, the non-structural protein 3 (nsP3) of Semliki Forest virus (SFV) can bind to the NTF2-like domain of G3BP-1 via an 'FGDF' motif and thereby preventing SG assembly and promoting viral RNA translation [22][23][24] or multiple poliovirus proteins affect RNA granules to varying extents, G3BP-1, for example, is cleaved by 3C pro25 . The general pathway of SG formation and function seems to be conserved, but only a few studies about plant stress granules are available 18,19 compared to mammalian stress granules. For example, the G3BP-like family in A. thaliana has not yet been fully characterized. Abulfaraj and colleagues 26 describe eight family members, generated and analyzed AtG3BP-7 (AT5G48650) OEX lines and KO lines, which showed no phenotype compared to control plants. This might be due to the fact that G3BPs in A. thaliana are redundant in their function, and a KO of one AtG3BP could be compensated by one of the others. Nevertheless, G3BP in plants and its interactions with plant viruses are poorly understood, although 'FGDF'-like binding motifs can be found in some plant viruses 22 , for example, it has been shown that the helper component proteinase (HC-pro) of potato virus A (PVA) induces the formation of RNA granules 27 . Only a few studies are investigating the interaction between plant viruses and the host's G3BP homologue, only recently, it was shown that A. thaliana G3BP-2 interacts with the nuclear shuttle protein (NSP) of the plant virus Abutilon mosaic virus 9 , but the purpose of this interaction is still unknown.
Turnip mosaic virus (TuMV), a member of the family Potyviridae, belonging to genus Potyvirus, has a singlestranded RNA genome. In general the size of the genomic RNA is < 10 kb and encodes a polyprotein, including the P1 protease 28 . At the N-terminal end of the P1 of Turnip mosaic virus (P1-TuMV) two FGDF-like motifs are located. This motif has been shown to bind the host protein G3BP 22 . This suggests that also P1-TuMV interacts with G3BP in plants in a similar matter. Aim of this study was therefore to gain more information about all putative members of G3BP-like proteins in A. thaliana, i.e. in terms of tissue expression, cellular localization and protein-protein interaction, and to investigate the response of the AtG3BPs to a viral infection, i.e. TuMV. Special emphasis here was also to monitor AtG3BPs expression to an early time point of infection.

Results
Phylogenetic analysis. Bioinformatic analysis of the A. thaliana genome (TAIR10) revealed seven possible members of an A. thaliana G3BP-like gene family. These candidates are distributed on different chromosomes of the A. thaliana genome (Fig. 1a). AtG3BP-1, -2, and -7 are located on chromosome 5, AtG3BP-3 on chromosome 3, AtG3BP-4, and -5 on chromosome 1 and AtG3BP-6 on chromosome 2. They all have been identified by and thus share an N-terminal NTF2-like domain and a C-terminal RRM domain. Furthermore, all of them harbor the Gly-rich region and RG(G) motifs at the C-terminus, except AtG3BP-4 (Fig. 1b). Additionally, they contain several conserved amino acids whose functional relevance has been demonstrated in the human homolog, such as phenylalanine at position 41 of the consensus sequence (equivalent to position 33 in HsG3BP-1), which all AtG3BPs, again with the exception of AtG3BP-4, share with their mammalian homolog. A protein sequence alignment was performed to show the intra-family relatedness between the postulated members ( Fig. 1c). They cluster into two main groups, the first one consisting of AtG3BP-1, -2, -3, and -7, the second one consisting of AtG3BP-4, -5, and -6.
Sequence comparison of each of the three different AtG3BP domains showed no significant alteration to the phylogenetic analysis based on the full amino acid sequences comparison. AtG3BP-1, -2, -3, and -7, and AtG3BP-4, -5, and -6 still cluster together if only the NTF2 domain is analyzed ( Supplementary Fig. S1a) and with minor changes if the RRM domain is compared ( Supplementary Fig. S1b). Strikingly, AtG3BP-1 and -7, and -2 and -3 cluster together, when comparing only the Gly-rich regions ( Supplementary Fig. S1c), because AtG3BP-4 lacks this region. The cluster AtG3BP-5 and -6 remains unchanged. Table 1 summarizes the protein sequence similarity and the nucleotide sequence identity of the coding sequences based on the ClustalW sequence alignment. Here, AtG3BP-5 and AtG3BP-6 show the highest amino acid sequence identity, 67%, whereas AtG3BP-2 and -5 the lowest with 36.2%.

Subcellular localization of AtG3BP::EYFP fusion proteins in N. benthamiana.
To study the subcellular localization, the AtG3BPs were each transiently overexpressed in functional fusion with enhanced yellow fluorescent protein (EYFP) under the control of the CaMV 35S promoter in N. benthamiana epidermis cells (Fig. 2). All possible members of the AtG3BP family showed a cytoplasmic signal under ambient condition, with AtG3BP-6 also a nuclear EYFP signal in addition. Furthermore, AtG3BP-3, -5, and -7 form granule-like structures already under ambient condition. After heat shock treatment, these structures can be observed also for the other AtG3BPs (Fig. 2). The nuclear EYFP signal of AtG3BP-6 can still be observed after stress application ( Supplementary Fig. S2). The expression of the fusion proteins was verified by western blot analysis (Supplementary Fig. S3).

AtG3BP protein interaction profiling in vivo.
Previous studies have shown that the NTF2-domain plays a key role in the oligomerization of HsG3BP-1 2 . Since all predicted A. thaliana G3BP homologs share this domain, we analyzed their ability to form homo-and heterooligomers by BiFC. All 49 pair-wise combinations were co-expressed in N. benthamiana leaves and monitored for a reconstituted fluorescent YFP signal under ambient conditions and after heat shock treatment. The data for homooligomerization is shown in Fig. 3a and b, and for heterooligomerization summarized in Fig. 3c and d, pictures in the Figs. S4-S7. Figure 3a shows the reconstituted YFP signals for the different AtG3BPs C-terminally fused to the respective YFP fragment before and after heat shock. All AtG3BPs but AtG3BP-6 showed cytoplasmatic YFP signal under ambient conditions, with AtG3BP-1, -4, -5, and -7 already forming granule-like structures under these conditions. After heat shock, the evenly distributed cytoplasmic-localized YFP fluorescence changed almost completely into a granular YFP signals. AtG3BP-6 C-terminally fused to splitYFP did not show any signal after heat shock. When fused www.nature.com/scientificreports/ www.nature.com/scientificreports/ N-terminally to the respective YFP fragments, only AtG3BP-1, -2, and -4 showed an exclusively cytoplasmatic reconstituted YFP signal. After heat shock treatment YFP fluorescent, granule-like structures can be observed for AtG3BP-1, -2, -4, -5, and now also for -6. We also tested the ability of the different AtG3BPs to form heterodimers in vivo with each other by co-infiltrating the reciprocal splitYFP fusion proteins and examining for reconstituted YFP signals two days post agroinfiltration (dpai) by laser scanning microscopy (Figs. S4-S7). The results are summarized in Fig. 3c and d. Overall, all AtG3BPs seem to interact with each other under ambient conditions as well as stress conditions (in this case heat shock).
To test whether the observed granule-like structures are indeed stress granules, we performed BiFC experiments with the stress granule protein AtUBP-24 (AT4G30890) 9 . As a control, we used the Clink protein of the Pea necrotic yellow dwarf virus (Accession JN133280) to show that our predicted AtG3BPs actually promote stress granule formation and not AtUBP-24 alone (Fig. 5). AtUBP-24 harbors an 'FGSF'-motif at its N-terminal end, similar to the 'FGDF'-motif of HsUSP-10, the protein's G3BP binding domain. C-terminal fusion constructs of AtG3BP or Clink with cYFP were co-agroinfiltrated into N. benthamiana leaves with AtUBP-24::nYFP (Fig. 4a), AtG3BP::nYFP and Clink::nYFP with AtUBP-24::cYFP (Fig. 4b), respectively. Epidermal cells were monitored for reconstituted YFP signals at two dpai by confocal microscopy. All combinations of the different AtG3BPs with AtUBP-24 showed granular YFP signal at ambient conditions and after heat shock, whereas co-expression of Clink with AtUBP-24 showed an exclusively nuclear signal for both conditions. qRT-PCR analysis of AtG3BP gene transcript abundance. The presence of seven A. thaliana G3BPlike homologs raises the question of their distinct function(s), the possibility of redundancy in function(s) in either the same or in different tissues and their possible role in virus infection 9 . To investigate the expression levels a qRT-PCR analysis of transcript abundance has been performed for G3BP-like gene family members in different plant tissue, namely root, stem, flower, silique, leaf, and phloem under ambient conditions. Except for AtG3BP-3 and AtG3BP-7, the different G3BP genes show a ubiquitous expression throughout all tested tissues. While AtG3BP-3 shows expression only in leaf tissue, AtG3BP-7 is nearly not detectable in leaves, but primarily in root tissue. Overall expression levels between the different AtG3BPs were comparable, with AtG3BP-4 showing the highest constitutive expression level (Fig. 5a).
Phloem tissue from a transgenic GFP driven under the AtSUC2 promoter A. thaliana plant was sampled utilizing a laser dissection microscope (Supplementary Video S1) to monitor AtG3BPs transcripts abundance in this specialized tissue. With the exception of AtG3BP-3, all AtG3BPs were detectable in the phloem. AtG3BP-1 shows the highest relative expression, followed by AtG3BP-2, -6, and -7. AtG3BP-4 and -5 are the lowest expressed in the phloem compared to the other AtG3BPs (Fig. 5b). A direct comparison between the relative expression values in the phloem tissue and the other tissues might be incorrect, due to the differences in the RNA extraction Table 1. Sequence similarity of the different AtG3BPs in %. The protein sequence similarity (shaded in grey) and the nucleotide sequence identity (white boxes) of the coding sequences were calculated based on the ClustalW alignment with the BLOSUM62 matrix and a threshold of 0. www.nature.com/scientificreports/ procedure. Petioles were cut, fixed, sectioned, and laser-dissected to gain phloem tissue. Nevertheless, an expression level analyses of the different AtG3BPs transcripts in roots, stem, flower, silique, and rosette leaves, which includes the data from phloem tissue, is shown in Supplementary Fig. S8.
AtG3BP-2 was shown to interact with the nuclear shuttle protein of a plant virus 9 , which raised the question if the expression levels of the different A. thaliana G3BP-like homologs are affected by a plant virus in A. thaliana, for example TuMV. TuMV, which also expressed a RFP fused with a nuclear localization signal (TuMV-RFP 29   www.nature.com/scientificreports/ infection in plants, avoid dilution effects and thus to obtain the most meaningful information, micro-dissected tissue at 2 dpi was collected and subsequently analyzed in a transcriptomic approach (RNAseq) ( Table 2) and subsequently by qRT-PCR (Fig. 6b). The different AtG3BPs were unevenly expressed, either up-or down-regulated, with the striking feature that AtG3BP-4 most probably significantly upregulated (1.7fold), which was then confirmed by qRT-PCR analysis (Fig. 6b). AtG3BP-4 was induced sixfold by TuMV infection at 2 dpi.  www.nature.com/scientificreports/

Discussion
Human G3BP, first reported by Parker and colleagues 30 , binds to the SH3 domain of RasGAP. Responsible for the binding was the N-terminal nuclear transfer factor 2 (NTF2)-like domain 31 . The C-terminal portion of HsG3BP-1 contains an RRM motif, indicating that HsG3BP-1 has RNA-binding capacities. Indeed, HsG3BP-1 was reported to co-immunoprecipitated with mRNAs and to bind to and cleave, for example, the 3′ untranslated region of the c-myc mRNA in vitro 9 . Database searching has allowed the identification of members of the A. thaliana G3BPlike gene family. Seven G3BP-like proteins, within the group of NTF2 and RRM domain family proteins, in A. thaliana were identified and further investigated in this study. Abulfaraj and colleagues 26 described additionally an RNA-binding (RRM-RBD-RNP motif) domain nuclear transport factor 2 family protein (AT3G07250), but in contrast to the other seven, AT3G07250 has a different domain structure, i.e., three RRM domains instead of one and the Gly-rich region and the RG(G) motifs are missing. In addition, the A. thaliana Information Resource (TAIR) database 32 and Klepikova eFP RNAseq data revealed that AT3G07250 is only expressed in flowers 12-14 33 . Attempts to amplify the ORF of AT3G07250 by RT-PCR from leaf tissue consequently failed (data not shown), whereas transcriptional analysis showed that AtG3BP-1 to -7 were expressed, mostly throughout the plant, with the exception of AtG3BP-3. AtG3BP-3 amplification was negative in phloem tissue but exclusively expressed in leaf tissue, which might be a hint for guard cell-specific expression. Guard cells form stomata and are highly specialized cells, in which gene expression is often regulated by specific promoter activity 34 . Promoter studies will provide insight into plant cell-specific gene expression of the AtG3BP gene family in the future. In summary, it is a reasonable assumption that the different G3BP-like proteins within the G3BP gene family in A. thaliana have redundant function in different tissues at different developmental stages. It is therefore suggested that the AtG3BPs, like their animal counterparts, are likely to be involved in a wide variety of cellular processes, most importantly in stress granules formation. This suggestion is strongly supported by the fact that the two G3BP family members in mammals,, G3BP-1 and G3BP-2, and both proteins co-localize in SGs, when cells are subjected to stress 35,36 . Sub-cellular localization studies showed that the different AtG3BPs are also localized in SG upon heat stress. In addition, AtG3BP-2 localizes into SGs, if the plant cell is treated with KCN 9 . Kedersha and colleagues 37 concluded that G3BP binds 40S ribosomes through its RGG region and is essential for stress  www.nature.com/scientificreports/ granule condensation, but distinct from polysome disassembly, and the condensation process is regulated by competitive Caprin1/USP10 binding to G3BP. The A. thaliana UBP-24 protein is the plant homolog of human USP10 9 . Consequently, all putative members of the AtG3BP family studied interact in planta with AtUBP-24. In addition, homo-and heterodimerization, as described for the human G3BPs 2 , was also confirmed in this study. The analyzed AtG3BP family members may, therefore, fulfill a similar role in SG assembly and disassembly as their well investigated mammalian counterpart and thus may also play a pivotal role in viral infection 15,16 , as previously suggested by interaction studies of AtG3BP-2 and the nuclear shuttle proteins of ssDNA plant viruses 9 . Therefore, it was necessary to investigate if the other identified members of the AtG3BP family were also affected by a virus infection. First, tissue tropism and expression level of the AtG3BPs was determined, and then the question was addressed if the expression level alters upon virus infection. Not surprisingly, six of seven AtG3BP members were significantly upregulated in infected leaf tissue, but most interestingly only AtG3BP-4 was significantly induced in early infection, as shown by transcriptomic data obtain from dissected tissue and confirmed by qRT-PCR analysis. The complete analysis of the TuMV differentially induced expressed genes in A. thaliana tissue at 2 dpi will be described elsewhere and further analyses on the role of AtG3BP-4 in early and systemic virus infection, in general, must and will be performed in the future. It is reasonable to assume that G3BP plays a role at an early stage of infection; for example, members of the Picornaviridae family also modulate SGs accumulation during replication. Poliovirus regulates SGs in a time-dependent manner; at early times the 2A proteinase induces assembly of SGs that are later disassembled by the 3C proteinase through G3BP-1 cleavage 38 . Similar temporal control of SG assembly is exhibited by Coxsackievirus B3 assembly as early as 3 h post-infection. Given that SGs are generally associated with regulation of gene expression, viruses have evolved different mechanisms to counteract their assembly or to use them in their favor to replicate within the host environment successfully. In this study, data is presented about a SG key factor, namely G3BP. As their mammalian equivalents, the AtG3BPs behave similarly and their expression in early response to a virus infection was investigated and confirmed. Particular focus must, therefore, be on AtG3BP-4, which we show is highly upregulated upon TuMV infection. Analysis of KO and OEX plant lines in the future will shed more light on the G3BP family's role in virus infection in particular and other stresses in general.

Transient expression of fusion proteins in N. benthamiana.
All expression plasmids were transformed into Agrobacterium tumefaciens strain LBA4404 or GV3101 by electroporation. Agrobacterium cultures were grown in selective media and infiltrated at a final OD 600 = 0.1. For bimolecular fluorescence complementation (BiFC) experiments cultures were mixed and then co-infiltrated at a 1:1 ratio into the abaxial surface of 3-5 week-old N. benthamiana leaves.
Heat treatment of plant samples and microscopy. Heat stress was applied by incubating the samples for 45 min at 37 °C in a humidified box. Fluorescence was visualized in epidermal cell layers of the leaves after 2-3 days of infiltration using confocal microscopy. A Leica TCS SP8 confocal laser-scanning microscope was utilized for visualization of EYFP and BiFC signals. Enhanced yellow fluorescent protein (EYFP) and VENUS were excited by using a 488 and 514 nm laser. A Leica HC PL APO CS2 63×/1.20 (NA = 0.75) water objective was employed. BiFC images were acquired with a resolution of 1024 × 1024 pixels. Images for the analyses of the cellular localization were acquired with a resolution of 2632 × 2632 pixels for deconvolution with the SVI Huygens Essential software (Scientific Volume Imaging B.V., Hilversum, Netherlands) embedded in the Leica LAS X program (Leica Microsystems, Wetzlar, Germany) using the standard strategy. Images were processed and compiled using Photoshop CS4 (Adobe, San Jose, CA).
Western blot analysis. Proteins were extracted from 100 mg infiltrated leaf material with 250 µl SDSsample buffer (4% SDS, 20% glycerol, 5% β-Mercaptoethanol, 100 mM Tris-HCl pH 6.8, 0.005% bromophenol blue) after grinding in liquid nitrogen. The extracts were then centrifuged at 13.000 rpm for 10 min and the supernatants were collected. Samples were resolved in a 4-20% Mini-PROTEAN® TGX™ Precast Protein Gel (Bio-Rad, Hercules, CA) and transferred to an Amersham™ Hybond™ P 0.45 PVDF membrane (GE Healthcare, Germany). PVDF membranes were blocked using 5% skim milk powder in TBS-T (0.05% Tween 20) and incu- We used laser microdissection to collect only infection sites with minimum non-infected surrounding tissue (CellCut laser microdissection microscope, MMI, Eching, Germany). Small pieces (~ 0.25 mm 2 ) of infected and mock leaves were mounted onto membrane slides (Molecular Machines and Industries, Eching, Germany) and covered with 70% ethanol for 10 min for a quick fixation and then with 100% ethanol until the ethanol evaporated from the slide. This step will reduce the water content of the leaves and facilitate the sectioning. RFP-labeled infection sites were precisely dissected, and the similar dissected area was obtained from mock-inoculated plants using the fluorescence module of the CellCut laser microdissection microscope. Three infection sites per plant were collected and pooled in the same 0.2 ml cap, and five replicates were obtained. RNA isolation and qRT-PCR analysis. Total RNA from A. thaliana rosette leaves, stem, siliques, flowers, and roots was extracted using the RNeasy Plant Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol. RNA extracts were treated with RNase-free DNase I and quantified with a NanoDrop2000 spectrophotometer. For cDNA synthesis, 1.5 µg of total RNA were transcribed with 300 U M-MLV Reverse Transcriptase using oligo (dT) 12-18 primer (0.5 µM). For each 20 µl qRT-PCR reaction 1 µL of cDNA, 10 µl of KAPA SYBR® FAST qPCR Master Mix (2X), and primers (0.2 µM) specific for AtActin-2 (AT3G18780), AtG3BP-1 (AT5G60980), AtG3BP-2 (AT5G43960), AtG3BP-3 (AT3G25150), AtG3BP-4 (AT1G69250), AtG3BP-5 (AT1G13730), AtG3BP-6 (AT2G03640), and AtG3BP-7 (AT5G48650) were used. Reactions were run on a qTOWER 3 G (Analytik Jena, Jena, Germany) and a Mastercycler® realplex (Eppendorf, Hamburg, Germany). Between 6 and 12 biological replicates, with 3 technical replicates each, were used in every experiment. Transcript levels and ratios were then calculated using the 2 −∆Ct or the 2 −∆∆Ct method, respectively, with AtActin-2 as the endogenous control. Statistical significance was measured by two-tailed t-test. The primers utilized for the detection of specific transcripts are listed in Supplementary Table S2. Library preparation, characterization and sequencing. Individual libraries from mock and TuMV-RFP-infected plants were prepared according to the Smart-3SEQ protocol previously described in 43 . The quality of libraries was checked with Agilent 2200 using the high sensitive DNA chip from Agilent. Libraries were sequenced in the NextSeq 500 using the NextSeq 500/550 high output Kit v2.5, 75 cycles from Illumina (Illumina Inc., San Diego, US).
Data preprocessing. After sequencing, reads were demultiplexed and converted to FASTQ with bcl2fastq (Illumina Inc., San Diego, US) with the adapters trimmed. UMI sequence, G-overhang, and A-tails from FASTQ data were removed with the script umi_homopolymer.py as described by Ref. 43 . Reads shorter than 18 nt were removed with the tool Filter FASTQ 44 . Read alignment, counting and analysis of gene expression. Final reads were mapped to the genome of A. thaliana Col-0 using Bowtie2 tool using the default setting 45  www.nature.com/scientificreports/ to count the reads with the tool featureCounts 46 . Gene differential expression between mock and TuMV-RFPinfected plants was calculated using output files from featureCounts with the DESeq2 tool 47 .
Phylogenetic analysis. AtG3BP protein and coding sequences were retrieved from the TAIR10 genome annotation data set (http://www.arabi dopsi s.org/). Multiple sequence alignments were carried out in MEGA X software utilizing the CLUSTAL W algorithm 48 . A BLOSUM62 matrix was used with a gap opening penalty of 10 and a gap extension penalty of 0.2. All phylogenetic trees were constructed with the Neighbor Joining (NJ) method 49 with 1000 bootstrap replicas 50 using the Poisson correction method 51 and are in the units of the number of amino acid substitutions per site. Protein domains were identified using ExPASy PROSITE (https ://prosi te.expas y.org/) 52 . Geneious Prime® 2020.1.2 (https ://www.genei ous.com) was used to visualize alignments and protein domains.