Comprehensive molecular dissection of TIFY Transcription factors reveal their dynamic responses to biotic and abiotic stress in wheat (Triticum aestivum L.)

The plant specific TIFY (previously known as ZIM) transcription factor (TF) family plays crucial roles in cross talk between Jasmonic Acid and other phytohormones like gibberellins, salicylic acid, abscisic acid, auxin, and ethylene signaling pathways. Wheat yield is severely affected by rust diseases and many abiotic stresses, where different phytohormone signaling pathways are involved. TIFYs have been studied in many plants yet reports describing their molecular structure and function in wheat are lacking. In the present study, we have identified 23 novel TIFY genes in wheat genome using in silico approaches. The identified proteins were characterized based on their conserved domains and phylogenetically classified into nine subfamilies. Chromosomal localization of the identified TIFY genes showed arbitrary distribution. Forty cis-acting elements including phytohormone, stress and light receptive elements were detected in the upstream regions of TIFY genes. Seventeen wheat microRNAs targeted the identified wheat TIFY genes. Gene ontological studies revealed their major contribution in defense response and phytohormone signaling. Secondary structure of TIFY proteins displayed the characteristic alpha–alpha–beta fold. Synteny analyses indicated all wheat TIFY genes had orthologous sequences in sorghum, rice, maize, barley and Brachypodium indicating presence of similar TIFY domains in monocot plants. Six TIFY genes had been cloned from wheat genomic and cDNA. Sequence characterization revealed similar characteristics as the in silico identified novel TIFY genes. Tertiary structures predicted the active sites in these proteins to play critical roles in DNA binding. Expression profiling of TIFY genes showed their contribution during incompatible and compatible leaf rust infestation. TIFY genes were also highly expressed during the initial hours of phytohormone induced stress. This study furnishes fundamental information on characterization and putative functions of TIFY genes in wheat.


Results
Identification and nomenclature of TIFY TF gene family in wheat. In order to generate a robust dataset of TIFY in wheat, a total of 45 completely annotated TIFY protein sequences of Oryza sativa (21), Sorghum bicolor (06) and Arabidopsis thaliana (18), were retrieved from Plant TFDB (Accession IDs provided in Supplementary Table 1). No wheat TIFY sequence was present in these databases. BLASTN with wheat genomic sequences provided 92 matches. These sequences were inspected for the presence of conserved TIFY (QLTI-FYGGR) domain that provided 37 sequences. Of them, 23 sequences displayed complete ORF and were designated as putative novel TIFY genes in wheat. This also suggests that about four fifth, i.e., 103 out of initially identified 126 TaTIFY genes became non-operational in wheat, most likely due to transposon activity mediated genome rearrangements, which is very common in wheat 25 . The incomplete pseudo-genes were excluded (pipeline mentioned in Supplementary Figure S1). These novel genes were assigned unique identifier as TaTIFY (Triticum aestivum TIFY) followed by alpha numeric credential from TaTIFY1 to TaTIFY23 (Table 1). TaTIFY8 is the largest protein having 235 amino acids while TaTIFY 19 is the smallest having 57 amino acids. Five genes have their coding sequence in antisense orientation (Table 1). To further confirm, Pfam web server was used to examine the conserved domains. Twenty-one of these proteins have a TIFY domain with CCT2 motif (Jas) while TaTIFY4 and TaTIFY19 have only the TIFY domain (Fig. 1).
Two types of Nuclear localization signals (NLS) had been identified: monopartite (single stretch of basic amino acids) and bipartite (two stretches of basic amino acids) (Supplementary Table S6). Two proteins (TaTIFY5 and 11) had monopartite NLS, 20 proteins had bipartite NLS whereas one protein, TaTIFY10 did not contain any NLS.
Identification of conserved motifs, secondary structure prediction and sub cellular localization. Ten most conserved motifs were identified in TaTIFY proteins (Supplementary Table S7 Secondary structure of TaTIFY proteins was predicted using a feed-forward neural network which performs analysis on outputs obtained from PSI-BLAST. The number and positions of helices and strands present in all TaTIFY proteins were determined. All proteins formed an alpha-alpha-beta fold, the characteristic of TIFY family (Supplementary Figure S4).
MEMSAT tool on PSIPRED server predicted trans-membrane structure and topology: eight TaTIFY proteins had pore lining helices and their respective positions are shown in Supplementary Figure S5. Proteins disordered state was predicted in all 23 proteins using DISOPRED2 tool (Supplementary Figure S6). Disorder is functionally important to be associated with recognition and binding.

Tertiary structure prediction and validation.
For better characterization of TaTIFY proteins, the translated sequences were subjected to ab initio modeling and threading using I-TASSER server. The server generated five 3D atomic models for threading and iterative structural assembly simulations. The 3D models were predicted by ten threading templates with PDB hits. The most apt secondary structure was chosen based on maximum C-score (confidence score), TM-score (topological similarity score), RMSD (Root Mean Square Deviation), cluster density and was subjected to internal evaluation of self-consistency tests. Tertiary structure of TaTIFY1 is provided in Fig. 3, and rest in Supplementary Figure S7.
The stereo-chemical quality and reliability of predicted TaTIFY models were validated by subjecting the PDB files to PROCHECK and PDBsum server (TaTIFY1 is provided in Fig. 3B-E, and rest in Supplementary Figure S8). The Ramachandran plot statistics depicted > 90% residues in most favorable combination of phi/ psi values assuring high quality attributes of TaTIFY protein structures. The ProSA server displayed graphs containing z-scores of negative values of the models with comparable protein structures in PDB indicating the reliability of the structures. The binding sites of the proteins were predicted by DoGSiteScorer. The first two binding pockets were considered as prominent active sites (TaTIFY 1 in Fig Gene ontology and enrichment analysis. Blast2GO analysis classified TaTIFY genes into three main categories: biological process, molecular function and cellular component (Fig. 4). The biological processes associated GO terms were regulation of transcription (21 terms), ubiquitin dependent protein catalytic process (1),  (2), SA mediated signaling pathway (2), response to wounding (4), response to water deprivation (2), response to ethylene stimulus (2), negative regulation of defense response (2), JA mediated signaling pathway (4), JA biosynthesis process (4), hyperosmotic salinity response (2), flower development (2), defense response to fungus (2), defense response to bacteria (2), ABA mediated signaling pathway (2) etc. (Fig. 4A). The molecular function category included binding (23 terms: to DNA, protein, ATP, ADP), catalytic activity (6 terms), ubiquitin thiol esterase activity (15 terms), cysteine type peptidase activity (1 term) (  Table S10). Syntenic analysis revealed all TaTIFY genes had orthologous sequences in sorghum, rice, maize, barley and Brachypodium indicating presence of similar TIFY domains in monocots (Fig. 6). Comparative synteny indicated TIFY sequences shared 55-96% identity.
Considering that orthologs with more than 55% identity often retain equivalent functions during evolution, we examined the orthologous relationship between TaTIFY genes with the other monocots using Circos software. TaTIFY genes (1, 6, 18 and 20) located on chromosome 2DS in wheat were found to be located on chromosome 1 of Brachypodium, 7 of rice, 2 of barley, chromosomes 1 and 2 of sorghum, and chromosome 1, 2 and 7 of maize respectively. The crucial point is that all four TIFY genes are located on a single chromosome of rice, barley and Brachypodium while on more than one chromosome of sorghum and maize. Thus, it can be inferred that these four TaTIFY genes are highly conserved in rice, barley and Brachypodium in comparison to the other two species. TaTIFY genes (2, 10, 13, 16 and 19) located on chromosome 2BL in wheat were distributed on chromosome 2 of barley and different chromosomes of the other monocot species (Fig. 6, Supplementary Table S11). Detailed synteny between Oryza sativa japonica and Triticum aestivum showed highly conserved TIFY gene distribution (Fig. 7).
Molecular cloning, sequencing, and sequence characterization. Representative genomic and cDNAs were cloned from different clades of TaTIFY genes (Supplementary Figure S11) using primer pairs (Supplementary Table S12) and sequences were submitted to NCBI (Table 2). Pairwise alignment revealed intronic region in four TIFY genes (TaTIFY3, 19, 20, 23; Supplementary Figure S12). The physico-chemical characteriza-    Table S14). Bipartite NLS were identified in all proteins whereas monopartite NLS were found in four proteins (Supplementary Table S15). Secondary structure of these proteins exhibited the number and positions of helices and strands (Supplementary Figure S13, Supplementary Table S16). TaTIFY5 had the highest number of α-helices (6) while TaTIFY3 had five β-strands. All proteins had disordered state implicating roles in protein-DNA interactions (Supplementary Figure S14). Pore lining helices were identified in TaTIFY3 and TaTIFY5 (Supplementary Figure S15). Blast2GO revealed GO terms associated with biological process like metabolism, cellular process, response to stimulus, development, localization, multicellular organismal process, while molecular functions were related to catalytic activity and binding, specifically to nucleic acids (Supplementary Figure S16). Genome wide syntenic relationships of these TFs with other monocots (Rice, Sorghum, Maize, Barley and Brachypodium) displayed orthologous relationships (Supplementary Figure S17). The analysis showed high conservation of TaTIFY23 among wheat, maize and Brachypodium while TaTIFY3 between wheat and barley indicating maximum orthology between wheat and barley.
Protein-protein interactions among TIFY with other proteins were studied using STRING database functional links 33 . Since wheat genome sequence became only recently available in the public domain, it is not present in this database. Therefore, we functionally annotated these proteins in rice, sorghum, Brachypodium and maize. The results were displayed as interaction networks of TIFY with their functional partners in a variety of functions like binding to different DNA binding domains, activation of transcription factors and post translation modifications (Supplementary Figure S18). Some important interacting proteins included JAZ 11 or TIFY 3A of Arabidopsis, which are repressors of JA responses, ZIM3 of Zea mays, MYC4-like protein, bHLH protein etc. They were specifically involved in activation of GID1 (GA Insensitive Dwarf1), a gibberellin receptor. Functional partners for binding included B3 DNA binding domain, bZIP TF, helix loop helix DNA binding domain which were mainly involved in inactivation of GID2. Gene occurrence studies revealed the organisms in which these proteins were highly conserved since functional partners often have similar occurrence pattern. These proteins were highly conserved in Streptophyta which included green plants comprising charophyceae (streptophyta green algae) and embryophyta (land plants) (Supplementary Figure S19).

Study of expression of TIFY family TFs during biotic stress and abiotic stress. QRT-PCR studies
revealed the potential roles of TIFY genes during compatible and incompatible interactions involving the leaf rust pathogen. The time-points specific for development of leaf rust infection structures 34 produced by pathogen were correlated with TIFY expression patterns. The spatial and temporal expression patterns were compared between mock and pathogen inoculated susceptible and resistant NILs. Highly increased expression was observed in the resistant NIL after infection relative to susceptible NIL and mock-inoculated controls (Fig. 8).
Reduced expression of most TaTIFY genes was observed in both NILs just before infection (0 hpi). During incompatible interaction, expression was highly induced from 6 hpi onwards. The period of maximum expression at 12  The difference in expression pattern between pathogen and mock inoculated resistant and susceptible NILs was determined using Wilcoxon signed rank test, a non-parametric distribution free test used to illustrate null hypothesis that revealed significant differences (Supplementary Figure S20; Supplementary Table S17). The significance of difference in expression of TIFY genes were evaluated using ∆Ct values at p ≥ 0.01 (Supplementary Tables S18A-F). The results revealed significant differences in expression between mock and pathogen inoculated susceptible and resistant cultivars.
QRT-PCR was also performed for TIFY genes under different abiotic stresses ( Fig. 9; Supplementary Table S19). Most TIFY genes were highly induced at two and four hours post phytohormone treatment, highest being for MJ (198-fold) and JA (189-fold) for TIFY9 followed by JA (176-fold) and MJ (129-fold) for TIFY20. Their expression decreased after 12 h. It can be inferred that different TIFY genes respond differently to different phytohormones and the response was mediated through JA pathway since maximum expression was observed during JA and MJ treatment followed by SA.   Tables S20A-F). The correlation between different phytohormones used in this study was established using Spearman's correlation rank-order, a non-parametric version of the Pearson correlation. The correlation coefficient (ρ, also signified by r s ) measures the strength of association between two ranked variables accepting values from + 1 to − 1. A r s of + 1 indicates a perfect association of ranks, a r s of zero indicates no association between ranks and a r s of − 1 indicates a perfect negative association of ranks. Highest correlation was found in TIFY5 (ABA and MJ, ABA and JA and MJ and JA); TIFY3 also showed high correlation (Supplementary Table S19). The distinct spatio-temporal expression pattern of TIFY genes during phytohormone treatments demonstrates their positive roles during abiotic stress in wheat plants.

Discussion
Knowledge of definite TF repertoires in specific plant species provides insights into the diverse roles it contributes to the plant. Besides TFs, plant hormones are also involved in many aspects of plant growth, development, and response to environmental cues. As a result, cross talks between signaling pathways of various phytohormones and TFs balance development and stress responses in plants. The TIFY families of TF proteins play pivotal roles in various developmental and physiological processes as well as responses to biotic and abiotic stress conditions in plants.
In this study, a total of 92 TIFY genes had been initially identified, of which 23 genes had complete ORF containing the TIFY domain. Since limited resources of TIFY sequences are available at different TFDBs, in silico data mining represented an effective approach to identify and predict putative functions of TIFY TF families in wheat. After careful comparison at Pfam database together with TIFY from other plants we found that CCT2 and TIFY domains were highly conserved in these proteins. Based on this observation, TaTIFY TFs were classified into two sub-families JAZ and TIFY. An N-terminal Jas (CCT2) domain characterizes JAZ proteins. This domain interacts with MYC2, a bHLH transcription factor, to inhibit JA response together with TIFY domain 6 . TaTIFY genes were classified into nine different sub-families which were involved in stress responses in rice and Arabidopsis. As genes with sequence similarities generally have similar functions across different species, so the identified TIFY genes could be predicted to have similar functions in wheat also. Thus, construction of the phylogenetic tree of TIFY genes among different species provided plausible functions of the newly identified TIFY TF proteins in wheat.
Chromosomal synteny is important in genome comparison to reveal genomic evolution of related species 35 . Although it cannot be proclaimed with high accuracy, certainly gene duplications had played important roles leading to series of genomic rearrangements and expansion of this gene family among the different monocot species as manifested by synteny analysis. Shared synteny denotes the genomic fragments in different species had originated from identical ancestor 36 . With well-defined synteny relationships between well studied genomes (such as Oryza sativa, Brachypodium distachyon, Hordeum vulgare, Zea mays and Sorghum bicolor) and newly sequenced genomes like T. aestivum provide knowledge on differentiation of the less-studied gene family like TIFY after species divergence. More importantly, it helps improve the gene annotation of newly sequenced genomes.
We propose a model depicting the function of TIFY TFs in wheat built on P. triticina induced biotic stress response analysis (Supplementary Figure S21A). Based on our earlier study 37 and published literature on histopathological data 34 we correlated the expression of TIFY genes during pathogen infestation. The pathogen induces mechano-sensory reactions when encounters leaf surfaces, thereby activating local defense responses that stimulate systemic responses, like early responsive genes that in turn activate different defense and stress responsive genes. The outcome is stimulation of downstream responses in terms of increased inhibition of fungal growth, disease resistance and development of hypersensitive responses depending upon the nature and function of upor down-regulated genes and their cross talk between infested and neighboring cells.
Various plant growth regulatory phytohormones are also involved in plant responses to biotic and abiotic stresses. The phytohormone network has the regulatory potential that allows plants to quickly react to changes environment, thereby modulate plants to efficiently use available nutrient resources. In nature, plants being sessile, are exposed to many biotic and abiotic stresses. Hormone homeostasis is critical in the establishment of appropriate and effective defensive responses in plants against natural attackers and abiotic stresses (Supplementary Figure S21B). The interaction between these two types of environmental stresses requires a complex adaptive molecular response involving many factors 38 . The significant differences between the NILs regarding Table 2. Accession numbers of cDNA and genomic DNA of TIFY sequences submitted at NCBI.

Sl no.
Gene name Subgroup NCBI Accession numbers cDNA Genomic DNA   1  TaTIFY19  TIFY3  MH277603 MH151794   2  TaTIFY20  TIFY10A  MH161186 MH109179   3  TaTIFY9  TIFY10C  MH211595 MF059098   4  TaTIFY3  TIFY11A  MH161187 MH124204   5  TaTIFY5  TIFY11B  MH290739 MH136799   6  TaTIFY23  TIFY11E  MH151793    It is well established that SA, JA and ET signaling is involved in the regulation of plant-pathogen interactions. SA-mediated plant responses generally govern biotrophic pathogens while JA and ET regulate plant responses to necrotrophic pathogens. Whereas ABA-mediated plant responses are commonly associated with abiotic stresses. Various cross talks between Arabidopsis and its necrotrophic fungal pathogen Botrytis cinera revealed the involvement of eliciting molecules like cyclopentenones to modulate several stress-responsive transcription factors including WRKY33 39 and RAP2.4 40 . Since TIFY TFs administer regulation of a variety of phytohormones involved in biotic and abiotic stresses, this study uncovered the dynamic roles of TIFY genes.
In conclusion, TIFY TFs proteins regulate a wide range of biological processes, with their most distinctive role in plant defense response. To the best of our knowledge, this is the first study of structural and functional attributes of TIFY TF genes in wheat. Generally, TFs within same groups show recent common evolutionary relationships and conserved specific motifs that are related to similar molecular functions 41 . The close relationship between wheat, rice and Arabidopsis permitted the identification of highly homologous TIFY genes that allowed predicting their probable functions in wheat. This study provides useful information for future studies of TIFY TFs in regulation of wheat growth and development as well as for their incorporation into wheat breeding programs.
To identify presence of additional conserved motifs, Multiple Expectation Maximization for Motif Elicitation (MEME) version 5.1.1 was used (http:// meme-suite. org/). The parameters used were minimum width 3, maximum width 50 and the maximum number of motifs to identify any number of repetitions. The subcellular localization and secondary structures of these sequences were predicted by WoLF PSORT (http:// wolfp sort. org/) and PSIPRED Protein Analysis Workbench (http:// bioinf. cs. ucl. ac. uk/) respectively. Transmembrane domains and pore lining helices were identified using MEMSAT-SVM tool at PSIPRED that utilizes support vector machine algorithm.
Cis-regulatory elements present in the 2.0 Kb upstream regions of all the identified TaTIFY genes were investigated using PLACE database (http:// www. dna. affrc. go. jp/ htdocs/ PLACE/). Further, all identified TIFY genes were searched as target genes for conserved and novel wheat microRNAs available at miRBase, release 21 with psRNATarget tool (http:// plant gm. noble. org/ psRNA Target/) using an E-value cut off 3 for filtering microRNAs because low E-value specifies high resemblance between small RNAs and target genes.
Tertiary structure prediction and validation. Tertiary structures were predicted at I-TASSER 46 server based on ab initio methods for structure prediction. Five different models for each TaTIFY protein were generated and the model showing overall best stereo-chemical quality was selected for further assessment. The SAVES (http:// servi ces. mbi. ucla. edu/ SAVES/) server was used to examine stereo-chemical properties of protein structure and ProSA-web server (http:// prosa. servi ces. came. sbg. ac. at/ prosa. php) to calculate overall quality score to select the most reliable model. The 3D models were subjected to PyMOL Molecular Graphics System (http:// pymol. org/ ep) to obtain the final structures. The PDB files of the modeled TaTIFY proteins were subjected to PDBsum server (http:// www. ebi. ac. uk/ thorn tonsrv/ datab ases/ pdbsum/ Gener ate. html) for structural motif analysis and to ProSA-web server for obtaining the reliable values for the model generated. The final models  Table S19). The amplified products were cloned and plasmids from five independent clones, obtained with each primer set, were sequenced from both ends commercially. Total RNA was extracted using TriReagent (Sigma-Aldrich) and 5 µg was reverse-transcribed to cDNA using Transcriptor First Strand cDNA Synthesis Kit (Roche Diagnostic, Indianapolis, USA). PCR, cloning and sequencing was performed as mentioned earlier and each consensus sequence was analyzed 50,51 . The nucleotide and deduced amino acid sequences were characterized using the same bioinformatics tools as mentioned earlier.

Response of TIFY TF to biotic and abiotic stresses. Plant and pathogen material, stress treatment of plants and quantitative Real Time PCR.
Wheat near-isogenic lines (NILs) HD2329 (seedling leaf rust susceptible, infection type 3+) and HD2329 + Lr28 (seedling leaf rust resistant, nest immune, infection type 0-0) were used. The Lr28 gene, effective against all pathotypes of the pathogen in India, was derived from Aegilops speltoides (Teusch) and introgressed on the long arm of chromosome 4AL 52 . Puccinia triticina pathotype 77-5, the most predominant and devastating pathotype in all parts of the Indian subcontinent, was selected as experimental pathogen. Experiments were conducted at the National Phytotron Facility, IARI, New Delhi and seedlings of the isogenic lines were inoculated with either urediniospores of race 77-5 mixed with talc (1:1) or mock inoculated with talc. Leaf samples were collected at different time points for RNA isolation: 0 h (i.e. just before inoculation) and at 6, 12, 24, 48, 72-and 168-h post inoculation (hpi) from five independently treated susceptible and resistant NILs. The collected samples (three leaves from each plant) were immediately frozen in liquid Nitrogen and used for RNA isolation. Susceptible mock inoculated was named S-M, Susceptible Pathogen inoculated as S-PI, Resistant mock inoculated R-M and Resistant Pathogen inoculated as R-PI. The wheat cultivar Chinese Spring was used for the abiotic stress study. Four different phytohormones [SA, ABA, Methyl Jasmonate (MJ) and Jasmonic Acid (JA)] at concentration of 5 mM were used for abiotic stress treatments. The samples were divided in two groups: control: 4 plants and treated: 7 plants for each treatment. Treatments were performed in duplicates. Seeds were germinated on autoclaved composite soil containing peat, sand and soil (2:1:1 ratio), grown to three leaf stage (~ 14 days after germination) at the Green House facility, BIT, Mesra, Ranchi under ideal conditions (temperature 20 °C, relative humidity 80%, 14 h light at 100 µmol m −2 s −1 and 10 h of darkness). The pots were watered well regularly. The test seedlings were sprayed with respective phytohormones, while control plants were sprayed with only milli-Q water having no phytohormone. Leaf samples for RNA isolation were collected at 0, 1, 2, 4, 8, 12 and 24 h post phytohormone application. The collected samples (five leaves from each set) were immediately used for RNA isolation.
Primers used for qRT-PCR are mentioned in Supplementary Table S21. Samples were run in two biological and three technical replicates, and the experiment was repeated once again to check reproducibility. Wheat Glyceraldehyde-3-Phosphate Dehydrogenase (GAPDH; GenBank Accession No. AF521191), was chosen as the endogenous control for normalization of input RNA differences and examine efficiencies of reverse transcription among the samples 53 and gene expression levels were computed using the 2 −∆∆Ct method 54 .
Statistical analysis. The qRT-PCR data were statistically analyzed using Wilcoxon signed rank sum test to compare expression data from infected and mock-inoculated materials while, Spearman's correlation was used to compare expression data from phytohormone treated and control plants.

Data availability
All the processed sequences were deposited in the National Center for Biotechnology Information (NCBI) and can be accessed as mentioned in Table 2.