Analysis of the TCP genes expressed in the inflorescence of the orchid Orchis italica

TCP proteins are plant-specific transcription factors involved in many different processes. Because of their involvement in a large number of developmental pathways, their roles have been investigated in various plant species. However, there are almost no studies of this transcription factor family in orchids. Based on the available transcriptome of the inflorescence of the orchid Orchis italica, in the present study we identified 12 transcripts encoding TCP proteins. The phylogenetic analysis showed that they belong to different TCP classes (I and II) and groups (PCF, CIN and CYC/TB1), and that they display a number of conserved motifs when compared with the TCPs of Arabidopsis and Oryza. The presence of a specific cleavage site for the microRNA miRNA319, an important post-transcriptional regulator of several TCP genes in other species, was demonstrated for one transcript of O. italica, and the analysis of the expression pattern of the TCP transcripts in different inflorescence organs and in leaf tissue suggests that some TCP transcripts of O. italica exert their role only in specific tissues, while others may play multiple roles in different tissues. In addition, the evolutionary analysis showed a general purifying selection acting on the coding region of these transcripts.


Results and Discussion
Transcripts encoding TCP proteins in the inflorescence of O. italica. To identify transcripts encoding TCP proteins, the inflorescence transcriptome of O. italica 46 was screened through a standalone TBLASTN search using as query the sequences of the TCP DNA-binding domain of A. thaliana and O. sativa. Eleven not-redundant transcripts (ranging from 850 to 2788 bp) present in the inflorescence transcriptome of O. italica contain a region encoding the TCP domain (Supplementary Data S1 and Figure S1) flanked downstream by at least 500 bp of additional sequence. This indicates that at least eleven distinct TCP genes are expressed in the inflorescence of O. italica, increasing the small number of known TCP genes of orchids (five in Phalaenopsis hybrid cultivar and one in Dendrobium hybrid cultivar) 35 . Surprisingly, a first phylogenetic analysis of these virtually translated sequences of O. italica and the TCP proteins of Arabidopsis thaliana and Oryza sativa revealed that none of the TCP transcripts present in the inflorescence transcriptome of O. italica belong to the class II CYC/TB1-like group, whose members are involved in the establishment of bilateral symmetry in numerous plant species 26,28,[48][49][50][51] . The recent release of the assembled genome of the orchid species Phalaenopsis equestris 42 provided for the first time the opportunity to check for the presence of TCP genes belonging to the class II CYC/TB1-like group within the genome of an orchid species. Using the TCP transcripts of O. italica as queries, standalone BLASTN and BLASTX searches of the assembled genome of P. equestris were conducted. In addition, the annotated sequences of the genome of P. equestris were checked to verify the presence of additional TCP genes. Excluding three sequences with an incomplete TCP domain, in the genome of P. equestris there are 17 TCP-encoding genes and among them three belong to the CYC/TB1-like group (Supplementary Table  S1). To check for the presence of CYC/TB1-like sequences in O. italica, degenerate primers based on the CYC/TB1-like sequences of P. equestris were used to amplify the genomic DNA of O. italica. The single fragment of 378 bp obtained from this PCR amplification (named OitaTB1 and deposited in GenBank with the accession number KR858306) encodes the region from the TCP domain to the R domain. The online TBLAST search confirmed its homology with CYC/TB1-like genes. As expected, the attempts to obtain the full-length cDNA of this sequence starting from the RNA extracted from an inflorescence of O. italica failed. In fact, the absence of an assembled transcript in the inflorescence transcriptome of O. italica corresponding to OitaTB1 suggests that this gene is not expressed or it is expressed at very low levels and/or only in specific tissues of the inflorescence of O. italica at the stage of development examined. Further attempts to amplify the full-length cDNA of OitaTB1 starting from RNA extracted from various tissues of the inflorescence or from leaf tissue did not give positive results. However, the OitaTB1 sequence was used for all the subsequent analyses. The generally dispersed positions of the different TCPs of O. italica among the branches of the tree confirm the hypothesis of the expansion of the TCP family before the divergence of the lineages examined, as observed in some dicot species [37][38][39] . The robustness of the NJ tree was confirmed by the topology of the Minimum Evolution and the Maximum Likelihood trees (Supplementary Figure S2 and S3, respectively), both showing branch patterns in agreement with the one obtained from the NJ analysis. In addition, the  Table S1). The bootstrap percentages >50% are shown next to the branches. The red asterisks indicate the sequences of O. italica. The graph of the conserved domains and the corresponding legend were obtained from the MEME search using the full length of the TCP proteins. The scale below the conserved domains indicates the amino acid position. The amino acid sequences of the TCP proteins of O. italica, A. thaliana, and O. sativa were scanned to search for conserved shared domains using the motif-based sequence analysis tool MEME, and the results are shown in Fig. 1 and Supplementary Figure S6. The pattern distribution of the shared motifs is in general agreement with the phylogeny inferred from the alignment of the TCP domain. All the sequences share the TCP domain, included in Motif 1 and 2. The R domain (Motif 5) is shared by 9 sequences belonging to the class II TCP family, both CYC/TB1-like and CIN-like. Among them, two are sequences of O. italica, one CIN-like (comp5062) and one CYC/TB1-like (OitaTB1). Another interesting motif shared by 12 sequences of the CIN-like group (Motif 13) corresponds to the amino acid stretch encoded by the target site of the microRNA miR319 [12][13][14][15] . This motif is present in three sequences of O. italica (comp5062, comp1326 and comp16313). Many other motifs are restricted to specific sub-groups of the tree, suggesting that they might contribute to the specific function of the different TCP proteins. For example, Motif 9 is present in OsTCP6, AtTCP20 and in two sequences of O. italica, comp16641 and comp24776; Motif 8 is shared by AtTCP3, AtTCP4, OsPCF5, OsPCF7 and comp1326. Scanning the identified consensus motifs against the database of protein domains PROSITE 52 resulted in positive hits with the TCP domain or with other regions of the TCP proteins with unknown functions, or in no positive matches. In conclusion, excluding the TCP domain, the functions of the other conserved motifs identified are still unknown.
The estimates of the non-synonymous (Ka) and synonymous (Ks) substitution rates was performed on the putative orthologous TCP nucleotide sequences of O. italica and P. equestris to evaluate the evolutionary constraints acting on these sequences. The Ka/Ks ratio equal to 1 indicates neutral selection, while values lower or higher than 1 reveal purifying or positive selection, respectively 53 . The search for the best reciprocal BLASTP hits between the TCP sequences of O. italica and P. equestris gave significant scores for 10 pairs of sequences. The results of the evolutionary analysis are summarized in Table 1. Among the sequences examined, the mean Ka/Ks value is 0.222, and the pair-wise comparisons reveal that a strong purifying selection is acting on the nucleotide sequences encoding the TCP domain in O. italica and P. equestris. However, the comparison between the sequence comp16641 of O. italica and PEQU28429 of P. equestris resulted in a Ka/Ks value much higher than all the other comparisons and higher than 1, even though the p value was not significant. As shown in Fig. 1 and reported in Supplementary Table S1, the transcripts comp16641 of O. italica and PEQU28429 of P. equestris are related to the TCP factor AtTCP20 of A. thaliana. AtTCP20 belongs to the class I PCF-like group. It is expressed in different organs and developmental stages and is involved in cell division and elongation, which must be coordinated for normal plant development and differentiation 54 . The Ka/Ks value detected between the transcripts comp16641 of O. italica and PEQU28429 of P. equestris suggests a possible relaxation of the selective constraints acting on these sequences in orchids, which might drive the evolution of morphological innovations.

MicroRNA target sites and expression analysis.
To check for the presence of target sites of specific microRNAs (miRNAs) expressed in the inflorescence, the selected transcripts of O. italica encoding TCP-like proteins were scanned with the psRNATarget online tool 55 (Fig. 1). All of these transcripts have been described as targets of miR319 and are involved in different developmental processes, from petal and leaf formation to senescence and response to different stresses [12][13][14][15][16] . The validation of the miRNA cleavage was performed by conducting a modified 5′ -RACE experiment. Cleavage was confirmed for the transcript comp5062 (Fig. 2B,C), whereas for the transcripts comp1326 and comp16316, no cleavage fragment of the expected size was detected. The apparent absence of the cleaving activity of miR319 on comp1326 and 16313 might be due to the sequence divergence in the upstream and downstream regions surrounding the predicted cleavage site.
The analysis of the expression patterns is useful to hypothesize possible gene functions and overlapping expression profiles indicate probable functional redundancy. The expression analysis of miR319 and of its putative target transcripts in different tissues of the inflorescence of O. italica at two developmental stages (Fig. 3C) was evaluated by real time PCR. The early stage corresponds to floral buds with a diameter of approximately 9 mm (Fig. 3A) and the late stage to completely opened flowers after anthesis (Fig. 3B). In both stages, cell division is completed, and the flower organs are completely formed; however, in the early stage, cells are still elongating. Figure 4 shows the relative expression of the transcripts comp5062, comp1326, comp16316 and miR319 in the outer and inner tepal, lip, column, unpollinated ovary and in leaf tissue. The expression patterns of these three transcripts in the tissues of O. italica at the two different stages examined are similar but with different expression levels. In the inflorescence, they are mainly expressed in the organs of the perianth (tepals and lip) with some difference between the early and late stage. In the column and ovary, the expression of the three transcripts is generally lower than in the other tissues, including leaves. In contrast, the expression pattern of miR319 shows the opposite trend, with the highest expression levels observed in the column and ovary. The observed values of the relative expression of miR319 are negatively correlated with those of its three putative targets; however, the only statistically significant Pearson correlation coefficient (r = -0.61, p < 0.05) is relative to the transcript comp5062. This result, together with the failure to detect specific fragments cleaved by miR319 for the transcripts comp1326 and comp16316, suggests the existence of different regulators of the expression of these two TCP transcripts in the tissues of O. italica at the stages examined. Our results demonstrate for the first time the activity of miR319 on a TCP target in the floral tissues of a monocot species. In the model dicot species Arabidopsis, miR319 has been demonstrated to have a critical role in the development of petal and stamens, through the cleavage of a CIN-like TCP target 15 . The differences The different tissues of O. italica were examined to evaluate the expression pattern of the other identified transcripts that encode TCP proteins. The results reported in Fig. 5 show distinct expression profiles for some transcripts, suggesting their different roles in the various tissues and stages examined, while other transcripts share similar patterns. In particular, the transcript comp21881 is expressed almost exclusively in the ovary tissue and the transcript comp21123 in the column at the late stage, suggesting that they might have a specific function in these tissues. The transcript comp21881 is related to the gene AtTCP15 of Arabidopsis (Fig. 1), involved in many developmental pathways including gynoecium development 56,57 . The expression profile observed in O. italica suggests its functional conservation, at least for its possible role in the development of the female reproductive structures. The transcripts comp12442 and 13386, belonging to the same branch of the NJ tree, display generally overlapping profiles, being expressed in all the tissues examined, including leaves, with some differences in the ovary tissue. This suggests pleiotropic and redundant functions of these two transcripts in the tissues examined of O. italica, in agreement with the diffuse functional redundancy of the class I TCPs. The transcripts comp8378  The asterisks indicate statistically significant differences between the relative expression of the early and the late stages (*p < 0.05, **p < 0.01). Te_out, outer tepals; Te_inn, inner tepals; Lip, labellum; Co, column; Ov, ovary; Le, leaf. Rn, relative normalized expression. and 8964 belong to the same branch of the NJ tree, and they display similar patterns at the early stage. In the late stage, differences are detectable in tepals and ovary, where the levels of comp8378 significantly decrease, whereas the levels of comp8964 increase markedly in column and ovary. These two transcripts are related to the genes AtTCP21 and AtTCP7 (Fig. 1). In Arabidopsis these two genes show not fully overlapping expression profiles and functions. AtTCP7 is mainly involved in leaf development and AtTCP21 is a component of the circadian clock 58,59 . The expression pattern of comp8378 and 8964 indicates their possible functional diversification also in O. italica. Despite the fact that the transcripts comp16641 and 24776 are included in the same branch of the NJ tree, their expression profiles are very different. The first is highly expressed in all the floral tissues at the late stage and in leaves; the latter is very weakly expressed in all the tissues examined, suggesting that its functions are performed in different organs and/or developmental stages. The transcript comp16641 shows the strongest variation of the expression profile between the two stages examined and, as previously discussed, it shows relaxation of the selective constraints acting on its coding region. These findings let hypothesize a possible sub-or neo-functionalization of comp16641 and suggest to investigate in more detail the possible role of this gene in the development and maintenance of floral structures in orchids. Finally, OitaTB1 is expressed at a very low level in the inflorescence tissues and at a slightly higher level in leaves. OitaTB1 seems to be phylogenetically close to OsTB1 of Oryza sativa (Fig. 1, Supplementary Figure S2 and S5), a negative regulator of lateral branching 60 . It would be of great interest to ascertain whether OitaTB1 is also involved in this developmental process in O. italica and in orchids in general.

Conclusions
This study has led to the identification of 12 transcripts encoding TCP proteins in the Mediterranean orchid O. italica. This number is lower than the number of the TCP genes of the model species A. thaliana (24) or O. sativa (22) because the approach used in the present study was focused on the TCPs expressed in the inflorescence tissues. Other TCP transcripts in O. italica are possibly expressed in tissues and/or developmental stages different from those examined. However, the number of transcripts identified in O. italica is not much different from the number annotated in the genome of the orchid P. equestris (17), suggesting that in orchids there could be fewer TCP genes than in Arabidopsis and Oryza. The analysis of the expression profiles of the TCP transcripts of O. italica shows that some of them are expressed in all the tissues and developmental stages examined, suggesting their potential involvement in multiple pathways, whereas others are restricted to specific tissues, indicating a possible more specialized role. In addition, the presence of a cleavage site for miR319 on one TCP transcript of O. italica supports The asterisks indicate statistically significant differences between the relative expression of the early and the late stages (*p < 0.05, **p < 0.01). Te_out, outer tepals; Te_inn, inner tepals; Lip, labellum; Co, column; Ov, ovary; Le, leaf. Rn, relative normalized expression.
Scientific RepoRts | 5:16265 | DOi: 10.1038/srep16265 the existence of an evolutionary conserved role of this miRNA in regulating the activity of specific TCP factors.

Isolation of the TCP genes expressed in the inflorescence of O. italica. The sequences of the
TCP DNA-binding domain of Arabidopsis and Oryza (Pfam PF03634) were used as query to perform a standalone TBLASTN (e-value 1e-003) against the non-redundant sequences of the inflorescence transcriptome of O. italica 46 . In addition, the transcripts of the inflorescence transcriptome of O. italica annotated as TCP genes were selected. The transcripts of O. italica with significant hits from the TBLASTN search and those previously annotated as TCP genes were checked for the presence of an ORF that exceeded the TCP domain, excluding those shorter than 500 bp from further analyses.
The coding sequences (CDSs) and predicted proteins of the genome of the orchid Phalaenopsis equestris 42 were downloaded from the FTP site ftp://ftp.genomics.org.cn/from_BGISZ/20130120/. The coding sequences annotated as TCP genes were selected (Supplementary Table S1) and used for the subsequent analyses.  Table S1). The TCP domains of the orchids O. italica, P. equestris, P. hybrid cultivar and D. hybrid cultivar were aligned, and the NJ, ME and ML trees were constructed as described above.

Isolation of the class II
The whole amino acid sequences of the TCP proteins of O. italica, A. thaliana and O. sativa were scanned for the presence of shared conserved motifs using the online tool MEME 64 , setting the parameters to any number of repetitions, the optimum width from 4 to 70 and the maximum number of motifs to 20.

Evolutionary analysis.
To identify the best reciprocal hits, a custom-made perl script was used to perform BLASTP search on the TCP sequences of O. italica and P. equestris. Based on the results obtained, pair-wise nucleotide alignment between the resulting orthologous nucleotide sequence encoding the TCP domain of O. italica and P. equestris was constructed and used to estimate the synonymous (Ks) and non-synonymous sites (Ka) substitution rates using the pair-wise approximate YN method 65 implemented in the KaKs Calculator 66 .
Search for miRNA targets and cleavage validation. The psRNATarget online tool 55 was used to search for the presence of microRNA targets within the TCP transcripts of O. italica, using as the query the microRNAs expressed in the inflorescence of O. italica 47 and setting stringent search parameters (maximum expectation 0.0).
Based on the predicted position of the putative miRNA cleavage sites on the TCP transcripts of O.italica, specific reverse primers were designed, all of them annealing downstream from the predicted putative miRNA target, and used to verify the presence of the cleavage product by performing a modified 5′ -RACE experiment 67 . In brief, total RNA was extracted from the inflorescence of O. italica before anthesis (~9 mm diameter) using the Trizol Reagent (Ambion). After DNase treatment, RNA was quantified using the spectrophotometer Nanodrop 2000 (Thermo Scientific). The 5′ adaptor of the RLM-RACE GeneRace kit (Invitrogen) was ligated to the 5′ -terminus of the extracted RNA (500 ng) without any enzymatic treatment to remove the 5′ cap. The RNA was then reverse transcribed, and the cDNA was amplified using the GeneRace 5′ Primer and the TCP-specific reverse primers (Supplementary Table  S2). A second PCR amplification was performed on 1 μ l of the first reaction using the nested adaptor Scientific RepoRts | 5:16265 | DOi: 10.1038/srep16265 forward primer and the nested TCP-specific reverse primers (Supplementary Table S2). The amplification products were cloned into the pGEM-T Easy vector (Promega), and 10 clones were sequenced as described above.
The specimens of O. italica used in the present work are present in the orchids collection of the Botanical Garden of Naples (Italy). Expression analysis. Total RNA was extracted from different tissues of O. italica following the protocol described above. In particular, outer and inner tepal, lip, column and unpollinated ovary were dissected from the inflorescence of O. italica before anthesis (defined as the early stage, ~9 mm diameter) and after anthesis (late stage). Although in the early stage, cell division has been completed and flower organs formed, cell elongation is still occurring. In addition, total RNA was extracted from leaf tissue. RNA (350 ng) was reverse transcribed using the Advantage RT-PCR kit (Clontech) and an oligo dT primer.
Using the 5.8 S RNA as the endogenous control gene, real time PCR experiments were conducted on 30 ng of the first strand cDNA from each tissue using the primer pairs specific for the TCP transcripts of O. italica (Supplementary Table S2). The reactions were performed using the SYBR Green PCR Master Mix (Life Technologies) in technical triplicates and biological duplicates, and the thermal cycle was followed by a melting curve step 68 .
Stem-loop real time PCR experiments were conducted to evaluate the expression pattern of the microRNA miR319 of O. italica in all the tissues examined 69 . In brief, total RNA (150 ng) from each tissue was separately reverse transcribed using the miR319 and the 5.8 S stem-loop primers (Supplementary Table S2). The first strand cDNA (5 ng) was then amplified in technical triplicates and biological duplicates using the miR319 and 5.8 S specific primers and the stem-loop universal primer (Supplementary Table S2).
The PCR efficiency (E) and the optimal threshold cycle (C T ) for each well were calculated using the Real Time PCR Miner online tool 70 . The mean relative expression ratio (Rn) and standard deviation of the TCP transcripts and of the miR319 in the different tissues was calculated using the 5.8 S RNA as the endogenous control gene by applying the formula Rn = (1 + E target gene ) −CT target gene /(1 + E control gene ) −CT control gene . Differences in relative expression levels of the TCP transcripts and of miR319 between and/or among the different tissues were assessed by the two-tailed t test and ANOVA followed by the Tukey HSD post hoc test. To exclude the presence of PCR artifacts, the real time PCR products of several samples were cloned and sequenced as described above.