Transcriptomic differences between male and female Trachycarpus fortunei

Trachycarpus fortunei (Hook.) is a typical dioecious plant, which has important economic value. There is currently no sex identification method for the early stages of T. fortunei growth. The aim of this study was to obtain expression and site differences between male and female T. fortunei transcriptomes. Using the Illumina sequencing platform, the transcriptomes of T. fortunei male and female plants were sequenced. By analyzing transcriptomic differences, the chromosomal helical binding protein (CHD1), serine/threonine protein kinase (STPK), cytochrome P450 716B1, and UPF0136 were found to be specifically expressed in T. fortunei males. After single nucleotide polymorphism (SNP) detection, a total of 12 male specific sites were found and the THUMP domain protein homologs were found to be male-biased expressed. Cytokinin dehydrogenase 6 (CKX6) was upregulated in male flowers and the lower concentrations of cytokinin (CTK) may be more conducive to male flower development. During new leaf growth, flavonoid and flavonol biosynthesis were initiated. Additionally, the flavonoids, 3′,5′-hydroxylase (F3′5′H), flavonoids 3′-hydroxylase, were upregulated, which may cause the pale yellow phenotype. Based on these data, it can be concluded that inter-sex differentially expressed genes (DEGs) and specific SNP loci may be associated with sex determination in T. fortunei.

Gene expression analysis. Bowtie compared the sequenced reads with the unigene library by using the expected number of Fragments Per Kilobase of transcript sequence per Millions (FPKM) base pairs sequenced as the expression abundance of the corresponding unigenes. Pt-ff and Pt-mf were grouped together, Pt-ml and Pt-fl were grouped together, and flowers and leaves were distinguishable (Fig. 1a), wherein the value of PCA1 was 57.8% and the value of PCA2 was 9% (Fig. 1b).

RT-qPCR verification.
Quantitative fluorescence analysis and RNA expression quantification were followed by linear regression analysis (Fig. 3a). Results revealed that the expression levels of STPK, UPF, and P450, which are specifically expressed in male plants, were significantly different compared with female plants; genes www.nature.com/scientificreports/ that were specifically expressed are presented (Fig. 3b). For example, the expression level of STPK in Pt-ml and Pt-mf was 23.76, which was 10.52 times that of Pt-ff.
SNP full transcriptome detection. By searching transcriptomic single nucleotide polymorphisms (SNPs), 12 SNP loci in the male and female strains were identified with base substitutions and homozygous homology from the same sample (Table 3); these sites may be gender markers that distinguish the male sex in T. fortunei. It is worth noting that TRINITY_DN73064_c0_g1 is 1 DEG in Pt-ff/Pt-mf and Pt-fl/Pt-ml that was upregulated in male plants at the 1,066 site (5′-3′ locus). There is a base-switching event where the G nucleotide is fixed in T. fortunei females and A in T. fortunei males. SSR detection found that there was a double-base repeat event inside the TRINITY_DN73064_c0_g1 gene and the repeating unit was 34 times. Male and female plants have C ↔ A, G ↔ A, T ↔ C, and C ↔ T, among other base conversion events, at the same base sites of the remaining seven genes.

Discussion
There are 15,600 dioecious angiosperms in 987 genera and 175 families that account for 5-6% of the total species 11 . Early ascertaining the sex of seedlings can accelerate the artificial selection in breeding programs 12 . Sex determination is a major shift in the evolutionary history of angiosperms, as dioecious, sex-determining genes are usually located in the non-recombinant regions of sex chromosomes 13 . Next generation sequencing (NGS) technology is a widely used to study the promotion of sex determination in flowering plants. De novo RNA-Seq transcriptome assembly and expression analysis can inform the investigation of gender determination in dioecious specie, exploring genome-wide sex-biased expression patterns in different species revealed a broad variation in the percentage of sex-biased genes (ranging from 2% of transcripts in Littorina saxatilis to 90% in Drosophila melanogaster), a study show that significantly more genes exhibited male-biased than female-biased expression in Asparagus officinalis 14 . Sex-biased expression was pervasive in floral tissue in Populus balsamifera, but nearly absent in leaf tissue 15 . Sex-specifically expressed genes may be derived from silencing the inhibition of related genes or the deletion of homologous genes in corresponding sex tissues. There are several scenarios for the origin of sex-biased genes, including single-locus antagonism, sexual antagonism plus gene duplication and duplication of sex-biased genes 16 . In this study, transcripts were constructed and assembled from T. fortunei female and male flowers and leaves to identify genes with sex-biased expression. Our study showed that more genes exhibited male-biased than female-biased expression in T. fortunei. Genes with sex-biased expression, often contribute largely to the expression of sexually dimorphic traits, SNPs calling from segregated populations of www.nature.com/scientificreports/ dioecious plant can help to identify the sex-associated SNPs and corresponding loci, five DEGs were polymorphic with common SNPs in each sex type in Eucommia ulmoides 17 . By searching transcriptomic SNPs, 12 SNP loci in the male and female strains were identified with base substitutions and homozygous homology from the same sample. S 4 U (4-thiouridine), a modified nucleoside, is located in the receptor arm and D arm of eubacterial and archaeal tRNA. S 4 U8 is synthesized by 4-thiouridine synthase (Thil) to stabilize the folding of tRNA and acts as a sensitive trigger for UV irradiation response mechanisms 18,19 . TRINITY_DN73064_c0_g1, the THUMP domain-containing protein 1 homolog, is a homologue with a THUMP domain and is one of the Thil constituent domains; it has a double base (CT) 34-unit repeat and a male-specific site. It is differentially expressed in both T. fortunei male and female plants; such polymorphic biased genes may be linked together in the non-recombinant region of sex determination regions (SDR) 20 . CHD1 (TRINITY_DN61762_c0_g1) is highly expressed in T. fortunei male plants. In a previous study, rice CHD1 was highly expressed in plant leaves and the number of cells in the leaves and stems of CHD1 mutant plants decreased, the surface epidermis increased, and the chlorophyll a/b content of leaves decreased 21 . CHD1 has little difference in terms of exon and intron changes, and variations are mainly concentrated in the introns. Interestingly, it has been used as a marker for early sex identification in birds and poultry [22][23][24] . Protein kinases are a class of enzymes that use ATP to phosphorylate other proteins and play important roles in controlling many www.nature.com/scientificreports/ aspects of cellular life, and are divided into three major subclasses: receptor tyrosine kinase (RTK), serine/su protein kinase (STPK), and histidine kinase 25 . TRINITY_DN81704_c4_g4 acts as a STPK that is synergistic with cyclins and serves as an important cellular regulatory factor. In Arabidopsis, the gene encoding the STPK, OXI1, is induced in response to extensive H 2 O 2 stimulation and serves as an important part of the signal transduction pathway that links several oxidative signals with downstream responses 26 . STPK are highly expressed in T. fortunei males compared to female flowers and leaves, suggesting that there may be differences in the stress transmission of oxidative signals between male and female plants. UPF0136 (TRINITY_DN70437_c2_g4) and cytochrome P450 716B1 (cytochrome P450 716B1-like) (TRINITY_DN82812_c2_g2) were also highly expressed in male plants, however, their specific functions have yet to be explored. KEGG enrichment of the DEGs in Pt-ff/Pt-mf revealed that zeatin biosynthesis (ko00908) was enriched and cytokinin dehydrogenase 6 (TRINITY_DN64789_c0_g1) was upregulated 6.98 times higher in male flowers. Cytokinin oxidase/dehydrogenase (CKXs) catalyze the irreversible degradation of cytokinins 27 . During the development of male and female T. fortunei flowers, the IAA, Abscisic acid (ABA) and zeatin riboside (ZR) contents of male flowers are lower than female flowers in the corresponding period 28 . Transgenic overexpression of 6 CKX members in Arabidopsis transgenic plants increased the breakdown of cytokinin with a content equivalent to 30-45% of wild-type CTK content; the existence of CTK is essential for survival and a lack of CTK leads to reduced plant apical meristem and leaf primordium activities 29 . Most Brassica napus BnCKXs are highly expressed in reproductive organs, such as buds, flowers, or siliques 30 . TRINITY_DN64789_c0_g1 was highly expressed in male flowers, indicating that CKXs were upregulated in male flowers and that lower CTK concentrations may be more favorable for male flower development. This spatial expression pattern may be related to the different CTK functions.
In Pt-ff/Pt-fl and Pt-mf/Pt-ml, flavonoid and flavonol biosynthesis (ko00944) were both > 1. Flavonoid 3′,5′-hydroxylase (F3′5′H) is a member of the cell pigment P450 family and is the key enzyme for the synthesis of 3′,5′-hydroxylhydrochemical pigmentation 31 . Flavonoid 3′-hydroxylase (F3′H) is involved in flavonoid biosynthesis. The F3′H gene is expressed in different tissues such as roots, stems, leaves, flowers, etc., and can change the color of plant flowers or seed coats 32 . Flavonoids, carotenoids, and beetroot are the main flower pigments 33 . In flavonoids, orange and charone are yellow pigments; thus, flavonoids and flavonols are colorless or very light yellow 34 . The upward expression of F3′5′H and F3′H in T. fortunei new leaf growth indicates that these components may cause the light yellow leaf color phenotype.

Conclusion
By analyzing the transcriptomic differences between T. fortunei female and male flowers and leaves, CHD1, STPK, cytochrome P450 716B1, UPF0136 and THUMP domain-containing protein 1 homolog were found to be highly expressed in T. fortunei males. Through SNP site detection, a total of 12 male and female specific sites were found. CKX6 exhibited increased expression in male flowers. Lower CTK concentrations may be more conducive to the development of male flowers. In the early stages of leaf growth, flavonoid and flavonol biosynthesis were initiated and the up-regulated expression of F3′5′H and F3′H may cause the pale yellow phenotyper of the leaf.

Materials and methods
Materials. Materials were collected from an artificially planted T. fortunei forest located in Guiding County, Guizhou Province, China. The test site belongs to the mid-subtropical monsoon humid climate. The soil in the forest is yellow soil with an annual rainfall of 1,143 mm and annual average temperature of 15 °C. Three male and female T. fortunei were selected for sampling. Male and female flowers in the unexpanded flower buds were collected and labeled as Pt-ff (female flower) or Pt-mf (male flower). The middle part of unexpanded leaves at the top of the treetop was collected (light yellow) and labeled as Pt-fl (female leaves) or Pt-ml (male leaves), and then promptly placed in liquid nitrogen. There are three biological repeats in each group. Table 3. Hypothetical sex-related genes in T. fortunei. POS is the site of the unigene where the sequencespecific site was located (5′-3′).  TRINITY_DN69373_c0_g2  89  C|7; C|7; C|7  C|7; C|6; C|7  A|7; A|4; A|6  A|2; A|12; A|6   143  G|20; G|7; G|20  G|7; G|7; G|10  A|9; A|5; A|7  A|3; A|11; A|7   TRINITY_DN70551_c5_g1  1762  T|24; T|22; T|23  T|26; T|47; T|20  C|76; C|45; C|69  C|69; C|70; C|69   TRINITY_DN73064_c0_g1  1,066 G|8; G|2; G|4  G|4; G|4; G|4  A|22; A|9; A|19  A|9; A|15; A|19   TRINITY_DN73101_c1_g3  728  C|28; C|15; C|24  C|5; C|3; C|5  T|24; T|27; T|6  T|4; T|2;  were enriched for eukaryotic mRNA. Subsequently, fragmentation buffer was used to break the mRNA into short fragments, which was used as a template to synthesize a strand of cDNA with a 6-base random primer. Then, double-stranded cDNA was synthesized, purified, and subjected to end repair, poly-(A) tail synthesis, and ligation to the sequencing link. Fragment size selection was conducted using AMPure XP beads. The second strand of the cDNA containing U was degraded using USER enzymes and the strand orientation of the mRNA was retained. After the prepared library was tested, sequencing was performed on Illumina HiSeq 2500 system machine. There are a total of 12 samples, and each sample is constructed separately for library sequencing.
Evaluation, assembly, and annotation of raw data quality. The raw sequencing data was qualitycontrolled; low-quality reads and unknown bases > 1% were removed using the Trimmomatic tool 35  Gene expression quantification, differential analysis, and functional enrichment. The reads obtained from sequencing were compared with the unigene library using Bowtie software 44  RT-qPCR verification. Real-time quantitative PCR (RT-qPCR) amplification was conducted using an SYBR Premix Ex TaqTM II kit. Twelve genes were selected; the actin gene was used as the reference gene, Primer Primer5 software was used to design primers (refer to Table 4 for the primer list). All samples reported in the transcriptome were quantitatively verified; each sample had three biological and three technical replicates. The first strand of the cDNA fragment was synthesized from total RNA. The RT-qPCR reaction conditions were as follows: preheating at 95 °C for 30 s, 40 cycles at 95 °C for 5 s, and annealing at 60 °C for 34 s. Relative expression levels were calculated using the 2 −ΔΔCt method 50 .