To Be a Flower or Fruiting Branch: Insights Revealed by mRNA and Small RNA Transcriptomes from Different Cotton Developmental Stages

The architecture of the cotton plant, including fruit branch formation and flowering pattern, is the most important characteristic that directly influences light exploitation, yield and cost of planting. Nulliplex branch is a useful phenotype to study cotton architecture. We used RNA sequencing to obtain mRNA and miRNA profiles from nulliplex- and normal-branch cotton at three developmental stages. The differentially expressed genes (DEGs) and miRNAs were identified that preferentially/specifically expressed in the pre-squaring stage, which is a key stage controlling the transition from vegetative to reproductive growth. The DEGs identified were primarily enriched in RNA, protein, and signalling categories in Gossypium barbadense and Gossypium hirsutum. Interestingly, during the pre-squaring stage, the DEGs were predominantly enriched in transcription factors in both G. barbadense and G. hirsutum, and these transcription factors were mainly involved in branching and flowering. Related miRNAs were also identified. The results showed that fruit branching in cotton is controlled by molecular pathways similar to those in Arabidopsis and that multiple regulated pathways may affect the development of floral buds. Our study showed that the development of fruit branches is closely related to flowering induction and provides insight into the molecular mechanisms of branch and flower development in cotton.


Results
Transcriptome sequencing of mRNA libraries. To identify transcripts that are differentially expressed during the differentiation stage of flower and floral buds between nulliplex-and normal-branch cotton, samples from plants at three different developmental stages were used: the seedling stage, the pre-squaring stage, and the squaring stage. Four lines (varieties) were used to construct RNA-seq libraries (Fig. 1). The experiments were performed with three biological replicates, resulting in 36 RNA-seq libraries for sequencing. A total of 2,022,339,010 raw reads were obtained. After low-quality reads were filtered out, a total of 1,915,778,530 clean reads were selected for further analysis (See supplemental Table S1).
DEGs between nulliplex-and normal-branch cotton. Based on these expression analyses, we identified 422 upregulated and 307 downregulated DEGs in G. barbadense and 123 upregulated and 92 downregulated DEGs in G. hirsutum during the seedling stage. In G. barbadense and G. hirsutum, 2,538 upregulated and 2,419 downregulated and 339 upregulated and 304 downregulated DEGs, respectively, were identified during the pre-squaring stage and 1,069 upregulated and 943 downregulated and 1,344 upregulated and 990 downregulated DEGs, respectively, were identified during the squaring stage (Fig. 2, see supplemental Data set S1 to S6 online). There were more DEGs at the pre-squaring stage in G. barbadense than the other two stages, whereas the greatest number of DEGs was identified in the squaring stage in G. hirsutum.
Functional category enrichment of DEGs. All DEGs of the six groups were assigned to Mapman functional categories. During the seedling stage, the DEGs were almost equally distributed in each category. During the second and third stages, the DEGs were mainly enriched for RNA, protein, signalling, and transport in both Scientific RepoRts | 6:23212 | DOI: 10.1038/srep23212 types of cotton (Fig. 3A,B). Regarding protein metabolism categories, the DEGs were mainly associated with ribosomal protein, posttranslational modification, and degradation at the second stage of G. barbadense. The third stage was mainly enriched with genes for posttranslational modification and degradation. However, the DEGs were considerably enriched in genes regulating degradation at the second stage and enriched in genes involved in posttranslational modification and degradation at the third stage (Fig. 3C). Regarding RNA categories, the DEGs were mainly enriched for transcription factor families, including homeobox, MYB, AP2/EREBP, WRKY, C2H2, and bHLH at the second stage in G. barbadense and enriched for MYB at the squaring stage in G. hirsutum (Fig. 3D).

DEGs in flower and floral bud development.
For interlibrary comparison, read numbers were normalised to relative abundance as reads per kilobase transcriptome per million mapped reads (RPKM). The RPKM value (mean value of three biological replicates) of each gene was further used to compute the related coefficients between each sample. The expression correlations of genes showed accordance between the same development stages in both types of cotton, excluding Gb3WT. The correlations were substantially lower between the seedling stage and squaring stage and/or pre-squaring stages (Fig. 4). Partial correlations between pre-squaring stage samples and squaring stage samples were lower than same-stage samples, such as that between Gb2WT and Gb2np with a coefficient of 0.83, which was slightly lower than 0.96 for Gb3WT (Fig. 4). This finding suggests that the development of the pre-squaring stage and squaring stage may be closely associated with integral processes.
To study the molecular mechanisms of cotton fruit branch initiation and flower development, we analysed the stage-specific DEGs at three stages in each type of cotton. There were 187, 3,825, and 845 DEGs in the seedling stage, pre-squaring stages and squaring stages of G. barbadense between nulliplex and normal branches, respectively (Fig. 5A). In G. hirsutum, there were 66, 353, and 2,047 specific DEGs in the three stages between the branching types, respectively (Fig. 5B).
DEGs specifically expressed in the pre-squaring stage. Flowers and floral buds are forming in the pre-squaring stage; genes specifically expressed during this stage may play a key role in the early regulation of cotton flower and floral bud formation. Gene ontology (GO) enrichment analyses of DEGs specifically expressed during the pre-squaring stage revealed a major enrichment of oxidoreductase activity in both G. barbadense and   Tables S2 and S3; correction p < 0.05). In addition, DEGs were primarily enriched for transmembrane transporter activity and hydrolase activity and acting on glycosyl bonds in G. barbadense and ion binding in G. hirsutum (Fig. 6A, Tables 1 and 2). We further analysed the DEGs collectively expressed in the pre-squaring stage, and the results revealed 68 DEG candidates (See supplemental Fig. S1 and S2).
Then, we classified the genes that were specifically expressed during the pre-squaring stage using the Mapman program and found that these genes fell into six major groups: cell wall-related processes, secondary processes, stress response, miscellaneous, RNA processes, and development (Fig. 6B). More interestingly, 12 DEGs (approximately one-fifth of the tissue-specific genes) in the pre-squaring stage were mainly enriched in RNA processes, including key flower development genes such as short vegetative phase (SVP, CotAD_47245), late elongated hypocotyl (LHY, CotAD_32839), and pseudo-response regulator 5 (PRR5, CotAD_14148), and were related to ABI3/VP1 1 (RAV1, CotAD_22031), suggesting that the pre-squaring stage specifically expressed genes that tend to promote flower development. In addition, genes specifically expressed during bud and shoot apical meristem development included branched 1 (BRC1, CotAD_02759), breast cancer associated RING 1 (BARD1, CotAD_54816), and indole-3-acetic acid inducible 18 (IAA18, CotAD_00235), which implied that these genes control the growth and development of floral buds to form the cotton architecture (Table 3). SVP, RAV1, and IAA18 were substantially downregulated and NF-YC13, PCNA2, BARD1, and ATXRP were upregulated during the pre-squaring stage compared to the other two stages in both G. barbadense and G. hirsutum. Moreover, EPR1, PPR5, and LHY were upregulated in G. barbadense but downregulated in G. hirsutum at the second stage. For BRC1 and CotAD_00066, the tendency to change was in the opposite direction of the latter three genes (Fig. 7, see supplemental DEGs of different developmental stages. The DEGs were analysed among the different developmental stages of each cotton variety, and the results showed that the number of DEGs in the squaring stage compared to the seedling stage and pre-squaring stage compared to seedlings in G. barbadense was much higher than in G. hirsutum in both nulliplex-and normal-branch cotton (Fig. 8A). The pre-squaring stage may be controlled by the differentiation of the fruit branch tissue, whereas the squaring stage is mainly controlled by the growth of the fruit branches in normal-branch cotton. The results revealed 395 DEGs between the pre-squaring stage and seedling stage and 435 DEGs between the squaring stage and seedling stage in normal-branch cotton (Fig. 8B). In nulliplex branches, there were 328 DEGs between the pre-squaring stage and the seedling stage and 535 DEGs between the squaring stage and the seedling stage.  These DEGs were enriched for functional categories, as determined using the Mapman software. The results showed that the DEGs of the nulliplex branches (Gb1np-vs-Gb2np, Gb1np-vs-Gb3np, Gh1np-vs-Gh2np, and Gh1np-vs-Gh3np) were substantially enriched in signalling categories, and the DEGs of the normal branches (Gb1WT-vs-Gb2WT, Gb1WT-vs-Gb3WT, Gh1WT-vs-Gh2WT, and Gh1WT-vs-Gh3WT) were mainly enriched in transport categories (Fig. 8D). In addition, the DEGs of the normal branches in the floral bud's forming stage and seedling stage (Gb1WT-vs-Gb3WT and Gh1WT-vs-Gh3WT) were considerably enriched in development categories. These results suggest that nulliplex-branch and normal-branch cotton may have different developmental patterns.
Sequencing of small RNA libraries. Twelve small RNA samples (each sample was the same as above with three biological repeats) yielded a total of 422,920,856 reads and 175,305,470 unique reads. There were more than 11 million reads and more than 4 million unique reads for each library. The reads of each library were mapped back to the cotton genome and accounted for 75% to 88% of the total reads, representing 67% to 81% of their unique read counterparts (See supplemental Table S4). In both types of cotton, rRNA in the squaring stage were substantially different between nulliplex and normal branches, showing 0.56% unique reads in normal branches and 1.36% unique reads in nulliplex branches for G. barbadense (miGb3WT vs. miGb3np) and 0.38% unique reads in the normal branches and 0.85% unique reads in nulliplex branches for G. hirsutum (miGh3WT vs. miGh3np). However, the other RNA families had similar proportions, including miRNA (0.5-0.7% for unique reads), snRNA (0.02-0.05% for unique reads), snoRNA (approximately 0.02% for unique reads), and tRNA (approximately 0.15% for unique reads) (See supplemental Table S4). For all sRNA libraries, the major distribution of read lengths was between 21 and 24 nt.
Identification of known conserved miRNA families. A total of 712 known conserved miRNAs were identified in 12 samples (three biological replications). Of these, some miRNAs were highly and stably expressed in all samples, such as miR157a-5p, miR166a-3p, and miR3954 (See supplemental data set S7 online). However, other miRNA counts diverged sharply. For example, miRNA167h had more than 35 k reads in miGb3WT and miGb2np and no reads in miGb3np and miGh2np (See supplemental data set S7 online).  Among these DEMs, there were 11, 16, and 26 stage-specific DEMs in the G. barbadense seedling stage, pre-squaring stage, and squaring stage, respectively (Fig. 9B). In G. hirsutum, we identified 11, 18, and 30 stage-specific DEMs in the seedling stage, pre-squaring stage, and squaring stage, respectively (Fig. 9C). In G. barbadense, the uniquely expressed miRNAs at the pre-squaring stage, which included miR156c-3p, miR156i-3p, and miR171c of the miR156 and miR171 family may be related to floral transition (See supplemental Table S5) 27 . In G. hirsutum, the specifically expressed miRNAs at the pre-squaring stage including miR166c, a member of the miR166 family, were also involved in floral organ polarity and shoot apical meristem formation (See supplemental  Table S6) 27,35 . In addition, at the pre-squaring stage of G. barbadense, the DEMs included miR167 and miR166 family members and the stage-specific DEMs noted above (See supplemental data set S9 online).

Pathway
Gene number(2243) Pvalue Qvalue Pathway ID Novel miRNAs were identified and the results showed that of these 3,350 novel miRNAs, 13, 8, and 13 were DEMs in the seedling stage, pre-squaring stage, and squaring stage of G. barbadense and 11, 9, and 13 were DEMs in the three stages of G. hirsutum, respectively (See supplemental Fig. S3A and S3B).
DEM target identification at three stages. The DEM target pairs above were identified from G. hirsutum genome sequences based on negative regulation between miRNA and their targets by combining the results of DEG analysis with RNA-seq. Among the three stages, 80 targets of the specifically expressed miRNAs were screened out. In the pre-squaring stage of G. barbadense, 11 DEMs targeting 32 DEGs were identified, including CotAD_35367 (MADS box protein, a homologue of sepallata3 (SEP3), which is involved in flower development) 36 and CotAD_75643 (a homologue of HB8, which regulates post-embryonic meristem initiation and is also bound by SVP in vegetative tissue in Arabidopsis) ( Table 4) 37,38 .
In the seedling stage, there were two DEMs with three target DEGs in G. barbadense and one DEM and one target in G. hirsutum. At the squaring stage, 7 DEMs were found in 11 DEG target genes in G. barbadense, and in G. hirsutum, 6 DEMs were identified with a predicted 33 target genes (Table 4).

RNA-seq expression validation by qPCR.
We used qPCR to confirm the reliability of the RNA-seq data.
The results showed that among nine selected DEGs of the pre-squaring stages, approximately 90% were consistent with RNA-seq data (See supplemental Fig. S5).

Discussion
Plant architecture is fundamental to agricultural productivity and the artificial selection of desired growth habits. The regulation of cotton architecture has enormous potential applications for high-density planting, mechanised harvesting, decreasing the cost of cultivation and improving yield. In the present study, to investigate the intricate molecular mechanisms of fruit branch and flower development in cotton plants, RNA-seq technology was utilized to examine the global gene expression profile of shoot apexes by generating both mRNA and miRNA libraries from the shoot apex of four cotton varieties (lines), including two with nulliplex branches and two with normal branches in three developmental stages with the aim of subtracting the unrelated differences in genotype, applying the newly published allotetraploid genome of G. hirsutum as a reference genome, which provided us with more complete and informative results for credible data mining 33,34 .
While it would have been highly desirable to start with genetic material that differs only in the trait of interest (e.g., near-isogenic lines), it would have taken many years to establish such lines given that G. hirsutum and G. barbadense are allotetraploid.
In this study although the nulliplex branches and normal branches lines of G. hirsutum and G. barbadense did not share parentage with each other, the differentially expressed genes observed should include the result of both the np mutation and unrelated differences in genotype in each comparison of G. hirsutum or G. barbadense. In order to acquire the result of the np mutation and subtract the unrelated differences in genotype, our experimental design with three different developmental stages of the two cotton fruit branch types in G. hirsutum and G. barbadense can help to find the differentially expressed genes related to the development of fruit branch. Firstly, we screened the common DEGs among three stages (the seedling, pre-squaring and squaring stage) in two kinds of branch type (nulliplex-branch and normal branch) of G. barbadense and G. hirsutum lines, respectively. Secondly, the common DEGs between the above two groups of G. barbadense and G. hirsutum were selected and can be considered as candidate DEGs for the np mutation. Finally, in this way we have in reality obtained interesting results and selected 68 genes associated with development of the np mutation in the pre-squaring stage. Among them, we have acquired the key genes related to development of branch (Table 3), such as Branched1(GhBRC1), IAA 18 etc., which play central roles in the development of branch in plants 13,[39][40][41][42][43][44] . The results showed that the nulliplex branch is a useful trait and model for the study of cotton architecture.
The results indicated that the number of DEGs was significantly higher in the second stage than in the third stage of development in G. barbadense, but the DEG numbers of two stages in G. hirsutum showed the opposite results. The correlation coefficient between Gh2np and Gh2WT was 0.93, significantly higher than the 0.89 correlation found between Gh3np and Gh3WT. The correlation coefficient between Gb2np and Gb2WT was 0.83, lower than the correlation coefficient of 0.84 between Gb3np and Gb3WT (Fig. 4).
Mapman software analysis indicated that each developmental stage of the two cotton plant types was enriched with DEGs. The four samples of the seedling stage were still in the early period of vegetative growth, and many biochemical pathways were not initiated; consequently there was a relatively small number of DEGs. The pre-squaring stage is a key stage that may control the transition from vegetative growth to reproductive growth, and both types of growth may occur simultaneously during the squaring stage. DEGs between nulliplex branches and normal branches were mainly enriched in RNA, protein, secondary metabolism, stress, and signalling processes in the pre-squaring stage and squaring stage (Fig. 3A). The results showed remarkable differences between nulliplex-branch and normal-branch cotton in the development of the shoot apex, especially in the RNA category, which contains a large number of transcription factors. Fruiting branch development and differentiation from the apical or lateral meristem may be regulated by a series of transcription factors in cotton. However, there were different proportions of DEGs in some functional categories between G. barbadense and G. hirsutum (Fig. 3B). In the hormone metabolism category, the proportion of DEGs was significantly higher in G. barbadense than  in G. hirsutum. This difference indicates that the developmental process may be more closely regulated by plant hormones in G. barbadense than in G. hirsutum in these four varieties (lines).
Regarding the different stages, the DEGs of the normal-branch cotton plants in the squaring stage and seedling stage (Gb1WT-vs-Gb3WT and Gh1WT-vs-Gh3WT) were remarkably enriched in development categories. In normal-branch cotton, the fruit branch grows during the squaring stage (there are no fruit branches in nulliplex-branch cotton), which is in agreement with the RNA-seq results. The second stage is a key developmental stage that regulates the transition from vegetative growth to reproductive growth and fruit branch formation. We further examined specific-stage DEGs in the second stage, and 68 candidate genes were identified.
The screened DEGs included 12 transcription factors ( Table 3). The gene CotAD_02759 (GhBRC1), a homologue of Arabidopsis branched1 (AtBRC1), was downregulated in nulliplex-branch G. barbadense and upregulated in nulliplex-branch G. hirsutum at the pre-squaring stage. BRC1 downregulation leads to branch outgrowth in Arabidopsis 39 . Based on mutants and expression analysis in Arabidopsis, BRC1 is likely downstream of the more axillary growth pathway and is required for auxin-induced apical dominance 13,39,40 . The cotton GhBRC1 gene may have the same function as AtBRC1, and the normal branch (fruit branch) is analogous to branch outgrowth in Arabidopsis to form axillary meristems. Cotton GhBRC1 was upregulated in Gh2np (nulliplex-branch, G.  hirsutum), which may be repressed during fruit branch development. However, it remains unclear why GhBRC1 was downregulated in Gb2np, a nulliplex-branch (no fruit branch) variety of G. barbadense (Fig. 6). The homologue (CotAD_00235, GhIAA18) of the auxin-related transcription factor IAA18 (indole-3-acetic acid inducible 18) in Arabidopsis, which causes aberrant cotyledon placement in embryos apical patterning, has been shown to be downregulated in nulliplex-branch cotton of both G. barbadense and G. hirsutum 41 . This finding suggests that another regulated pathway of lateral development may play a role.
The axillary meristem differentiates into fruit branches to form floral buds. Based on our results, the homologous genes may also regulate the differentiation of apical meristems in addition to BRC1 modulation of florigen activity in the floral buds of the axillary meristems in Arabidopsis 13 . The gene CotAD_54816 (GhBARD1) is homologous with BARD1 in Arabidopsis (also named ROW1) and is a repressor of Wuschel1 45 . BARD1 regulates shoot apical meristem organisation and maintenance by limiting WUS expression to the organising centre. Recently, BARD1 was shown to be essential for QC maintenance and stem cell niche development through the repression of WOX5 in the proximal meristem 45,46 . During the pre-squaring stage, GhBARD1 was substantially upregulated in Gb2np and Gh2np, which may repress downstream gene expression or the shoot apical meristem formation of floral buds, causing nulliplex-branch cotton.
In addition, GhBRC1 and GhBARD1 are involved in floral meristem determinacy. BRC1 interacts with Flowering Locus T to repress the floral transition, and BARD1 can limit WUS expression, which forms a negative feedback loop with agamous (AG) and is involved in floral determinacy and the specialisation of reproductive floral organs 13,[47][48][49] . These results suggest that the formation of nulliplex-branch cotton is associated with lateral axillary meristem development and floral formations, similar to other flowering time genes.
In Arabidopsis, PRR5 can bind to LHY promoters and recruit transcriptional corepressors of the Groucho/tup1 corepressor family to repress LHY transcription in circadian clocks 50,51 . EPR1 is regulated by both phytochrome A and phytochrome B and is a component of a slave circadian oscillator in Arabidopsis 52 . To the best of our knowledge, the circadian clock gene and photoperiodic response can induce early flowering in Arabidopsis [53][54][55] . Based on our results, GhPRR5, GhLHY, and GhEPR1 genes were upregulated in Gb2np and downregulated in Gh2np (Fig. 7).The same expression trend implied that GhPRR5 might also bind to the GhLHY promoter and regulate the circadian clock of flowering time in cotton.

Continued
In addition, SVP is a negative regulator of the floral transition, which forms a flowering repressor complex together with FLC in Arabidopsis 56-59 . In our study, the SVP homologue GhSVP was downregulated in nulliplex-branch plants in both G. barbadense and G. hirsutum, which suggests that GhSVP may promote floral transition in nulliplex-branch cotton compared to normal-branch cotton in our four varieties (lines). Furthermore, the overexpression of RAV1, a homologue of cotton CotAD_22031 (GhRAV1) in Arabidopsis, slows the development of lateral roots and rosette leaves [60][61][62] . In addition, low expression causes an early flowering phenotype, implying that RAV1 may function as a negative regulator of growth and development 60 . In the present study, the GhRAV1 gene was downregulated in nulliplex-branch cotton at the pre-squaring stage in both G. barbadense and G. hirsutum, which implies that GhRAV1 may have accelerated the flowering speed of cotton varieties Xinhai16 and 3798.
Although there is no report on trehalose-6-phosphate synthase 5 (TPS5: the homology of cotton gene CotAD_16567, GhTPS5), TPS1 (a homology family number with TPS5) is essential for normal vegetative growth and transition to flowering in Arabidopsis 63 .
Similarity analyses of the above differentially expressed genes showed that many DEGs of the fruit branch and floral transition may be predominantly expressed in cotton similar to Arabidopsis, Theobroma cacao, Populus euphratica, and other species. However, most of the related evidence has been obtained from studies on Arabidopsis.
In the past decade, numerous studies have demonstrated that different miRNA families play important roles in regulating plant architecture and branching patterns 42,64,65 , although a high level of conservation of miRNAs and their regulatory pathways/target genes has been demonstrated in many species. In the present study, DEMs were analysed between nulliplex-branch and normal-branch plants. In the pre-squaring stage, stage-specific DEMs such as miR156 and miR171 family members were found in G. barbadense (See supplemental table S5) and miR166 was found in G. hirsutum (See supplemental table S6). The miR156 miRNA family promotes juvenile-to-adult phase transition and apical meristem formation 64,66 , and miR171 negatively regulates shoot branching 35,67 . In the pre-squaring stage, the miR166 family found in G. hirsutum regulates the shoot apical  meristem and lateral organ formation in Arabidopsis 66,68 . In addition, these miRNA families have been confirmed to be closely related to flowering time 26,60 , in agreement with our RNA-seq results. The majority of the miRNA families are expressed during several plant growth periods or developmental processes. The non-stage-specific DEMs may also be involved in branching and flowering processes. More members of mi156, miR166, and miR167 were differentially expressed in three developmental stages (See supplemental data set S8 to S13 online). Nevertheless, the target of the DEMs was predicted, and we could not identify the genes related to branching and flowering (Table 4). These results imply that there were a number of genes involved in branching and flowering that had not been previously identified, but the molecular mechanism of regulation remains unclear. This conclusion suggests that nulliplex branching may be related to a flower time-related candidate gene 5 .
In nulliplex-branch cotton, the floral bud directly arises from the leaf axils of the main shoot, whereas in normal-branch the floral bud arises from the node of the fruit branch. With the fruit branch growing on leaf axils, the floral bud is produced at the same time. Based on the above results, genes controlling the formation of nulliplex-branch cotton may also be involved in floral induction. Interestingly, in the four varieties (lines) used for sequencing in this study, the squaring stage of Xinhai16 (G. barbadense L., nulliplex branch) was significantly earlier than that of Hai1 (G. barbadense L., normal branch) and line 3,798 (G. hirsutum L., nulliplex-branch) was significantly earlier than Huazhong 94-3130 (G. hirsutum L., normal branch). The earlier squaring stage (flowering time) in the nulliplex-branch compared to the normal-branch cotton in both G. barbadense and G. hirsutum in the four varieties (lines) may coincide with branching control genes involved in floral induction.
Our results showed that multiple pathways, including genes related to fruit branch development, e.g., GhBRC1, may control nulliplex branching. At the same time, the flowering time genes may influence fruit branch growth. The different tendencies of DEGs in the island and G. hirsutum imply that multiple pathways may regulate branching. The KEGG pathway of the pre-squaring stage DEGs showed that the DEGs of the island and G. hirsutum were both enriched in circadian rhythm genes, which may be associated with flowering (See supplemental  Tables S5 and S6). In the circadian rhythm pathway, all enriched DEGs were downregulated (Fig. 10A). However, DEGs such as GhLHY have different expression patterns in G. barbadense compared to G. hirsutum. Other DEGs homologous to GhLHY showed the same expression trend as GhLHY in G. hirsutum, such as CotAD_17426. In addition, other genes in the circadian rhythm pathway have the same condition, such as CCA1, PIF3, and APR7 (Fig. 10B). Furthermore, the plant hormone signal transduction pathway was substantially enriched in G. barbadense but not in G. hirsutum (See supplemental Tables S5-S7). This finding supports the hypothesis that other regulation pathways may be involved in G. barbadense, and the results showed that many DEGs were related to multi-hormone signal transduction pathways (Fig. 10C, See supplemental Table S7).
In conclusion, GhBRC1 may play a key role in floral bud formation (flowering) and fruit branch development in cotton, which is similar to BRC1′ s central role in the architecture of Arabidopsis 42,43 . GhBRC1 may also be associated with plant hormones and the flowering network, including the circadian rhythm genes that control fruit branching and floral transition, and thus is involved in elaborate regulatory mechanisms and networks.

Conclusions
Overall, our results imply that the molecular pathways similar to those in Arabidopsis may control fruit branching in cotton. Multiple regulated pathways may affect the development of floral buds on leaf axils in G. barbadense and G. hirsutum. The development of the fruit branch is closely related to the induction of flowering. Our results provide a basis for plant type research in crops and an approach for elucidating the molecular mechanism of the development of the nulliplex branch and the normal branch in the flowering and fruit branch forming stage for improving plant architecture and yield in cotton and other plants.

Materials and Methods
Plant material. The cotton varieties Xinhai16 (G. barbadense L., nulliplex-branch), Hai1 (G. barbadense L., normal branch), line 3798 (G. hirsutum L., nulliplex-branch), and line Huazhong 94-3130 (G. hirsutum L., normal branch, obtained from the Cotton Institute of the Chinese Academy of Agricultural Sciences of China) were grown in a natural field environment (Kaifeng, China) in 2014, all the materials are stable and pure lines or varieties which have been self-fertilized for at least eight generations before we used in the studies. Shoot apices (approximately 10 mm) from each individual were collected at three sampling stage and stored immediately in liquid nitrogen after washing with water (See supplemental Fig. 3). The three stages used for constructing RNAseq libraries included the seedling stage with two leaves of Xinhai16 (Gb1np), Hai1 (Gb1WT), 3798 (Gh1np), and Huazhong 94-3130 (Gh1WT). The pre-squaring stage with five and six leaves for Xinhai16 (Gb2np), seven and eight leaves for Hai1 (Gb2WT), five leaves for 3798 (Gh2np), and six and seven leaves for Huazhong94-3130 (Gh2WT), respectively, all without any visible triangular floral bud, just in the forming stage of floral bud (flowering) and fruiting branch in the pre-squaring stage. The squaring stage with nine leaves for Xinhai16 (Gb3np), nine and ten leaves for Hai1 (Gb3WT), and eight to ten leaves for 3798 (Gh3np) and Huazhong94-3130 (Gh3WT), respectively, all with visible triangular floral buds in the field. These leaf numbers above refer to visual, expanded true leaves in cotton field 69 . The nodes of first fruiting branch we investigated for each line including Xinhai16, Hai1, 3798 and Huazhong 94-3130 were 6-7,8-9,5-6,7-8 true expanded leaves, respectively, in virtue of the different lines (varieties). The same RNA samples were used for sRNA-seq library construction, and we labelled the libraries with the prefix 'mi' as in 'miGb1np' .
Sample preparation and sequencing library construction. The  Then, the total RNA of each sample was sent to BGI (Shenzhen, China) for library construction and sequencing. Finally, an Agilent 2100 Bioanalyzer and ABI Step One Plus Real-Time PCR System was used for the quantification and qualification of the sample library, after which the library was sequenced using an Illumina HiSeq ™ 2000 platform.
Alignment of RNA-seq reads to the reference genome. Reference genome sequences and annotations were downloaded from the Institute of cotton Research of CAAS (http://cgp.genomics.org.cn) 33 . Reads were aligned to the genome with no more than five mismatches using the SOAP aligner/SOAP2 software after low-quality reads (reads with unknown sequences 'N'), adaptor sequence fragments, and empty reads were removed 70 . After passing the QC process of the alignment, the results were used for further analysis.
Differential gene expression analysis. RPKM was used to obtain the relative levels of expression 71,72 .
Differential expression analysis was performed using R packages of DESeq for comparisons among samples with three biological replicates 71,73 . We used a false discovery rate of < 0.001 and an absolute value of the log 2 ratio > 1 as the threshold for judging the significance of the gene expression differences 74 . The DEGs were further analysed by GO-Term Finder and path_finder for GO and KEGG enrichment 75 . The calculated p-value was Bonferroni corrected, taking the corrected p-value < 0.05 as a threshold for GO annotation. After the GO annotation was obtained for each gene, WEGO software was used to classify the genes by function and to determine the distribution of gene functions in the species at the macro level 76 . The DEGs were annotated using the Mercator web tool 77 and then loaded into Mapman software for function enrichment analysis [78][79][80] . The Venn diagrams in this study were prepared using the function Venn diagram in R based on the gene list for each tissue type.
Small RNA classification. After trimming adaptor sequences at the 5′ and 3′ ends of sequenced reads low-quality reads (reads with unknown sequences 'N'), adaptor sequence fragments, and empty reads were filtered out. Reads were aligned to the genome using the SOAP aligner/SOAP2 software with parameter settings for perfect matches. Then, sRNA-seq reads were compared to miRNA databases including repeat RNA, miRBase (version 21, http://www.mirbase.org/ftp.shtml), GenBank (ftp://ftp.ncbi.nlm.nih.gov/genbank/), Rfam (version 11, http://rfam.janelia.org/), and snoRNA (http://www.mir2disease.org/) using the BlastN program (version 2.2.16). Mireap software (version 0.1) was used for novel miRNA prediction. Differential expression analysis of small RNAs and target prediction. Differentially expressed miRNAs between samples were identified using DESeq software 73 . The false discovery rate of < 0.001 and an absolute value of the log2 ratio > 1 were used as the threshold for determining the significance of miRNA expression differences. miRNA target gene prediction was performed using psRobot and TargetFinder [81][82][83][84][85] . miRNA function categorisation and pathway enrichment were based on the GO and KEGG databases.
Data validation by qPCR. Total RNA was extracted as described for DEG library preparation and sequencing. Total RNA (2 μg) from each sample was reverse transcribed in a 20 μL reaction using Reverse Transcriptase M-MLV (Takara). The sequences of the primers used are shown in Supplemental Table S8. The actin gene and/ or UBQ7 gene of cotton (GeneBank accession No. is AY305733 and DQ116441) were used as internal control genes. qRT-PCR was performed using a SYBR Premix Ex Taq ™ Kit (Takara) according to the manufacturer's protocol. The 20 μL reaction volume consisted of forward and reverse primers (1 μL) SYBR premix Ex Taq II (10 μL), ddH 2 O (2.6 μL), cDNA (5 μL) and ROX Dye II (0.4 μL). The selected genes were verified using an ABI 7500 real-time PCR detection system with a cycling temperature of 60 °C and a single peak on the melting curve to ensure a single product. Relative transcript levels for each sample were obtained using the 2 −△△Ct method. At least three replicates were tested per sample.