Dynamic mRNA and miRNA expression analysis in response to intermuscular bone development of blunt snout bream (Megalobrama amblycephala)

Intermuscular bone (IB), which occurs only in the myosepta of lower teleosts, is attracting more attention because they are difficult to remove and make the fish unpleasant to eat. By gaining a better understanding of the genetic regulation of IB development, an integrated analysis of miRNAs and mRNAs expression profiling was performed on Megalobrama amblycephala. Four key development stages were selected for transcriptome and small RNA sequencing. A number of significantly differentially expressed miRNAs/genes associated with bone formation and differentiation were identified and the functional characteristics of these miRNAs/genes were revealed by GO function and KEGG pathway analysis. These were involved in TGF-β, ERK and osteoclast differentiation pathways known in the literature to affect bone formation and differentiation. MiRNA-mRNA interaction pairs were detected from comparison of expression between different stages. The function annotation results also showed that many miRNA-mRNA interaction pairs were likely to be involved in regulating bone development and differentiation. A negative regulation effect of two miRNAs was verified through dual luciferase reporter assay. As a unique public resource for gene expression and regulation during the IB development, this study is expected to provide forwards ideas and resources for further biological researches to understand the IBs’ development.

It is well known that most of the freshwater aquaculture fishes around the world, especially Cyprinidae species, possessed a certain amount of intermuscular bones (IBs), which are hard-boned spicules located in the muscle tissue on both sides of the vertebrae 1 . The IBs develop directly from mesenchymal condensation and consist of membranous ossifications of connective tissue in the muscular septum. IBs differ from other axial skeletons, such as the vertebral column, which develops from a mesenchymal cell population derived from the ventral somite 2 . IBs reduce the edibleness and economic value of a fish species 3 . In view of the potential value in eliminating IBs, researchers began to study IBs in fish as early as the 1960s 4 . Interestingly, prior studies revealed the significant differences in the number of IBs among fish of different ploidy, indicating that genetic improvement methods could have a significant effect on decreasing the number of IBs 5 . But so far, almost all existing studies have only focused on the morphology of IBs in fish species [6][7][8] . Few in-depth studies have been carried out and little information has been obtained regarding the molecular mechanism for development of IBs. Only one previous study has revealed the molecular properties of IBs through microRNA (miRNA) transcriptome analysis 9 .
High-throughput sequencing technologies are being widely used to investigate the complete repertoire of expressed RNA transcripts in specific tissues or cells at specific stages of development, and this knowledge is helping improve our understanding of the molecule mechanism affecting expression 10 , regulatory 11 and development 12 . In fish, independent transcriptome profiling (RNA-seq) of miRNAs or mRNAs has been used to identify gene and miRNA expression related to various physiological functions [13][14][15][16] . As miRNAs has been reported to regulate gene expression by promoting mRNA degradation and repressing translation, the combined analyses of mRNA with miRNA expression data has been conducted in plants and mammals to infer regulatory mechanisms associated with various physiological processes [17][18][19] . In fish species, the integrated analysis of mRNA and miRNA expression profiles has been performed on zebrafish (Danio rerio) for optic nerve regeneration and on darkbarbel catfish (Pelteobagrus vachelli) in response to hypoxia 20,21 .
Perazza et al. 22 has reported that specimens of tambaqui (Colossoma macropomum) from one culture population lack IBs, even though this fish normally has IBs. Xu et al. 23 has identified an intermuscular bone-deficient grass carp (Ctenopharyngodon idellus) mutant in an artificial gynogenetic group. Based on these two reports, it has been speculated that there maybe one or several key genes that regulate the development of IBs.
In the present study, we used one typical Cyprinidae species, blunt snout bream (Megalobrama amblycephala), for which the morphological characteristics, emergent periods and morphogenesis of IBs are known from our previous research 24 . IBs gradually appear during the growth and development of M. amblycephala. It emerged 20 days post hatching (dph) at a body length of 1.33 cm, first in the tail and then toward the head. When fry were 40 dph with the body length of 2.36 cm, all intermuscular bones had appeared. With the aim to generate the fundamental molecular resources across the different developmental stages from emerging to complete formation of IBs and try to find the putative key genes or miRNAs for regulating IBs' development, the transcriptome property, mRNA and miRNA expression profiling were investigated in the four key stages related to IBs development of M. amblycephala in this study. A certain amount of bone-related genes, period-specific or period-differential expressed mRNAs and miRNAs were identified, and the related signaling pathways were uncovered through the comparisons of the four development stages. Meanwhile, some interaction networks and regulatory modes of mRNAs and miRNAs were also revealed based on the integrated analysis of miRNA and mRNA expression profiles.

Materials and Methods
All animals and experiments were conducted in accordance with the "Guidelines for Experimental Animals" of the Ministry of Science and Technology (Beijing, China). The study was approved by the Institutional Animal Care and Use Ethics Committee of Huazhong Agricultural University. All efforts were made to minimize suffering. All experimental procedures involving fish were approved by the institution animal care and use committee of the Huazhong Agricultural University.
Experimental animals and tissue collection. Experimental fishes were selected from an artificially propagated population and grown in the recirculating aquaculture system after hatching. Before sampling, experimental animals were anaesthetized in well-aerated water containing the 100 mg/L concentration of Tricaine methanesulfonate (MS-222). Four key stages for the development of IBs were selected as done for our previous research 21 and shown in Fig. 1. Muscle tissue containing IBs in the tail (Fig. 2), which were the initial area of intermuscular bone differentiation, were sampled under an anatomical lens. Details about sampling are described in Supplementary Fig. S1-1. The accuracy of sample collection was validated using Alizarin red and Alcian blue staining to ensure that there was no interference form fins, scales and other impurities (Supplementary Fig. S1-2). Composite samples of 6-10 individuals were collected to extract total RNA for every stage and each stage possessed two biological replicates. In addition, six months old fish was used to sample different tissues for gene expression analysis.
Total RNA isolation and cDNA library construction. Total RNA was isolated from the samples at 4 stages and different tissues using RNAiso Plus Reagent (TaKaRa, China) according to the manufacturer's protocol. RNA quality and quantity were measured using the NanoDrop 2000 (Thermo Scientific, USA) and Agilent 2100 Bioanalyzer (Agilent, USA). All the samples were standardized to 500 ng/μ l. At each stage, two Digital Gene Expression Profiling libraries and one miRNA transcriptome library were constructed. Meanwhile, a reference transcriptome library was constructed by mixing equal volumes RNA from 4 stages. The schematic diagram for high throughput sequencing is shown in Supplementary Fig. S1-3. During the QC steps, Agilent 2100 Bioanaylzer and ABI StepOnePlus Real-Time PCR System were used to quantify and qualify the sample library preparations. Finally, the libraries were sequenced using an Illumina HiSeq ™ 2000 platform (BGI, Shenzhen, China).
All of the raw RNA-Seq data were submitted to the NCBI databases (http://trace.ncbi.nlm.nih.gov/Traces/sra/ sra.cgi?view= run_browser) under accession number SRR2335137 for transcriptome data, SRR2338087 for DGE data and SRR2338872 for miRNA data.
De novo assembly and functional annotation. All reads with low quality or shorter than 18 nt were eliminated. The 5′ or 3′ primer contaminants and polyA tails were removed. The clean sequencing reads were de novo assembled using the Trinity (http://trinityrnaseq.sourceforge.net/) with Kmer = 25. The TGI Clustering Tool (version 2.1) (http://sourceforge.net/projects/tgicl/) was used for further processing of sequence splicing and redundancy removal. BLASTx (version v2.2.26) (E-value cut-off 1e-5) protein database (NR, Swiss-Prot, KEGG and COG) alignment was performed. When results from different databases conflicted, a priority order of NR, Swiss-Prot (http://www.ebi.ac.uk/swissprot/), KEGG (http://www.genome.jp/kegg) and COG (http://www. ncbi.nlm.nih.gov/COG) was followed for annotation. When a unigene was found to be unaligned to the above databases, ESTScan software (http://estscan.sourceforge.net/) was introduced to decide its sequence direction. Furthermore, biological function annotation and classification were performed by mapping unigenes to each term of the Gene Ontology database (http://www.geneontology.org/), and pathways for biochemical and signal transduction were determined via a KEGG pathway analysis (http://www.kegg.jp/). Gene expression profiling of different development stages. After raw data filtering, clean reads of the 8 libraries were mapped to the reference transcriptome using SOAPaligner/soap2 (http://soap.genomics.org.cn/) to obtain unique mapped reads, allowing a maximum of two mismatches in the alignment. The number of unique-match reads was calculated and normalized to RPKM (reads per kb per million reads) for gene expression analysis. Comparison of unigene expression between different stages was performed using DESeq analysis 25 . Genes with log-fold difference (log2Ratio) ≥ 1 and false discovery rate (FDR < 0.001) were considered to be significantly differentially expressed. MiRNA sequencing and differential expression analysis. Four small RNA libraries were constructed and sequenced as previously described 15 . High-quality reads were blasted against the Rfam (ftp://sanger.ac.uk/ pub/databases/Rfam/) database and the GenBank noncoding RNA database (http://blast.ncbi.nlm.nih.gov/) to annotate rRNA, tRNA, snRNA and other ncRNA sequences, and then aligned to exons and introns of mRNA to screen and remove degraded fragments. Selected sequences were also mapped to the reference transcriptome  with a tolerance of one mismatch in the seed sequence to analyze their expression and distribution on the genome by SOAP. Conserved miRNAs were identified through a Blastn search against the miRNA database, miRBase20.0 (http://www.mirbase.org/ftp.shtml) using Mireap software (https://sourceforge.net/projects/mireap/).
MiRNAs with reads less than 100 were discarded, and miRNA expression levels were normalized by TPM (transcript per million) values (TPM = (miRNA total reads/total clean reads) × 10 6 ). Comparisons between developmental stages were made to find significantly differentially expressed miRNAs (|log2(fold change)| > 1 and P-value ≤ 0.05). Subsequently, gene expression differences between developmental stages showing complementarity with corresponding miRNA expression values were selected and analyzed with Targetscan (http:// www.targetscan.org/) and MiRanda (http://www.microrna.org) to predict the miRNA target. Furthermore, GO terms and KEGG enrichment analysis on differential expressed miRNAs was determined via a hypergeometric test with FDR ≤ 0.05. In addition, detection and interaction analysis of miRNA-mRNA pairs were performed based on the targets prediction, function annotation and negative regulation mechanism of mRNA and miRNA. The miRNA-mRNA interaction networks were displayed using visualizing maps.
Quantitative PCR for miRNA and mRNA expression. To validate the sequencing data, 9 miRNAs and 6 mRNAs showing sustained significant increases or declines in expression over the four stages and 13 miRNA-mRNA interaction pairs identified from the pairwise comparison of S1 and S3 (S1 and S3 showing significant differentiation in the number and form of IBs; Fig. 1) were selected, and specific primers were used to quantify the miRNA and mRNA for each stage of development (Supplementary Table S1). After acquiring high quality total RNA, miRNAs and mRNAs were reverse transcribed using PrimeScript RT reagent Kit with gDNA Eraser (TaKaRa, Japan, RR047A). Quantitative real-time PCR analyses on the miRNAs and the mRNAs were performed using the SYBR Green PCR Master Mix (TaKaRa, Japan, RR820A) on a Roche LightCycler 480 System II (Roche, Mannheim, Germany) according to the manufacturer's instructions. M. amblycephala 18s RNA and β -actin were used as internal controls for miRNA and mRNA RT qPCR, respectively. All of the real-time reactions were performed in triplicate and the relative expression levels were measured in terms of threshold cycle value (Ct) and were normalized using the equation 2 −ΔΔCt , in which Δ Ct = Ct miRNA/mRNA − Ct 18s/β-actin . Based on the annotation information of mRNA transcriptome data, we selected a number of bone-related genes to analyze their expression in different tissues (Supplementary Table S2), with the aim to find IB specific genes.
Vector Construction and Dual Luciferase reporter assays. The 3′ UTR of tgfbr1a and runx2a containing a miR-133b-3p binding site and the 3′ UTR of tgfbr1a and runx2b containing a miR-206-3p binding site were amplified from genomic DNA by PCR with the primers shown in Supplementary Tables S1-4. PCR products were cloned into pmirGLO using the MSS I and Xho I restriction sites. Dual luciferase reporter experiments were performed in HeLa cell lines (Cell Collection Center for Freshwater Organisms, Huazhong Agricultural University). When the cells reached 60% to 70% confluence in 24-well plates, pmirGLO-3′ UTR (200 ng) was co-transfected with 100 nM negative control or a microRNA mimics (GenePharma, Shanghai, China) using 2 μ L Fugene6 (Promega) according to the manufacturer's instructions. The relative luciferase activity was measured 24 h after transfection by the Dual-Luciferase Reporter Assay System (Promega).

Results and Discussion
Assembly and annotation of reference transcriptome. To obtain a reference transcriptome for the IBs' development in M. amblycephala, a RNA-Seq library was constructed using RNA from samples of four development stages. A total of 92,991,884 raw reads were generated through high-throughput sequencing. After quality control, approximately 7,882,445,160 nt of high-quality data with a Q20 percentage of 98.11% and GC percentage of 48.70% was available for analysis (Table 1). With the Trinity de novo assembler, a total of 127,712 contigs were generated, with an average length of 301 bp and an N50 of 437 bp. A total of 52,918 unigenes were further generated with an average length of 635 bp and an N50 of 865 bp ( Table 1). The size distribution of these contigs and unigenes is shown in Supplementary Fig. S1-4 and 5. A total of 46,569 unigenes were successfully annotated through alignment to reference databases. In total, 33,354 (63.03%), 45,648 (86.26%), 29,753 (56.22%), 24,094 (45.53%) and 9,072 (17.14%) unigenes could be annotated by NR, NT, Swiss-Prot, KEGG and COG database, respectively, with 6,349 (12.00%) unigenes showed no homology to known sequences deposited in these databases (Table 1).
Based on the NCBI nr database, E-value distribution ( Supplementary Fig. S1-6A) and homology percent of the unigenes was performed and mostly of them showed strong homology to available database sequences ( Supplementary Fig. S1-6B) Fig. S1-6C). A total of 22,222 M. amblycephala unigenes were classified into 3 gene ontology (GO) categories (cellular component, biological process and molecular function) ( Supplementary  Fig. S1-7) and 24,094 unigenes were mapped into 258 KEGG pathways. Metabolic, regulation of actin cytoskeleton, cancer and focal adhesion pathways dominated the KEGG functional analyses. In addition, some pathways related to osteocyte differentiation, such as the MAPK (mitogen-activated protein kinase), Wnt signaling pathway, osteoclast differentiation and TGF-β (transforming growth factor beta) signaling pathways, were highly represented in the results of the KEGG functional analyses (Supplementary Table S3).
Based on the annotation, hundreds of bone-related genes, such as FGFRs, SOXs, Runx2, TGFβ s, BMPs, SMAD, Osteocalein, MMP, cathepsin K, Col1a1/2, Col2a1, ColXa1, IGFs and so on, were identified in the IB transcriptome (Supplementary Table S4). These genes have a well-recognized function in formation and differentiation of cartilage and bone 26 . For example, the combination of SOX5, SOX6, and SOX9 suggests that the signals necessary for induction of permanent cartilage are present in the transcriptome 27 . BMP signaling is known to be involved in fish in fin growth, scleroblast differentiation 28 Table 2. The distribution of unique reads in the 8 DGE libraries was compared as an assay of gene coverage. Similar coverage was obtained for all 8 libraries. More than 34% of the unigene sequences in every library have a gene coverage of more than 70% (Supplementary Fig. S1-8), which is determined as the ratio of the base number in a gene covered by unique mapping reads to the total bases number of that gene. Sequencing saturation analysis showed the number of unique tags reached a plateau shortly after the amount of clean reads reached 20 million. Therefore, the 8 libraries fully represent the transcripts expressed in each developmental stage. Principal component analysis (PCA) showed that the biological replicates had very similar expression levels, suggesting good reproducibility of the method ( Supplementary Fig. S1-9).
The analysis of specific expression genes (SEGs) found 59, 239, 569 and 470 development-dependent specific expression genes in the S1, S2, S3 and S4 stages, respectively (Supplementary Table S6). The number of SEGs increased with the development of IBs, which showed that more unigenes expressed and functioned with the appearance and development of IBs. The identification of SEGs related to development stages may provide key genomic information to explore the mechanism of IB development.

Functional analysis of DEGs and SEGs during the four developmental stages. GO categoriza-
tion of DEGs in the six pairwise comparisons were all significantly enriched in cellular process, cell, cell part, binding and single-organism process (Supplementary Table S5), indicating the similarity in the major processes that respond to different developmental stages of IBs. Through aligning to the KEGG database, which categorizes gene functions with emphasis on biochemical pathways, a total of 13, 92, 52, 184, 121 and 184 DEGs involved in 12, 35, 31, 50, 55 and 52 pathways were predicted in the pairwise comparison of S1-vs-S2, S2-vs-S3, S3-vs-S4, S1-vs-S3, S2-vs-S4 and S1-vs-S4 (Supplementary Table S7), respectively. The metabolic pathway containing the most DEGs was ko01100, which performs a variety of anabolic and catabolic tasks, affecting energy conversion, macromolecular compounds synthesis and so on 33 . Other pathways with an abundance of DEGs included those involved with the biosynthesis of secondary metabolites (ko01110), protein processing in endoplasmic reticulum (ko04141) and ribonucleotide metabolism (ko00240, ko00230). Similar results were obtained for GO annotation and KEGG analysis of SEGs in four development stages (Supplementary Table S6).
In order to narrow the focus of our analyses to genes likely to be relevant to bone tissue development, we performed a detailed analysis of expression levels for genes involved in the TGF-β pathway, which is a known important factor in osteoblast differentiation and bone formation 32 . Twenty-six TGF-β genes were found to show a sustained decline in expression levels, while 14 TGF-β genes showed sustained increases in expression from stages S1 to S4 (Fig. 3C). By referring to the KEGG map of the TGF-β pathway, we found that the TGF-β pathway   cooperated with the MAPK and extracellular signal-regulated kinase (ERK) pathways to regulate and control osteoblast differentiation (Fig. 4). It is noteworthy that Unigene34269 and Unigene250 were down-regulated in the ERK 1/2 pathway and Unigene32329 was down-regulated in TGF-β 2 pathway. Unigene39770 and Unigene20588, which belong to the BMP (bone morphogenetic protein) signaling pathway, were differentially expressed in S1 and S3. Previous studies have reported that ERK plays a significant role in survival of osteoclasts and BMP promote differentiation of meschenchymal cells into chondrocytes and osteoblasts [34][35][36][37] .     Table S9). To evaluate the time course and development-dependent miRNA activities across the four development stages of IBs, we performed a time course differential expression miRNA analysis by comparing each two adjacent developmental stages (Supplementary Table S10). Using |log2Ratio| ≥ 1, P < 0.05 and reads ≥ 100 as the cut-off, we identified 132 (40 up-regulated and 92 down-regulated), 120 (69 up-regulated and 51 down-regulated) and 174 (108 up-regulated and 66 down-regulated) differentially expressed miRNAs in adjacent pairwise comparisons of S1-vs-S2, S2-vs-S3 and S3-vs-S4 (Fig. 5A). In nonadjacent pairwise comparisons, 194 (64 up-regulated and 130 down-regulated), 176 (119 up-regulated and 57 down-regulated) and 241 (113 up-regulated and 128 down-regulated) differentially expressed miRNAs were identified in S1-vs-S3, S2-vs-S4 and in S1-vs-S4, respectively (Fig. 5A). Similar to the mRNAs, comparisons of nonadjacent developmental stages obtained more differentially expressed miRNAs (Fig. 5A), and miRNA expression showed a long-term slowly increase/decrease with the IB development. Through Venn diagram analysis, 16 and 51 overlapped differentially expressed miRNAs were identified in the three adjacent pairwise comparisons and three nonadjacent pairwise comparisons, respectively (Fig. 5B). In order to further understand the regulatory functions of miRNAs, a total of 52,918 unique targets of M. amblycephala were predicted for the 420 known miRNAs and 41 novel miRNAs including 362 highly significant differently expressed miRNAs.

Sequence and expression profiling of IB development-dependent microRNAs.
Furthermore, in order to obtain significant miRNAs, which expression has a sustained increase/decline trend with the development of IBs, a comprehensive analysis was performed using standardized reads data from four stages. Twenty-two miRNAs with sustained increased expression and 18 miRNAs with sustained decreased expression from S1 to S4 stages were detected (Fig. 5C). Functions of these miRNAs in differentiation and development of bone have been verified in many species. For instance, prior studies provided evidence that let-7 miR-NAs and miR-140 play major roles in endochondral bone development 38 . Let-7 and miR-22 had also been shown to be positive regulators of bone development, promoting osteogenesis and suppressing adipogenesis of MSCs in vitro 39,40 . MiR-100, miR-26a and miR-148 play important roles in osteogenic differentiation and osteoclastogenesis, respectively [41][42][43] . A previous study also found that over-expression of the Hox-cluster miR-196 in zebrafish embryos reduces the number of ribs and somites. Reciprocally, miR-196 knockdown could evoke extra ribs and somites 44 . Therefore, some of the miRNAs that are differentially expressed between these important developmental stages are likely to have important roles in affecting osteoblast differentiation and osteogenesis.
Analysis of miRNA-mRNA interaction in different developmental stages. The investigation of individual miRNA-target interactions and the miRNA-target interaction network has been an exciting and challenging field of study. In general, increased miRNA activity has been found to reduce the expression of mRNA targets, and vice versa 45,46 . Therefore, integrated analysis of miRNA and mRNA expression profiles over the four developmental stages is an effective way to identify functional miRNA-mRNA interaction pairs involved in specific biological processes 47,48 . Through the simultaneous analysis of expression profiles for miRNA and target mRNA involved in IB development, we identified 201, 31, 92, 219, 93 and 500 miRNA-mRNA interaction pairs from six pairwise comparisons (S1-vs-S2, S2-vs-S3, S3-vs-S4, S1-vs-S3, S2-vs-S4 and S1-vs-S4), respectively (Supplementary Table S11). For the miRNA-mRNA interaction pairs, we found that some miRNAs were closely related to bone formation and differentiation, such as miR-133/133a/133b, miR-222 and miR-3960 [49][50][51] . The interaction map of our study performed that CL1948.Contig2 (S1-S4) was down-regulated with the up-regulated of miR-133, miR-133-3p and miR-133b, Figure 7. Comparison of expression levels for the 9 detected miRNAs and 6 mRNAs using RNA-Seq and qRT-PCR. All the data were shown as mean ± SE. The different letters for qRT-PCR results mean significant difference between stages (P < 0.05).
Quantitative analysis of miRNA and mRNA expression. Quantitative PCR results for 9 miRNAs and 6 mRNAs revealed similar changes in expression to that detected using RNA-seq data (Fig. 7). A similar result was also observed in the validation of the miRNA-mRNA interaction pairs, and 12 of 13 interaction pairs have a same expression pattern with the sequencing data except one of them performed a down-down regulating pattern (Fig. 8). We think the result is acceptable and it confirmed the credibility of molecular resources and sequencing data identified in our study. Because extensive researches performed that hundreds of miRNAs interact with thousands of target mRNAs to maintain proper gene expression patterns under various physiological functions, as the results we showed. The complexity of the miRNA-mRNA interaction network presents a great challenge for researchers to reveal the roles for specific miRNAs or miRNA-mRNA interactions in specific biological processes 60 . For instance, miR-709, miR-466 and miR-466i-3p constituted a complex interaction network with unigene32659, unigene7148, unigene15406 and many other miRNAs and mRNAs (Fig. 6B). Furthermore, miRNA-mRNA interaction is only one of the multiple mechanisms influencing the regulation of gene expression and our results will not be sensitive in instances when multiple factors, in addition to miRNA-mRNA interactions are involved 61,62 . Moreover, in order to explore IB specific genes, we performed expression analysis of 65 bone-related genes in IBs, muscle and rib tissues. However, the present results indicated that it was not possible, despite repeated attempts to demonstrate IB specific localization of gene from the transcriptome. The failure to demonstrate IB specific gene expression may be because of the fact that the transcriptome contained genes not only for IB, particularly since genes inducing IB development may be expressed in associated tissues, such as muscle. Clearly further research will be required to confirm the function of genes involved in IB development.

Conclusion
This is the first integrated analysis of miRNA and mRNA related to IBs development in fish species. MiRNA and mRNA expression profiles involved in IB development were revealed over four developmental stages. Putative miRNAs/genes associated with IB development were identified in the study. In addition, we found that a number of miRNA-mRNA interaction pairs were associated with bone formation and differentiation, adding to knowledge about the regulation of IB development. This study generated fundamental molecular resources for four developmental stages from emerging to complete formation of IBs, which can be used to improve understanding about the molecular processes affecting IB development.