Integrated microRNA-mRNA analyses reveal OPLL specific microRNA regulatory network using high-throughput sequencing

Ossification of the posterior longitudinal ligament (OPLL) is a genetic disorder which involves pathological heterotopic ossification of the spinal ligaments. Although studies have identified several genes that correlated with OPLL, the underlying regulation network is far from clear. Through small RNA sequencing, we compared the microRNA expressions of primary posterior longitudinal ligament cells form OPLL patients with normal patients (PLL) and identified 218 dysregulated miRNAs (FDR < 0.01). Furthermore, assessing the miRNA profiling data of multiple cell types, we found these dysregulated miRNAs were mostly OPLL specific. In order to decipher the regulation network of these OPLL specific miRNAs, we integrated mRNA expression profiling data with miRNA sequencing data. Through computational approaches, we showed the pivotal roles of these OPLL specific miRNAs in heterotopic ossification of longitudinal ligament by discovering highly correlated miRNA/mRNA pairs that associated with skeletal system development, collagen fibril organization, and extracellular matrix organization. The results of which provide strong evidence that the miRNA regulatory networks we established may indeed play vital roles in OPLL onset and progression. To date, this is the first systematic analysis of the micronome in OPLL, and thus may provide valuable resources in finding novel treatment and diagnostic targets of OPLL.


Results
MicroRNAs were differentially expressed in OPLL derived primary ligament cells compared to normal primary posterior longitudinal ligament cells. In order to study the global microRNA (miRNA) expression change in OPLL compared with PLL, we first obtained tissue samples from cervical spine corpectomy patients either with ossified posterior longitudinal ligament (OPLL, n = 3) or normal posterior longitudinal ligament (PLL, n = 3). Obtained tissues were subjected to primary ligament cell culture immediately, and only ligament cells from passage 1 were used in the high throughput profiling analysis. After the sequencing, approximately 1,000,000 clean reads per sample were generated. All reads were mapped with annotated miRNAs in miRBase database (version 20), and approximately 35% of the clean reads were mapped to mature miRNA in the database, while approximately another 35% of the clean reads were mapped to ribosome RNA (rRNA) or Small nucleolar RNAs (snoRNA) when mapped to other small RNA datasets.
Comparing the results of OPLL and PLL group, significant difference in mapped reads' chromosome distribution were found initially (Fig. 1a), which gave direct evidence of altered global miRNA expression between the two groups. After analyzing and normalizing all the mapped reads in both groups, we identified the existence of 1520 miRNAs altogether. After applying a stringent filtering critiria that compared OPLL with PLL (False discover rate, FDR < 0.01, fold change > 2 or < 0.5), we identified 144 upregulated and 74 downregulated miR-NAs (Fig. 1b, Table 1). Hierarchical cluster analysis was performed based on these data using Euclidean distance similarity metric to test the similarity of these samples in a global level (Fig. 1c). A heatmap of all 1520 detected miRNAs in two groups were generated, and showed distinct miRNA expression pattern between two groups. And the clustering of samples showed that samples in the same group were much alike as the two groups were divided initially in the cluster tree. Taken together, by comparing the mirNome expression data of OPLL and PLL primary ligament cells, we showed significant difference exists in the miRNA expression between the two groups.
Defining OPLL specific microRNA signatures through multiple microRNA profiling comparison.
Although altered miRNAs were found in OPLL using high throughput approaches, but the specificity of these miRNAs have not been tested, especially in the context that OPLL may share common traits with other ossification processes. To clarify and define OPLL specific miRNAs, we selected multiple miRNA profiling data from the Gene Expression Omnibus repository (GEO, http://www.ncbi.nlm.nih.gov/gds). We searched for miRNA profiling data from cell types that resemble posterior longitudinal ligament cells, and chose GSE19232 16 (Contains miRNA data of dermal fibroblasts, mesenchymal stem cells and osteo-differentiated mesenchymal stem cells), GSE34144 17 (Contains miRNA data of osteoblasts), GSE47025 18 (Contains miRNA data of dental pulp cells and periodontal ligament cells) for further analysis.
To further analyze the concordance of different miRNA profiling datasets, we first normalized these data using quantile normalization method. After normalization, Hierarchical cluster analysis was performed. The results showed that although samples tend to be clustered according to the utilized platform, but the up or down regulated miRNAs were consistently different between cell types (Fig. 2a). The results indicated that OPLL altered miRNAs may be specific to OPLL, as no significant similarity were found between other cell types. However, detailed identification are performed to further specify these microRNAs.
Mesenchymal stem cells (MSCs) are pluripotent cells that can differentiate into mature osteoblasts, and the differentiation process is also well characterized. To test whether OPLL related miRNA changes also take place in the osteo-differentiation of MSCs, we compared the upregulated and downregulated miRNAs in the two datasets. In the comparison, of the 124 miRNAs that were upregulated in the OPLL, only 2 were overlapped with MSC/ osteo transition (Fig. 2b). In the downregulated miRNAs, 3 out of 75 miRNAs were found overlapping (Fig. 2b).
To gain more evidence on the specificity of these dysregulated miRNAs, we compared their abundance of each cell type. We first ranked all miRNAs by their profiling expression in every cell type, and compared their rank with each cell type. We found that the most abundant miRNAs (Colour labeled) in OPLL make up approximately 80% of the total miRNA abundance (Normalized counts), and most of them still compromise more than 70% total miRNA abundance in the PLL (Fig. 2c). However, this expression pattern is not found in either osteoblasts or periodontal ligament cells. In the case of MSC and fibroblasts, although some miRNAs have similar rank places in OPLL, but the overall state is still more dissimilar (Fig. 2c). The initial result indicated that from the aspect of global miRNA expression, every cell type we tested is unique from each other. Comparing the rank place changes Scientific RepoRts | 6:21580 | DOI: 10.1038/srep21580 of formerly discovered OPLL dysregulated miRNAs in these cell types, we found over 50% of them do not have similar rank places in cell types other than OPLL or PLL (supplementary data 1). Taken together, although some OPLL altered microRNAs share common expression traits with osteo-differentiation of MSCs or fibroblasts, the majority of identified OPLL altered miRNAs are OPLL specific (Table 1).
Global transcriptome analysis revealed OPLL altered gene expressions are related to ossification and skeletal development. To fully decipher the OPLL miRNA regulatory network, we first analyzed the global transcriptome in both OPLL and PLL. Taking advantage of Hierarchical cluster, we found the global gene expression is greatly altered (Fig. 3a), with 1667 genes significantly upregulated (Fold change > 2, FDR < 0.05) and 1264 genes significantly downregulated (Fold change < 0.5, FDR < 0.05). In order to categorize the altered genes, we applied Gene Ontology (GO) analysis. Altogether 65 significant (P < 0.01) GO terms were associated with upregulated genes, among which extracellular matrix organization, embryonic skeletal system morphogenesis, embryonic skeletal system development, skeletal system development, collagen fibril organization, osteoblast differentiation, ossification are more significant and highly enriched (Fig. 3b, Supplementary data 2). However, these terms were not found in the GO analysis in genes downregulated in OPLL (Fig. 3c). In showing the number of total mapped miRNAs in both groups and the number of differentially expressed miRNAs between two groups. Hierarchical clustering (c) of both samples and the miRNAs were performed to visualize the miRNA expression patterns in two groups. Each column represents each sample indicated in the bottom, each row represents an identified miRNA. The expression levels are depicted according to the color scale (middle right). Red or green indicate expression levels above or below the median, respectively. The magnitude of deviation from the median is represented by color saturation. Continued the Kegg pathway analysis of altered genes, we also found that ECM-receptor interaction and Calcium signaling pathway are highly enriched in upregulated genes, while Mineral absorption related pathway and Glycerolipid metabolism are enriched in the downregulated genes. Taken these data together, we found that OPLL cells express more ossification related genes than PLL cells.
To summarize and to visualize the findings, we constructed the GO tree graphs using the GO terms analyzed before. As shown in Fig. 3d, upregulation of genes mainly take place under the GO terms of extracellular matrix organization, embryonic skeletal system and ossification, while majority of other terms consist of downregulated genes. Together, we showed that OPLL altered genes were strongly correlated to ossification. Integrated microRNA/mRNA analysis implied OPLL specific microRNAs are important regulators in controlling OPLL gene alteration. To generate OPLL specific miRNA/mRNA interacting network, we used the OPLL specific miRNAs identified previously and predicted their target mRNAs using  Targetscan (http://www.targetscan.org, Supplementary data 3). The predicted targets are overlapped with the transcriptome data, only genes with the expression pattern that negatively correlates with its targeting miRNAs are considered. MiRNAs with low abundance were also neglected (Normalized counts < 100 in both groups). To scale down the whole network, we chose the top 15 regulated miRNAs as core miRNAs to generate the network. As shown in Fig. 4a, 14 top downregulated miRNAs and their matched mRNA targets were inversely correlated. In order to classify the related functions of these miRNA regulated genes a GO analysis was performed, results showed that these top upregulated targets were mainly associated with skeletal development and collagen fibril organization, which is similar to the results of whole transcriptome GO analysis. On the other hand, in the top 14 miRNAs and their matched targets were shown in Fig. 4b. And the GO analysis of these targeted downregulated genes also shown similar results to that of whole transcriptome GO analysis. Taken these together, we showed that the miRNA/mRNA network generated using these top regulated miRNAs can indeed represent most of the transcriptome changes, which implies that these miRNAs have set up the core transcription regulatory network in the OPLL.
To unveil their importance in regulating longitudinal ligament cells transdifferentiating into osteogenic cells, we first verified the expression of 10 differentially expressed miRNAs using 4 PLL and 4 OPLL specimens. Real-time PCR identified the expression changes of these miRNAs between PLL and OPLL are similar to the sequencing results (Fig. 5a,b). To test their functions, we chose 5 most upregulated miRNAs and synthesized their The differentially expressed miRNAs and mRNAs that identified were integrated by connecting miRNA with relevant mRNA using Targetscan miRNA targeting algorithm. These inversely correlated miRNA/mRNA pairs were sorted and the visualized networks were constructed using only the miRNAs with highest or lowest fold changes in OPLL compared to PLL. The correlation network showing 14 highly downregulated miRNAs and its possible target mRNAs were constructed (a). The GO analysis of these target genes were shown in the right panel. The correlation network showing 14 highly upregulated miRNAs and its possible target mRNAs were constructed (b). The GO analysis of these target genes were shown in the right panel.
mimics. After transfecting these mimics to PLL cells, we found that the expression of osteogenic related genes Runx2 and Ibsp were upregulated in some of the groups, especially in miR-10a-5p overexpressed group (Fig. 5c). Alizarin Red staining further showed that PLL cells can trans-differentiate into osteo-lineage, and that miR-10a-5p showed robust calcium deposition compared with a scramble RNA oligonucleotide overexpressed group as control (Fig. 5d). These results further implied that the OPLL differentially expressed miRNAs are indeed of biological significance in controlling OPLL onset and development.

Discussion
The first report demonstrating OPLL can be traced back to 1960 s, which has a prevalence of 1.9-4.3% in the Japanese population 19 . Despite a long-standing predominance in Japan, this disease has also been recognized in other geographic regions and ethnicities, in the western regions the prevalence is only about 0.01-1.7% 20 . Recent study showed that the incidence rate of OPLL has reach more than 5% in some of the Asian regions outside Japan 21 . The disease also increases with age, with the average onset age of around 50 years old, and the male to female ratio at 2:1 22 . Despite advances in surgery and radiographic technologies, treating OPLL is still a task that has not been improved. Patients with serious symptoms that proceed to surgical treatment often resulted in unsatisfactory prognosis. Studies have been conducted to uncover the underlying factors in causing OPLL development, however, the factor seemed to be unspecific, indicating it's an interplay of numerous genetic and environmental factors 23 . Beside non-genetic factors like mechanical stress, nutrition, glucose intolerance, and high body mass index, recent studies showed numerous genes are associated with the disease. Most of these genes are vital to the ossification process like NPPS, COL11A2, COL6A1, BMP2, BMP4, TGF-β 1, TGF-β 3 [4][5][6][7]22,24 , and often aberrant expression of these genes can cause the pathological heterogeneous ossification 25,26 . This means a  (a,b). Moreover, the function of five OPLL upregulated miRNAs were tested by overexpressing synthesized mimics and analyzed the expression changes of the osteogenic related factors Runx2 and Ibsp (c). Images of Alizarin Red S staining of the miRNA overexpressed PLL cells were shown to analyze the calcium deposition after osteo-induction for 2 weeks (d). Data were shown as mean ± S.D. and the P value are shown as *P < 0.05, **P < 0.01. certain disorder in the upstream regulatory network of these critical factors may exists during the pathological process of OPLL.
There are many regulatory factors controlling certain genes' expression, including transcriptional factors, epigenetic factors and post transcriptional factors 27 . Among which microRNA is the most well characterized post transcriptional factor that has been systematically researched in many biological fields. Owing to the high throughput technologies, the existence of miRNAs are greatly expanded and identified recently, with 2588 mature miRNAs identified in the latest version of miRBase. Beside the advancement of high throughput technologies, miRNA target prediction methods have also developed quickly. Although many algorithms were developed nowadays, the most widely used is Targetscan, which use algorithm that mainly depend on the seed sequence binding to the 3′ untranslated region of mRNA. With the identification of the crystal structure of miRNA binding protein Argonaute-2 28,29 , it is now widely accepted that this algorithm can predict functional miRNA interacting pairs. In the present study, we used Next Generation sequencing technology (NGS) to assess the expression change in both miRNA and mRNA, which generated more and accurate data than the traditional microarray assays. Altogether 1520 miRNAs were identified in both OPLL and PLL, while other microarray datasets we used in the study only generated about 700 miRNA data. Using integrative methods, we validated that 194 miRNAs that differentially expressed in OPLL are cell type specific, which generated OPLL specific miRNA signatures for the first time. These miRNA signatures may be involved in the onset and progression of OPLL and provide novel targets for OPLL diagnostic or therapeutic causes.
To investigate in depth of how these signature miRNAs really participates in the gene alteration of OPLL, we took advantage of the bioinformatics analysis. The Gene Ontology (GO) analysis is widely recognized as the leading tool for the organization and functional annotation of molecular attributes 30 . By using this tool, numerous differentially expressed genes in the transcriptome sequencing analysis were categorized into a few significant GO terms. Among the identified GO terms, skeletal system development, ossification, osteoblast differentiation have a relative low P value, which indicates that most upregulated genes in OPLL were related to such function or biological process. The result of which seemed to be reasonable for primary ligament cells derived from OPLL patients may already been influenced and prepared for ossification transformation in the genetic level. And this also gave direct evidence that the osteophytes in the OPLL patients may indeed come from untransformed posterior longitudinal ligament cells with proper stimulation. However, the genes that involved in these GO terms are also non-specific, one such example is Runx2 (About 3.7 fold increased in expression in OPLL than PLL), a well-defined master transcription factor that participates in almost every event involving ossification. Other ossification related genes like IBSP, collagen I and TGFB family were also upregulated in OPLL. Using high throughput datasets from existing databases, we are surprised to found that the miRNA signatures are specific to OPLL. And by constructing the miRNA/mRNA interacting network, we showed that these specific miRNA signatures can indeed drive the ossification related genes' expression change during OPLL, which implies these specific signatures may be the key regulators that control and alter the transcriptome of ligament cells toward a more ossification ready state.
However, the clinical relevance of these specific signature miRNAs and their targeting pairs still need further validation. But still the study we presented here elucidated the altered mirNome and transcriptome, and identified predominant miRNA/mRNA interacting pairs that seemed to be of significance in the OPLL development. We are looking forward to further clarify their biological significance with further validation.

Methods
Sample collection and primary cell culture. The diagnosis of OPLL or PLL (spinal trauma patients who underwent cervical corpectomy) was confirmed using X-rays, computerized tomography, and magnetic resonance imaging preoperatively in the institution. OPLL or PLL specimens obtained during surgery and immediately put to primary cell culture. The ligaments were rinsed using PBS supplemented with 1% penicillin/streptomycin twice. After which the ligament tissue was carefully dissecting under microscope to avoid any contamination with osteogenic or other cells. The collected ligaments were minced and washed twice with PBS. Tissues were plated on a 100 mm culture dish and maintained in Dulbecco's Modified Eagle Medium (DMEM, Life technology Gibco, USA) supplemented with 10% fetal bovine serum (FBS, Gibco, USA), 1% L-glutamine (Gibco, USA), and 1% penicillin / streptomycin (Gibco, USA). Tissues were incubated in a humidified atmosphere containing 95% air and 5% CO 2 at 37 °C. The fibroblast-like cells that migrated from the tissues were harvested with 0.02% EDTA)/ 0.05% trypsin and plated in T25 culture flask for further analysis. The ethics committees of the Changzheng Affiliated Hospital of Second Military Medical University approved the study protocols, and each participant have written informed consent. The methods were carried out in accordance with the approved guidelines. We totally obtained 3 OPLL patient tissue samples during anterior cervical discectomy, and 3 PLL patient sample during anterior cervical corpectomy.
Transcriptome and micronome profiling. For transcriptome sequencing, total RNA was extracted using the TRIzol solution (Invitrogen, Carlsbad, USA), according to the manufactures' procotols. RNA sequencing libraries were constructed as described 31 with some modifications. In brief, RNA concentration was measured by Nanodrop and the quality was measured by both gel electrophoresis and Agilent 2100. Samples that passed RNA quality control check will begin library preparation. We use Ion Total RNA-Seq Kit v2 (Life Technologies, USA) for library construction following manufacturer's protocol. After purifying the poly-A containing mRNA molecules using poly-T oligo-attached magnetic beads, the mRNA is fragmented into small pieces using RNase III. The cleaved RNA fragments are concentrated again using the Nucleic Acid Binding Beads provided by the manufacturer. The purified RNA fragments are ligated and reverse transcribed into cDNA. The products are then purified and amplified with PCR (15-cycle) to create the final cDNA library. After purification, quantification and validation, validated cDNA libraries were sequenced on Ion Proton ™ System (Ion Proton, Life Technologies, USA) following the manufacturer's standard workflow. For small RNA sequencing, 10 μ g total RNA of each sample was used for small RNA cDNA library preparation as previously described 32 with some modifications. Strand-specific RNA libraries were prepared using the TruSeq Small RNA Sample Prep Kit (Illumina, San Diego, USA). Briefly, small RNA fragments ranging from 18-30 nt were isolated, purified and subsequently ligated were ligated to 3′ and 5′ adaptors sequentially, reverse transcribed to cDNA and then PCR amplified. The entire library was tested by gel electrophoresis, and bands corresponding to microRNA insertion were cut and eluted. After ethanol precipitation and washing, the purified small RNA libraries were quantified sequenced using the Illumina HiSeq ™ 2000 analyzer (Illumina, San Diego, USA) according to the manufacturer's instructions. All profiling works were done under the help of Shanghai NovelBio Bio-Pharm Technology Co., Ltd.
Analysis of RNA sequencing data. For analysis of transcriptome sequencing data, we aligned approximately 100 bp long reads to the human genome (hg19) using Tophat2/Bowtie2 33,34 allowing for five mismatches. We identified mapped data to gene structures derived from RefSeq using the summarize overlaps function with mode Intersect Strict (Genomic Ranges, Bioconductor). Using the initial raw counts data, we calculated reads per kilobase per million reads mapped (RPKM) values for the same gene set using Cufflinks. The differential analysis was carried out using edgeR 35 , applying TMM (trimmed Mean of M-values) library normalization and a 0.05 false discovery rate (FDR) to select expressed transcripts. For analysis of small RNA sequencing data, we first generated the clean reads and aligned with human genome (hg19) using Tophat2/Bowtie2 33,34 , followed by calculating the reads in different regions of the genome distribution. The clean reads were compared with the Rfam database (ftp://selab.janelia.org/pub/Rfam) to match the known lncRNA, miRNA, rRNA, snRNA, snRNA and tRNA sequences and were then compared with the human mature miRNAs database in miRBase (v21) to identify mature miRNAs and count their reads. The raw counts of miRNA reads were further normalized by transcripts per million (TPM) values ((miRNA total reads/total clean reads) × 10 6 ). The differentially expressed miRNAs (DEmiRNAs) between samples were identified by the EdgeR program using parameters of P ≤ 0.01 and fold change ≥ 2 or ≤ 0.5.
Gene Ontology and KEGG pathway analysis. Gene Ontology (GO) and KEGG pathway analyses were performed as previously described 36 . In brief, we calculated the p-value of each GO term using right-sided hypergeometric tests, and Benjamini-Hochberg adjustment was used for multiple test correction. An adjusted p-value that is lower than 0.05 indicated a statistically significant deviation from the expected distribution, and thus the corresponding GO terms and pathways were enriched in target genes. We analyzed all of the differentially expressed mRNAs using GO and KEGG pathway analyses, and we analyzed the mRNAs that were included in the miRNA regulator network using GO analysis. miRNA regulatory network construction. Targetscan was used to predict the binding of differentially expressed miRNAs to the putative targets. The predicted target genes were compared with the transcriptome profiling data, and only genes that are inversely correlated in expression with the targeting miRNA were included. After the correlation mapping, upregulated or downregulated miRNAs and their targeting genes were subjected to the network visualization analysis individually. The igraph package of the statistical language R was applied for functional profiling, and the network visualization and analysis tool Cytoscape22 was used to construct the network.
Hierarchical clustering. Hierarchical clustering was performed as described 37 using Cluster version 3.0 and Java TreeView version 1.1.6 to identify and visualize expression differences between the dataset. With the use of clustering algorithms, the samples and genes were grouped based on similarities in the expression profiles. For clustering of small RNA or mRNA sequencing data, we used normalized counts or RPKM for the analysis. Genes were filtered to exclude missing values. The average linkage and median centering were chosen, and unsupervised clustering were used. For clustering miRNA data of multiple cell types, we first downloaded the required data from GEO dataset under the Accession codes GSE19232, GSE34144 and GSE47025. The miRNA ID of each data were updated with miRbase v21 and thus can be integrated into a single dataset. Quintile normalization method was applied to scale each dataset to a uniform expression level. Unsupervised clustering using average linkage and median centering were sub sequentially performed.