Transcriptomic profiling of long non-coding RNAs in dermatomyositis by microarray analysis

Long non-coding RNAs (lncRNAs) are prevalently transcribed in the genome and have been found to be of functional importance. However, the potential roles of lncRNAs in dermatomyositis (DM) remain unknown. In this study, a lncRNA + mRNA microarray analysis was performed to profile lncRNAs and mRNAs from 15 treatment-naive DM patients and 5 healthy controls. We revealed a total of 1198 lncRNAs (322 up-regulated and 876 down-regulated) and 1213 mRNAs (665 up-regulated and 548 down-regulated) were significantly differentially expressed in DM patients compared with the healthy controls (fold change>2, P < 0.05). Subgrouping DM patients according to the presence of interstitial lung disease and anti-Jo-1 antibody revealed different expression patterns of the lncRNAs. Pathway and gene ontology analysis for the differentially expressed mRNAs confirmed that type 1 interferon signaling was the most significantly dysregulated pathway in all DM subgroups. In addition, distinct pathways that uniquely associated with DM subgroup were also identified. Bioinformatics prediction suggested that linc-DGCR6-1 may be a lncRNA that regulates type 1 interferon-inducible gene USP18, which was found highly expressed in the perifascicular areas of the muscle fibers of DM patients. Our findings provide an overview of aberrantly expressed lncRNAs in DM muscle and further broaden the understanding of DM pathogenesis.

regulatory factors. Eisenberg et al. firstly reported the altered expression of several microRNA molecules in the muscle tissue of myositis patients 11 . Further investigation demonstrated that microRNA-1, -133, -206, -126 contributed to the pathogenesis of myositis as important regulating factors of relevant gene expression 12,13 . Despite emerging data showing the regulatory role of microRNAs in myositis, there is a paucity of information concerning the expression and potential role of lncRNAs in myositis.
High throughput genome screening can provide a panoramic view of gene expression in pathological conditions and may consequently provide new insights for disease pathomechanism. Therefore, in order to explore the expression pattern of lncRNAs in different DM clinical subtypes and to find potential regulatory lncRNAs in DM muscle, we profiled the lncRNAs and mRNAs in muscle tissue by using microarray analysis.

Results
Expression profiles of lncRNAs and mRNAs in DM muscle. Through microarray analysis, we examined the lncRNA and mRNA expression profiles in the muscle of 15 DM patients and 5 healthy controls. The MA plot is shown to reflect the overall data quality (Fig. 1A). Hierarchical clustering was performed based on the lncRNA and mRNA expression values in the microarray (Fig. 1B,C). The microarray data revealed 1213 mRNAs and 1198 lncRNAs were differentially expressed in DM patients compared with the control group. Among them, 665 mRNAs were upregulated, and 548 mRNAs were downregulated in DM patients. Regarding the lncRNAs, we found that 322 were upregulated and 876 were downregulated in DM patients. Table 1 shows the top 10 upregulated and downregulated lncRNAs. All lncRNAs and mRNAs that were aberrantly expressed with an absolute fold-change greater than 5 are listed in Supplementary datasheet 2.
Additionally, we subgrouped the DM patients according to the presence of ILD and found differentially expressed lncRNAs in DM patients with ILD compared to patients without ILD. Figure 2A shows the different expression patterns of lncRNAs in the two subgroups. Additionally, the top 10 upregulated and downregulated lncRNAs are listed in Supplementary Table S3, with ENST00000428205.1 and ENST00000450016.1 as the lncR-NAs that display the most significant increase and decrease, respectively. Moreover, we also identified certain lncRNAs that were expressed differently between DM patients with anti-Jo-1 antibody and DM patients without the antibody (Fig. 2B), among which we found that ENST00000587036.1 and XR_110948.1 expressions differed the most between the two subgroups (Supplementary Table S4).
Quantitative real-time PCR validation. Five lncRNAs and five mRNAs were selected to be analyzed by quantitative RT-PCR to validate their expression levels in the 15 DM patients involved in microarray analysis. The qRT-PCR results showed that the expression levels of lncRNAs-ENST00000541196.1, uc011ihb.2, linc-DGCR6-1, and of mRNAs-USP18, IFIH1 were significantly increased (P values all < 0.05) in DM patients compared to that in healthy controls. In addition, the expression of lncRNAs-ENST00000551761.1, ENST00000583156.1 and of mRNAs-FOS, ALDH3B2, PFKFB3 were significantly decreased (P values all < 0.05). The qRT-PCR results were consistent with the results of the microarray analysis (Fig. 3), and thus providing reliable validation for the microarray results. lncRNA classification and subgroup analysis. According to previous reports, lncRNAs can be divided to three main subgroups: HOX lncRNAs 14 , lncRNAs with enhancer-like function 15 , and large intergenic noncoding RNAs (lincRNAs) 16 . Our profiling data indicated that the differentially expressed lncRNAs consisted of 110 lncRNAs with enhancer-like function, 5 lincRNAs, and no HOX lncRNAs. Supplementary datasheet 3 shows all of these lncRNA subsets.

Bioinformatics analysis of differentially expressed mRNAs in DM patients. Gene Ontology (GO)
is a bioinformatics initiative that seeks to better represent gene and gene product attributes, providing three structured networks of defined terms: biological process, cellular component, and molecular function. By applying GO analysis, we found that the genes corresponding to the up-regulated mRNA transcripts in DM were involved in 306 GO terms in the biological process network, 34 GO terms in the cellular component network and 11 GO terms in the molecular function network. Additionally, the genes corresponding to the down-regulated mRNAs were found to be involved in 17 GO terms in biological process, 2 GO terms in cellular component, 2 GO terms in molecular function. The top 30 significant GO terms associated with dysregulated mRNAs are shown in Fig. 4. Notably, GO clustering revealed that "type 1 interferon-mediated signaling pathway" (GO:0060337), "response to type 1 interferon" (GO:0034340) were significantly up-regulated, indicating that dysregulated type 1 interferon signaling play a role in DM pathogenesis.  We further performed pathway analysis on the differentially expressed mRNAs according to the following databases: the latest version of Kyoto Encyclopedia of Genes and Genomes (KEGG) database (http://www. genome.jp/kegg), PID (http://pid.nci.nih.gov/), BioCyc (http://biocys.org/), Reactome (http://www.reactome. org/), and Panther (http://www.pantherdb.org/). Consequently, the biological pathways that were significantly enriched with these differentially expressed mRNAs were uncovered. The results showed that a total of 74 pathways were associated with the up-regulated mRNA transcripts, with "Interferon Signaling" recognized as the most enriched network. Furthermore, we also found 6 pathways that were significantly related to down-regulated mRNAs, "Erythrocytes take up oxygen and release carbon dioxide" being the most enriched one. The top 30 enriched pathways of up-regulated mRNAs and down-regulated mRNAs are shown in Fig. 5A,B.
Construction of the mRNA-lncRNA co-expression network. Based on the results of correlation analyses of the differentially expressed lncRNAs and mRNAs, mRNA-lncRNA co-expression networks were built. In total, we identified 718 pairs of co-expressed lncRNA-mRNA (Supplementary datasheet 5). Subsequently, 122 co-expression networks were constructed. Supplementary Figure S1 shows a representative mRNA-lncRNA co-expression network which involves 4 lncRNAs and 20 mRNAs. Target gene prediction for the differentially expressed lncRNAs. According to the mRNA-lncRNA co-expression networks, we predicted the coding genes that would be targeted by the differentially expressed lncRNAs by using bioinformatical approach. The prediction identified 12 lncRNA-mRNA pairs (Supplementary Table S5) and suggested the likelihood of these lncRNAs as regulators of corresponding mRNA expression. Of the highest interest, we noticed a lncRNA named linc-DGCR6-1 may target USP18, which is known as a type 1 interferon-inducible protein and considered to be a key regulator of interferon signaling 17,18 . High expression levels of USP18 in the muscles of DM patients. In order to validate the microarray findings of increased expression of USP18 in DM, we analyzed the expression of the USP18 protein by immunohistochemistry staining. Obvious expression of USP18 was prominently found in the perifascicular areas of the muscle fibers of DM patients. In contrast, we found no significant USP18 expression in healthy control muscle samples (Fig. 6).

Discussion
In this study, we profiled the expression of lncRNAs and mRNAs in DM muscle tissue by microarray analysis and identified lncRNAs and mRNAs that were differentially expressed in DM patients. Stratification by clinical subgroup according to the presence of ILD and of the anti-Jo-1 autoantibody revealed different expression patterns of lncRNAs in DM subgroups. Unique and shared pathways related to the three DM subsets were also unveiled. In particular, type 1 interferon signaling was found to be the most significantly dysregulated pathway in all DM subgroups, indicating that type 1 interferon may play a role in DM pathogenesis. Through constructing mRNA-lncRNA co-expression networks, we predicted the target genes of the differentially expressed lncRNAs by using a bioinformatic approach. Of note, we identified a lncRNA that is likely involved in the regulation of type 1 interferon-inducible molecule USP18. Additionally, immunohistochemistry staining showed that upregulated expression of USP18 proteins were mainly observed in perifascicular atrophy myofibers of DM patients.
In contrast to the previous consideration of noncoding transcripts as junk DNA with no functional purpose 19 , rapidly expanded research in the past decade has demonstrated that non-coding RNAs are essential for a variety of cellular processes 20 . The regulatory roles of non-coding RNAs have attracted considerable attention in recent years. In particular, recent findings have demonstrated that lncRNAs play significant roles in the regulation of immune functions [21][22][23][24] . Moreover, a growing body of evidence suggested that lncRNAs dysregulation may play a vital role in autoimmune diseases including systemic lupus erythematosus, rheumatoid arthritis 24 . Intriguingly, emerging data have unveiled lncRNAs that are involved in myopathies and that may play a role in muscle differentiation, regeneration and function 10,25-27 . However, little information is known about lncRNAs in DM. Systemic genome screening is a powerful tool for identifying differentially expressed RNA transcripts, which has been widely used to examine the expression profile of lncRNAs in various diseases [28][29][30][31][32][33] . Thereby we sought to investigate the expression profile of lncRNAs in DM muscle and relate it to the mRNA expression profile by using high throughput microarray analysis.
To our knowledge, this is the first study that address the transcriptomic profiling of lncRNAs in DM patients. In addition to a large number of differentially expressed mRNAs, we also found thousands of lncRNAs differentially expressed in DM muscle tissue. Our further bioinformatics prediction analysis points to a set of lncRNAs that may be potential gene regulators in DM pathogenesis. Therefore, the present data will unveil intriguing areas of inquiry for more in-depth exploration.
In order to gain insights into the biological pathways potentially involved in DM, pathway analysis was performed. The results revealed that several pathways that are involved in other autoimmune disorders -such as systemic lupus erythematosus, type 1 diabetes mellitus, autoimmune thyroid disease -were significantly dysregulated in DM, which suggests that DM may share some pathways with other autoimmune diseases. Recent genetic studies demonstrated that DM has a shared genetic etiology with other autoimmune disorders 34,35 . Our finding of several pathways that contribute to other autoimmune diseases were also dysregulated in DM, therefore, indicates that, besides genetic overlap, molecular pathways may also be shared by DM and other autoimmune disorders.
In our study, both GO and pathway analysis results suggested a role of type 1 interferon in DM, which is in accordance with previous reports 6 . An increasing number of investigations have shown highly expression of type 1 interferon-inducible proteins in the muscle and skin tissue of DM patients [4][5][6]36,37 . It is supposed that the overexpression and intracellular accumulation of type 1 interferon-inducible proteins might result in cellular toxicity in muscle fibers 38 . Moreover, the recent identification of an autoantibody against IFIH1(also known as anti-MDA5), which is a classic type 1 interferon-inducible protein in the patients with DM especially clinically amyopathic DM 39 , provides further evidence for the abnormality related to type 1 interferons in DM 40 . These studies imply that type 1 interferon pathway significantly contributes to the pathogenesis of DM. In our study, we also found significantly upregulated transcription levels of type 1 interferon-inducible genes, including IFN-stimulated ubiquitin-like modifier protein (ISG15), USP18, IFIT3, and IFIH1, which is consistent with previous reports [4][5][6]36 . Further analysis via immunohistochemistry staining, we confirmed that USP18 proteins were preferentially overproduced in perifascicular atrophy myofibers. Therefore the expression pattern of USP18 is similar to other type 1 interferon-inducible molecules such as ISG15 and MX1 which were found to be highly expressed in the perifascicular myofibers of DM muscle 36 . Thus, our study provides further evidence indicating the potential role of type 1 interferon signaling in DM pathogenesis.
On the other hand, although type 1 interferon signaling was the most significant shared pathway that was dysregulated in all DM subgroups, distinct pathways were also found in different DM subsets. Recently, Rothwell et al. revealed distinct genetic associations between PM, and DM and JDM, suggesting different predominating pathophysiology in different clinical subgroups 41 . Our data indicated that dysregulated molecular pathways may contribute to the clinical heterogeneity of DM patients.
What is more interesting, bioinformatics prediction in our study suggested that linc-DGCR6-1 might be a potential regulator of the USP18 gene, as linc-DGCR6-1 contains overlap sequence with the 3′ UTR of USP18 gene. Together with the finding of a significant correlation between the expression levels of linc-DGCR6-1 and the USP18 mRNA, our results indicate potential regulatory role for linc-DGCR6-1 in the expression of the USP18 gene. Recently, Gomez et al. reported that an enhancer-like lncRNA, termed NeST, functions in either cis or trans to promote interferon-γ expression 42 , providing direct evidence for a regulatory role that lncRNA may play in interferon pathway. Although further investigation is necessary to confirm the details of the role, our study suggests that lncRNAs may probably contribute to DM pathogenesis by regulating the expression of type 1 interferon-inducible molecules.
However, we acknowledge that our study has some limitations. First of all, the sample size of DM patients included in the microarray analysis was relatively small. This was partly due to our strict selection only of patients with very similar clinical features and muscle pathological manifestations. Additionally, the target genes of the differentially expressed lncRNAs were just predicted by bioinformatics approach, which require to be confirmed by biological function analysis.
Taken together, our study describes a molecular overview of aberrantly expressed lncRNAs in DM muscle and provide novel insights into the pathogenesis of DM. Further investigation of lncRNA function in DM may help to expand our understanding of the molecular mechanism of DM muscle damage.

Methods
Patients. From the inpatients who visited China-Japan Friendship Hospital, 15 patients who were diagnosed with DM according to Bohan and Peter criteria were recruited for this study. The diagnoses of the 15 DM patients were biopsy-proven and also fulfilled the ENMC criteria 43 . They all had typical skin rash and muscle weakness. Patients were considered to have ILD if they fulfilled the following criteria: (i) restrictive lung function impairment (total lung capacity (TLC), and diffusion capacity for carbon monoxide of the lung (DLCO) < 80% of predicted), and (ii) radiographic signs of ILD on HRCT (nodular, reticulonodular, linear or ground-glass opacities; consolidations; irregular interface; honeycombing; or traction bronchiectasis). Females outnumbered males in the cohort (13 females, 2 males). The DM patients ranged from 28-67 years of age. The clinical characteristics of the enrolled subjects are summarized in Supplementary Table S1. Magnetic resonance imaging-directed muscle biopsies were performed for diagnostic purposes, and the muscle tissue samples were stored in liquid nitrogen before use. All 15 DM patients received no treatment prior to biopsy. Control muscle tissue samples were obtained from 5 trauma patients who did not suffer from muscular diseases. This study was performed with the approval of the Human Ethics Board of China-Japan Friendship Hospital (Beijing, China) in accordance with the Declaration of Helsinki for human research. Written informed consent was obtained from all participating individuals.

RNA extraction.
Total RNA containing small RNA was extracted from frozen muscle tissue by using the Trizol reagent (Invitrogen) and purified with mirVana miRNA Isolation Kit (Ambion, Austin, TX, USA) according to manufacturer's protocol. The purity and concentration of RNA were determined from OD260/280 readings using the NanoDrop ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA). RNA integrity was determined via 1% formaldehyde denaturing gel electrophoresis. Microarray hybridization and computational analysis. Microarray hybridization was performed according to the standard procedure by CapitalBio Corporation, Beijing, China. Briefly, each purified RNA sample was amplified and transcribed into cRNA along the entire length of the transcripts without 3′ bias by using a random priming method. Subsequently, the cRNAs were transcribed into cDNAs and labeled with a fluorescent dye (Cy3-dCTP) by using Labeling Kit (CapitalBio, Beijing, China). The labeled cDNAs were purified by a PCR NucleoSpin Extract II Kit (MN, Germany) and then were hybridized onto a human lncRNA + mRNA Array V4.0. The hybridization was performed in a Agilent Hybridization Oven overnight at a rotation speed of 20 rpm and a temperature of 42 °C and washed with two consecutive solutions (0.2% SDS, 2× saline sodium citrate buffer at 42 °C for 5 min, and 0.2× saline sodium citrate buffer for 5 min at room temperature).
The lncRNA + mRNA array data summarization, normalization and quality control were analyzed by using the GeneSpring software V12.0 (Agilent Technologies). Fold-change greater than 2 or less than 0.5, and a Benjamini-Hochberg corrected p value less than 0.05 were considered as the criteria for differential expressed genes selection. The data was Log 2 transformed and median-centered by genes using the Adjust Data function of CLUSTER 3.0 software then further analyzed with hierarchical clustering with average linkage. Finally, tree visualization was performed by using Java Treeview (Stanford University School of Medicine, Stanford, CA, USA).
Quantitative real-time PCR (qRT-PCR) assay. Quantitative real-time PCR analysis was further carried out to validate the expression levels of candidate genes. Two μ g of total RNA was reverse transcribed using the GoTaq ® 2-Step RT-qPCR system (Promega, Madison, USA) according to the manufacturer's protocol. Fluorescent quantitative RT-PCR was performed using the ABI PRISM 7500 system (Applied Biosystems, Foster City, USA) according to standard methods. Five differentially expressed mRNAs and five additional lncR-NAs were chosen to validate the gene chip results using qRT-PCR. Specific primers of each gene are listed in Supplementary Table S2. The relative fold change was calculated using the 2 −ΔΔCt method normalized to GAPDH. All experiments were performed in triplicate.
Construction of the mRNA-lncRNA co-expression network. The mRNA-lncRNA co-expression network was constructed based on the correlation analysis between the differentially expressed lncRNAs and mRNAs. For each pair of genes, the Pearson correlation analysis was performed and the significantly correlated pairs were chosen to construct the network. LncRNAs and mRNAs with Pearson correlation coefficients 0.95 or greater were selected to construct the network through the software Cytoscape.
Target gene prediction. The cis-acting lncRNA predictions were based on the tight correlations (Pearson correlation coefficient greater than 0.99) between the lncRNA and a group of expressed protein-coding genes. These lncRNA reside at genomic loci where a protein-coding gene and a lncRNA gene are within 10 kb of each other along the genome. Therefore, "cis" refers to the same locus (not necessarily same-allele) regulatory mechanisms, which include antisense-mediated regulation by the lncRNAs of protein-coding genes that are encoded at the same locus. The trans-predictions were made using blat tools (Standalone BLAT v. 35 × 1 fast sequence search command line tool, downloaded from: http://hgdownload.cse.ucsc.edu/admin/exe/) to compare the full sequence of the lncRNAs with the 3′ UTR of its co-expressed mRNAs, using the default parameter setting.
Immunohistochemistry staining. Eight-μ m-thick unfixed cryostat muscle sections were collected from diagnostic muscle biopsies. Anti-human ubiquitin-specific peptidase 18 (USP18) polyclonal antibodies (Abcam, Cambridge, UK) were used as primary antibodies for detecting USP18 protein at a working concentration of 5 μ g/ml. Horseradish peroxidase-conjugated anti-rabbit IgG (Santa Cruz Biotechnology, Santa Cruz, USA) was used as a secondary antibody. Rabbit IgG isotype control (Abcam, Cambridge, UK) was used as a negative control for the primary antibody. The immunohistochemistry staining was performed as previously described 44 . Statistical analysis. Statistical analysis was performed using SPSS V. 16.0 (SPSS, Chicago, USA). The fold change and the Student's t-test were used to analyze the statistical significance of the microarray and RT-PCR results. Additionally, the Benjamini-Hochberg FDR (the FDR cutoff was 0.05) was used for multiple-testing correction. P-values < 0.05 (two-tailed) were considered statistically significant.