Growth retardation-responsive analysis of mRNAs and long noncoding RNAs in the liver tissue of Leiqiong cattle

As an important type of non-coding RNA molecule, long non-coding RNAs (lncRNAs) have varied roles in many biological processes, and have been studied extensively over the past few years. However, little is known about lncRNA-mediated regulation during cattle growth and development. Therefore, in the present study, RNA sequencing was used to determine the expression level of mRNAs and lncRNAs in the liver of adult Leiqiong cattle under the condition of growth retardation and normal growth. We totally detected 1,124 and 24 differentially expressed mRNAs and lncRNAs, respectively. The differentially expressed mRNAs were mainly associated with growth factor binding, protein K63-linked ubiquitination and cellular protein metabolic process; additionally, they were significantly enriched in the growth and development related pathways, including PPAR signaling pathway, vitamin B6 metabolism, glyoxylate and dicarboxylate metabolism. Combined analysis showed that the co-located differentially expressed lncRNA Lnc_002583 might positively influence the expression of the corresponding genes IFI44 and IFI44L, exerting co-regulative effects on Leiqiong cattle growth and development. Thus, we made the hypothesis that Lnc_002583, IFI44 and IFI44L might function synergistically to regulate the growth of Leiqiong cattle. This study provides a catalog of Leiqiong cattle liver mRNAs and lncRNAs, and will contribute to a better understanding of the molecular mechanism underlying growth regulataion.


Methods
Animals and sample collection. The Leiqiong cattle (genus: Bos; species: Bos indicus) were born, raised, and maintained at the Guangdong Leiqiong cattle farm. In this study, all animals were healthy and received the same diet until they were slaughtered in adulthood (24 months). The experimental animals were divided into two groups: growth retardation cattle (GRC; the average body height was 95 cm, and the average live weight before slaughter was 153.5 kg) and normal growth cattle (NGC; the average body height was 106 cm, and the average live weight before slaughter was 206.7 kg). All the experimental animals were without any nutritional manipulations; they were simply spontaneously different in size and weight. Liver samples at the same predetermined site were collected from GRC and NGC (n = 3 at each group, female) and immediately frozen in liquid nitrogen for total RNA extraction after harvesting. cDNA library preparation. We isolated the total RNA by using the TRIzol reagent (Invitrogen, Carlsbad, CA, USA) and DNase I (Qiagen, Beijing, China). Then, 1.5% agarose gel electrophoresis was used to assess the quality of the purified RNA, confirming the absence of genomic DNA. The RNA integrity was estimated using an RNA Nano6000 Assay Kit and a Bioanalyzer 2100 system (Agilent Technologies, Santa Clara, CA, USA) 21 .
For the cDNA library contruction, 3 μg of total RNA (the value of RNA integrity number was all larger than 7.0) was utilized as the input for each sample. Firstly, we discarded the ribosomal RNA (rRNA) through the Epicentre Ribo-zero rRNA Removal Kit (Epicentre, Madison, WI, USA), the rRNA-free residue was cleaned by ethanol precipitation. Next, sequencing libraries were constructed using the rRNA-deleted RNA with a NEBNext Ultra Directional RNA Library Prep Kit for Illumina (NEB, Ipswich, MA, USA) referring to the manufacturer's instructions. After fragmentation in NEBNext First Strand Synthesis Reaction Buffer (5 ×), we subsequently synthesized the first strand and second strand cDNA. After adenylation of the 3′ ends of the DNA fragments, NEBNext Adaptors containing a hairpin loop structure were ligated to prepare for hybridization. The library fragments were purified using an AMPure XP system (Beckman Coulter, Beverly, MA, USA), preferentially selecting the cDNA fragments of 150-200 bp in length. Then, USER enzyme (NEB) was utilized to incubate with the size-selected, adaptor-ligated cDNA at 37 °C for 15 min followed by 5 min at 95 °C. PCR amplification was then performed using Phusion High-Fidelity DNA polymerase, Universal PCR primers, and Index (X) Primer. www.nature.com/scientificreports/ Finally, the products were purified (AMPure XP system) and the library quality was assessed using the Agilent Bioanalyzer 2100 system 21 .
Sequencing and transcriptome assembly. The libraries constructed were sequenced on an Illumina HiSeq 4000 platform and 150-bp paired-end reads were generated. After removing sequences containing adapters and the reads containing poly-N and low quality reads using in-house Perl scripts, clean data were obtained. All the downstream analyses were based on the high quality, clean data. To obtain the high quality reads, we performed the following filtering process: removing the reads containing more than 10% unknown nucleotides and the reads containing more than 50% low quality nucleotides with a Phred quality less than 20. Mapping to the reference genome was the next step. Reads that passed the quality control were then mapped to the Bovine reference genome (Bos_taurus.UMD3.1; Ensemble release 94). An index of the reference genome was built using bowtie2 v2.2.8 and paired-end clean reads were aligned to the reference genome using HISAT2 (v2.0.4) 22 .
HISAT2 was run with '-rna-strandness RF'; other parameters were set as default. Next, transcriptome assembly was performed using the mapped reads of each sample and StringTie (v1.3.1) 23 .
Prediction of differentially expressed long non-coding RNAs and protein-coding RNAs. Before screening, Cuffmerge was used to create the set of transcripts. Then, lncRNA screening was carried out using the following steps: Step 1: Select the number of exon ≥ 2 transcripts; Step 2: From the results from step 1, select the transcripts with a length > 200 bp; Step 3: Annotate the above transcripts using Cuffcompare software; Step 4: Calculate the expression level of each transcript using Cuffquant, and select transcripts with a fragments per kilo base of exon per million fragments mapped (FPKM) value ≥ 0.1; Step 5: Coding potential screening. The coding potential of the transcript was predicted by three software modules: CNCI (Coding-Non-Coding-Index) (v2) 24 , CPC (Coding Potential Calculator) (0.9-r2) 25 , and PFAM (Protein Families Database; Pfam Scan, v1.3) 26,27 ; the intersection of transcripts without coding potential screened using the above three software modules with default parameters was predicted as the lncRNA dataset. The Ballgown software was used to perform the straightforward linear-model-based differential expression analyses with the default statistical modeling framework 23,28 . Transcripts with an adjusted P value < 0.05 were assigned as differentially expressed.
Target gene prediction of the lncRNAs. For each lncRNA locus, the 100-kb regions upstream and downstream were chosen to screen for the cis-acting target genes using the UCSC Genome Bioinformatics tool, which were also named co-located genes.
GO and KEGG enrichment analysis. Gene Ontology (GO) enrichment analysis of differentially expressed genes (DEGs) or lncRNA target genes was implemented using the GOseq R package, in which gene length bias was corrected 29 . GO terms with corrected P < 0.05 were considered significantly enriched for DEGs.
Kyoto Encyclopedia of Genes and Genomes (KEGG) is a database resource for understanding high-level functions and utilities of biological systems 30 , such as the cell, organism and ecosystem, from molecular-level information, especially large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies (https ://www.genom e.jp/kegg/). We used KOBAS software to analyze the enrichment of DEGs or lncRNA target genes in KEGG pathways 31 . Ethics approval and consent to participate. The study was approved by the Ethics Committees of Laboratory Animal Center of South China Agricultural University (permit number: SYXK-2014-0136). All experiments were performed in accordance with South China Agricultural University guidelines.

Results
Read mapping. For RNA-seq data analysis, the proportion of the nucleotides with Q30 in the total nucleotides should be generally larger than 85%; in our study, the proportion of each sample exceeded 91%. In total, 89,186,385, 77,069,075, 80,073,002, 91,238,926, 94,484,701 and 89,487,657 mapped reads were obtained from the clean data from the GRC1, GRC2, GRC3, NGC4, NGC5, and NGC6 libraries, respectively, and more than 94% were mapped to the Bovine reference genome (Table 1).
Enrichment analysis of differentially expressed mRNAs. We totally discovered 1,124 differentially expressed mRNA transcripts. Compared with the NGC group, 571 and 553 mRNAs were respectively upregulated and downregulated in the GRC group ( Fig. 1A and Additional file 1). Systematic cluster analysis showed that the expression pattern of the differentially expressed mRNAs was obviously distinct between the two experimental groups (Fig. 1B). To further depict the biological function of the differentially expressed mRNAs, we performed GO and KEGG enrichment analysis. In total, 609 GO terms with P < 0.05 were found (Additional file 2); most of them were associated with growth and development, including growth factor binding, glucocor- www.nature.com/scientificreports/ ticoid receptor signaling pathway, "protein K63-linked ubiquitination, and cellular protein metabolic process (Table 2). Additionally, we detected 31 KEGG pathways significantly enriched by the differentially expressed mRNAs (P < 0.05) ( Table 3 and Additional file 3), several of which were related to growth and development, including PPAR signaling pathway, vitamin B6 metabolism, glyoxylate and dicarboxylate metabolism, and fatty acid metabolism. Of the top 50 DEGs, WIF1 was involved in the Wnt signaling pathway, and it also showed association with the regulation of fat cell differentiation, regulation of cell differentiation, and cellular developmental process; SDHAF4 was related to system development, organ development, single-organism developmental process and metabolic process.
Enrichment analysis of the differentially expressed lncRNAs. In our study, 24 lncRNAs were found differentially expressed ( Fig. 2 and Additional file 4); the sequence of them was included in Additional file 5. Compared with NGC group, 18 lncRNAs were upregulated in the GRC group and 6 lncRNAs were downregulated (Table 4). For these differentially expressed lncRNAs, LNC_002093 and LNC_001379 were discovered 4 exons; LNC_002628, LNC_002005, LNC_000989, LNC_003503 and LNC_001876 showed 3 exons; all the other 17 lncRNAs had 2 exons, accounting for 70.83% of the differentially expressed lncRNAs (Additional file 4).
In total, we detected 114 protein-coding neighbors corresponding to the differentially expressed lncRNAs (Additional file 6). Based on the above targets of the lncRNAs, GO analysis revealed 561 significantly enriched GO terms (P < 0.05) (Additional file 7). According to the GO annotation, we screened the GO terms associated with growth and development, including positive regulation of cytolysis, succinate-semialdehyde dehydrogenase (NAD+) activity, cell chemotaxis, regulation of MAPK cascade, and positive regulation of smooth muscle cell proliferation (Table 5). Pathway analysis showed that the co-location genes were significantly enriched in 18 KEGG pathways (P < 0.05) (Additional file 8), some of which were associated with growth and development, including Type I diabetes mellitus, chemokine signaling pathway and Toll-like receptor signaling pathway (Table 6).  www.nature.com/scientificreports/ Combined analysis of the differentially expressed lncRNAs and differentially expressed mRNAs. Among the 114 protein-coding genes co-located with the differentially expressed lncRNAs, nine of them could be found in the DEGs list, including ILKAP, GLTPD2, PSMB6, IFI44, IFI44L, OSGIN1, CASP6, BOLA and B2M. For the 24 differentially expressed lncRNAs, LNC_001319 was located 43,587 bp upstream of PSMB6, which was enriched in the pathway of proteasome; GO analysis showed that PSMB6 was associated with cellular metabolic process, protein metabolic process, catalytic activity and organic substance catabolic process. LNC_001264 was located 79,489 bp upstream of OSGIN1; and OSGIN1 was related to metabolic process, catalytic activity, positive regulation of cellular process and positive regulation of biological process. LNC_002961 was located 24,399 bp upstream of CASP6, which was mainly involved in protein metabolic process and organic substance metabolic process. LNC_000301 was located 12,704 bp upstream of B2M; pathway analysis revealed that B2M was enriched in antigen processing and presentation.  www.nature.com/scientificreports/  www.nature.com/scientificreports/

Validation of differentially expressed lncRNAs and mRNAs.
We randomly selected six differentially expressed lncRNAs and six differentially expressed mRNAs for qRT-PCR analysis. Results demonstrated that the gene expression trend in the two experimental groups was consistent between RNA-Seq and qRT-PCR method, although the fold change of the expression level, to some extent, differed between the two sets of data (Fig. 3).

Discussion
Liver is an important organ for nutrient absorption and metabolism. However, in the liver of growth retardation fetuses, there are always disorders of growth and development and nutrient metabolism [32][33][34][35] . Therefore, in the present study, we performed transcriptome sequencing for the liver of Leiqiong cattle with growth retardation and those with normal growth and analyzed the differentially expressed mRNAs and lncRNAs to clarify their roles in growth and development. In this study, our results provided a valuable catalog of functional lncRNAs and mRNAs associated with growth and development.
Through RNA-Seq, we totally identified 1,124 differentially expressed mRNAs and 24 differentially expressed lncRNAs between GRC and NGC experimental groups, which may have specific biological roles in growth retardation in Leiqiong cattle. Using qRT-PCR, twelve randomly selected differentially expressed lncRNAs and mRNAs transcripts were validated, and the results were found to be consistent with the RNA-seq data. Among Table 5. Gene ontology (GO) enrichment analysis of the protein-coding genes co-located with the differentially expressed cis-acting lncRNAs.  www.nature.com/scientificreports/ the identified DEGs, wnt inhibitor factor 1 (WIF1) and encoding succinate dehydrogenase complex assembly factor 4 (SDHAF4) are significantly differently expressed. WIF1 is an extracellular inhibitor of the wingless signaling pathway that binds directly to Wnt proteins, thereby inhibiting the transmission of Wnt signaling. The Wnt signaling pathway transmits a growth stimulation signal, and the abnormal activation of the pathway can cause abnormal proliferation and differentiation of cells and cause tumor formation 36 . The WIF1 gene also appears as a paternal gene-expressing imprinted gene in human embryonic trophoblasts 37,38 . In our study, the expression of WIF1 gene increased in the GRC group, biological analysis results revealed that WIF1 was engaged in the GO terms including "regulation of fat cell differentiation", "regulation of cell differentiation", "cellular developmental process". Studies demonstrated that SDHAF4 gene could promote mitochondrial succinate dehydrogenase activity and prevent neurodegeneration 39 . In livestock and poultry research, the biological function of SDHAF4 has been rarely reported. In our study, the expression of SDHAF4 decreased in the GRC group, and the bioinformatic analyses revealed that SDHAF4 was engaged in the GO terms including "cellular metabolic process", "succinate dehydrogenase (ubiquinone) activity", "organ development", and "mitochondrial respiratory chain complex assembly". Thus, we made the hypothesis that SDHAF4 might play roles in regulating the growth and development of Leiqiong cattle; however, the underlying mechanism still requires further investigation. The ACO2 gene can promote the biosynthesis of lysine, generate ATP by participating in the tricarboxylic acid cycle, affect the proportion of muscle fibers, and regulate muscle growth and development 40,41 . For the genes B3GALT1 and UPP2 that we found to be extremely different, there are no reports related to the growth and development of the body, which requires further cytological validation. In our study, the expression of ACO2 and UPP2 were decreased in the GRC group and the expression of B3GALT1 was increased. Through bioinformatics analysis, we found that ACO2, B3GALT1 and UPP2 are mainly involved in Metabolic pathways, Carbon metabolism, Citrate cycle (TCA cycle), Glyoxylate and dicarboxylate metabolism, Biosynthesis of amino acids. The JUNB gene product (JunB proto-oncogene, AP-1 transcription factor subunit) has been described as a growth-inhibiting protein, and JUNB deficient mice showed a retarded growth and a reduction in adipose tissue 42 . The expression of transcription factor JUNB was significantly down-regulated in the model of muscle atrophy induced by diabetes, denervation and starvation. JUNB also plays an important role in regulating the synthesis and degradation of skeletal muscle proteins. It has been reported that knocking down JUNB in mice can significantly reduce the cross-sectional area of muscle fibers, while overexpression can significantly increase the cross-sectional area of muscle fibers and promote muscle hypertrophy in mice. The main reason for muscle hypertrophy is that JUNB promotes protein synthesis and inhibits protein degradation 43 . In our study, the expression of JUNB gene increased in the GRC group, biological analysis results revealed that JUNB was engaged in the KEGG pathway including "osteoclast differentiation", "TNF signaling pathway". EGFR is a member of ErbB family. It has tyrosine kinase activity and is an important transmembrane receptor. EGFR was positively correlated with the development of central nervous system and the growth ability of cultured neurons in vitro. EGFR is activated by ligands to initiate intracellular signal transduction. It regulates transcription of transcription factor-activated genes through cascades of binding proteins and enzymes in cytoplasm, and directly participates in cell migration, adhesion, proliferation, differentiation and apoptosis. EGFR is expressed in chondrocyte and can promote chondrocyte proliferation by binding with ligands 44,45 . Overexpression of EGFR in mice can accelerate the proliferation and growth of osteoblasts 46 and the proliferation of osteoblasts is impaired after knockout 47 . In our study, the expression of EGFR gene www.nature.com/scientificreports/ decreased in the GRC group, biological analysis results revealed that EGFR was engaged in the KEGG pathway including "regulation of actin cytoskeleton", "PI3K-Akt signaling pathway", "ErbB signaling pathway", "MAPK signaling pathway". KEGG pathway enrichment analysis showed that PPAR signaling pathway was significantly enriched by the DEGs ACOX2, SCD5, CPT1A, PPARα, ACOX1, RXRB, PCK2, SLC27A5, CPT2, EHHADH and RXRA, playing key roles in the regulation of multiple types of cells metabolism process. As a key member of this pathway, PPARα is involved in lipid metabolism in the liver and muscle, and is a key regulator of hepatocytes growth and metabolism; it has been demonstrated that PPARα could be utilized as a candidate gene for markerassisted selection for growth in cattle [48][49][50] . CPT1A gene is reported to be involved in regulating the growth traits in goat 51 . RXRA could influence cells proliferation and survival, influencing animals immune reaction 52 . PCK2 is suggested to play certain roles in modifying the function of liver and the development of muscle 53,54 .Thus, we hypothesized that these above genes were involved in animals' growth and development process; however, the underlying mechanism still required further investigation. In the next work, we will further verify the function of the genes found in the above on the cells. Transcriptional regulation could be affected by the activities of non-coding RNA (ncRNA) and transcription factors. Only about 2% of the mammalian genome is transcribed as proteins; 75-90% is transcribed as ncRNAs, the vast majority of which were lncRNAs 55 . In human and mouse, lncRNAs have been reported to potentially regulate muscle growth and differentiation [56][57][58] . In the present study, we totally identified 3,705 lncRNAs; 24 of them were differentially expressed in pairwise comparison between GRC and NGC groups, which might have specific biological roles in regulating animals growth and development. Studies have demonstrated that lncRNAs could influence the myoblast proliferation and differentiation through regulating the growth related genes; additonally, lncRNAs could significantly affect miRNA biology by acting as a precursor for miRNAs, directly binding to and sequestering miRNAs, or indirectly interfering with miRNA expression and regulation, through which the animal growth and development process might be regulated 59,60 . And, lncRNAs might work as competing endogenous RNA (ceRNA) model, binding to and sequestering miRNAs to prevent their target transcript degradation 61,62 ; however, the ceRNA hypothesis remains controversial. In the present study, our identified lncRNAs and their functions were predicted by bioinformatics method, thus further functional experiment should be performed for the validation, facilitating the enrichment of the functional annotation of the identified lncRNAs. For the differentially expressed lncRNAs, most of them were discovered two exons, accounting for 70.83%; just two lncRNAs showed the highest number of the exon, which was four. These lncRNAs were spliced with fewer exons (2)(3)(4), which was consistent with previous studies 63 . Their lower number of exons compared to mRNAs could be caused by the weaker expression of lncRNAs, and this made their structure more diffcult to verify 64 . GO analysis showed that the differentially expressed lncRNAs targets were associated with smooth muscle cell proliferation, cell chemotaxis, and regulation of MAPK cascade. The target genes (MAPK8, RPS6KA2) of Lnc_002363 and Lnc_003503 were significantly enriched in the MAPK signaling pathway; this pathway is related to embryonic development, cell proliferation, division, inflammation, cancer and so on 65,66 . Therefore, in addition to mRNAs, the differentially expressed lncRNAs reported here could be considered as important novel regulatory factors involved in the growth and development process in Leiqiong cattle. Among the 24 screened lncRNAs, Lnc_002583 was found located between the differentially expressed genes IFI44 and IFI44L. IFI44 is a kind of interferon-induced gene. The increase of IFI44 expression induces interferon response and inhibits the expression of MSTN gene. Most interferon-inducible genes can regulate cell growth. Some studies have found that IFI44 can inhibit cell proliferation. By transfecting IFI44 into melanoma cells, it was found that IFI44 could reduce the sensitivity of cells to IFN-alpha and inhibit cell growth 67 . The expression of IFI44L was up-regulated in synovial tissue of patients with systemic lupus erythematosus 68 . In order to study the role of IFI44 and IFI44L in the growth and development of Leiqiong cattle, GO enrichment analysis was carried out. The results showed that they were associated with small molecule binding, nucleotide binding, purine ribonucleoside binding, organic cyclic compound binding, GTP binding, viral defense response, stress response and immune system process. Lnc_002583, IFI44 and IFI44L were all upregulated in GRC group; due to the potential cis-regulatory effect, Lnc_002583 might positively influence the expression of IFI44 and IFI44L, exerting co-regulative effects on Leiqiong cattle growth and development. Our identified differentially expressed lncRNAs were all novel lncRNAs, they might act as ceRNAs to regulate animals growth; their regulatory effects in other species are still uncertain, which needs further verification. For the long non-coding RNAs related to the growth and development of the above screening, and the regulation relationship with the target genes, further experiment should be performed to verify the function and explore their regulation mechanism on the growth and development of Leiqiong cattle.

conclusions
The present study provided a systematic description of the changes in mRNAs and lncRNAs in Leiqiong cattle under the condition of growth retardation.The data was helpful for further investigation of the lncRNAs function. Generally, our results contributed to basic information to elucidate the mechanism associated the regulation of growth retardation in Leiqiong cattle at molecular level.

Data availability
Our sequencing data is being loaded to the sequence read archive (SRA) of NCBI. Once the submission ID was obtained, we would include it in our manuscript. www.nature.com/scientificreports/