tRNA-derived small RNA dataset in multiple organs of intrauterine growth-restricted pig

Intrauterine growth restriction (IUGR) impairs neonatal weight and causes multiple organ dysplasia. IUGR not only threatens human health but is also a significant constraint to the development of animal husbandry. However, the molecular mechanism underlying IUGR remains to be further elucidated. tRNA-derived small RNA (tsRNAs) is a regulative non-coding RNA, which has recently been reported to correlate with the onset and progression of several diseases. In this study, we investigated the tsRNAs expression profiles of IUGR pigs. A tsRNAs dataset for multiple organs in normal and IUGR pigs was generated, including muscle, liver, spleen and intestine. We further analyzed the characteristics of tsRNAs in different organs of pigs, and KEGG pathway analysis was performed to investigate possible pathways involved. This dataset will provide valuable information for further exploring the molecular mechanism of IUGR formation.


Methods
Animals and sample collection.The study used 12 paternal half-sib female Duroc × Landrance × Yorkshire (DLY) piglets.They were divided into two groups according to birthweight: Normal piglets (mean birth weight 1.60 ± 0.05 g, n = 6) and IUGR piglets (mean birth weight 1.07 ± 0.04 g, n = 6).The body weight of IUGR piglets was significantly lower than the weight of normal piglets.IUGR is commonly defined as a birth weight less than two standard deviations below the normal 1 .The piglets were raised following standard commercial practice.Body weight measurements were taken at 1, 23, 26, 30, and 37 days (Fig. 2A).At 37 days, piglets were slaughtered according to a standard commercial procedure.The weight of longissimus dorsi muscle, liver, spleen, kidney and pancreas were measured separately (Fig. 2B).Longissimus dorsi muscle, liver, spleen and intestine (jejunum) samples were collected in cryopreservation tubes and stored at −80 °C until used.
RNA extraction and library construction.Tissue samples were ground in liquid nitrogen.The number of samples per group was three.We selected the three lightest piglets in the IUGR group and the three heaviest piglets in the normal group to extract RNA.Each sample was added 1 ml RNAiso reagent (TaKaRa, Japan) to extract the total RNA according to the manufacturer's instructions.Isolated total RNA was measured concentrations and purities using the NanoDrop 2000 spectrophotometer (Thermo Scientific, USA).The absorbance ratio of the sample at 260 nm and 280 nm is used to evaluate RNA purity.RNA samples with a ratio between 1.8 and 2.0 are used for sequencing analysis.RNA modifications are abundant in tsRNAs and interfere with small RNA-seq library construction.Before library preparation, the detailed processing flow of total RNA is as follows: 3′ -aminoacyl (charged) deacylation to 3′ -OH for 3′ adaptor ligation, 3′ -cP (2′, 3′ -cyclic phosphate) removal to 3′ -OH for 3′ adaptor ligation, 5′ -OH (hydroxyl group) phosphorylation to 5′ -P for 5′ adaptor ligation, m1A and m3C demethylation for efficient reverse transcription.The total RNA from each sample was pretreated and then utilized to prepare the tsRNA-seq library.NEBNext ® Multiplex Small RNA Library Prep Set for Illumina (NEBNext ® , USA) was used for library construction.Library construction steps were carried out as follows.Firstly, the RNA was ligated to 3′ and 5′ small RNA adapters.Next, cDNA was synthesized from the ligated RNA using Illumina's proprietary reverse transcription (RT) primers and amplification primers.Subsequently, PCR amplification was performed to generate fragments ranging in size from 134-160 bp, which were extracted and purified from the polyacrylamide gel electrophoresis (PAGE) gel.Finally, the completed libraries were quantified using the Agilent 2100 Bioanalyzer to determine the concentration and quality of the libraries.The purified libraries were mixed in equal amounts and then used for sequencing.
sequencing procedures.The libraries were denatured with 0.1 M NaOH to generate single-stranded DNA molecules and diluted to a loading volume of 1.3 ml and loading concentration of 1.8pM.Diluted libraries were loaded onto reagent cartridges and forwarded to sequencing run on Illumina NextSeq500 system using NextSeq 500/550 V2 kit (#FC-404-2005, Illumina), according to the manufacturer's instructions.For standard small RNA sequencing on Illumina NextSeq instrument, the sequencing type is 50 bp single-read.

Data analysis.
Raw sequence data in FASTQ format generated from the Illumina NextSeq500 sequencing platform were used for further analysis.First, the FastQC (v0.11.7) was used to assess the quality scores of sequencing reads (Table 1).tRNA cytoplasmic sequences were downloaded from the Genomic tRNA Database (GtRNAdb) 20 .The reference genome used was Sscrofa11.1.tRNA sequences of mitochondria were predicted with tRNAscan-SE 21 software.Raw sequence were trimmed 5′, 3′ -adaptor sequence and discarded reads (length < 14nt or length > 40nt) to generate trimmed reads using Cutadapt.Trimmed reads were aligned to mature tRNA sequences, allowing onely one mismatch, and then reads that did not map were aligned to precursor tRNA sequences, allowing one mismatch with Bowtie software.
The expression level of each tsRNA is evaluated using sequencing counts and is normalized as counts per million of total reads (CPM).The count and CPM of tsRNAs can be calculated with the following formula: i: the i-th read aligned to the tsRNA region; n: the number of the reads aligned to the tsRNA region; c i : the count of the i-th read; m i : the number of tsRNA generated from the i-th read (m i possibly occur great than one, only when allowing for more than 1 mismatch); N: the total number of reads mapped onto all of the mature or precursor tRNA.The obtained counts and CPM data of tsRNA were used for subsequent analysis.

Characteristics of tsRNAs expression profile.
Based on the tsRNAs expression level (CPM), we evaluated Spearman's correlations coefficients between 24 samples (Fig. 3A).Principal Component Analysis was performed based on read counts (Fig. 3B) and CPM (Fig. 3C) of tsRNAs for each sample.The number of tsRNAs identified from each group was depicted in the petal diagram, and 364 core tsRNAs were shared among all groups (Fig. 3D).The length distribution of the identified tsRNAs was analyzed (Fig. 3E).The majority of tsRNAs were 31-32 nt in length.According to the cleavage site of parental tRNAs, tsRNAs were classified into 9 subtypes, as shown in Fig. 3F.Conventional and specifically expressed tsRNAs between normal and IUGR groups are depicted in the Fig. 3F Venn diagram.The pie chart shows the number of each group's tsRNAs subpopulation.We further analyzed the percentages of per tsRNAs in each group.The percentage of the top 15 tsRNAs was also computed in all groups (Fig. 3H).Among them, tRF-Gly-GCC-037/038 was the highest in abundance, the sum of two tsRNAs exceeded the 60%.Interestingly, the top 6 tsRNAs all originate from the same tRNA-Gly-GCC.Figure 3H illustrates the sequence of tRF-Gly-GCC-037/038 and their parental tRNA-Gly-GCC cleavage site.We further analyzed the characteristics of tsRNAs identified in four tissues of pigs. Figure 4A demonstrates the tRNA types from which the tsRNAs originate and the number of tsRNA subtypes.It indicated that the tRNA-Glu-TTC produced the most significant number of tsRNAs.As shown in both Figs.3F, 4A diagrams, tRF-5c was the most abundant tsRNA subtype in any one sample.Moreover, the tRNA cleavage sites corresponding to each subtype were analyzed in Fig. 4B.We calculated the proportion of four bases in each break site of tRNA in Fig. 4B lower panel.

Identification of differentially expressed tsRNAs.
Differentially expressed tsRNAs analyses were performed with R package edgeR.The P-value of the exact test was calculated by negative binomial distribution.Then, multiple testing using a FDR was applied to obtain the Q-values.No differentially expressed tsRNAs were found with FDR correction.The threshold for screening differentially expressed tsRNAs was the absolute fold change > 1.5 and P-value < 0.05.Differentially expressed tsRNAs in four tissues between normal and IUGR groups were visualized according to fold change and P-value.The red circle represents up-regulated tsRNAs, and blue circle indicates down-regulated tsRNAs (Fig. 5

Functional enrichment analysis.
Multiple recent studies have demonstrated that tsRNAs have similar regulation mechanisms to microRNAs.Thus, we used the publicly available miRanda and TargetScan tools to predict the target genes of tsRNAs.Targetscan software threshold was set at 50 (context score percentile), and miRanda was set with a maximum binding free energy of less than −20.Those genes predicted jointly in miRanda and TargetScan were used as the target genes of tsRNAs for the Kyoto Encyclopedia of Genes and Genomics (KEGG) analysis.All up-regulated and down-regulated tsRNAs in four tissues were performed KEGG pathway enrichment analysis, respectively.The top 10 pathways for each tissue are shown in Fig. 6.Up-regulated tsRNAs were mainly enriched in the MAPK signaling pathway and metabolic pathway.Down-regulated tsRNAs were mainly enriched in the insulin signaling pathway and ErbB signaling pathway.We also constructed the relationship between these pathways and up-regulated and down-regulated tsRNAs.
statistical analyses.rson.Results in Fig. 2B were represented as means ± SD.Significant differences between normal and IUGR group were determined by the unpaired t-tests.P-values < 0.05 (*) represent significant difference.P-values < 0.01 (**) represent highly significant difference.

Data Records
The RNA-Seq raw data were deposited in the NCBI Sequence Read Archive (SRA) database under the accession number PRJNA974817 22 and PRJNA800654 23 .The tRNA sequences, results of differential expression analysis and the functional enrichment analysis are stored in figshare 24 .Validation of experimental sample.Pearsons correlation coefficient analysis was performed on all 24 samples.Strong correlations were seen between samples from the same tissue type (Fig. 3A).Principal component analysis (PCA) was also performed with all samples and the distances between the sample points represent the similarity of samples.Obviously, the distance between samples from the same tissue type is closer (Fig. 3BC).

Fig. 1
Fig. 1 Study workflow for the main analysis.
left panel).Heat map showing differentially expressed tsRNAs clustering for each tissues (Fig. 5 right panel).

Fig. 3
Fig. 3 Analysis of tsRNAs characteristics.(A) Spearman's correlations coefficients between all samples.Principal Component Analysis (PCA) based on read counts (B) and CPM value (C) of samples.(D) numbers of tsRNAs for each tissue type.(E) tsRNAs length distribution in muscle, liver, spleen and intestine of normal and IUGR pigs.(F) Venn diagram summarizing tsRNAs number and type of Normal and IUGR piglets.(G) Relative abundance of the top most abundant 15 tsRNAs.(H) The generation position of tRF-Gly-GCC-037/038 derived from tRNA-Gly-GCC.

Fig. 4
Fig. 4 Analysis of parental tRNA.(A) Statistics of amino acids corresponding to parent tRNA of tsRNAs.The bar chart represents the number of tsRNAs corresponding to different tRNAs.The spherical plot represents the number of tsRNAs corresponding to different amino acids.The bubble chart represents the number of different subtypes tsRNAs.(B) Cleavage position of each type of tsRNAs and base characteristics.The peak represents the probability of cleavage site.The sum of peak areas for each subtype is 1.

Fig. 5
Fig. 5 Analysis of differentially expressed tsRNAs.(A-D) represent muscle, liver, spleen and intestine, respectively.The left panel is the tsRNA rank plot, the right panel is the heatmap of differentially expressed tsRNAs.The heatmap is based on the expression level of tsRNA (CPM) and used Z-Score for standardization.

Table 1 .
Sequence reads information.TotalRead: Raw sequencing reads after quality filtering.TotalBase: Number of bases after quality filtering.BaseQ30(%): Number of bases of Q score more than 30 after quality filtering.BaseQ30(%): The proportion of bases (Q ≥ 30) number after quality filtering.

Table 1
. show the proportion of bases (Q ≥ 30) number after quality filtering.