Introduction

Epigenetics refers to heritable changes that modify DNA or associated proteins without changing the DNA sequence itself1. The epigenome is a dynamic entity influenced by predetermined genetic programs and external environmental cues2. Epigenetic mechanisms include DNA methylation, histone modifications and microRNAs, DNA methylation modification in the gene's promoter region (5′ end of the gene) and microRNA (miRNA) regulation at the 3′ untranslated region (3′-UTRs) are important in the regulation of gene expression in most eukaryotes3.

DNA methylation is a normal process used by mammalian cells to maintain a normal expression pattern4,5,6 and has been implicated in diverse processes, including embryogenesis, genomic imprinting, X chromosome inactivation and transposon silencing in mammals and plants5,7,8. DNA methylation occurs almost exclusively on a cytosine in a CpG dinucleotide and is achieved by the addition of a methyl group to the 5 position of a cytosine ring mediated by DNA methyltransferases (DNMTs)9. CpG sites are approximately 80% depleted in the genome and are asymmetrically distributed into dense regions called CpG “islands”10,11, CpG islands normally remain unmethylated9. Methylation of CpG islands in promoter regions is often associated with gene silencing5,12. In porcine adipose and muscle tissues, the differentially methylated regions in promoters are highly associated with the development of obesity via the repression of the expression of both known obesity-related genes and novel genes13. Until now, although genome-wide DNA methylation maps of many organisms, such as humans14, arabidopsis15,16, chicken17, pig13 and bovine18,19 have been reported, the methylation pattern of bovine muscle tissue remains minimally studied. In this study, we constructed two DNA libraries from Chinese Qinchuan bovine muscle tissue at the fetal and adult stages. By high throughput sequencing of the DNA libraries and subsequent bioinformatics analysis, a genome-wide DNA methylation map for bovine muscle tissue was generated.

In addition, recent studies have shown that small RNAs (sRNAs) are associated with DNA methylation20. The miRNAs are a class of small non-coding RNAs molecules that regulate eukaryotic gene expression at the post-transcriptional level. They specifically bind mRNAs in their 3′-UTRs based on sequence complementation and lead to translational repression and gene silencing21. miRNAs bind to their target mRNAs and downregulate their stabilities and/or translation. When binding to its target mRNA with complete complementarity, a miRNA can lead to degradation of the target9. The miRNAs can also bind to targets with incomplete complementarity, often in the 3′-UTR regions, which leads to translational suppression22,23,24. The large number of miRNAs discovered so far indicates that many biological processes, including cell cycle control, cell growth and differentiation, apoptosis and embryo development, are controlled by miRNA-mediated regulation of gene expression25. Each miRNA is predicted to have many targets and each mRNA may be regulated by more than one miRNA26,27,28.

It is evident that DNA sequence polymorphisms alone do not provide adequate explanations for the mechanisms regulating muscle tissue development in beef cattle. Recently, epigenetic factors, both DNA methylation and miRNA regulation, have been shown to suppress gene expression and the corresponding protein products3; thus, they play critical roles in cellular processes and the development of bovine muscle tissue. Identifying differentially methylated genes is an important first step in investigating the function of epigenetic modifications in the course of bovine growth and development.

In this study, we have performed the first integrated genome-wide analysis of DNA methylation, miRNAs and mRNA transcriptional activity, using cattle as a model, from a well-defined Chinese Qinchuan beef cattle breed. We constructed two DNA libraries, two mRNA libraries and two small RNA cDNA libraries from the longissimus dorsi muscle (LDM) at the fetal and adult stages. The objective of the present study were to assay the landscape of methylome distribution in the genome, analyzed differentially methylated regions (DMRs) and identified genes that were involved in the development of muscle. We observed that gene expression is negatively correlated with DNA methylation in the proximal promoter regions in cattle. Additionally, the expression patterns of the high-read negatively correlated genes from nine different tissues (LDM, heart, liver, spleen, lung, kidney, stomach, small intestine and fat) at multiple developmental stages of bovine muscle tissues (fetal, newborn and adult) were evaluated. In addition, we validated the MeDIP-seq results by bisulfite sequencing PCR (BSP) in some of the differentially methylated promoters. The work performed in this study will serve as a valuable resource for future functional validation and could aid in searching for epigenetic biomarkers for muscle growth prediction and promoting further development of beef cattle as a model organism for muscle research in humans and other mammals.

Results

Landscape of the DNA methylomes

In the present study, we obtained 125,288,888 (fetal bovine, FB) and 127,819,566 (adult bovine, AB) raw reads from MeDIP-seq. In each group, approximately 65% of the clean reads were mapped to raw reads and approximately 96% of the reads were uniquely mapped to clean reads (Table S1). MeDIP-Seq reads were detected in most chromosomal regions (chromosomes 1–29 and chromosome X) in each group, although some gaps existed (Figure S1). Figure S2 shows the distribution of MeDIP-seq reads in different CG density regions. The densities of 5 to 10 CpGs show a highest percentage of reads in both groups.

The distribution of MeDIP-Seq reads in different genome regions represents a genome-wide methylation pattern. MeDIP-Seq reads distribution were analyzed in CpG islands, 2 kb region upstream of the transcription start sites (TSS), 5′-UTRs, CDS, introns, 3′-UTRs, 2 kb region downstream of the transcription termination site (TTS), repeat regions and each class of repetitive elements (Figure S3). CpG islands were reported to have relatively low methylation levels14. The analysis of read distribution in different components of the genome showed that uniquely mapped reads were mainly present in repeat elements. Repeats showed a relatively high methylation level (FB = 30.18%; AB = 31.6%). Reads are concentrated in repeats because the total length of the repeats is much larger than that of other elements. Moreover, we observed different methylation levels in different repeat types with a high percentage of reads that map to repeat regions in LINE/L1 (FB: 25.90%, AB: 26.56%), LINE/RTE-BovB (FB: 24.42%, AB: 24.93%) and SINE/BovA (FB: 14.56%, AB: 14.95%) (Table S2).

We analyzed the distribution of DNA methylation in the 2 kb region upstream of the TSS, gene body (the entire gene from the TSS to the end of the transcript) and the TTS. Generally, gene body region show a higher level of DNA methylation than the 5′ and 3′ flanking regions of genes15. The region around the TSS is crucial for gene expression regulation. In cattle, the DNA methylation level decreased dramatically before the TSS and increased sharply towards the gene body regions and plateaued until the TTS (Figure 1). Previous studies have demonstrated that DNA methylation in gene body regions impeded transcription elongation in chicken, Neurospora crassa, Arabidopsis thaliana and mammalian cells16,17,29,30. The hypermethylation of the gene body regions in the chicken genome further indicates that this methylation pattern is most likely a mechanism for the regulation of gene expression that is conserved among species.

Figure 1
figure 1

Distribution of MeDIP-Seq reads in the gene region.

Distribution of reads around gene bodies. The x axis indicates the position around gene bodies and the y axis indicates the normalized read number. This figure reflects the methylation level around gene bodies.

The uniquely mapped reads were used to detect the highly methylated regions (HMRs), which are methylation-enriched regions, also called peaks. To determine the genome-wide DNA methylation profiles of cattle, we used the uniquely mapped reads to detect the methylated peaks and further analyzed the peak distribution in different components of the genome through the comparison of their methylation densities. As expected, we identified 120,643 and 123,543 peaks in the fetal and adult bovine genomes, respectively (Table S3). For each peak length region, the number and percentage of peaks from the methylated regions were calculated. The results showed that there was a major distribution of the methylated peaks length ranged from 500 to 1000 bp in the bovine genome (Figure S4). Furthermore, we observed peak distribution variation in CpGs. Figure S5 shows the CpG number in peak. Most of the peak (approximately 22%) have 10 to 15 CpG sites. Analysis of HMRs distribution in the different components of the genome showed that the HMRs are mainly in the intron (FB: 32.27%; AB: 32.37%) and the coding sequence (CDS) (FB: 15.65%; AB: 14.97%) regions (Figure S6). Analysis of HMRs coverage in the different components showed that the genome coverage in downstream 2 kb, 5′ UTR, CDS, upstream 2 kb, 3′ UTR and intron were approximately 42%, 96%, 90%, 10%, 65% and 47%, respectively (Figure S7).

The peaks of two groups of samples were merged as candidate differentially methylated regions (DMRs). We used statistics to measure the methylation rate changes and defined DMRs between the FB and AB bovine groups. The results showed that a total of 378 promoters of differentially methylated genes between the FB and AB libraries were identified, of which 143 were downregulated and 235 were upregulated in the LDM in the adult bovine compared to the fetal bovine (Table 1 and Figure S8). In other words, the DNA methylation level in the LDM was increased in the adult bovine stage compared to the fetal period. These results suggest that the DNA methylation is dynamically altered during different stages of bovine growth.

Table 1 Number of genes showing differentially methylated genes in different gene regions

Association analysis with MeDIP-Seq and small RNA-Seq

Association analyses between methylation and small RNAs were based on the sequencing data from MeDIP-Seq and small RNA-Seq. To identify the small RNAs involved in bovine muscle proliferation and differentiation, total RNAs from bovine longissimus dorsi muscle at the fetal and adult stages were used to construct small RNA libraries. We obtained 15,454,182 clean reads (representing 97.31% of raw reads) from the fetal bovine library and 13,558,164 clean reads (representing 99.64% of raw reads) from the adult bovine library, after deleting some contaminant reads (Table S1).

We calculated the number of non-overlapping windows (window = 100 bp) in which MeDIP or small RNA reads were mapped; the coverage statistics results are shown in Table S4. All the windows were divided into the following four types: windows with small RNA and MeDIP reads (FB: 1.94%, AB: 3.39%); windows with MeDIP reads but no small RNA reads (FB: 15.17%, AB: 21.33%);windows with small RNA reads but no MeDIP reads (FB: 9.61%, AB: 10.27%); and windows with no small RNA and no MeDIP reads (FB: 72.74%, AB: 65.01%). We calculated the proportion of methylated miRNAs, scRNAs, snRNAs, tRNAs, rRNAs, snoRNAs and srpRNAs in the two libraries. There was a small percentage of miRNAs that were sorted as methylated RNAs (20.80% for fetal bovine and 15.68% for adult bovine) (Table S5).

All mature miRNA sequences represented by more than three reads were compared and transitively clustered into miRNA families31. All identified miRNA was located in the genome using the search function in the Ensembl genome browser (http://www.ensembl.org/Bos_taurus/Info/Index). Prediction of target genes were performed using MIREAP and TargetScan (http://www.targetscan.org/). The miRNA target genes with MeDIP read coverage greater than the effective chain depth were defined as methylated target genes. Methylated introns often corresponded to the target regions of miRNAs in both the fetal bovine and adult bovine libraries. The read distribution of methylated miRNA target genes are concentrated in introns because the total length of the introns are much longer than that of the other elements. (Figure S9). We calculated the methylated target gene methylation level and sRNA expression level as the number of reads located in the 2 kb region upstream of the TSS + the gene body + the 2 kb region downstream of the TTS for each assessment. A scatter plot was used to show the correlation between methylation and the sRNA expression levels of the methylated miRNA target genes. We observed that methylation levels correlates positively with expression levels of the methylated miRNA target genes (Pearson's r = 0.4296, P = 2.2 × 10−16 for FB; and Pearson's r = 0.2035, P = 2.2 × 10−16 for AB) (Figure S10). The results showed that a positive correlation between methylation and sRNA expression levels in the fetal and adult bovine genome.

Association analysis of MeDIP-Seq and the transcriptome (RNA-Seq)

These association analyses between the transcriptome and methylation were based on RNA-Seq and MeDIP-Seq sequencing data. We filtered the raw datas to reduce the influence of sequencing error as described previously of the standard analysis32,33. Using RNA-Seq, this study compared the transcriptomic landscapes of longissimus dorsi muscle from the fetal and adult stages used to construct mRNA libraries. To accomplish this goal, two rounds of linear amplification of mRNA were used, ensuring that each individual bovine produced enough RNA input for analysis. All the samples sequenced on the High-Seq 2000 system, resulting in approximately 2 billion base pairs and 26 million raw reads. The sequencing reads were analyzed using SOAP software34 by alignment with the bovine reference genome (btau4.0). Among the aligned reads, a total of 26,771,484 (FB) and 28,245,946 (AB) raw reads were generated for the two samples. More than 95.64% (FB) and 92.81% (AB) of the clean reads were mapped and approximately 73.16% (FB) and 75.20% (AB) of the reads in each sample were uniquely mapped to the bovine genome in each sample. Unmapped (FB: 20.89%, AB: 22.08%) and multiply mapped (FB: 5.95%, AB: 2.73%) reads were excluded from further analyses (Table S1).

We calculated each gene's methylation level and expression level in fetal and adult bovine and determined their distribution characteristics in terms of DNA methylation and the mRNA transcriptome. We observed that methylation levels correlates positively with expression levels (Pearson's r = 0.4326, P = 2.2 × 10−16 for FB; and Pearson's r = 0.6061, P = 2.2 × 10−16 for AB) (Figure 2). We divided the genes equally into five groups (lowest, lower, medium, higher, highest) according to their expression levels and counted genes of each group in each sample. The same number of genes (fetal: 3,099; adult: 2,971) were categorized into the lowest, low, medium and high expression groups and 3,095 and 2,969 genes were in the highest expression group for fetal and adult bovine, respectively (Table 2 and Figure 3). There were 2,136, 1,437, 1,304, 1,408 and 1,366 commonly expressed genes in fetal and adult bovine in the highest, high, medium, lowest and low expression groups, respectively (Figure 3).

Table 2 Statistics of differentially methylated and expressed genes on promoter and gene body regions
Figure 2
figure 2

Distribution characteristics of gene methylation and expression levels.

The Pearson's correlation was calculated between the log2 ratios of mRNA expression differences and the log2 ratios of the methylation differences. The statistical significance was calculated by one-way ANOVA. (A): fetal bovine; (B): adult bovine.

Figure 3
figure 3

Gene numbers of different expression groups for each sample (fetal and adult bovine).

A: highest expression; B: higher expression; C: middle expression; D: lowest expression; E: lower expression.

The DNA methylation profile in and around gene bodies were compared among these five gene expression levels. A clearly negative and monotonic correlation was found between DNA methylation levels around the TSS of genes and gene expression levels. The TSS regions of highly expressed genes were relatively insufficiently DNA methylated, whereas the genes expressed at low levels were increasingly methylated (Figure 4).

Figure 4
figure 4

DNA methylation level distributions of five levels of gene expression.

The DNA methylation profile in the gene regions (TSS, gene body and TTS) were shown by the reads that were aligned on a unique locus in the genome. In upstream and downstream 2 kb regions, the regions were split into 20 non-overlapping windows and the average alignment depth was calculated for each window. In the gene body, each gene was split into 40 equal windows and the average alignment depth was calculated for each window. The Y-axis represents the average of the normalized depth for each window. Genes were divided into five groups according to expression levels: lowest-level expressed genes (silent genes), lower-level expressed genes), middle-level expressed genes, higher-level expressed genes and highest-level expressed genes (house-keeping genes). Each line in the figure represents the DNA methylation level of different expression groups. For both fetal and adult bovine, the red line indicates that genes with the lowest expression level had a relatively low methylation level, the blue line indicates that genes with a middle expression level had a middle methylation level and the green line indicates that genes with the highest expression level had a relatively high methylation level. The DNA methylation profile in and around gene bodies were compared across these five gene expression levels. (A): fetal bovine; (B): adult bovine.

Methylated genes were defined in this study as genes overlapping (≥50%) with HMRs in the promoter or gene body regions. A total of 7,697 methylated genes were found in the fetal bovine, among which 886 genes were methylated only in promoters, 4,906 only in gene bodies and 1,905 in both promoters and gene bodies. A total of 7,744 methylated genes were found in the adult bovine, among which 910 genes were methylated only in promoters, 4,736 only in gene bodies and 2,098 in both promoters and gene bodies (Figure 5 and Table S14).

Figure 5
figure 5

Distribution of DNA methylation on promoters and gene bodies.

As expected35,36, we found a negative correlation between methylation and gene expression levels in this study; in both fetal and adult bovine, the promoters of highly expressed genes tended to exhibit low methylation levels, while the promoters of genes with low expression were usually highly methylated (Table 1, Table 2 and Table S6). According to the screening criteria chosen (FDR ≤ 0.001 and |log2Ratio| ≥ 1), the expression levels of 1,885 genes were upregulated and the expression levels of 4,889 genes were downregulated (Table S8). There were 1,274 highly levels of methylated genes (CDS: 1,076; Intron: 198) and 907 genes with low levels of methylation (CDS: 792; Intron: 115) in the gene body regions and there were 235 genes with highly levels of methylation and 143 genes with low methylation in the promoter regions in adult bovine compared to fetal bovine (Table 1). The results show a total of 77 and 1,054 negatively correlated genes with methylation in the promoter and gene body regions, respectively, in the FB and AB libraries (Table 2). In addition, we performed the correlation between methylation levels and the expression levels in negatively correlated genes (Figure S15). In the promoter regions, twelve promoters exhibited reduced methylation and increased gene expression and 65 genes exhibited the opposite trends in the LDM in bovine adult compared to the fetal bovine (Table 2 and Table S16). In the gene body regions, there were 257 genes were found to be down-methylated but up-regulated expression; 797 genes were found to be up-methylated but down-regulated expression in bovine adult compared to the fetal bovine (Table 2). There were more up-methylated genes with corresponding down-regulated expression in the LDM in bovine adult compared to the fetal bovine.

The results show a total of 56 and 911 correlated genes in the promoter and gene body regions between the FB and AB libraries (Table 2). In the promoter regions, there were 24 genes were found to be up-methylated and up-regulated expression and 32 genes were down-methylated and down-regulated expression. In the gene body regions, there were 252 genes were found to be up-methylated and up-regulated expression; and 659 genes were down-methylated and down-regulated expression during adult bovine compared to the fetal period. A positive correlation between methylation and transcription was detected in bovine LDM tissue. These results suggest that DNA methylation changes that occur in the LDM during the fetal and adult bovine developmental stages.

To further investigate the biological processes associated with the 77 and 1,054 negatively correlated genes with methylation and expression levels in the promoter and gene body regions, we performed GO analysis by running queries for each differentially expressed gene against the GO database37. The results of the GO functional annotation analysis are presented in Figure 6. However, further analysis of the negatively correlated genes with DNA methylation in all GO biological categories showed that there were no significantly enriched GO terms (P > 0.05) between the two LTT libraries from fetal and adult cattle.

Figure 6
figure 6

The functional enrichment of genes with significantly correlated methylation and expression levels.

Multiple genes can cooperate to exercise their biological functions. Pathway enrichment analysis identifies significantly enriched metabolic pathways or signal transduction pathways related to differentially methylated genes. Overall, 285 pathways were identified that were associated with differentially methylated genes. Five pathways were significantly enriched (Q ≤ 0.05) for differential methylation in the promoter (Table S10) and gene body (Table S11) regions. In promoter region, the pathway terms showing the highest level of significance were the citrate cycle and tight junctions. In gene body region, the terms of pathways in axon guidance, regulation of actin cytoskeleton and vitamin digestion and absorption showed the highest significant differences in the pathway analysis.

Association analysis of MeDIP-Seq with small RNA-Seq and the transcriptome (RNA-Seq)

All combinations of DNA methylation, miRNA regulation and gene expression are shown in Figure 7. In the bovine genome, approximately 32.44% (n = 6,487, FB) and 32.40% (n = 6,478, AB) of the genes were methylated, miRNA-regulated, or expressed; approximately 7.11% (n = 1,422, FB) and 8.55% (n = 1,790, AB) of the genes were methylated and miRNA-regulated genes; approximately 44.02% (n = 8,801, FB) and 41.10% (n = 8,218, AB) of the genes were miRNA-regulated and expressed genes; and approximately 16.37% (n = 3,273, FB) and 17.90% (n = 3,578, AB) of the genes were miRNA-regulated. The objective of this analysis was to identify methylated and miRNA-regulated genes affecting bovine muscle growth. This study also provided a pairwise statistical analysis between DNA methylation, miRNA and the transcriptome at the fetal and adult developmental stages (Figure 8). We hypothesized that methylated and miRNA-regulated for some genes might partially contribute to the transcriptome in bovine growth from fetal to adult development stage.

Figure 7
figure 7

Association analyses of genes in fetal (A) and adult bovine (B).

The numbers and percentages of genes in each possible combination of regulation are calculated. “+” represents methylated/miRNA regulated/expressed genes while “−” represents unmethylated/not miRNA regulated/not expressed genes. The other lines show the number of genes and the percentage of these genes among the whole gene sets.

Figure 8
figure 8

Pairwise statistics of MeDIP-Seq and small RNA (A, D), transcriptome and MeDIP-Seq (B, E) and transcriptome and small RNA (C, F) in fetal and adult bovine.

We found that the average expression levels were the lowest in both fetal and adult bovine libraries for genes in which both the promoter and the gene body were modified compared to the average expression level of differentially expressed miRNA target genes (Figure 9, Table S12). The average methylation levels were higher for the miRNA-targeted genes than the non-miRNA-targeted genes in both fetal and adult bovine libraries (Figure 10 and Table S13). Based on these results, we inferred that miRNA activity on target genes may somehow encourage methylation of the gene, or else certain genes are so important to repress in the course of development that both methylation and miRNA systems are in place to keep expression fully off.

Figure 9
figure 9

Average expression levels of differentially expressed miRNA-targeted genes.

Figure 10
figure 10

Average methylation and expression levels of miRNA and non-miRNA target genes in the two libraries.

A large number of genes whose expression are mediated by miRNAs and methylation simultaneously were found in the FB and AB libraries. These genes were likely inhibited at some developmental stage and we can understand the functions of these genes by gene ontology (GO)38. Figure S11 shows the GO function analysis of the miRNA target genes. It should be noted that we only predicted target genes of miRNAs that were differently expressed between the two samples (fetal and adult bovine) in the standard small RNA analysis, so the GO functions of the methylated miRNA target genes of the two samples were the same. The DMRs in the MeDIP standard information analysis showed that the methylation modifications are different in those samples and the formation of DMRs may occur for many reasons. Recent studies have shown that miRNAs and siRNAs may affect DNA methylation in animals and plants9.

This analysis aimed to discover whether different miRNA expression influences DMR formation (Table S6). The expression profiles of the two libraries are shown in Table S7. The results show that the 251 miRNAs consist of 230 miRNAs with increased expression and 21 miRNAs with reduced expression. We explored the correlation between miRNA expression levels and the expression levels of their cognate target gene (Figure S16 and Table S15). We also found correlation between miRNA expression levels and the expression levels of their cognate target gene (Pearson's r = −0.0056, P = 0.0091).

Identification and validation of the negatively correlated genes via qPCR

We used qPCR to validate the changes in expression levels and gain insight into the possible roles of the negative correlated genes between methylation and expression in different tissues at three different developmental stages in cattle. In the present study, we randomly picked several high-read genes, including 37 genes with high methylation and low expression and 11 genes with low methylation and high expression in muscle tissues in fetal bovine compared to adult bovine.(Table S9).

In the present study, eight genes with upregulated methylation (LAMB1, HNRNPM, ACLY, CLCN2, CRABP2, ADAM12, MBOAT2 and PSD) had the lowest expression levels in adult bovine and the expression gradually decreased during the three muscle developmental stages. The expression patterns of these genes were similar; there was high expression in fetal muscle tissue and heart and low expression in the adult bovine (Figure 11). In contrast, six genes with down-regulated methylation (MYL2, PDLIM1, DUSP1, DTNBP1, CS and EEF1A2) had the highest expression levels in adult bovine and the expression gradually increased during the three developmental stages of the muscles (Figure 12).

Figure 11
figure 11

The qPCR validation and expression analysis of 8 genes with up-methylated and down-regulated in several bovine tissues and organs.

The mRNA expression was normalized using ACTB and GAPDH and expressed relative to gene expression in the fetal bovine group (green bars) are given as a negative control. Green bars: fetal bovine; Blue bars: newborn bovine; Yellow bars: adult bovine. Error bars represent standard error of the mean (SE). Each column values represent the means ± SE of three replicates.

Figure 12
figure 12

The qPCR validation and expression analysis of 6 genes with down-methylated and up-regulated in several bovine tissues and organs.

The mRNA expression was normalized using ACTB and GAPDH and expressed relative to gene expression in the fetal bovine group (green bars) are given as a negative control. Green bars: fetal bovine; Blue bars: newborn bovine; Yellow bars: adult bovine. Error bars represent standard error of the mean (SE). Each column values represent the means ± SE of three replicates.

Furthermore, there were 29 up-methylated and down-expressed genes (Figure S12) and 5 down-methylated and up-expressed genes (Figure S13), all with reciprocal expression patterns, were quantified in all tissues and several of them were expressed relatively consistently across all nine tissue types. A comparison of the expression profiles among tissues revealed that SNX4 and KIAA1524 (Figure S12) in the heart and EIF2AK4 (Figure S12), MYL2, PDLIM1, DTNBP1, CS and EEF1A2 (Figure 12) in muscle-related tissue or organs (skeletal muscle and heart) were highly expressed, as were CLCN2, MBOAT2, PSD (Figure 11) and SF3B5 (Figure S12) in fat. In addition, only PSD (Figure 11), CSDC2, DNAJC10, PADI6 and MSLN (Figure S12) were not expressed in the bovine newborn and/or adult muscles, while 31 genes (ACLY, CLCN2, CRABP2, ADAM12, MBOAT2, PSD, TUBA1B, THY1, SEC61A1, ANK2, SF3B5, CSDC2, CNN2, ALDH18A1, SNX4, PJA2, GPR124, DACT1, EIF2AK4, RBM23, PADI6, MARK1, MSLN, LETMD1, PDILM1, DUSP1, DTNBP1, CS, EEF1A2, ITGB7 and LCK) were detected in all tissues except the stomach in adults. Another 18 genes (HNRNPM, CRABP2, ADAM12, MBOAT2, PSD, TUBA1B, SF3B5, LEF1, CNN2, ALDH18A1, DACT1, RBM23, MPPED2, KIAA1524, PDLIM1, DUSP1, CS and EEF1A2) were not found to be expressed in liver tissue. In this study, PSD, CSDC2, DNAJC10, PADI6 and MSLN were not identified in bovine newborn and/or adult muscles but were highly expressed in other differentiated tissues. It is possible that the DNA methylation had repressive effects on gene expression between prenatal and postnatal bovine muscle development. The expression patterns and levels of 48 genes in all tested bovine tissues suggest that these genes may be involved in a highly conserved biological process in cattle. Further studies to determine their regulatory functions are needed.

MeDIP-Seq data validation via BSP

According to the MeDIP-Seq and qPCR results, three genes with relatively low methylation and three genes with high methylation were selected randomly from 77 genes with negative correlations between methylation and expression to carry out BSP to validate the MeDIP-Seq data. We found that the bisulfite sequencing results were almost exactly in accordance with the MeDIP-Seq results.

DNA methylation levels change a great deal among different genes in the fetal and adult stages. Three genes with high methylation and low expression (P1–P3) and three genes with low methylation and high expression (P4–P6) obtained from the MeDIP-Seq data were selected and their methylation patterns were assessed by bisulfite sequencing. The results showed that the DNA methylation levels of LAMB1 (P1: FB = 38.6%, AB = 83.3%), CLCN2 (P2: FB = 52.6%, AB = 89.5%) and CRABP2 (P3: FB = 44.6%, AB = 93.6%) increased in muscle tissue from fetal to adult stage. However, the DNA methylation levels of PDLIM1 (P4: FB = 91.1%, AB = 47.6%), DTNBP1 (P5: FB = 89.2%, AB = 48.0%) and CS (P6: FB = 96.7%, AB = 43.8%) decreased in muscle tissue (Figure 13).

Figure 13
figure 13

Validation of MeDIP-seq data by bisulfite sequencing between fetal bovine (FB) and adult bovine (AB) muscle tissue are shown.

Three genes with high methylation and low expression (P1–P3) and three genes with low methylation and high expression (P4–P6) obtained from MeDIP-Seq data were selected and their methylation pattern was assessed by bisulfite sequencing. Each line corresponds to a single strand of DNA and each circle represents a single CpG dinucleotide. Filled and open circles indicate methylated sites and unmethylated sites, respectively.

Discussion

Animal breeding theory assumes that most traits are affected by many genes, that each of which only contribute very little to the variance of the trait39. In the past several years, DNA methylation and miRNAs have been studied extensively. However, few studies have focused on cattle, one of the most important livestock animals raised worldwide. This study is the first to compare systematically the genome-wide muscle DNA methylation profiles and their relationships to mRNA and miRNA of fetal and adult bovine using two-tail samples of a Chinese Qinchuan beef cattle breed with different growth stages. The objective was to identify methylated genes affecting bovine growth. Methylated DNA fragments were detected using a highly sensitive method involving enrichment by MeDIP and high-throughput sequencing enabled the non-biased mapping of DNA methylation sites across the genomes of bovine muscle. There is no substitute for BS-Seq to resolve a methylome. However, MeDIP-Seq is clearly capable of enriching for highly methylated sequences at a fraction of the time and cost of BS-Seq40. MeDIP-Seq is a relatively low resolution technique that can detect methylated regions of approximately 150 ~ 200 bp41. Although the percentage of methylation in each loci have been inferred from the sequencing data42, the highest possible resolution of a single base-pair is desirable, the methylation state of neighboring CpG sites has been shown to be highly correlated over distances as great as 1000 bp43,44. Depending on the research question it may not be absolutely necessary to have single base pair resolution and the resolution provided by MeDIP-seq may be sufficient. For MeDIP-seq, it is only possible to detect differentially methylated regions and not possible to detect single differentially methylated sites, requiring additional analysis to determine the state of the individual CpG sites involved41. This study provided a comprehensive analysis of DNA methylation profiles of bovine muscle by MeDIP-Seq and revealed 77 and 1,054 negatively correlated genes in the promoter and gene body regions, respectively, in the fetal and adult stages. Our data sets covered almost the entire genome with sufficient depth to identify differentially methylated regions, thereby providing high resolution and reproducibility and proved that MeDIP-seq is a cost-effective approach for comparative analyses of the mammalian DNA methylome. Although many researchers have sought to describe DNA methylome alterations in animals and plants, to our knowledge this is the first methylome study that effectively encompasses the entire bovine genome and is not limited to specific sequences. In our analysis, hypermethylation occurred not only at proximal promoters but also at exons and introns, including regions distal from the TSS. Since DNA methylation interrupts the binding of transcription factors to their response elements45,46, changes in methylation at distal regions may affect the expression of a gene.

Skeletal muscle is composed of myofibers, intramuscular adipocytes and connective tissue. Muscle fibers or myofibers are the structural units of skeletal muscle47. In livestock, all muscle fibers are formed during the prenatal stage. The fetal stage is crucial for skeletal muscle development. Fetal muscle development involves myogenesis, adipogenesis and fibrogenesis from mesenchymal multipotent cells, which are negatively affected by maternal nutrient deficiencies48. Bovine prenatal myogenesis can be briefly divided into three different generations of cells, which appear at around 60, 90 and 110 days of the fetal stage49. In contrast, postnatal skeletal muscle development is mainly due to the increase in muscle fiber size and new muscle fibers are only generated during the adult stage to replace injured muscle fibers50,51. This pattern is significantly different between prenatal and postnatal bovine muscle development. Hence, in the present study, the fetal and adult Chinese Qinchuan bovine LDM were collected and two MeDIP-seq, two small RNA-seq and two RNA-seq libraries were constructed for Illumina sequencing, each line using DNA or RNA samples from three bovine LDM. To confirm results from MeDIP-seq and mRNA-Seq, DNA methylation and mRNA expression verification experiments of some negatively correlated genes were done with BSP and qPCR in each sample. The methylation levels and mRNA expression levels between the two methods were generally in accord with each other.

The scan of methylation enriched region (called peak) in MeDIP-seq was important to survey the global methylation pattern. In this study, peak distribution analysis demonstrated that upstream-2kb (promoter) and CpG islands (CGIs) were hypomethylated, whereas the methylation levels in gene body (CDS and intron) regions were relatively high. These results were in accordance with findings in other species52,53. DNA methylation in the gene body regions might alter chromatin structure and transcription elongation efficiency54,55. However, in contrast to previous research in animals52,53,56, we did not observe a higher methylation level in exons than in introns in cattle. The promoter methylation is a repressive epigenetic mark that downregulates gene expression. Gene body methylation and expression levels apparently have a complex relationship. Gene body DNA methylation is positively correlated with gene expression in humans53,57,58,59,60. DNA methylation within gene bodies is more prevalent than in promoters, but information on the role of DNA methylation in gene bodies is insufficient. However, the relationship between gene body DNA methylation and gene expression levels is not monotonic but rather bell-shaped in plants, invertebrates and even in humans; moderately expressed genes have the highest methylation levels16,61,62. In bovine LDM, moderately expressed genes have the highest degree of gene body DNA methylation.

The miRNAs negatively regulate gene expression by promoting degradation of target mRNAs or inhibiting their translation. The contribution of miRNA deregulation to skeletal muscle development has become increasingly evident in recent years. However, a complete understanding of the causes of deregulation is still lacking. The aim of this study was to get a deeper insight into the underlying mechanisms of miRNA deregulation in skeletal muscle development by integrating different layers of data from both the DNA and RNA levels. By considering two well-known mechanisms of DNA methylation and mRNA transcriptional regulation alterations, we wanted to identify miRNAs affected by such alterations at the genomic and epigenetic levels that were further reflected in miRNA expression. In this study, we investigated the combined effect of DNA methylation, miRNAs and transcriptional expression in muscle development.

In Qinchaun beef cattle, eight members of the let-7 gene family were sequenced at a high frequency in the muscle tissue (Table S7)63. Previous study also reported that the let-7 family is highly expressed and conserved across animal species, including mammals, flies, worms and plants64. These data show that let-7 miRNAs are some of the most important miRNA regulators of fundamental biological processes. The results showed that expression of bta-miRNA-133 was increased in the muscle tissues from fetal bovine to adult bovine, respectively. However, the expression levels of bta-miRNA-206 and miRNA-1 did not change between fetal bovine and adult bovine muscle tissues (Table S7). Previous studies have shown that miR-1, miR-133 and miR-206 can target multiple muscle-development-related genes. Specifically, muscle-specific miR-206, which is directly activated by MyoD, can target sequences in the Fstl1 and Utrn gene65. miR-1 promotes myogenesis by targeting HDAC4, a transcriptional repressor of muscle gene expression. In contrast, miR-133 enhances myoblast proliferation by repressing SRF66. Also, miR-1 and miR-206 regulate Pax7 directly in vivo67. Although the bovine-specific target genes of miRNA-1, miRNA-133 and miRNA-206 are not known, their consistent expression pattern and high conservation indicate that they are also likely to play roles in the development of bovine muscle tissues.

In addition, many differentially methylated genes related to muscle development were found in both fetal and adult bovine, including the key modulator of skeletal muscle differentiation, CRABP268 and well-known genes related to the biosynthesis of myosin (MYL2)69. The methylation of these genes might partially contribute to the bovine growth difference. On the other hand, some well-known genes related to the normal growth and development (EEF1A2, TUBA1B and DUSP1) were observed70,71,72. We believed that the methylation of these genes might partially contribute to the bovine growth difference between the fetal and adult stage. However, the epigenetic effects of these genes on bovine growth still require further study in the future.

The relationship between methylation and gene expression is complex, with high levels of gene expression often associated with low promoter methylation73 but elevated gene body methylation74 and the causality relationships have not yet been determined75. As seen in Figure 9, the position of the methylation in the promoter and gene body may be influence on gene expression. Methylation in the promoter blocks initiation, but methylation in the gene body does not block and might even stimulate transcription elongation76. While gene-body methylation can be seen to efficiently repress the initiation of intragenic transcription, the vast majority of methylated sites within genes are not associated with intragenic promoters75.

Skeletal muscles are composed of dense and oriented muscle fibers that are bundled into fascicles, which are further bundled together into muscles. Muscle fibers are sheathed by a perimysium that contains nerves and blood vessels77. Skeletal muscle is a heterogeneous tissue that can vary widely in respect to fiber composition, metabolic homeostasis and neural innervation, it plays a critical role in controlling muscle mass and metabolic characteristics78,79. Skeletal muscle is well known to exhibit a high degree of plasticity depending on genetic background environmental changes80,81. It was recently shown that the skeletal muscle fibres represent one of the most abundant cell types in mammals82. Skeletal muscle growth and energy partitioning in animals is under the complex genetic control. Muscle hierarchic architecture and heterogeneous cell composition have not yet been sufficiently investigated by either in vitro or in vivo studies83. As our knowledge in the field of epigenetics becomes more sophisticated, it is becoming appreciated that the regulators of epigenetic states include mediators of DNA methylation, microRNAs and other modifiers of histones. Epigenetic regulation at a molecular level that are much more varied and complex and probably reflect the dynamic interaction of cellular states with their environment resulting in greater functional heterogeneity than was previously envisioned84. Taken together, the epigenomic regulation of heterogeneous molecular basis of skeletal muscle among different stages of bovine growth, which contributes to muscle growth-related genes, is still unclear. It would obviously be necessary to understand the relationship between molecular networks in bovine muscles during development and their function to provide a mechanistic insight into the the normal growth and development.

There have, however, been studies of regulators of epigenetic states in bovine skeletal muscle. It was shown that those differentially methylated genes common for the contrasts compared between fetal and adult bovine, enriched growth and metabolic related multiple GO term and biological pathways were explored. In beef cattle, the most represented terms in the biological process category are biological regulation, cellular process, metabolic process and multicellular organismal development. However, metabolic process, cellular metabolic process, catalytic activity and oxidoreductase activity are the major terms found in the biological process category for porcine85. In addition, cellular process, metabolic process and biological regulation, multicellular organismal process and developmental process are the main terms representing the biological process ontological category in amphioxus86. These results indicate that distantly related species are likely to have considerable differences in the general organization for each ontological gene category. In the present study, several important signaling pathways were found, including citrate cycle, axon guidance, tight junction, regulation of actin cytoskeleton, vitamin digestion and absorption. The citric acid cycle is central to the regulation of energy homeostasis and cell metabolism87. Axon guidance refers to the process by which growing neural axons follow specific, predictable paths to reach their target locations88. Axon guidance represents a key stage during which axons extend to their correct targets during the formation of neuronal networks and related molecules were thought to be widely expressed and involved in tumor development, angiogenesis and metastasis89. Differential methylation changes in this pathway were used as a focus to identify how epigenetic changes during aging could potentially relate to the well-known loss of skeletal muscle function with increasing age90. In addition, our analyses also found some pathways related to cell junctions (tight junction) enriched. Previous research showed that the tight junction was involved in the regulation of cell growth and differentiation, while the adherens junction could limit cell growth91,92,93. The regulation of actin cytoskeleton participates in many fundamental processes including the regulation of cell shape, motility and adhesion. The remodeling of the actin cytoskeleton is dependent on actin binding proteins, which organize actin filaments into specific structures that allow them to perform various specialized functions94. These results have provided direct evidence suggest that skeletal muscle specialised contractile and metabolic functions depend on a large number of muscle growth-associated genes and proteins with extensive epigenetic modifications and components that exist in highly complex molecular structures. Therefore, those five pathways were regarded as pathways potentially related to bovine growth at fetal and adult stage in this study.

We have identified 77 negatively correlated genes in the promoter regions in longissimus dorsi muscle from fetal and adult Qinchuan bovine using deep sequencing technologies. This study expands the repertoire of bovine methylated genes and could initiate further study in the muscle development of cattle. In addition, the methylated genes expression patterns among nine tissues and DNA methylation modification on the promoter (2 kb upstream of a gene's TSS) level in bovine muscle tissue in beef cattle showed that most methylated genes are ubiquitously expressed, suggesting that these methylated genes may play a role in a broad range of biological processes in various tissues.

In our study, the identified differentially methylated genes within or between two-tail samples of Chinese Qinchuan beef cattle breed in muscle tissues were potentially involved in bovine growth at the fetal and adult stages. Eventually, we found that a total of 77 negatively correlated genes in the promoter regions between the FB and AB libraries might contribute to the regulation of bovine growth at the fetal and adult stages. In the promoter region, 12 were methylation down-regulated and expression up-regulated; and 65 were methylation up-regulated and expression down-regulated in the LDM during the adult bovine stage compared to the fetal period. The results showed that there were more methylation up-regulated and expression down-regulated genes increased in the longissimus dorsi muscle during adult bovine compared to the fetal period. We believe that the differentially methylation of these genes might partially contribute to the bovine growth difference between fetal and adult stages. However, the epigenetic effects of these genes on bovine growth still require further study in the future.

Conclusions

Many studies have attempted to understand how DNA methylation and miRNAs regulate the expression of their target genes and many previous exploratory studies have been reported, but all of them focused on the effect of each mechanism on the expression of target genes. This study is the first genome-wide investigation of the combined regulation of gene expression by DNA methylation at the transcriptional level and miRNA regulation at the post-transcriptional level that takes advantage of recent deep-sequencing technologies. We also identified many novel candidate genes that were associated with muscle-related genes that require further experimental validation.

Our study is the first large-scale comparison of the high resolution DNA methylation landscapes for the LDM from fetal and adult Qinchuan cattle. The integrated analysis provided valuable data for future biomedical research and epigenomic and transcriptomic studies of cattle that may help uncover the molecular basis that underlies economic traits in cattle, which can be used to improve the efficiency of artificial selection and will contribute to the improvement of beef production.

Methods

Ethics statement

Animal care and the experiments were conducted according to the guidelines established by the Regulations for the Administration of Affairs Concerning Experimental Animals (Ministry of Science and Technology, China, 2004) and approved by the Institutional Animal Care and Use Committee (College of Animal Science and Technology, Northwest A&F University, China). Pregnant cows, newborns and adult bovine were raised at Shannxi Kingbull Animal Husbandry Co., Ltd. (Baoji, China). The animals were humanely killed as necessary to ameliorate suffering and were not fed the night before they were slaughtered.

Tissue collection

The experimental animals used in this study were a well-known elite native breed of Chinese Qinchuan cattle. Nine tissue samples including the longissimus dorsi muscle (LDM), heart, liver, spleen, lung, kidney, stomach, small intestine and fat were collected from male individual for RNA and DNA isolation within ten minutes after slaughter. Fetuses, newborns and adult bovine used for this study came from a common ancestor, the samples were collected from the same generation and the pedigrees of core breeding population animals were traced back three generations. The animals were weaned on average at 6 months of age and raised from weaning to slaughter on a diet of corn and corn silage. The animals were allowed access to feed and water ad libitum, lived under the same normal conditions and were humanely sacrificed as necessary to ameliorate suffering. The tissues were collected from the following three key stages of myogenesis and muscle maturation: fetal bovine group (FB, day 90 fetal bovine), newborn bovine group (NB, 3-day-old) and adult bovine group (AB, 2-year-old). Fetal age was estimated based on crown-rump length95. In each group, all fresh tissue samples from 3 individuals were collected and divided into 1.5 mL plastic centrifuge tubes (each sample weighing approximately 100 mg) and snap frozen in liquid nitrogen until RNA and DNA extraction.

Illumina methylated DNA immunoprecipitation sequencing (MeDIP–Seq)

Two DNA libraries were constructed, namely, the FB and AB DNA libraries. We selected bovine LDM from 3 individuals within each group (FB and AB) was mixed in equal amounts to generate two pooled DNA libraries. To decipher the bovine DNA methylome, we dissected three fetal and three adult bovine LDM from Chinese Qinchuan cattle. We immunoprecipitated sheared genomic DNA with an antibody that specifically recognizes 5′-methylcytosine and sequenced the enriched methylated DNA with Illumina Genome Analyzer II (Illumina, BGI, Shenzhen China). For each sample, we incubated 4 mg of denatured DNA with 32 mg of anti-5-methylcytosine mouse monoclonal antibody (Calbiochem) in 400 mL of IP buffer (10 mM Tris-HCl, pH 7.5, 280 mM NaCl and 1 mM EDTA) at 4°C for 5.5 hours. Two MeDIP DNA libraries were prepared following a previously described protocol40. Ultra-high-throughput 50 bp paired-end sequencing was carried out using the Illumina HiSeq 2000 according to manufacturer's instructions (BGI, Shenzhen, China).

Illumina mRNA sequencing (mRNA–Seq)

Equal amounts of high-quality total RNA from bovine LDM within FB and AB groups were then pooled for two cDNA libraries construction and sequencing. Total RNA was isolated from each pooled sample using the Trizol reagent (Takara, Dalian, China) according to the manufacturer's protocol. Total RNA was treated with DNase I (Takara), cleaned with phenol-chloroform and precipitated with ethanol. The RNA quality and quantity were determined using an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA). Following qPCR quantification, two cDNA libraries were generated and sequenced on the Illumina Hiseq 2000 machine by the Beijing Genomics Institute (BGI, Shenzhen, China) to generate paired-end 100 bp reads.

Quantitative real–time PCR (qPCR)

For validation and identification of muscle-related genes in Qinchuan cattle, qPCR analysis of mRNA expression was performed in fetal, newborn and adult LDM, heart, liver, spleen, lung, kidney, stomach, small intestine and fat. The same RNA extraction protocol was used for nine different bovine tissue samples from the individual at the fetal bovine (FB), newborn bovine (NB) and adult bovine (AB) stages. RNase-free DNase I (Takara) was used for removal of genomic DNA from RNA samples used for qPCR analysis. cDNA was synthesised using the oligo (dT) and random 6-mer primers provided in the PrimeScript™ RT Master Mix kit (Takara, Dalian, China). The qPCR was performed using a standard SYBR ® Premix Ex Taq™ II (Takara, Dalian, China) on the BioRad CFX96 Real-Time PCR Detection System (Bio-Rad, USA, Hercules, CA) according to the manufacturer's instructions. Bovine ACTB and GAPDH were used as endogenous control genes. The primer sequences used for the qPCR are listed in Table S9. All measurements included a negative control (no cDNA template) and each RNA sample was analyzed in triplicate. The data were normalized to the geometric mean of the data from bovine ACTB and GAPDH used as endogenous control genes. The relative expression levels of the target mRNAs were calculated using the 2−ΔΔCt method.

Illumina small RNA sequencing (small RNA-Seq)

For association analysis of miRNA expression, DNA methylation and mRNA transcriptome expression levels, two miRNA libraries were constructed for use in small RNA high-throughput sequencing. Total RNAs from 3 fetal and 3 adult bovine LDM were pooled in equal amounts to produce two miRNA libraries within FB and AB groups. Total RNAs were extracted from fetal and adult Chinese Qinchuan bovine LDM and subjected to quality control as described above for gene mRNA-Seq and qPCR verification experiments. Equal amounts of high-quality RNA from each tissue for cDNA synthesis and sequencing. The bioinformatics pipeline for miRNA discovery was carried out as previous description96. The small RNA fragments (molecules between 10-40 nt) for the low molecular weight RNAs were isolated by 15% polyacrylamide gel electrophoresis (PAGE) and ligated with proprietary adaptors to the 5′ and 3′ termini (Illumina). The short RNAs were then converted to cDNA by qPCR and each small RNA library was sequenced individually using the Illumina Hiseq 2000 according to manufacturer's instructions (BGI, Shenzhen, China).

Bisulfite Sequencing Polymerase Chain Reaction (BSP)

Genomic DNA was extracted following standard procedures using a TIANamp Genomic DNA Kit (Tiangen, Beijing, China). Three separate bisulfite modification treatments were performed for each pooled DNA sample. Two micrograms of pooled DNA from the fetal and adult bovine groups was treated with sodium bisulfite using the EZ DNA Methylation Kit (Zymo Research, Irvine, CA, USA) according to the manufacturer's protocol, except that the conversion temperature was changed to 55°C. The modified DNA samples were diluted in 10 μL of distilled water and immediately used in BSP or stored at −80°C until PCR amplification.

To confirm the results from MeDIP-seq, six BSP primers were designed by the online MethPrimer software97, including three genes with upregulated methylation and consequential downregulation of expression (P1–P3) and three genes with downregulated methylation and consequential upregulation of expression (P4–P6). The sequences of the PCR primers used to amplify the targeted products are shown in Table S17. We used hot start DNA polymerase (Zymo Taq ™ Premix, Zymo Research) for BSP. PCR was performed in 50 μL of reaction volume, containing 200 ng/50 μL genomic DNA, 0.3 to 1 μM of each primer, Zymo Taq ™ Premix 25 μL. The PCR was performed with a DNA Engine Thermal Cycler (Bio-Rad) using the following program: 10 min at 95°C, followed by 45 cycles of denaturation for 30 s at 94°C, annealing at prescribed annealing temperature (AT) (Table S17) for 40 s; and primer extension at 72°C for 30 s, with a final extension at 72°C for 7 min. The PCR products were gel purified using a Gel Purification Kit (Sangon, Shanghai, China). The purified fragments were subcloned into the pGEM® T-easy vector (Promega, Madison, WI, USA). Different positive clones for each subject were randomly selected for sequencing (Sangon, Shanghai, China). Three independent amplification experiments were performed for these genes in each sample. We sequenced four clones from each independent set of amplification and cloning; hence, there were 12 clones for each primer set for each sample. The final sequence results were processed by the online software QUMA98.

Sequencing data analysis

Figure S14 summarizes the workflow used for the sequencing data analysis. Step 1: The raw data were obtained from the Illumina sequencing of the RNA and DNA conducted at BGI in Shenzhen, China. The raw sequencing data were processed by the Illumina base-calling pipeline. Step 2: The low-quality reads were filtered to remove those containing adaptors (3′ and 5′ adaptors) and unknown or low-quality bases to reduce the influence of sequencing errors32,33,99,100,101. Step 3: The remaining sequences (clean reads) were mapped to the latest bovine genome assembly (btau4.0) using the program SOAPaligner v2.21 (http://soap.genomics.org.cn) with no more than 2 bp mismatches34. Step 4: The uniquely mapped data were retained for read distribution analysis including the distribution in bovine chromosomes and the distribution in different components of the genome (such as promoters, 5′-UTRs, 3′-UTRs, exons, introns, intergenic regions, CpG islands (CGIs) and repeats). Clean reads can be mapped to the genome in the following three ways: uniquely mapped, multiply mapped and unmapped. All the following analyses were based on uniquely mapped reads. Gene information was downloaded from the public FTP site of Ensembl (ftp://ftp.ensembl.org/pub/release-67/fasta/bos_taurus/) and the region from the transcript start site to the transcript end site was defined as the gene body region. The CGIs were scanned by CpGPlot (https://gcg.gwdg.de/emboss/cpgplot.html) with the following criteria: length exceeding 200 bp, G + C content greater than 50% and a ratio of observed to expected CpG greater than 0.6102. Repeat annotations were obtained from the UCSC database (http://hgdownload.cse.ucsc.edu/goldenPath/bosTau4/database/refGene.txt.gz) and the analysis of read distributions on repeats was carried out by RepeatMasker (http://www.repeatmasker.org/). The MeDIP-Seq reads were aligned using Mapping and Assembly with Qualities (MAQ)103 and only the genome-wide methylation peak scanning was conducted using the Model-based Analysis of ChIP-Seq (MACS) V 1.4.2 (http://liulab.dfci.harvard.edu/MACS/)104. Step 5: The number of peaks in different components of the bovine genome was analyzed in our study. Moreover, the number of methylated peaks in the whole genome, called the total peak number, was also analyzed in each sample and a peak overlapping different components was counted only once.

Analysis of DNA methylation, mRNA and micoRNA expression levels in the two libraries

(1) DNA methylation level analysis

The DNA methylation level was measured by the number of MeDIP reads mapped to the gene body and promoter because methylation in both regions affects gene expression. The formula used to calculate the methylation level is as follows: Methylation level = (Number of unique mapped reads in region × Read length)/Region length, where region represents the gene promoter or body. To avoid false positive methylation results, we removed genes whose coverage was lower than the effective chain depth. The effective chain depth representing the average coverage of MeDIP reads over the whole genome was calculated as follows: Uniquely mapped reads effective chain depth = (Number of uniquely mapped reads × Reads length)/Genome size. The significance threshold of the P value in multiple tests was set based on the false discovery rate (FDR). After multiple test correction, we used P ≤ 0.01 and coverage changes greater than 2-fold (log2Ratio ≥ 1) as the threshold to judge the significance of differentially methylated genes.

(2) Gene expression level analysis

The gene expression level was normalized by considering the RPKM (Reads per Kb per Million reads) value32, which was calculated based on the number of reads uniquely mapped to the genome. The formula (1) is as follows:

where RPKM (A) is the expression of gene A, C is number of reads that uniquely aligned to gene A, N is the total number of uniquely aligned genes and L is the number of bases in the CDS of gene A. The RPKM method is able to eliminate the influence of gene lengths and sequencing discrepancies on the calculation of gene expression. Therefore, the calculated gene expression can be directly used for comparing the differences in gene expression among samples.

The gene expression level was normalized by considering the RPKM value32. Differentially expressed genes and their corresponding P values were calculated based on normalized expression105. The significance threshold of the P value in multiple tests was set based on the FDR. The fold changes (log2Ratio) were also estimated based on the normalized gene expression level in each sample. The differentially expressed genes were selected based on the expression profiles and the following criteria: the change in gene expression levels between FB and AB was greater than or equal to a 2-fold changes (log2Ratio ≥ 1) and the FDR was less than or equal to 0.001 (FDR ≤ 0.001).

(3) MicroRNA expression level analysis

To compare the miRNA expression levels between two samples to determine the differentially expressed miRNAs, the expression levels of the miRNAs in two samples (fetal and adult bovine LDM) were normalized to obtain the expression in transcripts of the miRNA per million total miRNA transcripts. In cases where the number of transcripts of an miRNA was 0 in one of the two libraries, the 0 was changed to 0.01 for the comparative analysis; the number of transcripts of an miRNA was lower than 1 in both of the libraries after normalization, this miRNA was discarded during the comparative analysis.

The fold-change and P value for each miRNA were calculated based on the normalized expression using the formulae shown below: Step 1: Normalize the expression of miRNA in two samples (fetal and adult bovine LDM) to get the expression of transcript per million. Normalized expression (NE) = Actual miRNA count/Total count of clean reads × 106. Step 2: Calculate fold-change (log2Ratio) and P value from the normalized expression. Fold-change formula: Fold-change = log2 (adult NE/fetal NE).

P value formula (2):

where N1 and X, N2 and Y represent the total number of clean reads and normalized expression level of a given miRNA in small RNA libraries of the fetal and adult stages, respectively. After multiple test correction, we used P ≤ 0.05 and |log2Ratio| > 1 as the threshold to judge the significance of miRNA expression differences.

Gene ontology (GO) annotation and the KEGG pathway

To further investigate the biological processes and biological functions associated with the differentially expressed genes, we performed gene ontology (GO) analysis and KEGG pathway analysis. Genes exhibiting more than 2-fold expression changes in different samples were analyzed for GO and KEGG pathway enrichment using the DAVID functional annotation tool (http://david.abcc.ncifcrf.gov/)106. The differentially expressed genes were classified into categories by cellular component, molecular function and biological process using GO annotation33,37. A hypergeometric test was applied to map all differentially expressed genes to terms in the GO database (http://www.geneontology.org/) and search for significantly enriched GO terms in differentially expressed genes compared to the genome background. The test formula (3) is as follows:

where N is the number of all genes with GO annotations, n is the number of negatively correlated genes in N, M is the number of all genes annotated to certain GO terms and m is the number of negatively correlated genes in M. The calculated P values were corrected using the Bonferroni correction, using the corrected P ≤ 0.05 as the significance threshold.

KEGG is the major public pathway-related database. Different genes usually cooperate with each other to exercise their biological functions. Pathway-based analysis helps to further understand the biological functions of genes. Pathway enrichment analysis identifies significantly enriched metabolic pathways or signal transduction pathways in which negatively correlated genes take part based on comparison with the whole genome background. The calculation formula is the same as that used for GO analysis. Here N is the number of all genes that have KEGG annotations, n is the number of negatively correlated genes in N, M is the number of all genes annotated to specific pathways and m is the number of negatively correlated genes in M. The Q value is defined as the FDR analog of the P value. The Q value of an individual hypothesis test is the minimum FDR at which the test maybe called significant107,108. The calculated P values were corrected using the Bonferroni correction and pathways with Q ≤ 0.05 were considered to be significantly enriched in differentially expressed genes.

Additional Information

Acccession codes The high-throughput sequencing data have been deposited in NCBI's Gene Expression Omnibus (GEO) under GEO series accession numbers GSE46504 (MeDIP-seq data).