Introduction

Sheep colibacillosis is one of the most common bacterial diseases found at large-scale sheep farms. The traditional method of controlling the bacterial disease is by antibiotic therapy, although this approach also has several disadvantages. Detecting the expression of antagonistic genes in sheep colibacillosis provides information that may facilitate the elucidation of the molecular mechanism underlying disease resistance to Escherichia coli. In 1966, Orskov et al.1 first reported the porcine E. coli adhesion antigen K88, which is an episome-determined antigen2. In addition, the morphology of the pilus situated on the surface of the bacteria has been examined by electron microscopy. To date, numerous animal-derived enterotoxigenic Escherichia coli (ETEC) pili have been identified, including K88, K99, 987P, F17 and F41, which are all vital virulence factors. Long non-coding RNAs (lncRNAs) are a class of non-coding RNAs longer than 200 nucleotides. Several studies have revealed that lncRNAs are closely related to the development of human tumours, cardiovascular diseases, and metabolic diseases. Furthermore, there is growing evidence that lncRNAs play a significant regulatory role in anti-viral and other natural immune responses3,4,5,6,7. However, research on the function of lncRNAs has mainly focused on the regulation of muscle growth, testicular development, hair follicle development, and other traits in sheep, which are less well studied compared to humans8,9,10. A few studies have recently focused on disease prevention and control in sheep11,12, including disease resistance, yet our understanding of its underlying molecular mechanism is limited. In the present study, the expression levels of lncRNAs in two sheep spleen phenotypes, antagonistic or sensitive to pili of E. coli F17, were assessed by RNA-seq, and key lncRNAs were investigated by target prediction and functional annotation analysis based on the cis mechanism, a transcription activation and expression control method for adjacent mRNAs by non-coding RNAs. The results were further verified by q-PCR. Our results provide a deeper understanding of sheep antagonism to E. coli F17 in terms of lncRNAs, facilitate in the identification of additional functional genes that are antagonistic to E. coli F17, and provide a theoretical basis for solving key issues that are related to breeding indigenous disease-resistant sheep in China.

Results

Histological observation and comparison of the number of intestinal bacteria in the antagonistic and sensitive groups

Based on faecal morphology13, the experimental subjects were divided into the antagonistic group (12p, 13p, 14p) and the sensitive group (15p, 16p, 17p). In the sensitive groups, the bacterial count was arranged from 4.7*108 to 1.9*109, the mean of the bacterial count was 1.22*109, and yet the count dropped 5.1*106 to 9.0*107 in the antagonistic groups, the mean of the bacterial count was 3.37*107 (TableĀ 1). Compared to the antagonistic group, the number of intestinal bacteria in the sensitive group was significantly higher (Pā€‰<ā€‰0.05, Fig.Ā 1). The jejunum mucosal tissues of lambs in the sensitive group were damaged, dull in colour, lysed, and exhibited large lacuna that could be observed at the submucosal layer. The intestinal villi had disintegrated and were highly vascularized (i.e., capillaries), and the intestinal mucosa showed severe damage, thus making it difficult to prepare histological sections for assessment (Fig.Ā 2).

Table 1 Comparison of the number of intestinal bacteria between the antagonistic and sensitive lambs.
Figure 1
figure 1

Difference analysis of the intestinal bacteria in antagonistic and sensitive lambs.

Figure 2
figure 2

Histological assessment of the jejunum of the antagonistic (A1, A2) and sensitive (B1, B2) groups. (A1 and B1) are 200x microscope observations, (A2 and B2) are 400x microscope observations.

Summary of RNA sequencing of spleens in sheep

cDNA libraries of the lamb spleens from the antagonistic and sensitive groups were constructed. Sequencing was performed using the Illumina HiSeq 2500 platform. The antagonistic and sensitive groups generated a total of 354,943,820 and 370,616,990 raw reads, with GC contents of 48.33% and 49.67%, respectively. The valid reads in the clean reads were mapped to the O. aries v4.0 reference genome, and more than 73.5% of the reads were mapped to the genome. The reads mapped to multiple locations of the reference sequence were less than 4.5% and more than 70% of the reads were uniquely mapped to the reference sequence. Approximately 35% of the reads mapped to the positive and negative chains of the genome. In addition, the number of reads that were mapped to the exonic regions (~60%) was higher than those that were mapped to intergenic and intron regions by annotation analysis. These results indicate that the matching efficiency of our de novo assembly is high, and most reads mapped to the exonic region (TableĀ 2).

Table 2 Read statistics of the reference genome.

Identification of transcripts in sheep spleens

After mapping the reference sequence, we identified 1,988 lncRNAs and 38,843 mRNAs from 42,460 compiled transcripts. The length of the lncRNAs was mainly distributed within the range of 200 bp-5,000ā€‰bp (Fig.Ā 3a), and the average length was 2,124ā€‰bp. Additionally, the lncRNA types mainly include intergenic lncRNAs (character u) (Fig.Ā 3b)and intronic lncRNAs (character i), containing 2 to 3 exons (Fig.Ā 3c).

Figure 3
figure 3

Summary of the lengths, types, and number of exons of the predicted lncRNAs.

Analysis and validation of DE transcripts

The expression levels of lncRNA and mRNA transcripts were estimated using the FPKM values. We found that the expression level of the lncRNA transcripts was relatively low (Fig.Ā 4). A total of 14 upregulated and 20 downregulated DE lncRNAs and 370 upregulated and 333 downregulated DE mRNAs were screened under conditions of Pā€‰<ā€‰0.05 and |log2 (fold change)|ā€‰>ā€‰1 (Fig.Ā 5). To further verify the reliability of our RNA-seq data, a total of 12 DE lncRNAs and DE mRNAs were randomly selected. Their relative expression levels in the antagonistic and sensitive lambs were confirmed by q-PCR (Fig.Ā 6) and were found to coincide with our RNA-seq results (Fig.Ā 7), thus indicating that our RNA-seq data were reliable. Our analyses also showed that high-throughput sequencing has the advantage of detecting genes that are expressed at relatively very low levels (0ā€‰<ā€‰FPKMā€‰<ā€‰1).

Figure 4
figure 4

Expression patterns of lncRNA and mRNA transcripts. The Box-whisker Plot consists of five statistics: the minimum, the first quartile (25%), the median (50%), the third quartile (75%), and the maximum.

Figure 5
figure 5

Differentially expressed lncRNAs and mRNAs between antagonistic and sensitive lambs. Color indicates the amount of expression of the gene. The darker the color, the greater the expression (red is up-regulated, green is down-regulated). Each row represents the expression of each gene in different samples, and each column represents the expression of all genes in each sample. The top tree shows the results of cluster analysis of different samples from different experimental groups, and the left tree shows the results of cluster analysis of different genes from different samples.

Figure 6
figure 6

Relative expression levels of DE lncRNAs and mRNAs between antagonistic and sensitive lambs were confirmed by q-PCR. Note: ā€œ**ā€ means highly significant correlation; ā€œ*ā€ means significant correlation; ā€œnsā€ or ā€œno SuperiorScriptā€ means no significant correlation. The same as below.

Figure 7
figure 7

RNA-seq results of DE lncRNAs and mRNAs.

GO and KEGG Pathway enrichment analysis of DE lncRNAs

In the corresponding relationship between ā€œlncRNA name-function prediction Termā€ (SupplementaryĀ 1 and SupplementaryĀ 2), we selected the top 500 predictive relationships with the highest predictive reliability (sorted by p-value). The frequency of each function and the number of GO (or pathway) terms with function annotations were analysed to reflect the difference in the distribution of lncRNAs (Fig.Ā 8).

Figure 8
figure 8

Gene Ontology and KEGG Pathway enrichment analyses of DE lncRNAs.

Comparisons of the DE lncRNAs and the entries in the GO database identified that a total of 34 lncRNAs could be annotated and classified into 302 functional subclasses. The number of DE lncRNAs in the top 30 functional subclasses is shown Fig.Ā 8. The number of DE lncRNAs in the categories of protein binding (GO:0005515), nucleus (GO:0005634), poly(A) RNA-binding (GO:0044822), cytoplasm (GO:0005737), tissue remodelling (GO:0048771), regulation of endopeptidase activity (GO:0052548, 6-phosphofructo-2-kinase/fructose-2,6-biphosphatase complex (GO:0043540), phosphatidylinositol phosphorylation (GO:0046854), fructose-2,6-bisphosphate 2-phosphatase activity (GO:0004331), and calcium-dependent phospholipase C activity (GO:0050429) were higher.

Comparison between the DE lncRNAs and the entries in the KEGG Pathway database indicated that a total of 34 lncRNAs could be annotated and classified into 149 KEGG pathways. The number of DE lncRNAs in the top 30 KEGG Pathways is shown in Fig.Ā 8. The number of DE lncRNAs in the thyroid hormone signalling pathway (path:ko04919), spliceosome (path:ko03040), leukocyte transendothelial migration (path:ko04670), neurotrophin signalling pathway (path:ko04722), lysosome (path:ko04142), MAPK signalling pathway-yeast (path:ko04011), sphingolipid signalling pathway (path:ko04071), phagosome (path:ko04145), and oxidative phosphorylation (path:ko00190) categories were relatively higher.

LncRNAs and their adjacent coding genes

We searched for all the coding genes within the 100-kb flanking regions of the lncRNAs, and genes that were significantly co-expressed with the lncRNAs as indicated by Pearson correlation calculations were identified. These co-expressed genes that were adjacent to the lncRNAs were presumed to be regulated by lncRNAs. Therefore, we identified 6 genes may be regulated by their associated lncRNAs (TableĀ 3).

Table 3 DE lncRNAs and their co-expressed gene.

Discussion

Due to the rapid development of transcriptome analysis, lncRNAs have recently been considered as a novel modulator of cell development14. The lncRNAs that we have identified in this study are mainly associated with cancer, such as prostate cancer15, gastric cancer16, lung cancer17 and breast cancer18, as well as reproduction19,20,21,22. However, investigations of lamb diarrhoea in relation to lncRNAs are limited. The Hu sheep is a unique breed with high fecundity and strong adaptability to warm-wet climates and can be kept indoors all year. This study has provided the first overview of lncRNAs in relation to diarrhoea in sheep, as well an investigation into their possible roles in disease resistance.

Major economic losses in sheep farms are often due to diarrhoea. In this study, We mainly study the immune status of lambs, and the spleen is the largest immune organ; thus, we selected the spleen as the research object. we found that the expression level of lncRNAs was lower than mRNAs in lamb spleens (Fig.Ā 4), which was in agreement with the results from sheep testicular tissues9, and the average lengths of lncRNAs and mRNAs in sheep were longer than those in pigs (1,713ā€‰bp and 1,983ā€‰bp, respectively)20. We searched for all the coding genes within the 100-kb flanking regions of the lncRNAs and identified intersecting genes that were significantly co-expressed with the lncRNAs as indicated by Pearson correlation calculations23. We also identified the following 6 genes as being co-expressed with the lncRNAs: myosin IG (MYO1G), translocase of inner mitochondrial membrane 29 (TIMM29), co-activator associated arginine methyltransferase 1 (CARM1), adhesion G protein-coupled receptor B1 (ADGRB1), septin 4 (SEPT4), and desumoylating isopeptidase 2 (DESI2).

MYO1G plays an important role in maintaining cell stiffness in B-cell lymphocytes. The deletion of the myo1g gene results in a reduction in cell stiffness, which in turn affects cell adhesion, proliferation, phagocytosis, and endocytosis in B-cell lymphocytes24. Investigations of TIMM29 are limited. TIMM29 has been identified as the first specific component of the mammalian TIMM22 protein complex and plays an important role in the assembly of the TIMM23 protein25,26. CARM1, a member of the protein arginine methyltransferase (PRMT) family, is an enzyme with a highly conserved domain with methyltransferase activity. CARM1 knockout mice die at birth27, indicating that CARM1 is essential to postnatal survival. It was later discovered that CARM1 inhibition promotes HIV-1 activation28. ADGRB1 is a member of the transmembrane protein-adhesion G protein coupled receptor (aGPCR) family, which is characterized by a conserved GAIN domain that has autologous proteolytic activity that can cleave the receptor near the first transmembrane domain. Studies have shown that the new N-terminal stalk, which is revealed by GAIN domain cleavage, can directly activate aGPCRs as a tethered agonist29. Septins are a highly conserved cytoskeletal family with GTPase activity. The tumour suppressor SEPT4 is a member of the septin family that can induce cancer cell apoptosis30. Mutations in the SEPT4 gene in mice can lead to disorders involving the annuli of spermatozoa31 and adjacent cortical structures, thereby causing low sperm motility, ultimately leading to infertility32. DESI2 is a pro-apoptotic gene; in vitro experiments have shown that its overexpression induces apoptosis in pancreatic cancer and other tumour cells, which can effectively inhibit the proliferation of some cancer cells. Gene therapy using DESI2 and IP10 significantly inhibits tumour growth and effectively prolongs the survival of tumour-bearing mice33,34.

A total of 703 mRNAs and 34 known lncRNAs were differentially expressed between the antagonistic and sensitive groups, including 14 upregulated lncRNAs and 20 downregulated lncRNAs. In addition, the present study identified 1,942 novel lncRNAs. We searched for all the coding genes within the 100-kb flanking regions of the lncRNAs and identified genes shared between the two experimental groups that were significantly co-expressed with lncRNAs as indicated by Pearson correlation calculations. We have determined that 6 genes may be regulated by their associated lncRNAs. To validate our RNA-seq results, q-PCR was performed to verify the expression levels of the 12 known lncRNAs and mRNAs. The final results coincided with our RNA-seq data.

GO is a bioinformatics tool that has been extensively utilized in studying the relationship of various gene functions. GO analysis indicated that 16 out the 34 DE lncRNAs were enriched with the protein binding (GO: 0005515) category. Moreover, KEGG Pathway analysis showed that the sphingolipid signalling (path: ko04071), axon guidance (path: ko04360), and glycosylphosphatidylinositol (GPI)-anchor biosynthesis (path: ko00563) pathways may be important KEGG pathways of genes co-expressed with DE lncRNAs, and the related lncRNAs may be potentially involved in fimbriae adhesion to the intestinal mucosa. However, the role of these pathways in disease resistance remains largely unknown.

In this study, the expression profiles of lncRNAs in the spleens of lambs that were antagonistic or sensitive to diarrhoea were investigated to further understand the regulation of lncRNAs in sheep disease resistance. Several differentially expressed lncRNAs in lamb spleens between the antagonistic and sensitive groups were identified, and we found that 6 genes (MYO1G, TIMM29, CARM1, ADGRB1, SEPT4, and DESI2) may be regulated by their associated lncRNA. Our study may help elucidate the mechanism underlying resistance to diarrhoea in lambs. Further investigations of these sheep lncRNAs in relation to diarrhoea-resistance are warranted.

Methods

Ethics statement

The Institutional Animal Care and Use Committee (IACUC) of the government of Jiangsu Province (Permit Number 45) and the Ministry of Agriculture of China (Permit Number 39) approved the animal study proposal. All experimental procedures were conducted in strict compliance with the recommendations of the Guide for the Care and Use of Laboratory Animals of Jiangsu Province and of the Animal Care and Use Committee of the Chinese Ministry of Agriculture. All efforts were made to minimize animal suffering.

Experimental design and sample collection

The experimental sheep were purchased from Jiangsu Xilaiyuan Ecological Agriculture Co., Ltd. in December 2016. A total of 18 three-day-old lambs showing normal growth and approximately similar weight were randomly selected, and all the sheep were raised with segregation. To ensure their dietary requirements, the sheep were fed with 10% lamb milk powder (TableĀ 4) prior to the experiment. Five-day-old lambs were fed 12.5% lamb milk powder and E. coli F17 bacteria liquid [4.6ā€‰Ć—ā€‰108 colony-forming units (CFUs)Ā·mLāˆ’1], as well as ad libitum access to drinking water. Stool features of the experimental lambs were recorded daily (TableĀ 5). Lambs that exhibited diarrhoea for two days were classified as antagonistic and sensitive and were euthanized. The intestinal tissues were collected in 4% paraformaldehyde. The liver, spleen, duodenum, jejunum, and ileum of each lamb were collected and immediately frozen in liquid nitrogen until RNA extraction.

Table 4 Lamb Milk Powder Ingredients List (per 100 grams of milk powder).
Table 5 Bristol Stool Form Scale13.

HE staining

The jejunum tissue was washed with 0.9% normal saline and fixed in 4% paraformaldehyde for 48ā€‰h at room temperature and then used in histological analysis. Next, 7 Ī¼m-thick sections were stained with haematoxylin-eosin and the morphology of the jejunum epithelia was assessed under a microscope.

Library construction and sequencing

RNA was extracted from the spleen of three individuals from each group. A NanoDrop 2000 Ultra Microscope and an Agilent 2100 Bioanalyzer were utilized in determining the quality control of the extracted total RNAs (Annex 1). Ribosomal RNA was removed using a Ribo-Zero (TM) kit (Epicenter, Madison, WI, USA). Short fragments (approximately 200ā€‰bp in length) were obtained and used as templates for first-stand cDNA synthesis. Second-strand cDNA synthesis was performed using a buffer, dNTPs, Rnase H, and DNA polymerase I. After PCR amplification and purification using the QubitĀ® dsDNA HS Assay Kit, the cDNA library was constructed using an NEBNextĀ® Ultraā„¢ RNA Library Preparation Kit. The cDNA library was sequenced on the Illumina HiSeq 2500 platform at Shanghai OE Biomedical Technology Co. (sequencing read length: 150ā€‰bp).

Identification of lncRNAs and mRNAs

The raw data were filtered to eliminate low-quality reads. Clean reads mapped to the reference genome (Ovis aries v4.0) were selected for de novo assembly. Coding and non-coding RNA candidates from the unknown transcripts were categorized using four coding potential analysis methods, CPC35, CNCI36, Pfam37, and PLEK38. The minimum length and the number of exons were set as thresholds, thereby filtering putative encoded RNAs, and transcripts containing two exons and longer than 200 nt were selected as candidate lncRNAs. Different types of lncRNAs were classified by cuffcompare, including intergenic lncRNAs (character u), intronic lncRNAs (character i), anti-sense lncRNAs (character x), and sense-overlapping lncRNAs (character o).

Different expression analysis

Because the fragments per kb per million reads (FPKM) method39 considers the simultaneous effect of sequencing depth and the length of the transcript on the number of fragments, the FPKM value was used to estimate the expression levels of lncRNA and mRNA transcripts. We used DESeq40 to determine the number of DE genes and the FPKM values between the two groups. In cases when RNA-seq data were employed to compare different expression levels in the same transcript in both samples, two criteria were selected: 1) fold change, which is the change in the expression level of the same transcript in both samples; and 2) the p value or false discovery rate (FDR) (adjusted p-value). The FDR error control method was used in correcting the p-value multiple hypothesis test41.

GO and KEGG pathway analyses

After screening for differentially expressed transcripts, functional annotation was performed using GO enrichment analysis. Enrichment analysis employed counting the number of transcripts in each GO term, followed by Fisherā€™s exact test to assess statistical significance (pā€‰<ā€‰0.05). KEGG42 is the main public database used in pathway analysis, which was followed by Fisherā€™s exact test to assess statistical significance (pā€‰<ā€‰0.05). The analysis of different transcripts was used to identify enriched pathways.

Prediction of the target genes of DE lncRNAs

The target genes of DE lncRNAs were predicted by calculating the Pearson correlation coefficients and P values among multiple genes. The |correlation|ā€‰ā‰„ā€‰0.7 and Pā€‰ā‰¤ā€‰0.05 were used to filter the transcripts43. The DE lncRNAs associated with adhesion were selected, and the target genes of all DE lncRNAs were predicted by cis-acting44.

Verification of the expression level of DE lncRNAs

To verify whether the screened DE lncRNAs play a role in the process of antagonism, q-PCR was used to detect the expression levels of 12 DE lncRNAs and DE mRNAs in the lamb spleens between the antagonistic and sensitive groups. The relative expression of each RNA was normalized to that of GAPDH using the 2āˆ’āˆ†āˆ†Ct method45, and the primers used in the amplification of the lncRNAs are shown in TableĀ 6.

Table 6 The primers for GAPDH, DE lncRNAs, and mRNAs.

Statistical analysis

All data were analysed by SPSS (version 20.0), and the relative expression levels of different transcripts were analysed by ANOVA. Tukeyā€™s test was used for multiple comparisons. Statistical significance was determined when pā€‰<ā€‰0.05. Each group contained three samples, and each experiment was repeated thrice.