Introduction

The proper regulation of gut activity is vital to homeostasis and survival. The digestive process of absorbing nutrients and releasing them into the bloodstream is achieved through a series of synchronized involuntary movements (motility) of gastrointestinal (GI) smooth muscle, which mixes food and propels the digested content through the GI tract1. Smooth muscle tissue is comprised of a diverse range of unique cellular subpopulations that require isolation for individual study to aid in the elucidation of each subpopulations contribution to the functioning of the overall tissue. Motility in the GI tract is controlled by several types of cells including smooth muscle cells (SMC), interstitial cells of Cajal (ICC), PDGFRα+ cells (fibroblast-like cells), as well as the enteric nervous system (ENS)1. ICC generate spontaneous electrical slow waves2, the ENS generates complex rhythmic motor behavior3, and PDGFRα+ cells mediate enteric inhibitory responses4,5, all of which control SMC, the final effectors for muscle contraction and muscle relaxation1. The three cell types, SMC, ICC, and PDGFRα+ cells (SIP cells), are electrically coupled via gap junctions and create an electrical syncytium, which collectively regulate GI motility1. Developmental abnormalities and pathophysiological damage to these cells are directly linked to GI neuromuscular diseases such as Hirschsprung’s disease6, diabetic gastroenteropathy7, gastrointestinal stromal tumor8, intestinal fibrosis9, and chronic intestinal pseudo-obstruction10. All of these motility diseases are thought to be developed from the remodeling of the smooth muscle in the GI tract, leading to abnormal growth (hypertrophy or tumor), myopathy, or death of the cells.

Genome-scale expression profiles of specific cell types provide indispensable information regarding cellular identity and function. To access the genetic information of SMC, ICC, and PDGFRα+ cells within the small intestine and colon, we launched a Smooth Muscle Transcriptome Sequencing Project. For this project, we isolated primary jejunal and colonic SMC, ICC, and PDGFRα+ cells (mucosa and muscularis) from cell-specific GFP reporter mouse lines, and obtained a transcriptomic profile of each cell type and associated tissue11,12,13,14. In analyzing each cell type’s transcriptome, we identified new markers and signature genes for each cell type that are linked to cellular functions11,12,13,14.

To help to explore this complex dataset, we built the SMTB. This graphical, web-based, browser displays the comprehensive expression profile and genomic map of each cell type and associated tissue within the colon and jejunum. The browser is available online, hosted by the University of Nevada, Reno at https://med.unr.edu/physio/transcriptome. This resource provides genome-wide genetic references and expression levels, enabling insight into genetic structure, expression profile, and isoforms of each gene expressed in key GI cell and tissue populations.

Results

The SMTB offers genome-wide genetic references and unique graphical images that can reveal new insights into the genetic structures, expression profiles, and isoforms of each gene expressed in key GI cell populations (SMC, ICC, and PDGFRα+ cells) and GI tissues (jejunum SM, colonic SM and mucosa) for functional studies.

Applications

  • Expression levels of various genes within GI tissues and GI cell types.

  • Expression levels, and numbers, of transcriptional gene variants in GI tissues and GI cell types.

  • Observing genomic structure (promoter, exons and introns) of transcriptional variants.

  • Primer design for RT-PCR or qPCR (designing primers to span exon to exon junctions in order to minimize genomic DNA amplification and to detect specific transcriptional variants).

  • Viewing splicing donor and acceptor sequence sites of transcriptional variants.

  • Obtaining cDNA sequences for transcriptional variants.

  • Finding open reading frames within transcriptional variants.

User's guide

The SMTB is accessible at https://med.unr.edu/physio/transcriptome/smooth-muscle-transcriptome-browser.

  1. 1.

    Once arrived at the home page, click “Access the Smooth Muscle Transcriptome Browser” to take you to the browser.

  2. 2.

    Go to “Select Track” as shown in Fig. 1a. There are two references of the mouse genome, mm9 (NCBI37, July 2007) and mm10 (GRCm38, Dec. 2011). Select one reference from the “Reference” section. As an example, mm9 was selected in Fig. 1a. Under the transcripts section, there are seven cell types (SMC Jejunum, ICC Jejunum, PDGFRαC Jejunum, SMC Colon, ICC Colon, PDGFRαC Colon, and PDGFRαC Mu Colon), three tissue types (SM Jejunum, SM Colon, and Mu Colon), and combined transcripts (GI All). Select cell(s)/tissue(s) as interested. For example, SMC Colon and SMC Jejunum were selected (Fig. 1a). Once all selections have been made, click “Back to Browser”.

    Figure 1
    figure 1

    Smooth Muscle Transcriptome Browser built with Gbrowse 2.0. (a) “Select Tracks” tab showing the two selectable mouse reference genomes, mm9 and mm10, as well as the various selectable transcripts from each cell type and tissue. (b) The home screen of the “Browser” tab. (c) The search result of Acta2 in the “Browser” tab. Shown are the “Overview,” “Region,” and “Details” section showing the chromosomal location, a highlighted region marker (light blue), and a graphical representation of Acta2 transcriptional variants expressed in colonic (green) and jejunal (yellow) SMC. A map of the reference gene is also shown under the transcripts (black).

  3. 3.

    The browser itself contains sections include “Search,” “Overview,” “Region,” and “Details” (Fig. 1b). Type a gene name in “Gene Name or Location” under the Search section (overwrite the chromosome location) and click “Search.” As an example, Acta2 (aka alpha smooth muscle actin) was typed in and searched for (Fig. 1c). The browser retrieves and displays the genomic map of Acta2 and the transcriptional variants expressed in both colonic and jejunal SMC. In the “Overview” and “Region, it displays the location of Acta2 on chromosome 19 with the zoomed in location highlighted in a light blue bar in the “Region” section. This bar can be clicked and dragged to shift the displayed chromosomal location. In the “Details” section, the transcriptional maps of four isoform variants expressed in colonic and jejunal SMC are displayed. Exons in the variants are marked by green (colonic SMC) or yellow (jejunal SMC) boxes. For Acta2, the directionality of the arrows on the lines of each variant indicates that the cDNAs match to the antisense strand. Among the four Acta2 transcriptional variants, TCONS_00138735 is the longest in both colonic and jejunal SMC which is the same as the reference transcript of Acta2 shown in black (Fig. 1c). Wider boxes on the line of the reference gene mark a coding cDNA region while narrow boxes show 5′ and 3′ non-coding regions.

  4. 4.

    The genomic view can be zoomed in and out on any chromosomal location (Fig. 2). Fixed ranges for zooming are from 100 bp to 1 Mbp (Fig. 2a). In addition, zoom levels can be changed in 10% increments using “+” or “−” buttons in the “Scroll/Zoom” section just above the browser. Furthermore, the view can be freely moved to left or right by clicking any of the “”, “<”, “” (to left), “”, “>”, “” (to right) buttons, or by holding and dragging the light blue bar in either direction (Fig. 2b). Chromosomal location and distance can be displayed and measured by clicking the ruler and dragging to the area of interest (Fig. 2b). The reference genes displayed are linked to Gene Search Results in NCBI showing a list of a related gene associated with a full gene report when clicked (Fig. 2b).

    Figure 2
    figure 2

    A map view of Acta2 transcripts in the genome. (a) Zooming capabilities range from 1 Mbp to 100 bp at the Acta2 locus (red arrow). (b) A genomic view of 1 Mbp around the Acta2 gene locus. This map view can be freely moved left (←), or right (→) in the genome by the “Scroll/Zoom” menu options of “”, “<”, “”, “”, “>”, or “”. Zoom is controlled by “−” or “+” or out (↓) and in (↑). “Flip” allows the user to switch between sense strand to anti-strand view or vice versa. Reference genes on the view are linked to the NCBI (Acta2, encircled in blue, is linked to reference genes at the NCBI). The ruler on left can be moved and denotes an exact chromosomal location of the cursor.

  5. 5.

    The genomic map image of Acta2 can be saved in “Snapshots” by clicking the “Save Snapshot” button just above the “Scroll/Zoom” section (Fig. 3a). The saved image can be reloaded, removed, or stored in the “Snapshots” tab above the browser by clicking it (Fig. 3b). The stored image can be also downloaded (Fig. 3c).

    Figure 3
    figure 3

    Saving and reloading a map of Acta2. (a) The map of Acta2 to be saved as Acta2 by clicking “Save Snapshot”. (b) An image of the map view saved under the “Snapshots” tab. This view can be reloaded by “Load” or removed by clicking the trash can. (c) A saved image of Acta2. The reloaded view (a) shows four transcriptional variants found in “SMC Colon” and in “SMC Jejunum”. The longest variant is TCONS_00138735 and is indicated by a red arrow in (a).

  6. 6.

    Each transcriptional variant on the map is linked to detailed data. The longest transcriptional variant of Acta2, TCONS_00138735, will be used as an example for analysis (Fig. 3a). By clicking on the structural display of TCONS_00138735 under the “Details” section, you will be linked to a page containing a summary of exonic numbers and locations, expression profiles, and cDNA sequence (Fig. 4). Exonic number, length, and associated chromosomal locations are summarized in Fig. 4a. Each exon contains a hyperlink to a map view of that exon. Expression levels (FPKM) of TCONS_00138735 in jejunal cells (JSMC, JICC, and JPαC), jejunal muscle tissue (JSM), colonic cells (CSMC, CICC, and CPαC), colonic muscle tissue (CSM), colonic mucosal PDGFRα+ cells (CMuPαC), and colonic mucosal tissue (CMu) are shown in as various histograms (Fig. 4b). This variant is dominantly expressed in JSMC and CSMC. It is also noticeably expressed in JPαC and CMuPαC. Total expression levels of all four transcriptional variants of Acta2 are shown in Fig. 4c. Figure 4d shows the DNA sequence of TCONS_00138735 containing 9 exons and 8 introns. The cDNA sequence can be downloaded as a.doc file by clicking the “Download cDNA Sequence” hyperlink above the displayed sequence. In addition, the cDNA can be further analyzed to search open reading frames (ORF).

    Figure 4
    figure 4

    The expression profile and exon/intron structure of Acta2 TCONS_00138735. (a) Summary of the exonic map of TCONS_00138735. The number of exons and respective chromosomal locations are shown. Each exon position (hyperlinked in blue) is linked to the map view. (b) Expression level (FPKM) of TCONS_00138735 from Acta2 in jejunal and colonic cells as well as tissues. The image can be downloaded into as a.gif file and saved. (c) Total expression level (FPKM) and the number of all transcriptional variants of Acta2 found in jejunal and colonic cells and tissues. The image can be downloaded into a.gif file and saved. JSM, jejunal smooth muscle; JSMC, jejunal SMC; JICC, jejunal ICC; JPαC, jejunal PDGFRα+ cells; CSM, colonic smooth muscle; CSMC, colonic SMC; CICC, colonic ICC; CPαC, colonic PDGFRα+ cells; CMu, colonic mucosa; CMuPαC, colonic mucosal PDGFRα+ cells. (d) DNA sequence of TCONS_00138735. Exons are highlighted on the sequence in grey. Repeated sequences (aka repetitive elements) are written in lowercase. TCONS_00138735 cDNA sequence can be downloaded and saved as a. Doc file as indicated by a black arrow. It is also linked to the ORF Finder at NCBI by clicking “Search Open Reading Frames” (a red arrow).

  7. 7.

    Instructions for searching ORF is shown in Fig. 5. Acta2 TCONS_00138735 cDNA is 1,780 bp in length. The cDNA sequence was copied, pasted, and submitted in the Enter Query Sequence in the NCBI ORF Finder15 (Fig. 6a). The search produced 12 possible ORFs (ORF1-12, Fig. 6b). ORF1 is the longest one with 1,134 bp (377 amino acids) stating at nucleotide (nt) 71 to nt 1,204. The amino acid sequence can be further analyzed by protein BLAST (SMARTBLAST or BLAST) searching for homologous proteins.

    Figure 5
    figure 5

    The cDNA sequence of TCONS_00138735 and instructions on using the Open Reading Frame (ORF) Finder at NBCI.

    Figure 6
    figure 6

    Submission of TCONS_00138735 cDNA to ORF Finder and selection of a putative ORF. (a) A snapshot of TCONS_00138735 cDNA pasted in ORF Finder. (b) A snapshot of the ORF Viewer showing TCONS_00138735 ORFs. The longest ORF, ORF1, is indicated by an arrow. Translated ORF1 amino acid (aa) sequence and summary of all ORFs are shown.

Discussion

The Smooth Muscle Transcriptome Browser (SMTB) built in this study provides both the number and expression level of each transcriptional variant for all mapped genes expressed in SIP cells (SMC, ICC, and PDGFRα+ cells) within the mucosa and muscularis layers of the jejunum and colon. The cellular transcriptomes and expression profiles are provided along with the expression profiles of each cell’s respective associated tissue including the jejunal muscularis, colonic muscularis, and mucosa. This browser allows researchers to analyze each transcriptional variant in terms of gene structure (promoter region, exons, and introns), expression levels within each cell type and tissue, and open reading frames (coding protein) within each variant.

We previously built UCSC Smooth Muscle Genome Browser (SMGB)11 that can interact with our SMC transcriptome, SRF (SMC-specific transcription factor) binding CArG boxes, and the ENCODE data contained within the UCSC Genome Browser16. Later, we added the transcriptomes of ICC, and PDGFRα+ cells from both the muscularis and mucosa to our browser12,13,14. This interactive database can analyze genetic/epigenetic structures and the regulation of individual genes expressed in SIP cells by integrating the abundant bioinformatics data in the UCSC Genome Database17. Now, as this paper has reported, we have built the SMTB that is capable of searching the expression profile of individual transcriptional variants of all mapped genes expressed in SIP cells, something that was not available with any of our previously reported SIP cell transcriptome studies. The two browsers serve as cooperative and powerful tools for use in functional studies of SIP cells.

As an example, the Acta2 gene was chosen for analysis using the SMGB and SMTB (Fig. S1). Both browsers can carry the mouse mm9 genome showing the same region chr19: 34,315,581–34,329,826 (14,246 kbp), within which four transcriptional variants, V1–4, of the gene are expressed in JSMC and CSMC. Within the SMGB, SRF binding CArG boxes, ENCODE data (SRF, H3K27ac, Pol2, JunD, and c-Jun binding sites, as well as DNase I hypersensitivity) were selected (Fig. S1a). Both V1 and V2 start in the proximal promoter regions that contain two SRF binding sites, within which match two conserved CArG boxes for each variant (CCATATAGGG and CCAAACAAGG for V1, CCATATTTAG, CCTAATTAGG for V2). The region around the SRF binding sites coincide with H3K27ac and Pol2 sites (Fig. S1a), suggesting that the gene is active, and V1 and V2 are transcribed by Pol2 while being regulated by SRF (due to the presence of SRF binding sites and CArG boxes) in intestinal SMC. In addition, a binding site for the transcription factor subunits JunD and c-Jun can be found within intron 7, which also matches a DNase I hypersensitive region (Fig. S1a), suggesting the transcription of V1-4 could be regulated by the Jun family in JSMC and CSMC. The corresponding region (240 bp, chr19: 34,318,625–34,318,864) of the Jun binging site was located on SMTB (Fig. S1b), the sequence of 240 bp was analyzed for the presence of Jun binding sites in the transcriptional regulatory element search database, “PROMO”18. This search identified two binding sequences, TATGTCA and GAAGTCA, for JunD and c-Jun (Fig. S1b). Next, expression levels of the four variants of Acta2 were found on SMTB. V1 and V2, likely regulated by SRF, show selective high expression in JSMC and CSMC, compared to colonic and jejunal ICC and PDGFRα+ cells while V3 and V4 are dominantly expressed in CSMC and Jejunal PDGFRα+ cells (Fig. 2a). V1, V2, and V4 are also expressed in mucosa PDGFRα+ cells at relatively low levels. Each cDNA sequence for V1–4 was downloaded and an open reading frame (the longest one) was obtained from the NCBI ORF Finder15. Alignment of their amino acid sequences is shown in Fig. S2b, V1, V2, and V3 encode the full length of 377 amino acids which contains six post-translationally modified residues at N-terminus as shown in UniProt19. V1 encodes a truncated protein of 257 amino acids at the N-terminus that misses the six residues. As all of this integrated information shows, both the SMGB and SMTB are interactive browsers that provide genetic and epigenetic references that can be incorporated together for further functional studies in intestinal SIP cells and tissues.

While our transcriptome browser SMTB does share some similar features with other previously published genome/transcriptome browsers20,21, SMTB contains a myriad of analyses that these resources do not contain the entirety that SMTB encompasses. SMTB allows for graphical representation of FPKM values and number of transcripts at whole tissue (smooth muscle and mucosa) and individual SIP cell type (SMC, ICC, and PDGFRα+ cells) levels as well as being able to download cDNA sequences for both the mm9 and mm10 murine genome. Other browsers will allow for visualizations of FPKM data, such as the well-designed and valuable Brain RNA-seq project21, but it only uses a single genome reference (mm9), does not allow for cDNA sequence download or visualizations of alternatively spliced transcripts and has no overall FPKM for the entire tissue, only individual cell types. In contrast, a tissue reference genome, such as the RETINAL genome20, only contains transcriptomic information from the total retinal tissue and not individual cell types and does not have graphical representations or cDNA sequence downloads and only uses a single genome reference. RNA-seq data from both the small and large intestine (colon) are also available through ENCODE (RNA-seq from LICR and UW, respectively) at the UCSC Genome Browser17. These RNA-seq data were obtained from whole small or large intestine tissue consisting of smooth muscle and mucosa which are functionally distinct and contain a myriad of different cell types. In contrast, we separated jejunal and colonic smooth muscle from their respective mucosa and mRNA from each was independently sequenced and deposited in the SMTB. Furthermore, while our transcriptome browser is definitively tissue specific to GI SIP types, it contains a plethora of data and user-friendly capabilities that have not been contained within one specific browser to date.

For all our previously reported transcriptomes, we obtained and analyzed the RNA-seq data from SIP cells contained within the murine jejunum and colon (Table 1) that unveiled a plethora of genetic information at the cellular levels. This report is the first to compile and compare these transcriptomes. SIP cells express anywhere from 15,192–17,172 known genes (Table 2), which account for 66–75% of all known mouse genes. In addition, these genes are transcribed into multiple transcriptional variants through alternative start sites and splicing. For example, the L-type Ca2+ channel Cacna1c is expressed into 22 different variants in jejunal and colonic SMC22. The average number of transcriptional variants per gene is 3 (Table 2). Most variants appear to be cell-specific. Further characterization of each variant is required to explore localization of variants in subpopulations of SIP cells (e.g. longitudinal SMC, circular SMC, intermuscular ICC/PDGFRα+ cells, and intramuscular ICC/ PDGFRα+ cells), as well as to further understand the functional role of variants (if they are translated into protein or if they are pseudo-substrates for microRNAs).

Table 1 List of intestinal tissues and cells used for transcriptome studies.
Table 2 Summary of transcriptome data obtained from intestinal tissues and cells.

The transcriptomes of SIP cells are very similar, sharing not only the genes that are expressed in all three types (around 93%), but also showing comparable expression levels for these gene transcripts. These similar transcriptomic profiles lend credence to the notion that these three cell types might have a shared developmental lineage arising from mesenchymal cells23. In contrast to their largely shared expression profiles, SIP cells also have ~1,500 genes that are unique to each cell within the syncytium. These cell type-specific genes may contribute to the individual phenotypic identity, and unique functioning, of each cell type in the SIP syncytium.

Through previous gene ontology (GO) term analysis of the predominately expressed genes within SIP cells, we found the main functional roles of each cell type. SMC genes showed unique GO terms related to muscle contraction including genes related to the cytoskeleton, actin binding, calcium ion binding, myosin complex, and smooth muscle contractile fibers11. ICC genes showed GO terms related to membrane excitability including membrane integrity, plasma membrane, metal ion binding, and transport13. PDGFRα+ cell genes revealed GO categories related to the function and structure of the extracellular matrix12,14.

The cell-specific markers identified in our previous transcriptome studies provide new tools to study SIP cells11,12,13 and our new SMTB allows for easy visualization and comparison of this elucidated data. We found that the most distinctive markers for detecting primary mature SMC are Cnn1, Mylk, Tpm2, Tpm1, Des, and Myh1111. SMC are phenotypically dynamic with the ability to dedifferentiate back to a myofibroblast-like state induced by pathological conditions causing SMC overgrowth and within in vitro conditions24. The six SMC markers should be used together to evaluate SMC phenotype in pathological conditions or cultured conditions. We also identified the new markers Thbs4 and Hcn4 that could be used in ICC identification. Currently, and historically, ICC are identified and tagged through the use of KIT (CD117) antibodies. However, the expression of KIT is not consistent across various ICC phenotypic conditions25. For example, in some GI pathologies that cause digestive dysmotility, ICC lose their expression of KIT, making KIT antibodies a poor marker of ICC as they cannot be used to track ICC that undergo phenotypic alterations26. Our newly uncovered ICC-selective marker, THBS4, may allow for the tracking of ICC long after KIT expression is lost. Additionally, we also found that expression of Cacna1g may be an identifying marker of hyperplasic PDGFRα+ cells in the context of various GI diseases (small bowel obstruction, colorectal cancer, Crohn’s disease, and diverticulitis)12. Lastly, we identified the new maker Adamdec1 in mucosal PDGFRα+ cells (unpublished data). Expression of Adamdec1 is induced in the DSS-induced colitis mouse model and human tissue affected by Crohn’s disease (unpublished data).

Prior to the publication of our SIP cell transcriptomes, there was no transcriptome-level comparative tools available for analyzing cells contained within the SIP syncytium; it was only available to be compared through individual extraction of each data point in extremely large and generally opaque spreadsheets. Our new SMTB is a new and dynamic tool that allows for quick, easy, and clear-cut access to genetic information for individual transcripts in individual SIP cells and their associated tissue while allowing for comparison between both transcripts and cells/tissue. This tool will bring insights into new potential pathways for future research in the vast field of GI smooth muscle biology.

Materials and Methods

Intestinal tissues and cells

Pooled total RNAs were isolated from gastrointestinal (GI) tissue (2 males and 2 females, 1–2 months old) or sorted cells (up to 20 males and 20 females, 1–2 months old) from GFP reporter mouse lines: SMC from Myh11-Cre-eGFP27, ICC from Kit-copGFP28, and PDGFRα+ cells from Pdgfra-eGFP29 (Table 1). All procedures involving animals and their care were performed in accordance with institutional, state and national guidelines. The animal protocol was approved by the Institutional Animal Care and Use Committee at the University of Nevada-Reno Animal Resources. RNA-seq libraries were generated and sequenced via Illumina HiSeq 2000 (Illumina, San Diego, CA) at LC Sciences (Houston, TX) as previously described11,12,13,14.

RNA-seq data analysis

Paired-end sequencing reads were isolated, processed, and analyzed as previously described11,12,13,14. An FPKM = 0.025 was selected as our cutoff expression value as it resulted in equal false positive and false negative ratios of reliability. Transcripts with FPKM values of less than 0.025 were considered to be 0. Sequencing reads were assembled and annotated onto a reference genome (UCSC mm9 or mm10) using TopHat v1.4.1 software.

Transcriptome

Transcriptomes obtained from RNA-seq data analyses are shown in Table 211,12,13,14. Briefly, we obtained total reads of 136.6 M–238.2 M, 77–93% of which were mapped to the murine genome. The mapping identified 46,299–55,345 transcriptional isoform variants, which were aligned to 15,192–17,172 known genes. The average number of transcriptional variants per gene was 3.

Smooth Muscle Transcriptome Browser

Gbrowse 2.030 was used to build the SMTB. Gbrowse is a web-based genome browser originally developed in 2002 for use with Wormbase31. It enables web-based graphical sequence displays and sequence annotation. The browser has been used for other public data sources such as Flybase32, SGD33, and SilkDB34. In the browser, data can be downloaded for all transcripts on transcript detail pages. The SMTB was created using the standard Gbrowse 2.0 install with custom modifications. In addition to configuration files, Gbrowse files modified included CSS files and the cgi scripts “gbrowse” and “gbrowse_details”. Gbrowse 2.0 was installed on a Linux Debian 7.6 operating system. Transcript data files were converted to gff3 format and then uploaded to a MySQL backend database included with the install. Images for each transcript containing expression levels and number of splice variants were created with in house software.

Future Directions

We plan to expand the Smooth Muscle Transcriptome Sequencing Project to other GI cell types in mice and human GI cell types, as well as update the SMTB.