Background & Summary

In this study, we present next-generation RNA sequencing (RNA-seq) data of murine ventricular cardiomyocytes (CMC). To date, only whole-heart RNA-seq data have been published13, in which a variety of cell types, such as fibroblasts, endothelial cells, and atrial and ventricular cardiomyocytes, are pooled. We endeavoured to provide RNA-seq data of isolated CMCs for several reasons. Firstly, since the pump function of the heart relies on proper CMC function, CMCs are the most thoroughly studied cardiac cell type. Researchers studying CMCs may benefit from CMC-specific RNA-seq data from which expression of genes of interest can be extracted. Secondly, because of the crucial role of ion channels in cardiac electrical excitability and arrhythmogenesis, researchers that study cardiac arrhythmias have debated the question of which ion channels are expressed in CMCs. However, existing ion channel expression data are low-throughput, often contradictory46, fragmented7, or expression is assessed in the whole heart. The present work reveals the expression of the more than 350 ion channel family members, including pore-forming and auxiliary subunits, in CMCs (see Fig. 1 and Tables 1, Tables 2 and Tables 3 (available online only)). We therefore believe that these data will be valuable for ion channel researchers attempting to resolve the ongoing debate.

Figure 1: Gene expression of ion channels in murine ventricular cardiomyocytes.
figure 1

(a) Expression levels of voltage-gated ion channel genes: voltage-gated sodium channels (Na+; purple), voltage-gated calcium channels (Ca2+; blue), transient receptor potential cation channels (TRP; light blue), CS, CatSper channels (aqua), two-pore channels (2P; green), cyclic-nucleotide-regulated channels (cN; light green), calcium-activated potassium channels (KCa; ochre), voltage-gated potassium channels (K+; orange), inwardly rectifying potassium channels (Kir; red) and two-pore potassium channels (2PK; burgundy). (b) Expression levels of the ligand-gated purinergic receptor gene (PR; purple) and of ion channel genes from the “other” category: aquaporins (Aqp; blue), voltage-sensitive chloride channels (Cl-; light blue), calcium-activated chloride channels (CaCl-; green) and inositol triphosphate receptors (IP3; light green). (c) Expression levels of more ion channel genes from the “other” category: ryanodine receptors (Ryr; orange), gap junction proteins (GJ; red) and chloride intracellular channels (icCl-; burgundy). All expression levels are average TPM values of WT samples (n=5). Shown are genes with more than 75 reads per gene (normalized for gene length, prior to conversion to TPM) from Tables 1, Table 2 and Table 3 (available online only).

Table 1 Expression of voltage-gated ion channels
Table 2 Expression of ligand-gated ion channels
Table 3 Expression of other ion channels

We have also included cardiac-specific knockout models of the ion channel regulators dystrophin, synapse-associated protein-97 (SAP97), and calmodulin-activated serine kinase (CASK). They interact with ion channels and modify their cell biological properties, such as membrane localization3,811. Notably, CASK provides a direct link between ion channel function and gene expression. It regulates transcription factors (TFs) in the nucleus, such as Tbr-1, and induces transcription of T-element-containing genes12. CASK also regulates TFs of the basic helix-loop-helix family, which bind E-box elements in promoter regions, by modulating the inhibitor of the DNA-binding-1 TF13. Additionally, CASK and SAP97 directly interact with each other11. For these reasons, we include CASK, SAP97, and dystrophin knockout mice to investigate whether these three proteins have a similar effect on gene expression, which may suggest their involvement in similar pathways. However, research beyond the scope of this paper would be needed to determine whether CASK-dependent TF regulation caused the differential expression that we observed.

To date, mutations in approximately 27 ion channel genes have been associated with cardiac arrhythmias, such as congenital short- and long-QT syndrome (SQTS and LQTS), Brugada syndrome (BrS), and conduction disorders (see http://omim.org)1416. Notably, our ion channel expression data, as presented in Fig. 1 and Tables 1, Table 2 and Table 3 (available online only), reveal that several arrhythmia-associated ion channel genes are not or are scarcely expressed in murine ventricular CMCs (including Kcne2, Kcne3, Scn2b, and Scn3b). Although murine and human ion channel expression may differ, we are presently unaware of any available transcriptome of human CMCs17,18. We are also unable to either exclude or assess the effect of enzymatic isolation on the transcriptome. Finally, other cardiac cell types such as (myo)fibroblasts may express these ion channels and therefore may be important for arrhythmogenesis. Indeed, many ion channel genes that are not expressed in cardiomyocytes have been reported in murine whole-heart tissue2. These include Scn1a, Scn3b, 10 voltage-gated Ca2+ channels, 10 Kv channels, and four two-pore K+ channels. Conversely, all ion channel genes expressed in CMCs are also reported in whole-heart expression data.

In sum, this study presents RNA-seq data from wildtype murine ventricular CMCs, as well as from SAP97, CASK, and dystrophin knockouts and controls (see Fig. 2 for a schematic overview of study design). We performed differential gene expression analysis to compare the knockouts to their controls, and we extracted wildtype ion channel gene expression data (Tables 1, Table 2 and Table 3 (available online only), Fig. 1). We believe that these data will be valuable for researchers studying cardiomyocytes and ion channels to assess expression of genes of interest.

Figure 2: Experimental design and workflow.
figure 2

(1) 22 mice with six different genetic backgrounds (CASK KO and control, SAP97 KO and control, and MDX and control) were used. fl+, first exon of gene is floxed; Cre+, Cre recombinase is expressed. (2) Cardiomyocytes were isolated on a Langendorff system and RNA was isolated with a FFPE Clear RNAready kit. (3) Libraries were constructed with 1 μg RNA per sample using a TrueSeq Stranded Total RNA protocol and (4) sequenced on an Illumina HiSeq3000 machine. (5) Quality of the reads was assessed with FastQC, and (6) reads were mapped to the Mus musculus reference genome (GRCm38.83) with Tophat. (7) To assess sample variation within each group, we performed principle component analyses (PCA) (see Fig. 3). (8) Lastly, ion channel expression was determined.

Methods

Mouse models

All animal experiments conformed to the Guide to the Care and Use of Laboratory Animals (US National Institutes of Health, publication No. 85-23, revised 1996); have been approved by the Cantonal Veterinary Administration, Bern, Switzerland; and have complied with the Swiss Federal Animal Protection Law. Mice were kept on a 12-hour light/dark cycle. Lights were on from 6:30 AM to 6:30 PM. To avoid the influence of circadian rhythm, mice were sacrificed between 10:00 AM and 1:00 PM. Mice were all male and were between the ages of 8 and 15 weeks.

MHC-Cre

The cardiac-specific murine alpha-myosin heavy chain (μMHC) promoter drives the expression of Cre recombinase, which, in turn, can recombine LoxP sequences. The μMHC-Cre strain was generated as previously described19 and acquired from the Jackson Laboratory (stock #011038).

CASK and SAP97 knockout mice

CASK KO and SAP97/Dlg1 KO mice were generated as previously described9,20. Both the CASK and SAP97 mouse lines were on mixed backgrounds. The appropriate control mice were selected in accordance with the publications that characterized both mouse lines9,20. CASK control mice express Cre while the first CASK exon is not floxed. SAP97 control mice are Cre-negative and the first SAP97 gene was floxed.

Dystrophin knockout (MDX-5CV) mice

The MDX-5CV strain demonstrates total deletion of the dystrophin protein. It was created as previously described21, and acquired from the Jackson laboratory (stock #002379). MDX mice were on pure Bl6/Ros backgrounds. Control mice were on pure Bl6/J background, except for MDX_Ct5 and MDX_5, which were Bl6/Ros mice backcrossed three times on Bl6/J.

Cardiomyocyte isolation

Mice (n=3–5 per genotype, male, age 10–15 weeks) were heparinized (intraperitoneal injection of 100 μL heparin (5000 U/mL; Biochrom AG)) and killed by cervical dislocation. Hearts were excised, and the aortas were cannulated in ice-cold phosphate-buffered saline (PBS). Subsequently, hearts were perfused on a Langendorff system in a retrograde manner at 37 °C with 5 mL perfusion buffer (1.5 mL/min; in mM: 135 NaCl, 4 KCl, 1.2 NaH2PO4, 1.2 MgCl2, 10 HEPES, 11 glucose), followed by the application of type II collagenase (Worthington CLS2; 25 mL of 1 mg/mL in perfusion buffer with 50 μM CaCl2). Left and right ventricles were triturated in PBS to dissociate individual ventricular cardiomyocytes and then filtered through a 100 μm filter.

RNA extraction and sequencing

RNA-seq was performed by the Next Generation Sequencing Platform at the University of Bern. Total RNA was isolated from freshly dissociated cardiomyocytes with an FFPE Clear RNAready kit (AmpTec, Germany), which included a DNase treatment step. RNA quality was assessed with Qubit and Bioanalyzer, and RNA quantity was checked with Qubit.

To allow sequencing of long non-coding RNA (lncRNA), libraries were constructed with 1 μg RNA using the TruSeq Stranded Total RNA kit after Ribo-Zero Gold (Illumina) treatment for rRNA depletion. Library molecules with inserts <300 base pairs (bp) were removed. Paired-end libraries (2x150 bp) were sequenced on an Illumina HiSeq3000 machine.

RNA-seq data analysis

Between 17.5 and 56.4 million read pairs were obtained per sample and the quality of the reads was assessed using FastQC v.0.11.2 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Ribosomal RNA (rRNA) was removed by mapping the reads with Bowtie2 v.2.2.1 (ref. 22) to a collection of rRNA sequences (NR_003279.1, NR_003278.3 and NR_003280.2) downloaded from NCBI (www.ncbi.nlm.nih.gov). No quality trimming was required.The remaining reads were mapped to the Mus musculus reference genome (GRCm38.83) with Tophat v.2.0.13 (ref. 23). We used htseq-count v.0.6.1 (ref. 24) to count the number of reads overlapping with each gene, as specified in the Ensembl annotation (GRCm38.83). Detailed information about the genes including the Entrez Gene ID, the MGI symbol and the description of the gene was obtained using the Bioconductor package BioMart v.2.26.1 (ref. 25).

Raw reads were corrected for gene length and TPM (transcripts per million) values were calculated to compare the expression levels among samples. Gene lengths for the latter step were retrieved from the Ensembl annotation (GRCm38.83) as the total sum of all exons.

Principal component analysis (PCA) plots were done in DESeq2 v.1.10.1 (ref. 26) (https://bioconductor.org/packages/release/bioc/html/DESeq2.html) using the 500 genes with the most variable expression across samples. A regularized log transformation was applied to the counts before performing the PCA.

Statistics

To assess differential gene expression between genotypes, a Wald test was performed with the Bioconductor package DESeq2 v.1.10.1 (ref. 26). We considered p values of up to 0.01, accounting for a Benjamini-Hochberg false discovery rate adjustment, to indicate significant difference. Statistical tools used included DESeq2, R-3.2.5 (https://cran.r-project.org), and Biomart_2.26.1 (www.biomart.org).

Data Records

The data were submitted to NCBI Gene Expression Omnibus (GEO) (Data Citation 1). This GEO project contains raw data and TPM values from all samples, and differential gene expression analysis between knockout and control samples.

Technical Validation

RNA metrics

RNA-seq yielded 1.0 billion read pairs in total, with an average of 44.5 million read pairs per sample (standard deviation 8.4 million). The number of read pairs (in millions) was 306 for CASK KO and Ctrl, 268 for SAP97 KO and Ctrl, and 404 for MDX and Ctrl (see Table 4 for an overview of RNA-seq metrics, including mapping rates). One sample (MDX_1) yielded few reads and was therefore excluded from further analyses. The proportion of reads mapping to annotated exons ranged from 65 to 77%. Mapping, no-feature (2–13%), and ambiguous (11–23%) read pairs together accounted for 89–97% of the total number of RNA reads (Table 4). Read pairs covered 49,671 genes of the Mus musculus reference genome (GRCm83.38).

Table 4 RNA-seq raw data and mapping metrics.

Quality assessment

The quality of all samples was assessed with FastQC. Except for MDX_1, all samples were of high quality. Where applicable, a representative example (MDX_Ct1) is shown. Firstly, the insert size histogram (Fig. 3a) shows that the inferred insert size of each sample exceeded 150 base pairs, demonstating that the sequencing was not contaminated by adapter sequences. Secondly, the GC content plot (Fig. 3c) ideally shows a roughly normal distribution centred around the average GC content of the genome, which varies between species. The peaks observed in Fig. 3c are likely caused by sequences that are detected at high copy numbers, and should not pose problems for downstream analyses. Furthermore, Phred scores (Fig. 3d) are well within the green area of the graph indicating good base quality along the length of reads. As well, the gene coverage graph (Fig. 3e) of sample MDX_Ct1 shows that reads are distributed evenly along the length of the gene body. Because the gene coverage for all other samples is highly comparable to that of MDX_Ct1, only one example is shown. Lastly, the saturation report (Fig. 3f) represents the number of splice junctions detected using different subsets of the data from 5 to 100% of all reads. At sequencing depths sufficient to perform alternative splicing analysis, at least the red line, representing known junctions, should reach a plateau where adding more data does not much increase the number of detected junctions. Only MDX_1 does not reach this plateau.

Figure 3: Quality control.
figure 3

(a) Histogram of inferred insert size for each sample, which represents distance between the two reads of one RNA fragment. (b) Principle component analyses (PCA) plots were performed to assess variability of samples within and between groups. Plot of the first two axes from a PCA based on the 500 genes with the most variable expression across all samples except MDX_1. CASK control (red, n=3) and KO (green, n=3); MDX control (orange, n=5) and KO (blue, n=4); SAP97 control (grey, n=3) and KO (black, n=3). (c) Distribution of GC content of the reads for each sample. (d) Base quality (Phred scores) along the length of the reads in each FastQC file of MDX_Ct1 as representative sample. The box plots are drawn as follows: red line, median; yellow box, range between upper and lower quartiles; whiskers, range between 10 and 90% quantiles. The blue line shows the mean quality. Y-axis represents quality scores across all bases. X-axis represents position in read (bp). (e) Gene body coverage. Distribution of reads along the length of the genes (5’-end on the left, 3’-end on the right). Shown image of sample MDX_Ct1 is representative for all samples. (f) Saturation report, depicting the number of splice junctions detected using different subsets of the data from 5 to 100% of all reads. Red, known junction based on the provided genome annotation; green, novel junctions; blue, all junctions. The red line reaches a plateau where adding more data does not increase the number of detected junctions, indicating that the sequencing depth suffices for performing alternative splicing analysis.

Gene expression variation of biological replicates

We performed Principle Component Analyses (PCA) to assess whether samples from the same experimental group have similar gene expression profiles (Fig. 3b). Of note, samples within each sample group still show considerable variation. The mixed genetic background of most sample groups may explain this variation; only the MDX control mice are on a pure Bl6/J background. The variation seen in MDX control mice is likely due to a batch effect, as two rounds of samples were sequenced. However, considering that PCA plots are based on the 500 genes with the highest variability in one sample, our genes of interest, including all ion channel genes, show similar expression levels throughout all samples.

Ion channel expression

Based on the list of ion channel genes from HUGO Gene Nomenclature Committee (https://www.genenames.org/cgi-bin/genefamilies/set/177), we distilled ion channel expression from WT mice expressed as TPM (Tables 1, Table 2 and Table 3 (available online only), Fig. 1).

Additional information

How to cite this article: Chevalier, M. et al, Transcriptomic analyses of murine ventricular cardiomyocytes. Sci. Data 5:180170 doi: 10.1038/sdata.2018.170 (2018).

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.