Background & Summary

Aging is a complex breakdown in the processes that facilitate organismal homeostasis. Importantly, aging has been shown to broadly impact the landscape of genomic regulation across tissues, sexes, and species1,2. This includes not only differences in canonical gene expression, but also in the expression of transposable elements (TEs)2,3,4,5,6,7,8. TEs are mobile repetitive genetic elements that are typically silenced in young tissues but become de-repressed with age. By examining how gene and TE expression changes with age, we can better understand the processes driving the aging process.

An important variable to consider when conducting any type of aging research are the myriad effects of biological sex9. For example, longevity is sex-dimorphic in humans in which females consistently outlive males10. The same trend is common across many animal species and appears to hold for most mammals11,12. Sex also affects the risk of developing age-related diseases with men at higher risk of coronary artery disease and women at higher risk of Alzheimer’s disease13,14,15. Sex may also influence aging through differential activity of TEs16,17. Indeed, TE de-repression was shown to correlate with decreased lifespan in transgenic flies with different copy numbers of the TE-rich Y chromosome18.

An emerging powerful model to study aging in vertebrates is the African turquoise killifish Nothobranchius furzeri19,20,21,22,23,24,25,26. The turquoise killifish is the shortest-lived vertebrate that can be bred in captivity, with a naturally short lifespan of 4–6 months. Moreover, it is relatively inexpensive to maintain compared to other traditional vertebrate model organisms (e.g. mice). Accumulating studies are using RNA-seq in the turquoise killifish to understand the effects of aging and aging interventions on many different tissues27,28,29,30,31. However, most of these studies have either focused on a single tissue, a single sex, or used a non-reference strain of turquoise killifish (e.g. MZM-0410)27,28,29,30,31. In addition, these studies have also focused on genic transcription, leaving little known about how TE transcription is regulated with aging in this species.

Here, we generated ribosomal-RNA depleted bulk RNA-seq datasets from young (6-weeks-old) and old (16-weeks-old) male and female GRZ strain turquoise killifish brain, heart, muscle, and spleen (n = 4–5 per sex) (Fig. 1a, Supplemental Table S1). We found strong age effects and mild sex-dimorphism in all sampled tissues. We performed differential gene expression in each tissue to identify genes and TEs regulated by age or by sex and observed that age is a larger driver of gene expression differences than sex in these tissues and conditions. Furthermore, we showed that TEs are highly expressed across tissues, even in a healthy context, and upregulated with age in the brain. Lastly, as a proof-of-principle, we perform gene ontology (GO) analysis to demonstrate a common aging signature across multiple tissues in the turquoise killifish, characterized by increased immune/inflammatory gene expression, consistent with previous findings in other species.

Fig. 1
figure 1

Experimental design and analytical pipeline. The experimental design used to generate our RNA-seq dataset. Brain, heart, muscle, and spleen were dissected from sets of 5 young female, young male, old female, and old male GRZ strain killifish. RNA was extracted, depleted of ribosomal reads, and sequenced. After sequencing, reads were mapped to a turquoise killifish genome reference and counted with TETranscripts. The ratio of TE to gene reads was compared for each library, groups were contrasted for similarity using PCA analysis, differential gene expression was run using DESeq 2, and gene ontology analysis was run using clusterProfiler.

Methods

Fish husbandry & tissue collection

Breeding, embryo collection, hatching, and fish husbandry followed standard protocols32. Fish were reared in the fish facility at MPI age under §11TSchG animal housing license No. 576.1.36.6.G12/18 and euthanized (license MPIa_Anzeige_RB.16.005) by anesthetic overdose (600 mg/L MS222 in system water) administered by trained personnel. Fish were dissected to extract the brain, heart, liver, muscle, and spleen which were immediately flash-frozen in liquid nitrogen and stored at −80 °C until use.

RNA isolation

For RNA isolation, frozen tissues (30–50 mg) were placed in MP biomedicals lysis matrix D tubes (CAT#6913500) filled with 1 mL of Trizol reagent (Thermo-Fisher), then homogenized using Benchmark BeadBug 6. Total RNA was purified using Direct-zol RNA Miniprep Plus Kit (Zymo cat# R2072) following the manufacturer’s instructions. RNA quality was assessed using high sensitivity RNA screen tapes (Agilent cat# 5067–5579, 5067–5580) on Agilent Tapestation 4200 to obtain the RNA Integrity Number (RIN). Samples with a RIN score of <4 were discarded, which excluded 8/20 liver samples including all old male samples. Due to the high number of samples that did not pass QC, which would compromise our ability to measure some of the biological groups, we chose not to proceed with liver RNA-seq library preparation.

RNA-Seq library preparation and sequencing

We used 40 ng of total RNA, which was subjected to ribosomal-RNA depletion using the RiboGone™ - Mammalian kit (Clontech cat# 634847) according to the manufacturer’s protocol. Strand specific RNA-seq libraries were then constructed using the SMARTer Stranded RNA-seq Kit (Clontech), according to the manufacturer’s protocol. Libraries were quality controlled using high sensitivity D1000 screen tapes (Agilent cat# 5067–5585, 5067–5603) on Agilent Tapestation 4200 before multiplexing the libraries for sequencing. Some samples were lost at this stage, as no library could be recovered, i.e. muscle sample 7 (young female) and spleen sample 8 (young female). Libraries that passed all QC steps were sequenced as paired-end 150-bp reads on the HiSeq X Ten platform at Novogene Corporation (USA).

Bioinformatic analysis

Adapter trimming and quality control

Raw reads were trimmed of adapters and low-quality reads were filtered using fastp version 0.23.233 with parameters “--failed_out fail_reads.out --detected_adapter_for_pe”. Raw reads and filtered reads were then quality-checked with Fastqc version 0.11.934 under default parameters. Multiqc version 1.1535 was used to summarize the Fastqc reports.

Mapping and counting reads

Filtered reads were mapped to killifish reference genome (GCA_014300015.1)36 that was softmasked with RepeatMasker version 4.1.2-p137 with Nothobranchius furzeri TE sequences obtained from FishTEDB38 (as described in17), using STAR version 2.7.0e39 with parameters “--outFilterMultimapNmax 200 --outFilterIntronMotifs RemoveNoncanonicalUnannotated --alignEndsProtrude 10 ConcordantPair–limitGenomeGenerateRAM 60000000000 – outSAMtype BAM SortedByCoordinate”. Multiqc version 1.1535 was used to summarize the alignment reports generated by STAR. Gene and TE count matrices were generated against killifish reference gene annotation and the TE annotation using TEtranscripts version 2.2.140 with parameter “--sortedByPos”.

To determine the ratio of reads mapped to introns and exons, we used featureCounts version 2.0.441 to summarize the number of reads mapped to the exon level and gene level with the killifish reference gene annotation, respectively. The number of intronic reads was determined by subtracting the sum of exonic reads from the sum of reads mapped to gene features42.

Transposable element read ratio

To determine the ratio of reads contributed by TE regions, the Tetranscripts summarized count matrices were imported into R version 4.3.043. The sum of reads mapped to TE features was divided by the total sum of reads in each tissue samples respectively. Non-parametric Mann-Whitney rank test was used to determine whether there was a statistically significant difference in TE ratio grouped by sex and age with ggpubr version 0.6.044 and false discovery rate [FDR] was reported to correct for multiple testing.

Differential gene expression analysis & transcriptional read correlation

The TETranscripts summarized count matrices were imported into R version 4.3.0 and differential gene expression analysis was conducted using DESeq 2 version 1.40.145 with sex and age as modeling variables. Normalized count matrices, variance-stabilized count matrices and differential gene expression result matrices were generated. Full list of differential analysis result by sex and age are provided (Supplemental Table S2). Transcriptome-wide correlation of reads mapped to gene and TE features was determined by assessing the pair-wise Spearman rank correlation between each sample pair. We also used principal component analysis on the variance-stabilized count matrices to determine the overall separation of samples across tissue types, as a function of age and sex. TE features were further classified into LINEs, SINEs, LTRs, DNA TEs, unclear, and unknown as provided in FishTEDB38. Unclear and unknown categories were collapsed under one single unknown category. The numbers of differentially expressed TE by age and sex within each category were reported for each of the four tissues.

Variance partition analysis

To determine the amount of variance that could be explained by sex and age, the variance-stabilized count matrices were first split into TE and canonical gene count matrix. R package variancePartition version 1.30.046 was used to determine the amount of variance explained by sex and age in TE and canonical gene count matrices respectively.

Gene ontology analysis

To determine the biological pathways that were significantly altered in aging and pathways that were implicated in sex dimorphism, we performed GSEA (gene set enrichment analysis) GO analysis47. As described in Teefy et al.17, turquoise killifish protein sequences from GCA_014300015.1 were aligned to the Ensembl release 104 human protein database using BLASTP48 (NCBI BLAST version 2.13.0). The top human protein sequence for each turquoise killifish hit was retained using a minimal E-value cutoff of 10−3 and used for GSEA. Although this E-value threshold may seem lenient, it is accepted for the comparison of species as evolutionary distant as the turquoise killifish and humans30,49,50. The results of the differential gene expression analysis with respect to sex and age were used as inputs for GSEA for each tissue. Killifish reference gene annotations were substituted with human homolog when possible and genes without human homologs that were able to pass the BLASTP filter were discarded. When multiple genes map to the same human homolog, the log-two-fold change were averaged. Genes were then sorted in a decreasing order with respect to the log-two-fold change. GSEA was performed using the R package clusterProfiler version 4.8.151 and human gene ontology database org.Hs.eg.db version 3.17.052. GSEA was run using a minimum gene set of 25 terms and a maximum gene set of 5,000 terms using an FDR threshold of 5%. Full list of GSEA results by tissues with respect to age and sex are provided (Supplemental Table S3).

Data Records

Sequencing data was submitted to the Sequence Read Archive and is accessible through SRA accession SRP430823 (Transcriptional profiling of aging tissues from African turquoise killifish)53. Accession for individual samples is provided in Supplemental Table S1.

Technical Validation

Experimental design and quality control

We generated ribosomal RNA-depleted RNA-seq libraries from the brain, heart, muscle, and spleen from young (6-weeks-old) and old (16-weeks-old) male and female GRZ strain turquoise killifish starting from 5 fish per biological group (Fig. 1a,Supplemental Table S1). Importantly, each euthanized fish contributed all profiled tissues to minimize the number of subjects, as well as ultimately to potentially identify transcriptional signatures common to particular subjects across multiple tissues (i.e. brain, heart, muscle, and spleen samples from individual GRZ-AD_8240; Supplemental Table S1). Due to library construction failure, one young female muscle sample and one young female spleen sample were not sequenced (Supplemental Table S1).

We began to assess library quality by analyzing the number of reads in each library (Fig. 2a, Supplemental Table S4). Each library had roughly the same number of counts with no systemic bias towards any groups. Of note, the brain libraries consistently had the fewest total reads, although read counts were comparable across brain libraries. Next, we performed FastQC analysis using the MultiQC tool on each RNA-seq library to determine the mean quality scores for each sample (Fig. 2b). Quality scores for each library consistently had a Phred score > = 33 for the length of the read thereby, showing we generated high-quality RNA-seq libraries.

Fig. 2
figure 2

Quality control metrics for RNA-seq libraries of African turquoise killifish tissues. (a) Boxplot of raw log2 counts from each library. Count distributions are similar between all replicates indicating unbiased sample preparation. (b) Representative output from MulitQC showing the Phred score for each library across the length of the reads. Each sample shows high read quality with a Phred score that is >33 for the majority of the read length. (c) Barplot of percentage of uniquely mapped reads and multi-mapped reads from each library. Consistent total mapped percentage indicates high-quality mapping. (d) Correlation plot of all RNA-seq libraries. Libraries common to each tissue tend to correlate extremely tightly. The brain appears have the most distinct transcriptional profile of all tissues in our dataset.

After confirming high-quality libraries, we mapped each RNA-seq library to a recently published killifish genome version36 that was softmasked with turquoise killifish TE sequences from FishTEDB38. First, we measured the intronic/exonic read ratio using featureCounts and observed that roughly half of all reads were intronic and half were exonic. The brain and spleen had the highest ratio of intronic reads while the heart and muscle had the lowest (Supplemental Table S5). Importantly, overall mapped fractions (including both uniquely mapped and multi-mapped reads) were high and consistent across libraries (>90%), although some libraries had higher multi-mapping rates (Fig. 2c). Interestingly, multi-mapping reads are likely to stem from repetitive regions of the genome (including TEs), which represent a large portion of the African turquoise killifish genome54.

Next, to capture both gene and TE counts, we generated read counts for genes and TEs using TETranscripts, as in Teefy et al.17. After generating count matrices consisting of gene and TE counts, we normalized reads in DESeq 2 and created transcript expression correlation maps between libraries (Fig. 2c). We found that samples clustered tightly by tissue, consistent with strong expected tissue-specific transcript expression. To assess transcriptional similarity of various samples in each tissue, we performed principal component analysis (PCA) on each count matrices normalized with the Variance Stabilizing Transformation in DESeq 2 (Fig. 3a–d). In all tissues, transcript expression tended to segregate mostly by age, with a lesser secondary separation by sex. To quantify how much transcript expression variation in each tissue could be explained by age and sex, we used “variancePartition” (Fig. 3e–f). Interestingly, in each tissue, age accounted for more variance in gene expression than sex for both genes (Fig. 3e) and TEs (Fig. 3f).

Fig. 3
figure 3

PCA plots of African turquoise killifish tissue transcriptomes as a function of aging and sex. (ad) PCA plots of transcript expression highlighted by group (young female, young male, old female, old male) for (a) brain (b) heart (c) muscle (d) spleen. In each tissue besides the spleen, transcript expression segregates by age along PC1. In the spleen, samples still separate by age but primarily along PC2. (e) Variance in gene expression explained by age and sex in each tissue. In each tissue, age explains more of the variance in gene expression relative to sex. (f) Variance in TE expression explained by age and sex. In each tissue, age explains more of the variance in TE expression relative to sex.

Differential transcription by age and sex

To assess the quality and useability of the dataset, we next performed differential gene expression analysis using DESeq 2, starting by using only genes, and then with TEs (see below; Supplemental Table S2). We used a combined differential expression model with animal age and sex as modeling covariates. Using a significance threshold of FDR ≤ 5%, we identified substantial age-related gene expression changes with 3611, 4910, 5077, and 2195 differentially expressed genes in brain, heart, muscle, and spleen, respectively (Fig. 4a). These numbers are consistent with the number of differentially expressed genes with aging observed in previous transcriptomic studies of aging in this species with single tissues, single sex and/or in a non-standard strain17,31,55. In agreement with our PCA analysis, we find fewer genes with sex-dimorphic expression in each tissue with 0, 429, 30, and 13 differentially expressed genes between females and males in the brain, heart, muscle, and spleen, respectively (Fig. 4b).

Fig. 4
figure 4

Differential gene expression analysis of African turquoise killifish tissue transcriptomes as a function of aging and sex. (a) Strip plot showing the number of differentially expressed genes by age (FDR ≤ 5%) with the number of significantly differential genes in parentheses. In each tissue, thousands of genes are differentially expressed by age. The muscle was the tissue most affected by age at the transcriptional level with 5,077 differentially expressed genes. Brown denotes genes upregulated in old tissues, yellow denotes genes upregulated in young tissues, gray denotes non-significant differences in gene expression between ages. (b) Strip plot showing the number of differentially expressed genes by sex (FDR ≤ 5%) with the number of significantly differential genes in parentheses. Fewer genes are differentially expressed by sex than age in all tissues assayed. The heart had the most sex-dimorphic gene expression with 429 genes differentially expressed by sex. Pink denotes genes upregulated in female tissues, blue denotes genes upregulated in male tissues, gray signifies genes with no significant expression differences between sexes.

Next, we analyzed the differential expression of TEs in these tissues. We found that as a percentage of mapped reads in each library, reads mapping to TEs ranges varied strongly by tissue, with ~50% of all reads in brain libraries mapping to TEs and only <20% of reads mapping to TEs in muscle libraries (Fig. 5a). TEs were more differentially expressed by age rather than sex with 897, 706, 291, and 114 differentially expressed TEs in brain, heart, muscle, and spleen, respectively. Most tissues had an approximately equal proportion of up- and down-regulated TEs except the brain, which showed a strong bias for TE upregulation with age (Fig. 5b). Like genes, TEs had more limited sex-dimorphic expression compared to age-related expression with only 15, 14, 0, and 1 differentially expressed TEs between sexes in brain, heart, muscle, and spleen, respectively (Fig. 5c). Brains had the most amount of differentially expressed TEs by sex and by age (Fig. 5b,c). When TEs were segmented into their respective subfamilies, LINE TEs were the most upregulated TE family in both the aging brain and in female brains (Fig. 5d,e).

Fig. 5
figure 5

Transposable element quality control and differential expression analysis. (a) Boxplot depicting the relative proportions of counts attributed to genes and TEs in each library. The brain contains the most reads attributed to TEs by proportion with approximately half of all reads mapping to TEs while muscle has very few reads mapping to TEs. Significantly more reads map to TEs in old brains relative to young brains as measured by Wilcoxon test. Brown denotes TEs upregulated in old tissues, yellow denotes TEs upregulated in young tissues, gray denotes non-significant differences in gene expression between ages. (b) Strip plot of differentially expressed TEs by age (FDR ≤ 5%) with the number of significantly differential TEs in parentheses. There is a substantial bias for TE upregulation in old brains. (c) Strip plot of differentially expressed TEs by sex (FDR ≤5%) with the number of significantly differential TEs in parentheses. There are far fewer TEs differentially expressed by sex compared to age without an obvious bias towards any sex. Pink denotes TEs upregulated in female tissues, blue denotes TEs upregulated in male tissues, gray denotes non-significant differences in gene expression between sexes. (d) Barplot of differentially expressed TEs by age (FDR ≤ 5%) segmented by TE family type within each of the four tissues. Proportion of the bar above 0 indicates the number of TEs upregulated in old fish and proportion of the bar below 0 indicates the number of TEs upregulated in young fish. (E) Barplot of differentially expressed TEs by sex (FDR ≤ 5%) segmented by TE family type within each of the four tissues. Proportion of the bar above 0 indicates number of TEs upregulated in females and the proportion of the bar below 0 indicates the number of TEs upregulated in males.

Lastly, we performed gene set enrichment analysis (GSEA) using gene ontology (GO) functional categories (using homology mapping from human annotations), to determine whether our dataset was amenable to this type of analysis (Supplemental Table S3). GO enrichment analysis was performed in each tissue, to determine enrichment as a function of age (Fig. 6a) and as a function of sex (Fig. 6b). As reported in previous aging ‘omic’ studies across animal taxa2,56, at least one immune-related term was enriched in aged tissues compared to young tissues (Fig. 6a), consistent with the notion of “inflamm-aging”. Importantly, young muscle also showed an enrichment of cell-cycle gene transcription, which may reflect more active or abundant muscle stem cells. All tissues displayed enough transcriptional sex-dimorphism to have at least 5 significantly enriched GO terms per sex, except for the female spleen, which only showed increased interferon production relative to the male spleen (Fig. 6b).

Fig. 6
figure 6

Gene Ontology enrichment analysis across African turquoise killifish tissues as a function of sex and age. (a) Top GO terms enriched in aging in the brain, heart, muscle, and spleen. The top 5 GO terms that are enriched in aged tissue (top, red) and the top 5 terms that are enriched in young tissue (bottom, off-white) are shown and ordered by significance. All tissues have at least one term associated with immunity enriched with age (bold). (b) Top GO terms enriched in each sex in the brain, heart, muscle, and spleen. The top 5 GO terms enriched in females (top, pink) and top 5 GO terms in males (bottom, blue) are shown and ordered by significance. The spleen had the least female-biased sexual dimorphic gene expression with only one term significantly enriched (FDR ≤ 5%).

Usage Notes

This dataset can be used to find differences in gene and TE expression using age and sex as variables in any combination suitable to the user. In addition, to facilitate the exploration of this dataset, we have deployed a user-friendly searchable database of differential gene and TE expression results, with human homology information, that can be mined by the community (https://alanxu-usc.shinyapps.io/nf_interactive_db/). The dataset could also be deconvoluted using single-cell atlases to establish cell composition profiles and analyze how cell type frequencies change with age and sex in each tissue.

Since the dataset was generated using ribosomal RNA depletion rather than polyA enrichment, it should also be possible to analyze RNA species other than canonical mRNAs, including circRNAs57 transcribed by RNA pol III, which typically lack polyadenylation58,59.

Limitations of this dataset are that, like most aging -omic studies outside of consortia efforts, it uses only 2 timepoints, which limits ability to enable detection of specific changes at middle-age17. Future transcriptomic studies of female vs. male turquoise killifish focusing on specific tissues may benefit from increased time resolution. The study also only looks into limited somatic tissues, namely brain, heart, muscle and spleen. Future studies including additional somatic tissues will be useful to expand our knowledge of sex-differences in aging turquoise killifish tissues. In addition, TE quantification may be partially driven by TE-derived intronic reads that are retained by ribosomal RNA-depleted RNA-seq library preparation60. In effect, this dataset cannot distinguish between intronic-derived TEs and autonomous TEs, which are regulated in a different fashion, although both may contribute to biological changes. Nonetheless, this dataset is useful in determining the total amount and class of TE reads present in young and old tissues across sexes.