Introduction

Maintaining energy homeostasis is essential for cells and biological organisms to survive and thrive. Throughout most of human history, perturbations to energy metabolism were due to starvation that stunted growth and development1, 2, while in modern populations metabolic health is challenged by sedentary life style, excess adiposity and ageing3,4,5. There is evidence that both energy extremes involve the same cellular processes that maintain energy homeostasis6,7,8 and that these disruptions may be important drivers for common diseases such as diabetes and cancer9,10,11,12. Autophagy is one such process. It is responsible for recycling cellular materials into energy resources during periods of nutrient deprivation13,14,15, but it also has an important role in maintaining the optimal composition of cellular organelles during periods of abundance16. Importantly, autophagy is affected by ageing17 and impaired autophagy in the ageing brain, in particular, may be an important risk factor for Alzheimer’s and Parkinson’s diseases18,19,20,21. For these reasons, our long-term goal is to understand how autophagy and energy metabolism are regulated in human cells and to use this new fundamental knowledge towards new treatments for age-associated diseases.

Previous studies on autophagy regulation have revealed multiple pathways and genes22,23,24, of which mammalian target for rapamycin (mTOR) and transcription factor EB (TFEB) are the best characterized25, 26. A specific sequence, the coordinated lysosomal expression and regulation (CLEAR) motif seems to be the preferred DNA binding target for TFEB and its transcription factor family27 and it may represent a key mechanism by which external conditions (e.g. starvation) exert a cascade of adaptation through mTOR, TFEB and the promoters of downstream autophagy genes25. As the name implies, the CLEAR motif is present in the promoters of lysosomal genes. This is important because the lysosome is the end-terminal of autophagic cascades28 and it is responsible for the final degradation and recycling of materials including the two Alzheimer proteins, tau and amyloid-beta, that accumulate in the brains of affected individuals21, 29, 30. Lastly, we and others have identified genetic associations between autophagy and dementia18, 31, 32. These findings motivated us to explore the transcriptional responses associated with starvation-induced increase in autophagy and to investigate potential links between these responses and neuro-degenerative processes.

The aim of this study was to characterize how the transcriptome changes in response to starvation or mTOR inhibition in model systems where we also see responses in autophagy. We used genetically engineered human cells where we could confirm the changes in autophagic flux into the lysosome; this sets the experiments apart from previous work. Furthermore, we applied RNA sequencing at multiple time points and three cell lines to achieve robust systems-level understanding of which genes are reproducibly affected. The multi-faceted study design makes our study different from previous RNA-seq profiling experiments. Across the different cell line/treatment combinations, we report unexpected associations between differentially expressed genes and autophagic flux and characterize universal expression patterns and their predicted driver genes that overlap with neuro-degenerative disease processes.

Results

Overview of transcriptome responses

We collected RNA-seq data at baseline and after two interventions (mTOR inhibition or amino-acid starvation) in three monoclonal cell lines (HeLa, HEK 293 and SH-SY5Y). Initially, 12 differential expression analyses were conducted (2 time points × 2 treatments × 3 cell lines = 12 analyses). The nature of our experimental design resulted in thousands of DE genes for each experiment. We observed that there was a significant overlap of DE genes detected at the 15 h and 30 h time points for each experiment (mean 53.4%). Therefore, to stratify our results into a manageable dataset, we included only genes that not only were DE at both time points, but that were also directionally concordant in their expression over the time points (i.e. either both upregulated or both downregulated). For a single estimate of fold-change, we used the mean log2 fold change across both time points. Hence the final set of results comprised six separate DE listings (3 cell lines × 2 treatments × 1 combined time point). The resulting six lists of differentially expressed (DE) genes were used for further analyses and we refer to them as the six DE “experiments” throughout the text (Supplementary Tables S1S6).

A total of 16,506 genes were detectable in at least one cell line and 11,202 (67.9%) were detectable in every cell line (Fig. 1A). We observed 8914 DE genes due to starvation in at least one cell line, of which 456 (5.1%) were classified as DE genes in every cell line (Fig. 1B). We also observed 6,226 DE genes due to mTOR inhibition in at least one cell line; 285 (4.6%) of these were classified as DE genes in every cell line (Fig. 1C). We identified 5672 DE genes associated with starvation or mTOR inhibition that were up-regulated in at least one cell line (Fig. 1D, inconsistent DE genes that were significantly up-regulated in one cell line but significantly down-regulated in another were excluded). Of these, 1541 (27.2%) genes were shared by both treatments. Lastly, we identified 5,543 down-regulated genes of which 1741 (31.4%) were shared between treatments (Fig. 1E).

Figure 1
figure 1

Overview of differentially expressed (DE) genes. (A) Genes were considered detectable if there were > 1.5 counts per million in > 3 samples out of all samples from the same cell line. (B) Genes that were DE between starved and control samples in at least one cell line. (C) Genes that were DE between mTOR inhibited and control samples in at least one cell line. (D) We collected DE genes associated with starvation or mTOR inhibition that were up-regulated in at least one cell line (inconsistent DE genes that were significantly up-regulated in one cell line but significantly down-regulated in another were excluded). (E) Down-regulated DE genes associated with starvation or mTOR inhibition.

Differentially expressed genes

The patterns of DE signals are summarized in Fig. 2 and more detailed expression differences and replication for the highlighted genes are available in Supplementary Tables S1S6 and Supplementary Figs. S4S18. P-values were adjusted for FDR as described in “Materials and methods” (PFDR). For each plot, genes were ranked according to the maximum FDR rule: first, we determined the maximum PFDR for each gene across the relevant experiments. Genes were then sorted according to maximum PFDR. Lastly, genes that were significantly up-regulated in one experiment, but down-regulated in another were excluded to maintain directional concordance.

Figure 2
figure 2

Top 25 differentially expressed (DE) genes based on the maximum FDR rule. Genes mentioned in the main text are highlighted for easier visual localization. (A) Genes were sorted according to the maximum FDR-adjusted P-value across six experiments. Discordant genes that were significantly (PFDR < 5%) up-regulated in one and down-regulated in another experiment were excluded. (B) Genes were sorted according to the maximum PFDR across all starvation experiments. We also required that all starvation responses were directionally concordant and that the mean log2 fold change across mTOR experiments was in the opposite direction. (C) Genes were sorted according to the maximum PFDR across all mTOR inhibition experiments. We required that all mTOR inhibition responses were directionally concordant and that the mean log2 fold change across starvation experiments was in the opposite direction. (D) Genes were sorted according to the maximum PFDR across responses in the SH-SY5Y cells. Missing signals were set to zero log2 fold change in other cell lines. We also required that the mean log2 fold changes in other cells were in the opposite direction to SH-SY5Y responses.

The gene with the lowest maximum P-value across all experiments (Fig. 2A) was LETM1 Domain Containing 1 (LETMD1, PFDR ≤ 5.0 × 10−7, mean logFC = 1.3, involved in phagocytosis, Supplementary Fig. S4). The greatest increase in relative expression was observed for a cancer-associated lncRNA that may inhibit autophagy (Small Nucleolar RNA Host Gene 7 or SNHG7, PFDR ≤ 0.0041, mean logFC = 2.0, Supplementary Fig. S5). Genes that were down-regulated included a member of the PI3K family (Phosphoinositide-3-Kinase Regulatory Subunit 3, PIK3R3, PFDR ≤ 5.6 × 10−6, mean logFC =  − 1.6, Supplementary Fig. S6), an L-amino acid transporter (Solute Carrier Family 43 Member 2, SLC43A2, PFDR ≤ 0.0029, mean logFC =  − 1.7, Supplementary Fig. S7) and Aldolase Fructose-Bisphosphate C (ALDOC, PFDR ≤ 0.00015, mean logFC =  − 1.6, a glycolysis gene associated with Alzheimer’s disease33, Supplementary Fig. S8).

Top 25 genes ranked according to starvation response are shown in Fig. 2B. The smallest maximum P-value across starvation experiments was observed for Activity Regulated Cytoskeleton Associated Protein (ARC, PFDR ≤ 0.00074, mean logFC = 1.6, associated with memory and cognitive disorders34, Supplementary Fig. S9). The greatest increase in expression was observed for G Protein Subunit Beta 1 Like (GNB1L, PFDR ≤ 0.0074, mean logFC = 1.1, associated with neurological disorders35, Supplementary Fig. S10). The top down-regulated gene was a transcriptional co-repressor involved in photoreceptor degradation and possibly autism (Sterile Alpha Motif Domain Containing 11, SAMD11, PFDR ≤ 8.3 × 10−5, mean logFC =  − 1.936, Supplementary Fig. S11). Arrestin Domain Containing 3 (ARRDC3, PFDR ≤ 0.00097, mean logFC =  − 1.8, Supplementary Fig. S12) was also down-regulated and it is involved in endocytic recycling37 and lysosomal degradation of receptors38.

Genes that were specifically affected by mTOR inhibition are shown in Fig. 2C. Upregulated genes included CAMP Responsive Element Binding Protein 3 Like 4 (CREB3L4, FDR ≤ 0.0025, mean logFC = 0.86, a transcription factor involved in glucose and lipid metabolism, Supplementary Fig. S13). Of the down-regulated genes, Methionyl-TRNA Synthetase 1 (MARS) had the smallest FDR (FDR ≤ 1.3 × 10−8, mean logFC =  − 1.2, involved in alveolar disease, Supplementary Fig. S14). The greatest decrease in expression was observed for Solute Carrier Family 6 Member 9 (SLC6A9, FDR ≤ 3.1 × 10−5, mean logFC =  − 2.0, Supplementary Fig. S15) which is a glycine transporter associated with Alzheimer’s disease39. Of note, we observed an mTOR-specific pattern among the top 25 in HEK 293 and HeLa cells, but not in SH-SY5Y.

We isolated responses specific to SH-SY5Y in Fig. 2D by applying the maximum FDR rule to the two SH-SY5Y experiments only (see details in figure caption). Piwi Like RNA-Mediated Gene Silencing 2 (PIWIL2, a piRNA [Piwi-interacting RNA] regulator of autophagy and apoptosis40) exhibited the smallest maximum P-value (PFDR ≤ 4.4 × 10−6, mean logFC = 2.1, Supplementary Fig. S16). The top-ranked down-regulated gene was Collagen Type I Alpha 2 Chain (COL1A2, PFDR ≤ 1.4 × 10−10, mean logFC =  − 2.9, a structural component of collagen, Supplementary Fig. S17). Membrane Bound O-Acyltransferase Domain Containing 4 (MBOAT4), which stimulates autophagy41, was also among the most down-regulated genes (PFDR ≤ 5.2 × 10−10, mean logFC =  − 2.8, Supplementary Fig. S18).

Canonical pathways enriched for differentially expressed genes

In the previous section, we observed multiple genes associated with autophagy and neurodegeneration, however, a single DE gene in isolation does not reveal the wider biological consequences. To gain more robust insight into the biological impact, we investigated i) if genes from a known biological pathway were over-represented among DE genes and ii) if DE genes were likely to perturb a known pathway when considering the gene–gene interactions within the pathway. Selected results are depicted in Fig. 3 and full statistics are available in Supplementary Tables S7S18.

Figure 3
figure 3

Enrichment of differentially expressed genes in the Kyoto Encyclopedia of Genes and Genomes pathway repository. (A) Over-representation analysis of DE genes. Pathways that produced a significant signal (PFDR < 0.05) in at least one experiment are shown. (B) Normalized perturbation scores from Signaling Pathway Impact Analysis. A negative (positive) score implies that the aggregate impact of DE genes is likely to decrease (increase) the activity of a pathway. Pathways that were directionally concordant (all significant signals in the same direction) and that produced at least three significant signals (PFDR < 0.05) are included.

We identified 31 KEGG pathways over-representing DE genes in at least one of the six experiments (Fig. 3A). No pathway was significant in every experiment. The most consistent over-representation signals included Parkinson’s and Alzheimer’s diseases and Amyotrophic lateral sclerosis that were significant in four out of six experiments (highlighted in Fig. 3A). Significant signals were also observed for Metabolic pathways, Cell cycle, Biosynthesis of amino acids, Carbon metabolism and Fluid shear stress and atherosclerosis.

Perturbation tests revealed multiple pathways that were likely to be activated or inhibited by the changes in gene expression (Fig. 3B). We observed a directionally consistent activation of the KEGG Alzheimer’s disease pathway (perturbation z-scores between + 2.15 and + 7.63) and Ferroptosis (z-scores between + 1.81 and + 3.27) across all six experiments (highlighted in Fig. 3B). We also observed unexpected but consistent association for inhibition of the Autophagy pathway (z-scores between − 5.3 and − 1.58). HIF-1 signaling was predicted to be inhibited across all experiments. Multiple immune-system pathways such as Antigen processing and presentation were also predicted to be inhibited due to the DE patterns. Full results are available in Supplement Tables S13S18.

Transcription factor target gene sets enriched for differential expression

Genome-wide regulatory processes are not yet fully understood and may be missed by existing pathway definitions. For this reason, we repeated the over-representation analysis from the previous section but replaced the KEGG pathways with transcription factor target (TFT) gene sets (defined in “Materials and methods”).

We observed 30 TFT gene sets that were over-represented amongst DE genes in every experiment. Based on the literature, TFEB may be a key regulator of autophagy25 and, as expected, the predicted TFEB targets were over-represented amongst DE genes associated with autophagy also in our study (highlighted in Fig. 4A, Supplementary Tables S19S24). Noteworthy signals related to neurological health include the MORC Family CW-Type Zinc Finger 2 set (MORC2 is associated with multiple neurological conditions), the Senataxin set (SETX, also known as Amyotrophic lateral sclerosis 4 protein), the THAP Domain Containing 1 set (THAP1 is associated with the neurodevelopmental disease dystonia 6) and SPT16 Homolog set (SUPT16H is associated with neurodevelopmental problems).

Figure 4
figure 4

Enrichment of differentially expressed genes in transcription factor target (TFT) sets. (A) Over-representation analysis of DE genes. Pathways that produced a significant signal (PFDR < 0.05) in at least one experiment are shown. (B) DE enrichment within TFT sets in SH-SY5Y cells but not in other cells.

We identified 22 gene sets that were enriched for DE genes in both SH-SY5Y experiments but not in other cells (Fig. 4B); 20 belonged to the same E2F family that shared most of their target genes (note also E2F2 in Fig. 4A). The strongest signal was observed for the Hydroxysteroid 17-beta Dehydrogenase 8 (HSD17B8) gene set (PFDR ≤ 7.9 × 10−16). Full results are available in Supplementary Tables S19S24.

Transcription factors as mediators between differential expression and canonical pathways

To compare the pathway and TFT responses, we first identified shared genes between a KEGG pathway and a TFT set, and then calculated the perturbations scores for this shared subset of genes. We observed ten pairs of transcription factors and KEGG pathways that satisfied PFDR < 0.05 across every experiment—all of them were predicted to have increased activity due to the DE pattern and seven of them were pairings with KEGG neuro-degenerative diseases (Fig. 5A, full results in Supplementary Tables S25S30). Moreover, three signals involved the THAP1 gene set (paired with Alzheimer’s disease, ALS and viral infection) and four involved the SETX gene set (paired with Alzheimer’s, Parkinson’s, Huntington’s and ALS).

Figure 5
figure 5

Combined perturbation analysis of canonical pathways and TFT sets. First, we identified DE genes that were shared between a KEGG pathway and TFT sets. Then, we used Signaling Pathway Impact analyses to test if the shared genes would impact the activity of the KEGG pathway. Therefore, the perturbation scores are predictions on the potential regulatory effects differentially expressed transcription factor target genes will have on canonical pathways. (A) TFT-pathway pairs that showed directionally consistent and significant (PFDR < 0.05) perturbation scores across every experiment. (B) TFT-pathway pairs that showed directionally consistent and significant (PFDR < 0.05) perturbation scores in the two experiments on SH-SY5Y cells but no significant signals in other cells.

We also found 32 pairs of TFT sets and pathways that were specific to SH-SY5Y (Fig. 5B). Notably, the perturbation scores were mostly negative, which suggests that the DE of the TFT sets may result in the inhibition of these pathways. Exceptions included SETX and Dopaminergic synapse (PFDR ≤ 0.00021), and multiple pathways perturbed by the E2F family of transcription factors, such as E2F and Alzheimer’s disease (PFDR ≤ 0.0065).

Discussion

To investigate the transcriptional regulation of autophagy in living human cells, we induced autophagic flux by amino acid starvation and mTOR inhibition and investigated the transcriptional responses in HeLa, HEK 293 and SH-SY5Y cell lines. Increase in autophagic flux was confirmed by a tandem-fluorescent LC3 assay (tf-LC3) and gene expression was quantified by RNA sequencing. We found that the KEGG autophagy pathway was inhibited at 15 h and 30 h after treatment, while pathways associated with neuro-generative diseases were activated. In particular, our results suggest that transcription target genes assigned to SETX and E2F may represent important regulatory mediators that connect energy metabolism, autophagy and cellular stress with Alzheimer’s and Parkinson’s diseases.

Previous literature and the tf-LC3 assay used in this study show how autophagic flux increased in response to starvation or mTOR inhibition15, 26, 42. We chose the time points of 15 h and 30 h based on time-series experiments to capture the inflection and saturation points of the autophagy response. A recent report on the dynamics of autophagy suggests that autophagy responses with respect to vesicular flux start within 10 min of treatment and saturate by 15 h43. In the first phase, mTOR Complex 1 inhibition by rapamycin induces an increase in autophagosomes which represents the initial packaging of molecular cargo into vesicles. Next, the autophagosomes fuse with lysosomes to form autolysosomes. Lastly, the autolysosomes degrade and the contents are recycled. The authors found that these three stages reached a steady state by 15 h where the numbers of autophagosomes and autolysosomes stabilize. Other studies have also demonstrated autophagic flux is still supported at late time points such as 8 h and 24 h in mouse embryonic fibroblasts44, and 48 h in HeLa cells, as demonstrated by measurement of LC3-II with and without a lysosomal inhibitor drug45. Our results from the tf-LC3 assay, which tracks the proportion of LC3 within acidic autolysosomes, are compatible with these findings, although we observed stabilization at 30 h rather than 15 h in most cases.

Against our expectations, we observed significant inhibition of the KEGG Autophagy pathway and possible autophagy regulators such as ARRDC3 and PIKR3 in the RNA-seq data. Moreover, putative autophagy inhibitors LETMD1 and SNHG7 were up-regulated. Extrapolation of the dynamic autophagy process to transcriptional regulation may explain the negative DE we observed. We did not see substantial changes in gene expression at 1 h, which means that there was limited if any immediate transcriptional response associated with the initial increase in autophagosomes. On the other hand, by 15 h the transcriptome was responding to the nutrient deprivation, while the tf-LC3 assay was starting to level off. It is plausible that expressing autophagy genes at 15 h onward may become less of a priority for the cell and the relative expression of the pathway is subsequently decreased.

SETX was first discovered via ataxia-associated mutations in a human homolog of the yeast gene Sen146 and numerous additional mutations have since been reported that are associated with a rare type of ALS47, 48. Initial studies showed that SETX helps to remove unintended DNA-RNA hybrid molecules (R-loops) that would otherwise promote genomic instability49, 50. Interestingly, recent evidence indicates that SETX may be an important regulator of autophagy, especially with respect to the removal of stress granules that form when a cell is starved or under other types of environmental pressure49. For example, Richard et al. investigated a SETX knock-out51 and reported that “SETX depletion inhibits the progression of autophagy, leading to an accumulation of ubiquitinated proteins, decreased ability to clear protein aggregates, as well as mitochondrial defects” which describes most neuro-degenerative diseases with features of proteinopathy. In another study, Bennet et al. induced SETX over-expression that disrupted the cell cycle of HEK 293 cells and they concluded that neurons due to their long RNA transcripts (i.e. propensity for R-loops) may be particularly vulnerable if SETX expression is outside the optimal range52.

We observed up-regulation of genes that were implicated in Alzheimer’s and Parkinson’s disease, respectively, and predicted to be downstream targets of SETX (Fig. 5A). On the other hand, SETX itself was not differentially expressed, which could be the result of tightly controlled expression range or transient expression patterns that are characteristic of transcription factors. Given the generic nature of our transcriptome findings, further studies of SETX may benefit from expanding the focus from ataxia and ALS to other types of neuro-degenerative diseases and focusing on energy restricted cellular milieu that may be characteristic to an ageing brain.

The E2F family of transcription factors is implicated in the regulation of energy metabolism, adipose tissue, obesity and growth in general53,54,55,56 and our results from starved and mTOR-inhibited cells fit this picture well. The first family member, E2F1, is the most extensively studied. E2F1 binds with retinoblastoma protein to induce autophagy in cancer cells56 and, inversely, E2F1 knockout inhibits autophagy to increase brown fat formation54. In Drosophila, E2F1 enables the regulation of TOR Complex 1 independent of insulin or amino acid pathways55 and interacts with the cell cycle in a biphasic manner to promote organismal growth53. The E2F1 protein may be up-regulated in people with Down’s syndrome and amyloid-beta deposition57.

The E2F signals we observed are most likely explained as universal consequences of energy restriction across cell types. In the neuroblastoma cell line, the E2F-targeted portions of cancer pathways, thermogenesis, mTOR signaling, insulin signaling, and circadian entrainment were all inhibited (Fig. 5B), as one would expect based on the previous research on E2F1. On the other hand, ALS and Alzheimer’s disease pathways were predicted to be activated. Our data cannot reveal causal relationships, but we speculate that E2F transcription factors respond to age-associated metabolic dysfunction and may subsequently trigger neuronal apoptosis58, 59. There is evidence that inducing E2F1 and E2F2 may help maintain genomic stability in neurons under toxic conditions60 while other experiments showed that reducing E2F1 in mice improved the survival of dopaminergic neurons59. Given these complex and contradictory findings, additional research into the exact roles of each E2F family member in relation to human tissues is warranted.

The inclusion of three cell lines, three time points and two conditions provide statistical and biological robustness to our findings. HEK 293, HeLa and SH-SY5Y cells are established platforms for experimental studies and grow predictably in standard conditions, which helped us to maintain high consistency between cultures. On the other hand, these immortalized cells may differ substantially from human cells in situ and the interventions we chose are beyond the typical physiological stresses most cells would encounter. Hence, we caution against over-reaching conclusions about possible therapeutic targets among the top DE genes. These data should not be interpreted as causal evidence. Instead, these data should be interpreted as further evidence on the associations between energy metabolism, autophagy machinery and neuro-degenerative diseases, whereas the exact causal mechanisms may be highly dependent on the cell type or on an individual’s genetic profile.

The use of immortal cell lines allowed us to optimize monoclonal cultures that expressed the tf-LC3 construct. This was important to achieve a high signal-to-noise ratio for the fluorescence assay for autophagic flux. Furthermore, the technical quality and depth of the RNA-seq data were high and we used additional permutation tests to verify signals beyond the original pathway tools. For these reasons, we are confident that the analytical quality of the study is high. Of note, the RNA-seq data from this study are publicly available (Annotare accession code E-MTAB-12020) and will contribute to the existing collection of datasets on autophagy gene signatures24, 61.

In conclusion, we conducted an experimental study to characterize transcriptomic changes associated with autophagy in three human cell lines. Our setup was not optimized for neuro-degenerative diseases beyond the neuronal SH-SY5Y cells, yet to our surprise we identified an enrichment of differentially expressed genes in Alzheimer’s and Parkinson’s disease pathways that emerged from the RNA-seq data. This re-enforces the idea that autophagy and energy metabolism are intrinsically involved in these major human diseases, however, further mechanistic work is required to identify which parts of the transcriptomic response come from which pathways, and if the genes we identified can drive these responses.. Furthermore, we identified SETX and the E2F transcription factor family as potential mediators between transcriptional regulation of autophagy and neuro-degenerative conditions.

Materials and methods

Study design

The experimental part of the project comprised four subcomponents: (i) an assay for autophagic flux, (ii) the selection and culture of cell lines, (iii) time series design for mTOR inhibition and starvation, and (iv) final experiments for transcriptomic analysis. Firstly, we used a tandem fluorescent LC3 (tf-LC3) assay to measure autophagic flux. LC3 is a core protein component of the autophagosome membrane that is eventually incorporated into the lysosome at the end of the vesicular autophagy pipeline42. The tf-LC3, which comprises the LC3 protein fused to a red fluorescence protein and a pH-sensitive green fluorescence protein, is incorporated into the autophagosome in the same manner as the native LC3. Once the tf-LC3 proteins reach the acidic interior of the lysosome the pH-sensitive green fluorophore is quenched while the red fluorophore is unaffected. Therefore, the ratio between red and green fluorescence indicates the proportion of tf-LC3 in lysosomes versus total cellular tf-LC3, which we use as a proxy for autophagic flux.

Secondly, we selected three different cell lines to identify consistent and universal RNA expression changes associated with changes in autophagic flux. We chose Hela and HEK 293 cells due to their robust growth in cultures, as well as their common experimental use, which allows the dataset to be comparable to previous and future experiments. We chose SH-SY5Y due to their brain-tissue origin, which is important for neurobiological implications directly arising from the findings of this work. Each cell line of interest was transfected with lentiviral particles that contained the sequence for the tf-LC3 construct under the cytomegalovirus promoter62. Multiple monoclonal lines were cultured for each cell line, each of which had total red and green fluorescence quantified by flow cytometry, thus allowing for selection of clones most appropriate for quantifying autophagic flux in the proposed experiments.

Thirdly, we subjected the clones to mTOR inhibition using 1 μM of AZD8055 (Selleck Chemicals LLC, Houston TX, USA) and to amino-acid starvation using Earl’s balanced salt solution (EBSS; MSD, Kenilworth NJ, USA) to induce autophagy. Temporal curves of autophagic flux were determined by measuring the tf-LC3 red/green ratio at 1 h intervals (Supplementary Fig. S2). Based on the curves, we chose 1 h, 15 h and 30 h time points as the initial response, inflection point and saturation point of autophagic flux respectively. Three technical replicates were collected from every experimental arm. We observed no difference between the baseline and 1 h RNA profiles, thus only 15 h and 30 h time points were used for statistical analyses.

In addition to the main study, we included data from a pilot study of 36 samples we did to test the tf-LC3 assay (these data were used as an independent replication of top genes and are included in Supplement in Supplement Figs. S4S18). The technical details for the pilot study were the same, with the following exceptions. The pilot study did not include SH-SY5Y cells. The experiments included an extra comparison between the top and bottom 20% of untreated cells sorted according to their tf-LC3 ratio at baseline. The treatment vs. vehicle was evaluated at 24 h instead of 15 h or 30 h. RNA-seq data did not include non-coding variants.

Cell culture and materials

HeLa (Sigma Aldrich, St. Louis MO, USA) and HEK 293 (ATCC, Manassas, VA, USA) cell lines were cultured in Dulbecco’s Modified Eagle Medium (DMEM; Life Technologies, Thermo Fisher Scientific, Waltham MA, USA), while SH-SY5Y (ATCC, Manassas, VA, USA) cells were cultured in 1:1 DMEM:Ham’s F12 (MSD, Kenilworth NJ, USA). All three cell lines were maintained with 10% (v/v) foetal bovine serum (Life Technologies), and 5 mg/mL penicillin and streptomycin (MSD, Kenilworth NJ, USA) in a humidified atmosphere of 5% CO2 at 37 °C. For RNA profiling, T25 flasks were seeded with 1.24 × 106 cells from 80% confluent T75 flasks 24 h prior to the start of experiments. Both the treated and non-treated samples were seeded from the same flask. Parental clones without the tf-LC3 proteins were used to calibrate the flow cytometer before measuring autophagic flux.

RNA sequencing

Total RNA was extracted using the RNeasy Plus Mini Kit (QIAGEN, Hilden North Rhine-Westphalia, Germany) as per the manufacturer’s instructions (sample RNA concentration ≥ 15 ng/μL, ≤ 2809 ng/μL, median 361.5 ng/μL). The RNA library was prepared with indices and was sequenced on an Illumina NovaSeq 6000 S4 at 2 × 150 bp at the David R Gunn Genomics Suite in the South Australian Health and Medical Research Institute.

RNA data processing

The scripts that were used for the analyses are available at https://github.com/Wenjun-Liu/Induced_autophagy. Default parameter settings were used at each step unless otherwise indicated. Each sample was sequenced to a median of 132 million paired reads per sample. We applied a three-step protocol to process raw reads into gene-level expression estimates. Firstly, we used cutadapt version 1.1463 to trim away low quality bases, adapters and other non-useful sequences. Secondly, trimmed reads were aligned to the human genome assembly GRCh38.p13 from Ensembl Release 9864 using STAR v2.765. Thirdly, total read counts for each gene (i.e. gene-level expression estimates) were quantified using featureCounts from the Subread package version 1.5.266, with the setting 1 for fracOverlap and 10 for Q. Gene annotations were obtained from Ensembl Release 9864.

For each of the three steps, quality checks were performed using FastQC v0.11.7 (URL: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and ngsReports67. We observed no issues related to low sequencing quality, variable GC content or high adapter content across the set of libraries. We considered a gene detectable for a cell line if we observed > 1.5 counts per million in > 3 samples out of 15, representing all samples from a complete treatment arm. A total of 16,506 (24.3%) out of 67,946 annotated genes were detectable in at least one cell line and 11,202 (16.5%) genes were detectable in every cell line. Lastly, we applied the conditional quantile normalization method to mitigate remaining artefacts from GC content and gene length in preparation for the statistical analysis68.

Differential expression analysis

We identified differentially expressed (DE) genes between the treated and untreated cell lines by quasi-likelihood negative binomial generalised log-linear regression as implemented in edgeR69, 70. We defined DE that exceeded the range of ± 20% fold change as biologically meaningful71. We then used the quasi-likelihood F-test to calculate P-values. P-values were further adjusted by the Benjamini–Hochberg method of false discovery rates (PFDR) to account for multiple testing12.

We conducted 12 initial DE analyses where we compared the expression levels at 15 h and 30 h against the baseline at 0 h (3 cell lines × 2 treatments × 2 time points, Supplementary Fig. S1). Statistically significant genes (PFDR < 0.05) were then selected for further investigation from each DE analysis. Given the overlap between significant genes at 15 h and 30 h time points (mean 53.4% across cell lines and treatments), we included only those that showed significant DE in the same direction at both time points, as a strategy to focus on the most consistently changed genes. For a single estimate of fold-change, we used the mean log2 fold change across both time points. Hence the final set of results comprised six separate DE listings (3 cell lines × 2 treatments × 1 combined time point, Supplementary Fig. S1).

Pathway enrichment analysis

We investigated (i) if genes in pre-defined biological pathways were over-represented among DE genes and (ii) to what extent DE genes were likely to perturb a given pathway when considering the known functional relationships between the pathway members. Firstly, over-representation of DE genes was tested with using goseq72, including gene length as an offset term to account for any bias.

Secondly, we applied the Signaling Pathway Impact Analysis (SPIA) method to identify potentially perturbed pathways73. The SPIA adds to the results from goseq since it provides deeper functional insight into the consequences from altered gene expression. In SPIA, pathways are represented as networks of genes based on pathway topology and activating/inhibitory roles of individual genes. Perturbation is defined as the propagating effect from altering the expression of one or more genes within the network. Crucially, the SPIA algorithm predicts the accumulated perturbation effect from multiple DE genes and summarizes the total effect as a single numerical score. This perturbation score is directional: a negative score indicates down-regulation of a pathway, whereas a positive score indicates up-regulation. In this study, we used a novel permutation procedure to calculate the statistical significance of the perturbation score (Supplementary Fig. S3).

Canonical pathway definitions were obtained from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database74. We retrieved 312 KEGG pathways and converted them into an SPIA-compatible network format using the tool graphite75. Over-representation and perturbation tests were applied to each of the six DE listings, respectively. The threshold for significant over-representation was set at 5% FDR. The same threshold was also applied to define significant perturbation.

Transcription factor target genes

We retrieved 957 transcription factor (TFT) gene sets from the Molecular Signatures Database version 7.276, 77 where, for a specific transcription factor, the TFT gene set was determined according to the binding sites or promoter binding motifs in the target genes. TFT gene sets were analysed for over-representation of DE genes the same way as the KEGG pathways, however, as the TFT definitions do not include interaction information between the target genes, we developed a strategy to combine the TFT information with KEGG pathway topologies. First, we determined subsets of genes that were shared between a KEGG pathway and a TFT gene-set; these subsets represent potential mechanisms by which a transcription factor may regulate a KEGG pathway. To test the regulatory potential further, we applied SPIA the same way as before, but using the subset of genes within the KEGG pathway (that were also TFT genes). KEGG pathways with PFDR < 0.05 were considered to be significantly perturbed due to the given transcription factor.