Abstract
Hepatitis C virus (HCV) infection poses a significant public health challenge and often leads to long-term health complications and even death. Parkinson’s disease (PD) is a progressive neurodegenerative disorder with a proposed viral etiology. HCV infection and PD have been previously suggested to be related. This work aimed to identify potential biomarkers and pathways that may play a role in the joint development of PD and HCV infection. Using BioOptimatics-bioinformatics driven by mathematical global optimization-, 22 publicly available microarray and RNAseq datasets for both diseases were analyzed, focusing on sex-specific differences. Our results revealed that 19 genes, including MT1H, MYOM2, and RPL18, exhibited significant changes in expression in both diseases. Pathway and network analyses stratified by sex indicated that these gene expression changes were enriched in processes related to immune response regulation in females and immune cell activation in males. These findings suggest a potential link between HCV infection and PD, highlighting the importance of further investigation into the underlying mechanisms and potential therapeutic targets involved.
Similar content being viewed by others
Introduction
This work involved the joint study of hepatitis C virus (HCV) infection and Parkinson’s disease (PD). These two conditions have been previously suggested to have a certain degree of connection. According to a systematic review and meta-analysis presented in Ref.1, patients with chronic HCV infection had a greater risk of developing PD than participants without infection, and a subsequent meta-analysis2 suggested that patients with HCV infection may have an increased risk of developing PD. In addition, antiviral medication against HCV has been shown to reduce the incidence of PD in people with HCV infection, even though there is no effective preventative strategy for PD3,4.
The Flaviviridae family, genus Hepacivirus, is home to the small, enveloped, positive single-stranded RNA virus known as HCV. HCV infection can be a short-term condition for some people, but it progresses to chronic, long-lasting infection in more than 50% of those infected with the virus. In many cases, people with chronic hepatitis C do not feel unwell or have symptoms (WHO, 2022). Symptoms are frequently signs of advanced liver disease once they first arise (WHO, 2022). Among the extrahepatic effects of HCV, there are a variety of inflammatory and immune-mediated diseases. It has been suggested that HCV can enter the central nervous system, causing disruption of striatal dopaminergic neurotransmission and neuronal death5,6.
PD is a common age-related neurodegenerative disease. In the previous 25 years, the incidence of PD doubled7. As reported by the World Health Organization (WHO), "more than 8.5 million people worldwide were estimated to have PD in 2019. PD resulted in 5.8 million years of disability-adjusted life lost in 2019, an increase of 81% since 2000, and was responsible for 329,000 global deaths in 2019, an increase of more than 100% since 2000” (WHO, 2022). PD, which is believed to be caused by complex interactions of numerous factors specific to an individual, has been linked to several hereditary and environmental factors. With the identification of 90 risk alleles over the last ten years, the number of genetic risk factors identified has significantly increased8. However, only approximately 20% of PD risk is explained by these known loci, leaving 80% of PD risk unexplained8. It has been suggested that bacterial or viral infection may be a risk factor for sporadic PD, and epidemiological and fundamental scientific data support this idea, although with varying degrees of consistency9. An example of a viral infection linked with PD development is influenza virus infection, which has been reported to be a risk factor for the development of parkinsonism through the generation of cytokines10,11.
The WHO also states that “50 million worldwide individuals had chronic HCV infection in 2022, and 1 million new cases occurred annually” (WHO, 2022). Given the widespread distribution of this virus, understanding the connections between HCV and PD is essential. This meta-analysis proposed that changes in the expression because of HCV infection could be related to the occurrence of PD; this topic was examined using BioOptimatics, a novel analysis perspective based on mathematical global optimization developed and advocated by our research group12,13,14. We aimed to identify the potential links, gene expression changes, and pathways common between PD and HCV infection using control-case microarray and RNA sequencing studies.
Results
To construct an integrative model, 22 transcriptomics studies available in the NCBI Sequence Read Archive (SRA; https://www.ncbi.nlm.nih.gov/sra) and Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/geo/) were used, including 8 related to PD (6 microarrays and 2 RNAseq) and 14 related to HCV (12 microarrays and 2 RNAseq). This cohort consisted of a total of 1064 samples, with 612 condition and 452 control samples (details in Table S1 and Fig. S1). Additionally, 13 out of the 22 datasets provided clinical information to enable subsequent analysis stratified by sex (see Fig. S2), seven of which were related to PD and six of which were related to HCV infection.
We employed the multiple criteria optimization (MCO) algorithm for gene selection, which identifies genes based on chosen performance measures (PMs), typically involving the absolute difference between control and condition group medians or means and producing consecutive Pareto-efficient frontiers, resulting in a hierarchical organization of genes12,13,14. MCO is a deterministic procedure that selects genes with certainty and under global optimality conditions. This approach differs from the use of probability- and nonoptimality-driven heuristic methods traditionally used in bioinformatics and biostatistics. To differentiate methods with these characteristics, we used the term “BioOptimatics”.
MCO was applied to identify genes whose relative expression was strongly altered in four case studies: individual dataset analysis, individual dataset analysis stratified by sex, meta-analysis within each condition, and meta-analysis between conditions (see Fig. 1).
Individual dataset analysis
The MCOs of the 22 datasets were analyzed individually using 10 consecutive Pareto-efficient frontiers each time. These analyses generated lists with 367 genes in total, representing those whose expression changed the most in the 8 PD datasets and 418 in the 14 HCV datasets. There were 34 genes in the intersection of the two sets of lists. A total of 751 genes integrated the solution for all the different individual analyses (see Table S2). The relationship between PD and HCV outcomes is shown in a Venn diagram in Fig. S3.
Individual dataset analyses stratified by sex
To carry out the analysis stratified by sex, each dataset that contained sex information was split into four groups (defined by the intersection of sex and material type): female control, female disease, male control, and male disease. In this way, 28 subsets were generated from the original seven PD datasets (GSE7621, GSE18838, GSE19587, GSE57475, GSE99039, GSE128177, and GSE106608). However, for the six HCV datasets, only two (GSE10356 and GSE 119117) had the sex information required to create the four subgroups; two had sex information for condition samples but not controls (GSE65123.mCD and GSE65123.pCD); and the other two had only male samples (GSE38542 and GSE140845). In summary, for HCV, there are 12 subgroups, including two for women (female control vs. female disease), four for men (male control vs. male disease), two for controls (control female vs. control male), and four for condition (condition female vs. condition male). MCO was used to analyze each of the 40 subsets. The results from 10 consecutive Pareto-efficient frontiers for each condition are listed in the appropriate groups—female, male, control, or disease—and then compared using a Venn diagram to identify common gene solutions (see Table S2). The genes common to females and men for each condition are listed in Fig. 2.
When the gene solutions used for individual analyses stratified by sex for PD and HCV were combined, a total of 1,074 genes were identified through MCO. Among these, 604 were associated with sex-related effects, suggesting that they were exclusive for the male or female comparisons but were not common between both, leaving 470 genes of interest for this work. These 470 genes exhibited expression changes attributed to the condition rather than sex-related differences. The complete list is shown in Table S3, with MT1H being the most frequently observed solution in all analyses. MT1H was associated to both PD and HCV conditions. In the MCO analysis conducted across sex-specific datasets, MT1H consistently emerged, appearing three times in male datasets analyses (twice for PD and once for HCV) and once in a female dataset related to HCV analysis.
The solution genes for the MCO analyses on sex-categorized groups were not included in the following sections to focus only on those changes that are intrinsic to the conditions and not affected by sex differences. The most frequently occurring expression changes when sex was considered in the comparisons between groups are shown in Fig. S4 (see complete list in Table S4). RPS4Y1, DDX3Y, HBB, EIF1AY, KDM5D, TXLNGY, USP9Y, ZFY, FTL, and UTY were found in at least five control/condition MCO analyses. Note that RPS4Y1, DDX3Y, and ZFY were present for PD and HCV.
Meta-analysis within each condition
In the following sections, the description of the number of datasets used for the MCO analyses is referred to as the dimension number. The datasets for the same condition were grouped together to perform six meta-analyses. The meta-analyses were built by separating samples by sex and by tissue. Two analyses were performed using two datasets simultaneously; these were identified as two-dimensional analyses (2D; two subsets) (male, GSE10356: GSE140845; male, GSE38542: GSE119117); and four analyses were performed simultaneously (female, GSE18838: GSE57475: GSE99039; male, GSE18838: GSE57475: GSE99039; female, GSE7621: GSE19887: GSE106608; male, GSE7621: GSE19887: GSE106608); these analyses were identified as three-dimensional analyses (3D; three subsets). To handle the results more effectively, only the first Pareto-efficient frontier was considered for each meta-analysis. Table S5 lists the genes identified by the meta-analysis after removing the solutions dependent on sex, and Fig. S5 illustrates the MCO results.
To evaluate PD and HCV conditions independently, the first three analyses of MCO cases—individual analysis, analysis by sex, and meta-analysis within conditions—were performed. Because they can serve as a link between the two diseases, the focus was directed to those genes whose expression changed the most under both conditions. The 19 genes (MT1H, MYOM2, S100A12, RPL18, IFIT1, SLC30A2, SAA1, LYZ, CEBPD, GPX3, SRGN, PGK1, SERPI1, FOS, GRN, GSTM1, CXCL1, KRT23, and EPB41L3) that were identified as common to both disorders are shown in Fig. 3 below (see Table S6).
Meta-analyses between conditions
The fourth and final case study used MCO meta-analysis to simultaneously analyze two diseases, PD and HCV. A four-dimensional (4D) meta-analysis of women and a five-dimensional (5D) meta-analysis of men were also conducted (female, GSE18838; GSE57475; GSE99039; GSE119117; male, GSE18838; GSE57475; GSE99039; GSE38542; and GSE119117). Table S7 contains the results, which include 51 genes for the female cohort and 57 genes for the male cohort. Ten genes in the intersection were used as the solution for these analyses (CCR7, JUND, SKIV2L, ALDH16A1, TSEN34, KRT23, PADI4, CORO1A, DYSF, and WAS) and are shown in Fig. 4.
Signaling pathways
The next phase of this work included multiple spanning tree (MST) analysis, in which the most correlated tree structure was found among a set of gene expression changes as a proxy for a signaling pathway. We used the gene sets previously identified for each of the eight MCO meta-analyses. Therefore, six MSTs from the meta-analysis within each condition (Fig. 5) and two MSTs from the meta-analysis between conditions were proposed (Figs. 6a and 7a). The weights for the most correlated gene pairs per analysis group are shown in Table S8. Relationships reported in the STRING (Search Tool for the Retrieval of Interacting Genes) database15 were contrasted with the corresponding proposed MST pathways. For the cases meta-analysis within each condition, both the MST and STRING pathways are shown in Fig. 5, along with their similarities. A change in MST1 expression was strongly correlated with changes in the expression of the FOSL2, FOS, MMP9, and PRKCD genes, which was also supported by the findings of previous studies reported in STRING. Similarly, changes in MST2 expression were strongly associated with changes in the expression of the SLC4A1 and EPB42 genes, as well as between CORO1A and ARHGDIB, as also reported by STRING. On the other hand, STRING revealed no relationships between the genes in sets 3, 4, 5, and 6. However, MST revealed important correlations between changes in the expression of TUBB3 and MT1H in group 4 and between changes in the expression of NCF2 and CLNS1A in group 6. Among these pairs, only associations have been reported for SLC4A1-EPB42 and CORO1A-ARGHDIB. The protein products of SLC4A1 and EPB42 are components of the human erythrocyte ankyrin-1 complex16. Problems with proteins forming the ankyrin-1 complex are associated with hereditary spherocytosis and other membranopathies that affect the erythrocytes’ membrane17. The protein products of CORO1A and ARGHDIB were proposed to interact in a study of chronic chagasic cardiomyopathy as part of natural killer cell-mediated cytotoxicity18. The other strong correlations found by MST but not yet reported in the literature are interesting leads to be followed up in future work because they show a possible connection between routes in HCV and PD.
In the case of the two meta-analyses between conditions, MSTs and functional enrichment analysis were conducted. Gene enrichment analysis was carried out with the enrichGO function of the clusterProfiler package. When comparing the MST7 results with the literature-reported groups of neighboring genes in STRING, four connected nodes were found in both analyses: (1) WAS, SPI1, PRKCD, CORO1A, and MYO1F; (2) TLR8, PIM2, LILRB, 1IRF9, and STAT5A; (3) XRN2, PAK2, and HSP90AB1; and (4) GLUD1 and PDHB (Fig. 6b). Biological processes associated with the female datasets were evaluated. The top GO terms associated with females were related to negative regulation of interleukin-12 production, regulation of the innate immune response, regulation of protein stability, protein stabilization, and cell chemotaxis (Table S9 and Fig. 8a). Last, the MST8 identified three groups of genes that are also supported by the STRING database: (1) SLC25A44 and SEMA4A; (2) SELENBP1 and SLC4A1; and (3) CD4, NFKBIA, CORO1A, MAPK8IP1, CCR7, LY9, CLDND2, TSC1, AGTRAP, ITGAL, SIRPB2, NCF2, IL6R, STXBP3, PREX1, PLP2, WAS, IL4R, ILK, ITGB2, DYSF, GPI, STXBP2, LY6E, CLNS1A, SYTL1, AP1S2, MYO9B, FFAR2, SERPINB1, JUND, ABI3, MAP3K11, TRAPPC1, and STK4 (Fig. 7b). The top GO terms associated with MST8 included leukocyte activation involved in the immune response, cell activation involved in the immune response, lymphocyte activation involved in the immune response, T-cell differentiation, and T-cell activation involved in the immune response (Table S10 and Fig. 8b).
Discussion
After summarizing the analysis of the four case studies, several results were of interest (see Fig. S6) (the complete list can be found in Table S11).
There are expression changes that are common to different dataset analyses for PD or HCV without overlapping between conditions. This result is expected because PD and HCV are two very different conditions. The presence of genes such as CFD, ARHGDIB, and MMP9 was confirmed in the present study’s PD analyses and coincides with the findings of previous studies reporting changes in the expression of these genes in PD and/or other neurodegenerative conditions19,20,21. Similarly, the recurrent HCV infection-related genes MT1M, SKIV2L, GLUD1, and IFIT1 coincided with previous studies that reported significant changes in their expression in hepatocellular carcinoma22,23,24,25. According to the analysis stratified by sex, some gene expression changes frequently occurred in both sexes, but there were also gene expression changes unique to each sex. The latter are important for characterizing each condition (PD or HCV) for each sex. For example, the IFIT1 gene was recurrently found in female HCV dataset analyses, the MGAM gene in female PD analyses, the RPS12 gene in male PD analyses, and the GPX3 gene in female PD and HCV datasets. Interestingly, IFIT1 codes for an antiviral protein and was detected in our female HCV analysis, which is consistent with the different immune responses between females and males26,27 and the faster clearance of HCV infection by females28. The MGAM gene encodes maltase-glucoamylase, a membrane enzyme that is proposed to play a role in energy uptake and the innate immune response29. These differences in expression changes could in part be responsible for the differences observed in PD incidence as well as in HCV infection and infection clearance by sex. Most PD patients are males, and it is estimated that the number of males is 1.5–2 times the number of female patients30. On the other hand, it has been reported that women clear HCV infection faster and that its prevalence is greater in males than in females (HCV Prevalence in 50 U.S. States and D.C. by Sex, Birth Cohort, and Race: 2013–2016). RPS12 encodes a specialized ribosomal protein that has been reported to be cell type specific and to change its expression in several cancers31,32. GTPX encodes a protein involved in cell protection against oxidative damage, and our analyses revealed differential expression of this gene mostly in females, in agreement with reports of different oxidative stress responses according to sex33,34. These results provide interesting leads for future experiments that could follow the role of each of these genes in a possible increase (or decrease) in the risk of developing PD, and how these changes might help explain why HCV and PD affect males and females differently.
Several genes were recurrent solutions for the PD and HCV analyses such as MT1H, MYOM2, RPL18, S100A12, IFIT1, KRT23, GPX3, and SLC30A2, are important for revealing a potential link between these two conditions.
Supporting Table S11 provides a summary of all MCO case studies. In total, nineteen gene expression changes were found to be common between PD and HCV; the top eight recurrent genes are presented in Table S12. The list was sorted by the total number of recurrences across the entire analysis, from largest to smallest, with MT1H being the most recurrent gene across all MCO analyses. These top eight genes were studied in depth to determine their mechanism and potential relationships between HCV and PD.
The expression of the MT1H, SLC30A2, KRT23, RPL18, IFIT1, and S100A12 genes is reportedly upregulated during HCV infection35,36,37,38,39,40, while the expression of MYOM2 and GPX3 is reportedly downregulated41,42. IFIT1 overexpression has been strongly correlated with cGAS/STING inflammatory pathway activation38. However, this pathway has been shown to be antagonized by RNA viruses such as HCV43. In this work, we propose that during initial HCV infection, the viral component can prevent the host immune response by antagonizing the cGAS-STING pathway; however, upon chronic infection, as is the case for 50% of HCV patients, host cells begin to overexpress IFIT1, which in turn activates the cGAS-STING pathway. This can be interpreted as a defense mechanism against infection, which eventually activates the host immune response. We propose that this immune response, in some cases, can lead to systemic inflammation that can include neuroinflammation and aid in HCV replication and success, ultimately causing neuronal death and the development of PD38. The protein encoded by RPL18, which has been shown to be overexpressed in HCV patients, is L18, a component of the ribosomal 60 S subunit39,44,45. L18 has been found to facilitate the initiation of viral RNA translation46. HCV replication and success could be supported by the overexpression of RPL1846, which could lead to neuroinflammation and neuronal death in some cases, increasing the risk of developing PD. KRT23 was recently identified as a pro-HCV host factor gene37. Its expression levels are related to severe liver disease, as is chronic HCV infection37. This finding suggested that KRT23aids in HCV replication and success, and we propose that under some conditions, the overexpression of KRT23 may cause neuronal inflammation and death and hence the development of PD. Patients with HCV have been found to overexpress MT1H and develop peripheral neuropathy35,47,48. These conditions can cause sequential illnesses such as cryoglobulinemia and arteriosclerosis35,49. In these compounded illnesses, reactive oxygen species accumulate in cells50. This stress contributes to endogenous neurotoxin generation, α-synuclein formation, and mitochondrial dysfunction, which leads to neuronal death and eventually leads to PD development35,51,52,53. The decreased expression of GPX3 is related to the mechanism of MT1H product, a primarily antagonistic to ROS species to protect cells42,54. It is possible that the overexpression of SLC30A2 is due to the presence of systemic neurotoxins that accumulate due to chronic HCV infection, as has been identified in previous research as a neurotoxin biomarker, and this accumulation could also be a pro-PD factor36. SLC30A2 dysregulation has also been associated with zinc ion imbalances, which in turn are correlated with arteriosclerosis and are known to facilitate α-synuclein aggregation, a PD hallmark55. HCV infection has been found to downregulate MYOM2 expression; this change in expression is associated with gene hypomethylation and correlated with HCV replication and success41. This contributes to neurodegeneration and PD development56. Although little research has been performed on S100A12 and its products, its overexpression has been reported to aid in the processes of excessive inflammation and oxidative stress40. This can be related to its role as a blood biomarker for coagulopathy in traumatic brain injury (TBI)57,58,59. S100A12 was also identified as a RAGE cell surface receptor ligand60. As such, overexpression of this gene leads to increased activity of the RAGE axis, leading to arteriosclerosis and the aggregation of α-synuclein, a known hallmark of PD development60. Therefore, this can correlate with HCV’s action on intrahepatic coagulation, a process that promotes thrombotic risk61,62. Blood disorders related to distinct HCV-induced gene upregulation and PD could be related to the blood‒brain barrier and the role of HCV in systemic infection and inflammation63. These deregulation processes could affect the role of microglia and astrocytes in producing neuroinflammation, as they activate proinflammatory mechanisms after HCV infection, contributing to PD development64 (see Fig. 9).
It must be understood that the work presented here is bioinformatics-based in nature. Experimental validation of the suggested biological explanations should be pursued in the future.
Correlated gene expression changes by the MST analyses
MST analysis was performed using previous HCV and PD MCO solutions as well as concurrent MCO analysis. MST analysis revealed three highly correlated pairs of expression changes: EPB42, SLC4A1; SPI1, WAS; and PRKCD, CORO1A. These connections were replicated in the STRING database results. This convergence of evidence prompted a more in-depth investigation, as detailed below.
EPB42 and SLC4A1
PD is also correlated with dysregulation of peripheral tissues, such as that observed in blood proteins. Erythrocyte membrane protein band 4.2 (EPB42) and solute carrier family 4 member 1 (SLC4A1) are erythrocyte membrane proteins associated with heme production and iron metabolism. The EPB42 protein interacts directly with the cytoplasmic domain of the solute carrier family 4 member 1 (SLC4A1) protein. The most highly correlated gene pairs in our MST meta-analysis for male datasets were EPB42 and SLC4A1, with a weight factor of approximately 2.45 between these gene expression changes (see Table S8). Furthermore, other research groups have identified these genes as possible PD biomarkers. These genes are downregulated in PD patients with sporadic and familial LRRK2 gene mutations65. EPB42 has been identified as a biomarker for idiopathic PD66, and the integrity of SLC4A1 influences iron uptake in PD mouse models, neurodegeneration, and PD67. The scientific community has extensively recognized EPB42 and SLC4A1 as genes relevant to PD, and this study further demonstrated this correlation.
SPI1 and WAS
SPI1 (Spi-1 Proto-Oncogene) encodes the transcription factor protein PU.1, which is important for several cellular responses and can act as a lymphoid-specific enhancer. WAS encodes WASp, which is an actin nucleation-promoting factor and is vital for neuron growth, vesicle formation, and membrane deformation68.
SPI1 and WAS influence each other’s functions through modulation of the expression of Btk (Bruton’s tyrosine kinase). The Btk protein can phosphorylate WASp and alleviate the autoinhibitory conformation that eventually prevents actin polymerization69. Transcription factor PU.1 and WASp also act in conjunction to promote the B-lymphoid fate and are essential at the earliest stages for commitment to the lymphoid lineage70. PU.1 has been found to regulate specialized functions of microglia at the gene level71. WASp has also been linked to overactivation of microglia. The EVH1 domain of WASp has been found to be essential for modulating the microglial inflammatory response and neuronal cell killing and has been identified as a target for treating CNS inflammatory diseases caused by microglial activation72. Inflammation related to microglial cell activation contributes to the death of dopaminergic neurons, which characterizes the destructive processes of PD73.
PRKCD and CORO1A
CORO1A (coronin-1 A) encodes an actin-binding protein, CORO1A, and has been associated with processes of the immune system and neuronal morphology74,75,76,77. CORO1A is also involved in axonal integrity74. Disturbed levels of CORO1A could signal low axonal integrity and poor synaptic communication53. CORO1A has been found to promote astrocyte reactivity and neurotoxicity, underlying factors in the progression of neurodegenerative diseases such as PD78. Additionally, CORO1A is overexpressed in extracellular vesicles in the plasma of amyotrophic lateral sclerosis (ALS) patients and has been proposed to be a biomarker for this disease. ALS, like PD, is a neurodegenerative disease characterized by neuronal loss79. Further research could help identify CORO1A as a potential PD biomarker. CORO1A has been found to be upregulated in HCV patients77. In terms of the immune system, genetic experts have determined that CORO1A deficiency, due to a recessive autosomal mode of inheritance, is correlated with severe immunodeficiency80. CORO1A product has been reported to facilitate pathogen evasion of the immune system during Mycobacterium tuberculosis host cell invasion76. The role of CORO1A product relation to T-cell homeostasis has been widely reported, as this protein is a key factor in the human immune response80.
The protein kinase C delta (PRKCD) is a key protein involved in cell survival and death81. Oxidative stress guides cells, including neuronal cells, into the apoptosis pathway81. PRKCD is cleaved and activated by caspase 3 in this process as part of the apoptotic pathway 81. PD is caused by dopaminergic cell death73. Apoptosis, as well as the influence of PRKCD, can contribute to PD development.
The changes in CORO1A and PRKCD gene expression were among the most strongly correlated changes according to the MST analysis. Our interpretation of that result is that these two gene products should share a pathway that could increase the risk of PD development. However, to the best of our knowledge, there are still no reports indicating a direct relationship between CORO1A and PRKCD dysregulation related to PD. This further shows the relevance of this paper’s findings in guiding future research. Nonetheless, like PRKCD, CORO1A has been found to be overexpressed in amyotrophic lateral sclerosis (ALS) patients, which causes an increase in oxidative stress and apoptosis82. Our results suggest that the relationship between these genes’ products might be related to the apoptotic pathway.
The results for the gene expression changes and the correlations between those changes present new leads for future studies. Some of the genes and connections found by our analyses have not been reported, but following the roles of the proteins that they encode, it makes sense that they are involved in the processes common to both HCV and PD, as well as processes that are not common but could relate to both conditions. Most of the gene solutions encode proteins involved in the immune response, suggesting that if HCV infection could increase the risk of developing PD, this could be due to immune response dysregulation.
Methods
Parkinson’s disease and hepatitis C virus data description
To construct an integrative model aimed at detecting a possible relationship between PD and HCV, we identified transcriptomic studies available before March 2021 in the NCBI Sequence Read Archive (SRA; https://www.ncbi.nlm.nih.gov/sra) and Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/geo/) publicly available datasets. The keywords “Parkinson’s disease” and “Hepatitis C” were used. Additionally, for the organism “Homo sapiens”, “expression profiling by array” or “expression profiling by high throughput sequencing” were selected as the study type, and a custom sample count range of 10 to 1000 was established. Additionally, through an exhaustive manual search, patient data containing case-control and sex information were included in the analysis. Hence, by combining the selected datasets, microarray data, and RNAseq data, 22 datasets were obtained, 8 from PD patients (6 microarrays, 2 RNAseq) and 14 from HCV patients (12 microarrays, 2 RNAseq). In total, 1064 samples were used for the analysis—612 condition samples and 452 control samples (see Table S1 and Fig. S1). Moreover, 13 out of 22 datasets included clinical information about sex. These datasets were crucial for the analysis by sex (see Fig. S2). The initial cohort was inspected for redundancies before the model was constructed (as described below).
Dataset preprocessing
To evaluate gene expression at the transcriptome level, we used both microarray and RNA sequencing (RNAseq) data. Microarray data were preprocessed primarily using the R package GEOquery83. The RNAseq data investigated here were analyzed with the Rsubread84 R package for mapping and for estimating expression levels and the popular edgeR85 package for organizing, filtering, and normalizing the data. A description of each of the two technologies’ specific steps is provided below.
Microarray download and preprocessing
To preprocess the 18 microarray datasets, the GEOquery83 package from R was used. We downloaded the expression matrices and platform information through the getGEO function, and then, each probe was merged with her matching gene symbol. The median value aggregated the various values for each gene. Next, we identified the condition samples, the control samples, and the sex information using the pData function of the GEOquery package to establish the matrices required by the MCO tool14. This includes the geoaccession number in the first column, the illness condition in the second column, and the expression values for each gene across each sample in the following columns.
The RNAseq workflow from the raw data to the differential expression and pathway analysis data
RNA sequencing (RNA-seq) is a key technique for gene expression profiling. There are currently numerous RNAseq data analysis pipelines available, each using a wide range of software tools that are often computationally intensive. We aligned and analyzed RNA sequencing data using Rsubread84, a Bioconductor package that offers high-throughput RNAseq read alignments and read counting algorithms as R functions. Preprocessing procedures, such as filtering and normalization, were performed using Edger85, a well-known R package, and differential expression and pathway analyses were performed using BioOptimatics12,13,14 (see Fig. S7).
This study used four (two for PD and two for HCV) RNAseq datasets that are available from the GEO repository as the series GSE128177, GSE106608, GSE119117, and GSE140845. With the use of the SRA Toolkit (https://github.com/ncbi/sra-tools) and the SRA Downloader program (https://github.com/s-andrews/sradownloader), sequencing data in the form of Sequence Read Archive (SRA) files were obtained from the NCBI. A unique set of SRA accession numbers were associated with each sample. For each sample, we first downloaded the SRA runs. Then, we employed the SRA Downloader tool, which creates Fastq files from SRA files. The Fastq file is the data format required for bulk RNA-sequencing analysis. Due to the computational requirements of these data, the sequencing data for each sample in the study were stored in a virtual machine (VM) with 30 GB of RAM memory facilitated by the Information Technology Center at the University of Puerto Rico at Mayagüez. Similarly, the next step of alignment and quantification was performed using VM due to its computational capacity. Once we have the Fastq files available, each read from the Fastq file needs to be “mapped” to a reference genome. Finding the location in the reference genome that most closely matches the segment of the mRNA transcript detected in the reading is referred to as “mapping”. We used three R functions from the Rsubread package for reading, mapping, and quantifying: buildindex, align, and featureCounts. All the functions return R objects.
The buildindex function generates a reference genome index. Since the same index file can be utilized for numerous projects, this process is a one-time procedure for each reference genome. For our study, the reference genome assembly Release 35 (GRCh38.p13) was downloaded from the GENCODE database (https://www.gencodegenes.org/human/release_35.html). For the human genome, it takes approximately 50 min to construct the entire index at single-base resolution, which results in a hash table of the target genome. Sequence reads were subsequently aligned to the reference genome using the align function. This procedure accepts raw reads in the form of Fastq and generates read alignments in BAM format. Align carries out local read alignment and reports the greatest mappable region for each read, soft-clipping unmapped read bases. This function works well with RNA-seq because of its distinctive seed-and-vote design. The quantification process came next, and it could be performed using the featureCounts function. The featureCounts function returns a matrix of counts (see Fig. S8) and counts the number of reads or read pairs that overlap any supplied collection of features.
When the matrix of counts is summarized at the gene level, three typical preprocessing steps entail (1) organizing, (2) filtering, and (3) normalizing the data. To address preprocessing, we employ the well-known edgeR85 package accessible through the Bioconductor project. For the first step, sample-level information related to the experimental design needs to be organized according to the columns of the count matrix. This should include experimental variables, both biological and technical, that could influence expression levels. In our case, we included disease status and sex phenotypes.
We stored the datasets in the simple list-based dataset DGEList-object using the DGEList function of the EdgeR R package. The resulting DGEList-object contains a matrix of counts with n rows associated with unique Entrez gene identifiers (IDs) and m columns related to the individual samples in the experiment. The sample data frame in our DGEList-object stores details about the batch (sequencing lane) and sample types. As shown in Fig. S9a, within x$ samples, library sizes are automatically computed for each sample, and normalization factors are set to 1. Similarly, gene-level details related to the rows of the count’s matrix are kept in a second data frame called genes in the DGEList-object. The org.Hs.eg.db86 package, a genome-wide annotation for humans, was used to collect associated gene symbols for the Entrez gene IDs present in our dataset.
For downstream analysis, the second step, filtering, was necessary because genes that are expressed and genes that are not expressed were mixed in every dataset. Some genes were not expressed in any of the samples, even though it is interesting to look at genes that are expressed under one condition but not another. The downstream analysis excluded genes that did not have enough reads in any sample. The edgeR package’s filterByExpr function offers an automatic method for filtering genes while retaining as many genes as possible among those with valuable counts. By default, the filterByExpr function retains genes with ten or more read counts in a minimum number of samples, where the minimum group sample size is used to determine the number of samples (see Fig. S10).
The third preprocessing step, data normalization, was necessary because external elements not of biological interest can impact a sample’s expression during sample preparation for sequencing. For instance, compared to samples processed in a second batch, samples processed in the first batch of an experiment may have greater overall expression. It is assumed that the range and distribution of expression values should be comparable across all samples. Normalization is needed to remove systematic technical effects from the data and guarantee that technical bias has a minimal impact on the results. The calcNormFactors function in edgeR was used for the trimmed mean of the M-values (TMM) normalization approach87. The normalization factors computed here were used as scaling factors for the library sizes. These normalization factors are automatically saved in x$samples$norm.factors when working with DGEList objects. For example, in Fig. S9b, we can see the normalization factors used to scale the library sizes. The boxplots in Fig. S11 show the sample expression distributions before and after normalization. Gene expression is rarely considered at the level of raw counts for differential expression and associated analysis because libraries sequenced at a greater depth would yield greater counts. Instead, scaling up raw counts is a standard procedure for considering variations in library size. Transforms such as counts per million (CPM) and log2-counts per million (log-CPM) were applied in our analysis.
Before performing differential expression analysis, the datasets were examined to explore differences between libraries. We used two well-known visual methods: multidimensional scaling (MDS) and mean difference (MD) plots. The RNA samples can be arranged into two-dimensional clusters using MDS plots. This procedure serves as a quality control step and examines the overall variability in the expression patterns of the distinct samples. To display the sample groups, the MDS plot is colored in Fig. S12a. The expression profiles of the samples can be analyzed in more detail with MD plots (see Fig. S12b).
BioOptimatics methods: multiple criteria optimization (MCO) and the minimum spanning tree (MST) methods for identifying potential biomarkers and pathways
Camacho-Caceres et al.12. provided a thorough explanation of the steps involved in choosing the desired genes using the MCO algorithm. Briefly, MCOs choose a group of genes with the best compromises between the multiple performance measures (PMs) chosen for the analysis. Depending on the specific performance metrics applied, these metrics represent 2D, 3D, etc., spaces. For instance, in an individual analysis of an expression dataset, typically two performance measures (2D) are used: the absolute difference between medians and the absolute difference between means of the control and condition groups. In a meta-analysis, each dataset contributes its own performance measure by employing the absolute difference between medians of the corresponding control and condition groups. This approach adapts to different dimensions, such as 2D, 3D, and 4D, based on the number of datasets analyzed.
The set of solutions to the MCO problem is called the Pareto–efficient frontier, and MCO can create a hierarchy of genes organized in succeeding frontiers. A recently released interactive MCO tool (https://server-deiver.shinyapps.io/MCO_TURBO/)14 created in R is available. This tool enables the generation of lists of genes with the largest changes in expression in individual dataset analyses, analysis stratified by sex, and meta-analysis with the possibility of analyzing up to five datasets simultaneously. We used the MCO tool to identify genes with the greatest changes in expression according to the case studies design of Fig. 1, which included: (1) individual dataset analysis, (2) analysis stratified by sex, (3) meta-analysis within each condition, and (4) meta-analysis between conditions.
MCO offers superior flexibility and adaptability compared to tools such as DESeq2 or edgeR for gene expression analysis. DESeq2 and edgeR use statistical methods that present certain disadvantages, such as the need for a minimum number of replicates, underlying assumptions of statistical models, and the requirement for users to have statistical and bioinformatics knowledge.
In contrast, MCO allows for customization according to specific research objectives and data characteristics. MCO supports meta-analyses, individual analyses, and sex-specific analyses, providing greater flexibility in experimental design. The selection of MCO is based on its ability to handle multiple criteria simultaneously, integrate diverse datasets, maintain coherence with preprocessing steps, and offer extensive customization and control in gene expression analysis.
Notably, MCO identifies genes with maximum expression changes using Pareto optimality, which offers advantages over heuristic algorithms. This ability to identify genes with the most significant expression changes is a substantial advantage in genomic research.
Individual analysis of each dataset, analysis by sex, and meta-analysis within/between conditions
The first step, individual analysis of each dataset, revealed genes deregulated by sex but not by condition; these genes were considered false positives. However, this step is still necessary due to the small sample size and the lack of sex information. Then, to handle false-positive genes, these individual analyses were conducted, and these genes were excluded from the next step of the analysis after stratification by sex.
For sex analysis, each dataset was divided into four subsets; these were subsets that included only females, males, controls, or conditioned samples. Subsequently, we carried out four MCOs for each dataset. For MCO analysis of the female and male subsets, the absolute value of the difference in gene expression between controls and conditions samples was considered. While in controls and conditions subsets, the absolute value of the difference in gene expression between female and male samples was assessed.
Then, by applying a Venn diagram technique, the genes of interest were those solely in the female, male, and intersection areas of females and males but not in the control or condition area. The genes whose relative expression was changed by disease but not by sex were identified. The genes found exclusively in women when comparing the condition versus the control may be related to female-specific biological mechanisms and particular responses to the experimental condition. Similarly, the genes exclusive to men in the same comparison may indicate male-specific biological responses to the experimental condition. Additionally, the intersection of DEGs exclusively in men and women suggested that these genes respond to the experimental conditions similarly in both sexes, suggesting that they could be considered universal responses to the conditions, regardless of sex.
To design the meta-analyses within the same conditions, we considered the tissue and sex of the samples. Only subsets of data from the same tissue and sex were used in our meta-analysis. In this regard, we conducted six meta-analyses, including two for HCV and four for PD. The meta-analyses between conditions (PD and HCV conditions at the same time) considered the tissue and sex similarly to the meta-analysis within each condition. Two blood meta-analyses were performed—one for men and one for women. The type of experiment (microarray or RNAseq) and the geo accession subset selected for each analysis are listed in Table 1.
The relationship between each pair of genes was modeled using MST to determine how the variations in gene expression in the genes discovered in each meta-analysis by MCO were related. The MST is a network optimization method that identifies potential signaling pathways through the pathway most strongly correlated with the identified genes. Our team successfully implemented MST13, and the code used to solve MST is shown in the supplementary information Algorithm S.1. We carried out eight MSTs, one for each MCO meta-analysis, to identify potential biological signaling pathways by determining the structure of the greatest correlation between genes with large variations in expression. We were able to identify the pertinent biological links for genes with the greatest changes in expression in PD and HCV patients based on the biological signaling pathways established by the MST.
Data availability
The datasets analysed during the current study are available in the GEO repository. The persistent web links to each dataset are provided below:
Cond | GEO accession | Platform | Year | Technology | Linking to GEO accessions |
---|---|---|---|---|---|
HCV | GSE119117 | GPL16791 | 2018 | RNAseq | https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE119117 |
HCV | GSE140845 | GPL16791 | 2019 | RNAseq | https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE140845 |
HCV | GSE128726 | GPL21185 | 2019 | MicroArray | https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE128726 |
HCV | GSE65123 | GPL17077 | 2016 | MicroArray | |
HCV | GSE65123 | GPL17077 | 2016 | MicroArray | |
HCV | GSE51941 | GPL6244 | 2015 | MicroArray | |
HCV | GSE40223 | GPL10558 | 2012 | MicroArray | |
HCV | GSE40184 | GPL96 | 2012 | MicroArray | |
HCV | GSE49954 | GPL10558 | 2013 | MicroArray | |
HCV | GSE49954 | GPL10558 | 2013 | MicroArray | |
HCV | GSE38226 | GPL6947 | 2012 | MicroArray | |
HCV | GSE38542 | GPL10976 | 2012 | MicroArray | |
HCV | GSE14323 | GPL571 | 2009 | MicroArray | |
HCV | GSE10356 | GPL5215 | 2008 | MicroArray | |
PD | GSE99039 | GPL570 | 2017 | MicroArray | |
PD | GSE72267 | GPL571 | 2015 | MicroArray | |
PD | GSE19587 | GPL571 | 2010 | MicroArray | |
PD | GSE7621 | GPL570 | 2007 | MicroArray | |
PD | GSE18838 | GPL5175 | 2010 | MicroArray | |
PD | GSE57475 | GPL6947 | 2015 | MicroArray | |
PD | GSE128177 | GPL24676 | 2020 | RNAseq | https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE128177 |
PD | GSE106608 | GPL15433 | 2021 | RNAseq | https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE106608 |
References
Yaow, C. Y. L. et al. Risk of Parkinson’s disease in hepatitis B and C populations: a systematic review and meta-analysis. J. Neural Transm. 1, 1. https://doi.org/10.1007/S00702-023-02705-7 (2023).
Wang, H. et al. Bacterial, viral, and fungal infection-related risk of Parkinson’s disease: Meta-analysis of cohort and case–control studies. Brain Behav.10 (3), e01549. https://doi.org/10.1002/BRB3.1549 (2020).
Lin, W. Y. et al. Association of antiviral therapy with risk of Parkinson Disease in patients with chronic Hepatitis C virus infection. JAMA Neurol.76 (9), 1019–1027. https://doi.org/10.1001/JAMANEUROL.2019.1368 (2019).
Su, T. et al. Antiviral therapy in patients with chronic hepatitis C is associated with a reduced risk of parkinsonism. Mov. Disord.https://doi.org/10.1002/mds.27848 (2019).
Wilkinson, J., Radkowski, M. & Laskus, T. Hepatitis C virus neuroinvasion: identification of infected cells. J. Virol.83 (3), 1312–1319. https://doi.org/10.1128/JVI.01890-08 (2009).
Forton, D. M. et al. Evidence for a cerebral effect of the hepatitis C virus. Lancet 358(9275), 38–39. https://doi.org/10.1016/S0140-6736(00)05270-3 (2001).
Dorsey, E. R., Sherer, T., Okun, M. S. & Bloemd, B. R. The emerging evidence of the Parkinson pandemic. J. Parkinsons Dis.8, S3–S8. https://doi.org/10.3233/JPD-181474 (2018).
Nalls, M. A., Blauwendraat, C., Vallerga, C. L. & Heilbron, K. Identification of novel risk loci, causal insights, and heritable risk for Parkinson’s disease: a meta-genome wide association study. Physiol. Behav. 176(1), 139–148. Accessed: Oct. 01, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1474442219303205 (2016).
Smeyne, R. J., Noyce, A. J., Byrne, M., Savica, R. & Marras, C. Infection and risk of Parkinson’s Disease. J. Parkinsons Dis.11 (1), 31–43. https://doi.org/10.3233/JPD-202279 (2021).
Wolfe, M. The molecular and cellular basis of neurodegenerative diseases: underlying mechanisms. Accessed: Oct. 01, 2022. [Online]. Available. https://books.google.com/books?hl (2018).
Harry, G. J. & Kraft, A. D. Neuroinflammation and microglia: considerations and approaches for neurotoxicity assessment. Expert Opin. Drug Metab. Toxicol.4 (10), 1265–1277. https://doi.org/10.1517/17425255.4.10.1265 (2008).
Camacho-Cáceres, K. I. et al. Multiple criteria optimization joint analyses of microarray experiments in lung cancer: from existing microarray data to new knowledge. Cancer Med.4 (12), 1884–1900. https://doi.org/10.1002/cam4.540 (2015).
Isaza, C. et al. Biological signaling pathways and potential mathematical network representations: biological discovery through optimization. Cancer Med. 7(5), 1875–1895. https://doi.org/10.1002/cam4.1301 (2018).
Narváez-Bandera, I., Suárez-Gómez, D., Isaza, C. E. & Cabrera-Ríos, M. Multiple criteria optimization (MCO): a gene selection deterministic tool in RStudio. PLoS One. 17 (1), e0262890. https://doi.org/10.1371/JOURNAL.PONE.0262890 (2022).
Szklarczyk, D. et al. The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res.49, D605–D612. https://doi.org/10.1093/NAR/GKAA1074 (2021).
Vallese, F. et al. Architecture of the human erythrocyte ankyrin-1 complex. Nat. Struct. Mol. Biol.29 (7), 706. https://doi.org/10.1038/S41594-022-00792-W (2022).
Yang, L., Shu, H., Zhou, M. & Gong, Y. Literature review on genotype–phenotype correlation in patients with hereditary spherocytosis. Clin. Genet.102 (6), 474–482. https://doi.org/10.1111/CGE.14223 (2022).
Wu, J., Cao, J., Fan, Y., Li, C. & Hu, X. Comprehensive analysis of miRNA–mRNA regulatory network and potential drugs in chronic chagasic cardiomyopathy across human and mouse. BMC Med. Genomics. 14 (1), 1–13. https://doi.org/10.1186/S12920-021-01134-3/FIGURES/7 (2021).
Kim, J., Lee, K., Jeon, Y. & Oh, J., Identification of genes related to Parkinson’s disease using expressed sequence tags. academic.oup.com, Accessed: Mar. 12, 2022. [Online]. Available. https://academic.oup.com/dnaresearch/article-abstract/13/6/275/464485 (2006).
Loeffler, D. A., Camp, D. M. & Conant, S. B. Complement activation in the Parkinson’s disease substantia nigra: an immunocytochemical study. J. Neuroinflammation. 3 (1), 1–8. https://doi.org/10.1186/1742-2094-3-29/FIGURES/5 (2006).
Liu, C. Z. et al. Correlation of matrix metalloproteinase 3 and matrix metalloproteinase 9 levels with nonmotor symptoms in patients with Parkinson’s disease. Front. Aging Neurosci.14, 889257. https://doi.org/10.3389/FNAGI.2022.889257/BIBTEX (2022).
Park, H. et al. IL-29 is the Dominant type III Interferon produced by Hepatocytes during Acute Hepatitis C virus infection. https://doi.org/10.1002/hep.25897 (2012).
Zhang, S. et al. The effect and mechanism of metallothionein MT1M on hepatocellular carcinoma cell. Eur. Rev. Med. Pharmacol. Sci. europeanreview.org, Accessed: Oct. 02, 2023. [Online]. Available: http://www.europeanreview.org/wp/wp-content/uploads/695-701.pdf (2018).
Ye, Y., Yu, B., Wang, H. & Yi, F. Glutamine metabolic reprogramming in hepatocellular carcinoma. Front. Mol. Biosci.10. https://doi.org/10.3389/FMOLB.2023.1242059 (2023).
Zhu, X. B. et al. Identifying and exploring the candidate susceptibility genes of cirrhosis using the multi-tissue transcriptome-wide Association study. Front. Genet.13. https://doi.org/10.3389/FGENE.2022.878607/FULL (2022).
Wang, Q. et al. Sex-specific circulating unconventional neutrophils determine immunological outcome of autoinflammatory Behçet’s uveitis. Cell. Discov. 10(1). https://doi.org/10.1038/S41421-024-00671-2 (2024).
Klein, S. L. & Flanagan, K. L. Sex differences in immune responses. Nat. Rev. Immunol.16 (10), 626–638. https://doi.org/10.1038/NRI.2016.90 (2016).
Ruggieri, A., Gagliardi, M. C. & Anticoli, S. Sex-dependent outcome of hepatitis B and C viruses infections: synergy of sex hormones and immune responses? Front. Immunol. 9, 1. https://doi.org/10.3389/FIMMU.2018.02302/FULL (2018).
Safran, M. et al. The GeneCards suite. Practical Guide to Life Science Databases, pp. 27–56. https://doi.org/10.1007/978-981-16-5812-9_2 (2021).
Willis, A. W. et al. Incidence of Parkinson disease in North America. NPJ Parkinsons Dis.8, 1. https://doi.org/10.1038/S41531-022-00410-Y (2022).
Miller, S. C., MacDonald, C. C., Kellogg, M. K., Karamysheva, Z. N. & Karamyshev, A. L. Specialized ribosomes in Health and Disease. Int. J. Mol. Sci. 24(7), 1. https://doi.org/10.3390/IJMS24076334 (2023).
Panda, A. et al. Tissue-and development-stage–specific mRNA and heterogeneous CNV signatures of human ribosomal proteins in normal and cancer samples. Nucleic Acids Res. 48(13), 7079–7098. https://doi.org/10.1093/nar/gkaa485 (2020).
Kander, M. & Cui, Y. -J. of cellular and molecular, and undefined 2017, ‘Gender difference in oxidative stress: a new look at the mechanisms for cardiovascular diseases’. J. Cell. Mol. Med.21 (5), 1024–1032. https://doi.org/10.1111/jcmm.13038 (2017).
Tower, J., Pomatto, L. C. D. & Davies, K. J. A. Sex differences in the response to oxidative and proteolytic stress. Redox Biol. 31, 1. https://doi.org/10.1016/J.REDOX.2020.101488 (2020).
Saadoun, D. et al. Role of Matrix metalloproteinases, Proinflammatory cytokines, and oxidative stress-derived molecules in Hepatitis C Virus-Associated mixed Cryoglobulinemia Vasculitis Neuropathy. Arthritis Rheum.56 (4), 1315–1324. https://doi.org/10.1002/art.22456 (2007).
Tong, Z. B., Braisted, J., Chu, P. H. & Gerhold, D. The MT1G gene in LUHMES neurons is a sensitive biomarker of neurotoxicity. Neurotox. Res.38 (4), 967–978. https://doi.org/10.1007/s12640-020-00272-3 (2020).
Kinast, V. et al. Identification of keratin 23 as a Hepatitis C Virus-Induced host factor in the Human Liver. Cells 2019. 8 (6), 610. https://doi.org/10.3390/CELLS8060610 (2019). Page 610.
Hinkle, J. T. et al. STING mediates neurodegeneration and neuroinflammation in nigrostriatal α-synucleinopathy. Proc. Natl. Acad. Sci. U S A. 119 (15), e2118. https://doi.org/10.1073/PNAS.2118819119/SUPPL_FILE/PNAS.2118819119.SM04.MOV (2022).
Duan, Z. et al. The association of ribosomal protein L18 with Newcastle disease virus matrix protein enhances viral translation and replication. 51(2), 129–140. https://doi.org/10.1080/03079457.2021.2013435 (2022).
Xie, J. et al. Inflammation and oxidative stress role of S100A12 as a potential diagnostic and therapeutic biomarker in Acute myocardial infarction. Oxid. Med. Cell. Longev. https://doi.org/10.1155/2022/2633123 (2022).
Li, H. et al. Identification and verification of ubiquitin D as a gene associated with hepatitis C virus-induced hepatocellular carcinoma. https://doi.org/10.1159/000525543 (2022).
Player, J. K., Riordan, S. M., Duncan, R. S. & Koulen, P. Analysis of Glaucoma associated genes in response to inflammation, an examination of a public data set derived from peripheral blood from patients with hepatitis C. Clin. Ophthalmol. 16,2022. https://doi.org/10.2147/OPTH.S364739 (2093).
Webb, L. G. & Fernandez-Sesma, A. RNA viruses and the cGAS-STING pathway: reframing our understanding of innate immune sensing. Curr. Opin. Virol.53, 101206. https://doi.org/10.1016/J.COVIRO.2022.101206 (2022).
Neufeldt, C. J. et al. Hepatitis C Virus-Induced cytoplasmic organelles use the Nuclear Transport Machinery to establish an Environment Conducive to Virus Replication. PLoS Pathog. 9(10). https://doi.org/10.1371/JOURNAL.PPAT.1003744 (2013).
Barba, G. et al. Hepatitis C virus core protein shows a cytoplasmic localization and associates to cellular lipid storage droplets (1997). [Online]. Available: https://doi.org/www.pnas.org.
Dhar, D. et al. Human ribosomal protein L18a interacts with hepatitis C virus internal ribosome entry site. Arch. Virol.151 (3), 509–524. https://doi.org/10.1007/s00705-005-0642-6 (2006).
Glaab, E. & Schneider, R. Comparative pathway and network analysis of brain transcriptome changes during adult aging and in Parkinson’s disease. Neurobiol. Dis.74, 1–13. https://doi.org/10.1016/j.nbd.2014.11.002 (2015).
Moretti, R. et al. Hepatitis C Virus-Related Central and Peripheral Nervous System disorders. Brain Sci. 2021. 11, Page 1569, 11, (12), 1569. https://doi.org/10.3390/BRAINSCI11121569 (2021).
Miric, D., Nahum, S., Jibidar, H. & Lezy-Mathieu, A. M. Vascular parkinsonism in an elderly woman with mixed cryoglobulinemia associated with hepatitis C infection. J. Am. Geriatr. Soc.54 (11), 1798–1798. https://doi.org/10.1111/J.1532-5415.2006.00932.X (2006).
Kattoor, A. J., Pothineni, N. V. K., Palagiri, D. & Mehta, J. L. Oxidative stress in atherosclerosis. Curr. Atheroscler Rep. 19(11). https://doi.org/10.1007/S11883-017-0678-6 (2017).
Chang, K. H. & Chen, C. M. The role of oxidative stress in Parkinson’s disease. Antioxidants 9, 597. https://doi.org/10.3390/ANTIOX9070597 (2020).
Choi, M. L. et al. Pathological structural conversion of α-synuclein at the mitochondria induces neuronal toxicity. Nat. Neurosci.25 (9), 1134–1148. https://doi.org/10.1038/S41593-022-01140-3 (2022).
Subramaniam, S. R. & Chesselet, M. F. Mitochondrial dysfunction and oxidative stress in Parkinson’s disease. Prog Neurobiol. 106–107. https://doi.org/10.1016/J.PNEUROBIO.2013.04.004 (2013).
Ismail, S. A. et al. Study of glutathion peroxidase (GPX) enzyme level in patients with chronic hepatitis C virus. AAMJ 3(2) (2005).
Fan, Y. G. et al. ‘From zinc homeostasis to disease progression: unveiling the neurodegenerative puzzle’, Pharmacological Research, vol. 199. Academic, Jan. 01, doi: https://doi.org/10.1016/j.phrs.2023.107039. (2024).
Masliah, E., Dumaop, W., Galasko, D. & Desplats, P. Distinctive patterns of DNA methylation associated with Parkinson disease: identification of concordant epigenetic changes in brain and peripheral blood leukocytes. Epigenetics 8(10), 1030–1038. https://doi.org/10.4161/EPI.25865 (2013).
Vlachos, N., Lampros, M. G., Lianos, G. D., Voulgaris, S. & Alexiou, G. A. Blood biomarkers for predicting coagulopathy occurrence in patients with traumatic brain injury: a systematic review. 16(12), 935–945. https://doi.org/10.2217/BMM-2022-0294 (2022).
Delic, V., Beck, K. D., Pang, K. C. H. & Citron, B. A. Biological links between traumatic brain injury and Parkinson’s disease. Acta Neuropathol. Commun. 8(1), 1–16. https://doi.org/10.1186/S40478-020-00924-7 (2020).
Brett, B. L., Gardner, R. C., Godbout, J., Dams-O’Connor, K. & Keene, C. D. Traumatic brain injury and risk of neurodegenerative disorder. Biol. Psychiatry 91(5), 498–507. https://doi.org/10.1016/J.BIOPSYCH.2021.05.025 (2022).
Rojas, A., Lindner, C., Schneider, I., Gonzalez, I. & Uribarri, J. The RAGE axis: A relevant inflammatory hub in human diseases. Biomolecules 14(4). https://doi.org/10.3390/biom14040412 (2024).
González-Reimers, E. et al. Thrombin activation and liver inflammation in advanced hepatitis C virus infection. World J. Gastroenterol.22 (18), 4427–4437. https://doi.org/10.3748/wjg.v22.i18.4427 (2016).
Pretorius, E., Page, M. J., Mbotwe, S. & Kell, D. B. Lipopolysaccharide-binding protein (LBP) can reverse the amyloid state of fibrin seen or induced in Parkinson’s disease. PLoS One. 13 (3), e0192121. https://doi.org/10.1371/JOURNAL.PONE.0192121 (2018).
Galea, I. The blood–brain barrier in systemic infection and inflammation. Cell. Mol. Immunol. 18(11), 2489–2501. https://doi.org/10.1038/s41423-021-00757-x (2021).
Kwon, H. S. & Koh, S. H. Neuroinflammation in neurodegenerative disorders: the roles of microglia and astrocytes. Transl. Neurodegen. 9(1), 1–12. https://doi.org/10.1186/S40035-020-00221-2 (2020).
Mutez, E. et al. Involvement of the immune system, endocytosis and EIF2 signaling in both genetically determined and sporadic forms of Parkinson’s disease. Neurobiol. Dis.63, 165–170. https://doi.org/10.1016/j.nbd.2013.11.007 (2014).
Falchetti, M., Prediger, R. D. & Zanotto-Filho, A. Classification algorithms applied to blood-based transcriptome meta-analysis to predict idiopathic Parkinson’s disease. Comput. Biol. Med.124, 103925. https://doi.org/10.1016/j.compbiomed.2020.103925 (2020).
Salazar, J. et al. Divalent metal transporter 1 (DMT1) contributes to neurodegeneration in animal models of Parkinson’s disease. Proc. Natl. Acad. Sci. U S A. 105 (47), 18578–18583. https://doi.org/10.1073/PNAS.0804373105 (2008).
Kramer, D. & Piper, H. B. C.-E. WASP family proteins: Molecular mechanisms and implications in human disease. Elsevier, Accessed: Oct. 02, 2023. [Online]. J. Cell Biol. Available: https://www.sciencedirect.com/science/article/pii/S0171933522000474.
Fernández-Calleja, V., Fernández-Nestosa, M. J., Hernández, P., Schvartzman, J. B. & Krimer, D. B. CRISPR/Cas9-mediated deletion of the Wiskott-Aldrich syndrome locus causes actin cytoskeleton disorganization in murine erythroleukemia cells. PeerJ 7(1). https://doi.org/10.7717/PEERJ.6284 (2019).
Lim, V. Y., Zehentmeier, S., Fistonich, C. & Pereira, J. P. A chemoattractant-guided Walk through Lymphopoiesis: from hematopoietic stem cells to mature B lymphocytes. Adv. Immunol.134, 47–88. https://doi.org/10.1016/BS.AI.2017.02.001 (2017).
Satoh, J. I., Asahina, N., Kitano, S. & Kino, Y. Profile of ChIP-Seq-based PU.1/Spi1 target genes in Microglia. Gene Regul. Syst. Bio. 8, 127–139. https://doi.org/10.4137/GRSB.S19711 (2014).
Sato, M., Ogihara, K., Sawahata, R., Sekikawa, K. & Kitani, H. Impaired LPS-induced signaling in microglia overexpressing the Wiskott–Aldrich syndrome protein N-terminal domain. Int. Immunol.19 (8), 901–911. https://doi.org/10.1093/INTIMM/DXM074 (2007).
Qian, L. & Flood, P. M. Microglia and Parkinson’s disease. Immunol. Res.41 (3), 155–164. https://doi.org/10.1007/S12026-008-8018-0 (2008).
Sowell, R. A., Owen, J. B. & Allan Butterfield, D. Proteomics in animal models of Alzheimer’s and Parkinson’s diseases. Aging Res. Rev.8 (1), 1–17. https://doi.org/10.1016/J.ARR.2008.07.003 (2009).
Lamontagne-Proulx, J. et al. Portrait of blood-derived extracellular vesicles in patients with Parkinson’s disease. Neurobiol. Dis.124, 163–175. https://doi.org/10.1016/J.NBD.2018.11.002 (2019).
Martorella, M., Barford, K., Winckler, B. & Deppmann, C. D. Emergent role of coronin-1a in neuronal signaling. Vit. Horm. 104, 113–131. https://doi.org/10.1016/BS.VH.2016.10.002 (2017).
Montaldo, C. et al. Fibrogenic signals persist in DAA-treated HCV patients after sustained virological response. J. Hepatol.75 (6), 1301–1311. https://doi.org/10.1016/J.JHEP.2021.07.003 (2021).
Pandey, H. S., Kapoor, R., Bindu & Seth, P. Coronin 1A facilitates calcium mobilization and promotes astrocyte reactivity in HIV-1 neuropathogenesis. FASEB Bioadv. 4(4), 254–272. https://doi.org/10.1096/FBA.2021-00109 (2022).
Xia, X., Wang, Y. & Zheng, J. C. Extracellular vesicles, from the pathogenesis to the therapy of neurodegenerative diseases. Transl. Neurodegener. 11(1). https://doi.org/10.1186/S40035-022-00330-0 (2022).
Schejter, Y. D., Mandola, A. & Reid, B. Coronin 1A deficiency identified by newborn screening for severe combined immunodeficiency. 6(1), 17–25. https://doi.org/10.14785/LYMPHOSIGN-2019-0001 (2019).
Kaul, S. et al. Tyrosine phosphorylation regulates the proteolytic activation of protein kinase cdelta in dopaminergic neuronal cells. J. Biol. Chem.280 (31), 28721–28730. https://doi.org/10.1074/JBC.M501092200 (2005).
Zhou, Q. et al. Increased expression of coronin-1a in amyotrophic lateral sclerosis: a potential diagnostic biomarker and therapeutic target. Front. Med.16 (5), 723–735. https://doi.org/10.1007/S11684-021-0905-Y/METRICS (2022).
Davis, S. & Bioinformatics, P. M. GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor. academic.oup.com 23(14), 1846–1847. https://doi.org/10.1093/bioinformatics/btm254 (2007).
Liao, Y., Smyth, G. K. & Shi, W. The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads. Nucleic Acids Res. 47(8), 1. https://doi.org/10.1093/nar/gkz114 (2019).
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: A Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26(1), 139–140. https://doi.org/10.1093/BIOINFORMATICS/BTP616 (2010).
Carlson, N., Falcon, M., Pages, S. & Li H., org.Hs.eg.db: Genome wide annotation for Human. R package version, vol. 3, no. 2, p. 3 (2019).
Robinson, M. D. & Oshlack, A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 11(3), 1. https://doi.org/10.1186/GB-2010-11-3-R25 (2010).
Acknowledgements
We wish to acknowledge finantial support to C.C. from NIH NIGMS RISE program R25 GM127191
Author information
Authors and Affiliations
Contributions
C.I and M.C conceived the idea and proposed the line of work. I.N. Developed the analyses, wrote the initial manuscript. I.N and D.S. designed the shiny tool and the computational framework. C.I and M.C supervised the development, analysis, and conclusions of this work. All authors discussed the results and contributed to the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Narváez-Bandera, I., Suárez-Gómez, D., Castro-Rivera, C.D.M. et al. Hepatitis C virus infection and Parkinson’s disease: insights from a joint sex-stratified BioOptimatics meta-analysis. Sci Rep 14, 22838 (2024). https://doi.org/10.1038/s41598-024-73535-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-73535-0