Genome-wide pathway analysis implicates intracellular transmembrane protein transport in Alzheimer disease

Hong, Mun-Gwan; Alexeyenko, Andrey; Lambert, Jean-Charles; Amouyel, Philippe; Prince, Jonathan A

doi:10.1038/jhg.2010.92

Download PDF

Short Communication
Published: 29 July 2010

Genome-wide pathway analysis implicates intracellular transmembrane protein transport in Alzheimer disease

Mun-Gwan Hong¹,
Andrey Alexeyenko¹,
Jean-Charles Lambert^2,3,4,
Philippe Amouyel^2,3,4 &
…
Jonathan A Prince¹

Journal of Human Genetics volume 55, pages 707–709 (2010)Cite this article

1193 Accesses
40 Citations
3 Altmetric
Metrics details

Subjects

Abstract

We developed and implemented software for the analysis of genome-wide association studies in the context of biological pathway enrichment and have here applied our algorithm to the study of Alzheimer disease (AD). Using genome-wide association data in a large French population, we observed a highly significant enrichment of genes involved in intracellular protein transmembrane transport, including several mitochondrial proteins and nucleoporins. An intriguing aspect of these findings is the implication that TOMM40, the channel-forming subunit of the translocase of the mitochondrial outer membrane complex, and a gene generally considered to be indiscernible from APOE because of linkage disequilibrium, may itself contribute to Alzheimer pathology. Results provide an indication that protein trafficking, in particular across the nuclear and mitochondrial membranes, may contribute to risk for AD.

Single-cell long-read sequencing-based mapping reveals specialized splicing patterns in developing and adult mouse and human brain

Article Open access 09 April 2024

Anoushka Joglekar, Wen Hu, … Hagen U. Tilgner

Genome-wide association studies

Article 26 August 2021

Emil Uffelmann, Qin Qin Huang, … Danielle Posthuma

APOE4/4 is linked to damaging lipid droplets in Alzheimer’s disease microglia

Article Open access 13 March 2024

Michael S. Haney, Róbert Pálovics, … Tony Wyss-Coray

Main

Genome-wide association studies are now abundant with hundreds of newly identified single loci being shown with a high degree of probability to influence a variety of traits and diseases.¹ However, for almost all tested traits only 1–2 genes are typically identified that survive correction for multiple testing in a genome-wide context, leaving the question open as to whether additional risk genes exist. An emerging approach to understanding these studies in a larger biological context is to explore the upper distribution of the most significant genes for an enrichment of certain classes of function. Because of the depth of annotation of the genome, the preferred way to do this is by means of the gene ontology (GO).² This is a relatively new approach, stemming from the application of GO-based analyses to gene expression data,³ but despite promise only a handful of replicated cases of pathway enrichment have emerged.^{4, 5} One of the critical issues in enabling this strategy is to convert with high fidelity the single-nucleotide polymorphism (SNP) lists from genome-wide platforms to the list of the genes they represent. Toward this end, we developed a software program implemented in Perl, using as input genome-wide SNP results (primarily from PLINK⁶), that considers linkage disequilibrium (LD) across regions of significance that corrects for the inflation of significance due to gene length.⁵ In brief, our software automates the process of converting genome-wide SNP lists to gene lists, beginning with the retrieval of LD structures in analogous populations with denser genotyping data (that is, HapMap). When a group of markers are in high LD in HapMap (we use an r²>0.8 threshold), they are tied to a ‘proxy cluster’ treating it as a single signal. Subsequently, each marker in the original SNP list with statistically significant evidence of association with a phenotype is evaluated to see (a) if it belongs to any proxy cluster and (b) if the marker itself or any marker in the cluster is located in a genic region. Any marker or cluster that overlaps a region extending across a gene is assigned as a signal indicating the possible association of that gene. To correct the multiple-testing problem that emerges due to multiple signals across a gene, the P-value for each gene is adjusted by multiplication of the lowest P-value of the assigned signals by the number of signals. An illustration of the algorithm can be found in our earlier paper.⁵ Here, we have applied this program to a genome-wide association study in a French Alzheimer disease (AD) case–control sample.

The genome-wide association study included 2032 AD cases and 5328 controls of French ancestry and was conducted on the Illumina 610 platform (Illumina, San Diego, CA, USA).⁷ Appropriate institutional review board permission was obtained for this study(see Lambert et al.⁷ for details). A total of 511 978 SNPs that passed quality control (genotypes were excluded that had call rates <98%, a minor allele frequency of 1% or less, or significant deviation from Hardy–Weinberg equilibrium at P<10⁻⁶) were parsed and converted into a list of 16 503 genes using our algorithm. We note that the maximum significance (P=2.3 × 10⁻¹³⁰) obtained overlapped with the TOMM40 gene, near APOE. Also notable is that within this set there are no genes, save around the APOE locus that show genome-wide significance. The resultant list of genes, the marker with highest significance that is assigned to that particular gene, the number of genetic markers used for gene-based correction and a list of genes indiscernible due to LD is provided as Supplementary Table 1. For enrichment analysis we used our software together with the public domain tools provided by both the DAVID bioinformatics platform⁸ and Genecodis.⁹ After adjustment for gene length, there were 1351 genes that were assigned a P-value of 0.05 or less and these were tested for enrichment against the study base set of 16 503 genes. Importantly, testing the top genes against a default full genome base set gives an anticipated highly significant (and incorrect) enrichment of multiple high level GO categories, emphasizing the importance of using the gene lists that are actually represented on, for example, the Illumina 610 platform.

In this genome-wide data set, we observed a highly significant enrichment of genes annotated as being involved in the biological process of intracellular transmembrane protein transport (GO:0065002, P=7.2 × 10⁻⁶ based on a hypergeometric test, P<0.001 based on 1000 permutations). Both Genecodis and DAVID provided equivalent results (the P-value for this pathway with DAVID was slightly lower at 5.2 × 10⁻⁶). There were 18 genes that contributed to this significance and we show those specific genes, as well as the best genetic marker and its associated P-value in Table 1. Both DAVID and Genecodis use a hypergeometric test for significance estimation, and taking the Genecodis example, significance was derived from 18 of 1331 genes in the enriched list being association with the protein transport term, versus 69 in the total of 16 283 annotated genes. We note that the genes contributing to the signal for protein transport are dispersed widely in terms of individual significance across the top 1351 genes, emphasizing the possible existence of true association signals beyond only the first few most significant genes. A common problem with analyses of this nature is the false appearance of enrichment due to chromosomal clustering of functionally related genes.⁵ For this particular analysis, all genes contributing to enrichment were located in distinct genomic loci (also shown in Table 1), with the closest genes being several megabases apart. However, there were also a few cases of ontology categories that could be dismissed because of positional clustering, the most prominent being ‘cytokine activity’ due to an enrichment of interferon genes that are located in tight genomic proximity (not shown).

Table 1 Enriched genes in Alzheimer disease involved in intracellular transmembrane protein transport

Full size table

To understand in more detail the relationships among the genes contributing to the protein transport signal, we used FunCoup,¹⁰ which enables connections to be visualized based on genomics and experimental data, such as protein–protein interaction and gene expression correlations. We were particularly interested in how the identified protein transport pathway might be related to the APOE locus, which contains four genes that cannot be readily discerned due to LD (APOE, TOMM40, PVRL2 and BCL3). We therefore tested these 4 genes in turn for network connectivity to the 18 genes identified by enrichment analysis. To evaluate statistical significance we developed our own custom algorithm based on a previously described randomization strategy.¹¹ The randomized network was thus re-wired in such a way that the number of links for each node was preserved, although its network neighbors were shuffled. The real (that is, FunCoup-predicted) network was randomized 100 times. In FunCoup, each link is characterized by a confidence value termed as final Bayesian score—a sum of individual log likelihood ratios of the integrated data sets (51 sets from 7 eukaryotes) that informed on functional coupling. For the analysis, we selected network edges with final Bayesian score 4.8 (natural logarithm), that defined a network of 14 899 genes connected with 709 343 links. After every randomization, connections between a gene of interest and a gene group were counted. These values were used to calculate the mean and s.d. Together with the respective number of links in the real network, these values produced one-sided Z-scores that estimated significance. In this analysis, only TOMM40 was strongly connected to the set of 18 protein transport genes, with 4 direct and 792 indirect; that is, through a third gene, links (P<10⁻⁴ and P<10⁻⁷, respectively). BCL3 had a single much weaker link (based on subcellular colocalization) to NUP88—but this was not significant. From Figure 1 there is a clear division into two groupings, one containing members of the nucleoporin gene family and the other consisting of mitochondrial genes. These two groups are connected, albeit weakly, by interactions between NUP98, TIMM44 and TIMM17A. In the final network (Figure 1) most of the original 18 genes are represented, with 3 (Magmas, TNKS and C18orf55) not having any significant connections with the remaining 15 genes. Notably, Magmas and C18orf55 are mitochondrial genes, whereas TNKS is a nuclear pore protein.

We used gene expression data to explore the relationships of TOMM40 and APOE to the known base set of genes that have been confirmed to lead to AD (PSEN1, PSEN2 and APP). For this, a human brain sample was used with gene expression level estimates for 14 077 transcripts in 193 individuals.¹² In testing TOMM40 and APOE against the base set, we observed very strong correlation of TOMM40 to PSEN2 (P=1.3 × 10⁻¹³, r²=0.24) and a weaker association of APOE to APP (P=4.5 × 10⁻⁷, r²=0.12). The other correlations were not significant at α=0.05.

This study marks one of the first attempts to explore genome-wide association data in AD in the context of pathway enrichment. The enriched pathway that we have uncovered provides an intriguing indication that dysfunction of intracellular protein trafficking may be a common biological theme in AD. Although there is little support in the literature for the involvement of nucleoporin genes in AD, there is more substantial evidence for the importance of the mitochondria. In this regard, recent evidence suggests that import of β-amyloid into mitochondria may underlie β-amyloid toxicity,^{13, 14} in line with a larger body of evidence linking mitochondrial function to AD.¹⁵ More importantly this process is facilitated by the translocase of the mitochondrial outer membrane complex, illustrating the potential importance of TOMM40, itself the highest ranked gene in this GWAS and the only gene in the BCL3-PVRL2-TOMM40-APOE LD block that is significantly connected to the identified pathway. It may be plausible that age-related susceptibility to β-amyloid might be mediated by a decrease in mitochondrial function that occurs with advancing age.^{16, 17} Import of β-amyloid into the nucleus through nucleoporins may also be an avenue worth pursuing in functional studies. Although TOMM40 shows pathway connectivity, whereas APOE does not, we emphasize that we in no way make the claim that the association of the region to AD is mediated by TOMM40. Rather, the data indicate that TOMM40 may also have a role in the disease, and this is echoed in the strong correlation of TOMM40 to PSEN2 expression. In summary, our approach rests on the idea that the genetic architecture of complex traits is not dispersed over unrelated genes in the genome, but rather the mutational events that ultimately underlie trait variance can occur in functionally related genes. While implicating intracellular protein transport in AD is a highlight of the present study, we also consider the success of identifying a significant pathway component to a complex disease an important validation of this strategy.

References

McCarthy, M. I., Abecasis, G. R., Cardon, L. R., Goldstein, D. B., Little, J., Ioannidis, J. P. et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat. Rev. Genet. 9, 356–369 (2008).
Article CAS Google Scholar
Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).
Article CAS Google Scholar
Mootha, V. K., Lindgren, C. M., Eriksson, K. F., Subramanian, A., Sihag, S., Lehar, J. et al. PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat. Genet. 34, 267–273 (2003).
Article CAS Google Scholar
Srinivasan, B. S., Doostzadeh, J., Absalan, F., Mohandessi, S., Jalili, R., Bigdeli, S. et al. Whole genome survey of coding SNPs reveals a reproducible pathway determinant of Parkinson disease. Hum. Mutat. 30, 228–238 (2009).
Article CAS Google Scholar
Hong, M. G., Pawitan, Y., Magnusson, P. K. & Prince, J. A. Strategies and issues in the detection of pathway enrichment in genome-wide association studies. Hum. Genet. 126, 289–301 (2009).
Article CAS Google Scholar
Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Article CAS Google Scholar
Lambert, J. C., Heath, S., Even, G., Campion, D., Sleegers, K., Hiltunen, M. et al. Genome-wide association study identifies variants at CLU and CR1 associated with Alzheimer's disease. Nat. Genet. 41, 1094–1099 (2009).
Article CAS Google Scholar
Huang da, W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009).
Article Google Scholar
Nogales-Cadenas, R., Carmona-Saez, P., Vazquez, M., Vicente, C., Yang, X., Tirado, F. et al. GeneCodis: interpreting gene lists through enrichment analysis and integration of diverse biological information. Nucleic Acids Res. 37, W317–322 (2009).
Article CAS Google Scholar
Alexeyenko, A. & Sonnhammer, E. L. Global networks of functional coupling in eukaryotes from comprehensive data integration. Genome Res. 19, 1107–1116 (2009).
Article CAS Google Scholar
Maslov, S. & Sneppen, K. Specificity and stability in topology of protein networks. Science 296, 910–913 (2002).
Article CAS Google Scholar
Myers, A. J., Gibbs, J. R., Webster, J. A., Rohrer, K., Zhao, A., Marlowe, L. et al. A survey of genetic human cortical gene expression. Nat. Genet. 39, 1494–1499 (2007).
Article CAS Google Scholar
Anandatheerthavarada, H. K., Biswas, G., Robin, M. A. & Avadhani, N. G. Mitochondrial targeting and a novel transmembrane arrest of Alzheimer's amyloid precursor protein impairs mitochondrial function in neuronal cells. J. Cell Biol. 161, 41–54 (2003).
Article CAS Google Scholar
Hansson Petersen, C. A., Alikhani, N., Behbahani, H., Wiehager, B., Pavlov, P. F., Alafuzoff, I. et al. The amyloid beta-peptide is imported into mitochondria via the TOM import machinery and localized to mitochondrial cristae. Proc. Natl Acad. Sci. USA 105, 13145–13150 (2008).
Article CAS Google Scholar
Moreira, P. I., Duarte, A. I., Santos, M. S., Rego, A. C. & Oliveira, C. R. An integrative view of the role of oxidative stress, mitochondria and insulin in Alzheimer's disease. J. Alzheimers Dis. 16, 741–761 (2009).
Article Google Scholar
Balaban, R. S., Nemoto, S. & Finkel, T. Mitochondria, oxidants, and aging. Cell 120, 483–495 (2005).
Article CAS Google Scholar
Hong, M. G., Myers, A. J., Magnusson, P. K. & Prince, J. A. Transcriptome-wide assessment of human brain and lymphocyte senescence. PLoS One 3, e3024 (2008).
Article Google Scholar

Download references

Acknowledgements

This work was supported by the Swedish Medical Research Council (Grant 2007-2722).

Author information

Authors and Affiliations

Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
Mun-Gwan Hong, Andrey Alexeyenko & Jonathan A Prince
Inserm U744, Lille, France
Jean-Charles Lambert & Philippe Amouyel
Institut Pasteur de Lille, Lille, France
Jean-Charles Lambert & Philippe Amouyel
Université de Lille Nord de France, Lille, France
Jean-Charles Lambert & Philippe Amouyel

Authors

Mun-Gwan Hong
View author publications
You can also search for this author in PubMed Google Scholar
Andrey Alexeyenko
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Charles Lambert
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Amouyel
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan A Prince
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jonathan A Prince.

Ethics declarations

Competing interests

The authors declare no conflict of interest.

Additional information

Supplementary Information accompanies the paper on Journal of Human Genetics website

Supplementary information

Supplementary Table 1 (PDF 1243 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hong, MG., Alexeyenko, A., Lambert, JC. et al. Genome-wide pathway analysis implicates intracellular transmembrane protein transport in Alzheimer disease. J Hum Genet 55, 707–709 (2010). https://doi.org/10.1038/jhg.2010.92

Download citation

Received: 09 April 2010
Revised: 31 May 2010
Accepted: 30 June 2010
Published: 29 July 2010
Issue Date: October 2010
DOI: https://doi.org/10.1038/jhg.2010.92

Keywords

This article is cited by

Two-dimensional enrichment analysis for mining high-level imaging genetic associations
- Xiaohui Yao
- Jingwen Yan
- Li Shen
Brain Informatics (2017)
Alzheimer’s Disease Variants with the Genome-Wide Significance are Significantly Enriched in Immune Pathways and Active in Immune Cells
- Qinghua Jiang
- Shuilin Jin
- Junwei Hao
Molecular Neurobiology (2017)
Targeting protein aggregation for the treatment of degenerative diseases
- Yvonne S. Eisele
- Cecilia Monteiro
- Jeffery W. Kelly
Nature Reviews Drug Discovery (2015)
Pathway analysis of body mass index genome-wide association study highlights risk pathways in cardiovascular disease
- Xin Zhao
- Jinxia Gu
- Guiyou Liu
Scientific Reports (2015)
Genome-wide pathway analysis of a genome-wide association study on Alzheimer’s disease
- Young Ho Lee
- Gwan Gyu Song
Neurological Sciences (2015)