Nuclear and mitochondrial genetic variants associated with mitochondrial DNA copy number

Mitochondrial DNA copy number (mtDNA-CN) is a biomarker for mitochondrial dysfunction associated with several diseases. Previous genome-wide association studies (GWAS) have been performed to unravel underlying mechanisms of mtDNA-CN regulation. However, the identified gene regions explain only a small fraction of mtDNA-CN variability. Most of this data has been estimated from microarrays based on various pipelines. In the present study we aimed to (1) identify genetic loci for qPCR-measured mtDNA-CN from three studies (16,130 participants) using GWAS, (2) identify potential systematic differences between our qPCR derived mtDNA-CN measurements compared to the published microarray intensity-based estimates, and (3) disentangle the nuclear from mitochondrial regulation of the mtDNA-CN phenotype. We identified two genome-wide significant autosomal loci associated with qPCR-measured mtDNA-CN: at HBS1L (rs4895440, p = 3.39 × 10–13) and GSDMA (rs56030650, p = 4.85 × 10–08) genes. Moreover, 113/115 of the previously published SNPs identified by microarray-based analyses were significantly equivalent with our findings. In our study, the mitochondrial genome itself contributed only marginally to mtDNA-CN regulation as we only detected a single rare mitochondrial variant associated with mtDNA-CN. Furthermore, we incorporated mitochondrial haplogroups into our analyses to explore their potential impact on mtDNA-CN. However, our findings indicate that they do not exert any significant influence on our results.

Figure S3: Mitochondrial sub-compartments were analyzed for all genes with COMPARTMENTS resource [1] , integrating all sources of protein subcellular localization information (including manually curated literature, high-throughput microscopy-based screens, primary sequence predictions, and automatic text mining results), based on the meta-analysis top hits.Scores from https://download.jensenlab.org/human_compartment_integrated_full.tsvranking between 0.5 and 5 (highest confidence) limited to mitochondria related compartments.

E
Figure S5: Regional plots created with LocalZoom [2] showing a detailed picture of the genome-wide significant loci.shows the p-values from our mtDNA-CN GWAS (C) shows the p-values of the association on the expression levels on HBS1L respectively MYB obtained from the eQTLGen Consortium [3] .

Figure S8:
Results of colocalization analysis of genes around our GWAS lead SNP on chromosome 17 in whole blood.The purple square represents our GWAS top hit (rs56030650) in all plots.Results are shown for all SNPs in the region ±250kB surrounding rs56030650.For each gene, (A) shows the colocalization of mtDNA-CN GWAS and eQTL signals.(B) shows the p-values from our mtDNA-CN GWAS (C) shows the p-values of the association on the expression levels on the respective gene obtained from the eQTLGen Consortium [3] .

Figure S9:
Protein-protein interaction network analysis using the STRING database, based on the findings of our colocalization analysis.The eight nodes in the network fall into two clusters, with high interaction (PPI enrichment p-value: 1.75x10 -09 ).The node colors indicate connections with phenotypes curated by the Monarch Initiative [4] , with the highlighted entries in the list considered.shown on the x-axis, the frequency is represented on the y-axis.MitoImpute [5] with the Reference Panel v1 0.01 (MAF 1%) was used to infer missing mtDNA variants and final haplogroup estimation using HaploGrep 2 (version 2.4) 6 was performed in 16,121 samples.

Figure S11:
Mitochondrial DNA copy number distribution between the five haplogroup-clusters (R0, JT, UK, Other Europeans, Non-Europeans) for men and women separately.Medians of mitochondrial DNA copy number for each cluster are labelled.

Figure S1 :
Figure S1: Correlation plot (Pearson correlation coefficient) to test linear relation between mtDNA-CN and covariates in each study.Red indicates a positive correlation, while blue indicate negative correlation coefficients.

Figure S2 :
Figure S2: Venn Diagram illustrating differences in genotyped mitochondrial variants on the five used microarrays.In CHRIS, the Illumina OmniExpressExome and the OMNI 2.5Exome chip array, in the GCKD study, the OMNI 2.5Exome BeadChip and in the AugUR study, different versions of the Illumina Global Screening Array (v1/v3) were used for genotyping.

Figure S4 :
Figure S4: Manhattan plots illustrating meta-analyses of genome-wide association studies for mtDNA-CN from all three studies (GCKD, AugUR, CHRIS) for all additional adjustment models (A-E).The red line represents the threshold for genome-wide significance (p-value<5×10 −8 ).The x-axis gives the chromosomes, the y-axis shows the -log10 p-values of imputed SNPs.Abbreviations: RBC = red blood cell count, WBC = white blood cell count, PL = platelets count, PC = principal components.

Figure S6 :
Figure S5: Regional plots created with LocalZoom[2] showing a detailed picture of the genome-wide significant loci.The two regions associated with mtDNA-CN are shown: (a) the variant between HBS1L and the MYB gene, (b) the missense variant in the GSDMA gene.(a) and (b) show the main model adjusted for age, sex and the first four PCs.The linkage disequilibrium (LD) of each variant with the labelled lead SNP is indicated by color.The x-axis gives the chromosome position and below the genes within the plotted genomic region.The y-axes show the -log10 p-values of imputed SNPs as well as the recombination rate.

Figure S7 :
Figure S7: Results of colocalization analysis of HBS1L and MYB in whole blood.The purple square represents our GWAS top hit (rs4895440) on chromosome 6 in all plots.Results are shown for all SNPs in the region ±250kB surrounding rs4895440.For each gene, (A) shows the colocalization of mtDNA-CN GWAS and eQTL signals.(B)shows the p-values from our mtDNA-CN GWAS (C) shows the p-values of the association on the expression levels on HBS1L respectively MYB obtained from the eQTLGen Consortium[3] .

Figure S10 :
Figure S10: Distribution of haplogroups separated for each study cohort.Haplogroups and macrohaplogroups areshown on the x-axis, the frequency is represented on the y-axis.MitoImpute[5] with the Reference Panel v1 0.01 (MAF 1%) was used to infer missing mtDNA variants and final haplogroup estimation using HaploGrep 2 (version 2.4) 6 was performed in 16,121 samples.

Table S1 :
Overview on previous studies investigating genetic aspects of mtDNA copy number regulation from blood samples.

Table S2 :
Overview on ß-estimates and p-values of three top hits in all different adjustment models including an additional main model without GCKD to allow differentiation between power effects and influence on different adjustment variables.

Table S3 :
Allele frequencies and number of carriers of top hits for the main model (adjusted for age, sex, 4 PCs) for each study separately.
Abbreviations: Chr = Chromosome; Allele1 = effect allele; SE = standard deviation of ß-estimate (effect) * This variant was considered a technical artifact and therefore, excluded from further considerations.

Table S4 :
Results of eQTL and colocalization analysis in whole blood using data from the eQTLGen Consortium (n=31,684) 3 FDR: false-discovery rate a Selection criteria: genes in a window ±250 kB around the GWAS top-hit, FDR <0.05 in the association with expression b Numerical underflow appears with values <3.27×10 -310 and therefore, in the eQTL Consortium, all p-values below were set to this threshold.This might impact the calculations of the posterior probabilities of H3 and H4.c LD taken from LDlink (https://ldlink.nih.gov/) using all European populations H3: both mtDNA-CN and expression are associated with SNPs in the gene region, but with different causal variants H4: both mtDNA-CN and expression are associated and share a single causal variant

Table S7 :
Mediation effects of SNPs rs56030650, respectively rs48955440 on mtDNA-CN in the AugUR (a) and CHRIS study (b).