Understanding mechanisms of hepatocellular damage may lead to new treatments for liver disease, and genome-wide association studies (GWAS) of alanine aminotransferase (ALT) and aspartate aminotransferase (AST) serum activities have proven useful for investigating liver biology. Here we report 100 loci associating with both enzymes, using GWAS across 411,048 subjects in the UK Biobank. The rare missense variant SLC30A10 Thr95Ile (rs188273166) associates with the largest elevation of both enzymes, and this association replicates in the DiscovEHR study. SLC30A10 excretes manganese from the liver to the bile duct, and rare homozygous loss of function causes the syndrome hypermanganesemia with dystonia-1 (HMNDYT1) which involves cirrhosis. Consistent with hematological symptoms of hypermanganesemia, SLC30A10 Thr95Ile carriers have increased hematocrit and risk of iron deficiency anemia. Carriers also have increased risk of extrahepatic bile duct cancer. These results suggest that genetic variation in SLC30A10 adversely affects more individuals than patients with diagnosed HMNDYT1.
Liver disease remains an area of high unmet medical need, causing 3.5% of deaths worldwide, and the burden of liver disease is rising rapidly, driven mainly by increasing rates of nonalcoholic fatty liver disease (NAFLD)1,2. Better characterizing the genetic determinants of liver disease may lead to new therapies3. In addition, liver injury is a common side effect of drugs, and is a frequent reason that drugs fail to progress through the development pipeline; understanding the molecular mechanisms of liver injury can aid in preclinical drug evaluation to anticipate and avoid off-target effects4,5.
Circulating liver enzymes are sensitive biomarkers of liver injury; in particular, alanine aminotransferase (ALT) and aspartate aminotransferase (AST) are released into the circulation during damage to hepatocyte membranes6,7. The activities of circulating liver enzymes have been found to be highly heritable6,8,9,10,11,12, and variation even within the normal reference range is predictive of disease13. Accordingly, genome-wide association studies (GWAS) of activities of circulating liver enzymes across large population samples have proven powerful for understanding the molecular basis of liver disease6,14,15,16,17,18,19,20,21,22,23,24. Combined GWAS of ALT and AST have previously revealed genetic associations providing potential therapeutic targets for liver disease such as PNPLA325 and HSD17B1326. To further study the genetics of hepatocellular damage, here we perform GWAS on serum activities of ALT and AST in 411,048 subjects, meta-analyzed across four ancestry groups in the UK Biobank (UKBB). We find 100 loci associated with both enzymes and show that the strongest effect is a rare missense variant in SLC30A10.
Discovery of ALT- and AST-associated loci by GWAS
We performed a GWAS of ALT and AST in four sub-populations in the UKBB (demographic properties, Supplementary Table 1; sample sizes, number of variants tested, and λGC values, Supplementary Table 2; genome-wide significant associations, Supplementary Data 1; Manhattan and QQ plots for each enzyme and sub-population, Supplementary Figs. 1 and 2). After meta-analyzing across sub-populations to obtain a single set of genome-wide p-values for each enzyme (Manhattan plots, Fig. 1), we found 244 and 277 independent loci associating at p < 5 × 10−8 with ALT and AST, respectively, defined by lead single nucleotide polymorphisms (SNPs) or indels separated by at least 500 kilobases and pairwise linkage disequilibrium (LD) r2 less than 0.2. Enzyme activities were strongly associated with coding variants in the genes encoding the enzymes, representing strong protein quantitative trait loci in cis (cis-pQTLs). For example, rs147998249, a missense variant Val452Leu in GPT (glutamic-pyruvic transaminase) encoding ALT, strongly associates with ALT (p < 10−300) and rs11076256, a missense variant Gly188Ser in GOT2 (glutamic-oxaloacetic transaminase 2) encoding the mitochondrial isoform of AST, strongly associates with AST (p = 6.3 × 10−62). While these strong cis-pQTL effects validated our ability to detect direct genetic influences on ALT and AST, the aim of this study was to detect genetic determinants of liver health that have downstream effects on both ALT and AST due to hepatocellular damage; therefore we focused the remainder of our analyses only on the variants associated with serum activity of both enzymes (labeled with black text on Fig. 1).
Focusing only on loci with both ALT and AST GWAS signals (lead variants from either GWAS were identical or shared proxies with r2 ≥ 0.8), we found a total of 100 independent loci associated with both enzymes (Fig. 2, Supplementary Data 2). As expected, effect sizes on ALT and AST at these loci were highly correlated (r = 0.98), and at all 100 loci the direction of effect on ALT and AST was concordant. Of these 100 loci, six were coincident or in strong LD with a published ALT or AST variant in the EBI-NHGRI GWAS Catalog, and 15 were within 500 kb of a published ALT or AST variant; 33 of the loci harbored a missense or predicted protein-truncating variant; and of the remaining 67 entirely noncoding loci, 19 were coincident or in strong LD with the strongest eQTL for a gene in liver, muscle, or kidney suggesting that effects on gene expression may drive their associations with ALT and AST. A majority (70 of the 100 loci) were shared with a distinct published association in the GWAS Catalog, suggesting pleiotropy with other traits.
Comparing the effect sizes of all lead variants (Supplementary Data 2), the strongest estimated effect was from rs188273166, a rare (MAF in White British = 0.12%) missense variant (Thr95Ile) in SLC30A10, associated with a 4.2% increase in ALT (95% CI, 4.6% to 7.1%; p = 1.6 × 10−24) and 5.9% increase in AST (95% CI: 3.4% to 5.0%; p = 4.9 × 10−31). Because Thr95Ile is coding and not strongly linked to any other variants, we considered it likely to be the causal variant driving the association at the SLC30A10 locus. SLC30A10 encodes a manganese efflux transporter (solute carrier family 30 member 10, also known as zinc transporter 10 or ZnT10)27,28. Loss-of-function mutations in SLC30A10 have been reported to cause a rare recessive syndrome, hypermanganesemia with dystonia 1 (HMNDYT1), characterized by cirrhosis, dystonia, parkinsonism, polycythemia, and hypermanganesemia28,29,30,31,32,33,34. The next strongest effect on either enzyme was rs28929474, a missense variant (the Pi-Z allele) in SERPINA1 (serpin family A member 1) which causes alpha-1 antitrypsin deficiency (AATD) in its homozygous state35, associated with a 2.5% increase in ALT (95% CI, 2.1% to 2.8%; p = 1.4 × 10−72) and 1.3% increase in AST (95% CI, 1.1% to 1.5%; p = 3.2 × 10−50); AATD manifests with both lung and liver damage. The most statistically significant association with either enzyme was rs738409, a common missense variant (Ile148Met) in PNPLA3 (patatin like phospholipase domain containing 3) known to strongly increase risk of liver disease25; it is associated with a 2.2% increase in ALT (95% CI, 2.1% to 2.3%; p < 10−300) and a 1.3% increase in AST (95% CI, 1.3% to 1.4%; p < 10−300).
We observed significant heterogeneity in effects between sexes for both enzymes (Cochran’s Q test p < 0.05/100 for both enzymes) for three of the lead variants: rs9663238 at HKDC1 (stronger effects in women), rs28929474 at SERPINA1 (stronger effects in men), and rs1890426 at FRK (stronger effects in men) (Supplementary Data 3).
We tested the 100 lead variants from the ALT and AST GWAS analysis for association with a broad liver disease phenotype (ICD10 codes K70-77; 14,143 cases and 416,066 controls), meta-analyzing liver disease association results across all four sub-populations (Supplementary Data 2). Of the 100 lead variants, 28 variants associate with liver disease with p < 0.05. As expected, variants associated with an increase in ALT and AST tend to be associated with a proportional increase in liver disease risk (across all lead variants, Pearson correlation of betas r = 0.82 for both enzymes; Fig. 3). Liver disease is found more frequently in our sample of carriers of SLC30A10 Thr95Ile (rs188273166), proportional with the observation of increased ALT and AST (OR = 1.47); however, owing to the small sample size of carriers and liver disease cases, we are underpowered to confidently determine whether this high point estimate is due to chance (although the 95% CI from the PLINK analysis used to estimate effects does not include OR = 0, the p value from the SAIGE analysis which more accurately controls for Type I error in highly unbalanced case-control studies is 0.07; see “Methods”).
Because SLC30A10 Thr95Ile had the strongest effect on ALT and AST of all of our lead variants and has not been reported as being associated with any phenotypes in the literature, we centered the following analyses on better understanding its function.
Validation of SLC30A10 Thr95Ile genotype
Because rare variants are especially prone to errors in array genotyping36, we sought to validate the array genotype calls for SLC30A10 Thr95Ile in a subset of 301,473 individuals who had also been exome sequenced (Supplementary Table 3). The only individual homozygous for the minor (alternate) allele by array was confirmed by exome sequencing; no further homozygotes were identified. Of 702 individuals called as heterozygous for Thr95Ile by array data who had exome data available, 699 (99.6%) were confirmed heterozygous by exome sequencing, while three were called homozygous reference by exome sequencing, suggesting an error either in the array typing or exome sequencing for these three individuals. Overall, these results demonstrate high concordance between array and exome sequencing, implying highly reliable genotyping.
Magnitude of ALT and AST elevation in SLC30A10 Thr95Ile carriers
After establishing the association between SLC30A10 Thr95Ile and ALT and AST, we sought to further explore the relationship between genotype and enzyme activity levels to understand clinical relevance. Carriers of Thr95Ile had a mean ALT of 27.37 U/L vs 23.54 U/L for noncarriers, and a mean AST of 28.85 U/L vs 26.22 U/L for noncarriers. Counting individuals with both ALT and AST elevated above 40 U/L, a commonly-used value for the upper limit of normal (ULN)7, 5.6% of carriers vs 3.6% of noncarriers had both enzymes elevated at the time of their UK Biobank sample collection, an increased relative risk of 58% (Fisher’s p = 8.1 × 10−4) (Supplementary Data 4).
Drinking behavior in SLC30A10 Thr95Ile carriers
The SLC30A10 Thr95Ile has not been reported as associating with drinking behavior by any of the available studies in the GWAS Catalog. We used the drinking questionnaire taken by UK Biobank participants to assess drinking status at enrollment of SLC30A10 Thr95Ile carriers (current, former, or never drinkers.) While the rate of current drinkers is higher among carriers vs. non-carriers in the entire biobank (93.7% vs. 91.7%, Fisher’s p = 0.019) (Supplementary Data 4), this association is highly confounded by genetic ancestry and country of birth (Supplementary Table 4). Limiting to the White British subpopulation and individuals born in England, the rate of current drinking is not detectably different among carriers (94.0% vs 93.4%, Fisher’s p = 0.57) while the rate of individuals with elevation of both ALT and AST over the ULN remains significant (5.5% vs. 3.5%, Fisher’s p = 4.6 × 10−3) (Supplementary Data 4).
Replication of ALT and AST associations
The initial association of rs188273166 with ALT and AST was identified in the White British population. To replicate this association in independent cohorts, we first identified groups besides the White British subpopulation harboring the variant in the UKBB. The only two other populations with a substantial number of SLC30A10 Thr95Ile carriers were individuals identifying as Other White and as White Irish (Supplementary Table 4); we tested for association with ALT and AST in these subpopulations. We then tested the association in two independent cohorts from the DiscovEHR collaboration between the Regeneron Genetics Center and the Geisinger Health System37. Meta-analyzing the association results across these four groups (N = 132,992 and N = 131,646, respectively) confirmed the Thr95Ile association with increased ALT and AST (p = 6.5 × 10−5 and p = 5.4 × 10−6, respectively) (Supplementary Fig. 3, Supplementary Table 5). We also searched repositories of available complete summary statistics for ALT and AST GWAS and found two prior studies that reported associations21,38. Although these studies were underpowered to detect significant associations and were not reported in units that allowed their inclusion in our replication analysis, they were consistent with increases in both enzymes (Supplementary Table 6).
Independence of SLC30A10 Thr95Ile from neighboring ALT and AST associations
Because we applied distance and LD pruning to the results of the genome-wide scan to arrive at a set of lead variants, it was unclear how many independent association signals existed at the SLC30A10 locus. Revisiting trans-ancestry association results in a window including 1 Mb flanking sequence upstream and downstream of SLC30A10 revealed 76 variants with genome-wide significant associations with both ALT and AST (Fig. 4). These 76 variants clustered into three loci: SLC30A10 (only Thr95Ile, rs188273166); MTARC1 (mitochondrial amidoxime reducing component 1, lead variant rs2642438 encoding missense Ala165Thr, previously reported to associate with liver disease and liver enzymes39, and six additional variants together spanning 68 kilobases); and LYPLAL1-ZC3H11B (intergenic region between lyophospholipase like 1 and zinc finger CCCH-type containing 11B, with array-genotyped variant rs6541227 and 67 imputed variants spanning 46 kilobases), a locus previously reported to associate with non-alcoholic fatty liver disease (NAFLD)40.
To test for independence between these three loci, we performed ALT and AST association tests for each of the three array-typed variants while including the genotype of either one or both of the others as covariates. Associations were similar in these conditional analyses, suggesting that each of these three associations are not confounded by linkage disequilibrium with the other regional association signals (Supplementary Table 7) Therefore, the SLC30A10 Thr195Ile association is statistically independent of the associations at neighboring loci. This statistical independence of the liver enzyme associations does not preclude a long-distance regulatory interaction between the three loci; for example, rs188273166, despite encoding an amino acid change in SLC30A10, could conceivably influence transcription of MTARC1, and rs6541227, despite being nearest to LYPLAL1 and ZC3H11B, may influence transcription of SLC30A10. However, these three variants are not detected as liver eQTLs for the genes at neighboring loci in published data41.
Linkage of Thr95Ile to GWAS variants at SLC30A10
A GWAS of circulating toxic metals42 discovered an association between a common intronic variant in SLC30A10 (rs1776029; MAF in White British, 19.5%) and blood manganese levels, where the reference allele—which is the minor allele—is associated with increased circulating manganese. We calculated linkage disequilibrium statistics between rs1776029 and Thr95Ile and found that the minor allele of Thr95Ile (A) was in almost perfect linkage with the minor allele of rs1776029 (A) (r2 = 0.005, D′ = 0.98); Thr95Ile (rs188273166) is 154 times more frequent among carriers of at least one copy of the minor allele of common variant rs1776029 (95% CI = 84–325; Fisher’s p < 2.2 × 10−16). These results suggest that the previously reported association of rs1776029 with circulating manganese may be partially or completely explained by linkage with Thr95Ile (Supplementary Table 8); however, genotypes of Thr95Ile in the manganese GWAS or manganese measurements in the UK Biobank would be needed in order to perform conditional analysis or directly measure association of Thr95Ile with serum manganese. We then systematically tested nearby variants reported in the GWAS Catalog for any phenotype for linkage to Thr95Ile, measured by high |D′|. Combining GWAS Catalog information and |D′| calculations, we find nearly perfect linkage (|D′| > 0.90) between rs188273166-A (rare missense Thr95Ile) with rs1776029-A (intronic), rs2275707-C (3′UTR), and rs884127-G (intronic), all within the gene body of SLC30A10 (Supplementary Data 5). In addition to increased blood Mn42, these three common alleles have been associated with decreased magnesium/calcium ratio in urine43, decreased mean corpuscular hemoglobin (MCH)44,45,46, increased red blood cell distribution width44,45,46, and increased heel bone mineral density (BMD)46,47,48,49. A recent study, not yet in the GWAS catalog, reported an association between another common intronic variant in SLC30A10 (rs759359281; MAF in White British, 5.6%) and liver MRI-derived iron-corrected T1 measures (cT1)50. However, the reported cT1-increasing allele of rs759359281, which is the minor allele, is in complete linkage (D′ = 1) with the major allele of Thr95Ile (rs188273166); in other words, the cT1-increasing allele and Thr95Ile liver disease risk allele occur on different haplotypes, suggesting that the mechanism of this reported cT1 association is independent of Thr95Ile.
Phenome-wide associations of SLC30A10 Thr95Ile
To explore other phenotypes associated with SLC30A10 Thr95Ile, we tested for association with 135 quantitative traits and 4398 ICD10 diagnosis codes within the White British population (Supplementary Data 6 and 7). We were particularly interested in testing associations with phenotypes related to HMNDYT1, the known syndrome caused by homozygous loss of function of SLC30A10. Besides ALT and AST elevation, rs188273166 was associated with other indicators of hepatobiliary damage such as decreased HDL cholesterol and apolipoprotein A (ApoA)51, decreased albumin, and increased gamma glutamyltransferase (GGT). Other phenome-wide significant quantitative trait associations were increases in hemoglobin concentration and hematocrit (Table 1); increased hematocrit, or polycythemia, is a known symptom of HMNDYT1. Liver iron-corrected T1 by MRI (cT1), although only measured in seven carriers, was above the population median value in all seven (Supplementary Fig. 4).
The only phenome-wide significant associations of diagnoses with SLC30A10 Thr95Ile were C24.0, extrahepatic bile duct carcinoma, and C22.1, intrahepatic bile duct carcinoma. There are eight Thr95Ile carriers of each type of cancer, and six carriers with both types of cancer, for a total of ten carriers (1% of the 1,001 total carriers in the White British population) with bile duct carcinoma. Strikingly, over 5% of individuals with extrahepatic bile duct carcinoma (8 in 148) carry Thr95Ile. (Table 2, Supplementary Data 7).
Among hematological manifestations of HMNDYT1, iron deficiency anemia was enriched among carriers (OR = 1.5, 95% CI, 1.1–1.9; p = 4.0 × 10−3). Searching for neurological manifestations similar to HMNDYT1, we find no association with Parkinson’s disease or dystonia but note that, as with liver diseases, we are powered to exclude only strong effects because of the small case number for these traits (Supplementary Data 7). The top non-cancer hepatobiliary associations with SLC30A10 Thr95Ile were with K83.0, cholangitis; K83.1, obstruction of bile duct; and K81.0, acute cholecystitis. Because biliary diseases are risk factors for cholangiocarcinoma and co-occur with them in our data, we tested whether SLC30A10 Thr95Ile was still associated with these biliary diseases, and the other selected quantitative traits and diagnoses, after removing the 148 individuals with extrahepatic bile duct cancer (Supplementary Table 9, Supplementary Table 10). All of the associations remained significant except for intrahepatic bile duct carcinoma.
To test whether the association with extrahepatic bile duct cancer was driven by a nearby association, and to assess other risk variants and the potential for false positives given the extreme case-control imbalance, we performed a GWAS of the phenotype; remarkably, SLC30A10 Thr95Ile was the strongest association genome-wide, with minimal evidence for systematic inflation of p-values (λGC = 1.05; Fig. 5, Supplementary Fig. 5).
Replication of hematocrit association
A key result from the phenome-wide scan that was not related to hepatocellular damage was the association between SLC30A10 Thr95Ile and increased hematocrit. Polycythemia is a symptom of HMNDYT1 mechanistically related to manganese overload. We meta-analyzed hematocrit values from the Other White and White Irish populations, the DiscovEHR data, and a non-UKBB population (INTERVAL Study) from a published meta-analysis of hematocrit values45, and found that the association replicated (N = 179,689, p = 0.013; Supplementary Table 11).
SLC30A10 expression in liver cell subtypes
Across organs, SLC30A10 is transcribed at the highest level in liver according to data from the GTEx Project52. The association of Thr95Ile with bile duct cancer led us to query expression of SLC30A10 in specific cell types within the liver using data from three single-cell RNA sequencing studies of liver53,54,55. These data show very low expression of SLC30A10 message in individual cells, but all studies detect expression in both hepatocytes and cholangiocytes (Supplementary Table 12, Supplementary Fig. 7). Immunohistochemistry has established that SLC30A10 protein is present in hepatocytes and bile duct epithelial cells and localizes to the cholangiocyte plasma membrane, facing the lumen of the bile duct32.
Bioinformatic characterization of SLC30A10 Thr95Ile
To understand potential functional mechanisms of the Thr95Ile variant, we examined bioinformatic annotations of SLC30A10 Thr95Ile from a variety of databases. The UNIPROT database shows that Thr95Ile occurs in the third of six transmembrane domains and shares a domain with a variant known to cause HMNDYT1 (Supplementary Fig. 8). Several in silico algorithms predict that Thr95Ile is a damaging mutation. The CADD (Combined Annotation Dependent Depletion) algorithm, which combines a broad range of functional annotations, gives the variant a score of 23.9, placing it in the top 1% of deleteriousness scores for genome-wide potential variants. The algorithm SIFT, which uses sequence homology and physical properties of amino acids, predicts Thr95Ile as deleterious. The algorithm PolyPhen-2 gives Thr95Ile a HumDiv score of 0.996 (probably damaging), based on patterns of sequence divergence from close mammalian homologs, and a HumVar score of 0.900 (possibly damaging), based on similarity to known Mendelian mutations. Cross-species protein sequence alignment in PolyPhen-2 shows only threonine or serine at position 95 across animals. These properties suggest that Thr95Ile substitution ought to affect the function of the SLC30A10 protein.
Characterization of SLC30A10 variants in vitro
To test the protein localization of SLC30A10 harboring Thr95Ile as well as other variants, we created constructs with Thr95Ile (rs188273166) and the HMNDYT1-causing variants Leu89Pro (rs281860284) and del105-107 (rs281860285) and transfected these constructs into HeLa cells. Immunofluorescence staining revealed membrane localization for wild-type (WT) SLC30A10 which was abolished by the two HMNDYT1 variants, consistent with previous reports which showed that the HMNDYT1 variant proteins are mislocalized in the endoplasmic reticulum (ER)56. In contrast, Thr95Ile showed membrane localization similar to WT, suggesting that Thr95Ile does not cause a deficit in protein trafficking to the membrane (Fig. 6).
Expanded genetic landscape of risk for hepatocellular damage
Our trans-ancestry GWAS of ALT and AST reveals a broad genetic landscape of loci that modulate risk of hepatocellular damage or other diseases that cause increases in circulating ALT and AST, bringing the number of loci known to associate with serum activities of both enzymes from 10 (currently in the GWAS Catalog) to 100. Two loci had been previously reported in majority-European ancestry GWAS of ALT and AST as associating with both enzymes: PNPLA314,15,18,19,20,57,58,59 and HSD17B1314,18,26; we detect ALT and AST signals at both of these loci. Broadening beyond majority-European ancestry GWAS, an additional eight loci have been previously identified: PANX118, ALDH217,19, CYP2A618,57, ABCB1118, ZNF82718, EFHD118, AGER-NOTCH418,60, and AKNA18; we replicate four of these in our trans-ancestry GWAS (PANX1, ZNF827, EFHD1, and AKNA.) We are limited by the lack of diversity in the UK Biobank and expect that studies in more diverse populations will result in the discovery of new loci and alleles.
Among the loci are many that had been previously identified as risk loci for liver disease, but had never been explicitly associated through GWAS of both ALT and AST, such as SERPINA1 (associated with alpha-1 antitrypsin deficiency35), HFE (homeostatic iron regulator, associated with hemochromatosis61), and TM6SF2 (transmembrane 6 superfamily member 2, associated with NAFLD62,63,64). The MTARC1 lead variant was discovered in a GWAS of cirrhosis, and then found to associate with lower ALT and AST39. Others are known to associate with risk of gallstones (ABCG8, ANPEP, and HNF1B)65,66 or increased GGT (EPHA2, CDH6, DLG5, CD276, DYNLRB2, and NEDD4L)14,18. Consistent with the fact that ALT and AST elevation can be caused by kidney or muscle damage, we detect an association with ANO5 (anoctamin 5), which has been implicated in several autosomal recessive muscular dystrophy syndromes67,68, and several loci associated with expression of genes in muscle or kidney but not liver (SHMT1, BRD3, DLG5, EYA1, IFT80, IL32, EIF2AK4, and SLC2A4). We expect only a subset of the loci from this screen to be directly causally implicated in hepatocellular damage; many may predispose to a condition where liver damage is secondary or where enzyme elevation originates in kidney or muscle, an important limitation of this approach.
The significant sex heterogeneity we observe at the ALT- and AST- associated loci HKDC1, SERPINA1, and FRK warrants further investigation and is consistent with a prior study that found significant genotype-by-sex interactions in the genetic architecture of circulating liver enzymes6. HKDC1 has been associated with glucose metabolism and notably, this effect is specific to pregnancy69,70.
SLC30A10 had not been identified in prior GWAS of circulating liver enzymes. Because SLC30A10 Th95Ile is so rare, it is not surprising that these scans were underpowered to detect its large effect, due either to insufficient study size, lack of inclusion on the genotyping arrays used, or lack of power to impute its genotype. For example, in the UK Household Longitudinal Study (N = 5458 and N = 5321, respectively) effects were reported that were consistent with a strong effect size but were not statistically significant (Supplementary Table 6).
Properties of SLC30A10 Thr95Ile
The variant with the strongest predicted effect on ALT and AST, SLC30A10 Thr95Ile (rs188273166), is a rare variant carried by 1117 of the 487,327 array-genotyped participants in the UK Biobank. While Thr95Ile is found in some individuals of non-European ancestry, it is at much higher frequency in European-ancestry populations, with carrier frequency in our sample by UK country of birth ranging from a minimum of 1 in 479 people born in Wales to a maximum of 1 in 276 people born in Scotland (Supplementary Table 4). The increased frequency we see in European-ancestry populations is not merely due to those populations’ overrepresentation in the UK Biobank, but is also consistent with global allele frequency data cataloged in dbSNP71.
The Thr95Ile variant occurs in the third of six transmembrane domains of the SLC30A10 protein72, the same domain affected by a previously reported loss-of-function variant causing HMNDY1 (hypermanganesemia with dystonia 1), Leu89Pro (rs281860284)56 (Fig. 6). In vitro, Leu89Pro abolishes trafficking of SLC30A10 to the membrane56, and another study pointed to a functional role of polar or charged residues in the transmembrane domains of SLC30A10 for manganese transport function73. Bioinformatic analysis suggests that Thr95Ile should impact protein function.
Our site-directed mutagenesis experiment of SLC30A10 shows that Thr95Ile, unlike reported HMNDYT1-causing variants, results in a protein that is properly trafficked to the cell membrane. Further biochemical studies will be required to investigate whether the Thr95Ile variant of SLC30A10 has reduced manganese efflux activity, or otherwise affects SLC30A10 stability, translation, or transcription.
Comparison of SLC30A10 Thr95Ile phenotypes to HMNDYT1 phenotypes
SLC30A10 (also known as ZNT10, and initially identified through sequence homology to zinc transporters27) encodes a cation diffusion facilitator expressed in hepatocytes, the bile duct epithelium, enterocytes, and neurons32 that is essential for excretion of manganese from the liver into the bile and intestine28,32. Homozygous loss-of-function of SLC30A10 was recently identified as the cause of the rare disease HMNDYT1, which in addition to hypermanganesemia and dystonia is characterized by liver cirrhosis, polycythemia, and Mn deposition in the brain29,30,31,32,33,34,56. Other hallmarks include iron depletion and hyperbilirubinemia. Mendelian disorders of SLC30A10 and the other hepatic Mn transporter genes SLC39A8 (solute carrier family 39 member 8, causing congenital disorder of glycosylation type IIn)74 and SLC39A14 (solute carrier family 39 member 14, implicated in hypermanganesemia with dystonia 2)75, along with experiments in transgenic mice76,77, have confirmed the critical role of each of these genes in maintaining whole-body manganese homeostasis78. Notably, while all three of the genes have Mendelian syndromes with neurological manifestations, only SLC30A10 deficiency (HMNDYT1) is known to be associated with liver disease78.
We detect two key aspects of HMNDYT1—increased circulating liver enzymes and increased hematocrit—exceeding phenome-wide significance in heterozygous carriers of SLC30A10 Thre95Ile. Among other hepatic phenotypes that have been reported in HMNDYT1 cases, we also detect an association with anemia, but no evidence of hyperbilirubinemia. The neurological aspect of HMNDYT1, parkinsonism and dystonia, is not detectably enriched among Thr95Ile carriers; however, we have limited power and cannot exclude an enrichment. It is therefore intriguing to consider that carrier status of Thr95Ile may represent a very mild manifestation of HMNDYT1.
The quantitative trait with the largest effect associated with SLC30A10 Thre95Ile is liver MRI cT1 (+1.2 SD; 95% CI, +0.5 to +2.0; p = 0.0032). Liver MRI cT1 has been recently explored as a non-invasive diagnostic of steatohepatitis and fibrosis79,80. However, MRI T1 signal has also been used to detect manganese deposition in the brain, and it is unclear the extent to which hepatic manganese overload could confound the association of liver cT1 with liver damage81.
Comparison of Thr95Ile phenotypes to SLC30A10 common variant phenotypes
Apart from rare variants in SLC30A10 causing HMNDYT1, Thr95Ile can also be compared to common variants in SLC30A10 that have been associated with phenotypes by GWAS (Fig. 7). We find that the minor allele of Thr95Ile is in almost complete linkage with a common intronic variant associated with increased blood manganese. Other GWAS variants in almost perfect linkage with Thr95Ile associate with decreased MCH, increased RBC distribution width, decreased magnesium/calcium ratio, and increased heel bone mineral density (BMD). Decreased MCH could reflect the anemia experienced by HMNDYT1 patients, caused by the closely linked homeostatic regulation of manganese and iron28. Increased BMD may reflect the protective role of manganese in bone maintenance82,83. Looking for the subset of these phenotypes available in our scan of Thr95Ile, we do find a nominally significant increase in BMD but no detectable increase in MCH or erythrocyte distribution width. By contrast, we find that a common intronic variant in SLC30A10 recently reported to associate with liver MRI cT150 is in complete linkage with the major allele of Thr95Ile, suggesting an independent genetic mechanism but also providing independent evidence of the role of SLC30A10 variants in liver health and/or hepatic manganese content.
The linked GWAS variants may be interpreted through two mechanistic hypotheses: first, the associations may all be causally driven by Thr95Ile carriers in the studies, which the GWAS variants tag; alternatively, the associations may be driven by effects of the common variants themselves, which are noncoding but may influence SLC30A10 (or another gene in cis) by modulating expression or post-transcriptional regulation; or some combination of both. To distinguish between these, measurements of Mn would need to be available to perform conditional analyses. If the GWAS variants have an effect independent of Thr95Ile, SLC30A10 still seems likely (although not certain) to be the causal gene at the locus, due to the similarity in phenotypes to HMNDYT1 and Thr95Ile. A putative regulatory mechanism could be through transcriptional or post-transcriptional regulatory elements, as the haplotype includes a variant (rs2275707) overlapping both the 3’-UTR of SLC30A10 and regions of H3K4me1 histone modifications (characteristic of enhancers) active only in brain and liver84.
Clinical relevance: manganese homeostasis in health and disease
Manganese (Mn) is a trace element required in the diet for normal development and function, serving as a cofactor and regulator for many enzymes. However, it can be toxic in high doses; because Mn(II) and Mn(III) can mimic other cations such as iron, it can interfere with systemic iron homeostasis and disrupt in other biochemical processes85,86; at the cellular level, it is cytotoxic and poisons the mitochondria by interfering with the electron transport chain enzymes that rely on Fe-S clusters87. The hallmark of occupational exposure through inhalation is neurotoxicity manifesting as parkinsonism and dystonia (manganism, or Mn intoxication)85,86. Neurotoxicity is an aspect of the Mendelian syndromes caused by loss of function of all three of the hepatic manganese transporters; interestingly, GWAS has also identified a common missense variant in SLC39A8 as a risk factor for schizophrenia and many other diseases88,89; altered function of glycosyltransferases due to manganese overload in the neurons is a proposed mechanism for neurological manifestations of this variant90. Because manganese is excreted through the liver into the bile, increased circulating manganese secondary to liver damage may be a contributing factor to the neurological manifestations of chronic acquired hepatocerebral degeneration (CAHD)91,92,93. However, liver toxicity is not a hallmark of environmental or occupational exposure. Importantly, of the Mendelian syndromes of genes encoding manganese transporters, only SLC30A10 (causing HMNDYT1) involves hepatic symptoms78,94. Hepatotoxicity in HMNDYT1 is thought to be due to cytotoxic manganese overload within hepatocytes; polycythemia is thought to be caused by upregulation of erythropoietin by manganese; and iron anemia through systemic dysregulation of iron homeostasis by excess manganese94,95. Our results suggest that polymorphism in SLC30A10 is a risk factor for manganese-induced hepatocellular damage, polycythemia, and iron anemia in a much broader population beyond the rare recessive syndrome HMNDYT1.
The association of SLC30A10 Thr95Ile with extrahepatic bile duct cancer was unexpected, as this disease has not been described in conjunction with HMNDYT1. Bile duct cancer (cholangiocarcinoma) is a rare disease (age-adjusted incidence of 1–3 per 100,000 per year); cirrhosis, viral hepatitis, primary sclerosing cholangitis, and parasitic fluke infection have been identified as risk factors96,97. It is unclear whether low levels of manganese in the bile, or high levels of manganese in the hepatocytes and bile duct epithelial cholangiocytes, could be directly carcinogenic; manganese-dependent superoxide dismutase (MnSOD, or SOD2) is a tumor suppressor98. A simpler possibility is that cytotoxic manganese overload in hepatocytes and cholangiocytes causes localized inflammation that predisposes to cancer through similar mechanisms as other hepatobiliary risk factors. We do detect an association with cholangitis, but the effect of this association is weaker than the association with cholangiocarcinoma. To our knowledge, SLC30A10 Thr95Ile would be the strongest genetic cholangiocarcinoma risk factor identified to date, being carried by 5% of the extrahepatic bile duct cancer cases in the White British subset of the biobank. Because both SLC30A10 Thr95Ile and extrahepatic bile duct cancer are exceedingly rare, validation of this association in either another very large biobank or in a cohort of cholangiocarcinoma patients will be necessary.
Clinical relevance: genome interpretation
Currently, SLC30A10 Thr95Ile (rs188273166) is listed as a variant of uncertain significance in the ClinVar database99. While the appropriate clinical management of carriers of SLC30A10 Thr95Ile is unclear and would require further studies to determine whether monitoring of hepatobiliary function is warranted, evidence from HMNDYT1 patients has demonstrated that chelation therapy combined with iron supplementation is effective at reversing the symptoms of SLC30A10 insufficiency100. Further studies will be needed to define whether other damaging missense variants or protein-truncating variants in SLC30A10, including the variants known to cause HMNDYT1, also predispose to liver disease in their heterozygous state. Because we only observe one homozygous carrier of SLC30A10 Thr95Ile in our data, further study will also be needed to understand the inheritance model of this association; we cannot determine whether risk in homozygotes is stronger than risk in heterozygotes, unlike cases of HMNDYT1 where identified cases have all experienced homozygous loss-of-function mutation.
More broadly, the case of SLC30A10 fits a pattern of recent discoveries showing that recessive Mendelian disease symptoms can manifest in heterozygous carriers of deleterious variants, blurring the distinction between recessive and dominant disease genes and bridging the gap between common and rare disease genetics101,102. These discoveries are possible only by combining massive, biobank-scale genotype and phenotype datasets such as the UK Biobank.
Sub-population definition and PC calculation
Sub-populations for analysis were obtained through a combination of self-reported ethnicity and genetic principal components. First, the White British population was defined using the categorization performed previously by the UK Biobank (Field 22006 value “Caucasian”); briefly, this analysis selected the individuals who identify as White British (Field 21000), performed a series of subject-level QC steps (to remove subjects with high heterozygosity or missing rate over 5%, removing subjects with genetic and self-reported sex discrepancies and putative sex chromosome aneuploidies, and removing subjects with second or first degree relatives and an excess of third-degree relatives), performed Bayesian outlier detection using the R package aberrant103 to remove ancestry outliers using principal components (PCs) 1 + 2, 3 + 4, and 5 + 6 (calculated from the global UK Biobank PCs stored in Field 22009), selected a subset of variants in preparation for PCA by limiting to directly-genotyped (info = 1), missingness across individuals <2%, MAF > 1%, regions of known long range LD, and pruning to independent markers with pairwise LD < 0.1. Based on this procedure used by the UK Biobank to define the “White British” subset, we defined three additional populations, using other self-reported ancestry groups as starting points (Field 21000 values “Asian or Asian British”, “Black or Black British”, and “Chinese”). Principal components were estimated in PLINK using the unrelated subjects in each subgroup. We then projected all subjects onto the PCs. For the majority of downstream analyses (calculation of per-variant allele frequency and missingness thresholds, calculation of LD, and for association analyses performed in PLINK), just the unrelated subset of people in the subpopulation was used, the unrelated sets were used. The exception was association analyses performed in SAIGE104, a generalized mixed model method that allows inclusion of related individuals; for SAIGE, related individuals were retained in the subpopulations.
For validation in an independent subpopulation of the UK Biobank, two other self-reported ethnicity groups with a sufficient number of SLC30A10 Thr95Ile carriers were assembled, who were not included in “White British” (Field 21000 values “White” subgroup “Irish”, and “White” subgroup “Any other white background” or no reported subgroup).
Array genotype data for association analysis
Data were obtained from the UK Biobank through application 26041. Genotypes were obtained through array typing and imputation as described previously. For genome-wide association analysis, variants were filtered so that imputation quality score (INFO) was greater than 0.8. Genotype missingness, Hardy-Weinberg equilibrium (HWE), and minor allele frequency (MAF) were then each calculated across the unrelated subset of individuals in each of the four sub-populations. For each sub-population a set of variants for GWAS was then defined by filtering missingness across individuals less than 2%, HWE p-value > 10−12, and MAF > 0.1%.
For genome-wide analysis, blood biochemistry values were obtained for ALT (Field 30620) and AST (Field 30650) and log10 transformed, consistent with previous genetic studies14,105, resulting in an approximately normal distribution.
For phenome-wide analysis, ICD10 codes were obtained from inpatient hospital diagnoses (Field 41270), causes of death (Field 40001 and 40002), the cancer registry (Field 40006), and general practitioner (GP) clinical event records (Field 42040). A selection of 135 quantitative traits was obtained from other fields (Supplementary Data 6), encompassing anthropomorphic measurements, blood and urine biochemistry, smoking, exercise, sleep behavior, and liver MRI; all were inverse rank normalized using the RNOmni R package106. All quantitative traits and cancer registry diagnoses were downloaded from the UK Biobank Data Showcase on March 17, 2020. The GP clinical events, inpatient diagnoses, and death registry were available in more detail or in more recent updates than was available through the Data Showcase and were downloaded as separate tables; data for GP clinical records were downloaded on September 30, 2019, data from the death registry was downloaded on June 12, 2020, and data from hospital diagnoses was downloaded on July 15, 2020.
Genome-wide association studies of ALT and AST
Because of the high level of relatedness in the UK Biobank participants107, to maximize power by retaining related individuals we used SAIGE software package104 to perform generalized mixed model analysis for GWAS. A genetic relatedness matrix (GRM) was calculated for each sub-population with a set of 100,000 LD-pruned variants selected from across the allele frequency spectrum. SAIGE was run on the filtered imputed variant set in each sub-population using the following covariates: age at recruitment, sex, BMI, and the first 12 principal components of genetic ancestry (learned within each sub-population as described above). Manhattan plots and Q-Q plots were created using the qqman R package108. The association results for each enzyme were meta-analyzed across the four populations using the METAL software package109 using the default approach (using p-value and direction of effect weighted according to sample size.) To report p-value results, the default approach was used. To report effect sizes and standard errors, because the authors of the SAIGE method advise that parameter estimation may be poor especially for rare variants110, the PLINK software package v1.90111 was run on lead variants on the unrelated subsets of each subpopulation, and then the classical approach (using effect size estimates and standard errors) was used in METAL to meta-analyze the resulting betas and standard errors. All PLINK and SAIGE association tests were performed using the REVEAL/SciDB translational analytics platform from Paradigm4.
Identifying independent, linked association signals between the two GWAS
Meta-analysis results for each enzyme were LD clumped using the PLINK software package, v1.90111 with an r2 threshold of 0.2 and a distance limit of 10 megabases, to group the results into approximately independent signals. LD calculations were made using genotypes of the White British sub-population because of their predominance in the overall sample. Lead variants (the variants with the most significant p-values) from these “r2 > 0.2 LD blocks” were then searched for proxies using an r2 threshold of 0.8 and a distance limit of 250 kilobases, resulting in “r2 > 0.8 LD blocks” defining potentially causal variants at each locus. The “r2 > 0.8 LD blocks” for the ALT results were then compared to the “r2 > 0.8 LD blocks” for the AST results, and any cases where these blocks shared at least one variant between the two GWAS were treated as potentially colocalized association signals between the two GWAS. In these cases, a representative index variant was chosen to represent the results of both GWAS by choosing the index variant of the GWAS with the more significant p-value. Next, these putative colocalized association signals were then distance pruned by iteratively removing neighboring index variants within 500 kilobases of each index variant with less significant p-values (the minimum p-value between the two GWAS was used for the distance pruning procedure.) The Manhattan plot of METAL results with labeled colocalization signals was created using the CMplot R package112.
Annotation of associated loci and variants
Index variants and their corresponding strongly-linked (r2 > 0.8) variants were annotated using the following resources: distance to closest protein-coding genes as defined by ENSEMBL v98 using the BEDTools package113, impact on protein-coding genes using the ENSEMBL Variant Effect Predictor (VEP) software package114 with the LOFTEE plugin to additionally predict protein-truncating variants115; eQTLs (only the most significant eQTL per gene-tissue combination) from GTEx v8 (obtained from the GTEx Portal) for liver, kidney cortex, and skeletal muscle116; a published meta-analysis of four liver eQTL studies41; the eQTLGen meta-analysis of blood eQTL studies117; and GWAS results from the NHGRI-EBI GWAS Catalog (r2020-01-27)118, filtered to associations with p < 5 × 10−8.
Association of ALT- and AST-associated loci with liver disease
Index variants were tested for association with any liver disease using ICD10 codes K70-K77 in inpatient hospital diagnoses, causes of death, and GP clinical event records, using SAIGE, with the same covariates used for the liver enzymes (age, sex, and genetic PCs 1-12) plus a covariate for each of the following: whether the subject was recruited in Scotland, whether the subject was recruited in Wales, and whether the patient had GP clinical event records available. Association results were meta-analyzed across the four sub-populations using METAL using the default method (combining p-values) to obtain the final p-value. To obtain effect sizes and standard errors, the same procedure was performed but using PLINK (on the unrelated subset of each population) and using the classical method in METAL (combining effects and standard errors.)
Sequencing-based validation of rs188273166 array genotyping
Whole exome sequencing was available for 301,473 of the 487,327 array-genotyped samples. DNA was extracted from whole blood and was prepared and sequenced by the Regeneron Genetics Center (RGC). A complete protocol has been described elsewhere119. Briefly, the xGen exome capture was used and reads were sequenced using the Illumina NovaSeq 6000 platform. Reads were aligned to the GRCh38 reference genome using BWA-mem120. Duplicate reads were identified and excluded using the Picard MarkDuplicates tool (Broad Institute)121. Variant calling of SNVs and indels was done using the WeCall variant caller (Genomics Plc.)122 to produce a GVCF file for each subject (GVCF files are files in the VCF Variant Call Format that are indexed for fast processing). GVCF files were combined to using the GLnexus joint calling tool123. Post-variant calling filtering was applied as described previously119.
Replication of SLC30A10 Thr95Ile associations
ALT and AST association tests were repeated as described for the genome-wide scans, using SAIGE and PLINK, in the “Other White” and “White Irish” populations, for the SLC30A10 Thr95Ile (rs188273166) variant. In the two DiscovEHR Geisinger Health Service (GHS) cohorts, association tests were performed using BOLT124 with covariates for age, age squared, age × sex, sex, and the first ten principal components of genetic ancestry; ALT, AST, and hematocrit values were taken from the median of lab values available. Results were meta-analyzed across the four populations. A forest plot was created using the forestplot package in R125.
Testing linkage of SLC30A10 Thr95Ile to common GWAS variants
To test linkage of SLC30A10 Thr95Ile (rs188273166) to common GWAS variants, the GWAS Catalog was searched for all results where “Mapped Gene” was assigned to SLC30A10; because of the very relevant phenotype, blood Mn-associated variant rs1776029, an association that is not in the GWAS Catalog, was also included in the analysis, as well at cT1-associated variant rs759359281. LD calculations were performed in PLINK, using the White British unrelated subpopulation, between rs188273166 and the GWAS variants with the options --r2 dprime-signed in-phase with-freqs --ld-window 1000000 --ld-window-r2 0. For rs1776029, an additional Fisher’s exact test was performed to determine the confidence interval of the enrichment of rs188273166 on the rs1776029 haplotype. The linked alleles from PLINK were then used in conjunction with the effect allele from the reported papers to determine the direction of effect. The GWAS Atlas website46 was used (the PheWAS tool) to determine the direction of effect for the linked alleles from the original paper; in cases where the original paper from the GWAS Catalog did not report a direction of effect, other papers for the same phenotype and variant from GWAS Atlas were used to determine the direction of effect and cited accordingly (Supplementary Data 5). Reference epigenome information for the GWAS variants was obtained by searching for rs1776029 in HaploReg v4.1126.
Phenome-wide association study of SLC30A10 Thr95Ile
A phenome-wide association study of SLC30A10 Thr95Ile (rs188273166) was performed by running SAIGE and PLINK against a set of ICD10 diagnoses and quantitative traits, obtained as described above, and using the covariates described above for the test of association with liver disease. ICD10 diagnoses were filtered to include only those at a three-character (category), four-character (category plus one additional numeral), or “block” level that were frequent enough to test in both subpopulations and without significant collinearity with the sex, GP availability, or country of recruitment covariates: at least 100 diagnoses overall, and at least one diagnosis in each of the following subgroups, to avoid collinearity with covariates while running SAIGE: the with- and without-GP data subgroups, men, women, and each of the three recruitment countries. This resulted in 4397 ICD10 codes to test, serving as the multiple hypothesis burden.
Bioinformatic analysis of SLC30A10 Thr95Ile
To visualize Thr95Ile on the protein sequence of SLC30A10, UNIPROT entry Q6XR72 (ZN10_HUMAN) was accessed72. In UNIPROT, natural variants causing HMNDYT132,34,56 and mutagenesis results56,73,127 were collated from the literature and highlighted. CADD score v1.5128 was downloaded from the authors’ website. SIFT score was obtained from the authors’ website using the “dbSNP rsIDs (SIFT4G predictions)” tool129. PolyPhen score and multiple species alignment was obtained from the authors’ website using the PolyPhen-2 tool130.
Immunofluorescence of SLC30A10 localization in cultured cells
HeLa cells (ATCC®, Manassas, VA) were grown in Eagle’s minimum essential medium (ATCC®, Manassas, VA) containing 10% fetal bovine serum (Gibco, Carlsbad, CA) at 37 °C and 5% CO2. All plasmid transfections were performed using Lipofectamine™ 2000 (Invitrogen, Carlsbad, CA) and Opti-MEM (Gibco, Grand Island, NY) according to the manufacturer’s specifications. FLAG-tagged SLC30A10 plasmid constructs designed with a linker sequence in pCMV6-AN-3DDK (Blue Heron Biotech, Bothell, WA) included wild type, del105-107, L89P, T95I, and used an empty vector for one of the negative controls.
HeLa cells were grown on 8-chambered slides for 48 h post-transfection. IF procedures were performed at room temperature unless otherwise noted. HeLa cells were rinsed in dPBS (Gibco, Grand Island, NY), fixed with 4% paraformaldehyde (in water) (Electron Microscopy Sciences, Hatfield, PA) for 10 min, rinsed in 4 °C PBS (Invitrogen, Vilnius, LT), and permeabilized for 5 min with 0.1% Triton X-100 (Sigma-Aldrich, St. Louis, MO). After rinsing in PBS and blocking in 2% BSA (in PBS) (Jackson ImmunoResearch, West Grove, PA) for 30 min, the cells were stained with 2% BSA blocking solution containing monoclonal ANTI-FLAG® M2-FITC, Clone M2 (dilution 1:100; Sigma-Aldrich, St. Louis, MO) and Calnexin Monoclonal Antibody, Clone AF18 (dilution 1:100; Invitrogen, Carlsbad, CA). After three final washes in dPBS, mounting medium with DAPI (Vector Laboratories, Burlingame, CA) was added and sealed under a coverslip with nail polish. Images were captured with the REVOLVE Echo microscope at 20X magnification.
Ethics oversight for the UK Biobank is provided by an Ethics and Governance Council which obtained informed consent from all participants for health-related research. All research described was performed within the framework of Application 26041.
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Complete summary statistics from the genome-wide association studies of ALT, AST, and extrahepatic bile duct cancer have been submitted to the NHGRI-EBI GWAS catalog [https://www.ebi.ac.uk/gwas/] with accession numbers GCST90013663, GCST90013664, and GCST90013662, respectively. Source data are provided with this paper. Individual-level genetic and phenotypic data from the UK Biobank are available to qualified researchers upon application [http://ukbiobank.ac.uk]. Individual-level genetic and phenotypic data from DiscovEHR are not available to outside researchers due to privacy restrictions. Source data are provided with this paper.
Asrani, S. K., Devarbhavi, H., Eaton, J. & Kamath, P. S. Burden of liver diseases in the world. J. Hepatol. 70, 151–171 (2019).
Younossi, Z. M. et al. Changes in the prevalence of the most common causes of chronic liver diseases in the United States from 1988 to 2008. Clin. Gastroenterol. Hepatol. 9, 524–530.e1; quiz e60. (2011).
Plenge, R. M., Scolnick, E. M. & Altshuler, D. Validating therapeutic targets through human genetics. Nat. Rev. Drug Disco. 12, 581–594 (2013).
Stevens, J. L. & Baker, T. K. The future of drug safety testing: expanding the view and narrowing the focus. Drug Disco. Today 14, 162–167 (2009).
Deaton, A. M. et al. Rationalizing secondary pharmacology screening using human genetic and pharmacological evidence. Toxicol. Sci. 167, 593–603 (2019).
van Beek, J. H. et al. The genetic architecture of liver enzyme levels: GGT, ALT and AST. Behav. Genet 43, 329–339 (2013).
Pratt, D. S. & Kaplan, M. M. Evaluation of abnormal liver-enzyme results in asymptomatic patients. N. Engl. J. Med. 342, 1266–1271 (2000).
Rahmioglu, N. et al. Epidemiology and genetic epidemiology of the liver function test proteins. PLoS ONE 4, e4435 (2009).
Pilia, G. et al. Heritability of cardiovascular and personality traits in 6,148 Sardinians. PLoS Genet. 2, e132 (2006).
Makkonen, J., Pietilainen, K. H., Rissanen, A., Kaprio, J. & Yki-Jarvinen, H. Genetic factors contribute to variation in serum alanine aminotransferase activity independent of obesity and alcohol: a study in monozygotic and dizygotic twins. J. Hepatol. 50, 1035–1042 (2009).
Nilsson, S. E., Read, S., Berg, S. & Johansson, B. Heritabilities for fifteen routine biochemical values: findings in 215 Swedish twin pairs 82 years of age or older. Scand. J. Clin. Lab. Invest. 69, 562–569 (2009).
Bathum, L. et al. Evidence for a substantial genetic influence on biochemical liver function tests: results from a population-based Danish twin study. Clin. Chem. 47, 81–87 (2001).
Targher, G. Elevated serum gamma-glutamyltransferase activity is associated with increased risk of mortality, incident type 2 diabetes, cardiovascular events, chronic kidney disease and cancer—a narrative review. Clin. Chem. Lab. Med. 48, 147–157 (2010).
Chambers, J. C. et al. Genome-wide association study identifies loci influencing concentrations of liver enzymes in plasma. Nat. Genet. 43, 1131–1138 (2011).
Young, K. A. et al. Genome-wide association study identifies loci for liver enzyme concentrations in Mexican Americans: The GUARDIAN Consortium. Obes. (Silver Spring) 27, 1331–1337 (2019).
Park, T. J. et al. Genome-wide association study of liver enzymes in korean children. Genomics Inf. 11, 149–154 (2013).
Moon, S. et al. The Korea Biobank Array: design and identification of coding variants associated with blood biochemical traits. Sci. Rep. 9, 1382 (2019).
Kanai, M. et al. Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases. Nat. Genet. 50, 390–400 (2018).
Kamatani, Y. et al. Genome-wide association study of hematological and biochemical traits in a Japanese population. Nat. Genet. 42, 210–215 (2010).
Kim, Y. J. et al. Large-scale genome-wide association studies in East Asians identify new genetic loci influencing metabolic traits. Nat. Genet. 43, 990–995 (2011).
Prins, B. P. et al. Genome-wide analysis of health-related biomarkers in the UK Household Longitudinal Study reveals novel associations. Sci. Rep. 7, 11008 (2017).
Namjou, B. et al. GWAS and enrichment analyses of non-alcoholic fatty liver disease identify new trait-associated genes and pathways across eMERGE Network. BMC Med. 17, 135 (2019).
Gurdasani, D. et al. Uganda genome resource enables insights into population history and genomic discovery in Africa. Cell 179, 984–1002.e1036 (2019).
Gilly, A. et al. Very low-depth whole-genome sequencing in complex trait association studies. Bioinformatics 35, 2555–2561 (2019).
Romeo, S. et al. Genetic variation in PNPLA3 confers susceptibility to nonalcoholic fatty liver disease. Nat. Genet 40, 1461–1465 (2008).
Abul-Husn, N. S. et al. A Protein-Truncating HSD17B13 Variant and Protection from Chronic Liver Disease. N. Engl. J. Med. 378, 1096–1106 (2018).
Seve, M., Chimienti, F., Devergnas, S. & Favier, A. In silico identification and expression of SLC30 family genes: an expressed sequence tag data mining strategy for the characterization of zinc transporters’ tissue expression. BMC Genomics 5, 32 (2004).
Tuschl, K. et al. Syndrome of hepatic cirrhosis, dystonia, polycythemia, and hypermanganesemia caused by mutations in SLC30A10, a manganese transporter in man. Am. J. Hum. Genet. 90, 457–466 (2012).
Brna, P., Gordon, K., Dooley, J. M. & Price, V. Manganese toxicity in a child with iron deficiency and polycythemia. J. Child Neurol. 26, 891–894 (2011).
Gospe, S. M. Jr. et al. Paraparesis, hypermanganesaemia, and polycythaemia: a novel presentation of cirrhosis. Arch. Dis. Child 83, 439–442 (2000).
Lechpammer, M. et al. Pathology of inherited manganese transporter deficiency. Ann. Neurol. 75, 608–612 (2014).
Quadri, M. et al. Mutations in SLC30A10 cause parkinsonism and dystonia with hypermanganesemia, polycythemia, and chronic liver disease. Am. J. Hum. Genet. 90, 467–477 (2012).
Sahni, V. et al. Case report: a metabolic disorder presenting as pediatric manganism. Environ. Health Perspect. 115, 1776–1779 (2007).
Tuschl, K. et al. Hepatic cirrhosis, dystonia, polycythaemia and hypermanganesaemia—a new metabolic disorder. J. Inherit. Metab. Dis. 31, 151–163 (2008).
Brantly, M., Nukiwa, T. & Crystal, R. G. Molecular basis of alpha-1-antitrypsin deficiency. Am. J. Med 84, 13–31 (1988).
Weedon, M. N. et al. Use of SNP chips to detect rare pathogenic variants: retrospective, population based diagnostic evaluation. BMJ 372, n214 (2021).
Dewey, F. E. et al. Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study. Science 354, aaf6814 (2016).
Partnership, A. M. Common Metabolic Diseases Knowledge Portal, http://hugeamp.org/ Accessed December 2020.
Emdin, C. A. et al. A missense variant in Mitochondrial Amidoxime Reducing Component 1 gene and protection against liver disease. PLoS Genet. 16, e1008629 (2020).
Speliotes, E. K. et al. Genome-wide association analysis identifies variants associated with nonalcoholic fatty liver disease that have distinct effects on metabolic traits. PLoS Genet 7, e1001324 (2011).
Strunz, T. et al. A mega-analysis of expression quantitative trait loci (eQTL) provides insight into the regulatory architecture of gene expression variation in liver. Sci. Rep. 8, 5865 (2018).
Ng, E. et al. Genome-wide association study of toxic metals and trace elements reveals novel associations. Hum. Mol. Genet. 24, 4739–4745 (2015).
Corre, T. et al. Common variants in CLDN14 are associated with differential excretion of magnesium over calcium in urine. Pflug. Arch. 469, 91–103 (2017).
Kichaev, G. et al. Leveraging polygenic functional enrichment to improve GWAS power. Am. J. Hum. Genet. 104, 65–75 (2019).
Astle, W. J. et al. The allelic landscape of human blood cell trait variation and links to common complex disease. Cell 167, 1415–1429.e1419 (2016).
Tian, D. et al. GWAS Atlas: a curated resource of genome-wide variant-trait associations in plants and animals. Nucleic Acids Res 48, D927–D932 (2020).
Kim, S. K. Identification of 613 new loci associated with heel bone mineral density and a polygenic risk score for bone mineral density, osteoporosis and fracture. PLoS ONE 13, e0200785 (2018).
Morris, J. A. et al. An atlas of genetic influences on osteoporosis in humans and mice. Nat. Genet. 51, 258–266 (2019).
Kemp, J. P. et al. Identification of 153 new loci associated with heel bone mineral density and functional involvement of GPC6 in osteoporosis. Nat. Genet. 49, 1468–1475 (2017).
Parisinos, C. A. et al. Genome-wide and Mendelian randomisation studies of liver MRI yield insights into the pathogenesis of steatohepatitis. J. Hepatol.https://doi.org/10.1016/j.jhep.2020.03.032 (2020).
Trieb, M. et al. Liver disease alters high-density lipoprotein composition, metabolism and function. Biochim. Biophys. Acta 1861, 630–638 (2016).
Aguet, F. et al. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020)
Ramachandran, P. et al. Resolving the fibrotic niche of human liver cirrhosis at single-cell level. Nature 575, 512–518 (2019).
MacParland, S. A. et al. Single cell RNA sequencing of human liver reveals distinct intrahepatic macrophage populations. Nat. Commun. 9, 4383 (2018).
Aizarani, N. et al. A human liver cell atlas reveals heterogeneity and epithelial progenitors. Nature 572, 199–204 (2019).
Leyva-Illades, D. et al. SLC30A10 is a cell surface-localized manganese efflux transporter, and parkinsonism-causing mutations block its intracellular trafficking and efflux activity. J. Neurosci. 34, 14079–14095 (2014).
Yuan, X. et al. Population-based genome-wide association studies reveal six loci influencing plasma levels of liver enzymes. Am. J. Hum. Genet. 83, 520–528 (2008).
Liu, Y. et al. Genome-wide study links PNPLA3 variant with elevated hepatic transaminase after acute lymphoblastic leukemia therapy. Clin. Pharm. Ther. 102, 131–140 (2017).
Whitfield, J. B. et al. Biomarker and genomic risk factors for liver function test abnormality in hazardous drinkers. Alcohol Clin. Exp. Res. 43, 473–482 (2019).
Xu, C. F. et al. HLA-B*57:01 Confers susceptibility to pazopanib-associated liver injury in patients with cancer. Clin. Cancer Res. 22, 1371–1377 (2016).
Feder, J. N. et al. A novel MHC class I-like gene is mutated in patients with hereditary haemochromatosis. Nat. Genet. 13, 399–408 (1996).
Kozlitina, J. et al. Exome-wide association study identifies a TM6SF2 variant that confers susceptibility to nonalcoholic fatty liver disease. Nat. Genet. 46, 352–356 (2014).
Holmen, O. L. et al. Systematic evaluation of coding variation identifies a candidate causal variant in TM6SF2 influencing total cholesterol and myocardial infarction risk. Nat. Genet. 46, 345–351 (2014).
Liu, Y. L. et al. TM6SF2 rs58542926 influences hepatic fibrosis progression in patients with non-alcoholic fatty liver disease. Nat. Commun. 5, 4309 (2014).
Buch, S. et al. A genome-wide association scan identifies the hepatic cholesterol transporter ABCG8 as a susceptibility factor for human gallstone disease. Nat. Genet. 39, 995–999 (2007).
Ferkingstad, E. et al. Genome-wide association meta-analysis yields 20 loci associated with gallstone disease. Nat. Commun. 9, 5101 (2018).
Tsutsumi, S. et al. The novel gene encoding a putative transmembrane protein is mutated in gnathodiaphyseal dysplasia (GDD). Am. J. Hum. Genet. 74, 1255–1261 (2004).
Penttila, S. et al. Eight new mutations and the expanding phenotype variability in muscular dystrophy caused by ANO5. Neurology 78, 897–903 (2012).
Hayes, M. G. et al. Identification of HKDC1 and BACE2 as genes influencing glycemic traits during pregnancy through genome-wide association studies. Diabetes 62, 3282–3291 (2013).
Guo, C. et al. Coordinated regulatory variation associated with gestational hyperglycaemia regulates expression of the novel hexokinase HKDC1. Nat. Commun. 6, 6069 (2015).
Sherry, S. T., Ward, M. & Sirotkin, K. dbSNP-database for single nucleotide polymorphisms and other classes of minor genetic variation. Genome Res. 9, 677–679 (1999).
Arnold, L. M., Hirsch, I., Sanders, P., Ellis, A. & Hughes, B. Safety and efficacy of esreboxetine in patients with fibromyalgia: a fourteen-week, randomized, double-blind, placebo-controlled, multicenter clinical trial. Arthritis Rheum. 64, 2387–2397 (2012).
Zogzas, C. E., Aschner, M. & Mukhopadhyay, S. Structural elements in the transmembrane and cytoplasmic domains of the metal transporter SLC30A10 are required for its manganese efflux activity. J. Biol. Chem. 291, 15940–15957 (2016).
Park, J. H. et al. SLC39A8 Deficiency: a disorder of manganese transport and glycosylation. Am. J. Hum. Genet. 97, 894–903 (2015).
Tuschl, K. et al. Mutations in SLC39A14 disrupt manganese homeostasis and cause childhood-onset parkinsonism-dystonia. Nat. Commun. 7, 11601 (2016).
Scheiber, I. F., Wu, Y., Morgan, S. E. & Zhao, N. The intestinal metal transporter ZIP14 maintains systemic manganese homeostasis. J. Biol. Chem. 294, 9147–9160 (2019).
Mercadante, C. J. et al. Manganese transporter Slc30a10 controls physiological manganese excretion and toxicity. J. Clin. Invest 129, 5442–5461 (2019).
Katz, N. & Rader, D. J. Manganese homeostasis: from rare single-gene disorders to complex phenotypes and diseases. J. Clin. Invest 129, 5082–5085 (2019).
Pavlides, M. et al. Multiparametric magnetic resonance imaging for the assessment of non-alcoholic fatty liver disease severity. Liver Int. 37, 1065–1073 (2017).
Pavlides, M. et al. Multiparametric magnetic resonance imaging predicts clinical outcomes in patients with chronic liver disease. J. Hepatol. 64, 308–315 (2016).
Kim, Y. High signal intensities on T1-weighted MRI as a biomarker of exposure to manganese. Ind. Health 42, 111–115 (2004).
Bae, Y. J. & Kim, M. H. Manganese supplementation improves mineral density of the spine and femur and serum osteocalcin in rats. Biol. Trace Elem. Res. 124, 28–34 (2008).
Strause, L. G., Hegenauer, J., Saltman, P., Cone, R. & Resnick, D. Effects of long-term dietary manganese and copper deficiency on rat skeleton. J. Nutr. 116, 135–141 (1986).
Roadmap Epigenomics, C. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Crossgrove, J. & Zheng, W. Manganese toxicity upon overexposure. NMR Biomed. 17, 544–553 (2004).
O’Neal, S. L. & Zheng, W. Manganese toxicity upon overexposure: a decade in review. Curr. Environ. Health Rep. 2, 315–328 (2015).
Chen, J. Y., Tsao, G. C., Zhao, Q. & Zheng, W. Differential cytotoxicity of Mn(II) and Mn(III): special reference to mitochondrial [Fe-S] containing enzymes. Toxicol. Appl. Pharmacol. 175, 160–168 (2001).
Schizophrenia Working Group of the Psychiatric Genomics, C. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).
Costas, J. The highly pleiotropic gene SLC39A8 as an opportunity to gain insight into the molecular pathogenesis of schizophrenia. Am. J. Med Genet. B Neuropsychiatr. Genet 177, 274–283 (2018).
Mealer, R. G. et al. The schizophrenia risk locus in SLC39A8 alters brain metal transport and plasma glycosylation. Sci. Rep. 10, 13162 (2020).
Krieger, D. et al. Manganese and chronic hepatic encephalopathy. Lancet 346, 270–274 (1995).
Rajoriya, N., Brahmania, M. & Feld, J, J. Implications of manganese in chronic acquired hepatocerebral degeneration. Ann. Hepatol. 18, 274–278 (2019).
Burkhard, P. R., Delavelle, J., Du Pasquier, R. & Spahr, L. Chronic parkinsonism associated with cirrhosis: a distinct subset of acquired hepatocerebral degeneration. Arch. Neurol. 60, 521–528 (2003).
Anagianni, S. & Tuschl, K. Genetic disorders of manganese metabolism. Curr. Neurol. Neurosci. Rep. 19, 33 (2019).
Ebert, B. L. & Bunn, H. F. Regulation of the erythropoietin gene. Blood 94, 1864–1877 (1999).
Tyson, G. L. & El-Serag, H. B. Risk factors for cholangiocarcinoma. Hepatology 54, 173–184 (2011).
Razumilava, N. & Gores, G. J. Cholangiocarcinoma. Lancet 383, 2168–2179 (2014).
Kim, A. Modulation of MnSOD in cancer:epidemiological and experimental evidence. Toxicol. Res 26, 83–93 (2010).
Landrum, M. J. et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 42, D980–D985 (2014).
Stamelou, M. & Bhatia, K. P. A new treatable genetic disorder of manganese metabolism causing dystonia-parkinsonism and cirrhosis: the “new” Wilson’s disease? Mov. Disord. 27, 962 (2012).
Bastarache, L. et al. Phenotype risk scores identify patients with unrecognized Mendelian disease patterns. Science 359, 1233–1239 (2018).
Hou, Y.-C. C. et al. Precision medicine advancements using whole genome sequencing, noninvasive whole body imaging, and functional diagnostics. bioRxiv, 497560. Preprint at https://doi.org/10.1101/497560 (2018).
Bellenguez, C. et al. A robust clustering algorithm for identifying problematic samples in genome-wide association studies. Bioinformatics 28, 134–135 (2012).
Zhou, W. et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 50, 1335–1341 (2018).
Nioi, P. et al. Variant ASGR1 associated with a reduced risk of coronary artery disease. N. Engl. J. Med. 374, 2131–2141 (2016).
McCaw, Z. R., Lane, J. M., Saxena, R., Redline, S. & Lin, X. Operating characteristics of the rank-based inverse normal transformation for quantitative trait analysis in genome-wide association studies. Biometrics 76, 1262–1272 (2020).
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
Turner, S. D. qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots. Journal of Open Source Software 3, 731 (2018).
Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
Zhou, W. (2018) https://github.com/weizhouUMICH/SAIGE/issues/43.
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
Yin, L. A high-quality drawing tool designed for Manhattan plot of genomic analysis, https://github.com/YinLiLin/R-CMplot (2018).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
McLaren, W. et al. The ensembl variant effect predictor. Genome Biol. 17, 122 (2016).
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Consortium, G. T. et al. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).
Võsa, U. et al. Unraveling the polygenic architecture of complex traits using blood eQTL metaanalysis. bioRxiv, 447367. Preprint at https://doi.org/10.1101/447367 (2018).
Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).
Van Hout, C. V. et al. Exome sequencing and characterization of coding variation in 49,960 individuals in the UK Biobank. Nature 586, 749–756 (2020).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Lin, M. F. et al. GLnexus: joint variant calling for large cohort sequencing. bioRxiv, 343970. Preprint at https://doi.org/10.1101/343970 (2018).
Loh, P. R., Kichaev, G., Gazal, S., Schoech, A. P. & Price, A. L. Mixed-model association for biobank-scale datasets. Nat. Genet. 50, 906–908 (2018).
Gordon, M. & Lumley, T. forestplot: Advanced Forest Plot Using ‘grid’ Graphics. R package version 1 (2015).
Ward, L. D. & Kellis, M. HaploReg v4: systematic mining of putative causal variants, cell types, regulators and target genes for human complex traits and disease. Nucleic Acids Res. 44, D877–D881 (2016).
Zhao, Y., Feresin, R. G., Falcon-Perez, J. M. & Salazar, G. Differential targeting of SLC30A10/ZnT10 heterodimers to endolysosomal compartments modulates EGF-induced MEK/ERK1/2 activity. Traffic 17, 267–288 (2016).
Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res 47, D886–D894 (2019).
Vaser, R., Adusumalli, S., Leng, S. N., Sikic, M. & Ng, P. C. SIFT missense predictions for genomes. Nat. Protoc. 11, 1–9 (2016).
Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).
This research has been conducted using the UK Biobank resource, application number 26041. We thank the UK Biobank participants for their donations to this resource.
A.D., A.F.C., S.T., L.W., M.P., P.N., C.Q., H.C.T., G.H., and P.H. are employees of Alnylam Pharmaceuticals, Inc. L.L., N.V., M.F., and A.B. are employees of Regeneron Pharmaceuticals, Inc.
Peer review information Nature Communications thanks Constantinos Parisinos, Stefan Schreiber, and the other, anonymous reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Ward, L.D., Tu, HC., Quenneville, C.B. et al. GWAS of serum ALT and AST reveals an association of SLC30A10 Thr95Ile with hypermanganesemia symptoms. Nat Commun 12, 4571 (2021). https://doi.org/10.1038/s41467-021-24563-1