Liver cirrhosis is the 12th leading cause of death in the United States, with a total of 28,175 deaths reported in 20051. Nearly half of these were classified as alcohol related. Individuals of Hispanic origin are disproportionately affected by chronic liver disease2. In 2007, cirrhosis was the fourth leading cause of death in Mexico and the second leading cause in adults age 15–64 ( Only 10–15% of alcoholics develop cirrhosis, and although patterns of alcohol consumption are clearly important, they do not appear to fully account for the ethnic differences in cirrhosis incidence rates3.

Recently, Romeo et al.4 carried out a GWAS of nonsynonymous sequence variations in a population comprising Hispanics, African Americans and European Americans to identify genetic variants contributing to nonalcoholic fatty liver disease. They found strong evidence of association of an allele in PNPLA3 (rs738409: C>G, NP_079501.2:p.I148M) with increased hepatic fat levels (P= 5.9 × 10−10), and the association remained highly significant (P= 7.0 × 10−14) after adjustment for body mass index, diabetes status, ethanol use and genetic ancestry. The study also revealed a significant elevation in serum concentrations of alanine aminotransferase (ALT) in association with the rs738409[G] allele in Hispanics (P= 1.3 × 10−5). Resequencing the region revealed another allele (rs6006460: G>T, NP_079501.2:p.S453I) that was independently associated with lower hepatic fat content in African Americans, the group at lowest risk of nonalcoholic fatty liver disease. An independent GWAS5 aiming to find genes influencing plasma levels of liver enzymes also showed evidence that variants in PNPLA3 were associated with plasma levels of ALT. This study gave additional support to the hypothesis that rs738409[G] may confer increased susceptibility to hepatic injury. These associations have recently been confirmed in additional cohorts6,7.

We set out to assess whether variation in PNPLA3 was also associated with clinically evident liver disease in Mestizo (mixed European and Native American ancestry) individuals from Mexico City with a history of substantial alcohol abuse. In addition to the reported nonsynonymous variants rs738409 and rs6006460, we assayed 15 common tagging SNPs from the PNPLA3 region, 291 SNPs for assessing global ancestry8, 16 ancestry-informative markers flanking the PNPLA3 region for assessing local ancestry9 (Supplementary Table 1) and 7 SNPs previously reported to be associated with cirrhosis in individuals with hepatitis C10. We successfully genotyped 305 individuals with a history of alcohol dependence but apparently normal liver function (controls), 434 with intermediate alcoholic liver disease (ALD) and 482 with clinically evident alcoholic cirrhosis. Diagnoses were based on biochemical and clinical assessments supported by imaging, without histological confirmation (Supplementary Fig. 1 and Supplementary Methods).

The clinical characteristics of our samples are shown in Table 1. The control, ALD and cirrhosis diagnosis groups had significant differences (all P 10−10) in mean age, mean current alcohol intake and duration of alcohol consumption, and in mean global and local loadings from a principal component analysis (PCA) of the genotyping data (Supplementary Figs. 2, 3 and 4). Both the local and global mean individual proportions of Native American ancestry were significantly higher among the individuals with cirrhosis than the control individuals (P < 2.2 × 10−16), consistent with the higher prevalence of cirrhosis in Hispanics compared to individuals of European or African ancestry2.

Table 1 Clinical characteristics of the genotyped subjects

We tested each SNP for association with clinically evident cirrhosis or ALD using likelihood ratio tests from a logistic regression analysis, adjusted for age, alcohol intake and duration, an interaction between age and duration, and global genetic ancestry estimated using PCA (Supplementary Table 2). The association analysis was carried out for three pairwise combinations of diagnosis status: cirrhosis versus control, cirrhosis versus ALD, and ALD versus control. We also performed tests controlling for local ancestry along the 9.4-Mb region flanking rs738409 estimated from PCA using the 16 ancestry-informative markers. SNP rs738409 was strongly associated with alcoholic cirrhosis (Table 2).

Table 2 Association test results for rs738409

In tests of cirrhosis versus control status, rs738409 showed strong association before and after controlling for global ancestry (P=1.7 × 10−10, P= 1.9 × 10−5) as well as after controlling for local ancestry (P= 4.7 × 10−5). Test results for cirrhosis versus ALD, and for ALD versus control status, suggest that rs738409 has an intermediate effect on the ALD phenotype. We used the Akaike information criterion to compare four genetic models (2-d.f. general model and 1-d.f. additive, dominant and recessive risk-allele models) using logistic regression adjusting for covariates and the global individual ancestry. The most parsimonious model was an additive model for rs738409 [G] (Supplementary Table 3). Logistic regression analysis suggested that the rs738409 sequence variation accounts for 49% of the observed ancestry-related difference in cirrhosis susceptibility. Further tests showed no association or interaction of rs738409 with other covariates, including age, alcohol intake and duration of intake. Matched analyses gave very similar estimates of the genotype effect of rs738409 (Supplementary Methods).

Association test results for the tagging SNPs were generally consistent with their extent of linkage disequilibrium (LD) with rs738409 (Supplementary Fig. 5 and Table 2). SNP rs738408 is 3 base pairs away from rs738409 and these two SNPs are in nearly complete LD (r2= 0.99). Association tests for rs738408 gave results nearly identical to those for rs738409. When rs738409 was treated as causal by including its genotypes as a covariate in the regression model, the additional associations in the PNPLA3 region were eliminated. All common haplotypes containing the rs738409[G] allele were more common among individuals with cirrhosis than in the control group (Supplementary Table 4). The rare variant rs6006460[T] reported by Romeo et al.4 is also rare in our Mestizo population (minor allele frequency =0.002), so we had no power to detect an effect of this variant. The minor T allele of rs6006460 was observed in both the cirrhosis and control groups. We did not observe significant associations with any of the markers previously reported to be associated with cirrhosis in individuals with hepatitis C10 (Supplementary Table 2).

We further tested for an association of rs738409 with a prognostic measure of disease severity in individuals with cirrhosis, the Child-Pugh score11. We coded Child-Pugh classes as numeric scores from 1 to 3 in order of severity and fit the data using linear regression with the same covariates used in the binary outcome models and an additive genotype term. The high-risk G allele showed a suggestive association with increasing disease severity (Wald test, one-sided P= 0.05). The frequencies of rs738409[G] in individuals scored as Child-Pugh class A, B and C (with class C being the most severe), were 0.70, 0.75 and 0.77, respectively.

The biochemical function of the PNPLA3 protein is unclear, though it appears to have lipogenic transacetylase activity12. The gene is highly expressed in adipocytes and liver7, and the protein may have a role in energy homeostasis13. Obesity and diabetes are known risk factors for non-alcoholic as well as alcoholic liver disease, and expression of PNPLA3 is correlated with obesity6. However, studies associating rs738409 with liver dysfunction have not detected associations with systemic metabolic phenotypes, including obesity, plasma lipid levels or insulin resistance4,6. These and other results7 suggest that rs738409 is an independent risk factor for liver dysfunction. Our work supports a central role for altered lipid processing in liver pathogenesis14, but further investigation of the mechanism of PNPLA3 activity is warranted. As our study was limited to analyses of categorical diagnoses without detailed supporting quantitative phenotype data, additional work will be required to more precisely characterize the disease states associated with this polymorphism (also see Supplementary Note).

Our study extends the previously reported associations of rs738409 with subclinical nonalcoholic liver disease to clinically relevant diagnoses of alcoholic liver disease. This single variant accounts for a substantial share of the increased risk of cirrhosis associated with Hispanic ancestry. Hispanics with hepatitis C are also at substantially elevated risk for hepatic injury compared to individuals of other ethnicities15, and it will be important to determine whether rs738409 is associated in that context as well. The effect size of rs738409 is large for an association with complex disease in humans and may be the largest known genetic modifier for a disease that is a major cause of preventable death. For these reasons, this variant may be an attractive target for genetic screening to identify individuals at high risk of liver disease for more aggressive interventions.