Multiancestry exome sequencing reveals INHBE mutations associated with favorable fat distribution and protection from diabetes

Body fat distribution is a major, heritable risk factor for cardiometabolic disease, independent of overall adiposity. Using exome-sequencing in 618,375 individuals (including 160,058 non-Europeans) from the UK, Sweden and Mexico, we identify 16 genes associated with fat distribution at exome-wide significance. We show 6-fold larger effect for fat-distribution associated rare coding variants compared with fine-mapped common alleles, enrichment for genes expressed in adipose tissue and causal genes for partial lipodystrophies, and evidence of sex-dimorphism. We describe an association with favorable fat distribution (p = 1.8 × 10−09), favorable metabolic profile and protection from type 2 diabetes (~28% lower odds; p = 0.004) for heterozygous protein-truncating mutations in INHBE, which encodes a circulating growth factor of the activin family, highly and specifically expressed in hepatocytes. Our results suggest that inhibin βE is a liver-expressed negative regulator of adipose storage whose blockade may be beneficial in fat distribution-associated metabolic disease.


Contents
Two frameshift mutations in PLIN1 have been previously identified to be causal for FPLD type 4, which is characterized by selective lack of gluteofemoral and limb fat and an unfavorable metabolic profile (1). In our analysis, PLIN1 pLOF variants were associated with lower BMIadjusted WHR and larger hip circumference (Supplementary Data 27), indicative of a favorable fat distribution and opposite of what has been observed in FPLD type 4. The two frameshift variants in PLIN1 that cause FPLD type 4 (Leu404fs and Val398fs) occur at the C-terminal end of the protein, with long out-of-frame C-terminal tails with 158 and 166 aberrant amino acid residues (1), and both mutations also result in the synthesis of a mutated protein that is longer than wildtype perilipin 1 (2). Conversely, the vast majority of pLOF alleles observed in our data are predicted to truncate perilipin 1 earlier than the two FPLD variants (Supplementary Figure 21) and are likely to result in loss of a functional copy of the gene, leading to haploinsufficiency. Therefore, our human genetic analysis suggests that loss of a copy of PLIN1 is associated with favorable fat distribution in the general population, while the lipodystrophy phenotype observed in FPLD type 4 might be due to altered function of the C-terminal frameshift variants.

Supplementary Result 2. Genomic context analyses for rare variants at the INHBE locus.
We explored in depth the genomic context of the associations with BMI-adjusted WHR at the INHBE locus. We looked for fine-mapped common variant signals in a 1-Mb window around the INHBE gene and found no associated common variants that met genome-wide statistical significance in any of the analyzed populations (Supplementary Data 2 and Supplementary  Figure 7), indicative of the association being driven exclusively by rare variants.
In a backward-selection analysis of INHBE alleles contributing to the pLOF gene-burden association, we found that the association was primarily, but not exclusively driven by the c.299-1G>C splice variant (Supplementary Data 14). As c.299-1G>C is in linkage disequilibrium (LD; r 2 =0.89) with a rare Ser544Asn missense variant in the SLC26A10 pseudogene, we explored the possible role of the latter in the observed associations.
First, despite a lower allele frequency, INHBE c.299-1G>C had a stronger association with BMI-adjusted WHR than SLC26A10-Ser544Asn and was the lead rare coding variant at the locus (Supplementary Data 10). Second, the association for SLC26A10-Ser544Asn was eliminated when adjusting for INHBE c.299-1G>C genotype (Supplementary Data 10). Third, even when excluding all carriers of SLC26A10-Ser544Asn from the analysis, which (due to LD) meant the exclusion of the majority of INHBE pLOF carriers, there was still evidence of association for rare coding variants in INHBE (Supplementary Data 15). Consistent with the contribution of INHBE rare coding alleles beyond c.299-1G>C, we observed stronger associations for the overall burden of pLOF variants in INHBE than the splice variant alone for several WHR-related metabolic traits, including hip circumference, HbA1c, apolipoprotein B, high-density lipoprotein cholesterol, triglycerides and type 2 diabetes (Supplementary Data 11). Fourth, when excluding the Ser544Asn allele, rare coding variants in SLC26A10 were not associated with BMI-adjusted WHR (Supplementary Data 16). Fifth, SLC26A10 is a pseudogene that shows a high degree of tolerance to deleterious rare coding variation (https://gnomad.broadinstitute.org/, accessed January 18 th 2021) (3), with no reported evidence of protein expression in the Human Protein Atlas (https://www.proteinatlas.org/, accessed January 18 th 2021) (4). Thus, beyond the association results above, a missense allele in this pseudogene is less likely to impact a human phenotype than an experimentally validated loss-of-function allele like c.299-1G>C in a gene with evidence of protein expression as INHBE (https://www.proteinatlas.org/, accessed January 18 th 2021) (4). Taken together, these results implicate INHBE as the effector gene for rare variant associations at the locus.

Supplementary Result 3. Rare pLOF variants in INHBE are not associated with estimated bone mineral density or fracture risk, while rare coding variants in PPARG show an association.
Peroxisome proliferator activated receptor gamma (PPARG) agonists, a class of anti-diabetic drugs that enhance glycemic control by promoting peripheral fat storage (5), cause an increased risk of fractures (6), which has partly contributed to their limited clinical use. Consistently, we found an association with lower odds of fractures for rare pLOF plus deleterious missense variants in PPARG (Supplementary Data 28), which were also associated with higher BMI-adjusted WHR at exome-wide statistical significance in our study ( Table 1). In contrast, rare pLOF variants in INHBE were not associated with estimated bone mineral density or fracture risk (Supplementary Data 28).

Supplementary Result 4. Interplay of common and rare alleles in body fat distribution.
We sought to compare phenotypic impact of polygenic extremes and rare coding variants or other genotype combinations. We used protein-truncating or experimentally validated LOF variants in PPARG (Methods), the causal gene for FPLD type 3 and one of the 16 genes from our exome-wide analysis, as a benchmark for a Mendelian-like effect on fat distribution. Then, using GWAS results, we generated, selected, and validated a polygenic score comprising 500 common variants that maximized the variance explained in BMI-adjusted WHR (Methods; Supplementary Figure 16).
PPARG mutation carriers had 0.46 SD higher BMI-adjusted WHR (p=0.012; Table 2, Table  S23) and >4-fold higher odds of type 2 diabetes (per-allele odds ratio, 4.3; 95% CI, 1.9 to 9.6; p=3.4×10 -4 ; Table 2, Supplementary Data 23) compared to noncarriers; effect sizes which are consistent in magnitude to those of other Mendelian-disease causing mutations in population-based studies (7)(8)(9). In the same dataset, higher values of the 500-variant polygenic score for BMIadjusted WHR were associated with an unfavorable fat distribution and a higher diabetes risk, with a graded dose-response relationship (Supplementary Figures 17-19). Individuals in the top 1% of polygenic predisposition had similar average fat distribution to PPARG mutant carriers ( Table  2). Notably, being in the top 1% of the polygenic score is approximately 120-times more frequent than being the heterozygous carrier of a PPARG mutation in the cohorts we studied ( Table 2).
We observed similarly large phenotypic impact for other genotype combinations including rare alleles combined with high polygenic burden or multiple rare alleles (see Methods for genotype selection). Heterozygous carriers of ANKRD12 pLOF variants (the largest-effect WHR increasing association in our analysis; Table 1) who were also in the top quintile of the polygenic score had an average BMI-adjusted WHR of 0.77 SD units, a comparable phenotype to that of PPARG mutations, for a genotype combination twice as frequent (Supplementary Data 24). Rare "human knock-out" individuals who carried homozygous pLOF variants in PLIN4, encoding a coating protein of unilocular lipid droplets in adipocytes, had an average BMI-adjusted WHR of 0.52 SD units (Supplementary Data 24), again similar to that of PPARG mutation carriers. Interestingly, while the phenotypic impact on fat distribution of those genotypes was similar to that of PPARG mutations and carriers of those genotypes had higher prevalence of type 2 diabetes than the population prevalence (Supplementary Data 24), the prevalence of diabetes was highest in PPARG mutation carriers (33%; Supplementary Data 24) suggesting that perhaps the impact of PPARG mutations on diabetes is only partly mediated via fat distribution.
We made similar observations at the opposite polygenic extreme. Individuals in lower quantiles of the BMI-adjusted WHR polygenic score had more favorable fat distribution and lower prevalence of metabolic disease (Supplementary Figures 17-19), with individuals at the bottom 1% of the polygenic score having a favorable fat distribution (-0.67 SDs) and a risk of type 2 diabetes similar to that of INHBE pLOF carriers ( Table 2). Interestingly, individuals with rare pLOF variants in INHBE who were also in the bottom quintile of the polygenic score had a prevalence of diabetes of just 1% (as opposed to a pooled prevalence of 9.8% in the studied cohorts; Supplementary Data 24).
Our inferences remained the same when polygenic scores were derived using a previouslypublished GWAS training set of 142,762 people (10) (Supplementary Data 25, Supplementary  Figure 20).

SUPPLEMENTARY FIGURES Supplementary Figure 1. Genome-and exome wide associations with BMI-adjusted WHR.
Manhattan plots showing statistical strength of association (y-axis; plotted as -log10(p-value)) and chromosome position (x-axis). Panel A shows the multi-ancestry exome-wide analysis of gene burden genotypes; Panel B the multiancestry exome-wide analysis of single rare coding variants; Panel C shows the European Ancestry and Panel D shows the American Ancestry genome-wide association analysis of common imputed variants. P-values are from two-sided Z-tests from fixed-effect meta-analyses. For the exome-wide analyses, triangles pointing upwards (in yellow) or downwards (red) indicate associations with higher and lower BMI-adjusted WHR, respectively. Abbreviations: log, logarithm.

Supplementary Figure 2. Correlations of association estimates in sensitivity analyses using alternative adjustments.
Each panel shows associations for the 16 genes identified in the BMI-adjusted WHR discovery analysis, with BMI-adjusted WHR (x-axis; n=618,375) and with the following trait associations on the y-axis: A, WHR, not adjusted for BMI (n=619,298); B, WHR, adjusted for BMI and height (n=618,375); C, WHR, adjusted for BMI and estimated bone mineral density (n=440,770); D, WHR, non-linear adjusted for BMI and total body fat mass (n=444,107). Non-linear adjustment (panel D) was performed using a cubic spline. Error bars represent 95% confidence intervals around the beta coefficient. The red dotted line corresponds to a line with a slope of 1 and intercept of zero; the solid blue line represents the ordinary least squares regression line. P-values are based on two-sided Wald tests. Abbreviations: eBMD, estimated bone mineral density; SD, standard deviation; CI, confidence interval; WHR, waist to hip ratio; BMI, body mass index; UKB, UK Biobank.

Supplementary Figure 3. Associations with magnetic resonance imaging (MRI) derived visceral to gluteofemoral fat ratio for genes identified in the BMI-adjusted WHR discovery analysis.
The Figure shows associations with MRI-derived visceral to gluteofemoral fat ratio (VA/G fat ratio; y-axis; n=38,878; see Methods) and BMI-adjusted WHR (x-axis; n=618,375) for the 16 genes identified in the discovery analysis. Error bars represent 95% confidence intervals around the beta coefficient (black dots) in SD units of the outcome trait per-allele. The dotted line corresponds to the null effect of zero; the solid black line shows the Huber weighted robust regression line. P-value is from a two-sided Wald test. Abbreviations: SD, standard deviation; CI, confidence interval; WHR, waist to hip ratio; BMI, body mass index.

Supplementary Figure 4. Tissue expression enrichment in the exome-wide gene-burden analysis.
Tissue enrichment analysis was performed using both GTEx V8 gene expression data and geneburden association analysis results for BMI-adjusted WHR. Shown on the plot are tissues (y-axis) whose enhanced genes (see Methods) have a stronger association (x-axis) with BMI-adjusted WHR. Subcutaneous adipose tissue was preferentially sampled from the leg (11). P-values are from one-sided Wald tests. Only adipose subcutaneous tissue showed significant enrichment at α=0.001 (i.e. a Bonferroni correction for 48 tissues tested for enrichment). Abbreviations: p, Pvalue.

Supplementary Figure 5. Tissue expression for each of the 16 genes in this study.
We identified tissues of enriched expression for each of the 16 genes identified in our study. The figure below shows, for a given gene, the enriched tissue where that gene is most expressed (see Methods). Subcutaneous adipose tissue was preferentially sampled from the leg (11).

Supplementary Figure 6. Association of rare pLOF variants in INHBE with bioelectrical impedance body composition measures.
Association estimates are represented in red for fat mass or percentage phenotypes and in blue for lean mass or percentage phenotypes. Markers represent beta coefficients and error bars their 95% confidence intervals. Data are from the UKB cohort (n=445,122). Abbreviations: pLOF, predicted loss of function; AAF, alternative allele frequency; kg, Kilograms; CI, confidence intervals. In the top panel, the figure shows liver mRNA expression levels of INHBE in counts per million (CPM) in patients (n=2,032) with normal liver (control), steatosis of the liver (simple steatosis) and nonalcoholic steatohepatitis (NASH). The simple steatosis group showed higher expression of INHBE in the liver than the control group. The NASH group showed higher expression both when compared to the control and when compared to the simple steatosis groups. Box plots depict the median (horizontal bar), the 75 th and 25 th percentiles respectively (top and bottom bounds of each box), and the minimum and maximum CPM values for INHBE. In the bottom panel, the figure shows differential expression results comparing individuals in each nonalcoholic fatty liver disease (NAFLD) activity score (NAS) group compared with individuals with a score of zero. The numbers above each point correspond to the sample size in each group. All differences in INHBE expression between groups were statistically significant after correction for multiple testing. Markers represent the estimated percentage change and error bars their 95% confidence intervals. P-values are based on two-sided Wald tests.

Supplementary Figure 11. Correlation of hepatic INHBE mRNA expression with expression of FST, INHBA, and INHBB.
The figure shows the correlation (estimated using Pearson's correlation coefficient, with 95% CI in parenthesis) between liver mRNA expression levels of INHBE (shown on the y-axis), and liver expression of FST (encoding follistatin), and INHBA and INHBB (encoding the activin subunits; shown on the x-axis).

Supplementary Figure 13. Prevalence of type 2 diabetes by percentiles of fat distribution and BMI.
The figure below shows the relationships of BMI-adjusted WHR (green points) and BMI (orange points) with type 2 diabetes. Analyses were performed in European ancestry participants from the UKB. Abbreviations: BMI-adjusted WHR, waist-hip ratio adjusted for body mass index; BMI, body mass index; UKB, UK Biobank.

Supplementary Figure 14. Association of favorable fat distribution polygenic scores with DXA phenotypes.
Panel A shows associations with VA/G fat mass ratio (n=4,959), which was adjusted for BMI. Panel B shows associations with leg fat percentage (n=4,966), calculated using total body fat as the denominator. Panel C shows associations with leg fat mass (n=4,966). Associations of a polygenic score for lower BMI are also shown for comparison. Markers represent beta coefficients and error bars their 95% confidence intervals. P-values are based on two-sided Wald tests. Abbreviations: DXA, Dual-energy X-ray absorptiometry; CI, confidence interval; BMI, body mass index; WHR, waist-hip ratio; SD, standard deviation; P, p-value. Figure 15. Association of polygenic score for lower insulin resistance with liver phenotypes, type 2 diabetes and coronary artery disease risk.

Supplementary
Estimates are shown per 1 SD lower insulin resistance polygenic score. Markers represent beta coefficients (for quantitative traits) or odds ratios (for binary traits) and error bars their 95% confidence intervals. The upper panel shows associations with quantitative traits, while the lower panel associations with binary traits. P-values are based on two-sided Wald tests. Sample size per outcome phenotype: ALT,442,695;PDFF at MRI imaging,38,915;cT1 at MRI imaging,38,915; NAFLD activity score at liver biopsy, 3,572; nonalcoholic liver disease, 14,195 cases and 428,139 controls; cirrhosis, 4,063 cases and 428,139 controls; type 2 diabetes, 58,379 cases and 530,072 controls; coronary artery disease, 89,202 cases and 342,007 controls. Abbreviations: ALT, alanine aminotransferase; AST, aspartate aminotransferase; PDFF, proton density liver fat fraction; cT1, corrected T1; NAFLD, nonalcoholic fatty liver disease; BMI, body mass index; WHR, waist-hip ratio; P, P-value; CI, confidence intervals; SD, standard deviation.