Integration of human adipocyte chromosomal interactions with adipose gene expression prioritizes obesity-related genes from GWAS

Increased adiposity is a hallmark of obesity and overweight, which affect 2.2 billion people world-wide. Understanding the genetic and molecular mechanisms that underlie obesity-related phenotypes can help to improve treatment options and drug development. Here we perform promoter Capture Hi–C in human adipocytes to investigate interactions between gene promoters and distal elements as a transcription-regulating mechanism contributing to these phenotypes. We find that promoter-interacting elements in human adipocytes are enriched for adipose-related transcription factor motifs, such as PPARG and CEBPB, and contribute to heritability of cis-regulated gene expression. We further intersect these data with published genome-wide association studies for BMI and BMI-related metabolic traits to identify the genes that are under genetic cis regulation in human adipocytes via chromosomal interactions. This integrative genomics approach identifies four cis-eQTL-eGene relationships associated with BMI or obesity-related traits, including rs4776984 and MAP2K5, which we further confirm by EMSA, and highlights 38 additional candidate genes.


1.
The abstract contains some statements that need to be altered or qualified in my opinionrelated to my points above. Including "obesogenic loci".

2.
Introduction. The SNP based heritability of BMI may be about 21% but that is not the same as all common genetic variants, it is just about the SNPs and variation captured on that array.

3.
Because the results come before the methods, it is worth either simplifying some statements, or explaining more. For example it is not clear what the HindIII fragments are without going to the methods.

4.
Is the enrichment of PPARG TFBS in DHSs noteworthy after correction for multiple testing ? It only reaches =0.01 and there were 26 /332 TFBS's enriched? Similarly, the SNPs in adipocyte DHS explain 4.6% heritability compared to 0.23% expected, but does this account for the fact that DHSs will be closer to genes ?

5.
How does HOMER software work? Does it correct for the likely enrichment of DHS near genes and therefore TFBS ?

6.
In the eGenes section of the results, the narrative is hard to follow in places. For example, I am not clear where the 386K eQTLs come from when first mentioned. Would it not be easier on the reader to clump these into independent signals and talk about the 100s-1000s of independent eQTLs nto associated SNPs, which could include 100s in strong LD with the index (most strongly associated) variant?

7.
In the results section on "looping" and elsewhere in the text, I think the term "BMI genes" is being used too loosely. This is a very ambiguous term that could refer to a gene who's expression level is correlated with BMI, a gene lying in a functionally interesting region relevant to BMI (such as DHS) or a SNP associated with BMI. Please be clearer -see main comments.

8.
Whilst the MAP2K gene story is nice, I think it is too strong to say that this data confirms it as a "BMi gene". That would require a coding mutation in a human, or manipulation in an animal model. Note that the adipose tissue gene expression of most genes in the genome correlates with BMI (Emilsson et al ) Reviewer #2 (Remarks to the Author): Genome-wide association studies (GWASs) aim at deciphering the roles of variants such as single nucleotide polymorphims (SNPs) in complex diseases. However the SNPs identified by GWASs can only explain a small part of the genetic heritability of these diseases. Moreover it is difficult to identify the causal SNPs among the SNPs that are associated to the diseases due to high linkage disequilibrium (LD) among close SNPs. By integrating GWAS SNPs with eQTLs and long-range chromatin interactions, Pan et al. identified SNPs that might act through long-range contacts with the four genes : MAP2K5, ORMDL3, LACTB and ACADS. These genes are associated with body mass index (BMI) and other obesity phenotypes. For instance, they show that the SNP rs4776984 is in long-range contact with the gene MAP2K5, it is an eQTL of MAP2K5, and it is in almost perfect LD with a BMI GWAS SNP (rs16951275). Moreover they demonstrate that the SNP rs4776984 can increase to predicted binding for CTCF protein, and using electrophoretic mobility shift assays of nuclear protein extracts, they observed increased protein binding for the SNP allele compared to the reference allele.
My major comments are the following: 1) The article is hard to read and to understand, especially for the subsections "Characterization of the adipocyte chromosomal interactions" and "Chromosomal interactions explain heritability of gene expression". The article lacks clarity for multiple reasons. For instance, some terms are not explained such as "local gene expression", what does the term "local" refer to here? What is "partitioned LD Score Regression"? What is the aim of it? Another reason why the article lacks clarity is because the sentences are sometimes not well connected and some ideas are not introduced. Moreover, the authors must provide a schema to illustrate how data are integrated (Fig2/Sup Fig2 are really not enough). It is very difficult to understand clearly how data are integrated. Basically, the authors do multiple overlappings between sets such as eQTL SNPs, GWAS SNPs, LD SNPs, SNPs mapping to HindIII fragments, or eQTL genes, genes associated with BMI and genes mapping to HindIII fragments. The authors must plot multiple Venn diagrams (for instance) to represent the filtering procedures that were used to identify the candidate SNPs and genes that the authors identified.
2) In subsection "Chromosomal interactions explain heritability of gene expression", The authors show that DNase I hypersensitive sites (DHSs) in pCHI-C non-promoter fragments are significantly enriched in the local gene expression. They must compare it to the DHSs not in non-promoter fragments in order to demonstrate the importance of long-range interacting loci in the regulation of gene expression. Similarly, they show that the variants in DHSs in pCHI-C non-promoter fragments account for 4.6% of the heritability while accounty for 0.23% of the SNPs per local gene region. The authors must compare to the variants in the DHSs not in non-promoter fragments. The authors will obtain 4 different counts to compute an exact Fisher's test. Also, in subsection "Characterization of the adipocyte chromosomal interactions", the authors show that non-promoter fragments are enriched for histone marks. The authors must also demonstrate the enrichment for DHSs. Moreover the authors should compare the enrichments of DNA motifs in non-promoter fragment DHSs with other DHSs.
3) In subsection "Looping eGenes dissect novel GWAS genes for obesogenic traits", the authors found that the SNP rs4776984 showed an increased binding for CTCF, p300, RAD21 and SMC3. The authors must remove results for RAD21 and SMC2 because those proteins are part of the cohesin complex that cannot bind directly to the DNA. Instead cohesin is recruited by CTCF to the chromatin to form chromatin loops. Predicting the binding for CTCF is thus enough. Moreover, the authors can use two different tools: DeepSEA (http://deepsea.princeton.edu/) and gkmSVM (http://www.beerlab.org/gkmsvm/) to predict the impact of the SNP on DHS, CTCF binding and histone marks. In addition, the authors must do a ChIP of CTCF when doing the EMSA assay and show a supershift due to the antibody bound to CTCF.
4) The authors can validate the SNP rs4776984 if they demonstrate that it has a differential looping effect. For this purpose, the authors can do a 3C-qPCR experiment in one or more patient(s) carrying the SNP (rs4776984) and compared it to another/other patient(s) not carrying the reference allele. The 3C-qPCR should be designed to capture the long-range contact between rs4776984 and the promoter of MAP2K5.
5) The authors must provide more details about the pCHi-C experiment results they obtained, such as basic statistics. How many reads were mapped? How many interactions were identified as significant? Is there any replicate to estimate the reproducibility? Reviewer #3 (Remarks to the Author): This is an interesting manuscript where complementary approaches for global interrogation of functional elements in the human genome is used to define (likely) functional SNPs controlling adipose gene expression. In addition, expression of these genes is correlated with BMI. I have the following comments -It is not clear why GWAS of BMI, lipids, and metabolites in peripheral blood were used to define clinical implications of the findings. How important is adipose tissue, as compared to other organs, to determine metabolites in peripheral blood? The importance of primary disturbances in adipose tissue for controlling BMI is also unclear; most candidate genes for BMI from GWAS are primarily expressed in the brain. One reasonable hypothesis is that adipose gene expression is more important for body fat distribution and insulin resistance. Why where GWAS of these traits not investigated? - The authors draw too far-reaching conclusions as regards the metabolite loci when discussing them in relation to obesity. These are metabolic traits, but the link to obesity in humans has not been established.

-
The adipocyte pCHi-C DHS loci were overrepresented in adipocytes as compared to other cell types. But what about the corresponding eQTLs, are they specific to adipose tissue? If not, how to explain this?
- Table 2 is misleading as the number of genes per pathway is no more than 2.
-Are ORMDL3 or LACTB expressed in human adipocytes at the protein level? Do they have any function in human fat cells?

-
The causal link between adipose tissue gene expression and BMI is unclear. I do not mean that the authors need to define this relationship. However, they should be more cautious in their writing about obesity based on presented data.

Details: -
In the Introduction it is written that "deep clinical phenotype data" were used; which data is referred to? Are there really "deep clinical phenotype data" in this study?
-"Chromosomal interactions explain heritability of gene expression" -This paragraph is difficult to understand. I do not understand what is meant by heritability is partitioned into 52 categories?
the HindIII fragments so that counts per fragment are obtained, representing the number of times two regions were detected to be interacting.

Is the enrichment of PPARG TFBS in DHSs noteworthy after correction for multiple testing ?
It only reaches =0.01 and there were 26 /332 TFBS's enriched? Similarly, the SNPs in adipocyte DHS explain 4.6% heritability compared to 0.23% expected, but does this account for the fact that DHSs will be closer to genes ? Response: We would like to clarify regarding the multiple testing of the enrichment of the PPARG TFBS in DHSs that the Benjamini Hochberg adjusted p-value after correcting for multiple testing is 0.039, which remains significant. Regarding the heritability of local gene expression explained by DHSs in adipocyte promoter-interacting regions, the 0.23% refers to the percentage of genome-wide SNPs that are located within these promoter-interacting DHSs. This small percentage accounts for 4.6% of the heritability of local gene expression. This enrichment analysis does not take into account the fact that DHSs are closer to genes. However, the other functional annotations we analyzed with LD Score are also enriched near genes (e.g. 5' UTR, coding regions, TSS, etc.), and it is thus not likely that the closeness to the genes biases the estimate of how much these 0.23% of genome-wide SNPs explain of local gene expression.

How does HOMER software work? Does it correct for the likely enrichment of DHS near genes and therefore TFBS ?
Response: We thank the Reviewer for this comment and have revised the Results and Methods to clarify how the HOMER software works (pages 6, 29). Briefly, the HOMER software (Heinz et al. Mol Cell 2010) takes a set of input target and background regions and then calculates the number of times a TF motif is seen in the target and background sequences. Our target regions consisted of DHSs in promoter-interacting fragments in adipocytes that are not present in pCHi-C data produced in CD34 + blood cells. The background is the vice versa: DHSs in CD34 + promoter-interacting fragments that are not present in our adipocyte data. The expected number of times a TFBS should be seen is calculated assuming the sites are evenly distributed across the genome. The proportion of times the TFBS is seen in the target when compared to the background is then tested for significant enrichment in the target sequences using the cumulative binomial distribution. We did not correct for the likely enrichment of DHSs near genes because both datasets are interrogating regions that are interacting with gene promoters and we assume that the promoter-interacting fragments come from a similar distribution of distances to genes.
6. In the eGenes section of the results, the narrative is hard to follow in places. For example, I am not clear where the 386K eQTLs come from when first mentioned. Would it not be easier on the reader to clump these into independent signals and talk about the 100s-1000s of independent eQTLs nto associated SNPs, which could include 100s in strong LD with the index (most strongly associated) variant? Response: As suggested by the Reviewer, we have modified the Results to better describe our eQTL results (page 8). Specifically, we revised the Results and Supplementary Figure 2 to clearly show how we narrowed down to the 386k cis-eQTLs. Briefly, there are 487,679 SNPs, which were identified as cis-eQTLs in both METSIM and the GTEx adipose RNA-seq data. Among these 487,679 cis-eQTLs, the 386,068 cis-eQTLs have the same target gene and direction of effect in both METSIM and GTEx. We would also like to clarify that our design was to find the GWAS signals that regulate gene expression via chromosomal interactions as a looping cis-eQTL. Thus, we could not first do LD pruning of the cis-eQTL signals, and accordingly, we could not just focus on the strongest eQTL at each locus. We have further revised the Supplementary Figure 2 to clearly indicate the steps we used to overlap the eQTLs and pCHi-C data to fine-map our cis-eQTLs. 7. In the results section on "looping" and elsewhere in the text, I think the term "BMI genes" is being used too loosely. This is a very ambiguous term that could refer to a gene who's expression level is correlated with BMI, a gene lying in a functionally interesting region relevant to BMI (such as DHS) or a SNP associated with BMI. Please be clearer -see main comments.

Response:
We thank the Reviewer for this comment and have now omitted the ambiguous term "BMI genes" from the manuscript (pages 11 and 14).
8. Whilst the MAP2K gene story is nice, I think it is too strong to say that this data confirms it as a "BMi gene". That would require a coding mutation in a human, or manipulation in an animal model. Note that the adipose tissue gene expression of most genes in the genome correlates with BMI (Emilsson et al )

Response:
To address the Reviewer's concern, we have now revised the wording to state that we have identified one possible candidate gene (page 12) and mechanism underlying the GWAS BMI signal at the MAP2K5 locus. We have modified the section of Discussion on MAP2K5 (page 17) to better describe the link between MAP2K5 and adiposity. Briefly, MAP2K5 is a member of the ERK5 MAP kinase signaling cascade, and the importance of ERK5 signaling in adipose was previously demonstrated in mouse Erk5 knock-outs, which exhibit increased adiposity (Zhu et al. J Biol Chem 2014). This suggests that changes in ERK5 signaling in adipocytes could be relevant for human obesity. MAP2K5 is a strong and specific activator of ERK5 in the ERK5 MAP kinase signaling cascade (Kato et al. EMBO 1997), supporting further study of MAP2K5 in connection with increased adiposity.

Responses to Reviewer 2:
We thank the Reviewer for the helpful critique and comments of the manuscript and have addressed all of the issues that were raised. We hope that the revisions satisfactorily respond to all of the Reviewer's concerns.

Reviewer #2 (Remarks to the Author):
Genome-wide association studies (GWASs) aim at deciphering the roles of variants such as single nucleotide polymorphims (SNPs) in complex diseases. However the SNPs identified by GWASs can only explain a small part of the genetic heritability of these diseases. Moreover it is difficult to identify the causal SNPs among the SNPs that are associated to the diseases due to high linkage disequilibrium (LD) among close SNPs. By integrating GWAS SNPs with eQTLs and long-range chromatin interactions, Pan et al. identified SNPs that might act through longrange contacts with the four genes : MAP2K5, ORMDL3, LACTB and ACADS. These genes are associated with body mass index (BMI) and other obesity phenotypes. For instance, they show that the SNP rs4776984 is in long-range contact with the gene MAP2K5, it is an eQTL of MAP2K5, and it is in almost perfect LD with a BMI GWAS SNP (rs16951275). Moreover they demonstrate that the SNP rs4776984 can increase to predicted binding for CTCF protein,and using electrophoretic mobility shift assays of nuclear protein extracts, they observed increased protein binding for the SNP allele compared to the reference allele.
My major comments are the following: 1. The article is hard to read and to understand, especially for the subsections "Characterization of the adipocyte chromosomal interactions" and "Chromosomal interactions explain heritability of gene expression". The article lacks clarity for multiple reasons. For instance, some terms are not explained such as "local gene expression", what does the term "local" refer to here? What is "partitioned LD Score Regression"? What is the aim of it? Another reason why the article lacks clarity is because the sentences are sometimes not well connected and some ideas are not introduced. Moreover, the authors must provide a schema to illustrate how data are integrated (Fig2/Sup Fig2 are really not enough). It is very difficult to understand clearly how data are integrated. Basically, the authors do multiple overlappings between sets such as eQTL SNPs, GWAS SNPs, LD SNPs, SNPs mapping to HindIII fragments, or eQTL genes, genes associated with BMI and genes mapping to HindIII fragments. The authors must plot multiple Venn diagrams (for instance) to represent the filtering procedures that were used to identify the candidate SNPs and genes that the authors identified.
Response: We thank the Reviewer for these comments and have revised the manuscript throughout to clarify all of the issues that the Reviewer lists here. Accordingly, we revised the 2 paragraphs (pages 7-8); explained the terms "local" and "partitioned LD Score Regression" (pages 6-7); and clarified the integration of the data analysis and how we overlapped the data sets (the revised Supplementary Figure 2 and page 8).
2. In subsection "Chromosomal interactions explain heritability of gene expression", The authors show that DNase I hypersensitive sites (DHSs) in pCHI-C non-promoter fragments are significantly enriched in the local gene expression. They must compare it to the DHSs not in non-promoter fragments in order to demonstrate the importance of long-range interacting loci in the regulation of gene expression. Similarly, they show that the variants in DHSs in pCHI-C non-promoter fragments account for 4.6% of the heritability while accounty for 0.23% of the SNPs per local gene region. The authors must compare to the variants in the DHSs not in nonpromoter fragments. The authors will obtain 4 different counts to compute an exact Fisher's test. Also, in subsection "Characterization of the adipocyte chromosomal interactions", the authors show that non-promoter fragments are enriched for histone marks. The authors must also demonstrate the enrichment for DHSs. Moreover the authors should compare the enrichments of DNA motifs in non-promoter fragment DHSs with other DHSs. Response: As suggested by the Reviewer, we included a new functional category to our LD-score regression analysis to also compute the enrichment in local gene expression of SNPs located within DHSs while excluding the DHSs in the pCHi-C non-promoter fragments (revised figure 1). Noteworthy, only a very small fraction of DHSs is excluded in this new DHSs category when compared to the overall DHSs category that we showed previously in Figure 1.
As suggested by the Reviewer, we then compared the enrichment of the DHSs in pCHi-C nonpromoter fragments in the local gene expression (enrichment=20.3; SD+/-5.2), with the enrichment of this new DHS category that excludes the DHSs in the pCHi-C non-promoter fragments (enrichment=2.28; SD+/-0.2) (revised Figure 1).
As also suggested by the Reviewer, we compared the result that the 0.23% of the SNPs in DHSs in pCHi-C non-promoter fragments account for 4.6% of the heritability of adipose tissue gene expression in cis with the result that the 13.25% SNPs in the new DHS category, which excludes the DHSs in the pCHi-C non-promoter fragments, account for 30.21% of the heritability of adipose tissue gene expression in cis.
Taken together, these data show that the enrichment for heritability with the 0.23% SNPs in the DHSs in pCHi-C non-promoter fragments is ~10x higher when compared to the 13.25% SNPs in this new DHS category that excludes the DHSs in the pCHi-C non-promoter fragments, emphasizing their role in regulating local gene expression.
In our previous version of the manuscript, we demonstrated an enrichment of histone marks in the pCHi-C non-promoter fragments. As suggested by the Reviewer, we now also performed an enrichment analysis for the DHSs in the pCHi-C non-promoter fragments. We see a similar, significant enrichment for the DHSs in the pCHi-C non-promoter fragments as for the histone marks, and we revised the manuscript accordingly (page 5 and Supplementary Table 2).
We have also included our results from HOMER comparing DHSs in adipocyte promoter-interacting fragments to DHSs in all other non-promoter, non-promoter-interacting regions (page 6). The enriched TFBSs are largely the same as in our previous comparison between the DHSs in the promoter-interacting regions in adipocytes and those in CD34 + cells. The major difference is in the pvalues, which are much more significant in this new analysis that the Reviewer suggested, comparing the adipocyte promoter-interacting DHSs to all DHSs in non-promoter, non-interacting fragments. For example, the PPARG binding site is enriched in our adipocyte promoter-interacting DHSs when compared to the CD34 + promoter-interacting DHSs, with a p-value of 1×10 -2 , whereas in the comparison of adipocyte promoter-interacting DHSs to all other non-promoter, non-interacting DHSs in the genome, the PPARG binding site is enriched with a p-value of 1×10 -6 . We thank the Reviewer for this helpful clarification.
3. In subsection "Looping eGenes dissect novel GWAS genes for obesogenic traits", the authors found that the SNP rs4776984 showed an increased binding for CTCF, p300, RAD21 and SMC3. The authors must remove results for RAD21 and SMC2 because those proteins are part of the cohesin complex that cannot bind directly to the DNA. Instead cohesin is recruited by CTCF to the chromatin to form chromatin loops. Predicting the binding for CTCF is thus enough. Moreover, the authors can use two different tools: DeepSEA (http://deepsea.princeton.edu/) and gkmSVM (http://www.beerlab.org/gkmsvm/) to predict the impact of the SNP on DHS, CTCF binding and histone marks. In addition, the authors must do a ChIP of CTCF when doing the EMSA assay and show a supershift due to the antibody bound to CTCF.

Response:
We thank the Reviewer for this comment and have now removed the statement that RAD21 and SMC3 binding is predicted to be increased. We also thank the Reviewer for the suggestion to use the DeepSEA tool to predict the impact of SNP rs4776984 on DHS, CTCF binding, and histone marks. Using DeepSEA (deep learning-based sequence analyzer), we examined the allelic effect on protein binding of rs4776984 and the 15 other looping cis-eQTLs of MAP2K5. Of these 16 looping cis-eQTLs, two variants passed functional significance score<0.05 using DeepSEA. Of the two, our candidate functional eQTL SNP, rs4776984, resulted in the most significant functional score (2.36x10 -3 ) and was the only variant passing the functional significance score <0.01 among the 16 variants. Thus, the DeepSEA result further supports the differential TF binding at the variant site rs4776984 among all possible looping cis-eQTLs at the MAP2K5 locus. We thank the Reviewer for this helpful comment and have now included these DeepSEA results in the revised manuscript (page 11).
We also performed the supershift experiment using the CTCF antibody and adipocyte nuclear extract, and repeated this experiment three times. We did not observe a supershift in these EMSAs (new Supplementary Figure 5). We also performed the supershift experiment using another CTCF antibody (EMD Millipore 07-729), which resulted in the same negative finding (data not shown). However, a supershift experiment may remain negative even in the presence of true TF binding if a complex instead of a single TF alone is required for the TF binding. We also directly tested the CTCF protein for allele-specific binding at rs4776984 using EMSA in 3 replicate experiments, and did not find evidence of sole CTCF protein binding (new Supplementary Figure 6). These supershift and CTCF protein EMSA results are now included in the revised Results (page 12 and new Supplementary  Figures 5-6) and Methods (page 25). Even though we did not find evidence of CTCF binding at rs4776984, nevertheless our replicated EMSA experiments with adipocyte extract show a clear allelespecific binding of a protein at the rs4776984 site with the looping DHS and histone marks H3K27me3 and H3K9me3. As the DeepSEA tool identified multiple TFs binding the rs4776984 site in an allele-specific way (Supplementary Table 8), it will likely require testing of a large set of TFs in additional functional studies to identify the actual TF that binds this site. These studies are beyond the scope of the current study.
4. The authors can validate the SNP rs4776984 if they demonstrate that it has a differential looping effect. For this purpose, the authors can do a 3C-qPCR experiment in one or more patient(s) carrying the SNP (rs4776984) and compared it to another/other patient(s) not carrying the reference allele. The 3C-qPCR should be designed to capture the long-range contact between rs4776984 and the promoter of MAP2K5.

Response:
We agree with the Reviewer that this experiment would be valuable to show that differential looping occurs within the nucleus. However, it is not feasible to establish and perform the suggested experiments within the time frame of the revision process because these studies are technically challenging and time consuming. Furthermore, to produce the adipocyte promoter Capture Hi-C data, we used a commercial cell line, and we do not have access to human cell lines with the particular genotypes at the candidate SNP site rs4776984, and moreover establishing a protocol for patient iPSC generation and differentiation into adipocytes in our lab is very time-consuming and outside the scope of this study. Instead, for additional validation of the looping interactions at the 4 GWAS loci, we now provide new evidence from our independent pCHI-C experiment in HWA with two replicates. These new pCHI-C data replicate the chromosomal interactions at all 4 GWAS loci, and thus further confirm our original findings (page 13 and new Supplementary Table 9), demonstrating the robustness of our pCHI-C results. 5. The authors must provide more details about the pCHi-C experiment results they obtained, such as basic statistics. How many reads were mapped? How many interactions were identified as significant? Is there any replicate to estimate the reproducibility? Response: As suggested by the Reviewer, we have now included details about the pCHi-C data analysis in the revised Results (page 5 and Supplementary Table 1). As mentioned in our response to point 4 above, since the submission of this manuscript, we have repeated the pCHi-C experiments in the same cell line, HWA, and found that the same GWAS SNP interactions reported in the current study were also identified in this subsequent experiment. This new data thus provide additional important evidence, supporting the conclusion that the interactions we report in the current study are robust. We revised the Results (page 13) and made a new Supplementary Table 9 to include the new pCHi-C data.

Response to Reviewer 3:
We thank the Reviewer for the helpful critique and comments of the manuscript and have addressed all of the issues that were raised. We hope that the revisions satisfactorily respond to all of the Reviewer's concerns.

Reviewer #3 (Remarks to the Author):
This is an interesting manuscript where complementary approaches for global interrogation of functional elements in the human genome is used to define (likely) functional SNPs controlling adipose gene expression. In addition, expression of these genes is correlated with BMI. I have the following comments -It is not clear why GWAS of BMI, lipids, and metabolites in peripheral blood were used to define clinical implications of the findings. How important is adipose tissue, as compared to other organs, to determine metabolites in peripheral blood? The importance of primary disturbances in adipose tissue for controlling BMI is also unclear; most candidate genes for BMI from GWAS are primarily expressed in the brain. One reasonable hypothesis is that adipose gene expression is more important for body fat distribution and insulin resistance. Why where GWAS of these traits not investigated?
Response: We thank the Reviewer for the comment and have now revised the manuscript (page 9) to clearly state the known importance of adipose tissue in energy homeostasis of the human body. We agree with the Reviewer that other organs also play a role in altering the traits investigated in this study, and accordingly, we have now revised the manuscript to explain that the goal of the current study is to dissect the contribution of adipose/adipocyte functions to these traits (page 9). As suggested by the Reviewer, we also added to the manuscript a list of GWAS studies that we investigated, including a type 2 diabetes and a waist-hip-ratio (WHR) adjusted for BMI GWASs (Shungin et al. Nature 2015 and Fuchsberger et al. Nature 2016), which did not result in any overlaps with our adipocyte pCHi-C and adipose cis-eQTL data.
-The authors draw too far-reaching conclusions as regards the metabolite loci when discussing them in relation to obesity. These are metabolic traits, but the link to obesity in humans has not been established.

Response:
We thank the Reviewer for this comment. In addition to the changes noted in our response above, we have now included a rationale of our decision to use the metabolite GWAS data into the revised Results (page 9). We would also like to clarify that these metabolic traits may not have been shown to be causal for obesity yet, but their relationship to the obese state has been studied previously, as described in the Discussion. Regarding the LACTB locus, a robust correlation between the levels of the succinylcarnitine metabolite and BMI has been shown in two independent cohorts, KORA and TwinsUK, previously (Suhre et al. Nature 2011). Additional support for LACTB as a causal gene for obesity derives from functional studies using transgenic overexpression of Lactb in mice, resulting in an increase in the fat-mass-to-lean-mass ratio (Yang et al Nat. Genet. 2009 andChen et al. Nature 2008). These previous findings are discussed on pages 18-19 of the revised manuscript.
-The adipocyte pCHi-C DHS loci were overrepresented in adipocytes as compared to other cell types. But what about the corresponding eQTLs, are they specific to adipose tissue? If not, how to explain this? Response: We thank the Reviewer for the comment and would like to clarify that we observed that the TF motif enrichment within the DHSs (union of all publicly available DHS data) included TFs that are important for adipocyte function when compared to the DHSs in promoter-interacting fragments from a different cell type (CD34 + ). To address the Reviewer's comment about the tissue-specificity of the looping cis-eQTLs, we used the GTEx cis-eQTL data for the 44 tissues here for the Reviewer as follows: To investigate whether the cis-eQTLs identified by adipose looping interactions for the 576 eGenes are enriched in the subcutaneous adipose tissue, we selected the strongest looping cis-eQTL per gene and ranked the p-values of each looping cis-eQTL in the 44 GTEx tissues in ascending order (please see the Figure below). For these ranking comparisons, we used summary statistics from publicly available GTEx cis-eQTL data. If a cis-eQTL was not significant in a given tissue, its rank was set at the shared highest rank. The figure below shows that overall the cis-eQTLs in the subcutaneous adipose tissue rank the lowest when compared to the other 43 tissues in the GTEx cohort. Using a t-test, the rank of looping cis-eQTLs in the subcutaneous adipose tissue was significantly lower than the rank of cis-eQTLs in the tibial nerve tissue, which was the second smallest ranking tissue (p-value =1.27x10 -9 ) (please see the figure below). Similarly, the rank of the looping cis-eQTLs in the subcutaneous adipose tissue was also significantly lower than the ranks of the cis-eQTLs in the remaining 43 GTEx tissues (p-value < 2.2x10 -16 ). Taken together, these data suggest a potential enrichment of these looping cis-eQTLs in the subcutaneous adipose tissue when compared to the 43 other GTEx tissues. However, we recognize that we selected looping cis-eQTLs that are also significant cis-eQTLs in GTEx adipose tissue, which might inflate the enrichment signal. Therefore, another independent cohort is required to estimate the adipose-specificity of looping cis-eQTLs without bias. Since currently there is no other multitissue cohort available in addition to GTEx, future studies in independent multitissue cohorts are warranted to fully verify the adipose enrichment of these looping cis-eQTLs.
The box plots show the medians of the ranks of looping cis-eQTLs from our 576 eGenes in our study across all GTEx tissues. A significantly smaller rank of the cis-eQTLs was observed for the subcutaneous adipose tissue when compared to the tibial nerve tissue (p-value =1.27x10 -9 ), and overall smaller when compared to the ranks of the cis-eQTLs in the remaining 42 GTEx tissues (p-value < 2.2x10 -16 ).
- Table 2 is misleading as the number of genes per pathway is no more than 2.
Response: We thank the Reviewer for this comment and have added information on the genes within each pathway to demonstrate that the genes do not fully overlap. To not overstate the importance of this finding that is based on small numbers, we moved it to the Supplementary results (Supplementary Table 7).
-Are ORMDL3 or LACTB expressed in human adipocytes at the protein level? Do they have any function in human fat cells?
Response: To address the Reviewer's comment, we have further examined protein-level expression and function of these two proteins in adipose tissue and/or adipocytes. While ORMDL3 protein expression is not available in public databases (e.g. The Human Protein Atlas), Ormdl3 is expressed in mouse adipocytes differentiated from 3T3-L1 cells. Furthermore, its expression at the protein level is used in this previous study as a readout of ER stress in adipocytes (Kajimoto et al. 2014). As the function of the ORMDL3 protein has not been studied in human primary adipocytes, we are thus limited to various BMI-related transcriptomic studies, which consistently show a significant inverse correlation of ORMDL3 adipose expression with BMI (Heinonen et al. Diabetologia 2017;He et al. Sci Rep 2017), as reported in our study. We face a similar issue for LACTB, which has a limited number of studies on the endogenous protein function in adipocytes. A transgenic Lactb mouse model demonstrated an increased fat mass when compared to the WT mice; however the adipose tissue transcriptome was not studied. Nevertheless, as mentioned in our responses previously, functional studies using transgenic overexpression of Lactb in mice supports LACTB as a causal gene for obesity, because transgenic overexpression of Lactb in mice results in an increase in the fat-mass-tolean-mass ratio. We have revised our Discussion (page 18) to note this lack of evidence at the protein level for both of these proteins and call for further molecular studies to determine the function of ORMDL3 and LACTB in connection with obesity.
-The causal link between adipose tissue gene expression and BMI is unclear. I do not mean that the authors need to define this relationship. However, they should be more cautious in their writing about obesity based on presented data.

Response:
We thank the Reviewer for this comment and have carefully revised the manuscript to avoid stating that our results provide a causal link to BMI. Instead we now state that our study furthers the understanding of adipose tissue and adipocyte biology and their role in obesity-related human phenotypes (page 4, 8, 11, and 14).
Details: -In the Introduction it is written that "deep clinical phenotype data" were used; which data is referred to? Are there really "deep clinical phenotype data" in this study?
Response: To address the Reviewer's concern, we omitted this expression from the Introduction and instead clarified in the revised the Introduction and Results what exact metabolic clinical phenotypes we used in the METSIM cohort (pages 3 and 14).

-"Chromosomal interactions explain heritability of gene expression" -This paragraph is difficult to understand. I do not understand what is meant by heritability is partitioned into 52 categories?
Response: To address the Reviewer's comment, we rewrote this section to clearly describe how the LD Score Regression method was utilized in our study (page 7). Partitioning heritability to 52 categories is now explained as follows: To investigate whether the variants residing within the open chromatin in the adipocyte chromosomal looping regions are enriched for the heritability of cis expression regulation, we partitioned the heritability of cis regulation of human adipose gene expression into 52 functional categories using a modified partitioned LD Score regression method To assess the heritability enrichment by the variants in the chromosomal interactions detected by pCHi-C on a per-gene basis, we further modified the LD score method, as described in detail in the Methods. We have now revised the Results to clarify this part (page 7).

Reviewer #1 (Remarks to the Author):
This remains a good paper tackling the very hard job of finding causal alleles and genes in a relevant tissue. In general I still think the authors are overselling their results, especially in the title. Whilst very interesting , I don't think the coincidence of one of the SNPs in an LD block representing a cis eQTL with a HiC looping feature means the functional SNP has been found. Other SNPs could be causal for different reasons and it is still not made clear whether or not these SNPs are the strongest of those associated with gene expression AND the strongest of those associated with the GWAS trait.
I still find it hard to follow some sections. This is perhaps reflected in the abstract , where it is hard to get a clear and simple message about the paper.
For example, the results sectoin "identification of looping cis eQTL SNPs" is hard to follow. The section appears to be misstitled in that it appears to be about the identification of genes not SNPs. The sequence appears to be 1.

2.
Take intersect of 1 above and the hiC promoter interacting regions.

3.
Take 386k cis eQTLs consistent in direction in metsim and gtex. Is this a subset of 1 or 2 above ? I think 1.? And these represent how many independent cis eQTLs given many will be in v strong LD ? 4. these 386eQTLs are linked to 4332 genes. (Presumably the redundancy is mainly due to many tens of variants in LD per locus plus multiple signals per gene?)

5.
Take the 576 / 4332 genes that occur in hiC identified regions (but how does this compare to step 2 above ?)

6.
Take the 52 / 576 genes where their adipose tissue expression correlates with BMI . This figure of ten % correlation with BMI seems very low given the adipose expression of most genes correlates with BMI.

7.
Take the 42/ 52 genes where adipose tissue expression also correlates with BMI in another cohort -twins UK. I think step 2 is confusing and the whole section could be clearer . It might be easier to simply list in one sentence the criteria for gene selection before explaining more. For example steps 1 and 3 could be "the genes selected had to a) have eQTLs in two studies, metsim with consistent directoins in gtex", and steps 6 and 7 "b) have adipose expression correlated with BMI in the same direction in two studies". I still have a problem with the next section "dissect novel GWAS genes for BMI ". The term "for BMI" is vague and gives the impressoin we know that they alter BMI when altered in some way. I don't think the authors have proven this. There is one cis eQTL in this paper that would justify being called a BMI SNP and the authors have done a good job at highlighting MAP2k5 as a strong candidate at this locus. (Note that It would be worth stating which gene the GIANT consortium had labelled this locus as , presumably not MAP2K5 as the index SNP is in a nearby gene?). Given the GWAS sample sizes , especially for BMI and WHR, a variant with an r2 of 0.8 with the index gwas variant could be very unlikely to be the causal variant compared to tens if not hundreds at stronger r2 values. What are the r2 values with the gwas index snp ? The authors give this for MAP2k5 but not the other three.
To make the results clearer I suggest separate subheadings and paragraphs for each of the four GWAS genes discussed.
The final short paragraph of the results is not needed I feel. Once a gene is correlated with BMI it is not surprising it is going to be correlated with bmi related traits and doesn't help us decide if the cis e gene-bmi link is real or not, and the results section is long and quite hard to follow, when it should be the clearest and quickest part of the paper to understand.
Other points:

1.
It is still tough to get a clear and simple idea of the paper from the abstract. For example the second sentence is long and difficult to parse.

2.
At the top of page six in the results, how do the authors define "more likely to be functional"?

3.
You don't actually need to use the word "significant" anywhere. By stating a result you are implying it has reached a level of statistical confidence that makes it interesting , and significant gives the false impression it is definitely real.

4.
Some discussion points creep into the results. For example the phrase "making the likely target gene…. difficult to establish".

5.
In general fig 5 and table 1 don't help the reader much I feel. As mentioned, the adipose expression of most genes is correlated with BMI, and therefore many BMI traits.