A genome-wide association analysis identifies 16 novel susceptibility loci for carpal tunnel syndrome

Carpal tunnel syndrome (CTS) is a common and disabling condition of the hand caused by entrapment of the median nerve at the level of the wrist. It is the commonest entrapment neuropathy, with estimates of prevalence ranging between 5–10%. Here, we undertake a genome-wide association study (GWAS) of an entrapment neuropathy, using 12,312 CTS cases and 389,344 controls identified in UK Biobank. We discover 16 susceptibility loci for CTS with p < 5 × 10−8. We identify likely causal genes in the pathogenesis of CTS, including ADAMTS17, ADAMTS10 and EFEMP1, and using RNA sequencing demonstrate expression of these genes in surgically resected tenosynovium from CTS patients. We perform Mendelian randomisation and demonstrate a causal relationship between short stature and higher risk of CTS. We suggest that variants within genes implicated in growth and extracellular matrix architecture contribute to the genetic predisposition to CTS by altering the environment through which the median nerve transits.


Summary:
The manuscript submitted by Wiberg and colleagues describes the first ever genome-wide association study (GWAS) of Carpal Tunnel Syndrome [CTS,(N = 12,106 cases / 387,347 controls)]. Variants at 13 loci reached genome-wide significance (P < 5×10-8), however these associations were not followed up in an independent replication study. The authors attempted identify the most likely causal gene(s) underpinning each association signal by identifying missense coding variants, mining RegulomeBD and performing an eQTL study using GTEX expression data from several tissues. Three candidate genes were identified [i.e. ADAMTS17 and ADAMTS10 (missense coding variants) and EFEMP1 (regulomeDB and eQTL)] and a complementary RNA-seq study confirmed that all three genes were expressed in the tenosynovium of CTS patients.

General comment:
A brief search of the literature, suggests that the genetic architecture of CTS has not been thoroughly investigated using genome-wide methodologies. Consequently, Wiberg and colleagues are presented with an exciting opportunity to thoroughly investigate the genetic architecture of CTS. For example, computationally efficient tools such as genomic-relatedness-based restricted maximum-likelihood (Yang et al 2011, American Journal of Human Genetics), and LD Score regression (Bulik-Sullivan et al 2015, Nature Genetics) can be easily implemented to estimate the proportion of CTS risk (i.e. SNP heritability) explained by all genotyped/imputed genetic markers using unrelated individuals from the UK-Biobank Study, or summary results statistics from their CTS GWAS. Furthermore recent modifications to the LD Score Regression method enable one to partition the SNP heritability across functional genomic categories and tissue types, providing valuable insights into the genetic architecture and the molecular mechanisms underlying the regulation of gene expression relevant to CTS (Lui et al 2017, American Journal of Human Genetics). Moreover, LD Score regression, as implemented in LD-HUB (Zheng et al 2017, Bioinformatics), can also be used to establish whether the genetic architecture influencing CTS is shared with other traits and disease (i.e. obesity, diabetes and rheumatoid arthritis etc). Collectively, findings from these analyses could provide valuable insights into the genetic underpinnings of CTS and could be used inform future studies and identify putative risk factors of CTS that have a shared genetic basis, and that could in theory be targeted for future CTS intervention.
Unfortunately the study described by Wiberg and collegues does not address these fundamental questions. I also have several major concerns with regards to the methods used in the study (described below). For these reasons I do not feel that the study is suitable for publication in the premier journal Nature Communications.
Major concerns: 1. The inclusion of disease status (namely: diabetes, rheumatoid arthritis, hypothyroidism and obesity) as covariates in the GWAS is not explicitly justified. I find this practice questionable as the inclusion of these heritable covariates could bias the effect estimates of variants that exert pleiotropic effects on CTS and these four disease phenotypes. Furthermore, the adjustment of heritable covariates may also induce spurious associations through collider bias (Aschard et al 2015, American Journal of Human Genetics). Similarly, the exclusion of subjects with peripheral neuropathy could also induce a form of selection bias. It is generally good practice to always make the results from the minimally adjusted model available.
2. Correction for genomic inflation factor as estimated by λGC is considered to be overly conservative and could bias some downstream analyses. Consider using LD score regression to quantify the proportion of inflation due to polygenicity versus confounding before correcting for genomic inflation (Bulik-Sullivan et al 2015, Nature Genetics).
3. The authors state that the large sample size of their study mitigates the lack of a replication sample. I disagree wholeheartedly. Large sample sizes cannot rule out chance findings, and are unlikely to guard against artifacts that occur as a result of uncontrolled biases specific to one, but not a second independent replication sample.
4. The strategies used to identify causal genes underpinning each association signal could be improved substantially. Consider using more up to date methods such as Summary based Mendelian Randomization ( 5. The authors define a list of CTS genes for gene set enrichment analysis by selecting the closest gene to each sentinel CTS association. This approach could be improved by performing a genomewide gene based test of association (e.g. MAGMA: de Leeuw et al 2015 PLOS Computational Biology) and running gene set enrichment analysis on the list of associated genes that meet the appropriate genome-wide significance threshold. This can be easily performed using FUMA (Watanabe et al 2017, Nature Communications).

Reviewers' comments:
Reviewer #1 (Remarks to the Author): This paper reports a genome-wide association study for carpal tunnel syndrome (CTS), the commonest entrapment neuropathy. The authors undertook the genome-wide association study (GWAS), using 12,106 CTS cases and 387,347 controls from the UK Biobank. They discovered 13 novel genome-wide significant loci for CTS, and identified likely causal genes in these loci. They also adjusted the analysis for the presence of selected systemic conditions, such as diabetes, obesity, rheumatoid arthritis, and hypothyroidism (which were suggested to be among risk factors for CTS). Also, a finding that on average, CTS patients are shorter in height than controls, is interesting. Importantly, using RNA sequencing from the surgically resected tenosynovium from CTS surgeries, the authors demonstrated differential expression of some of their identified genes.
Expression quantitative trait loci (eQTL) were assessed for the candidate causal variants, however, not in the most relevant for CTS tissue. In general, since etiology of CTS involves entrapment of the median nerve at the level of the wrist, the suggestion that some of the identified genes, implicated in growth and extracellular matrix architecture contribute to the genetic predisposition to CTS, makes sense. This is a straightforward GWAS, performed by an experienced group, with large numbers in the discovery cohort. The manuscript is clearly and consistently written. However, there are several outstanding issues: Methods/ RESULTS: Participants with diagnostic codes for peripheral neuropathies other than CTS were excluded from both samples of cases and controls. This might be unwarranted, and these participants can be analyzed with the CTS for sensitivity's sake (see p. 10, the case of HNPP disorder and the PMP22 gene).

We have taken this point into consideration, and our main analysis has now been performed by not excluding participants with diagnostic codes for peripheral neuropathies.
Using the weighted genetic risk score (wGRS) calculated from the same individuals who were included in the GWAS and applying to the discovery dataset: It would need an independent cohort to validate whether this wGRS is predictive of the disease's severity.
We are grateful to the reviewer for highlighting this. Our wGRS is constructed from our GWAS summary results and applied to the same individuals, and therefore it is entirely to be expected that the cases will have a higher wGRS than the controls. As such, we do not make the circular claim that this in itself is a significant finding (we merely state the scores for cases vs controls in order to illustrate the absolute difference in wGRS between these two groups). So as to make this clearer, we have now prefaced the relevant sentence with, "As expected, wGRS in CTS cases… (page 12, line 2)".
What is interesting, and of potential relevance to the issue of lack of replication, however, is that this score is associated with CTS severity within our cohort. We would expect that CTS patients who have undergone an operation have a phenotypically more severe form of CTS compared to those CTS patients who have not. And we find that the those who have undergone an operation for CTS have a significantly higher wGRS than the latter group.

This provides us with some degree of internal validation of (1) the accuracy of our phenotyping methodology, and (2) the fact that the GWAS hits on which the wGRS is
based are not spurious. In the absence of an independent cohort in which to test our wGRS, we believe it is fair to illustrate the differences between these two 'sub-cohorts' to show that wGRS appears to associate with disease severity; we do not make any claims beyond that.
The authors conditioned on the disease status of diabetes, rheumatoid arthritis, hypothyroidism and obesity. It is unclear why Mendelian Randomization was not attempted instead. Why the conditional analysis did not include height as a covariate?

Regarding the question of why the conditional analysis did not include height as a covariate, firstly, if this reviewer is suggesting that we should not condition on diseases associated with CTS such as diabetes and rheumatoid arthritis, we would think that the same should apply for height. Secondly, conditioning for height in a CTS GWAS is somewhat akin to conditioning on BMI in a diabetes GWAS -and would tend to attenuate
the genetic architecture of CTS that has arisen from pathways involving height. We also note that conditioning on anthropometric traits can lead to bias, as illustrated in a recent article (https://www.ncbi.nlm.nih.gov/pubmed/29520038).

In order to disentangle the intriguing relationship between height and CTS, we have taken the reviewer's suggestion of performing a Mendelian randomistion analysis, and we find compelling evidence that height is causally implicated in the aetiology of CTS.
OTHER COMMENTS: The cohorts used in this study were mostly composed of European-descent populations; a need to expand to multi-ethnic samples should be noted.

We have now mentioned the need to extend the study to non-European ancestry populations in the Discussion (page 20, line 23).
Clinical application of the knowledge this paper generated is not obvious.

We do not feel that this our manuscript is particularly lacking in clinical applications compared to other GWAS papers, especially considering that this is the first ever GWAS
performed in this disease. We illustrate that a genetic risk score correlates with CTS disease severity, and we provide new evidence of a causal relationship of height with CTS.

Furthermore, we also suggest miR-338-5p, which has been shown to inhibit the synthesis of EFEMP1 (a key gene implicated in our GWAS), to be a potential therapeutic target (page 18, line 13).
Finally, this new GWAS has uncovered three additional loci, and we have now implicated three genes in the TGF-β/Smad signaling pathway, a well-studied pathway in several diseases. We have dedicated a paragraph to discussing the relevance of these findings, and explain that the efficacy of corticosteroid injections into the carpal tunnel (a commonly performed procedure) could potentially be mediated through modulation of this pathway.

25, line 11).
It was shown that mutations in the gene SH3TC2, associated with Charcot-Marie-Tooth, confer susceptibility to neuropathy, including CTS. Did the authors try to test whether this gene is enriched in their analysis? (The same in regards to HNPP-associated PMP22 gene).
Similarly, there was a GWAS for Dupuytren Disease -which is a related Fibrosis and might share etiology with the CTS (Ng et al. Am J Hum Genet. 2017). The authors should have a look on the genetic correlation between these conditions.

We next examined SNPs that were genome-wide significant in the CTS GWAS and in the CPASSOC results, and found that none of these were remotely suggestive of association in the Dupuytren's GWAS (the minimum p-value was 0.0088 for rs4356642). Finally, we examined the set of SNPs that were genome-wide significant in the Dupuytren's GWAS and in the CPASSOC results -none of the SNPs were suggestive of significance at p<1x10 -5 in the CTS GWAS; three SNPs had very nominally "suggestive" p-values (ranging from p=7.4x10 -5 to p=9.8x10 -5 ). These three SNPs reside at the WNT7B locus on chromosome 22 that we reported in the Dupuytren's paper. However, given the very lax definition of a "suggestive" p-value for these SNPs, we are strongly inclined to disregard them as evidence of enrichment across the two phenotypes. We therefore conclude that there is currently no evidence of a shared genetic architecture between CTS and
Dupuytren's Disease. We do not feel that this negative finding warrants reporting in this manuscript, but would be happy to include it as supplementary data should you or the editor feel it necessary. Table 1 does not include "obesity".

We have no longer conditioned on "obesity" in our GWAS, and therefore, do not provide this information.
Supplementary Figure 2: the very next after the "body height" among top enriched set of the SNPs was "membranous glomerulonephritis"; "kidney disease" was also enriched. Do the authors have any comments/speculations why? (On the other hand, "rheumatic disease" and skeletal/bone conditions did not make it to the top, -any idea why?)

We have used the SNP-based enrichment methods proposed by Reviewer #3 (FUMA) -as such, we no longer provide the results of SNP-based enrichment using this method. As a general point, we are inherently sceptical of the value of the various computational tools available to map GWAS summary statistics onto disease ontologies in the context of CTS. The ontologies that exist in these computational tools are invariably subject to publication bias; whereas numerous GWAS have been conducted in e.g. autoimmune, cardiovascular and neuro-psychiatric diseases, our study is the first to investigate an entrapment neuropathy through genome-wide association. Therefore, we feel that many of the disease (and other) ontologies that are mapped onto our GWAS summary statistics must be interpreted with caution.
Reviewer #2 (Remarks to the Author): This is an extremely well written GWAS manuscript that reports several novel genome-wide significant signals at/close to compelling genes. The authors have a large CTS case cohort and a considerably larger control cohort, all derived from the UK Biobank.
All of the appropriate QC and analysis checks have been undertaken, whilst functional follow up has included public in-silico data as well as specific RNAseq data generated by the authors.The interpretation of the results is thoughtful and reasonable and is not at all exaggerated. There is one major issue that the authors freely acknowledge and do defendthe lack of replication. Replication is a mainstay of GWAS and its absence here means that some of the significant signals will represent type I error. Its not possible to predict how many.
Considering the mitigation provided and the current lack of genetic data for the disease and the potential insight provided by the signals, I'm minded to overlook this weakness.
I have no suggested changes to make regarding manuscript structure, presentation or clarity. Bioinformatics), can also be used to establish whether the genetic architecture influencing CTS is shared with other traits and disease (i.e. obesity, diabetes and rheumatoid arthritis etc). Collectively, findings from these analyses could provide valuable insights into the genetic underpinnings of CTS and could be used inform future studies and identify putative risk factors of CTS that have a shared genetic basis, and that could in theory be targeted for future CTS intervention.

We have also used LD Score Regression to investigate genetic correlations between CTS and the traits mentioned above. We found statistically significant correlations with anthropometric measures such as BMI and height, whereas other traits such as type 2 diabetes and rheumatoid arthritis did not meet the Bonferroni-corrected significance threshold. (Supplementary Table 7).
Unfortunately the study described by Wiberg and collegues does not address these fundamental questions. I also have several major concerns with regards to the methods used in the study (described below). For these reasons I do not feel that the study is suitable for publication in the premier journal Nature Communications.
Major concerns: 1. The inclusion of disease status (namely: diabetes, rheumatoid arthritis, hypothyroidism and obesity) as covariates in the GWAS is not explicitly justified. I find this practice questionable as the inclusion of these heritable covariates could bias the effect estimates of variants that exert pleiotropic effects on CTS and these four disease phenotypes. Furthermore, the adjustment of heritable covariates may also induce spurious associations through collider bias (Aschard et al 2015, American Journal of Human Genetics). Similarly, the exclusion of subjects with peripheral neuropathy could also induce a form of selection bias. It is generally good practice to always make the results from the minimally adjusted model available.

As mentioned above in a reply to Reviewer #1, we have addressed this issue by (1) not excluding participants with a peripheral neuropathy diagnosis, and (2) not adjusting for the disease covariates. Thus, our GWAS now only conditions on GWAS platform and sex.
2. Correction for genomic inflation factor as estimated by λGC is considered to be overly conservative and could bias some downstream analyses. Consider using LD score regression to quantify the proportion of inflation due to polygenicity versus confounding before correcting for genomic inflation (Bulik-Sullivan et al 2015, Nature Genetics).
We have now calculated the LD score regression intercept, with an associated attenuation ratio (intercept = 1.015; attenuation ratio = 0.073), consistent with polygenicity and the large sample size.
3. The authors state that the large sample size of their study mitigates the lack of a replication sample. I disagree wholeheartedly. Large sample sizes cannot rule out chance findings, and are unlikely to guard against artifacts that occur as a result of uncontrolled biases specific to one, but not a second independent replication sample.

We have used 25 genes mapped to our loci by FUMA and 17 genes mapped by MAGMA
gene-based association, rather than the genes in closest proximity.

Summary-based Mendelian Randomisation (SMR) requires eQTL data, of which there are publicly available summary data for various tissue types. These include the 53 GTEx tissue types, and various eQTLs in whole blood. One obstacle that we come across time and again
when performing these types of computational analyses on our CTS data is that CTS (and other entrapment neuropathies) have been hitherto relatively underinvestigated. As such, the tissue-specific data for these conditions simply do not exist.

The findings from our study strongly suggest that the tissues of interest (where CTSpredisposing risk genes are principally likely to act) are: (1) the tenosynovium within the carpal tunnel, and (2) bone (causing altered growth and skeletal proportions of the upper limb). It is notable that there are no GTEx eQTLs for synovium or bone. We have, as a
result, resorted to using the GTEx eQTL data for "transformed fibroblasts", which are a constituent of tenosynovium; however, it goes without saying that transformed fibroblasts will be phenotypically different from the fibroblasts that are found in carpal tunnel tenosynovium in vivo.

5:1) and the surprisingly young age at diagnosis suggest that the EGCUT cohort may include a significant number of CTS cases during pregnancy -a common and usually transient phenomenon due to fluid retention in pregnancy, which is quite distinct in terms of pathophysiology from idiopathic CTS. As such, the genetic architecture of CTS during pregnancy is likely different to that in the general population, which could account for these differences. Moreover, there may be further misclassification based on over-reliance
on self-reported questionnaire data for diagnosis.

Lack of power:
Our other major concern regarding the EGCUT data pertains to lack of statistical power. Reviewer #1 (Remarks to the Author):

We present these data to you in the interests of full transparency, and kindly ask that you exercise your editorial judgment in deciding whether our attempt at replication ought to be included in our manuscript (with caveats and limitations). We thank you again for
The authors have provided appropriate responses to all the concerns this reviewer raised. There is one outstanding question though: Previous comment by reviewer 1: "Participants with diagnostic codes for peripheral neuropathies other than CTS were excluded …"(See also Rev. 2) The authors have addressed this issue by (1) not excluding participants with a peripheral neuropathy diagnosis, and (2) not adjusting for other covariates. Thus, it is expected that Table 1 should be changed a bit, which hadn't happen. Pls. comment.

Reviewer #2 (Remarks to the Author):
Thank you for the responses to reviewer's comments. These were detailed and thoughtful.
Reviewer #3 (Remarks to the Author): Reviewer #3, residual concerns: 1.Author: As mentioned above in a reply to Reviewer #1, we have addressed this issue by (1) not excluding participants with a peripheral neuropathy diagnosis, and (2) not adjusting for the disease covariates. Thus, our GWAS now only conditions on GWAS platform and sex.
Reviewer: Could the authors please comment on why "year of birth" was included as a covariate in the regression model for replication, but not for discovery?

2.Author:
In order to disentangle the intriguing relationship between height and CTS, we have taken the reviewer's suggestion of performing a Mendelian randomistion analysis, and we find compelling evidence that height is causally implicated in the aetiology of CTS. Reviewer: Could the authors perform the bidirectional MR analysis, to rule out a possible causal effect of CTS on height? 3.Author: Summary-based Mendelian Randomisation (SMR) requires eQTL data, of which there are publicly available summary data for various tissue types. These include the 53 GTEx tissue types, and various eQTLs in whole blood. One obstacle that we come across time and again when performing these types of computational analyses on our CTS data is that CTS (and other entrapment neuropathies) have been hitherto relatively under investigated. As such, the tissue-specific data for these conditions simply do not exist. The findings from our study strongly suggest that the tissues of interest (where CTS-predisposing risk genes are principally likely to act) are: (1) the tenosynovium within the carpal tunnel, and (2) bone (causing altered growth and skeletal proportions of the upper limb). It is notable that there are no GTEx eQTLs for synovium or bone. We have, as a result, resorted to using the GTEx eQTL data for "transformed fibroblasts", which are a constituent of tenosynovium; however, it goes without saying that transformed fibroblasts will be phenotypically different from the fibroblasts that are found in carpal tunnel tenosynovium in vivo. Summary-based Mendelian Randomisation (SMR) requires eQTL data, of which there are publicly available summary data for various tissue types. These include the 53 GTEx tissue types, and various eQTLs in whole blood. One obstacle that we come across time and again when performing these types of computational analyses on our CTS data is that CTS (and other entrapment neuropathies) have been hitherto relatively underinvestigated. As such, the tissue-specific data for these conditions simply do not exist. Reviewer: The authors investigate transformed fibroblasts with SMR, despite the gene-property analysis strongly implicating the tibial artery and coronary artery as tissues of interest (Suppl Figure 4). Although the link between these tissue types and CTS is not intuitive or immediately clear, it may be worthwhile to use SMR to potentially identify the plausibly causal gene(s) that contribute to these associations as they could implicate tangible disease pathways.

4.Author:
Replication in Estonian Cohort -Further to your decision regarding this manuscript on 5th of June 2018 and the issue of replication that was raised, we have made substantial efforts to try to replicate this GWAS by reaching out to several international biobanks. The vast majority of these potential collaborators either did not collect CTS data, or had very small cohorts (of no more than a few hundred cases). However, the Estonian Institute of Genomics at the University of Tartu (EGCUT) had a reasonably-sized cohort of CTS cases, and we therefore initiated a collaboration with Dr Reedik Mägi and his team to establish whether we could replicate our GWAS in EGCUT.

Reviewer:
In the interest of transparency, I feel strongly that the replication attempt be included in the manuscript together with the caveats and limitations. Furthermore, I have a few comments in respect to the replication effort: 1. The UK-Biobank Study does not appear to be representative of the UK population as discussed in the following paper (PMID:29040562). I would caution against relying on population based estimates of disease prevalence as they appear to suffer from ascertainment bias.
2. Please provide descriptive summary statistics relating to key variables from both cohorts (e.g. height, weight, age etc) so that the comparability of the discovery and replication cohorts can be evaluated.
3. Please perform LD score regression to evaluate the genetic similarity between CTS as defined in Biobank to that defined in EGCUT using GWAS summary statistics. This may help to evaluate the comparability of disease definitions, assuming the replication sample is sufficiently powered.
4. To complement point 3, it would be beneficial to create a genetic risk score in the EGCUT study (using the 16 GWAS associated CTS SNPs and their weights) and test for an association with CTS in EGCUT.
5. Arguably, the Bonferonni corrected threshold of association may be too conservative, given that the attempt at replication is informed by the discovery GWAS. Could the authors supplement the table with power estimates using an alpha of 0.05, correcting for winner's curse if possible.
5. Since the release of the UK-Biobank Study, it is increasingly difficult to identify replication cohorts of sufficient sample size for robust replication. To partially address this issue, Yengo and colleagues (PMID:30124842) quantify replicability of their GWAS associations "jointly" by estimating the regression slope of SNP effect size estimated in the replication sample onto the SNP effect sizes (corrected for winner's curse effects) from their discovery. Perhaps the authors could attempt to evaluate their findings using this method, keeping in mind that it may not perform as well given that the current study identified 16 SNPs and not thousands as per the cited GIANT study.
The authors have provided appropriate responses to all the concerns this reviewer raised. There is one outstanding question though: Previous comment by reviewer 1: "Participants with diagnostic codes for peripheral neuropathies other than CTS were excluded …"(See also Rev. 2) The authors have addressed this issue by (1) not excluding participants with a peripheral neuropathy diagnosis, and (2) not adjusting for other covariates. Thus, it is expected that Table 1 should be changed a bit, which hadn't happen. Pls. comment.

If the reviewer is referring to Supplementary Table 1, we removed the diagnostic codes for all other diseases apart from CTS between our first and second submissions. The number of individuals with each of the diagnostic codes remained the same between our two previous manuscripts, as this table listed the numbers of individuals prior to QC. We realise that this is potentially confusing, so we have now modified the table to only display the number of individuals with each diagnostic code following sample QC.
Reviewer #2 (Remarks to the Author): Thank you for the responses to reviewer's comments. These were detailed and thoughtful.

We thank the reviewer for their comments and their re-review of our manuscript.
Reviewer #3 (Remarks to the Author): Reviewer #3, residual concerns: 1.Author: As mentioned above in a reply to Reviewer #1, we have addressed this issue by (1) not excluding participants with a peripheral neuropathy diagnosis, and (2) not adjusting for the disease covariates. Thus, our GWAS now only conditions on GWAS platform and sex.
Reviewer: Could the authors please comment on why "year of birth" was included as a covariate in the regression model for replication, but not for discovery?

The Supplementary Materials now contains an entire section dedicated to our attempts at replication in the Estonian cohort. One of the tables demonstrates the demographics of the UK Biobank vs EGCUT cohorts. The age difference in UK Biobank between cases and controls was a mere 2 years, and we therefore did not condition on age, especially in light of the issues that were raised by both reviewers 1 and 3 previously regarding conditioning on too many variables in our original discovery GWAS.
In contrast, in EGCUT, the age difference between cases and controls was 6 years in females and 7 years in males, hence the inclusion of age as a covariate in the replication GWAS. The EGCUT dataset is considerably different to the UK Biobank dataset, and our two groups have consequently developed different GWAS pipelines to reflect this.

2.Author:
In order to disentangle the intriguing relationship between height and CTS, we have taken the reviewer's suggestion of performing a Mendelian randomistion analysis, and we find compelling evidence that height is causally implicated in the aetiology of CTS.

Reviewer:
Could the authors perform the bidirectional MR analysis, to rule out a possible causal effect of CTS on height?

3.Author:
Summary-based Mendelian Randomisation (SMR) requires eQTL data, of which there are publicly available summary data for various tissue types. These include the 53 GTEx tissue types, and various eQTLs in whole blood. One obstacle that we come across time and again when performing these types of computational analyses on our CTS data is that CTS (and other entrapment neuropathies) have been hitherto relatively under investigated. As such, the tissue-specific data for these conditions simply do not exist. The findings from our study strongly suggest that the tissues of interest (where CTS-predisposing risk genes are principally likely to act) are: (1) the tenosynovium within the carpal tunnel, and (2) bone (causing altered growth and skeletal proportions of the upper limb). It is notable that there are no GTEx eQTLs for synovium or bone. We have, as a result, resorted to using the GTEx eQTL data for "transformed fibroblasts", which are a constituent of tenosynovium; however, it goes without saying that transformed fibroblasts will be phenotypically different from the fibroblasts that are found in carpal tunnel tenosynovium in vivo. Summary-based Mendelian Randomisation (SMR) requires eQTL data, of which there are publicly available summary data for various tissue types. These include the 53 GTEx tissue types, and various eQTLs in whole blood. One obstacle that we come across time and again when performing these types of computational analyses on our CTS data is that CTS (and other entrapment neuropathies) have been hitherto relatively underinvestigated. As such, the tissue-specific data for these conditions simply do not exist.

Reviewer:
The authors investigate transformed fibroblasts with SMR, despite the gene-property analysis strongly implicating the tibial artery and coronary artery as tissues of interest (Suppl Figure 4). Although the link between these tissue types and CTS is not intuitive or immediately clear, it may be worthwhile to use SMR to potentially identify the plausibly causal gene(s) that contribute to these associations as they could implicate tangible disease pathways.

As the reviewer acknowledges, the link between coronary artery, tibial artery , and carpal tunnel is highly tenuous from a physiological point of view. As we have explained, any association is likely caused by publication bias inherent in the databases selected for analysis. As such, we think that any analysis performed using these tissue types is likely to lead to false conclusions that would distract from the manuscript.
We had actually previously performed SMR for tibial artery and coronary artery, and did not find any additional genes implicated, and therefore did not include this in the manuscript.

4.Author:
Replication in Estonian Cohort -Further to your decision regarding this manuscript on 5th of June 2018 and the issue of replication that was raised, we have made substantial efforts to try to replicate this GWAS by reaching out to several international biobanks. The vast majority of these potential collaborators either did not collect CTS data, or had very small cohorts (of no more than a few hundred cases). However, the Estonian Institute of Genomics at the University of Tartu (EGCUT) had a reasonably-sized cohort of CTS cases, and we therefore initiated a collaboration with Dr Reedik Mägi and his team to establish whether we could replicate our GWAS in EGCUT.

Reviewer:
In the interest of transparency, I feel strongly that the replication attempt be included in the manuscript together with the caveats and limitations. Furthermore, I have a few comments in respect to the replication effort: As we stated in our reply, we are willing to publish this data, and the editor has suggested it is included in the supplementary materials.
1. The UK-Biobank Study does not appear to be representative of the UK population as discussed in the following paper (PMID:29040562). I would caution against relying on population based estimates of disease prevalence as they appear to suffer from ascertainment bias. 2. Please provide descriptive summary statistics relating to key variables from both cohorts (e.g. height, weight, age etc) so that the comparability of the discovery and replication cohorts can be evaluated.