Sex-specific Mendelian randomization study of genetically predicted insulin and cardiovascular events in the UK Biobank

Insulin drives growth and reproduction which trade-off against longevity. Genetically predicted insulin, i.e., insulin proxied by genetic variants, is positively associated with ischemic heart disease, but sex differences are unclear, despite different disease rates and reproductive strategies by sex. We used Mendelian randomization in 392,010 white British from the UK Biobank to assess the sex-specific role of genetically predicted insulin in myocardial infarction (MI) (14,442 cases, 77% men), angina (21,939 cases, 65% men) and heart failure (5537 cases, 71% men). Genetically predicted insulin was associated with MI (odds ratio (OR) 4.27 per pmol/L higher insulin, 95% confidence interval (CI) 1.60 to 11.3) and angina (OR 2.93, 1.27 to 6.73) in men, but not women (MI OR 0.80, 95% CI 0.23 to 2.84, angina OR 1.10, 95% CI 0.38 to 3.18). Patterns were similar for insulin resistance and heart failure. Mitigating the effects of insulin might address sexual disparities in health.

This is a study that makes wonderful use of the Mendelian Randomization method. I found that the study was convincing and that the authors did a fantastic job exploring genetic links between insulin and heart disease by sex using MR. In fact, having reviewed many MR papers in the past, this was one of the more convincing studies that I have reviewed. However, the limitations of the MR method, in particular violations of the exclusion restriction, make MR in general fraught with potentially violated assumptions and other biases. I would like to see the limitations section greatly expanded. Under each of the important limitations that the authors point out, how precisely might violations of assumptions or failure to adequately address limitations bias results? In general, I think this is needed in all MR work, but being concrete about important assumptions and biases builds a much more honest and straightforward scientific paper. I would also like to see the authors reference limitations throughout the paper as they describe results and hypotheses and how they will explore hypotheses, rather than waiting until the end. MR is not in any way a perfect solution to causality in genetics, and so it is important in improving the body of literature in this field that limitations are not simply hastily placed in the discussion of a paper. Other than this important methodological consideration, I believe the authors have done a fantastic job here and, once this issue is addressed, I would recommend publication of this paper.

Reviewer #3 (Remarks to the Author):
This is an article investigating the genetic association between predicted inulin/insulin resistance and myocardial infraction, angina, and heart failure. The authors used Mendelian randomization method in the UK Biobank database. Genetically predicted insulin was associated with myocardial infarction in the overall participants and male subgroup. However, this association was not significant in female subgroup. Regarding angina, predicted insulin had significant association only in male subgroup. This relationship was similar for BMI adjusted insulin level and insulin resistance genetic score. The authors confirm previously known associations of genetically predicted insulin level with myocardial infarction and angina. It is also stated that there is sex specific association, which is only prominent in men. The manuscript is overall well written, and analysis has been done thoroughly by experienced investigators. However, I have the following questions and comments regarding this manuscript. 1. The association between genetically predicted insulin/insulin resistance is already reported in major journals. It is nice to see that this study is replicating previous results. However, it will be more interesting if the authors are able to provide novel insights to this relationship using the one of the largest genetic association databases of UK Biobank. 2. The sex specific effect is interesting. However, there is a large difference in sample size between men and women. Please comment on how this might have affected the results. 3. I wonder if diabetes patients are included in the analysis. In that case, is it possible that diabetes per se, and anti-diabetic medications might have affected the outcomes? 4. Similarly, is there a possibility of reverse causality? 5. I am curious why the authors selected reticulocyte count as one of the outcomes for the genetically predicted insulin level? What is the hypothesis underling this investigation?
Below are some suggestions and comments: 1) The authors remove 5 SNVs due to observed genetic associations with other traits, and then perform MR analysis of the remaining 7 SNVs. In sensitivity analysis, the authors perform additional MR tests that account for pleiotropy on these 7 SNVs. The additional MR tests don't necessitate a prior removal of SNVs based on observed pleiotropy therefore it would be good to include all 12 SNVs in the additional MR tests as a sensitivity analysis.
Thank you very much for your comment. Please accept our apologies for being unclear. We did include a sensitivity analysis using all the 12 genetic variants for insulin (Supplemental Table 5 as shown below). We have amended the title to be clearer and more explicit. From: "Sensitivity analyses on the associations of genetically predicted insulin with myocardial infarction, angina and heart failure with potentially pleiotropic SNPs" To: "Sensitivity analyses showing the associations of genetically predicted insulin with myocardial infarction, angina and heart failure including all potentially pleiotropic SNPs". 2) It would nice to show the actual genetic association results of the tested 7 SNVs in males and females separately, to see the SNV-specific effects.

Supplemental
Thank you very much for your comment. We have added the results in Supplemental Table 2 as follows: Supplemental In results, paragraph 2, we added "7 SNPs were used (Supplemental Table 1 and Supplemental Table 2)".
3) Visualization of the association of genetically-predicted insulin and MI, angina and heart failure, can be shown with scatter plots of the effect size (with standard errors) of SNV on insulin vs. effect size of SNV on MI.
Thank you very much for your comment. We have added the scatter plot in Supplemental Figure  1, as follows: Supplemental Figure 1. Scatter plot for genetically predicted insulin and myocardial infarction, angina and heart failure In results, paragraph 5, we added "Genetically predicted insulin, BMI-adjusted insulin and insulin resistance score were all positively associated with MI overall (Table 1 and Supplemental Figure 1)".
4) It would be interesting to show if there is an observational correlation between the serum insulin level with MI phenotype in UK biobank in males only, and females only. Although confounding and reverse causation are issues for observational analysis, this can provide support to do MR analysis.

Thank you very much for your helpful comment. It is a great idea and would be very interesting to compare the sex-specific associations in MR with those in conventional observational studies.
However, serum insulin is not currently available in the UK Biobank, so we cannot do this analysis at the moment. We really look forward to doing this analysis once the data is available in the future.

5) Please clarify which MR analyses used GWAS summary statistics and which MR tests used individual level genotypes and phenotypes.
Thank you very much for your comment. We have added clarification on the use of summary statistics and individual level data. The revisions are as follows: (Methods-Genetic associations with MI, angina and heart failure) "Genetic associations with MI, angina and heart failure were obtained using individuallevel data in the UK Biobank (under the application #42468), with validation for MI using summary statistics from CARDIoGRAPMplusC4D 1000 Genomes 1 ." (Methods-Genetic associations with LDL-cholesterol and ApoB) "Genetic associations with LDL-cholesterol (as inverse normal transformed effect sizes), adjusted for age, age 2 and sex, were obtained from the Global Lipids Genetics Consortium Results summary statistics …" (Methods-Genetic associations with blood pressure and reticulocyte count) "We obtained overall and sex-specific genetic associations with blood pressure and reticulocyte count using summary statistics from the UK Biobank, provided by Neale Lab (http://www.nealelab.is/uk-biobank/)..." 6) Following point 5, in those instances where you used two sample MR based on GWAS summary statistics, please note that if GWAS summary statistics for both the exposure and outcome were obtained from one sample source, re. UK biobank solely, then the causal estimate will be biased, see PMID 27625185. this should be discussed as a limitation in discussion. From: "In addition, the sample for genetic variants on insulin has no overlap with the UK Biobank. As such, any relationship of the genetic variants to unmeasured confounders is not expected to exist coincidently in the samples for insulin or insulin resistance and for the outcomes, due to the different data structures 2 ." To: "In addition, the sample for genetic variants on insulin has no overlap with the UK Biobank. Two-sample MR is less biased than one-sample MR 3 , because any relation of the genetic variants with unmeasured confounders is not expected to exist coincidently in both the sample providing genetic associations with insulin or insulin resistance and the sample providing genetic associations with the outcomes, due to the different data structures 2 . If bias did occur due to weak instruments, it is often towards the null, whereas in one-sample MR the bias is towards the direction of the conventional observational studies 3 ." 7) The sample size for number of cases is much larger in men than women -can you perform power calculations to show specifically that the null results in Women is not due to reduced statistical power?
Thank you very much for your comment. We have added power calculation in the methods and results as follows: In the methods, we added "Power calculations were conducted overall and by sex. MR studies require larger sample sizes than conventional observational studies, because the sample size needed for MR is the sample size for the conventional observational study divided by the variance in the exposure explained by the genetic predictors 4 ." In the results, we added "The replication for MI using a different study provides additional validation, and enabled us to test causality in a cost-efficient way 5  8) It would nice to obtain validation of the sex-specific results in another cohort. If this is not possible, it would be good to mention that additional replication in other cohorts is warranted to provide more support of this finding.
Thank you very much for your comment. It would be great to replicate in another cohort, however, we cannot find another cohort which can provide sex-specific genetic associations. We have added in the limitations "Validation of the sex-specific associations in another cohort is warranted." We have also added in the conclusions "Replication in other cohorts is needed." Minor comments 9)

Reviewer #2 (Remarks to the Author):
This is a study that makes wonderful use of the Mendelian Randomization method. I found that the study was convincing and that the authors did a fantastic job exploring genetic links between insulin and heart disease by sex using MR. In fact, having reviewed many MR papers in the past, this was one of the more convincing studies that I have reviewed. However, the limitations of the MR method, in particular violations of the exclusion restriction, make MR in general fraught with potentially violated assumptions and other biases. I would like to see the limitations section greatly expanded. Under each of the important limitations that the authors point out, how precisely might violations of assumptions or failure to adequately address limitations bias results? In general, I think this is needed in all MR work, but being concrete about important assumptions and biases builds a much more honest and straightforward scientific paper. I would also like to see the authors reference limitations throughout the paper as they describe results and hypotheses and how they will explore hypotheses, rather than waiting until the end. MR is not in any way a perfect solution to causality in genetics, and so it is important in improving the body of literature in this field that limitations are not simply hastily placed in the discussion of a paper. Other than this important methodological consideration, I believe the authors have done a fantastic job here and, once this issue is addressed, I would recommend publication of this paper.
Thank you very much indeed for the positive comments. We have expanded the limitations section greatly, to address in detail and more precisely of the limitations. As you suggested, we re-arranged the discussion, and put some of the limitations concerning power calculation to the methods and results. The re-arrangement has been shown with track changes throughout the paper. The discussion has been expanded as follows: From: "First, MR is based on three assumptions, i.e., relevance, independence and exclusionsrestriction (no pleiotropy). We used genetic variants strongly associated with insulin and insulin resistance identified in large GWAS 6,7 , as previously 8,9 . We checked for associations with potential confounders, such as socioeconomic position and lifestyle in the UK Biobank.
In addition, the sample for genetic variants on insulin has no overlap with the UK Biobank. As such, any relationship of the genetic variants to unmeasured confounders is not expected to exist coincidently in the samples for insulin or insulin resistance and for the outcomes, due to the different data structures 2 …. To detect known potential pleiotropy we checked in three comprehensive curated databases." To: "First, MR is based on three assumptions, i.e., the genetic variants are strongly related to the exposure, are not related to the exposure-outcome confounders, and the genetic variants are related to the outcomes only via influencing the exposure 10,11 . To satisfy the first assumption, we used genetic variants strongly associated with insulin and insulin resistance identified in large GWAS 6,7 , as previously 8,9 . To satisfy the second assumption, we checked for associations with known exposure-outcome confounders, including socioeconomic position and lifestyle in the UK Biobank, where there was no association with these potential confounders. In addition, the sample for genetic variants on insulin has no overlap with the UK Biobank. Two-sample MR is less biased than one-sample MR 3 , because any relation of the genetic variants with unmeasured confounders is not expected to exist coincidently in both the sample providing genetic associations with insulin or insulin resistance and the sample providing genetic associations with the outcomes, due to the different data structures 2 . If bias did occur due to weak instruments, it is often towards the null, whereas in one-sample MR the bias is towards the directions of the conventional observational studies 3 . …To test the assumption of pleiotropy, we checked for the known potential pleiotropy in three comprehensive curated databases." From: "Fourth, our study could be affected by survivor bias (selection bias) 12 , and by competing risk for specific causes of death that share risk factors." To: "Fourth, our study could be affected by survivor bias (selection bias) 12 , and by competing risk for specific causes of death that share risk factors. Specifically, the estimates for a potentially harmful exposure might be biased towards being less harmful if people with higher levels of exposures were already dead and not selected into the study, as in the obesity paradox 13 ." Reviewer #3 (Remarks to the Author): This is an article investigating the genetic association between predicted inulin/insulin resistance and myocardial infraction, angina, and heart failure. The authors used Mendelian randomization method in the UK Biobank database. Genetically predicted insulin was associated with myocardial infarction in the overall participants and male subgroup. However, this association was not significant in female subgroup. Regarding angina, predicted insulin had significant association only in male subgroup. This relationship was similar for BMI adjusted insulin level and insulin resistance genetic score. The authors confirm previously known associations of genetically predicted insulin level with myocardial infarction and angina. It is also stated that there is sex specific association, which is only prominent in men. The manuscript is overall well written, and analysis has been done thoroughly by experienced investigators. However, I have the following questions and comments regarding this manuscript.
Thank you very much for your positive comment.
1. The association between genetically predicted insulin/insulin resistance is already reported in major journals. It is nice to see that this study is replicating previous results. However, it will be more interesting if the authors are able to provide novel insights to this relationship using the one of the largest genetic association databases of UK Biobank.
Thank you very much for your comment. Our study is consistent with previous study on genetically predicted insulin and ischemic heart disease (IHD). Our study adds to the current evidence by showing the sex-specific associations of genetically predicted insulin and insulin resistance in subtypes of IHD, suggesting a sex-disparity in these associations. We have expanded the discussion as follows: (Discussion, paragraph 4) "Our study, together with previous evidence 14,15 , suggests that insulin and insulin resistance have symbiotic roles that may both ultimately play a role in CVD. Our study adds to the current evidence by showing a sex-disparity in these associations." 2. The sex specific effect is interesting. However, there is a large difference in sample size between men and women. Please comment on how this might have affected the results.
Thank you very much for your comment. We agree there is a larger sample size in men than in women, however, the difference in sample size should only affect the precision of the estimates, rather than the magnitude of the point estimates or the direction of the associations. We have added power calculation in the methods and results as follows: In the methods, we added "Power calculations were conducted overall and by sex. MR studies require larger sample sizes than conventional observational studies, because the sample size needed for MR is the sample size for the conventional observational study divided by the variance in the exposure explained by the genetic predictors 4

."
In the results, we added "The replication for MI using a different study provides additional validation, and enabled us to test causality in a cost-efficient way 5  3. I wonder if diabetes patients are included in the analysis. In that case, is it possible that diabetes per se, and anti-diabetic medications might have affected the outcomes?
Thank you very much for your comment. We did not specifically exclude people with type 2 diabetes from the analysis. Diabetes or anti-diabetic medications might affect the outcomes but should not affect the genetic predictors for insulin or insulin resistance, so the associations of genetically predicted insulin or insulin resistance should not be confounded by diabetes or antidiabetic medications. It is possible that adjusting for diabetes or anti-diabetic medications might improve the precision of the estimates. However, it is also possible that diabetes is a mediator of the association of insulin or insulin-resistance with the outcomes, in which case adjusting for diabetes (by adjustment or exclusion) would give the direct effect instead of the total effect and thereby introduce a bias. As such, we prefer not to adjust for or exclude by diabetes or diabetes medication status. We have expanded the discussion to explain this point as follows: "Seventh, some of the participants may have comorbidities such as type 2 diabetes and may be taking medications for these comorbidities. Co-morbidities and their treatment may affect the cardiovascular outcomes, but should not affect the genetic predictors of exposures, so they are not confounders but their inclusion could improve the precision of the estimates. However, co-morbidities could also be consequences of insulin and insulin resistance so their consideration in the model would give the direct effects of insulin rather than the total effect sought, i.e., might create bias. As such, we did not account for comorbidities or their treatment by adjustment or restriction, so as to obtain an unbiased, though possibly less precise, estimates."

Similarly, is there a possibility of reverse causality?
Thank you very much for your comment. Reverse causality, i.e., cardiovascular events leading to abnormal insulin or insulin resistance, is not a major concern in this study. People with cardiovascular events may change their lifestyle, which may be beneficial for lowering insulin resistance, however, this cannot explain the positive associations of insulin or insulin resistance with cardiovascular events in this study, because it cannot change the genetic predictors. Moreover, all SNPs are genome-wide significant SNPs for insulin or insulin resistance, none of them are genome-wide significant for myocardial infarction (MI), angina or heart failure. We have expanded the discussion as follows: "Eighth, reverse causality may occur if people with cardiovascular events change their lifestyle thereby affecting insulin or insulin resistance. However, these changes would not affect genetically predicted insulin or insulin resistance. None of the genetic variants are genome-wide significant for cardiovascular events, so it is unlikely that they predict insulin or insulin resistance by affecting cardiovascular events." 5. I am curious why the authors selected reticulocyte count as one of the outcomes for the genetically predicted insulin level? What is the hypothesis underling this investigation?
Thank you very much for your comments. Red blood cell traits have long been suspected to play a role in cardiovascular disease [16][17][18] , although it is not clear which specific trait is causal. The most recent evidence from an MR study published in Cell suggests reticulocytes are related to higher cardiovascular risk 19 , although more validation is needed. Based on the best evidence available, we used reticulocyte count as one of the outcomes. We have added further explanation on this point as follows: (Introduction-paragraph 3) From: "Here, we used MR to assess overall and sex-specific effects of insulin, and for completeness insulin resistance, on MI, angina, heart failure and their key risk factors (low density lipoprotein (LDL) cholesterol, apolipoprotein B (ApoB) 20 , blood pressure and reticulocyte count, a recently identified causal factor for CVD 19 ) using individual data in a large cohort, the UK Biobank 21 , or the largest available genome wide association study (GWAS)." To: "Here, we used MR to assess overall and sex-specific effects of insulin, and for completeness insulin resistance, on MI, angina, heart failure and their key risk factors (low density lipoprotein (LDL) cholesterol, apolipoprotein B (ApoB) 20 , and blood pressure) using individual data in a large cohort, the UK Biobank 21 , or the largest available genome wide association study (GWAS). Red blood cell attributes have long been suspected to be relevant to cardiovascular disease 18 , however, which trait matters is not well-established.