Interethnic analyses of blood pressure loci in populations of East Asian and European descent

Blood pressure (BP) is a major risk factor for cardiovascular disease and more than 200 genetic loci associated with BP are known. Here, we perform a multi-stage genome-wide association study for BP (max N = 289,038) principally in East Asians and meta-analysis in East Asians and Europeans. We report 19 new genetic loci and ancestry-specific BP variants, conforming to a common ancestry-specific variant association model. At 10 unique loci, distinct non-rare ancestry-specific variants colocalize within the same linkage disequilibrium block despite the significantly discordant effects for the proxy shared variants between the ethnic groups. The genome-wide transethnic correlation of causal-variant effect-sizes is 0.898 and 0.851 for systolic and diastolic BP, respectively. Some of the ancestry-specific association signals are also influenced by a selective sweep. Our results provide new evidence for the role of common ancestry-specific variants and natural selection in ethnic differences in complex traits such as BP.


1.
Can the authors indicate which imputation panel was used for the discovery analyses, this is not stated in the text. I note the information is in Suppl. Table 2.

2.
Sentinel variants in a combined meta-analysis of discovery and replication samples reaching P < 5X10-8 are reported as novel loci in this study -can this definition be stated in the first paragraph of the results so the study design is totally clear in the text alongside the supplementary figure. 3.
ST6 presents the results from eQTL analyses, there is no text provided on the exact method and any indication of the coincidence of the BP SNP and top eQTL for the gene indicated? From a brief look at the results I do not think many of the BP SNPs are in high LD (r2>0.8) with the top eSNP for some of the genes listed. The eQTL results should be carefully reviewed for presentation.

4.
The section entitled genetic correlation and power of GWAS requires a better description of the motivation for this analysis with this dataset. The comment on estimating the SNP-based Heritability of BP with other CV risk factors comes a little left field from the prior discussion of GWAS results. I was not able to follow in the written text the work you had done relating to the different ethnic groups.

5.
The power of GWAS for different sample sizes across ancestries again requires some motivation text for this analysis -these results are not so well integrated following on from describing the results from a GWAS and inter-ethnic heterogeneity.

6.
To note the FTO association with BP has recently been reported in a genome-wide association meta-analysis incorporating gene-smoking interactions for BP associated loci, this result should be commented upon and the reference included.
Reviewer #2 (Remarks to the Author): Takeushi et al. describe a blood pressure discovery GWAS using large Asian and European studies (130,777 Asians in discovery and 289,038 Europeans in replication) and the authors claim 19 novel BP loci. Additionally the authors find: genetic heterogeneity between East Asians and Europeans, describe causal variant effect size correlations between Asians and Europeans, and provide an explanation for some of the ancestry-specific variants observed.
The subject is interesting, the text is well written, the analysis approaches are valid. The conclusions partly valid.

Strengths:
1) Detailed genetic analysis using a very large collection of Asian samples.
2) Data on the differences of the genetic origins of BP between Asians and Europeans. Weaknesses: 1) This reviewer does not believe that the proof for the novelty of the 19 BP loci is sufficient. The cutoff for follow-up was very lenient (<1x10-5) and a large number of SNPs was taken forward. I would like to see how many SNPs replicate when correcting for the number of tests in replication. The text should inform the reader on the main steps of the analysis, without having to read the supplement. Main QC results such as GC and GC correction steps should be in the main text. The QQ plots should be presented without and with subtraction of the findings from the main SNPs.
2) The work on inter-ethnic heterogeneity, heritability, and ancestry-specific work has to make more clear (in the main text) which SNP-set was used for these analyses. What I would like to know is how many of the current BP SNPs has significant heterogeneity, this seems partially addressed in the section on ancestry-specific variants, but not for the other sections and the number of previously identified BP SNPs is not transparently explained (e.g. with a supplementary table). The sections on the inter-ethnic work may be summarized with a single quantitative statement that should also be included in the abstract.
1. This paper initially describes a GWAS for BP phenotypes, expanding on this groups earlier work on BP discovery and reports 19 new BP loci. The authors then go onto describe inter-ethnic heterogeneity of the signals at some BP loci (there are some novel observations here, although one of the genome-wide significant loci was the topic of some of this groups prior work). Up until this point the paper is easy to follow and interpret the findings, although the section on ancestry-specific SNP loci should follow next.
>> We have placed the section on 'Ancestry-specific SNP loci' at the back of the section on 'Interethnic heterogeneity of GWAS results' (Page 10) as the reviewer recommended.
2. The analyses to delve into ancestry specific loci for BP and other traits is not so well described and easy to follow. The authors need to define their motivation for pursuing the additional analyses and across traits and make their observations more succinct, including the power analyses.
>> We have revised the part of 'Genetic correlation and power of GWAS' as follows. First, we have described our motivation for pursuing the additional analyses and across traits: (1) "In the present study, the availability of genome-wide association data from >100,000 individuals for both East Asians and Europeans separately motivated us to perform additional analyses of systematic, genome-wide interethnic comparison" (in the second paragraph, Page 11); (2) "To estimate the degree of interethnic overlap and non-overlap of blood pressure loci" and "Similar to Europeans, the recent progresses of GWAS in East Asians motivated us to investigate different sample sizes in preparation for much-larger trans-ancestry meta-analysis" (in the third paragraph, Page 11); and (3) "We extended the interethnic analyses to other complex traits such as plasma lipid level, anthropometric measurement and type 2 diabetes using published GWAS summary statistics of relatively large number of samples (Supplementary Table 14)" (in the fourth paragraph, Page 11). Also, we have summarized our observations in the power analysis in the additional sentence, "As the sample sizes in both ethnic groups become larger, we can expect a higher proportion of interethnic overlap; nevertheless, more than or nearly half of the genome-wide significant loci may not overlap between the ethnic groups for GWAS of the same sample size" (in the first paragraph, Page 12).
2 3. At the moment the paper is a bit of a mix of topics, it may work better to describe all the work on BP then compare to results from other traits and then make the statements on common-ancestry specific association model. >> We have revised the corresponding results section such that we describe all the work on BP in the first place (from Page 7 to the third paragraph in Page 11), followed by comparison with results from other traits (from the fourth paragraph in Page 11) and then proceed with discussion of "a common ancestry-specific variant association model" plus selective sweep at ancestry-specific loci (Pages 12 & 13) in the Results section, as suggested by reviewer 1.

Minor comments
1. Can the authors indicate which imputation panel was used for the discovery analyses, this is not stated in the text. I note the information is in Suppl. Table 2.   5. The power of GWAS for different sample sizes across ancestries again requires some motivation text for this analysis -these results are not so well integrated following on from describing the results from a GWAS and inter-ethnic heterogeneity.
>> We have revised the part of 'Genetic correlation and power of GWAS' accordingly; that is, "Similar to Europeans, the recent progresses of GWAS in East Asians motivated us to investigate different sample sizes in preparation for much-larger transethnic meta-analysis" (in the third paragraph, Page 11). Please see our responses to reviewer 1's comment #2.
6. To note the FTO association with BP has recently been reported in a genome-wide association meta-analysis incorporating gene-smoking interactions for BP associated loci, this result should be commented upon and the reference included.
>> We have revised the corresponding part in the Results (the first paragraph in Page 9) and included the reference (#13, Sung YJ et al.).

Responses to reviewer 2's comments
Weaknesses: 1-1. This reviewer does not believe that the proof for the novelty of the 19 BP loci is sufficient.
The cutoff for follow-up was very lenient (<1x10-5) and a large number of SNPs was taken forward. I would like to see how many SNPs replicate when correcting for the number of tests in replication.
>> SNPs attaining P < 1×10 -5 in stage 1 consisted of 281 loci, of which 172 loci were previously unreported. Bonferroni's correction for 172 loci may yield a significance level of 0.05/172 = 0.00029. Here, two loci (rs3853476 and rs10821808) identified in our meta-analysis combining 4 two ethnic groups may be excluded because they showed significant (P < 0.01) BP association in the additional European replication stage alone. Among the 17 (19 minus 2) loci, 5 loci satisfy this threshold in East Asian replication stage (stage 2), i.e., one-sided P < 0.00029, imposing concordant direction. However, we consider the other 12 loci to show statistically significant BP association as well, for the following reason.
For two-staged GWAS, there are generally two types of study design, namely, replication-based analysis and joint analysis [Nat Genet (2006) 38:209]. In the replication-based analysis, as described by the reviewer, evidence of replication mainly concentrated on stage 2 results, using a significance level of 0.05/(the number of markers tested in stage 2). On the other hand, in the joint analysis, genome-wide significance is assessed by applying P < 5×10 -8 to the combined results for both stages after meta-analysis.
In the above-cited article, joint analysis is recommended as it almost always results in increased power compared to replication-based analysis for the same type I error rate. We have thus adopted the joint analysis design in the current study. Several joint BP meta-analyses have examined whether the SNPs identified via GWAS can retain genome-wide significance (e.g., P < 5×10 -8 ) after combining the additional stage samples with the discovery stage samples (e.g., Nat Genet 2016, ng.3667; Nature 2011, nature10405). There are BP GWAS meta-analyses, which have adopted an approach combining two types of study design. For example, P < 5×10 -8 in combined meta-analysis plus P < 0.05 or P < 0.01 in replication data with the same direction of effect were adopted (e.g., Nat Genet 2016, ng.3654; Nat Genet 2017, ng.3768). When we apply this combined criteria (P < 5×10 -8 in combined meta-analysis plus P < 0.05 with the same direction of effect) to 12 loci in our results, all but one (rs66658258) loci have satisfied them. For rs66658258, although the strength of association in East Asian replication stage is borderline significant (P=0.056) for MAP, it is nominal significant (P=0.034) for DBP, and P < 5×10 -8 is attained in the combined samples for both BP traits (P=4.6×10 -9 and 5.0×10 -9 for MAP and DBP, respectively) (as shown in ST3 and ST5), satisfying the combined criteria.
Nevertheless, understanding the reviewer's concern, we have further looked up European GWAS results for the novel loci as demonstrated in ST4 (a new table). The European GWAS results were available at 6 of 17 loci that were identified via East Asian meta-analysis, as not all the corresponding sentinel SNPs are included in HapMap SNPs, with which European ICBP data were imputed. In all cases, the direction of effect is consistent between the ethnic groups and nominal significant association (P < 0.05) is detectable in Europeans for half (3) of the loci: P=0.0086 at rs17622152, P=0.012 at rs9303509 and P=3.9×10 -4 at rs6021247.
Consequently, we consider that the joint analysis controls type I error and is well powered to detect true associations in the present study.
1-2. The text should inform the reader on the main steps of the analysis, without having to read the supplement. Main QC results such as GC and GC correction steps should be in the main text. The QQ plots should be presented without and with subtraction of the findings from the main SNPs.
>> We have revised the corresponding part in the Results (last 4 lines, Page 7) and Supplementary Fig. 2. 2-1. The work on inter-ethnic heterogeneity, heritability, and ancestry-specific work has to make more clear (in the main text) which SNP-set was used for these analyses. What I would like to know is how many of the current BP SNPs has significant heterogeneity, this seems partially addressed in the section on ancestry-specific variants, but not for the other sections and the number of previously identified BP SNPs is not transparently explained (e.g. with a supplementary table).  Table 13 and Supplementary Fig. 7), in which all the corresponding information and results are demonstrated.
2-2. The sections on the inter-ethnic work may be summarized with a single quantitative statement that should also be included in the abstract.
>> We have added to the abstract a sentence, "At 6 unique loci, distinct non-rare (or common) ancestry-specific variants co-localized within the same linkage disequilibrium block despite the significantly discordant direction of effects for the proxy shared variants between the ethnic groups".

Reviewer #1 (Remarks to the Author):
The authors have addressed all of my concerns satisfactorily, thank you. I don't think you need to state "motivated us" again in the paragraph entitled "Genetic correlation and power of GWAS" this can be re-phrased a little.

Reviewer #2 (Remarks to the Author):
This is a revision of the previously reviewed paper by Takeushi et al. The manuscript has two parts: de-nuovo discovery and trans-ethnic work.
Part one does still not satisfy this reviewer: the cleaning steps are unsatisfactory (GC not applied, not indicated in the main text), the definition of replication is not satisfactory.
Part two mainly concentrates on the newly discovered loci from part one for the first analyses, but this is, as I understand, only a small part of the overall number of known BP loci. The method for identifying causal variants is not satisfactory, although this is a central element for inter-ethnic comparisons.
This reviewer holds the view that the title chosen does not adequately describe the work and may even be misleading "Interethnic comparability in blood pressure loci". and 'Supplementary online material'), we highlighted the revised parts of the manuscript in yellow.
Responses to reviewer 1's comments I don't think you need to state "motivated us" again in the paragraph entitled "Genetic correlation and power of GWAS" this can be re-phrased a little.  Table 2). Since the LD Score regression intercept can account for polygenic effects and inflation due to large sample size (Bulik-Sullivan et al. Nat Genet 2015), we applied the LD Score regression intercept as a correction factor for cohorts with a sample size of >3,000 individuals. Genomic control lGC was used as a correction factor in the other studies." Here, LD Score regression is unsuitable for cohorts with a sample size <3,000 individuals (https://www.med.unc.edu/pgc/statgen/presentations/pgc_stat_bulik_2015.pdf). http://www.nealelab.is. We tested blood pressure associations in three independent data sets: discovery stage, replication stage and lookups, denoting the meta-analysis of discovery and replication stages as the "combined meta-analysis".

Table. Genotyping and correction factor in the stage 1 cohorts (See details in ST 2)
We have revised the definition of a validated association signal as follows.
"In the present study, an association signal was declared to be validated if it satisfied all four of the following criteria: (i) the sentinel SNP was genome-wide significant (P < 5 × 10 -8 ) in the combined meta-analysis for any of the five blood pressure traits; (ii) the sentinel SNP showed evidence of support (P < 0.05) in the replication stage alone for association with the most significantly associated blood pressure trait from the combined meta-analysis; (iii) the sentinel SNP showed further evidence of support (P < 0.05) in association results for either SBP or DBP of lookup variants (n = 18 in this study); and (iv) the sentinel SNP had concordant directions of effect across the discovery and replication stages and the lookups. " 3. Part two mainly concentrates on the newly discovered loci from part one for the first analyses, but this is, as I understand, only a small part of the overall number of known BP loci. The method for identifying causal variants is not satisfactory, although this is a central element for inter-ethnic comparisons.
>> As pointed out by the reviewer #2, we have revised the part of interethnic comparisons to a large extent. First of all, we have clearly described the tested BP loci in reference to total non-rare blood pressure loci that have been previously reported or newly identified in the present study, by addition of the corresponding paragraph in the main text (in the section entitled "Ancestry-specific SNP loci", Page 10), Supplementary  Fig. 7a and Supplementary Table 13).
Supplementary Fig. 7. Investigation of interethnic heterogeneity at blood pressure loci previously reported and newly identified.
By calibrating the proportion in this group-1 subset, we estimated the proportion of loci showing significant interethnic heterogeneity within the overall blood pressure loci tested (N = 446). The estimated proportion was 2.5% each in group 1 and group 2a ( Supplementary Fig. 7b) as follow.
In the present study, interethnic heterogeneity of genetic impact on blood pressure was tested largely via two approaches, that is, for group-2a SNPs (46 + 2 SNPs) and group-1 SNPs (242 SNPs), respectively. For the first approach, we have described more detailed findings and explanations in the main text (the section entitled "Ancestryspecific SNP loci" in Page 10-12), Supplementary Figs. 8 and 9, and the online methods (the section entitled "Exploration of transethnic haplotype SNPs from ancestry-specific SNPs" in Page 29-30). For the second approach, we also have described more detailed findings and explanations in the main text (the section entitled "Interethnic heterogeneity at variants polymorphic in both ancestries" in Page 12-13), Supplementary Figs. 7, 10 and 11, and the online methods (the section entitled "Interethnic heterogeneity at non-rare variant loci" in Page 30-31, above-mentioned).  Fig. 1) with a GWAS (which consists of stages 1 and 2, N = up to 289,038) followed by a replication study (N = 516,972).
All genome-wide SNP markers (6.2 million SNPs) are characterized in a proportion of the GWAS samples, that is, stage 1 (N = 130,777), and results of stage 1 are used to select a proportion of the SNP markers for follow-up (a list of 13,003 SNPs in this study) on the remaining GWAS samples, that is, stage 2 (N = 53,008 for EAS alone and N = 158,261 for EAS+EUR). In this joint analysis strategy, test statistics from stages 1 and 2 are combined by meta-analysis, and a genome-wide significance level (P < 5 × 10 -8 = We have revised the part of 'Genome-wide association analyses and lookups for replication' in the Results (Pages 7 and 8), as well as Discussion (first 3 lines in Page 16) and the part of 'GWAS and replication meta-analyses' in the online methods accordingly. Also, we have revised all the related data and materials in Table 1,