Introduction

Walking is a simple and convenient form of exercise that is widely promoted for its benefit to physical fitness and overall health1. The public health recommendations for walking focus particularly on increasing the time spent walking and the number of steps walked, with walking at a faster pace receiving less emphasis2.

However, recent studies have observed a brisk habitual walking pace, self-reported through questionnaire or verbal interview, to be associated with reduced risk of a range of cardiorespiratory and cancer outcomes2,3. Most notably, self-reported habitual walking pace has been identified as one of the strongest predictors of all-cause mortality4, even when adjusting for the effects of established risk factors such as body mass index (BMI)5 and other lifestyle behaviours including smoking6.

Despite the strong associations of self-reported walking pace with health and survival, it is unclear whether these associations arise from common biological processes, including genetic predisposition, nor whether there are causal effects of walking pace on health outcomes. These questions can be addressed with knowledge of the genetics of walking pace. To date, studies examining the genetic component of walking pace have analysed objectively measured gait speed, where speed is assessed by timing participants to walk a distance of up to 8 m. These studies focussed on older adults, giving insight into the biological mechanisms underlying age-related diseases and physical mobility7,8. Genome-wide significant markers of objectively measured gait speed were not identified in these studies, which had a maximum sample size of 31,479.

To examine the genetic component of self-reported walking pace, we performed a genome-wide association study (GWAS) in UK Biobank, a prospective study of approximately 450,000 adults of European descent, in addition to approximately 50,000 participants of other ethnicities, aged between 40 and 69 years at baseline9. Participants self-reported their walking pace as “slow”, “steady/average” or “brisk”. We aimed to identify associated genetic variants and their possible function, quantify the genetic correlation of walking pace with other complex traits, and assess the potential of self-reported walking pace as a modifiable health-related exposure. Through these analyses we identify 70 genetic loci for self-reported walking pace and show that this trait shares its genetic architecture with other cardiometabolic risk factors, including educational attainment and cognitive outcomes. Using Mendelian randomisation (MR) we find evidence in favour of causal relationships between self-reported walking pace and several traits associated with mortality. This suggests that self-reported walking pace may indeed be a logical target of health interventions.

Results

GWAS of self-reported walking pace identifies 70 associated loci

We performed a GWAS of self-reported walking pace in 450,967 individuals of European ancestry from UK Biobank (full details in Methods). The phenotype was coded 0, 1 and 2 for self-reported slow, steady/average and brisk walking pace, and the characteristics of participants across these categories are summarised in Supplementary Data 1. We used a linear mixed model with covariates for age, sex, genotyping array and 20 principal components of ancestry implemented in BOLT-LMM v2.3.310. After quality control 10,061,374 imputed variants were analysed (Fig. 1). We identified 144 independent significant SNPs across 70 genomic loci (Table 1), indexed by 75 lead SNPs (Supplementary Data 2).

Fig. 1: Miami plot of self-reported walking pace GWAS results with and without adjustment for BMI.
figure 1

The x-axis is ordered by chromosome and base position. On the y-axis the −log10(P-value) is shown, where P-values are from a Wald test in the BOLT-LMM mixed model test of association (N = 450,967 individuals). A genome-wide significance threshold of P < 5 × 10−8 is indicated by the red dotted line.

Table 1 Seventy independent loci associated with self-reported walking pace at genome-wide significance (P < 5 × 10−8).

We estimated an inflation in the test statistics (λGC = 1.597, mean χ2 = 1.767) but, similarly to other phenotypes analysed in UK Biobank11, the LD score intercept of 1.058 (s.e. = 0.0120) suggests that the inflation is largely due to polygenic signal and the large sample size rather than population substructure.

As there is a clear negative association between BMI and self-reported walking pace (Supplementary Data 1), we were concerned that the results may simply reflect genetic associations with BMI, which have been extensively described12. We therefore performed a sensitivity analysis by including BMI as a covariate in the model (Fig. 1). Of the 70 associated loci only 15 retained genome-wide significance following adjustment for BMI, whilst 45 loci in total maintained a suggestive significance level (P < 10−5). In addition, using LD score regression13 we observed a strong genetic correlation between self-reported walking pace with and without adjustment for BMI (rg = 0.83, s.e. = 0.0073), suggesting that much of the genetic component of walking pace is independent of BMI.

Post-GWAS annotation, gene-based analysis and tissue-enrichment analyses

A detailed annotation catalogue of candidate SNPs in the associated genomic loci is presented in Supplementary Data 3. Of the 70 independently associated genomic loci, 59 have previously reported suggestive associations for other traits and diseases (Supplementary Data 4). The strongest overlaps with the self-reported walking pace include 28 shared loci with BMI, 20 loci associated with educational attainment and 13 loci associated with hand grip strength.

Using positional mapping and expression quantitative trait loci (eQTL) mapping, we identified a total of 535 genes associated with genome-wide significant SNPs (Supplementary Data 5). We also performed a genome-wide gene-based association study (GWGAS) that identified 255 genes associated with self-reported walking pace (Supplementary Data 6), of which 152 were implicated through positional or eQTL mapping.

The strongest self-reported walking pace signals were identified within SLC39A8 on chromosome 4, which has previously been associated with metabolic traits14, FTO on chromosome 16, strongly associated with fat mass and obesity15, and TCF4 on chromosome 18, linked to neurocognitive traits and psychiatric disease16. Of these, SLC39A8 and TCF4 remained genome-wide significant after adjustment for BMI, while the association of FTO was attenuated as expected but remained nominally associated (Supplementary Data 2).

To prevent against the potential pleiotropic effects of adiposity-related SNPs in the gene analysis, we further assessed genes that remained prioritised following adjustment for BMI. Of the 152 genes implicated by both the gene mapping and gene-based analyses, 78 remained significantly associated with self-reported walking pace following BMI adjustment. These genes included GDF5, ACBD4, H1FX, PTPN9, FAM83C and UQCC1 which have previously been associated with height17,18,19,20; MMP24, NCOA6, PIGU, GSS and PLCD3, associated with lean body mass21,22; MAPT, TRPC4AP, DCAKD, GGT7 and PROCR, associated with heel bone mineral density23, and several genes linked to educational attainment24 and cognitive function25 (SDCCAG8, BTN3A2, TCF4, HIST1H4H, ABT1, TXNL1, NYAP2 and ZNF322).

We assessed whether tissue types from the GTEx database26 were enriched for differences in self-reported walking pace. Genes that were associated with self-reported walking pace had increased expression in the brain (P = 9.6 × 10−4) and pituitary (P = 3.1 × 10−6), with tissue-specific enrichments found in the cerebellar hemisphere (P = 5.4 × 10−7) and cerebellum (P = 2.1 × 10−6) (Supplementary Data 7).

Interpretable SNP-heritability estimates

To provide an interpretable heritability estimate for an ordinal outcome, we parameterised self-reported walking pace on the liability scale. Self-reported walking pace on the observed scale y takes the values 0, 1 and 2 with frequencies πj for the three ordered categories. The underlying latent variable l ~ N(0, 1) is related to the observed scale through thresholds t1 and t2 in the equation

$$y = 1\left\{ {l > t_1} \right\} + 1\left\{ {l > t_2} \right\}.$$

The heritabilities on the observed and liability scales are related using the result

$$h_l^2 = h_o^2\frac{{V_0}}{{(z_1 + z_2)^2}}$$
(1)

derived by Gianola27, where zj is the standard normal density at threshold tj and Vo = ∑3k=1 k2 πk −(∑3k=1 kπk)2 (see Supplementary Note 1).

We used BOLT-REML28 adjusting for age, sex, genotyping array and 20 principal components to first estimate the SNP-heritability on the observed scale. Then, using Eq. (1) to convert between scales, we estimated the SNP-heritability for self-reported walking pace on the liability scale as 13.2% (s.e. = 0.21%). With BMI included as a covariate, the heritability is reduced to 8.9% (s.e. = 0.17%).

Genetic overlap with other traits and diseases

We assessed whether self-reported walking pace has a shared genetic basis with other complex traits, which may reflect common biological mechanisms or causal effects in either direction. We examined genetic correlations rg between self-reported walking pace and a range of 53 traits using LD score regression13. The traits were assorted into categories including anthropometric traits, cardiometabolic traits, cognition and educational attainment, and aging-related traits. We observed significant genetic correlations with 39 traits based on a Bonferroni corrected threshold (P < 9.4 × 10−4), with results summarised in Fig. 2 and Supplementary Data 8.

Fig. 2: Summary of significant genetic correlations between self-reported walking pace and other phenotypes.
figure 2

rg, genetic correlation estimated by LD score regression. Horizontal bars represent 95% confidence intervals for the rg estimates. A Bonferroni threshold was used to test 53 phenotypes (P < 9.4 × 10−4). Complete results are shown in Supplementary Data 8.

The genetic architecture of self-reported walking pace overlaps highly with traits relating to adiposity (BMI, rg = −0.52, P = 4.7 × 10−179), education and cognition (years of schooling, rg = 0.51, P = 3.4 × 10−170; intelligence rg = 0.34, P = 3.1 × 10−72) and longevity (parentsʼ age at death, rg = 0.54, P = 3.9 × 10−12). Overall, traits related to cardiometabolic risk, lung function, psychiatric disease and muscular strength show genetic correlations with self-reported walking pace. The genetic correlations also support many of the phenotypic associations that have been observed across categories of walking pace in external cohorts29,30. Traits that remained genetically correlated with self-reported walking pace after adjusting for BMI included hand grip strength, measures of lung function such as forced vital capacity (FVC) and forced expiratory volume in 1 s (FEV1), years of schooling, intelligence, insomnia and depressive symptoms. Genetic correlations with adiposity-related traits and glycemic traits were attenuated following adjustment for BMI.

Polygenic risk score association with all-cause mortality

We explored whether the strong associations that exist between self-reported walking pace and survival2 can be explained partly through genetic predisposition. Cox proportional hazard models were used to test the association of genetically predicted walking pace, estimated through polygenic risk scores (PRS) with a range of P-value thresholds, and all-cause mortality. We conducted our analyses using sex-stratified GWAS results for self-reported walking pace (see “Methods”) to control for sample overlap.

We observed a significant association between genetic variants associated with self-reported walking pace and all-cause mortality in males (PRS with P < 10−2; hazard ratio (HR) = 0.95; 95% CI: 0.92–0.97; P = 1.93 × 10−5) and females (PRS with P < 10−2; HR = 0.95; 95% CI 0.92–0.98; P = 2.70 × 10−3) (Table 2). We performed further analyses to examine the possibility of BMI acting as a mediator of the associations. When we adjusted for BMI in the model, the association with all-cause mortality remained significant both in males (PRS with P < 10−2; HR = 0.96; 95% CI 0.93–0.98; P = 4.40 × 10−4) and females (PRS with P < 10−2; HR = 0.95; 95%CI 0.92–0.98; P = 2.24 × 10−3), which suggests the effect of the genetic variants on mortality is partly independent of BMI.

Table 2 Association between genetically determined self-reported walking pace and all-cause mortality, stratified by sex. PRS, polygenic risk score. Hazard ratios are per 1 standard deviation increased PRS.

Mendelian randomisation

We performed MR to test for credible causal associations between walking pace and genetically correlated traits. We tested 21 traits for causal relationships with self-reported walking pace at a Bonferroni significance threshold of P < 2.3 × 10−3, using only GWAS data from large scale, published studies of European ancestry that do not include participants from the UK Biobank cohort. The 75 lead SNPs for self-reported walking pace were used as genetic instruments within a two-sample MR, with walking pace as the exposure.

Genetically predicted self-reported walking pace was associated with a range of cardiometabolic, respiratory, psychiatric and sleeping traits (Supplementary Data 9). An increase in genetically predicted walking pace is associated with lower BMI (βIVW = −1.37, PIVW = 6.7 × 10−12), lower risk of coronary artery disease (odds ratio (OR) = 0.34, PIVW = 3.1 × 10−8), higher HDL cholesterol levels (βIVW = 0.95, PIVW = 3.3 × 10−9) and higher FEV1 (βIVW = 0.35, PIVW = 2.9 × 10−5). We found no evidence of directional pleiotropy by testing the intercept of MR-Egger analysis (Supplementary Data 9).

To examine the potential pleiotropic effects of adiposity-related SNPs on the MR results, we conducted two sensitivity analyses accounting for the effects of BMI.

Firstly, largely similar results were found when we excluded 28 SNPs that were previously associated with BMI (Supplementary Data 10). Similar magnitude associations remained between increased genetically predicted walking pace and lower risk of coronary artery disease (OR = 0.37, PIVW = 1.5 × 10−5) and higher FEV1 (βIVW = 0.33, PIVW = 1.6 × 10−3), though a weaker effect was observed on lowering BMI (βIVW = −0.55, PIVW = 2.1 × 10−6). Associations were substantially weakened following the exclusion of adiposity-related SNPs between genetically predicted walking pace and glycemic traits such as fasting insulin, HOMA-IR (homeostasis model assessment of insulin resistance index) and type 2 diabetes, suggesting a contribution of pleiotropy that confounds the MR results in these cases.

Secondly, we included both self-reported walking pace and BMI in a multivariable MR (Supplementary Data 11). After the inclusion of BMI as an exposure, only the association between genetically predicted walking pace and waist-to-hip ratio remained significant. This may suggest that the observed associations found between genetically predicted walking pace on lower cardiovascular risk and improved lung function are pleiotropically mediated through BMI. Alternatively, because the multivariable MR tests the direct causal effect of walking pace while holding BMI constant, the analysis may have limited power to detect such an effect when the causal effect of walking pace is substantially mediated through BMI.

Discussion

We present a GWAS of self-reported walking pace using data from 450,967 individuals of European ancestry in the UK Biobank cohort. We identified 70 independent genomic loci associated with self-reported walking pace, of which 59 have previously reported associations in published GWAS for other traits and diseases, and 11 are currently unique to self-reported walking pace.

We estimated the SNP-heritability of self-reported walking pace as 13.2% on the liability scale, showing only a modest genetic component, suggesting that self-reported walking pace is largely modifiable. We showed that there are many significant genetic correlations with cardiometabolic traits and diseases, including BMI, coronary heart disease, type 2 diabetes and lipid levels, with respiratory traits and other lifestyle behaviours such as sleep. These could be due either to causal associations between self-reported walking pace and those traits, in either direction, or through pleiotropic effects whereby genetic variants influence multiple phenotypes through possibly independent biological pathways31. We showed also that polygenic scores predicting self-reported walking pace are inversely associated with all-cause mortality risk, and this association is independent of BMI. Future work examining the genetic relationship between walking pace and survival could focus on the biological mechanisms underlying these associations.

By performing MR analyses we provide evidence that a genetically elevated self-reported walking pace is linked to a lower cardiometabolic risk profile, suggesting that increasing walking pace could act as a beneficial intervention for a range of health outcomes. This is consistent with findings from randomised controlled trials in cardiovascular disease patients, which have shown that exercise-based interventions have beneficial effects on survival32. Our results suggest that such interventions may also be effective in the general population of adults. MR depends upon a number of assumptions to draw causal inferences, with many methods available to vary the required assumptions31. An exhaustive analysis of every causal association is beyond the scope of this study, but we have allowed for the impact of pleiotropy with MR-Egger and weighted median methods, and further sensitivity analyses to examine the effect of adiposity-related SNPs. As self-reported walking pace is a general indicator of an individual’s perceived health, there are likely to be many different biological and psychological mechanisms underlying it. The specific mechanisms are unclear though, as is the extent to which they might invalidate the MR results. By using a range of MR estimators, which depend on different, though related sets of assumptions, we can increase the reliability of our causal inferences. We believe that the ensemble of significant MR results across phenotypes, with effects in biologically plausible directions, is sufficient to conclude with confidence that increasing self-reported walking pace would cause certain aspects of health to improve, and thus is likely to be a suitable target for intervention. In addition, because the phenotype is a self-reported measure, our results may also support a causal link between positive self-perceptions of health and overall health status.

To better understand the relationship between self-reported walking pace and BMI we performed several sensitivity analyses. The high genetic correlation between self-reported walking pace with and without adjustment for BMI (rg = 0.83, s.e. = 0.0073) suggests a substantial component of the genetic architecture of self-reported walking pace is independent of BMI. This was supported by genetic correlations between self-reported walking pace and a range of complex traits and diseases that were largely robust to adjustment for BMI. In comparison with the genome-wide correlations, a more marked effect of BMI was noted at the genomic loci associated with self-reported walking pace. Only 15 of the 70 loci survived the adjustment for BMI at the genome-wide significance level, whilst 45 loci in total retained a suggestive level of P < 10−5. The attenuation of top hits may partly reflect a mediated effect of BMI on the causal pathway between genotype and self-reported walking pace. To explore this, we performed multivariable MR which is a valid form of mediation analysis33. Following the inclusion of BMI as a secondary exposure alongside self-reported walking pace, we found that across a range of outcomes there was weak evidence of an indirect causal effect (independent of BMI) of self-reported walking pace. One possible explanation to note however for this finding is the limited statistical power available to accurately detect both direct and indirect causal effects in a multivariable MR setting.

We found that a self-reported walking pace has a strong genetic overlap with increased years in education and greater intelligence. Hypotheses have been proposed to explain the association between walking pace and both educational and cognitive outcomes34. Firstly, educational attainment may be associated with positive lifestyle choices regarding physical activity and diet, and in addition, a higher education is associated with a greater ability to self-manage health such as by using health services effectively. The importance of walking pace as a measure of overall health status is further supported by previous evidence showing this phenotype is correlated highly with objective measures of physical fitness1. A faster walking pace may also reflect psychological factors relating to increased motivation and internal “drive”, which are plausibly linked to educational attainment and cognition. In addition, it has been observed that in old age there is a parallel decline of walking pace and cognition, and our results may provide some evidence of a genetic basis to this association. Future work could explore this further through joint analysis of walking pace and age-related neurological diseases associated with loss of cognition.

A strong genetic correlation was also observed between self-reported walking pace and hand grip strength, a proxy for overall muscle strength35. In addition, 13 genome-wide significant loci for hand grip strength overlapped with our 70 self-reported walking pace loci. Similar to walking pace, hand grip strength is known to decline with age, whilst increasing muscular strength has been shown to improve functional capacity36. These results indicate a shared genetic basis to the associations that both hand grip strength and walking pace display towards age-related phenotypes. There is however potential for pleiotropic effects that act through the same genetic variants on distinct biological pathways, and further work is needed on the biological mechanisms relating to the self-reported walking pace loci to understand their relevance to muscular strength.

Further work may also include bidirectional MR analyses and mediation analyses to understand the relative importance of walking pace and adiposity on health and survival outcomes. The release of detailed data acquired by accelerometer devices on a subset of participants37 presents further opportunities to compare self-reported walking pace with objective measures of physical activity at both a phenotypic and genotypic level.

Our analysis revealed challenges that are introduced when analysing an ordered categorical phenotype. Rather than the classical modelling approach of an ordinal logistic regression, we assigned weights to the ordered categories and used a linear mixed model. The linear scale makes strong assumptions about the distances between the categories of self-reported walking pace. Whilst recently developed ordinal logistic regression methods have been applied to non-imputed data at UK Biobank scale38, they are not yet computationally tractable on densely imputed GWAS datasets. Analysing ordered categorical variables on the linear scale proves problematic when interpreting SNP effect sizes, SNP-heritability and causal effect estimates in MR. We converted heritability estimates from the observed scale to the liability scale, which is more interpretable as it models self-reported walking pace as a continuous trait. This unobserved latent scale is not the actual walking pace, which can be measured under controlled conditions7, but reflects genetic and environmental factors that influence the self-reported category.

There are several limitations to note. First, the associated loci must be accepted tentatively until validated in an independent cohort. We were specifically interested in the self-reported phenotype owing to its ease of measurement, but while similar measures are available in some prospective cohorts, we were unable to obtain the relevant data during the course of this study. In particular, it is important to confirm the results in a separate demographic, since the UK Biobank participants are known to be healthier than the general population39. Second, self-reported walking pace is known to be a crude measure in comparison to objective assessments, which raises the possibility of misclassification bias40. In particular, it is thought that self-reported walking pace reflects both actual walking pace in daily life as well as a sense of self-rated health41,42. Nonetheless, previous studies have indicated a reasonably close association exists between self-reported and objectively measured usual walking pace43,44, and work by Murtagh et al.45 showed that issuing a simple instruction to walk “briskly” prompted more vigorous activity in participants across all fitness levels. Third, this work is limited in scope by the lack of questionnaire data on the specific context of the walking behaviour, as walking pace is known to differ across domains (e.g. exercise, travel, domestic, leisure)46.

Therefore, the genetic associations and possible causal effects we report here may not hold for more specific measures of gait. Nevertheless, the strong association of self-reported walking pace with health outcomes and mortality warrants study in its own right. Despite the inherent limitations described, our results highlight the value of studying subjective, self-reported measures of physical activity. We are able to utilise a simple measure of self-reported walking pace to infer that walking at a speed that is brisk in one’s own estimation has important benefits to health and longevity. Arguably this could provide the basis for health advice that is easier to understand and follow compared to walking at or above a precisely defined speed. Nevertheless, further investigation is needed into the generalisability of our findings to interventions aimed at increasing objectively assessed walking pace.

In conclusion, we have identified 70 genetic loci associated with self-reported walking pace and shown that its strong associations with cardiorespiratory and mortality outcomes is partly explained by genetic correlations. MR arguments augment the results of trials on cardiovascular patients32 to suggest that self-reported walking pace may be a beneficial target for intervention in the general population. Given its ease of measurement, by definition by individuals themselves, it may be entirely feasible to develop pragmatic interventions on walking pace that have beneficial effects on health.

Methods

Study population

The UK Biobank study is a large cohort of 501,726 British residents aged between 40 and 69 at recruitment. The participants attended assessment visits across 23 study centres in the UK, through which extensive phenotypic data were collected. Participants provided informed consent to participate, and the UK Biobank study has ethics approval from the North West National Research Ethics Committee (REC reference 11/NW/0274). This work has been conducted under UK Biobank application 33266.

Genotype, imputation and quality control

The initial genotyping, imputation and quality control were conducted centrally by the UK Biobank, and have been described in detail elsewhere9. Genotyping was performed using the UK BiLEVE Axiom Array and the UK Biobank Axiom arrays, with imputation to the Haplotype Reference Consortium panel47 which has approximately 96 million variants.

Phenotype

Self-reported walking pace was ascertained using the ACE touchscreen question “How would you describe your usual walking pace?” with response options of “slow”, “steady/average”, “brisk”, “None of the above” or “Prefer not to answer”. If the participant activated the “Help” button they were shown the message: “Slow pace is defined as less than 3 miles per hour. Steady average pace is defined as between 3-4 miles per hour. Fast pace is defined as more than 4 miles per hour.” We excluded participants whose answers were “None of the above” (n = 1,426) or “Prefer not to answer” (n = 519). The low numbers of these exclusions suggest minimal impact of any informative missingness. The responses “slow”, “steady/average” and “brisk” were coded as 0, 1 and 2 for our analyses.

Genome-wide association analysis

Association analysis was carried out in a set of 450,967 individuals of European ancestry with non-missing phenotypes, where ancestry was defined by the K-means clustering of the first two principal components48. A linear mixed non-infinitesimal model for self-reported walking pace was implemented in BOLT-LMM v2.3.210 under an additive genetic model. The model included covariates for age, sex, genotyping array and the first 20 principal components of ancestry. We additionally carried out a sensitivity analyses to explore the effect of using BMI as a covariate. Following association analysis, only biallelic SNPs were retained with a minor allele frequency ≥0.005, imputation quality ≥0.60 and maximum per SNP missingness of 10%. In total, 10,061,374 variants were analysed. To estimate the linear mixed model parameters further QC was performed to remove variants with a minor allele frequency <1% and deviation from Hardy-Weinberg equilibrium (P < 10−6).

Genomic risk loci were derived using the Functional Mapping and Annotation of genetic associations (FUMA) platform49. Independent significant SNPs were defined using a genome-wide significance threshold of P < 5 × 10−8, independent from each other at r2 < 0.6. Lead SNPs were further identified as a subset of the independent significant SNPs that are in linkage disequilibrium (LD) at r2 < 0.1. Genomic loci were defined by merging lead SNPs that are located within 250 kb of each other.

Interaction effects for the lead SNPs by sex were investigated by carrying out the BOLT-LMM analyses stratified by sex. The strata were ensured to be approximately independent by excluding individuals related to 3rd degree or above (kinship coefficient <0.044) using the software KING50. In each 3rd degree related pair, we retained the individual with the lower genotyping missingness rate.

The effect of confounding by population structure was estimated using the intercept of the LD score regression, which estimates the inflation in test statistics due to confounding of the association between walking pace and genotype13.

Sensitivity analysis

Because we used a linear model to test association with an ordinal categorical trait, we assessed the sensitivity of the results to different coding schemes of the self-reported walking pace phenotype, and compared statistical power when using an ordinal logistic and linear model. We partitioned the GWAS SNPs into 6 minor allele frequency bins where we randomly selected 1000 SNPs from each, and compared the P-value of association for these SNPs under both the linear and ordinal logistic models (Supplementary Fig. 1). We additionally compared SNP effect sizes under both the linear and ordinal logistic models for the 75 independent lead SNPs (Supplementary Fig. 2). We used a sample of 373,414 unrelated individuals, such that no pair are related to 3rd degree or above, corresponding to a KING kinship coefficient50 of <0.044. We fitted both linear and ordinal logistic models with covariates for age, sex, genotyping array and 20 principal components using PLINK1.951 for the linear model and the Julia package OrdinalGWAS.jl38 for the ordinal logistic model.

Genetic correlations

The genetic correlations rg between self-reported walking pace and 53 traits were estimated using LD Score regression performed through the LDSCv1.0.1 software13. The set of traits includes anthropometric, cardiometabolic, educational, bone mineral density, aging and other categories for which summary statistic data was publicly available. Genetic correlations were tested for significance using a Bonferroni correction of P < 9.4 × 10−4.

Post-GWAS annotation and functional mapping

The functional annotation of SNPs associated with self-reported walking pace was carried out using FUMA49. Annotations include ANNOVAR categories, CADD scores, RegulomeDB scores and chromatin states. All candidate SNPs in the genomic risk loci (SNPs with r2 ≥ 0.6 with the lead SNPs and a suggestive significance level P < 5 × 10−5) were annotated.

Positional mapping and eQTL mapping were used to link self-reported walking pace genomic loci to genes. We used the prioritised genes from the positional and eQTL mapping to perform gene-set enrichment analysis against gene sets defined by traits in the GWAS catalogue. Additionally, gene-based analysis was performed with MAGMA through the FUMA platform52. MAGMA combines the P-values for SNPs within a gene to create gene-based P-values for 19,834 protein-coding genes. A Bonferroni corrected threshold of P < 2.52 × 10−6 was used to determine significantly associated genes. Finally, we used FUMA to perform tissue-enrichment analysis of 30 broad tissue types and 54 specific tissue types from the GTEx database26.

GWAS catalogue lookup

We identified SNPs with previously reported (P < 10−5) phenotypic associations in published GWAS in the NHGRI-EBI catalogue which overlap with SNPs in LD (r2 > 0.6) with the independent significant SNPs.

Polygenic risk score association with all-cause mortality

Cox proportional hazard models were used to investigate the association between genetically determined self-reported walking pace with all-cause mortality, using age as time scale. Analyses were stratified by sex. For males there were 7049 all-cause mortality cases (n total = 186,015) and for females 4546 cases (n total = 223,646). To test for association with all-cause mortality in males, we computed genetic risk scores weighted by effect sizes estimated from the independent sample of females, and vice versa. The polygenic risk scores were constructed using PRSice v2.2.3 software53 for a range of P-value thresholds between 5 × 10−8 and 10−2, using approximately independent genetic markers obtained by clumping the SNPs with an r2 threshold of 0.1 and a window size of 250 kb. To examine the robustness of these associations to adiposity as a mediator, we included covariate adjustment for BMI.

Analyses were performed with Stata 16.0. Mortality status was obtained from the UK Biobank through the National Health Service (NHS) Information Centre and the NHS Central Register, Scotland with detailed information on the data linkage procedure available online.

MR analyses

To investigate whether walking pace has a causal effect on different outcomes, we performed two-sample MR analyses testing 21 traits identified in the genetic correlation analysis. We used only GWAS data from large scale, previously published studies of European ancestry that do not include participants from the UK Biobank cohort. The inverse variance weighted approach was used as the primary method to infer causal effect estimates. The potential effect of pleiotropy was evaluated using the MR-Egger and weighted median estimate methods54,55. MR-Egger requires the InSIDE assumption to hold (Instrument Strength Independent of Direct Effect), whilst the weighted median approach requires no more than 50% of the weighted instruments to be invalid due to horizontal pleiotropy. The 75 independent lead SNPs were used as instrumental variables, using proxies in strong LD (r2 > 0.80) if the SNPs were unavailable in the outcome GWAS.

We conducted further sensitivity analyses to explore the effect of pleiotropy due to BMI, as several of the SNP associations for self-reported walking pace are shared with BMI. Firstly, we conducted the MR analyses with the 28 lead SNPs previously associated with BMI excluded. Secondly, we performed multivariable MR by including both self-reported walking pace and BMI as exposures56. Estimates in this case correspond to the direct causal effect of walking pace with BMI being fixed. The summary statistic data on BMI was obtained from The Genetic Investigation of Anthropometric Traits (GIANT) consortium12.

MR analyses were performed using the MendelianRandomisation57 package implemented in R software.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.