Genetic associations with healthy ageing among Chinese adults

The genetic basis of overall healthy ageing, especially among the East-Asian population is understudied. We conducted a genome-wide association study among 1618 Singapore Chinese elderly participants (65 years or older) ascertained to have aged healthily and compared their genome-wide genotypes to 6221 participants who did not age healthily, after a 20-year follow-up. Two genetic variants were identified (PMeta < 2.59 × 10−8) to be associated with healthy aging, including the LRP1B locus previously associated in long-lived individuals without cognitive decline. Our study sheds additional insights on the genetic basis of healthy ageing.

The genetic basis of overall healthy ageing, especially among the East-Asian population is understudied. We conducted a genomewide association study among 1618 Singapore Chinese elderly participants (65 years or older) ascertained to have aged healthily and compared their genome-wide genotypes to 6221 participants who did not age healthily, after a 20-year follow-up. Two genetic variants were identified (P Meta < 2.59 × 10 −8 ) to be associated with healthy aging, including the LRP1B locus previously associated in long-lived individuals without cognitive decline. Our study sheds additional insights on the genetic basis of healthy ageing. Human life expectancy has increased remarkably over the last two centuries, however, this is often not translated as concomitant improvements in healthy life expectancy 1 . Genetic factors may influence longevity and survival to old age in good health (i.e. healthy ageing) 2 . Genome-wide association studies (GWAS) for longevity have been conducted in both Europeans and Asians, but only a few common genetic loci have been robustly replicated, such as the APOE locus 3,4 . This may be due to heterogeneity in the definition of long-lived cases and non-cases, as well as ethnicspecific genetic variabilities.
In addition to lifespan, healthy ageing is a complex outcome that reflects on the overall quality of life as well as cognitive and physical functions among the elderly population. Recent genetic studies for healthy ageing have primarily evaluated the incidence of major chronic diseases and defined healthy ageing as diseasefree survival to old age [5][6][7] . Fewer studies have evaluated the multiple facets of healthy ageing in a complete manner, but nevertheless have described the important role of genetic factors in the process of ageing. A recent example is the identification of the LRP1B locus in long-lived individuals associated with both good cognitive function and the absence of major chronic diseases 8 . Nevertheless, holistic evaluations for healthy ageing have also been largely limited to studies performed in European ancestry populations 5,6,8 and related data is sparse among participants of East-Asian ancestry.
The Singapore Chinese Health Study (SCHS) is a populationbased prospective cohort study of Chinese adults living in Singapore 9 . Detailed information on the establishment of the cohort, genotyping methods and statistical analysis is available in the online supplementary documents. Seven aspects of healthy ageing in the SCHS was assessed by home visits at the third follow-up interview conducted between 2014 and 2016, and among participants who had survival to at least 65 years of age. Healthy ageing was defined as 1) absence of 10 major chronic diseases, 2) no evidence of cognitive impairment using educationspecific cutoffs from Singapore-modified Mini-Mental State Examination (MMSE), 3) no limitations in instrumental activities of daily living (IADL) defined using the Lawton IADL scale, 4) no major depression defined as a score of less than 5 using the Geriatric Depression Scale (GDS), 5) good overall self-perceived health, 6) good physical functioning on self-report, and 7) no selfreported function-limiting pain. Using these seven domains to define healthy ageing has been widely used in the literature, and published previously in this Singapore Chinese cohort 10,11 .
A total of 7,839 study samples with genotyping data and who had information available to generate healthy ageing status were included in this study. The mean age of study participants at healthy ageing measurement was 73.36 (SD = 5.85) years and 40.72% were men (Supplementary Table 1). The proportion of individuals who met the criteria of healthy ageing was approximately 20% in this present genetic study (Supplementary Table 1), which was similar to the proportion in the large population that included those without genotyping data (N = 14,159) 10 . Compared to those that did not age healthily, participants who attained healthy ageing at follow-up 3 had lower BMI and lower proportions of current smokers and daily drinkers at baseline, but higher diet quality with higher aMED scores, and a higher proportion with secondary education and above at baseline (Supplementary Table 2). The study population was divided into a SCHS-discovery dataset (1,489 healthy ageing cases and 5,721 controls) and a SCHS-replication dataset (129 healthy ageing cases and 500 controls) according to the timing of the genotyping and the version of the Illumina Global Screening Array 12 . Associations between common variants [minor allele frequency (MAF) ≥ 1%] and healthy ageing were conducted in the discovery dataset and replication dataset, and meta-analysis was done subsequently using the inverse-variance method in a fixed-effect model. For of the seven components of healthy ageing in our dataset. Functional annotations of top hits were performed using Functional Mapping and Annotation (FUMA v1.3.6) 13 .
Two variants were associated with healthy ageing in our study ( Table 1, Supplementary Fig. 1). Rs138499810, an intergenic single nucleotide polymorphism (SNP) on chromosome 5, was identified to be associated with healthy ageing at genome-wide significance levels in the discovery dataset and successfully validated in the replication dataset [T allele, odds ratio (OR) (95% confidence interval (Cl)] = 3.158 (2.148, 4.643), P Meta = 4.94 × 10 −9 , Table 1, Supplementary Fig. 2a]. The association remained significant after adjustment for additional lifestyle-related variables, including BMI, dietary intake (aMED score), smoking status, alcohol consumption and education levels [T allele, odds ratio (OR) (95% confidence interval (Cl)] = 3.099 (2.110, 4.554), P Meta = 8.21 × 10 −9 , Supplementary Table 3]. The frequency of the risk T allele for rs138499810 was 2% in the SCHS study, which was similar to other 1000 Genomes East-Asian reference populations (i.e. CDX and CHB reference populations). In contrast, this SNP is monomorphic in most European ancestry populations, the exception being the 1000 Genomes Finnish reference population, which reported approximately 3% for this allele frequency 14 . Regional genes (XRCC4, VCAN and TMEM167A) within 10 kb of this variant were significantly enriched among loci identified for previous GWAS for biomarkers of osteoarthritis ( Supplementary Fig. 3) 15 . Chromatin interaction mapping indicated that a significant interaction exists between the intergenic region containing rs138499810 and the genes XRCC4, VCAN and TMEM167A in embryonic and mesenchymal stem cells (FDR between 3.37 × 10 −7 and 1.99 × 10 −19 ) 16 (Supplementary Table 4). XRCC4 functions together with DNA ligase IV and the DNA-dependent protein kinase in the repair of DNA double-strand breaks. The efficiency of nonhomologous end joining (NHEJ) repair declines with age and restoration of XRCC4 improves NHEJ efficiency, suppresses the onset of stress-induced premature cellular senescence and mitigates ageing-related pathologies 17 . Additional studies would be warranted to unveil potential functional links between rs138499810, XRCC4 and ageing.
The second variant identified was rs117898573. This was an East-Asian specific intronic variant at LRP1B on chromosome 2, and the minor allele, the T allele, had an allele frequency of 3% and the OR for the association with healthy ageing was 2.433 (95% CI 1.779, 3.327; P Meta = 2.59 × 10 −8 ) ( Table 1, Supplementary Fig. 2b). Although we were unable to robustly replicate this association in the replication dataset, the direction of its effect in the replication dataset was consistent with that observed in the discovery analysis. Furthermore, the overall statistical significance still surpassed the genome-wide threshold level in the meta-analysis of all data. The effect estimates also remained similar after inclusion of additional lifestyle-related covariates in the association model [T allele, odds ratio (OR) (95% confidence interval (Cl)] = 2.327 (1.703, 3.179), P Meta = 1.14 × 10 −7 , Supplementary Table 3]. The LRP1B locus has been associated with Alzheimer's disease 18 , and haplotypes at LRP1B were previously reported to be associated with healthy ageing among long-lived individuals without cognitive decline 8 . Hence, our results corroborated the LRP1B gene locus for overall healthy ageing in the East-Asian population.
To unravel if the identified loci for overall healthy ageing may be associated with specific domains used to define the status of healthy ageing, we evaluated the associations between the identified SNPs and the individual components of healthy ageing in the SCHS dataset.  Table 2).
Lastly, we evaluated if previous loci for longevity were associated with healthy ageing in this SCHS cohort. Interestingly, we found that the majority of the loci associated with increased longevity were not associated with healthy ageing in our study (Supplementary Table 5). However, it was consistent with the previous reports 5 and indicated that most genetic determinants of longevity may be distinct from those involved in healthy ageing. Nevertheless, there were two longevity loci, rs17514846 and rs6108784, which were nominally associated with healthy ageing in our study [P = 1.38 × 10 −4 and P = 0.024, respectively, Supplementary Table 5]. The A allele in rs17514846 and the C allele in rs6108784 were previously associated with the phenotype of decreased longevity 19 , and these two alleles were observed with decreased healthy ageing in our study samples [OR (95% Cl) = 0.834 (0.746, 0.932) for the A allele in rs17514846 and OR (95% Cl) = 0.911 (0.841, 0.988) for the C allele in rs6108784, Supplementary Table 5].
Our study identified genetic variants for overall healthy ageing in an East-Asian Chinese population. However, limitations need to be acknowledged. There is a concern that we were unable to significantly replicate the association of one variant, rs117898573, in our replication dataset, although the direction of effects at this SNP was consistent with the discovery data. This might be due to the relatively modest sample size of the replication dataset. Additionally, given the differences in definitions for healthy ageing, it would be challenging to evaluate the transferability of variants identified for related phenotypes, such as human healthspan focused on disease-free survival to old age, which have been recently reported in European ancestry samples 6 . Furthermore, given that the variants identified in the present study were East-Asian specific, our data also highlights on the importance of potential ethnic-specific mechanisms that may influence healthy ageing in the population.

Study cohort
SCHS is a long-term population-based prospective cohort study focused on the dietary, genetic and environmental determinants of cancer and other chronic diseases in Singapore 9 . In brief, 63,257 subjects aged between 45 and 74 years were recruited from April 1993 to December 1998. The subjects belonged to two major Chinese dialect groups in Singapore (the Hokkien and the Cantonese). A total of 28,439 participants donated blood specimens. At recruitment, all the study subjects were interviewed inperson by an interviewer with a structured questionnaire. After recruitment, the participants were re-contacted for follow-up every 5 to 6 years. The in-person interview at the follow-up 3 was conducted from 2014 to 2016. Follow-up 3 was focused on the measurement of ageing outcomes and 17,107 surviving participants participated in this follow-up. The study was approved by the Institutional Review Board (IRB) of the National University of Singapore (NUS). Written informed consents were obtained from all study participants.

Assessment of healthy ageing
The definition of healthy ageing in the SCHS has been previously described [20][21][22] . Healthy ageing addressed 7 domains, namely, no history of major chronic diseases, no impairment of cognitive function, no limitations in instrumental activities of daily living (IADL), no clinical depression at screening, good overall self-perceived health, good physical functioning and no function-limiting pain 10,23 . The major chronic diseases included in our definition of healthy ageing were compiled from previous studies [20][21][22] , which were asked in the baseline or subsequent follow-up questionnaires, and these, including cancer, myocardial infarction, angina, heart failure, coronary artery bypass graft or angioplasty, stroke, diabetes, kidney failure, Parkinson's disease, and chronic lung diseases. The other components of healthy ageing were only assessed at the follow-up 3 visit cognitive function was evaluated using the Singapore-modified Mini-Mental State Examination (SM-MMSE), which had been validated in the Singapore population 24,25 ; IADL was assessed on the basis of the Lawton IADL scale; clinical depression was screened using the 15-item geriatric depression scale (GDS-15), which had been validated in older Asians 26 ; overall self-perceived health was assessed by asking a question "In general, would you say your health is: excellent, very good, good, fair, poor?"; physical function and function-limiting pain were assessed using the EuroQol Group's 5-domain questionnaire (EQ-5D) 27 . Among participants who survived to at least age 65 years at the follow-up 3 visit, those who met these seven criteria were considered to have healthy ageing.

Genotyping and Imputation
Twenty-seven thousand three hundred and eight SCHS samples were genotyped on the Illumina Global Screening Array (GSA). GWAS genotyping for 25,273 SCHS participants were completed in the year 2018 using the Illumina Global Screening v1.0 and v2.0 arrays and data were utilized as the SCHS Discovery in the present study. An independent set of 2035 SCHS participants were genotyped in the year 2020 using the Illumina Global Screening v2.0 and utilized in the present study as the SCHS Replication dataset. Detailed description for the quality control (QC) procedures has been previously described 28 . 7839 participants who had complete information to generate healthy ageing status and with overlapping genotypes were included in the current study. We imputed for additional autosomal SNPs using a two reference panel imputation approach by including both the cosmopolitan 1000 Genomes reference panels (Phase 3, 2,504 reference panels) 29 and 4810 local Singapore Chinese (N = 2,780), Malays (N = 903), and Indians (N = 1,127) reference panels from a WGS study of the Singapore populations 30 . Alleles for all SNPs were coded to the forward strand and mapped to HG19. IMPUTE v2 31 was used to mutually impute variants specific to 1000 Genomes and local Singaporean population sample reference panels into each other to obtain a merged reference panel for imputing untyped variants in study datasets. Imputation was performed in chunks of 1 Mb with a buffer of 500 kb and the effective size of the reference population, Ne, was set as 20,000 as recommended 31 . Frequencies of imputed common SNPs (MAF > 1%) showed strong correlation as compared to 1000 Genomes East Asian reference populations (EAS panels, R = 99.7%).

Statistical method
Clinical characteristics for the study subjects at baseline were compared between healthy ageing cases and controls. Normally distributed quantitative traits, including body mass index (BMI) and alternate Mediterranean Diet Score (aMED score), were presented as mean ± SD (standard deviation) and the differences in means between cases and controls were compared by t-test. Categorical variables were presented as number of individuals and differences in their frequencies between groups were determined by Pearson's χ2 test. Genotype association analyses were done assuming an additive genetic model and with the inclusion of the-method score function in SNPTEST (version 2) 32 to account of genotype uncertainties of imputed SNPs. To derive overall association values, META 33 (version 1.5) was used to combine association summary statistics from discovery and replication datasets using the inverse-variance weighted method and assuming a fixed-effects model (-method 1). Heterogeneity of effects in metaanalyzed data was determined using Cochran's Q and a Cochran's Q Pvalue (P_het) < 0.05 was determined to be significantly heterogeneous. As lifestyle factors may affect the development of age-related diseases and health-and life-spans in the general population 34 , for identified hits, sensitivity analysis was done by including additional covariates in the regression model, namely body mass index (BMI, kg/m 2 ), dietary intake (aMED score), smoking status (never, ever and current), alcohol consumption (nondrinker/monthly drinker, weekly drinker and daily drinker) and education levels (no formal education, primary school and secondary/A level/ University education) from baseline data. We additionally evaluated genetic associations between each of the seven components of healthy ageing and identified hits in our dataset. Functional annotations for identified variants was performed using Functional Mapping and Annotation (FUMA v1.3.6) 13 . Datasets used were based on the HG19 human assembly. Variant MAF and LD (r 2 ) were precomputed using PLINK 35 and based on the EAS 1000 Genomes reference population in FUMA. All SNPs in LD (r 2 > 0.6 in 1000 G Phase 3 ASN panel) with lead SNPs from the study were identified. Regional genes (within 10 kb of lead SNP) at identified healthy ageing associated loci were mapped and annotated using the SNP2GENE function. Chromatin interaction mapping was analyzed as one end containing identified variants (region 1), that interacted with the 2nd region of the chromatin interactions that mapped to gene promoter regions (250 bp up-and 500 bp downstream of transcriptional start site by default) (region 2). Only significant interactions (FDR < 0.05) with mapped genes were retained for identified chromatin interactions. Gene set enrichment was done using the GENE2FUNC function in FUMA that evaluates for the enrichment of regional genes reported from the study, among putative GWAS catalogue genes from previous GWAS study results. Gene set enrichments with P Adj < 0.05 were considered significant.