The role of genetics in neurodegenerative dementia: a large cohort study in South China

Neurodegenerative dementias are a group of diseases with highly heterogeneous pathology and complicated etiology. There exist potential genetic component overlaps between different neurodegenerative dementias. Here, 1795 patients with neurodegenerative dementias from South China were enrolled, including 1592 with Alzheimer’s disease (AD), 110 with frontotemporal dementia (FTD), and 93 with dementia with Lewy bodies (DLB). Genes targeted sequencing analysis were performed. According to the American College of Medical Genetics (ACMG) guidelines, 39 pathogenic/likely pathogenic (P/LP) variants were identified in 47 unrelated patients in 14 different genes, including PSEN1, PSEN2, APP, MAPT, GRN, CHCHD10, TBK1, VCP, HTRA1, OPTN, SQSTM1, SIGMAR1, and abnormal repeat expansions in C9orf72 and HTT. Overall, 33.3% (13/39) of the variants were novel, the identified P/LP variants were seen in 2.2% (35/1592) and 10.9% (12/110) of AD and FTD cases, respectively. The overall molecular diagnostic rate was 2.6%. Among them, PSEN1 was the most frequently mutated gene (46.8%, 22/47), followed by PSEN2 and APP. Additionally, the age at onset of patients with P/LP variants (51.4 years), ranging from 30 to 83 years, was ~10 years earlier than those without P/LP variants (p < 0.05). This study sheds insight into the genetic spectrum and clinical manifestations of neurodegenerative dementias in South China, further expands the existing repertoire of P/LP variants involved in known dementia-associated genes. It provides a new perspective for basic research on genetic pathogenesis and novel guiding for clinical practice of neurodegenerative dementia.


INTRODUCTION
Neurodegenerative dementias are a group of clinically heterogeneous diseases with frequently overlapping symptoms, such as multi-cognitive impairments, behavioral changes, and movement deficits 1 . Alzheimer's disease (AD) is the most common dementia worldwide, accounting for 60-80% of all dementia cases 2 . Frontotemporal dementia (FTD) is the second most common cause of neurodegenerative dementia after AD in patients younger than 65 years, responsible for 10.2% of cases 3 , and dementia with Lewy bodies (DLB) has been reported as being the second most common dementia subtype in older people following AD, accounting for 7.5% of all dementia cases 4 . However, the etiology of neurodegenerative dementias is still obscure, which is thought to be caused by a combination of ageing, environmental, and genetic factors.
Recently, substantial progress has been made regarding the molecular genetics of neurodegenerative dementias. PSEN1, PSEN2, and APP are recognized as three causative genes for familial AD (FAD), which explains the genetic background of 5-10% of early onset AD (EOAD, younger than 65 years). The estimated mutation frequencies of PSEN1, APP, and PSEN2 in EOAD, are 80%, 15%, and 5%, respectively 5 . Likewise, FTD is a genetically and pathologically heterogeneous disorder with a higher incidence of familial cases than AD. Genetic etiology has been revealed in~30-50% of FTD patients with a positive family history 6,7 . At present, more than 10 genes are related to FTD, and MAPT, GRN, and C9orf72 are the most common, accounting for 60% of all cases of inherited FTD 3 . In contrast, the genetic architecture of DLB remains largely elusive 8 . To date, only three genes have been confirmed to be related to DLB, including APOE, GBA, and SNCA. However, growing evidence supports that DLB has a strong and unique genetic component 9 .
Interestingly, previous studies have suggested a potential genetic overlap between AD, FTD, and DLB. Notably, PSEN1, the most common etiology of EOAD, has also been found in patients with FTD and DLB [10][11][12][13] . Similarly, mutations in MAPT, GRN, and C9orf72 have also been detected at lower frequencies in AD and DLB patients [14][15][16] . Homozygosity for APOE4, the strongest genetic risk factor for AD, has also been reported in several studies to increase the risk of FTD and DLB 17,18 . In addition, mutations in SNCA have been shown to result in a wide phenotypic spectrum of DLB, Parkinson's disease (PD), multiple system atrophy (MSA), and FTD [19][20][21] .
In this study, we comprehensively analyzed the mutational spectrum of known dementia-associated genes from patients with neurodegenerative dementias in the South Chinese population using integrated targeted gene sequencing analysis. First, we systematically identified pathogenic and likely pathogenic (P/LP) variants of known dementia-associated genes, including known and novel variants, summarized and compared the mutation frequency among patients with different clinical diagnosis. Second, we generalized the clinical manifestation of neurodegenerative dementia patients carried P/LP variants in this study, including PSEN1, PSEN2, APP, MAPT, GRN, C9orf72, CHCHD10, HTRA1, TBK1, OPTN, SQSTM1, VCP, SIGMAR1, and HTT, attempting to summarize the relationship between gene mutations and clinical phenotypes. Then, we compared the age at onset (AAO) of patients with and without P/LP variants and patients carried different genes separately, to depict the AAO spectrum for these dementia-associated genes in our population. Finally, we analyzed APOE genotypes (non-carriers or carriers of APOE4) in AD cohort and conclude the difference between APOE genotypes and different AD subgroups. Our studies provide a new perspective for further basic research of neurodegenerative dementia, especially genetic-associated pathogenesis and facilitated the clinical prediction, diagnosis, and genetic counseling.
In this study, PSEN1 was the most frequently mutated gene, 19 P/LP missense mutations were identified in 22 patients, among which four were novel identified in our study, including c.451G>A, p.V151M; c.679A>C, p.I227L; c.1139A>G, p.K380R; and c.1369A>G, p.M457V. Seven patients carried PSEN2 P/LP variants, including six missense mutations at the same amino acid residue (M239) and one frameshift mutation. Two were novel, including c.T716C, p.239M>T and c.1180delG, p.A394Pfs*8. All patients with the variants had a positive family history except for one who carried PSEN2 p.239M>T. Meanwhile, two APP missense mutations were identified in four FAD probands, including c.2143G>A, p.V715M, and c.2149G>A, p.V717I ( Table 2). The distribution of PSEN1/ PSEN2/APP P/LP variants are shown in Fig. 2. Interestingly, all identified P/LP variants of PSEN1/PSEN2/APP were located in hydrophobic regions or in the endoproteolytic cleavage regions.
Additionally, we also found two female AD patients who carried a nonsense mutation in CHCHD10 (c.283C>T, p.Q95*) and HTRA1 (c.589C>T, p.R197*), respectively. The patient who carried CHCHD10 p.Q95* showed memory decline at 52 years and gradually developed language dysfunction, behavioral changes, bradykinesia, and depression. Brain MRI showed bilateral atrophy of temporal parietal lobe and hippocampus, and cerebrospinal fluid (CSF) examination showed the level of Aβ42 and Aβ42/Aβ40 ratio decreased, while the phospho-tau (p-tau) and total tau (t-tau) increased. In addition, the Pittsburgh compound B (PiB)-PET showed diffuse amyloid deposition in the whole brain cortex. The patient who carried HTRA1, p.R197*, mainly presented typical forgetfulness of recent events and daily living ability declined at 49 years. Brain MRI showed multiple spot-like hyperintensities in the deep bilateral frontotemporal lobes and paraventricular region, while no microbleeds on susceptibility-weighted images sequence. The level of Aβ42 in CSF decreased, and p-tau increased which supported the diagnosis of AD.
As for clinical characteristics, all patients carried P/LP variants in FTD cohort showed memory decline, 58.33% (7/12) patients had language impairment, mental, and behavior changes, and 25% (4/ 12) P/LP variants carriers accompanied by sensory and movement disorders. Interestingly, one showed personality changes and language impairment, as well as abnormal emotional responses at baseline. In the fifth year of onset, she suffered from memory decline. Brain MRI showed bilateral frontal lobe atrophy, and bvFTD was initially considered. However, molecular testing revealed that she carried heterozygous CAG expanded repeats in HTT, which supported the diagnosis of Huntington's disease (HD). Mutational frequencies of all (left) and each (right) known cognitive impairment-associated genes in the different dementia cohorts. a AD cohort. b FTD cohort. c entire cohort. Variants that were classified as pathogenic or likely pathogenic according to the standards and guidelines of the ACMG. 'Pathogenic' means that the patients had pathogenic variants in known cognitive impairment disease-associated genes, and 'likely pathogenic' means that the patients had likely pathogenic variants in known cognitive impairment disease-associated genes. ACMG American College of Medical Genetics, AD Alzheimer's disease, FTD frontotemporal dementia.

Spectrum of age at onset
Moreover, the mean AAO were significantly younger in patients with P/LP variants in the AD cohort and entire cohort, while no difference was found in the FTD cohort (Fig. 3a-c). Specifically, the mean AAO of patients with P/LP variants in the entire cohort was 51.4 ± 9.5 years,~10 years younger than the mean AAO of those without P/LP variants (64.8 ± 10.7 years) (p < 0.001), among them, 89.4% were younger than 65 years. Meanwhile, we analyzed the spectrum of AAO in patients with P/LP variants of different genes (genes with two or more mutations were included). The results showed that the mean AAO of subjects with P/LP variants of PSEN1, PSEN2, and APP (47.5 ± 10.7 years, 53.2 ± 8.1 years, and 46.5 ± 4.2 years, respectively) were significantly lower than those of non-carriers (p < 0.05) (Fig. 3d).
In addition, we performed subgroup analysis on the family history and the status of APOE4 to compare the difference in AAO between the two groups respectively, which showed that the AAO of FAD patients was significantly younger than that of SAD (63.2 ± 11.4 and 64.9 ± 10.6, respectively, p = 0.005), while no significant between APOE4 carriers and non-carriers (p = 0.953) (Fig. 3e, f).
To analyze the confounding factors affecting AAO, we conducted further multiple linear regression analysis with AAO as the dependent variable. After controlling independent variables, including gender, disease duration, educational attainment, APOE genotypes, dementia family history, MMSE scores, mutation status, and clinical diagnosis, the model showed that MMSE scores (B = −0.135, p < 0.001), disease duration (B = −0.421, p = 0.001), and status of mutation carried (B = −13.44, p < 0.001).

Characteristics of APOE genotypes
As for the distributions of APOE genotypes, there was no significant difference across AD, FTD, and DLB cohorts (p > 0.0166; Bonferroni corrected). Furthermore, APOE4 as the strongest genetic risk factor for AD, we further compared the distribution difference between EOAD and LOAD patients, FAD and SAD. We found no significant difference in APOE4 frequency (EOAD vs LOAD: p = 0.501; FAD vs SAD: p = 0.153, respectively). Further subgroup analysis in AD cohort showed that the proportion of APOE4-negative patients was higher than that of APOE4-positive patients (p < 0.001, p = 0.007, and p < 0.001, respectively) (Fig. 4a, b). In addition, we found no significant difference in the distribution difference of APOE4 between variants carriers and non-carriers in the AD cohort (p = 0.281). Meanwhile, a higher percentage of APOE4 (+) patients was found in P/LP variants than in patients without P/LP variant in AD cohort (p < 0.001) (Fig. 4c).

DISCUSSION
In this study, we determined the mutational spectrum of 36 known dementia-associated genes in patients clinically diagnosed with neurodegenerative dementia patients, including AD, FTD, and DLB in a South China population sample using integrated targeted gene sequencing analysis. This is the first report of distributions of gene mutations in patients with neurodegenerative dementias from South China. We observed that the use of an integrated gene analysis could be an effective tool for detecting potential genetic causes in neurodegenerative dementias with high genetic heterogeneity or overlapping phenotypic features, Mutations in PSEN1 are the most common cause of EOAD, meanwhile, PSEN1 was the most frequently mutated gene in patients with FAD. To date, more than 300 mutations in PSEN1 have been identified to be associated with FAD. In this study, four novel variants were identified, which expanded the mutational spectrum of PSEN1. The AAO of PSEN1 mutation carriers in our study (47.5 years), was older than Ryan et al. reported (43.6 years) in 168 AD patients with PSEN1 mutations 22 , but younger than Jia et al. reported in a large FAD cohort from China (50.59 years) 23 . Of interest, in addition to the PSEN1 mutations mentioned above, we found an older female (83 years) carrying a novel mutation (M270L), and the APOE genotype was 3/4 in the DLB cohort. Several algorithms predicted the variant was not disease damaging, whereas the nearby mutations (R269G, R269H, and L271V) have been reported to be associated with FAD [24][25][26] . Whether the clinical phenotype of the patient is caused by the novel mutation or the contribution of APOE genotype is unclear; we will perform functional research to further clarify the variant. Meanwhile, regarding clinical phenotypes, PSEN1 mutation carriers often present with atypical cognitive symptoms and additional neurological features 22 . However, in this study, patients mainly presented with amnesia, language impairment, mental and behavioral changes, and movement disorders, which is one limitation of this study. This might have two explanations. First, some atypical symptoms might not occur at an early stage of the disease, and follow-up is necessary. Second, the mutation locations may lead to distinguishing phenotypes; for example, atypical cognitive presentations and pyramidal signs were seen more frequently in association with PSEN1 mutations involving exon 8 22 , suggesting that multiple factors could contribute to the phenotypic heterogeneity of PSEN1-related AD.
In contrast to PSEN1, only 18 pathogenic mutations within PSEN2 have been reported, most of which occurred in European and African populations. In this study, seven P/LP variants were identified, including six FAD cases. Previous studies showed that the AAO of PSEN2-associated cases vary widely, from 45 to 88 Fig. 3 AAO spectrum of known cognitive impairment disease-associated genes in AD, FTD, and entire cohorts. Comparison of AAO in all patients with P/LP variants of known cognitive impairment disease-associated genes and patients without P/LP variants in known cognitive impairment disease-associated genes. a AD cohort. b FTD cohort. c Entire cohort. The dashed red line refers to the mean AAO of patients with P/LP variants in the corresponding cohorts, whereas the dashed line refers to the mean AAO of patients without P/LP variants in known cognitive impairment disease-associated genes in the corresponding cohort. d Spectrum of AAO in patients with P/LP variants of each cognitive impairment disease-associated gene (only genes carried by two or more patients were included), in patients with and without P/LP variants of known cognitive impairment disease-associated genes. e Spectrum of AAO in FAD and SAD patients. f Spectrum of AAO in patients with and without APOE4. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001, ns no significance. AAO age at onset, AD Alzheimer's disease, FTD frontotemporal dementia, P/LP pathogenic or likely pathogenic, FAD familial Alzheimer's disease, SAD sporadic Alzheimer's disease.
years; that is more than 10 years later than the mean AAO for PSEN1-related cases [27][28][29] , which was consistent with our results. Interestingly, six patients with a substitution at PSEN2 amino acid residue 239 were identified, including M239V, M239I, and M239T. Among them, M239V has been reported in European populations to elevate Aβ42 levels and Aβ42/Aβ40, and to exhibit a partial loss of function with respect to the C-terminal fragment-γ as well as a substantial decrease in Aβ40 levels [30][31][32] , but it has been absent from Asian cohorts so far. Our findings suggest that this residue may be a common causative variant in the South Chinese population. In addition, the clinical phenotypes of carriers of the M239V mutation varied widely. Our findings, together with previous reports, further suggest that phenotypic heterogeneity exists even at the same codon site because of different amino acid transversions 30,32,33 .
In accordance with other populations, the common mutation site of APP were residues 715 and 717. Taken together with our previous reports, our team have found four families carrying mutations at this site 34 . Amino acid residues 715 and 717 are located near the γ-secretase cleavage site, and mutation at this site may increase the hydrophobicity of the APP TM domain to anchor the protein within the membrane and elevate the Aβ42/ Aβ40 ratio 35,36 . Interestingly, patients with mutations at this site often have non-memory symptoms, which can be misdiagnosed as FTD, because the behavioral problems occur earlier than the memory deficits. Totally, in all PSEN1/PSEN2/APP P/LP variants in LOAD, PSEN1 M457V, and PSEN2 A394Pfs*8 were novel variants, further functional validation was necessary and warranted.
In addition, we identified CHCHD10 and HTRA1 mutations in the AD cohort. CHCHD10 has been identified to be associated with a large spectrum of diseases, including FTD, ALS, AD, cerebellar ataxia, mitochondrial myopathy, late-onset spinal motor neuronopathy, and Charcot-Marie-Tooth disease type 2 [37][38][39][40] . Previously, we have reported a late-onset AD patient with the CHCHD10 mutation 41 . A homozygous HTRA1 mutation was known to be causative for CARASIL 42 , while evidence was also showed that heterozygous HTRA1 mutation, which might result in an impaired HTRA1 activation cascade or be unable to form stable trimers, is related to autosomal dominant hereditary cerebral small vessel disease with delayed onset [43][44][45] . In this study, the female AD patients presented with typical amnesia symptoms at 49 years without any other neurological or extra-neurological symptoms. Meanwhile, the Fazekas score of periventricular white matter hyperintensities was 1, the APOE genotype was 3/3 and the core biomarkers of CSF showed A+T+N-. However, the effect of the heterozygous mutation in the pathophysiologic process of AD remains elusive; further functional studies are still needed. These results further indicate that mutations not only in PSEN1, PSEN2, and APP can cause the AD phenotype, but that variants in other genes might also cause AD-like symptoms. Further follow-up is necessary.
Notably, we identified double mutations in a 52-year-old female (PSEN2 p.M239I and MAPT p.R5H), but her daily living ability remained intact, and the double mutation did not accelerate the cognitive decline, further expanding the phenotype spectrum of the mutation and supporting the phenotypic heterogeneity among subjects carrying the same MAPT mutations. Further in vivo and in vitro studies are needed to determine the effect of MAPT and PSEN2 mutations on the pathology and pathogenesis of AD.
Many different gene are reported to cause FTD, of which MAPT, GRN, and C9orf72 are three most common [46][47][48] . Except of three common genes, we also found variants in another seven genes, including CHCHD10, OPTN, SQSTM1, VCP, SIGMAR1, TBK1, and HTT [49][50][51][52] . In addition to genetics, the clinical phenotypes of FTD are also highly heterogeneous. In this study, we did not observe the classic phenotypes of mutations in VCP, such as inclusion body myopathy with Paget's disease of the bone 53,54 , we will follow up the patient to see the symptoms evolve. Moreover, the wrong diagnosis of the patient carried heterozygous CAG expanded repeats in HTT, further indicated that the overlap of clinical phenotypes is one of the main reasons for the difficulty in the diagnosis of neurodegenerative diseases. Genetic analysis is an effective method to improve diagnostic certainty.
Additionally, in our study, the proportion of APOE4 positive cases between FAD and SAD, EOAD and LOAD, were not significantly different, which is inconsistent with previous studies reporting that APOE4 exerts its maximal effect in EOAD 55,56 . Perhaps other genetic or environmental factors may play an important role in the onset and pathogenesis of AD.
This study represents a comprehensive and systematic screening of 36 dementia-associated genes in AD, FTD, and DLB patients from South China, although the current study has some limitations. First, we only focused on known 36 dementiaassociated genes, not susceptibility genes, risk loci, or new candidate genes, which may play important roles in neurodegenerative dementia. Second, in this study, we only screened neurodegenerative dementia patients, but no controls were assessed to compare background frequencies of the P/LP variants. Lastly, for those novel variants identified in this study, we did not design functional experiments to further validation.
In conclusion, we have conducted the most systematic survey of the mutational spectrum of neurodegenerative dementia patients in South Chinese population, which further expanded the mutational spectrum of dementia-related genes and have provided evidence that there is some genetic heterogeneity and perhaps overlap between phenotypes. Our results may prove to be beneficial for clinical prediction, diagnosis, and genetic counseling and may generate hypotheses for future basic Fig. 4 The percentage of APOE4 in AD cohort. a The percentage of APOE4 in LOAD and EOAD. b The percentage of APOE4 in SAD and FAD. c The percentage of APOE4 in AD patients with and without P/LP variants. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001, ns no significance. LOAD late-onset Alzheimer's disease, EOAD early-onset Alzheimer's disease, SAD sporadic Alzheimer's disease, FAD familial Alzheimer's disease.
research on genetic-associated pathogenesis of neurodegenerative dementia.

Study participants
A total of 1795 patients with neurodegenerative dementias, including 1592 with AD, 110 with FTD, and 93 with DLB, were recruited at the Xiangya Hospital, Central South University, between February 2004 and October 2020. All patients were unrelated probands. The demographic and clinical characteristics are summarized in Table 1. All subjects had been clinically diagnosed with AD, FTD, or DLB according to international guidelines. This study was approved by the Ethics Committee of Xiangya Hospital, Central South University, China. Written informed consent was obtained from each participant or guardian.
Moreover, the (GGGGCC)n repeats in C9orf72 and (CAG)n repeats in HTT were performed in all individuals using previously reported repeat-primed polymerase chain reaction and capillary electrophoresis 61,62 .

Sanger sequencing
All P/LP variants were estimated by PCR amplification and Sanger sequencing using a Big Dye Terminator V3.1 on an ABI 3730xl DNA analyzer (Applied Biosystems, Foster City, USA). The DNA sequences were then analyzed using Sequencher software version 4.2. All primers were designed using Primer 5, and the primer sequences and PCR reaction conditions are listed in Supplementary Table 3. Meanwhile, variants of unknown significance identified in this study are shown in Supplementary  Table 4. The study workflow is shown in Fig. 5.

Statistical analysis
Quantitative variables such as age at onset, age at diagnosis, disease duration, education attainment, and cognitive assessment score are expressed as the mean ± SD. All data were tested for normality and homogeneity of variance using the Shapiro-Wilk test and Levene variance equality test. Two independent samples were conducted using the t test or the Mann-Whitney U test. The χ 2 test and Fisher exact test were used to analyze categorical data, such as the proportion of female patients, family history, the percentage of APOE4 positive or negative, and proportion of EOAD, LOAD or P/LP variants carriers and non-carriers patients. Multiple linear regression analysis was performed to correct the confounding factors and explore the factors affecting the AAO. All tests were two-tailed, and p < 0.05 was considered statistically significant. All analyses were performed using SPSS v.26 (IBM). Data were visualized using Prism 8 (GraphPad).

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.

DATA AVAILABILITY
The sequencing raw data analyzed during this study has been deposited in European Variation Archive, the accession number was PRJEB46658. All other data are available from the corresponding authors on reasonable request.