Introduction

Vascular cognitive impairment (VCI) is a term used for cognitive impairment associated with cerebrovascular disease [1]. Vascular dementia (VaD) is the most severe form of VCI and it is the second most common cause of dementia after Alzheimer’s disease (AD) [2]. An important cause of VCI is cerebral small vessel disease (CSVD) which consists of a heterogeneous group of pathological processes that affect the small vessels of the brain [3]. Most CSVD patients suffer from a sporadic disorder but familial monogenic forms of the disorder have also been described [4]. CADASIL (cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy) is the most frequent subtype of familial CSVD and is caused by variants affecting function in the NOTCH3 gene [5]. CARASIL (cerebral autosomal recessive arteriopathy with subcortical infarcts and leukoencephalopathy) is an autosomal recessive CSVD caused by pathogenic HTRA1 gene variants [6], although an autosomal dominant form of the disease has been identified [7]. Autosomal dominant COL4A1-related CSVD is usually caused by pathogenic glycine missense variants within the triple-helical domain of COL4A1/COL4A2 collagen genes [8]. Multi-infarct dementia of Swedish type and PADMAL (pontine autosomal dominant microangiopathy and leukoencephalopathy) were recently found to be caused by variants of a predicted binding site for miR-29 microRNA located within the 3′UTR of COL4A1 gene [9, 10]. These diseases differ from other COL4A1-related CSVD, and variants found both in Swedish multi-infarct dementia family and PADMAL cases disrupt the same miR-29 binding site leading to upregulation of COL4A1 [9, 10].

Despite the variants identified, many CSVD cases remain unexplained genetically even when they appear familial. In this study, we used whole-exome sequencing (WES) to study the genetic background of a cohort of 35 Finnish CSVD patients. We also investigated the prevalence of variants in miR-29 binding site of COL4A1 in a cohort of 60 Finnish CSVD patients.

Subjects and methods

The study was approved by the Ethical Committee of the Hospital District of Southwest Finland. The approval for the use of patient DNA samples was obtained from the National Supervisory Authority for Welfare and Health (Valvira) and Hospital District of Southwest Finland. Permit for the access to medical records was obtained from the National Institute for Health and Welfare.

Patients

A cohort of Finnish patients with suspected CADASIL was selected from 365 patients referred for diagnostic testing for NOTCH3 in the Department of Medical Genetics of Turku University Hospital between years 1998 and 2004. All patients were screened negative for the most common variants affecting function in NOTCH3 (p.Arg133Cys and p.Arg182Cys). Two of the patients were also screened negative for variants in NOTCH3 exons 3–8, 11, and 18–20 and one patient was screened negative for variants in NOTCH3 exons 3, 4, and 8 (NOTCH3 exons numbered consecutively from 1 to 33 according to NM_000435.2). Medical records of the cohort of 365 patients were reviewed to confirm the diagnosis or clinical phenotype. Characteristics of the whole Finnish cohort are summarized in Supplementary Table I. After examining the medical records, 60 patients from the cohort of 365 patients were confirmed to have a diagnosis of VCI and were selected for sequence analysis of the miR-29 microRNA binding site in the 3′UTR of COL4A1 (Fig. 1). Of these 60 VCI patients, 35 patients were selected for whole-exome sequencing (Fig. 1). The inclusion criteria included the presence of VCI with white matter changes in magnetic resonance imaging, age at onset up to 75 years and/or family history of dementia or stroke. Family history was defined from the medical notes and was considered positive if patient had at least one relative suffering from dementia or stroke. The inclusion criteria were used to select the best candidates with adequate clinical information from the cohort of 60 patients to investigate familial forms of VCI.

Fig. 1: Schematic presentation of the study describing the workflow of selection of patients and genetic examinations.
figure 1

AAO age-at-onset, MRI magnetic resonance imaging, VCI vascular cognitive impairment, WES whole-exome sequencing.

Sanger sequencing of the miR-29 microRNA binding site in the 3′UTR of COL4A1

The miR-29 microRNA binding site in the 3′UTR of COL4A1 was sequenced in 60 of the samples studied. Sequencing was performed after PCR amplification with Applied Biosystems BigDye terminator version 3.1 sequencing chemistry in an ABI3730xl DNA analyzer (region sequenced: NG_011544.2(NM_001845.5):c.5001_*145). Primers are available upon request. Sequences were analysed using SeqScape Software (Applied Biosystems, Thermo Fisher Scientific, Waltham, MA, USA).

Whole-exome sequencing (WES)

Details of library preparation and data processing are shown in Supplemental Materials. The stroke-gene panels SGP1 and SGP2 compiled by Ilinca et al. [11] was utilized in the variant analysis. Variants located in 168 genes/loci known to be associated with monogenic causes of stroke [11] were extracted from the whole-exome data. Mitochondrial genes were excluded from this analysis. Variants were filtered out if they were located in a known genomic duplication region and if they did not pass the VQSR score. Variants included in subsequent analyses had a high or moderate impact annotation score, which excluded synonymous and intronic variants that were not located within splice sites. Variants reported at this stage had an allele frequency <1% in gnomAD (v2.1.1) and passed the QC filters described by Patel et al. [12]. In addition, applying the same QC steps, we searched for rare variants by evaluating all non-synonymous and splice site variants that were absent from gnomAD. In addition, we used the Exomiser software (v11.0.0) to prioritize variants related to CADASIL (ORPHA:136). Exomiser aids finding disease-causing variants from WES data by annotating, filtering, and prioritising variants according to user-defined criteria. With Exomiser, autosomal dominant and recessive inheritance models were analyzed to compile a list of the three to four top ranked candidate variants. Only variants that had allele frequency <1% in gnomAD were considered in Exomiser analysis. The workflow of the WES data analysis is presented in Supplementary Fig. 1. In silico prediction tools SIFT, PolyPhen2, MutationTaster, LRT, MutationAssessor and CADD were used to predict variant pathogenicity. Only variants with CADD score ≥10 were considered as potentially pathogenic. Variants were classified according to the American College of Medical Genetics and Genomics (ACMG) criteria [13]. Possibly causative variants were submitted to ClinVar (submission ID: SUB7388577, accession numbers SCV001250686, SCV001250687, SCV001250688, SCV001250689, SCV001250690, SCV001250691, SCV001250692, SCV001250693, SCV001250694, SCV001250695, SCV001250696, SCV001250697, SCV001250698, SCV001250699, SCV001250700).

Results

WES results

We used WES to identify the variants underlying CSVD in 35 Finnish patients. A positive family history was identified from the patient records for 46% (16/35) of the patients. Of the subjects, 54% (19/35) were women. Clinical characteristics of the patients studied by WES are summarized in Table 1.

Table 1 Characteristics of the 35 patients selected for WES.

Six of the patients (17%) carried variants possibly affecting function in NOTCH3, HTRA1, COL4A1, or COL4A2, which are genes known to be associated with CSVD (Table 2). In addition, seven of the patients (20%) carried variants possibly affecting function in genes associated with other neurological or stroke-related conditions (Table 2). All results of the analyses are presented in Supplementary Tables IIIV. Heterozygous NOTCH3 variants were identified in two patients: c.323 G > A, p.(Cys108Tyr) in exon 3 and c.2149 C > T, p.(Arg717Cys) in exon 14. Both variants are missense variants resulting either in the gain or loss of a cysteine residue in the EGF-like repeats of NOTCH3 protein that is the most common type of variant causing CADASIL. NOTCH3 variant c.323 G > A, p.(Cys108Tyr) has been reported earlier in the literature in a CADASIL patient [14]. The patient carrying the c.323 G > A variant had a phenotype consistent with CADASIL and positive family history. The other NOTCH3 variant c.2149 C > T, p.(Arg717Cys) has not been reported before. It was detected in a VaD patient whose phenotype included multiple strokes, atherosclerosis, cardiomyopathy, and heart failure. The patient also had multiple vascular risk factors; diabetes, obesity, and smoking.

Table 2 Possibly causative variants identified by WES.

Furthermore, we identified a heterozygous HTRA1 variant c.961 G > A, p.(Ala321Thr), which has been reported in a CARASIL patient compound heterozygous with another HTRA1 variant [15]. Homozygous or compound heterozygous variants affecting function in HTRA1 are known to cause CARASIL, rare autosomal recessive CSVD [6], whereas heterozygous HTRA1 variants have been identified in autosomal dominant CSVD which is characterized by delayed onset and absence of extra-neurological features typical for CARASIL [7, 16]. The VaD patient carrying the HTRA1 c.961 G > A variant had a phenotype consistent with HTRA1-CSVD. The age at onset of the patient was 70 years and her phenotype included cerebral microangiopathy, lacunar infarcts, migraine with aura, hypertension and she also suffered from Ménière’s disease. Her sibling had a similar phenotype. The patient was not recorded to have extra-neurological features. In addition to the HTRA1 variant, the patient carried the COL4A1 variant c.401 C > T, p.(Pro134Leu) which was also identified in another patient in our study. We also detected two other collagen variants in two patients, COL4A1 c.2440 G > A, p.(Gly814Arg) and COL4A2 c.4291 C > T, p.(Arg1431Cys), both occurring on the triple-helical domain of the protein. The COL4A1 c.2440 G > A, p.(Gly814Arg) variant was identified in a patient who also carried the PSEN2 variant c.53 C > T, p.(Thr18Met). The patient had the youngest age of onset in the study cohort (17 years) and his phenotype included vascular leukoencephalopathy, multiple strokes, epilepsy, and psychiatric features. Variants affecting function in PSEN2 have been found in patients with early-onset AD [17]. The COL4A2 variant c.4291 C > T, p.(Arg1431Cys) was identified in a CSVD patient whose phenotype included VCI, migraine, mild hearing impairment, and balance impairment.

One of the patients carried the APP missense variant c.1795G > A, p.(Glu599Lys), which has previously been reported in patients with Parkinson’s disease or dementia with Lewy bodies [18,19,20]. Variants affecting function in APP are a well-known cause of early-onset AD and cerebral amyloid angiopathy (CAA). Heterozygous variants CCM1 (KRIT1) c.1565 T > C, p.(Ile522Thr) and ITM2B c.193 C > T, p.(Leu65Phe) were identified in a VaD patient whose phenotype also included behavioral changes and hearing impairment. ITM2B loss-of-function variants resulting lengthened protein products cause autosomal dominant CAA (Familial British and Danish dementia) [21, 22], but ITM2B gene has also been linked to retinal dystrophy [23]. Variants in the KRIT1 (CCM1) and CCM2 genes cause autosomal dominant cerebral cavernous malformations, which are vascular anomalies in the brain [24,25,26]. We also identified a novel heterozygous CACNA1A variant c.1348 T > C, p.(Ser450Pro), CACNA1A is a gene associated with familial hemiplegic migraine, episodic ataxia type 2 and spinocerebellar ataxia type 6 [27]. The patient carrying the CACNA1A variant c.1348 T > C, p.(Ser450Pro) suffered from migraine with aura and her phenotype also included secondary parkinsonism and dysphagia. In addition, we detected a novel heterozygous variant c.115 G > C, p.(Asp39His) in the TMEM106B gene. TMEM106B gene is identified as a risk factor for frontotemporal dementia (FTD), but the gene is also linked to hypomyelinating leukodystrophy [28, 29].

Furthermore, we detected variants in C1R and NPPA. These genes are linked to stroke-related conditions. Pathogenic variants in the C1R gene are associated with autosomal dominant periodontal Ehlers–Danlos syndrome [30], which is a syndrome that may include vascular anomalies [31]. However, the heterozygous C1R variant c.336 G > C, p.(Met112Ile) detected in our study is present in 0.2% of the Finnish population according to the gnomAD database and the clinical significance of the variant is interpreted both as uncertain and likely benign in ClinVar database. The patient carrying the C1R variant c.336 G > C, p.(Met112Ile) also carried the CCM2 variant c.1346 T > G, p.(Ile449Ser) and her phenotype included walking and balance impairment, hypercholesterolemia, diabetes, myocardial infarction, and coronary artery disease, and she had family history positive for strokes. The NPPA gene is linked to familial atrial fibrillation [32, 33], which may cause cardioembolic stroke. In our study, the heterozygous NPPA variant c.377 G > A, p.(Arg126Gln) was identified in a patient who suffered from angina pectoris.

Sanger sequencing of the miR-29 microRNA binding site in 3′UTR of COL4A1

A total of 60 Finnish CSVD patients were screened for variants in the miR-29 microRNA binding site in 3′UTR of COL4A1. Sanger sequencing did not reveal any variants in the miR-29 microRNA binding site in 3′UTR of COL4A1.

Discussion

Although VCI is very commonly found in subjects with dementia, research of the disease lags behind other dementing conditions. There are no common standards in the studies of VCI or universally accepted diagnostic criteria for the disease, which complicates reproducibility of research in this area. Research of VCI has also lacked large, well-characterized patient cohorts. Even though monogenic forms of VCI are considered rare, the identification and characterizations of these forms of disease may considerably contribute to the understanding of the molecular pathogenesis of dementing diseases. With this in mind, we investigated the genetics of VCI by studying a homogenous Finnish cohort with well-defined clinical features, ascertained by the individual revision of medical records. Our study resulted in the detection of several variants possibly affecting function both in known CSVD genes and in genes linked to other neurological disorders or stroke-related conditions.

Six patients carried variants possibly affecting function in the known CSVD genes: NOTCH3, COL4A1, COL4A2, and HTRA1, accounting for as high as 17% of all the patients. The relatively high proportion of these variants probably reflects the original selection of patients for CADASIL (NOTCH3) testing, and our selection criteria for exome sequencing might have further favored a CSVD type phenotype. Even so, these results support pathogenic roles of variants in COL4A1, COL4A2, and HTRA1 in CSVD and VCI. This is in line with the recent study by Ilinca et al., where variants in NOTCH3, COL4A1, and COL4A2 were found in a WES study in patients with suspected monogenic form of stroke [34].

Interestingly, we also detected several variants in genes associated with other dementing or neurodegenerative disorders, which may indicate the overlapping pathologies between these disorders. Detection of variants in the AD-linked genes APP and PSEN2 may represent a genetic connection of CSVD with AD pathology. Several studies have shown a relationship between CSVD and AD [35]. AD very often occurs concomitantly with vascular or other neurodegenerative pathology [36], but it is still unknown how pathologies of AD and CSVD interact with each other [37]. One of the study subjects carried variants both in CSVD-linked gene COL4A1 and AD-linked gene PSEN2, so it is possible that both variants had a role in his disease, which started at an exceptionally early age (17 years). In this study, three patients carried more than one variant that possibly affect function and may have roles in patients’ disease, indicating possible oligogenic cause of VCI. In addition to AD-linked genes, we observed variants possibly affecting function in genes linked to FTD and migraine. Although there are not many studies on the relationship between vascular impairment and FTD, an effect of vascular lesions in the pathogenesis of FTD has been suggested [38]. It is also possible that phenotypic similarities may have been the cause for detection of variants in genes linked to FTD and migraine in our study.

Distinguishing VCI from other forms of dementia and neurodegenerative diseases may be challenging, highlighting the importance of the evaluation of the clinical phenotype of the study subjects when studying a particular disease entity. In our study, the clinical information of the patients was obtained from the medical records, but the amount of the available information varied between patients. A large proportion of the subjects were later diagnosed with another disease than VCI, although CADASIL testing was originally performed (Supplementary Table I). Furthermore, less than half (46%) of the patients showed a positive family history, the rest of the subjects possibly representing sporadic cases. Samples from the relatives of the patients were not available and therefore we could not analyse the segregation of the detected variants. In addition, the cohort did not include any cases confirmed by neuropathological examination, which could have facilitated the diagnosing and characterization of patients.

Previous studies have shown that PADMAL and multi-infarct dementia of Swedish type are caused by variants in an untranslated region of COL4A1 [9, 10], but there is limited knowledge on the prevalence of these variants among CSVD patients in different populations. Here we screened the miR-29 microRNA binding site in 3′UTR of COL4A1 in 60 CSVD patients of Finnish origin, but found no variants to be present in our cohort. The small sample size and possible clinical heterogeneity of the cohort included in this analysis can be possible reasons for the negative results obtained. Despite these, this analysis suggests that COL4A1 3′UTR variants are a very rare cause of CSVD and they may be restricted to certain populations and/or clinical phenotypes. Further studies including larger sample sizes from different ethnicities are needed to fully reveal the role of COL4A1 3′UTR variants in the whole spectrum of CSVD.

Patients that remained negative may represent disorders that are inherited in a polygenic rather than a Mendelian manner. Two patients carried variants in genes associated with atrial fibrillation or Ehlers–Danlos syndrome, which are distinct from other variants detected in genes linked to CSVD or other neurological disorders, but which could also have roles in the vascular phenotypes of the patients. Vascular risk factors, such as hypertension and type 2 diabetes, and environmental risk factors, such as smoking and alcohol consumption, have also a role in the pathogenesis of VCI [39]. Some of the patients may carry pathogenic intronic variants, copy number variants, repeat expansions, structural variants, or methylation changes that were not possible to detect with WES. In addition, some of the patients may carry variants in novel genes that have not yet been found to be associated with VCI or other forms of neurodegeneration.

It should also be noted, that the stroke-gene panel used in the variant analysis needs to be updated in future studies, as more data on the genetic background of cerebrovascular phenotypes will accumulate. Pathogenicity of the identified variants with uncertain significance should be confirmed with functional studies and larger data sets.

These data provide evidence for improved information and guidance in genetic testing of familial VCI. Although there are no curative treatments available for VCI, identifying disease-causing variants may aid making a precise diagnosis and provide information on the prognosis. Genetic diagnosis provides the opportunity for diagnostic testing of other affected family members and predictive screening of the unaffected relatives.

In conclusion, our results support pathogenic roles of variants in COL4A1, COL4A2, and HTRA1 in CSVD and VCI. The variants identified in genes linked with neurodegenerative diseases suggest that vascular pathogenic mechanisms are linked to neurodegenerative conditions. Although more research needs to be done to reveal how these variants cause disease, our study provides novel insights into the molecular basis of VCI.