Genome-wide study of longitudinal brain imaging measures of multiple sclerosis progression across six clinical trials

While the genetics of MS risk susceptibility are well-described, and recent progress has been made on the genetics of disease severity, the genetics of disease progression remain elusive. We therefore investigated the genetic determinants of MS progression on longitudinal brain MRI: change in brain volume (BV) and change in T2 lesion volume (T2LV), reflecting progressive tissue loss and increasing disease burden, respectively. We performed genome-wide association studies of change in BV (N = 3401) and change in T2LV (N = 3513) across six randomized clinical trials from Biogen and Roche/Genentech: ADVANCE, ASCEND, DECIDE, OPERA I & II, and ORATORIO. Analyses were adjusted for randomized treatment arm, age, sex, and ancestry. Results were pooled in a meta-analysis, and were evaluated for enrichment of MS risk variants. Variant colocalization and cell-specific expression analyses were performed using published cohorts. The strongest peaks were in PTPRD (rs77321193-C/A, p = 3.9 × 10–7) for BV change, and NEDD4L (rs11398377-GC/G, p = 9.3 × 10–8) for T2LV change. Evidence of colocalization was observed for NEDD4L, and both genes showed increased expression in neuronal and/or glial populations. No association between MS risk variants and MRI outcomes was observed. In this unique, precompetitive industry partnership, we report putative regions of interest in the neurodevelopmental gene PTPRD, and the ubiquitin ligase gene NEDD4L. These findings are distinct from known MS risk genetics, indicating an added role for genetic progression analyses and informing drug discovery.

Therefore, we leveraged longitudinal imaging data from a unique dataset of randomized controlled trials (RCTs) conducted by Biogen and Roche/Genentech in a precompetitive partnership to examine the genetics of MS progression.We investigated genetic correlates of two longitudinal, quantitative brain MRI outcomes: change in brain volume (BV), and change in T2 lesion volume (T2LV).BV change measures the overall impact of MS injury, including demyelination and neurodegeneration 11 .The rate of brain atrophy over time is a quantitative measure that correlates with worsening clinical disability 12,13 .T2LV increases with accumulating injury and early measures of T2LV may be predictive of long-term disability 14,15 .A deeper understanding of the genetic factors that impact the rate of MS progression could reveal novel candidate targets and biomarkers relevant to MS progressive biology.

Materials and methods
Standard protocol approvals, registrations, and patient consents.We performed GWAS of BV change (N = 3401) and T2LV change (N = 3513) across six RCTs: ADVANCE (trial registry number: NCT00906399), ASCEND (trial registry number: NCT01416181), DECIDE (trial registry number: NCT01064401), OPERA I & II (trial registry number: NCT01247324, NCT01412333), and ORATO-RIO (trial registry number: NCT01194570) (eTables 1, 2).All participants provided written informed consent that covered the scope of this research.Ethical approval was provided as per the original RCTs.This work adheres to the Declaration of Helsinki.
Image acquisition and trait estimation.T1-weighted, proton density-weighted and T2-weighted brain MRIs were collected in accordance with standardized imaging protocols for each trial.All trials used the same centralized MRI reading center (NeuroRx, Montreal, Canada).
BV change was measured between week 24 and the last time point available for each participant using the automated SIENA (Structural Image Evaluation, using Normalization, of Atrophy) method 16 .We used week 24 instead of baseline to account for potential rapid volume changes related to resolution of acute inflammation following treatment initiation (also known as pseudoatrophy) 12 .For placebo arm patients in ADVANCE, BV change was calculated from baseline to week 48 (when participants switched to active treatment).
To calculate T2LV, T2 lesions were segmented using a fully-automated method, followed by visual verification and adjustments, if needed.T2LV change was then estimated as the change in total T2LV between baseline and last time point available for each participant.For the placebo arm in ADVANCE, we used week 48 instead of the last time point to be consistent with the BV change estimation.
Within each RCT, changes in BV and T2LV were annualized to account for varying lengths of follow-up time between the trials.This was accomplished by dividing each individual change by the number of years between the first and last time point used in estimation.Annualized traits were transformed using rank-based inverse normal transformation to mitigate departures from normality (eFigures 1, 2).
Identity by descent (IBD) analysis was used to identify related individuals.Pairs of individuals with k0 > 0.4 were considered related, and one member from each pair was selected randomly for exclusion.We estimated ancestry separately in each RCT, and used principal component analysis (PCA; SmartPCA, Eigensoft v7.2.1) to identify outliers and calculate PCs.We excluded individuals who were 6 standard deviations from the top 10 PCs with a maximum of 10 outlier removal iterations (default).We estimated Tracy-Widom (TW) 17 statistics to determine the number of PCs to include as covariates, which were defined as PCs with TW p < 0.05 that accounted for ≥ 1% of variance (calculated by subtracting the largest eigenvalue with p ≥ 0.05, from the sum of the eigenvalues with p < 0.05, and then estimating relative contributing variance).
Genotypes were imputed to 1000G phase3v5 (phasing: ShapeIT v2.r790).After imputation, we excluded SNPs with R 2 < 0.30, MAF < 0.01 or in the pseudo-autosomal region of chromosome X.SNPs and samples at each QC step are tabulated in eTable 3.
OPERA I, OPERA II, and ORATORIO (Roche trials).DNA was extracted from blood and whole-genome sequenced (WGS, mean read depth 30x) (TruSeq Nano library prep, Illumina HiSeq) to generate 150 bp pairedend reads (Illumina, Inc [Foster City, CA]).Burrows-Wheeler Aligner (BWA) was used to map reads (hg38).Resulting alignments (bam files) were analyzed using GATK for base quality score recalibration (BQSR), indel realignment, duplicate removal, SNP/INDEL discovery and joint genotyping across samples according to GATK Best Practices 18 .Sites that did not pass GATK variant quality score recalibration (VQSR) filter were removed.Genotypes with quality score (GQ) < 20 were set to missing, followed by removal of sites with missing rate > 10%.To further improve the coverage of common variants, genotype imputation was performed using a haplotype reference panel of 55,929 individuals constructed using 27,166 individuals from Haplotype Reference Consortium, 2548 samples from 1000 Genomes project and 26,215 samples sequenced at Genentech.Imputation was CNS cell-type specific expression.To visualize single-nucleus gene expression of putative genes identified through GWAS meta-analysis by CNS cell type, we used CellxGene VIP (https:// cellx genev ip-ms.bxgen omics.com/).This is based on brain tissue from 12 MS patients 20 .

Comparison with MS risk SNPs.
We examined all autosomal MS risk GWAS loci 3 , including peak variants in the MHC, for potential association with change in BV and T2LV.We selected available overlapping variants from WGS and genotype data, and report association statistics for variants whose effects on BV change and T2LV change were consistent across the six RCTs (heterogeneity I 2 < 40).Additionally, we generated a polygenic risk score (PRS) using the non-MHC variants and weights from the IMSGC MS risk discovery GWAS, and tested the score for association with change in T2LV and change in BV 3 .
The most significant association with BV change was an intronic SNP in PTPRD (top SNP: rs77321193-C/A, p = 5.3 × 10 -7 ), which showed no evidence for heterogeneity across the trials (heterogeneity I 2 = 0, heterogeneity p = 0.96) (Fig. 1A, Table 2).The minor allele (C; MAF = 0.18) of rs77321193 was associated with a greater reduction in BV over time.Multiple SNPs in this region showed association with BV change with similar effect sizes and no evidence for heterogeneity in the meta-analysis (eTable 4, Figs.2A, 3A).We did not observe colocalization between the PTPRD peak and expression of PTPRD in the datasets we evaluated.However, analysis of cell-type specific gene expression in MS brain (Fig. 4) revealed that PTPRD was expressed in neurons, oligodendrocytes and oligodendrocyte precursor cells.The gene-based tests did not reveal significant association with PTPRD (p = 0.066), or other genes (top hit: TSPAN8, p = 8.08 × 10 -5 ) (eTable 5).
The strongest association with T2LV change was in a regulatory region near NEDD4L (top SNP: rs11398377-GC/G, p = 9.5 × 10 -8 ), which was consistent across the trials (heterogeneity I 2 = 51, heterogeneity p = 0.07) (Fig. 1B, Table 3).The C deletion in rs11398377 (MAF = 0.16) was associated with greater T2LV accumulation over time.Several SNPs in this region associated with T2LV change had similar effect sizes and did not show significant heterogeneity in the meta-analysis (eTable 6, Figs.2B, 3B).Interestingly, we found evidence for colocalization between the T2LV change peak (rs11398377) and NEDD4L expression in whole blood (colocalization pp4 = 0.78), suggestive of a common causal variant between the traits.Moreover, analysis of cell-specific gene expression in the MS brain showed that NEDD4L is expressed mainly in neurons, including inter-neurons and excitatory neurons (Fig. 4).Gene-based tests did not show significant associations with change in T2LV (top hit: WDR34, p = 2.0 × 10 -5 ), however NEDD4L was among the top hits (p = 6.94 × 10 -5 ) (eTable 7).
Comparison with the known MS risk GWAS 3 SNPs (197 non-MHC and autosomal; 161 and 162 SNPs met our inclusion criteria for comparison with BV change and T2LV change meta-analyses results, respectively) did not show evidence of overlap with BV change nor T2LV change GWAS hits (eTables 8, 9).We did not observe any significant associations (p < 1 × 10 -6 ) between MS susceptibility loci with BV change or T2LV change, and found no overlap with MS risk loci in the MHC region 3 .Conversely, neither the BV change peak in PTPRD nor the T2LV change peak in NEDD4L were associated with MS risk 3 , and no putative BV change or T2LV change peaks (p < 1 × 10 -6 ) colocalized with MS risk.Additionally, the MS risk PRS was not significantly associated with change in BV (p = 0.54) nor with change in T2LV (p = 0.89).

Discussion
To our knowledge, this is the first GWAS meta-analysis of longitudinal MRI measures across multiple RCTs to investigate the genetics of MS progression.We achieved this by leveraging a precompetitive industry partnership aimed at utilizing well-characterized MS cohorts to interrogate the genetics of disease progression in MS.
The top BV change SNP occurs in an intron of PTPRD, a protein tyrosine phosphatase receptor involved in cellular signaling, growth and differentiation 21 .Previous studies of cross-sectional BV and Multiple Sclerosis Severity Score (MSSS) have also reported sub-significant associations with variants in the PTPRD region independent from our BV change peak (cross sectional BV top SNP: top SNP: rs1953594, adjusted trend − logP = 4.3; MSSS top SNP: rs10977017, p = 1.02 × 10 -5 ) 5,6 .Additionally, variants mapped to PTPRD are associated with two sleep disorders that are enriched in PwMS: restless leg syndrome (top SNP rs1836229) 22,23 and insomnia (top SNP rs10761240) 24,25 .While we did not observe evidence for colocalization between the top BV change association and PTPRD expression, it is possible that this study is insufficiently powered to detect colocalization, or that the locus impacts mechanisms that are time-dependent, cell lineage-specific, or independent of PTPRD expression.
The top T2LV association is an indel SNP that lies in the regulatory region upstream of NEDD4L.NEDD4L (neural precursor cell expressed developmentally down-regulated 4-like) plays a role in axon guidance, neurite growth, synaptic transmission, and pain sensitivity 26,27 .The NEDD4L peak showed evidence for colocalization with expression of NEDD4L in blood (GTEx).The minor allele of this top SNP (rs11398377*C-deletion) was associated with greater T2LV accumulation and increased gene expression, indicating that downregulation of the gene may have therapeutic potential.Missense SNPs in NEDD4L cause abnormal fetal neurodevelopment and brain malformations such as periventricular nodular heterotopia 28 .NEDD4L encodes an E3 ubiquitin-protein ligase that regulates several membrane channels and transporters including epithelial (ENaC) 29 and voltage-gated (NaV) 30 sodium channels.Both ENaCs and NaVs have been implicated in demyelination and MS pathophysiology, and research is ongoing to investigate the role of sodium channel blockers in preventing axonal damage 31,32 .This link is supported by use of the channel blocker dalfampridine for improvement of walking in PwMS 33 .
While BV and T2LV change traits may be correlated 15 (eTable 10), they reflect complementary measures of MS progressive biology-BV reduction represents a global measure of brain tissue loss, while T2LV increase reflects immune-mediated injury in the CNS white matter.Both PTPRD and NEDD4L are strongly expressed in the CNS, particularly in neurons.PTPRD is also abundant in oligodendrocytes and oligodendrocyte precursor cells.Oligodendrocytes play an important role in MS because they myelinate neurons.NEDD4L is a paralog of NEDD4, which promotes ubiquitination and degradation of RAPGEF2, one of the few putative MS risk genes with enhanced single cell expression in CNS cells 34,35 .
The IMSGC recently published a large study of MS severity using the age-related MS severity score (ARMSS), and reported association in the DYSF-ZNF638 locus 4 .Additionally, a smaller study (N = 1813 individuals) examining median longitudinal ARMSS (l-ARMSS) and longitudinal MSSS (l-MSSS) did not find significant association 36 .The IMSGC DYSF-ZNF638 locus index variant (rs10191329) was not associated with change in BV (beta = − 0.004, p = 0.91) nor change in T2LV in our analysis (beta = 0.07, p = 0.04).This may be due to the greater power in the IMSGC analysis, but it is also possible that the variables we analyzed capture different aspects of the MS disease course.We chose to focus on objective imaging measures rather than clinical measures such as EDSS (on which the ARMSS variable is based) due to certain limitations in these measures.For instance, the EDSS score is numerical and nonlinear (ranging from 0 to 10) and is largely based on mobility, making it a poor proxy for progression.
Similar to the IMSGC MS severity study, we did not find overlap between MS risk loci and our peaks, nor did we observe association with an MS PRS 4 .This suggests that mechanisms driving MS susceptibility may differ from those modulating outcomes influencing disease progression.A prior study in ~ 500 MS patients found that the strongest genetic risk factor for MS (HLA-DRB1*15:01) was associated with reduction in brain parenchymal volume and higher T2 lesion load 37 , however these findings were not replicated in our study.
We acknowledge the following limitations in our study.All but two of our RCTs (ASCEND, ORATORIO) examined RRMS participants, and it is possible that the biological mechanisms underlying MS progression differ between early-stage and late-stage MS 38 .Moreover, the clinical trial participants may not be representative of the general MS population owing to strict inclusion and exclusion criteria used in participant selection of RCTs.Neither peak identified in this study reached genome-wide significance (p < 5 × 10 -8 ) or a stricter threshold of p < 2.5 × 10 -8 to account for multiple testing.Though powered to detect moderate associations with common SNPs, our ability to detect genome-wide associations would be improved with a larger sample size.Additional, well-characterized cohorts are needed for independent replication.Longitudinal MRI data was available for < 2 years, however this reflects a small proportion of the average MS patient's disease course, and longer follow-up would provide more accurate outcomes on BV and T2LV changes.Several of the trials had additional longitudinal data available from open label extension studies, however including these data would have introduced complexities such as variable lengths of follow-up, and differential handling of placebo and treatment arms when accounting for pseudoatrophy at treatment initiation.We therefore limited our analyses to data from the placebo-controlled period.While we endeavored to account for pseudoatrophy by re-baselining at 24 weeks, it is possible this did not fully capture all pseudoatrophy.We chose to re-baseline using week 24 measures to have a reasonable duration of MRI follow-up for GWAS.
This work highlights the utility of using clinical trial data for genetic analyses.RCTs have assigned treatment arms and systematic imaging collection, and genetic analyses of RCTs are uniquely positioned to leverage objective, quantitative measurements to advance our understanding of MS disease course.Furthermore, the MRI analysis methods used to quantify atrophy and T2LV were standardized within studies, and to the extent possible, across the studies used in the meta-analysis, because all studies utilized the same centralized MRI reading center (NeuroRx).Finally, our findings complement existing risk studies by identifying additional genetic factors related to different aspects of the disease.

Figure 2 .
Figure 2. Regional association plots of peak SNPs from meta-analysis of change in brain volume (BV) and T2 lesion volume (T2LV) GWAS studies.Regional association plots with negative log 10 P-values on primary y-axis, genetic recombination rate on secondary y-axis, and chromosome position (hg19) on x-axis.Each point corresponds to a single SNP analyzed in the GWAS.The correlation between the peak SNP (or proxy SNP) and the other SNPs in the region is reflected by the linkage disequilibrium (r 2 ) estimate, depicted in blue for least correlation (r 2 < 0.2) to red for most correlation (r 2 > 0.8).The plots were created using LocusZoom (http:// locus zoom.sph.umich.edu/), and depict the association of: (A) BV change with intronic SNP rs77321193 in PTPRD on chromosome 9; (B) T2LV change with regulatory SNP rs11398377 near NEDD4L on chromosome 18.Proxy SNP rs9955426 was used for plot B because rs11398377 was not present in the reference panel (1000G EUR).1000G EUR, 1000 Genomes European; BV, brain volume; GWAS, genome-wide association study; NEDD4L, neural precursor cell-expressed developmentally down-regulated 4-like; PTPRD, protein tyrosine phosphatase receptor type D; SNP, single nucleotide polymorphism; T2LV, T2 lesion volume.

Figure 3 .
Figure 3. Forest plots of peak SNPs from meta-analysis of change in brain volume (BV) and T2 lesion volume (T2LV) GWAS studies.Forest plots depicting trial-wise (black squares) and meta-analyzed (summary in blue diamond) association of: (A) peak variant rs77321193 on chromosome 9 with BV change (effect allele: C, heterogeneity I 2 < 0.01, p = 0.95); (B) Proxy SNP (for peak variant) rs9955426 on chromosome 18 with T2LV change (effect allele: C, heterogeneity I 2 = 46, p = 0.10).Effect sizes are reported as Beta ± SE on x-axis, sample sizes are reported as N along with trial names, and strength of associations are reported as P-values (P).Direction of association is indicated on color scale at the bottom of each plot.BV, brain volume; GWAS, genome-wide association study; NEDD4L, neural precursor cell-expressed developmentally down-regulated 4-like; PTPRD, protein tyrosine phosphatase receptor type D; SNP, single nucleotide polymorphism; T2LV, T2 lesion volume.