Assessing average somatic CAG repeat instability at the protein level

Sandwich ELISA-based methods use Abs that target the expanded polyglutamine (polyQ) tract to quantify mutant huntingtin (mHTT). Using Meso Scale Discovery (MSD) assay, the mHTT signal detected with MW1 Ab correlated with polyQ length and doubled with a difference of only 7 glutamine residues between equivalent amounts of purified mHTTexon1 proteins. Similar polyQ length-dependent effects on MSD signals were confirmed using endogenous full length mHTT from brains of Huntington’s disease (HD) knock-in (KI) mice. We used this avidity bias to devise a method to assess average CAG repeat instability at the protein level in a mixed population of HTT proteins present in tissues. Signal detected for average polyQ length quantification at the protein level by our method exhibited a strong correlation with average CAG repeat length at the genomic DNA level determined by PCR method in striatal tissue homogenates from HdhQ140 KI mice and in human HD postmortem cortex. This work establishes that CAG repeat instability in mutant HTT is reflected at the protein level.


Results
PolyQ length in mHTT affects its quantification by MSD assay using polyQ targeting Abs. The effect of polyQ length on the detection of mHTT by MSD assay was evaluated with a series of purified GST-FLAG-HTTexon1 fusion proteins containing polyQ lengths from Q19 to Q72 ( Supplementary Fig. S1a). MSD is a method similar to ELISA except that electrochemiluminescence is used as detection readout: electricity is applied to the plate electrodes leading to light emission by electrochemiluminescent labels that are conjugated to detection antibodies. The monoclonal rabbit capture Ab EPR5526 was paired with different mouse monoclonal polyQ targeting detection Abs MW1, 1C2 and 3B5H10 for mHTT assays (Fig. 1a). A rigorous protocol was developed to achieve the most accurate protein concentrations of the GST-FLAG-HTTexon1 proteins used in the assay (see Methods, Supplementary Figs. S1, S2 and Supplementary Table S1). Results showed that the intensity of MSD signal obtained with MW1 detection Ab increased with increasing polyQ length (Fig. 1b), confirming previous published results. In contrast, the MSD signal intensity seen with the mouse monoclonal MAB5492 detection Ab, a non-polyQ targeting Ab 36 (Fig. 1a) was solely dependent on protein concentration (Fig. 1c). If used with biological sample, the Abs pair EPR5526-MAB5492 will allow total HTT (WT and mutant) detection. When the slopes of the standard curves in the linear dynamic range obtained by MW1 were normalized by the slopes of the standard curves in the linear dynamic range obtained by MAB5492 (see Supplementary Data Set 1 for Method details), corresponding to mHTT/Total HTT assay, a strong polyQ length correlation was observed (R 2 = 0.9971; Fig. 1d). Similar correlations were obtained with 1C2 and 3B5H10 detection Abs (R 2 > 0.98; Supplementary Fig. S3).
To quantify Q-dependent signal rate change observed with MW1 for polyQ lengths in the range of adult HD patients, we extrapolated, from the correlation in Fig. 1d, the mHTT signal fold increase for each additional glutamine residue in GST-FLAG-HTTexon1 protein at constant protein concentration. In this aim, mHTT signal predicted for GST-FLAG-HTTexon1 proteins from Q38 to Q62 was normalized by the MSD signal for GST-FLAG-HTTexon1-Q38. Results showed that predicted mHTT signal with MW1 doubled with the addition of only 7 glutamine residues (Fig. 1e). These results suggest that polyQ length dependent bias has a significant effect on mHTT detection, even for CAG repeats in the HTT gene in the pathological range of most HD patients. Other polyQ targeting Abs 1C2 and 3B5H10 also exhibited a polyQ length-dependent bias but to a much lower extent than MW1 ( Supplementary Fig. S3).
We next tested if the polyQ length-dependent bias with MW1 detection Ab could be observed with the full length endogenous HTT protein using homogenates from striatum of 6 months old heterozygous HD-KI mice bearing different CAG repeat lengths in the HTT gene. Initially, MSD signal for mHTT was not observed to be polyQ length-dependent ( Supplementary Fig. S4a). However, analysis of samples by western blot (WB) revealed a decreased amount of mHTT with increased polyQ length and for constant amount of total protein ( Supplementary Fig. S4b). Normalization of MSD signal by the amount of mHTT quantified by WB confirmed the polyQ length-dependent correlation with MW1 detection Ab and full length endogenous HTT (R 2 > 0.99; Fig. 2). It is remarkable to observe such similar correlation to what was seen with purified GST-FLAG-HTTexon1 using another method of normalization, demonstrating the robustness of our finding. A similar polyQ length correlation was observed independently of the capture Ab used (monoclonal rabbit EPR5526, targeting N-terminus of endogenous HTT protein or monoclonal rabbit D7F7, targeting middle region; Fig. 1a), confirming that only the avidity of MW1 detection Ab is involved (Fig. 2). Most striking, polyQ length-dependent bias for full length endogenous HTT was observed for a very large polyQ length range (from Q44 to Q188). All together, these observations show an inherent bias in mHTT detection by sandwich ELISA-based assays, which can be quantified and thus corrected.
A novel method to evaluate polyQ length expansion in mHtt containing tissues using MSD assay. We hypothesized that we could take advantage of polyQ length-dependent bias observed in mHTT detection by MSD assay to design a novel method for quantification of average polyQ length in a biological sample, such as tissue lysates or human biofluids (Fig. 3). In essence, we addressed if CAG repeat instability could be assessed at the protein level. The premises were 1) that HTT protein exhibits a mosaicism of polyQ lengths in biological tissue prone to CAG repeat instability [37][38][39] and 2) that a population of HTT proteins with different polyQ lengths result in a similar detected signal to a single HTT protein with a polyQ length corresponding to the average polyQ length of the population. Briefly, the sample is analyzed twice by MSD assay: first, with non-polyQ targeting detection Ab such as MAB5492 that allows quantification of total HTT (WT and mutant form; Fig. 3a,b) then with polyQ targeting detection Ab that allows quantification of mHTT (Fig. 3c). Signal obtained in the linear dynamic range with polyQ targeting detection Ab for a determined HTT concentration can be used to estimate the average polyQ length by a mathematical model ( Fig. 3d and Methods). Even if polyQ-targeting Abs preferentially bind expanded polyQ tract, they also interact, to a lower extent, with WT HTT. Similarly, Abs that do not PolyQ length affects GST-FLAG-HTTexon1 quantification by MSD assay using polyQ targeting detection Ab. (a) Diagram shows antibody epitopes in human HTT protein (NCBI reference sequence: NP_002102.4). Calibration curve performance for GST-FLAG-HTTexon1 protein using MW1 (b) and MAB5492 (c) detection Abs. Curves were fitted with a four-parameter logistic regression model with 1/ Y 2 weighting. Mean values ± SD (1 σ) of duplicates of a single experiment are shown. (d) Plot of ratio of the slopes determined from standard curves in the linear dynamic range for mHTT assay by total HTT assay as a function of polyQ length exhibits a strong correlation. Mean values ± propagated SD (1σ) of duplicates of a single experiment are shown. (e) Using the polyQ length-dependent correlations shown in (d), MSD signal fold increase as a function of polyQ length at constant amount of mHTT protein was extrapolated for mHTT assay. mHTT signal predicted for GST-FLAG-HTTexon1 proteins from Q38 to Q62 was normalized by the MSD signal for GST-FLAG-HTTexon1-Q38. PolyQ lengths ranging from Q38 to Q62 correspond to the polyQ length range seen in adult HD patients. GST: glutathione S-transferase; N17: HTT first 17 aa; PRD: proline-rich domain.
target the polyQ tract interact with both WT and mHTT. Thus, our method which relies on quantification of both WT and mHTT, provides information on the average polyQ length in total HTT proteins.
To mimic in vitro polyQ length mosaicism in HTT protein from biological tissue prone to CAG instability, different amounts of GST-FLAG-HTTexon1 proteins with variable polyQ lengths were mixed (Tables 1-3). Using polyQ targeting detection Ab MW1 and a mix of GST-FLAG-HTTexon1 proteins with an average polyQ length of 48 residues ( avg Q48a), there was a similar MSD dose response to the standard curve obtained with pure GST-FLAG-HTTexon1-Q48 (Fig. 4a), indicating that the same average polyQ length could be determined  . Method for HTT polyQ length quantification. HTT proteins exhibit a mosaicism of polyQ lengths in biological tissue prone to CAG repeat instability. To quantify average polyQ length in HTT proteins, the biological sample is quantified twice by sandwich ELISA-based assay with two pairs of Abs: one that includes a detection Ab that does not target the polyQ tract (a) to quantify total HTT (b) and another one that has a polyQ targeting detection Ab (c). This information is used in a mathematical model to determine the average polyQ length in HTT proteins (d) when samples are tested in the linear dynamic range.
at different concentrations of total GST-FLAG-HTTexon1 protein. The graph in Fig. 4a also displays results obtained with pure GST-FLAG-HTTexon1-Q38 and -Q72 for comparison. The average polyQ lengths experimentally determined did not exceed 13% of relative error, the highest relative error at the lowest concentration tested (Table 1). We then generated the same average polyQ length by different protein mixings of GST-FLAG-HTTexon1 ( avg Q48a, avg Q48b and avg Q48c; Table 2). The average polyQ length experimentally determined at a single concentration was constant for a similar average polyQ length obtained by using different protein mixings ( Fig. 4b and Table 2), highlighting the robustness of our method. Finally, we generated 9 different average polyQ lengths from avg Q38 to avg Q58 with 2.5Q increments ( Table 3). The different average polyQ lengths determined experimentally at a single concentration exhibit a strong linear correlation with theoretical average polyQ lengths (R 2 = 0.9829; Fig. 4c and Table 3). Intra-batch accuracy and precision for average polyQ length quantification were less than 13% of relative error and less than 4% of coefficient of variation for all conditions tested (Tables 1-3). Results obtained with other polyQ targeting Abs 1C2 and 3B5H10 were similar but with a lower accuracy (Supplementary Fig. S5 and Supplementary Tables S2-7). All together, these data validate the ability of our method to estimate the average polyQ length in a mix of HTT proteins with variable polyQ lengths, with the Ab pairs EPR5526-MW1 (for mHTT) and EPR5526-MAB5492 (for total HTT) being superior in accuracy.
Average polyQ length at the protein level correlates with average cAG repeat length at the DnA level in postmortem brain of HD mouse and HD patients. To establish whether our assay is suitable to measure the average polyQ length in endogenous HTT proteins from brain tissue, we examined  www.nature.com/scientificreports www.nature.com/scientificreports/ striatum from homozygous Hdh Q140 KI mice from different litters and of different ages (from 3.3 to 13 mo). Since this mouse model was previously shown to exhibit intergenerational CAG repeat changes 40 , we expected to detect a variation in average polyQ length in HTT between animals. To test this idea, data obtained by MSD assay were compared with the extent of CAG repeat instability measured in gDNA from the contralateral striatum using PCR method adapted from Lee et al. 41 (see Method section for details). The MSD signal is normalized using the MSD signal ratio (MSD of mHTT/MSD of Total HTT; plotted on y-axis in the figure). The MSD signal for total HTT is solely dependent on protein concentration and does not depend on polyQ length. Results showed that MSD signal ratios (EPR5526-MW1/EPR5526-MAB5492) obtained from striatum of HD mice exhibited a strong correlation with average CAG repeat length determined by PCR (R 2 = 0.7929; Fig. 5a). Remarkably, the average CAG repeat length was determined from contralateral striatum which may have introduced some variation and could explain, at least in part, some outliers. Unfortunately, we could not interpolate the average polyQ length from these data because 1) the recombinant GST-FLAG-HTTexon1 proteins used as standards do not bear sufficient polyQ repeats tracts and 2) it was reported that the same concentration of the full length and truncated HTT proteins with similar polyQ lengths are detected with wide difference in intensity 31 . Even though we showed polyQ length correlation with full length endogenous mHTT (Fig. 2), anchor points of this correlation are probably different than those obtained with GST-FLAG-HTTexon1.
Having established that MSD signal ratios (EPR5526-MW1/EPR5526-MAB5492) for endogenous HTT could be correlated with CAG repeat length in HD mice, we next focused on analysis of human postmortem HD brain. We analyzed lysates from postmortem cortex of 2 adult and 5 juvenile HD (JHD) patients. Protein and DNA analysis were done in the same sample lysate for all samples. As it was shown that exon 1 of HTT is produced via incomplete splicing of the HTT pre-mRNA in HD patient tissue 42 , we used an additional Abs' pair (D7F7-MAB2166) for total HTT quantification (WT + mutant form) that does not recognize the truncated form of HTT. The MSD signal ratios (mHTT/Total HTT) displayed a high correlation with average CAG repeat length determined by PCR for both Ab pairs used for normalization (R 2 > 0.9; Fig. 5b) and a strong parallelism between them. Among the samples tested for total HTT quantification with EPR5526-MAB5492, two non-affected individuals were not used for correlation because their signals were below background signal for the level of detection (data not shown).

Discussion
Currently, lowering mHTT is a major therapeutic strategy under investigation in many laboratories and in clinical trials for HD patients 43,44 , therefore accurate quantification using ultra-sensitive immunodetection methods is vital. mHTT can be preferentially distinguished from WT by polyQ targeting Abs 23-25 sensitive to expanded polyQ repeats containing more epitopes than normal polyQ tracts. The increased avidity of such Abs for longer polyQ tracts was recognized as a potential bias in mHTT quantification 19,20,22,31 . However, even if the levels of mHTT were associated with inherited CAG repeat length 19,[33][34][35] , polyQ length was considered as a minor contributor compared to mHTT protein concentration 20 . Previously, a series of purified truncated HTT proteins with different polyQ lengths was detected by TR-FRET assay and MW1 Ab 33 . The authors reported a 10-to 20-fold higher sensitivity for mHTT than WT HTT. They did not mention that with an increase of 7 glutamines, corresponding to the range of polyQs of HD patients tested in their study, they had a signal increase of ~40% for the same HTT protein concentration (see their Supplementary Fig. S1b). The analysis of large polyQ length series was not evaluated with sandwich-ELISA based methods currently used in clinical trial 20,31 . Here, we show that MSD signal detected with polyQ targeting Ab MW1 increases and, most of all, strongly correlates with polyQ length in purified N-terminal HTT fragments (Fig. 1d) and in endogenous full length HTT obtained from HD KI mice and human cortex (Figs. 2 and 5). Remarkably, this polyQ length dependent bias is evident for polyQ tracts that are in the range of adult onset HD patients as well as very large polyQ tracts (up to Q188). Our data suggest that even small polyQ length variations could lead to a large inaccuracy in mHTT quantification (Fig. 1e). When considering that somatic CAG repeat expansion occurs in HD brain 11,12 , the inaccuracy of mHTT quantification may be even greater.
Our findings raise questions about the reported increase in mHTT in CSF with disease progression using micro bead-based IP-flow cytometry and SMC assays 19,20,35 : is it solely due to mHTT concentration or might there be a contribution of CAG repeat instability? This is especially important if we consider that mHTT detected in CSF could preferentially come from dying cells exhibiting a very high level of instability. This issue is further complicated by findings that mHTT increases with disease progression in peripheral blood mononuclear cells (PBMC) but without significant difference in total HTT 33,34 . Initially, CAG repeat instability was proposed as a possible explanation for progressive increase in mHTT levels with no concomitant differences in total HTT level, but another likely explanation was a progressive accumulation of N-terminal fragments 33 . The latter explanation is challenged by a recent study showing no variation in N-terminal HTT level at different disease stages in PBMC 34 . The presence of CAG repeat instability is unlikely to influence the relative quantification of mHTT in current therapeutic silencing studies where a reduction in mHTT is measured as a change from baseline before treatment 21 , normalizing potential bias due to polyQ length difference between patients. Only CAG instability over the course of the longitudinal study could affect results. Results were plotted as a function of average CAG repeat length determined by PCR method in DNA extracted from the contralateral striatum of each animal (see Methods, Quantification of average CAG repeat length). It is unclear why there is more variability (larger SDs) in raw MSD signals for samples between ~108 and 124 CAG repeats than for other samples. All samples were processed at the same time and in the same manner, so it is likely that variation may be from pipetting. (b) Homogenates prepared from postmortem cortex of HD patients were analyzed by MSD assay for average polyQ length quantification (MSD signal ratio for mHTT by total HTT). Results were plotted as a function of average CAG repeat length determined by PCR method from the same sample lysates (see Methods, Quantification of average CAG repeat length). Light blue sample was below the level of detection (background + 3 SD) for total HTT assay and was not used for correlation. Mean values ± propagated SD (1 σ) of duplicates of a single experiment are shown. Please note that the MSD signal is normalized using the following MSD signal ratio (MSD of mHTT/MSD of Total HTT; plotted on y-axis in the figure). The MSD signal for total HTT is solely dependent on protein concentration and does not depend on polyQ length. www.nature.com/scientificreports www.nature.com/scientificreports/ In this study, we exploited the biasing effect of polyQ sensitive Abs in mHTT detection to design a novel method to assess the average polyQ length in HTT in samples where there is a population of HTT proteins with different polyQ lengths as might be expected under conditions of CAG repeat instability (Fig. 3). Our method relies on the normalization of MSD signal detected with polyQ targeting Ab MW1 by the amount of total HTT, corresponding to the MSD signal detected with non-polyQ targeting Ab MAB5492. The method proved to be sensitive, accurate and robust when tested using purified GST-FLAG-HTTexon1 (Fig. 4). Moreover, polyQ length assessment at the protein level strongly correlated with CAG repeat length at the DNA level in postmortem brain lysates from HD mice and patients (Fig. 5). It should be noted that for comparison of MSD signal ratio (mHTT/Total HTT), all samples were tested under the same conditions (same amount of total protein) and the detected signals were in the linear dynamic range of detection. Signal detected in 2 non-affected individuals was in the background signal and could not be used for MSD signal ratio and comparison with samples from HD individuals.
Studies have shown that the level of CAG repeat instability is higher in cortex than in cerebellum 10,11 . We confirmed this observation at the DNA level with our human sample set (data not shown). However, we were unable to obtain detectable signals for total HTT from cerebellar lysates, preventing the calculation of the MSD signal ratios. HTT protein was previously detected at lower levels in human cerebellum than in cortex of the same HD postmortem brains 37,45 . In our study, samples of human cerebellum and cortex were taken from the same brains and were processed in the same way and at the same time. Thus, we think the amount of total HTT present in cerebellar tissue is below the level of detection in our assay rather than an issue of the quality of the postmortem tissues or protein lysates.
Our method of determining average polyQ length relies on the correlation between MSD signal ratio (EPR5526-MW1/EPR5526-MAB5492) and polyQ length in HTT proteins. Using a different immuno-assay (AlphaLISA), different polyQ Ab (3B5H10) and non-polyQ (MAB2166 and D7F7) Abs and different materials (cell culture lysate and purified full length HTT), Baldo et al. 46 reported that the ratio of mHTT/Total HTT signals increased with polyQ length. However, they did not perform further analysis. Our review of their data showed that similar to our findings, the ratio of their mHTT/Total HTT values shows a strong correlation with polyQ length (from their Fig. 4 and Fig. 5c; data not shown; R 2 > 0.99).
Gold standard methods for determining CAG repeat instability involve PCR amplification from "bulk" or multiple small pools of genomic DNA. The negative correlation between CAG repeat length and PCR amplification efficiency represents a significant pitfall for accurate quantification 47,48 . However, despite a likely underestimation of CAG instability, especially for the bulk method that cannot detect the rare large expansions, the results obtained with these two methods exhibit a strong correlation 41 . Data obtained by bulk PCR in our studies exhibited a strong correlation with detected average polyQ length in HTT. Thus, we present a new complementary method to PCR for evaluating average instability at the protein level. Though less informative than PCR because it provides only average polyQ lengths without size distribution, it may allow an evaluation of expansion in tissues where HTT proteins, but not HTT gene, can be detected (e.g. CSF).
Our method of predicting HTT protein average instability relies on quantification of both WT and mHTT; both alleles must be expressed equally to correlate with CAG repeat length. We have observed that the level of mHTT decreases with polyQ length in HD-KI mouse models (Supplementary Fig. S4b). However in human, some western blot studies have observed an increased level of mHTT compared to WT in both adult and JHD brains 37 or a lower amount of mHTT than WT solely in JHD brains 38 and fibroblasts 38,49 . These inconsistent results could be due to a variety of factors including small sample size, the type of sample (brain or cell lines), the extent of separation of WT and mHTT or broader migration of mHTT in SDS-PAGE, probably due to CAG repeat instability and polyQ length mosaicism. Recently, a novel mass spectrometry-based method was developed to quantify allele specific HTT protein levels using polymorphic variants 50 . From the 28 adult HD subject-derived lymphoblast cell lines tested, levels of mHTT protein were highly associated with levels of WT HTT and were not correlated with the expanded CAG repeat size. These results argue against the idea that there is a potential effect of CAG repeat length on HTT protein level at least in the adult onset range. Although the impact of CAG repeat size on HTT expression levels in human brains remains largely unsolved, especially for JHD, our data, showing that polyQ length quantification significantly correlates with CAG repeat size, argues that both WT and mHTT levels are equal.
Our method relies on immunodetection of HTT proteins and therefore is subject to technical issues common to this approach, such as matrix influences or interfering substances. A fragment of HTT protein, corresponding to the 1-573 N-terminal aa, was reported to produce a higher signal than the full length HTT protein at comparable concentrations 31 . Fodale et al. consider results as a best estimate rather than absolute for mHTT quantification 31 . We were unable to obtain a series of stable purified full length HTT proteins with increasing polyQ lengths to compare to the results obtained with GST-FLAG-HTTexon1. The presence of HTT fragments has been reported in HD brain [51][52][53] . Additionally, flanking regions of the polyQ tract, which were sites used for total HTT detection in our assay, may be affected by polyQ length as described by others 54,55 and may introduce a bias when determining Total HTT. It is noteworthy that the MSD signal ratios (mHTT/Total HTT) obtained from human cortical lysates with 2 different Abs' pairs-targeting flanking polyQ regions and more C-terminal domains in HTT-displayed a high correlation with average CAG repeat length and a strong parallelism between them (Fig. 5b), suggesting that the contribution of truncated forms of HTT and the impact of polyQ length on flanking regions, if any, is negligible.
We have shown that MSD signal ratio (EPR5526-MW1/EPR5526-MAB5492) followed a simple polyQ length correlation in the linear dynamic range of our assay (Fig. 1d). We analyzed HD brain samples in this range of detection. The constraint of linear dynamic range could be a problem for polyQ assessment in samples with very low concentrations of HTT. However, results obtained with GST-FLAG-HTTexon1 showed that parameters from 4-parameters logistic regression are constant (Bottom and HillSlope) or strongly polyQ length dependent (Top Scientific RepoRtS | (2019) 9:19152 | https://doi.org/10.1038/s41598-019-55202-x www.nature.com/scientificreports www.nature.com/scientificreports/ and EC50) ( Supplementary Fig. S6), allowing us to predict a regression curve for any polyQ length as illustrated in Fig. 3. Such an improved model for polyQ length assessment should overcome the limitation of our current study.
Genome-wide association studies identified potential genetic modifiers involved in CAG repeat instability 15,17 , opening an area for future therapeutic intervention. Our study represents a proof of principle for CAG repeat quantification at the protein level and paves the way for further studies. Our method relies on the detection of mHTT and Total HTT, which have both been detected and quantified in patient CSF using the SMC assay 19,20 , thus our assay potentially represents a way to study indirectly the extent of CAG repeat instability in vivo in the patient's central nervous system. The lower limit of quantification of our MSD sandwich ELISA-based assay for mHTT (picomolar range) is not sensitive enough for quantification of mHTT in clinical CSF samples from HD patients. The SMC assay is required to reach femtomolar sensitivity 20,31,35 . Quantification of average CAG instability by our method adapted to SMC assay, could more accurately predict age of disease onset 12 and be used in future clinical approaches that aim to reduce CAG repeat instability 14,56,57 . Methods cloning. Plasmid vectors pGEX-6P-1 coding for GST-FLAG-HTTexon1 proteins with Q32, Q44 and Q55 were kindly provided by Erich Wanker 58 . DNA fragment coding for HTTexon 1 proteins with Q19 was available in-house 59 . DNA fragment coding for HTTexon 1 proteins with Q38 was a gift from Pamela Bjorkman 26 (Addgene plasmid #11514). DNA fragments coding for HTTexon 1 proteins with Q25 and Q72 were kindly provided by Boxun Lu 60  and complete EDTA-free protease inhibitor cocktail (Roche). Bacteria were lysed by sonication during 2.5 min as follow: 3 s "on", 10 s "off " using Sonic dismembrator model 500 set at 40% and 1/8" probe (Thermo Fisher Scientific). After centrifugation at 14,000 g for 1 h, the soluble bacterial extract was loaded at gravity flow on 400 μL of Glutathione Sepharose 4B affinity chromatography resin (GE Healthcare Life Sciences) in a Poly-Prep chromatography column (Bio-Rad). Resin was then washed with 10 volumes of lysis buffer, the first 5 volumes containing Triton detergent (0.5%) to improve the release of nonspecifically bound bacterial material. Finally, GST-FLAG-HTTexon1 proteins were sequentially eluted once with 100 μL then 5 times with 200 μL of elution buffer. Protein containing eluates (usually eluate 2 to 4) were diafiltrated by 5 washing out steps with dialysis buffer and Amicon Ultra-0.5 Centrifugal Filter Unit with Ultracel-3 membrane (MilliporeSigma). To avoid unnecessary losses upon freezing/thawing, protein stock concentrations were adjusted by diluting them in dialysis buffer and were stored at concentrations ranging from ~65 to ~100 μM. Comparison of concentrations before and after freeze/thawing showed negligible losses (<6%). To remove potential aggregates generated by the freezing/thawing process, thawed protein samples were centrifuged at 16,000 g and 4 °C for 5 min and the supernatant was collected. This centrifugation and supernatant collection step was performed twice. If used below 10 μM, bovine serum albumin (MilliporeSigma; #A2153) was added to proteins at 2 mg/mL to limit protein adsorption on pipette tips. Determination of purified protein concentration. Protein concentration was measured using its specific molar attenuation coefficient, after absorption spectrum scanning between 220 and 350 nm with DS-11 spectrophotometer (Denovix). Molar attenuation coefficient was computed with ProtParam tool on ExPASy bioinformatics resource portal 61 . Purity of full-length GST-FLAG-HTTexon1 proteins ranged from 67 to 90% depending on protein batch: a protein of the same size (~28 kDa) copurified with all proteins produced ( Supplementary  Fig. S1a), in proportion that is pure CAG repeat length dependent (Supplementary Fig. S1b). This product corresponded to the molecular mass of GST-FLAG and was detected with EPR5526 (anti-hHTT aa 1-100) but not with MW1 Ab by western blotting (Supplementary Fig. S1c). All together, these data suggest that 1) EPR5526 Ab targets HTT first 17 aa (N17), located N-terminally to the polyQ tract; 2) the 28 kDa species is composed of GST-FLAG-N17 and 3) only GST-FLAG-HTTexon1 protein can be detected by Ab pairs used for MSD assay. Quantification of protein concentration by absorbance at 280 nm, which measures absorbance of both GST-FLAG-HTTexon1 and GST-FLAG-N17 in solution, showed different results than Coomassie blue staining, which allowed a relative quantification of GST-FLAG-HTTexon1 (Supplementary Fig. S1d). To adjust protein concentration of GST-FLAG-HTTexon1 estimated by absorbance at 280 nm, correction factors for each batch of GST-FLAG-HTTexon1 protein were estimated based on relative quantification after SDS-PAGE (Supplementary Table S1 www.nature.com/scientificreports www.nature.com/scientificreports/ MSD assay. Multi-Array 96-well standard plates (MSD) were coated overnight at 4 °C on a flat surface with 30 µL of D7F7 or EPR5526 capture Ab (2 µg/mL) in PBS pH 7.4 (Thermo Fisher Scientific). Plates were emptied and blocked with 150 µL of 3% bovine serum albumin (BSA) in PBS-Tween 0.05% pH 7.4 for 2 h at room temperature and 1,000 rpm on orbital microplate shaker (Scientific Industries). After 3 washes with 150 µL of washing buffer (PBS-Tween 0.05% pH 7.4), 30 µL of diluted samples were distributed into plates and incubated 1 h (for purified proteins) or 2 h (for biological samples) at room temperature and 1,000 rpm. The amount of biological material tested was adjusted for each pair of Abs to obtain signal in the linear dynamic range of detection: ~10 μg of total protein of mouse derived material (for D7F7-MW1 and EPR5526-MW1 mHTT assays); ~140 μg of total protein of mouse derived material (for EPR5526-MAB5492 Total HTT assay); 6 μg of total protein of human derived material for EPR5526-MW1 mHTT assays and 50 or 80 μg of total protein of human derived material for D7F7-MAB2166 or EPR5526-MAB5492 Total HTT assays respectively. Plates were then washed 3 times with washing buffer and incubated with 30 µL of detection antibody and incubated for 1 h at room temperature and 1,000 rpm. Depending on the type of sample, different concentrations of detection Abs were used for optimal signal-to-noise ratio: MW1 (2 µg/mL); 3B5H10 (2 µg/mL); 1C2 (1:1,000 or 1:2,500); MAB5492 (1:5,000 or 1:20,000) and MAB2166 (1:10,000). After 3 washes with 150 µL of washing buffer, 30 µL of goat anti-mouse SulfoTag secondary Ab (2 µg/mL) were distributed into plates and incubated 1 h at room temperature and 1,000 rpm. After 3 washes with 150 µL of washing buffer, 150 µL of 2X Read Buffer T with surfactant (MSD) were distributed into plates before reading on QuickPlex SQ120 instrument (MSD) according to manufacturer's instructions.
Regression analysis of MSD data. Calibration curves of purified proteins were fitted with a four-parameter logistic regression model, 1/Y 2 weighting and least squares' method using Solver, a Microsoft Excel Office 365 software add-in program. Four-parameter logistic regression model is:

(1)
HillSlope HillSlope HillSlope where Bottom and Top are plateaus of MSD signal; x is protein concentration; EC50 is the protein concentration that gives MSD signal half way between Bottom and Top; HillSlope is a factor representing the steepness of the standard curve. Slope of standard curves in the linear dynamic range were determined as shown in Supplementary Data Set 1. For other regression analysis, different models (linear, exponential, power, logarithmic and power) were tested with the least squares' method using Microsoft Excel Office 365 software. Regression models with highest R-squared value were selected.
Quantification of average polyQ length. When the amount of HTT is assessed in the linear dynamic range of our MSD assays, then: Slope (2) and we showed in Fig. 1d Both ratio of MSD signal obtained with MW1 by MSD signal obtained with MAB5492 or ratio of slope of MSD signal obtained with MW1 by ratio of slope of MSD signal obtained with MAB5492 could be used to quantify average polyQ length in biological samples. For MSD signal ratios or ratios of slope of MSD signals, propagation of error was calculated by the equation: PolyQ length was extrapolated from standard curve obtained by testing different concentrations of GST-FLAG-HTTexon1 with different polyQ lengths. pcR assay. Genomic DNA was isolated from tissues for somatic instability analysis using the DNeasy Blood & Tissue Kit (Qiagen). The size of the HTT CAG repeat was determined using a PCR assay that amplifies the HTT CAG repeat. The forward primer was fluorescently labeled with 6-FAM (Applied Biosystems) and products were resolved using the ABI 3730xl DNA analyzer (Applied Biosystems) with GeneScan 500 LIZ as internal size standard (Applied Biosystems).
Quantification of average CAG repeat length. PCR amplification of trinucleotide repeats from tissue prone to CAG instability generates multiple PCR products, viewed using GeneMapper software as a cluster of peaks differing by a single CAG repeat unit 41 . The following steps were used to determine the average CAG repeat