Motivational, proteostatic and transcriptional deficits precede synapse loss, gliosis and neurodegeneration in the B6.HttQ111/+ model of Huntington’s disease

We investigated the appearance and progression of disease-relevant signs in the B6.HttQ111/+ mouse, a genetically precise model of the mutation that causes Huntington’s disease (HD). We find that B6.HttQ111/+ mice are healthy, show no overt signs of central or peripheral inflammation, and no gross motor impairment as late as 12 months of age. Behaviorally, we find that 4–9 month old B6.HttQ111/+ mice have normal activity levels and show no clear signs of anxiety or depression, but do show clear signs of reduced motivation. The neuronal density, neuronal size, synaptic density and number of glia is normal in B6.HttQ111/+ striatum, the most vulnerable brain region in HD, up to 12 months of age. Despite this preservation of the synaptic and cellular composition of the striatum, we observe clear progressive, striatal-specific transcriptional dysregulation and accumulation of neuronal intranuclear inclusions (NIIs). Simulation studies suggest these molecular endpoints are sufficiently robust for future preclinical studies, and that B6.HttQ111/+ mice are a useful tool for modeling disease-modifying or neuroprotective strategies for disease processes before the onset of overt phenotypes.

relationship between CAG size and pathogenesis in an otherwise isogenic system 5 . Recent molecular 6 and behavioral 7 analyses of an allelic series of HD knock-in mice reveal very discrete CAG-dependent disease signatures, suggesting these mice recapitulate a cardinal feature of human HD, namely the relationship between CAG tract length and the rapidity of disease progression 3,8 . Because knock-in models express Htt at endogenous levels from the endogenous locus they precisely mimic the genetics of human HD, however their use in preclinical studies has been limited because their overt neurological signs are very subtle compared to transgenic animals 9,10 . The bulk of preclinical research has therefore been conducted in transgenic models expressing either full-length or short fragments of mutant huntingtin, which display more phenotypes 11 .
The embracing of transgenic models by the field has largely been driven by the desire to model important signs and symptoms of HD, including progressive striatal atrophy and hyperkinetic motor disturbances. We consider that these phenomena emerge from a substrate of decades of progressive cellular and physiological dysfunction before manifesting in human patients as a set of clinical signs recognizable as "Huntington's disease" 12 . Targeting these late features of the disease process, including cell death and emergent motor symptoms, may enable the modification of late HD symptoms. However, this strategy doesn't allow targeting of the earliest disease changes that precede overt clinical disease features, which if clearly identified might ultimately provide paths to prevention and/or early-stage reversal. Finally, no HD mouse model develops chorea-like hyperkinetic movements, and mouse models present with modest striatal volume loss over the lifespan of the mouse 13,14 , while human disease results in the nearly complete atrophy of this critical structure 15 .
We and others acknowledge that knock-in mouse models of HD cannot fully recapitulate clinical symptoms that, in human mutation carriers, require approximately four decades of pathogenic processes to emerge and are followed by a further 15 years to progress to death 16 . However, these genetically faithful but subtle mouse models provide us the ability to study the earliest and subtlest changes caused by the CAG expansion of Htt -the very processes we wish to understand if we are to develop interventions to prevent the development of clinical HD onset, rather than addressing its symptoms only after significant damage to key processes has been incurred. Towards this end, we have here characterized the earliest observable changes in the murine B6.Htt Q111/+ (hereafter Htt Q111/+ ) model of the HD mutation 17 . Since their creation, these mice have been utilized to test specific hypotheses in a wide range of studies 18 , and a single unbiased phenotypic screen 10 , but have not been well characterized as a system for conducting preclinical studies. We therefore examined the incidence, progression and statistical power of a range of phenotypes in these mice, and find that they provide a powerful tool for studying hypotheses about the early disease events that stem from the HD CAG expansion mutation. We observe robust transcriptional, proteostatic and motivational phenotypes which progress from 3 to 12 months of age in Htt Q111/+ mice. Simulation studies suggest interventions that rescue specific signs by 10-25% are readily detectable in appropriately powered preclinical studies, making the Htt Q111/+ mouse a useful tool for testing disease modifying therapeutics in HD.

Results
General Animal Health and Inflammation. Consistent with previous reports of Htt Q111/+ mice on the CD1 and B6J strains 9,10 , no body weight changes were observed between 3 and 12 months of age (effect of genotype, F (1,94) = 3.1, p = 0.07, mean CAG length = 113, cohort summary in Table 1). Important plasma chemistry parameters, including protein, glucose and ion concentrations, as well as liver enzyme levels, were normal at 4 months of age in Htt Q111/+ mice 10 . Because we are interested in progressive phenotypes, we examined a suite of clinical chemistry parameters at 12 months of age -no changes were observed in any of these parameters in male Htt Q111/+ mice at this age (Table 2). Consistent with findings in younger mice, and coupled with body weight observations, these results suggest that up to 12 months of age Htt Q111/+ mice are grossly healthy. Subtle increases in concentrations of pro-inflammatory chemokines and cytokines have been observed in presymptomatic HD mutation carriers 19 and some (e.g. YAC128), but not all (e.g. BACHD), transgenic HD mice by 12 months of age 20 . To understand whether increased peripherally-detectable inflammation is occurring in Htt Q111/+ mice we quantified cytokines, chemokines and acute phase reactants in the plasma of 12-month-old, male Htt Q111/+ mice (multi-analyte profiling, Myriad RBM). We successfully quantified 16 of these molecules (Table 2: Eotaxin, EGF,  IP-10, IL-1β, IL-18, LIF, M-CSF-1, MDC, MIP-1α, MIP-3β, MCP-1, MCP-3, MCP-5, Thrombopoietin, TIMP-1, VEGF-A) but observed no changes in 12-month-old Htt Q111/+ mice. Other important pro-inflammatory molecules (FGF9, FGF-basic, GM-CSF, KC/GRO, IFN-γ, IL-1α, IL-2, −3, −5, −6, −7, −10, −11, −12p70, −17a, MIP1-β, OSM, SCF, and TNF-α) were also assayed, but fell below the indicated limit of detection in all mice studied (lower limit of quantification, LLoQ, provided in Table 2). These data suggest that peripherally-detectable immune activation is either not occurring in 12-month-old, male Htt Q111/+ mice, or is sufficiently subtle to be missed by these analyses.

Striatal Transcriptional Alterations.
Altered transcription is an early and prominent feature of HD pathology, and is observed in both animal models and samples from human patients 12 . Because we are interested in intervening at the earliest possible point in the pathogenic process, we investigated the time course of transcriptional changes in the central nervous system of Htt Q111/+ mice. As an initial, untargeted pilot study, . Abbreviations: epidermal growth factor (EFG), interleukin (IL), interferon gamma-induced protein (IP), monocyte chemoattractant protein (MCP), macrophage colonystimulating factor (M-CSF), macrophage-derived chemokine (MDC), macrophage inflammatory proteins (MIP), tissue inhibitor of metalloproteinase (TIMP), vascular endothelial growth factor-A (VEGF-A), alkaline phosphatase (ALP), gamma-glutamyl transferase (GGT), alanine aminotransferase (ALT), aspartate aminotransferase (ASP). Inflammation markers measured below LLoQ (LLoQ listed): Fibroblast growth factor (FGF)−9 (2.4 ng/mL), FGF-basic (9.6 ng/mL), granulocyte-macrophage colony-stimulating factor (15 pg/mL), keratinocyte chemoattractant/human growth-regulated oncogene (0.028 ng/mL), interferon gamma (182 pg/mL), IL-1 alpha (100 pg/mL), IL-2 (70 pg/mL), IL-3 (5.9 pg/mL), IL-4 (60 pg/mL), IL-5 (0.6 ng/mL), IL-6 (4.7 pg/mL), IL-7 (0.19 ng/mL), IL-10 (125 pg/mL), IL-11 (56 pg/mL), IL-12p70 (0.091 ng/mL), IL-17A (0.018 ng/mL), MIP-1 beta (142 pg/mL), MIP-2 (11 pg/mL), oncostatin-M (0.24 ng/mL), stem cell factor (652 pg/mL), tumor necrosis factor-alpha (0.082 ng/mL). N = 4 male Htt Q111/+ , 4 male Htt +/+ , subset of 12-month mice from Table 1). we examined striatal and cerebellar genome-wide transcriptional changes in 3-and 9-month-old Htt Q111/+ mice using mRNA sequencing (RNASeq). At 3 months of age, no robust transcriptional alterations were detected in the striatum of Htt Q111/+ mice compared to wild-type mice (Fig. 1a, 5 quantified transcripts -Lct, Zfp385c, Smpx, Dsp, Crlf1 -have an effect of genotype with a false discovery rate (FDR) of < 5%). By 9 months of age, 726 transcripts were changed in the striatum of Htt Q111/+ mice compared to wild-type mice (Fig. 1a, 245 up-regulated, 481 down-regulated at FDR < 5%). The cerebellum, despite robustly expressing mutant Htt 21-23 , is relatively spared from pathological tissue loss in human HD patients 24 and the YAC128 mouse model of HD 14 . We therefore compared the rate and scale of transcriptional dysregulation between the cerebellum and striatum of Htt Q111/+ mice at 3 and 9 months of age. There were very few significant effects of CAG expansion in Htt in the cerebellum at either age (Fig. 1a, 3 genotype-sensitive transcripts, 2 down-regulated, 1 up-regulated at FDR < 5%). These data confirm that, as in humans with HD and other animal models, progressive striatal transcriptional dysregulation is a feature of aging Htt Q111/+ mice. We next examined pathway enrichment of dysregulated transcripts using both hypergeometric 25 and gene set enrichment analysis (GSEA) 26 techniques. The hypergeometric test examines over-enrichment of differentially expressed genes in a specific pathway, while GSEA quantifies whether a set of pathway genes is over-represented at the top or bottom of an ordered list of genes -here, the full list of 9-month striatal transcripts, ordered by the fold-change in Htt Q111/+ mice compared to wild-type mice. Hypergeometric testing of 726 genotype-sensitive (FDR < 5%) striatal transcripts (from a gene universe of 19,031 robustly assayed genes) revealed significant enrichment of striatal genotype-sensitive transcripts in 8 pathways with an adjusted p-value < 0.05, notably in neuronal signaling pathways (Fig. 1b). GSEA revealed 15 gene sets significantly enriched (FDR < 0.1) at the high or low end of the list of striatal transcripts, ordered by their genotype expression ratio (Htt Q111/+ /Htt +/+ ; Fig. 1c). The impact of genotype on the expression level of each of the 19,031 transcripts assayed is indicated -statistical significance on the y-axis (-log 10 FDR), and fold-change on the x-axis (log fold-change, Htt Q111/+ /Htt +/+ ). Color indicates statistical significance -black indicates FDR > 0.1, red and blue indicate FDR < 0.1 for up-and downregulated genes in the Htt Q111/+ striatum, respectively. (b) Network diagram depicting the 8 reactome pathways with adjusted p-values less than 0.05 in a hypergeometric analysis, and the proportion of genes shared between each pathway. The size of each node corresponds to the overall gene set size (e.g. signaling by GPCR pathway includes 853 genes, Gα z signalling events 44), while the width of the edges corresponds to the Jaccard similarity coefficient for the pair of gene sets (the intersection of the two sets divided by its union). (c) Network diagram depicting the 15 reactome pathways whose genes are non-randomly distributed on the ordered list of striatal transcripts with an FDR < 0.1, and the proportion of genes shared between each pathway. Blue nodes indicate down-regulated pathways, while red nodes indicate upregulation in the 9 month old Htt Q111/+ striatum. Nineteen common pathways had nominal enrichment p-values < 0.05 in both GSEA and hypergeometric analyses, including Reactome pathways involved in neurotransmission and cellular signaling (e.g. transmission across chemical synapses, neuronal system, signaling by GPCR, and GPCR downstream signaling; Table S1), suggesting synaptic and signaling transcripts are altered at the transcriptional level in the 9-month-old Htt Q111/+ striatum.
To orthogonally validate the observed striatal transcriptional alterations and examine their relevance as useful preclinical tools in Htt Q111/+ , we quantified specific transcript levels in the striatum of 3-, 9-and 12month-old Htt Q111/+ mice using quantitative real-time polymerase chain reaction (QRT-PCR). As predicted by cross-sectional RNASeq data ( Fig. 1), transcript levels of the critical striatal signaling gene Ppp1r1b, that encodes the protein dopamine-and cAMP-regulated phosphoprotein, Mr 32 kDa (DARPP32) are normal in Htt Q111/+ mice at 3 months of age, but reduced at both 9 and 12 months of age (effect of genotype F (1,47) = 6.74, p = 0.01, Figure 2a). Similarly, we find that levels of a suite of additional neuronal signaling and synaptic genes (including: Cnr1, Drd2, Homer1, Pde10a, Penk, Scn4b) are reduced in aged (9-and 12-month-old) but not young (3-month-old) Htt Q111/+ mice (Fig. 2a, effect sizes for each target shown in Fig. 2b). Other transcripts increase over time in Htt Q111/+ and include genes involved in DNA damage (e.g. N4bp2) or immune response (e.g. Islr2, H60b, Fig. 2a). These results confirm the findings of our RNAseq study and suggest that levels of a range of transcripts are sensitive genotype markers in the aging Htt Q111/+ striatum.
Striatal Histology. We also considered a range of histological endpoints in the dorsolateral striatum of 3-, 9-and 12-month-old Htt Q111/+ mice. Compared to tissue-level analyses such as RNASeq and QRT-PCR, immunohistochemistry (IHC) enables the identification of specific cell types and analysis of subcellular localization. First, a trivial explanation for the observed transcriptional alterations in the striatum of the Htt Q111/+ mice is that the cellular composition or relative cell sizes within the striatum has changed. To examine this possibility, we counted putative neurons, astrocytes and microglia in the dorsolateral striatum (respectively: NeuN-, glial fibrillary acidic protein-and allograft inflammatory factor 1-immunoreactive cells). Between 3-12 months of age, the neuronal density in the dorsolateral striatum of Htt Q111/+ mice does not change (Fig. 3a, F (1, 100) = 0.03, p = 0.87), nor does the cross-sectional area of NeuN-immunoreactive soma in the dorsolateral striatum at 9 months (NeuN-immunoreactive cell size distributions shown in Fig. 3b, two-sample Kolmogorov-Smirnov (K-S) test D = 0.02, p = 0.9). We next quantified astrocytic and microglial density in the dorsolateral striatum of 12 month old Htt Q111/+ and Htt +/+ mice, finding that neither is increased in 12 month old Htt Q111/+ mice (Fig. 3d). Progressive loss of synaptophysin density has been reported in transgenic HD and other neurodegenerative disease mouse models 27 . We quantified synaptophysin density in the dorsolateral striatum and find that Htt Q111/+ and Htt +/+ mice have equivalent levels of synaptophysin staining from 3-12 months of age ( Fig. 3c, F (1, 93) = 3.71, p = 0.057). These results suggest that the robust transcriptional alterations we observe in the aging Htt Q111/+ striatum ( Fig. 1) are not due to altered numbers or size of neurons, loss of synaptic compartments and that overt gliosis is not a component of the phase of HD modeled by the Htt Q111/+ mice.
We also examined the appearance of aggregated striatal huntingtin using the MW8 antibody, which detects aggregated, but not diffuse, mutant huntingtin 28 , but to our knowledge has not yet been applied to study Htt Q111/+ mice. We restricted our analyses to neuronal cell bodies, immunoreactive for NeuN, a well-described pan-neuronal marker 29 . In the dorsolateral striatum of 3-month-old Htt Q111/+ mice there is virtually no neuronal MW8 immunoreactivity, but by 9 months of age we observe robust accumulation of both small punctate and large inclusion staining in neuronal nuclei (Fig. 4a). Between 9 and 12 months of age, increased levels of neuronal intranuclear inclusions (NII) in the Htt Q111/+ striatum are accompanied by a reduction in the total nuclear MW8 immunoreactivity. Total neuronal nuclear MW8 immunoreactivity increases from 3 to 9 months of age, before declining at 12-months of age (Kruskal-Wallis: H (5) = 80.7, p < 0.0001, Dunn: 3-month genotype p = 0.15, 9-month genotype p < 0.0001, 12-month genotype p < 0.0001; Fig. 4b). This rise and fall in total neuronal nuclear MW8 immunoreactivity is accompanied by a progressive increase in the percentage of cells with large NIIs, from 0% at 3-months to 13% at 9-months and 28% by 12 months of age (Kruskal-Wallis: H (5) = 87.9, p < 0.0001, Dunn: 3-month genotype p = 1, 9-month genotype p < 0.0001, 12-month genotype p < 0.0001, Fig. 4c). In parallel, the average size of striatal NIIs in Htt Q111/+ mice increases more than 40% from 1.3 ± 0.7 μm at 9 months to 1.8 ± 0.9 μm at 12 months of age (two-sample K-S test, D = 0.29, p < 0.0001). Impaired autophagy has been proposed to contribute to impaired proteostasis in HD 29,30 , and in the brain mHTT interacts directly with p62/ Sqstm1, an autophagic receptor protein important for selective macroautophagy 31 . We observed complete co-localization between p62 and MW8 immunoreactivity in striatal NII's in Htt Q111/+ mice (Fig. 4d) suggesting that, as has been observed in cell culture 31 , p62 is found in NIIs. We finally considered whether quantitative IHC for specific MSN targets is superior to QRT-PCR quantification. As a proof of concept, we quantified neuronal DARPP32 levels in corticostriatal sections. Consistent with mRNA reductions (Fig. 2), we observe reduced neuronal somatic DARPP32 levels in the Htt Q111/+ striatum ( Fig. 5; Kruskal-Wallis: H (5) = 19.6, p = 0.0015, Dunn: 3-month genotype p = 0.12, 9-month genotype p = 0.09, 12-month genotype p = 0.006). We find that, for DARPP32, quantification of IHC data is more variable than QRT-PCR when establishing reductions in the Htt Q111/+ striatum.
Behavioral Analyses. We cross-sectionally analyzed behavior in several cohorts of 3-9 month old Htt Q111/+ mice, including assays of motivation, anhedonia, depression, and anxiety (cohorts described in methods). We first measured reward seeking behavior in Htt Q111/+ mice using an operant fixed ratio 1 (FR1) behavioral assay to examine how many sweet rewards mice will perform for in a 5 minute block at 6 months of age. We found that Htt Q111/+ mice demonstrate reduced reward attainment relative to Htt +/+ mice ( Fig. 6a; Genotype: F (1,14) = 17.0, p = 0.001: Genotype x Reward Size: F (2,28) = 8.4, p = 0.001). In an attempt to discern hedonic from motivational explanations for this reduced reward attainment, we examined preference for either 2% or 4% sucrose in drinking water in a separate cohort of mice. We found no difference between 9 month old Htt +/+ and Htt Q111/+ mice (2%  a) Guided by transcriptional discovery with RNAseq, we quantified a number of transcripts and found that many synaptic and neuronal signalling transcripts were down as mice aged from 3-months to 9-and 12-months. As anticipated, we also found upregulated transcripts related to immune and DNA damage pathways (N = 60, subset of 5 mice per row from Table 1). Data are presented as boxplots. (b) Corresponding longitudinal effect sizes for each transcript (highlighted in colored bands above) are displayed in the same color below, with whiskers representing the 95% confidence interval range. Along the x-axis, transcripts are ordered by increasing effect size at 12-months, though robust effects are seen at both 9-and 12-months. Our results demonstrate that these transcripts are sensitive genotype markers that make suitable targets for assessing rescue in interventional trials using the Htt Q111/+ mouse model.
To explore the potential contributions of anxiety or depression to the observed reward attainment changes we conducted several tests of these phenotypes in a separate cohort of 9-month old Htt +/+ and Htt Q111/+ mice. First we conducted the Porsolt swim test, a measure of behavioral despair 30 , where we observe no differences between 9-month old Htt +/+ and Htt Q111/+ mice in the duration of time spent inactive during the task ( Fig. 6c; t(38) = 0.8, p = 0.9). Similarly, using the elevated plus maze task as a measure of anxiety 32 , we found that 9-month-old Htt +/+ and Htt Q111/ mice do not differ in the amount of time spent in the open arms ( Fig. 6d; t(37) = 0.7, p = 0.5), or in the percentage of total arm entries into the open arms, (t(37) = −1.3, p = 0.2). As a final measure of anxiety levels in Htt Q111/+ mice, we also employed the light/dark exploration task 32,33 , finding that Htt +/+ and Htt Q111/+ mice spend the same amount of time in the light and dark halves of the apparatus ( Fig. 6e; t(38) = 0.04, p = 0.9) and enter the light compartment of the maze a similar number of times (t(38) = −0.6, p = 0.5). Altogether, these data suggest that up to 9-months of age Htt Q111/+ mice do not display anxiety-related or depressive-like symptoms, and declines observed in reward seeking behavior may reflect presently undefined cognitive or motivational issues worthy of additional study.
Power Analysis for preclinical studies in Htt Q111/+ mice. To establish the utility of the natural history data collected here (QRT-PCR and IHC), we conducted several power analysis studies to understand whether these results would be useful in a preclinical trial setting. We considered a hypothetical 2 × 2 factorial experiment, with two genotypes (Htt +/+ vs. Htt Q111/+ ) and two treatment groups (baseline vs. a hypothetical treatment). We simulated data for Htt +/ and Htt Q111/+ mice in the 'treatment' condition by drawing values from a random normal distribution with the same mean, variance, and covariance as our real natural history data. Assuming a 50% rescue from a hypothetical intervention, we established the power of single molecular endpoints in a preclinical study with 10 animals per arm (4 arms -Htt +/+ and Htt Q111/+ mice in a treatment or control arm). By 12 months of age, several individual endpoints provide reasonable power to detect this 50% rescue, including reductions in striatal Scn4b mRNA levels and an increase in MW8-immunoreactive aggregate counts (Fig. 7a). We next considered whether combining information from multiple molecular endpoints would improve the power to detect a partial rescue. We trained an elastic net logistic regression model to classify Htt Q111/+ versus Htt +/+ mice using a weighted combination of the QRT-PCR (Scn4b, Drd1a, Cnr1, Darpp32, Homer1) and IHC (mHTT aggregates, mHTT area, DARPP32, cell counts, neuronal ratio) endpoints in the striatum. This model distinguished Htt Q111/+ vs. Htt +/+ mice with > 90% accuracy in training data from 9-and 12-month-old mice. The model assigns non-zero weights to 0 endpoints in 3-month-old mice, 9 in 9-month-old mice, and 9 in 12-month-old mice (regression coefficients for individual endpoints provided in Fig. 7d). With a sample size of 10, this multivariate model would have 80% power to detect a 35% rescue in 12-month-old mice or a 60% rescue in 9-month-old mice (Fig. 7b). With n = 20, the model had 80% power to detect a 25% rescue in 12-month-old mice or a 40% rescue in 9-month-old mice (Fig. 7c). These results suggest that combining information from multiple molecular endpoints can improve power to detect subtle effects of treatments. . Accumulation of both large neuronal-nuclear HTT aggregates, as well as small huntingtin nuclear speckles is present in Htt Q111/+ , but absent in Htt +/+ mice. (b) Total neuronal nuclear MW8 immunoreactivity increases from 3 to 9 months of age, before declining at 12 months of age (Kruskal-Wallis: H (5) = 80.7, p < 0.0001, Dunn: 3-month genotype p = 0.15, 9-month genotype p < 0.0001, 12-month genotype p < 0.0001; N = 108, breakdown in Table 1. (c) Using a neuronal (NeuN) mask with particle inclusion from 0.5-5 μm 2 shows a median of 10% of neurons contain large nuclear aggregates by 9-months, which increases to 27% at 12 months of age. The presence of NIIs provides a robust measure for huntingtin accumulation with an extremely robust effect size for genotype comparisons at 9-and 12-months (d = 1.6 and 6.7, respectively; error bars = 95% confidence interval). Kruskal-Wallis: H (5) = 87.9, p < 0.0001, Dunn: 3-month genotype p = 1, 9-month genotype p < 0.0001, 12-month genotype p < 0.0001; N = 108, breakdown

Discussion
We have here characterized the health and progression of phenotypes in the Htt Q111/+ mouse, which accurately model the zygosity and Htt expression level of human HD patients. We found that, in contrast to transgenic N-terminal HTT fragment models of the disease, which have rapidly progressing and terminal disease, the Htt Q111/+ mice were grossly healthy through 12 months of age in terms of body weight, plasma chemistry and both central and peripheral inflammation, extending previous observations of these mice 10 . Despite their generally healthy state in their first year of age, during this interval they presented striatal-specific, progressive, molecular changes consistent with those observed in other animal models of HD and indeed human mutation carriers, as well as specific behavioral phenotypes consistent with early HD. Using simulation studies, we demonstrated that these animals provide a powerful tool for experimental therapeutics targeting these early molecular changes.
In a recent review, Menalled and Brunner 11 demonstrated that less than 5% of preclinical studies of reviewed therapeutic agents in HD were tested in knock-in models of the disease. The R6 and N171 lines of mice, which transgenically express short fragments of mHtt, were much more commonly used, together comprising 72% of all preclinical studies examined. Translation from animal models of neurodegeneration to human clinical trials has been disappointing, with a large number of agents predicted to be useful in transgenic animals failing to provide clinical benefit in human HD, for example 11 . A major benefit of screening therapeutics in short fragment mouse models of HD is that they present robust behavioral phenotypes, including motor alterations, affective impairments and cognitive alterations 4 . Similar, though less pronounced, phenotypes have been observed in full-length YAC128 and BACHD mice 9 . Distinguishing central effects from peripheral ones is difficult in these animals because they either progressively lose (e.g. R6/2 34 and N171 34,35 ) or gain (YAC128 36 and BACHD 9 ) significant amounts of body weight in parallel with the development of progressive molecular and behavioral phenotypes. Large observational studies have suggested that HD patients may, on average, lose weight 9,37 , as observed in short fragment mouse models of HD. However, more recently, careful and well-powered studies of presymptomatic HD mutation carriers and early stage patients reveals they have normal body composition and plasma chemistry 38 , consistent with our observations in aging Htt Q111/+ mice (Table 1).
While presymptomatic HD mutation carriers do not show clinically obvious motor impairments, by definition, they do show clear sub-clinical cognitive impairments 39 and progressive affective alterations. Amongst affective changes in HD mutation carriers, apathy has emerged as the most widely observed 40 , and uniquely amongst neuropsychiatric symptoms, progresses in a continuous fashion with disease state 40,41 . Like presymptomatic HD mutation carriers, previous studies of Htt Q111/+ mice revealed normal gross motor behavior and grip strength through 12 months of age 9,10,42 . Recent work with q175 HD knock-in mice suggests they present subtle motivational alterations detected using mixed fixed-/progressive-ratio operant tasks 43 . Similarly, reduced executive function has been observed in Htt Q111/+ mice using delayed non-match to sample tasks 42 . The data presented here (Fig. 6) suggest that 9-month old Htt Q111/+ mice do not have enhanced anxiety or depression-like behaviors, but do show motivational deficits, most notably a reduced performance on a fixed-ratio 1 task, despite normal hedonic drive for sweet stimuli at this time point. These behavioral studies, in conjunction with the body weight and  Table 1). Data are presented as boxplots, effect size error bars = 95% confidence interval.
general health data presented here, suggest that Htt Q111/+ mice more closely resemble pre-symptomatic mutation carriers than do short fragment models of HD. Importantly, this suggests interventions targeting apathy, the single most prevalent and progressive psychiatric manifestation of HD 41,40 , can be tested in knock-in models of HD.
Transcriptional dysregulation in HD has long been noted in both human samples and samples taken from experimental models. Here, using mRNA sequencing techniques we provide quantitative evidence about the degree to which striatal transcriptional changes exceed transcriptional changes in a relatively spared tissue in HD, the cerebellum. Indeed, at 9 months of age, the striatum has more than 250-fold more differentially expressed transcripts than does the cerebellum at this age (Fig. 1A). This is consistent with the fact that the cerebellum is grossly spared from pathological volume change in both human patients 24 and full-length mouse models of HD 14 . A concern with transcriptional changes in pathologically affected tissue is that tissue architecture changes (in either cellular composition or relative cell size changes) may explain alterations in mRNA abundance in tissue level analyses. The evidence presented here suggests that robust transcriptional dysregulation in the striatum of Htt Q111/+ mice precedes any alterations in the density of striatal neurons, astrocytes, microglia or synapses. In fact, of the presented phenotypes, altered transcript levels are amongst the most powerful features for distinguishing Htt +/+ from Htt Q111/+ mice at 9-12 months of age (Fig. 7). These data confirm that striatal transcriptional dysregulation occurs in the absence of changes in cell number in early HD, occurring the absence of neurodegeneration, gliosis or overt loss of synapses. Also consistent with previous observations of Htt Q111/+ mice at younger ages 17 , we  observe progressive striatal-specific accumulation of neuronal intranuclear inclusions (NIIs) in Htt Q111/+ mice. Similar to aggregates found in human HD, these large inclusions are primarily restricted to neurons (we do not observe any glial NIIs), and co-stain with autophagic cargo marker p62 (Fig. 4). Our power analyses suggest that treatments that alter either mHTT levels, or improve proteostasis and thereby alter aggregate formation, can be robustly tested in vivo using the Htt Q111/+ mice with adequate power.
While we find that the Htt Q111/+ mice provide a number of useful reflections of human HD, there are also significant limitations to their truly modeling Huntington's disease. First, even healthy captive mice live only approximately 2 years 44 , while human HD generally has onset only after 30-50 years of relatively unaffected life 45 . This discrepancy in lifespan may explain why very long CAG repeats (here, approximately 113 - Table 1) are required to lead to robust molecular or behavioral changes. Repeats of this length would be predicted to cause juvenile onset HD in the human context, with fulminant behavioral and neuropathological consequences that are not observed in any mouse models described 46 . Alternatively, or in addition, the different neural circuity 47 , transcriptional and proteomics networks 6,48 of the human brain could explain the selective vulnerability of human neurons to mutant-huntingtin induced cell dysfunction and death.
In summary, we present data suggesting that the Htt Q111/+ mice are grossly healthy during their first year of life, yet present a range of molecular and behavioral alterations consistent with presymptomatic HD mutation carriers. Our power analyses revealed that these changes can provide sufficiently powered endpoints for preclinical studies of neuroprotective therapies for HD with a reasonable duration and number of subjects. In conjunction with other recently described cognitive and affective 10,42,43 behavioral phenotypes in these animals, we propose that knock-in models provide greater face validity, and thereby potential translatability, than transgenic models of HD for future preclinical studies.

Methods
Mice. We used 4 cohorts (detailed below) of mixed-gender B6.Htt Q111/+ mice for the described studies (Research Resource Identifier:IMSR\_JAX:003456). The creation of the Htt Q111/+ line has been described 5 . Experiments for cohort 1 were conducted in accordance to NIH Guide for the Care and Use of Laboratory Animals and was approved by the Massachusetts General Hospital (MGH) Subcommittee of Research Animal Care (SRAC). Experiments for cohort 2 and 3 were conducted in accordance to NIH Guide for the Care and Use of Laboratory Animals and approved by the Western Washington University (WWU) animal care and use committee. Experiments for cohort 4 were conducted in accordance to the UK Animals (Scientific) Act 1986, European Union Directive 2010/63/EU and approved by the Cardiff University local ethical review committee.

Cohort 1 (RNA sequencing).
The cohort used to generate the RNASeq data presented in Fig. 1 was generated at Massachusetts General Hospital (MGH). Cohorts of 58 total mixed-gender littermate mice (12 × 3-month and 16 × 9-month Htt Q111/+ ; 12 × 3-month and 16 × 9-month Htt +/+ ) mice were generated by crossing B6.Htt Q111/+ males with C57Bl/6 J females. The size of the Htt CAG tract ranged from 126 to 135 (mean 131), with no significant differences observed across the groups of this cohort. The mice were part of a larger study of the impact of dietary fat on metabolism 49 -the mice described in Fig. 1 were fed chow from weaning until 9 months of age with either 60% kcal/fat (Open Source Diet D12492) or 45% kcal/fat (Open Source Diet D1245). Striatal and cerebellar transcriptional effects of these diets were collapsed for the analysis presented in Fig. 1.

Cohort 2 (Molecular natural history).
The cohort used to generate the molecular natural history data, presented in Figs 2-5 and 7, was bred and maintained at the Jackson Labs (Bar Harbor, Maine) and shipped to the WWU vivarium approximately 2 weeks before the endpoints under investigation. Three groups of mice were sacrificed at 3 (90 ± 4 days), 9 (295 ± 2 days) or 12 (369 ± 5 days) months of age (Table 1).

Cohort 3 (Behavioral cohort, WWU).
The cohort used to collect the behavior data at WWU (Porsolt swim test, elevated plus maze, light/dark box, sucrose solution preference) were bred and maintained at Jackson Labs (Bar Harbor, Maine) and consisted 40 female mice (20 Htt +/+ ; 20 Htt Q111/+ ). Htt CAG tract ranged from 107 to 119 (mean 114). Mice were shipped to the WWU vivarium at 8 months of age and acclimated to reversed light cycle conditions (lights on from 12 am to 12 pm) for four weeks prior to testing at 9 months of age. Behavioral assays were conducted between 1:00 pm and 5:00 pm during the active phase. Behavioral cohort, Cardiff). The Cardiff cohort consisted of 16 mice (8 Htt +/+ ; 8 Htt Q111/+ ), bred in-house from founders originally obtained from Jackson Labs (Bar Harbor, Maine), was used for the FR1 and sucrose pellet preference experiments. The mice were housed under 12-hour light/dark cycle with free access to food and water outside of experimental periods. FR1 operant testing began when the mice were 4 months of age, followed by the sucrose pellet preference experiment at 6 months of age. All experimental procedures were initiated between 8 am and 9 am.

Cohort 4: (
Tissue isolation and sample processing. After a three hour fast, mice in the molecular natural history cohort were euthanized via IP sodium pentobarbital injection (Fatal Plus, Henry Schein). Whole blood was collected by cardiac puncture with EDTA (25 μM) and plasma was extracted. Mice were then transcardially perfused with phosphate buffered saline (PBS) to clear tissues of blood. Whole brains were extracted and placed in a Brain Slicer Matrix (Zivic). A midline longitudinal cut was made to separate hemispheres. Coronal cuts were made in the left hemisphere 1-and 4-mm posterior to the junction between the olfactory bulb and cortex, resulting in a 3-mm thick corticostriatal block of tissue ( Figure S2) that was formalin fixed (6-8 hours), paraffin embedded, sectioned (5-μm), and mounted onto glass slides for immunohistochemistry. The contralateral hemisphere was dissected into striatum, cortex, and cerebellum, and flash frozen for molecular analyses.
Plasma chemistry and immune profiling. Plasma clinical chemistry was measured using an AU2700 Chemistry Analyzer (Beckman Coulter) at Phoenix Central Laboratories (Mukilteo, WA, USA). A panel of 37 cytokines, chemokines and acute phase reactants was assessed using a multiplexed cytometric bead array immunoassay (Mouse InflammationMAP 1.0, Myriad RBM, Inc., Austin, TX, USA).

Library construction, RNA Sequencing and RNASeq analysis.
For the RNA Sequencing studies in Fig. 1, RNA was extracted using the RNeasy Lipid Tissue Mini Kit (Qiagen). Quality control for RNA was conducted using the Agilent 2100 Bioanalyzer with Agilent RNA 6000 Nano kit. Libraries for sequencing were constructed using the Illumina TruSeq RNA Sample Prep Kit and sequenced on a HiSeq 2000 (2 × 50 bp) to a read depth of 2.4 × 10 7 ± 6.2 × 10 6 per sample. The fastq files were aligned using the default parameters of SNAPR 50 (https://github.com/PriceLab/snapr) against the GRCh38 genome assembly, along with the transcriptome assembly gtf file from Ensembl, GRCh38.75. SNAPR generates read counts for both genes and transcripts simultaneously with alignment. All alignments were performed on Amazon EC2 c3.8xlarge instance using a Ubuntu14.04 base AMI. Differential gene expression was conducted using the edgeR package 51 , and pathway enrichment using HTSanalyzeR 52 in Bioconductor 53 .
Immunohistochemistry. Deparaffinization washes were as follows: 2 × 3 min xylenes, 3 min xylenes:ethanol (1:1), 3 min 95% ethanol, 3 min 80% ethanol, 3 min 70% ethanol, 3 min 50% ethanol, H 2 O rinse. Heat-mediated antigen retrieval: 20 min at 95 °C in citrate buffer (pH 6) or Tris-EDTA buffer (pH 9). Sections were blocked in 20% goat serum in PBS for 1 hour, followed by overnight incubation in primary antibody at 4 °C; GFAP-stained slides were blocked with a mouse-on-mouse kit (Vector). Following wash, secondary antibody incubation was 1 hour at room temperature. Slides were mounted and sealed with DAPI fluormount-G (Southern Biotech). AIF1 slides were prepared from free-floating 40 Image acquisition and analysis. For all immunolabeled sections, 12-bit images were acquired with an IX-81 laser-scanning confocal microscope with Fluoview 1000 software (Olympus) using a 60x/1.42 NA oil objective, with the exception of AIF/GFAP which used a 40x/1.30 NA oil objective. For each secondary antibody, acquisition parameters were set such that a brain section with no primary antibody emitted no fluorescent signal and settings were maintained for the entire set of sections. Z-stack numbers and thickness were kept consistent for each set of sections. Maximum z-projections were compiled using ImageJ 54 . NeuN and/or DAPI masks were automatically created using ImageJ default thresholds, and used to measure neuronal and/or nuclear expression of MW8, p62, and DARPP32 (Fig. 5B). AIF1 and GFAP were manually counted. Synaptophysin was quantified as percent area of image with synaptophysin immunoreactivity. Experimenters were blind to genotype for antibody application, image acquisition, and analysis.
Behavior. Elevated Plus Maze. The plus maze was made of white acrylic with two open arms (25x5 × 0.5; room lux ~475) and two closed arms (25x5 × 16 cm) extending from a central platform elevated 50 cm above the floor. Each session began by placing a mouse in the intersection of the arms with its head directed toward the closed arm located opposite of the experimenter and ended after 10 minutes of free exploration. Activity was video recorded and analyzed using EthoVision XT 8 (Noldus).
Light/Dark Exploration. The light/dark exploration box used in this experiment was constructed of acrylic plastic divided into a clear, brightly illuminated side (27 × 27 × 30 cm; room lux ~475) and a black, fully enclosed side (18 × 37 × 30 cm) separated by a 5.5 × 5.5 cm opening that permitted the animals to move freely between each compartment. Each trial began by placing a mouse in the illuminated section of the apparatus facing the entrance to the dark section, and ended after 10 minutes of free exploration. Activity was video recorded and analyzed using EthoVision XT 8 (Noldus).
Porsolt Swim Test. Mice were placed into a clear 8-quart bucket (28 cm tall, 22 cm in diameter) filled ¾ full with room temperature water. Each mouse was placed into the bucket for a single 6-minute session, and swimming activity was recorded using a camera mounted on a tripod oriented toward the side of the apparatus. Inactivity during the last 4 minutes of each session was assessed by two separate experimenters blind to genotype (interrater reliability = 0.89), and a single composite score was calculated for subsequent analyses.
Sucrose Consumption Test. Mice were given a sucrose eating challenge to determine whether free access to sucrose would produce different levels of consumption between the genotypes. For these sessions the mice were given 95% sucrose pellets (AIN-76A, TestDiet, Richmond, IN) for 10 hours per day for 4 days, and normal lab chow for 14 hours. The amount of sucrose and chow consumed was measured daily by weight of 5 mg sucrose pellets/chow consumed and expressed as g/kg of bodyweight.
Sucrose Preference Test. Two-bottle sucrose preference testing occurred in the WWU vivarium over eight days. On day 1, animals were single housed and provided access to two identical 50 mL dual bearing sipper tubes containing tap water. On day 2, one of the tap water tubes was replaced with a 2% sucrose solution, and mice were provided free access to both sipper tubes for 72 hours. On day 5, the sipper tube containing the 2% sucrose solution was replaced with a 4% sucrose solution, and mice were provided free access for another 72 hours. The location of the sipper tube containing sucrose solution relative to tap water was counterbalanced between experimental animals and rotated each day to avoid place preference. Every 24 hours, experimenters weighed both sipper tubes to determine consumption from each. The amount consumed from each tube was expressed as a percentage total consumption per day; average consumption for each sipper tube during each 72-hour period was compared between genotypes Operant Testing. Operant testing was conducted in 16 aluminium/steel 9-hole box (14 cm × 13.5 cm × 13.5 cm) operant chambers (Campden Instruments, UK). On the rear wall of each chamber situated 15 mm from the grid floor, a horizontal 9-hole light array was fixed that had 9 response holes (11 mm diameter and 2 mm apart) that contained lights at the rear and infra-red (IR) sensors at the front, such that nose poke responses to light stimuli could be detected with an IR beam break. For the present experiments only the central response hole (hole 5 of 9) was used with the other holes blocked. On the inner of the front wall a food magazine was placed that allowed the mouse to recover sweet liquid rewards (Yazoo strawberry milk, Campina Ltd, UK), delivered by peristaltic pump after a successful response to the light stimuli. Initial training consisted of non-contingent reward presentations to the magazine that were signalled by the illumination of a light in the magazine. Removal of the head from the magazine was detected by an IR beam across the magazine entrance which then reset the trail. This was followed by training the mice to respond with a nose poke to illuminated hole 5 on the light array. A response in lit hole 5 extinguished the light and illuminated the magazine light to signal reward delivery. On removal of the head from the magazine, the magazine light was extinguished and after a 2 s intertrial interval, a new trial began with the illumination of the stimulus light in hole 5 thereby continuing the fixed ratio 1 (FR1) schedule of reinforcement. Once all mice could successfully complete 30 trials in a 30 minute session, the reward seeking probes began. Nine 45 minute FR1 sessions were run using, 3 (x3) counterbalanced reward pump durations (40 ms, 60 ms, 100 ms), to determine sensitivity to different reward magnitudes with the key outcome measure being number of rewards obtained.
Statistical Analysis. Statistics were processed in R 3.2.3 55 . Data were tested for normality (Anderson-Darling test) 56 and homoscedasticity (Levene's test) 57 . If data met parametric assumptions, we fit linear models analysed by ANOVA. If data violated these parametric assumptions, Kruskal-Wallis test was used as an omnibus test and followed up with Dunn's test to determine whether pairwise differences between genotypes were significant with Bonferroni corrections for multiple post-hoc comparisons. Data presented in Figs 2-6 used boxplots -horizontal lines indicate 25th, 50th and 75th percentile, while the vertical whiskers indicate the range of data. Data falling outside 1.5 times the interquartile range are graphed as isolated points, but were not excluded from statistical analysis 58 . Simulated distributions for power analysis were constructed with the mvrnorm function in the MASS package 59,60 . The elastic net classifier was constructed using glmnet 59 . Endpoint QRT-PCR and IHC data and code for power analysis are available at https://github.com/seth-ament/hd_endpoints. Graphics were produced using ggplot2 61 and Illustrator (Adobe).