Huntington’s disease (HD) is a fatal, progressive neurological disorder caused by a coding cytosine-adenine-guanine (CAG) expansion in the huntingtin (HTT) gene, where repeat lengths above 40 result in full penetrance of the disease1. Patients with HD display a clinical triad of cognitive, psychiatric, and motor symptoms2. Although chorea is a prerequisite for formal clinical diagnosis3,4 and the most recognized HD related deficit5, cognitive and psychiatric symptoms often appear during the prodromal phase of the disease, as many as 10 years prior to the onset of motor dysfunction6,7. Cognitive symptoms in HD include various impairments in learning and memory, deficits in executive function8, and difficulty recognizing emotional states9,10, while the most commonly identified psychiatric manifestations include apathy, anxiety, depression, irritability, perseveration, and obsessive behaviors6,7,11.

The cognitive and behavioral aspects of HD contribute to significant declines in functional capacity12 and are often described as being the most burdensome symptoms to both HD patients and their families13. In the absence of effective disease modifying treatments, current therapeutic options for HD are focused on managing these symptoms14. As such, a great need exists to further study the cognitive and psychiatric components of HD, not only because of the great distress they cause to HD families, but also because understanding the progression of these symptoms will provide researchers with the opportunity to assess the efficacy of potential disease modifying therapies at the earliest time point possible. Unfortunately, given the importance of these phenotypes in HD patients’ quality of life, HD preclinical mouse studies do not typically include analysis of impaired cognition and altered affect15. One potential reason for this is the expense and complexity of the equipment required for traditional cognitive and affective assays, which limits the feasible number of animals in preclinical HD studies. This practical limitation of study size limits the power of HD preclinical studies which, as is common in neuroscience research16, is generally poor.

Amongst HD’s psychiatric manifestations, apathy has an extremely high point prevalence, coupled with a uniquely consistent relationship between severity and HD progression7,17. A recent clinical evaluation study of presymptomatic HD mutation carriers found striking increases in the incidence of apathy, even in subjects predicted to be more than 10 years from clinical onset18. Apathy, as a psychiatric symptom distinct from depression, has been operationalized to contain aspects of diminished motivation, reduced goal-directed behavior, lack of interest in new experiences, and diminished emotional responsivity19. In HD patients, apathy is correlated with functional capacity and cognition, but not depression, suggesting apathy is a distinctive component of the affective landscape of HD20.

Motivated by the importance of apathy to the lived experience of HD mutation carriers, we are interested in bringing analysis of apathy into HD preclinical studies. Traditional rodent experiments to test motivated behavior include the progressive-ratio (PR) operant task21, in which subjects are required to perform increasingly large numbers of nose-pokes or lever-presses to receive a reward. Commercial operant chambers used to assay PR responses in rodents can cost thousands of dollars each, limiting the number of animals (and thereby statistical power) of preclinical studies of apathy. To address this problem, we have modified a recently described open-source operant chamber - the “ROBucket”22 - based on the Arduino computing platform, with a total built cost of approximately $150/chamber. To validate the modified apparatus, we studied motivation in 9–11 month old B6.HttQ111/+ mice, a knock-in mouse model of the HD mutation23. Using commercially available tools we, and others, have previously observed motivational phenotypes in this model that precede motor or cognitive changes24,25,26,27. Using the open-source ROBucket, we confirm specific deficits in progressive, but not fixed, ratio tasks in B6.HttQ111/+ mice at this time point, consistent with relatively intact learning but impaired motivation.


Modification of existing operant chamber design

We first precisely recreated a recently reported open source operant chamber (“ROBucket”)22. On conducting pilot experiments, we found mice tended to interact with the housing and reward tubing fed through the bucket in the original design. To overcome this distraction, we redesigned the 3D-housing apparatus to be outside the mouse chamber (Fig. 1) - updated plans are available online at Studies performed with these modifications revealed that mice quickly learned to interact with the active well to receive 10 µl of 20% sucrose, and that the modified ROBuckets (mROBucket) accurately counted nose pokes, compared to direct observations.

Figure 1
figure 1

Modified ROBucket design and housing. (a,b) mROBucket with nose-poke housing mounted on the exterior of the bucket reduced time spent exploring the housing and sucrose tubing. (c,d) Construction of a 4 × 3 grid of isolation housing was used to keep each mROBucket independent and facilitate concurrent testing of adequately sized cohorts. Each housing chamber included a small fan for white noise and a viewport to monitor the mROBucket display throughout the experiment.

Normal fixed ratio performance in Htt Q111/+ mice

9-month-old male B6.Htt+/+ (n = 8) and B6.HttQ111/+ (n = 12) mice (hereafter Htt+/+ and HttQ111/+) were single-housed and food restricted over 2 weeks (target weight loss of 2%/day, final body weight ~85% free-feeding weight) before operant testing. We observed no impact of genotype on baseline body weight or the rate at which Htt+/+ and HttQ111/+ mice lost weight during food restriction (Supplemental Fig. 1). After body weight stabilized, mice were placed into mROBucket chambers for 1 hour sessions each day on a fixed ratio 1 (FR1) reinforcement schedule - i.e. one nose poke in the active well resulted in sucrose delivery in the reward well. There was a 1 second timeout after each active well response. Two criteria were required for progression to the next phase: a 3:1 preference for the active well versus the inactive well and 20 or more reinforcements for 3 consecutive days. All mice, with the exception of a single Htt+/+ mouse (Fig. 2, grey panel), quickly learned the task (Fig. 3, average time to FR1 criteria 7.5 ± 2.4 days).

Figure 2
figure 2

Normal per-mouse acquisition of FR1 task in 10-month-old HttQ111/+ mice. Shown for each mouse is the number of nose pokes per 1 hour session in the active (blue) and inactive (green) wells. Htt+/+ mice (n = 8) are graphed with solid lines, HttQ111/+ with dashed lines (n = 12). One mouse (grey highlight) was excluded for failing to pass the pre-defined FR1 acquisition criteria.

Figure 3
figure 3

Normal performance of FR1 task in 10-month-old male HttQ111/+ mice. No genotype effect is observed in the days to criteria (a). t(7.1) = 0.2, p = 0.8), active/inactive well ratio (b). t(16.9) = −0.9, p = 0.4) or average total nose pokes per session (c). t(16.5) = −0.1, p = 0.9). Horizontal lines in the boxes indicate 25th, 50th and 75th percentiles, while vertical lines indicate 1.5 times the interquartile range; outliers beyond these values are graphed as points.

We observed no effect of genotype on days to meet criteria, the active/inactive nose poke ratio, or total nose pokes per session on the FR1 task (Fig. 3). This suggests 9-month-old male HttQ111/+ mice are able to normally acquire this simple discrimination task.

After meeting both criteria for FR1, we next trained the mice on a fixed ratio 5 (FR5) task for 3 days to familiarize them with tasks requiring multiple nose pokes to achieve reward. There was a 1 second timeout after each active well response. Consistent with the FR1 task, 9-month-old male HttQ111/+ mice are able to normally acquire this simple discrimination task, with no differences observed between the ratio of active/inactive well responses or the total number of nose pokes per session (Fig. 4). This suggests fatigue and motor dysfunction do not prevent HttQ111/+ mice from making large numbers of accurate nose pokes in an hour-long session (average of 273.1 ± 11.7 nose pokes/session).

Figure 4
figure 4

Normal performance of FR5 task in 10-month-old HttQ111/+ mice. No genotype effect is observed in the active/inactive well ratio (a). t(16.6) = −1.3, p = 0.2) or average total nose pokes per session (b). t(6.3) = −1.4, p = 0.2); n = 7 Htt+/+, 12 HttQ111/+. Horizontal lines in the boxes indicate 25th, 50th and 75th percentiles, while vertical lines indicate 1.5 times the interquartile range; outliers beyond these values are graphed as points.

Reduced progressive ratio performance in Htt Q111/+ mice

Regardless of performance, mice were advanced to a progressive ratio (PR) task after 3 days of FR5 training. The PR task requires the animals to respond with an exponentially increasing number of sequential nose pokes in the active well to receive the reward (1, 2, 4, 6, 9, 12, etc., according to the formula R = ||5e(N*0.2)||−5)29. The final number of reinforcements achieved is referred to as the “breakpoint”21. We established the criterion for the PR task based on breakpoint stabilization, or less than 10% variation in the breakpoint in trials over 3 consecutive days. Both Htt+/+ and HttQ111/+ mice learned the PR task, with breakpoints stabilizing between 4–17 days.

We observed no genotype effect on the days to criterion (Fig. 5a) or average active/inactive ratio (Fig. 5b) during the PR task, however HttQ111/+ mice perform significantly fewer total nose pokes per session (44% reduction, Fig. 5c) and consequently receive fewer rewards per session (17% reduction). This results in a 37% reduction in the final stabilized breakpoint of HttQ111/+ mice compared to Htt+/+ mice (Fig. 5d). Post-hoc analysis suggests our experiment had 94.6% power to detect a breakpoint difference between genotypes (n = 7 Htt+/+ and 12 HttQ111/+; effect size, d, = 1.8, type 1 error probability = 0.05).

Figure 5
figure 5

Progressive ratio deficits in 10-month-old Htt+/+ and HttQ111/+ mice. No genotype effect is observed in (a) the final days to criterion (t(14.3) = 0.5, p = 0.6), or (b) active/inactive well ratio (t(15.8) = 0.3, p = 0.8) during the PR task. However, HttQ111/+ mice do have (c) reduced total nose pokes (t(7.8) = 3.0, p = 0.02), reduced rewards/session (t(12.1) = 3.0, p = 0.01) and a consequently (d) reduced final breakpoint (t(13.9) = 4.0, p = 0.002) during the PR task. *Indicates p < 0.05. Horizontal lines in the boxes indicate 25th, 50th and 75th percentiles, while vertical lines indicate 1.5 times the interquartile range; outliers beyond these values are graphed as points.

Replication in an independent cohort

We are interested in the improvement of preclinical trials in HD, particularly their reproducibility27,30,31,32. To determine whether the mROBucket apparatus and progressive ratio task are reproducible assays of motivated behavior, we repeated the experiment in a new cohort of similarly aged (10–11 months) male Htt+/+ (n = 12) and HttQ111/+ mice (n = 12). Consistent with the findings of our first cohort, we observe no effect of genotype on the total number of nose pokes in the FR1 task (Fig. 6a). Similarly, in the FR5 task, HttQ111/+ mice perform equivalently to Htt+/+ mice for total nose pokes (Fig. 6b). In the PR task, HttQ111/+ displayed reduced total pokes (t(17.1) = 3.9, < 0.001), earned fewer rewards (t(21.5) = 3.9, p < 0.001) and displayed a 46% reduction in final breakpoint (Fig. 6c). Post-hoc analysis of the replication cohort (n = 12 Htt+/+ and 12 HttQ111/+, effect size, d, = 1.6, type 1 error probability = 0.05) indicates we achieved 96.3% power.

Figure 6
figure 6

Independent replication of progressive ratio deficits in 10-month-old Htt+/+ and HttQ111/+ mice. (a) We observed no difference in the total nose pokes during the FR1 task (t(19.8) = 0.12, p = 0.9). (b) Similarly, in the FR5 task, HttQ111/+ mice perform equivalently to Htt+/+ mice (t(16.3) = 0.9, p = 0.4). (c) The replication cohort had a significantly reduced final breakpoint (t(16.2) = 3.9, p = 0.001) in the PR task. *indicates p < 0.05. Horizontal lines in the boxes indicate 25th, 50th and 75th percentiles, while vertical lines indicate 1.5 times the interquartile range; outliers beyond these values are graphed as points.


Apathy is a core feature of affective dysfunction in HD17, but analysis of apathy and other affective disturbances is rarely included in preclinical studies of HD33. Here, we demonstrate that an inexpensive open source apparatus can robustly and reproducibly detect motivational deficits in 9–11 month old HttQ111/+ mice. These motivational deficits precede overt motor, cognitive, or neurodegenerative changes in this model27, suggesting they may occur amongst the earliest changes associated with mutant huntingtin expression in vivo. A recent study from the Yhnell et al. tested HttQ111/+ mice in commercial operant boxes at 6-, 12-, and 18-months of age24, finding that HttQ111/+ mice showed no deficits in fixed ratio testing at 6- or 12-months, but had reduced breakpoint in progressive ratio at 12 months when delivering similar volumes of liquid reward to those used in our study (7.5 μL vs. 10 μL used in our study). While differences in the methodology for the training, progressive ratio task, and breakpoint determination between our study and the Yhnell et al. study limit direct comparison, our results confirm those found in the commercial apparatus: that there are no deficits in fixed ratio testing at a time point between the 6- and 12-month timepoints (9- and 10-mo for our independent cohorts, respectively), while we also confirm the significant progressive ratio deficits found in the commercial apparatus at 12-months in our slightly younger cohorts tested using open-source equipment.

The basal ganglia, and particularly the striatum, are the most strikingly impacted brain regions in Huntington’s disease, showing robust volume declines many years before clinical disease onset7,34. While no full-length HD mouse models experience similar pronounced neurodegeneration, a number of imaging30,35,36 and molecular analyses37 confirm that the caudoputamen is the most strikingly impacted brain region in mice expressing mutant Htt. In humans, focal ischemic basal ganglia damage is associated with a range of motivational deficits38, including notable deficits in incentive motivation - the process of activating specific behavioral responses based on predicted reward39. In stroke patients with basal ganglia lesions, deficits in incentive motivation occur in the absence of deficits in hedonic responses, consistent with reduced progressive ratio performance in HttQ111/+ mice (here, and24,27) at a time when sucrose preference tasks suggest no alterations in hedonic drive in these mice27. Changes in instrumental motivation may therefore serve as a translatable readout of basal ganglia dysfunction in mice. The deficits in instrumental motivation seen here may provide a behavioral tool for quantifying this dysfunction, however we caution that mouse behavioral phenotypes likely map imperfectly onto complex symptoms - such as apathy - in human HD patients.

Statistical power in neuroscience studies, and especially behavioral studies, is generally low16, leading to widespread calls for improvements in conducting and reporting of preclinical work40. While the use of operant chambers can improve preclinical studies by automating data collection of complex behavioral tasks, these benefits can be negated by the cost of the apparatus. Traditional operant chambers from commercial suppliers cost thousands of dollars, placing a practical limit on the number of mice that can be screened in parallel in a single lab and thereby limiting the power of these studies. Recent technological developments - including the rapid development of inexpensive additive manufacturing (i.e. “3D printing”) and low-cost open source computing platforms (e.g. Raspberry Pi and Arduino) - enable open source alternatives to commercial products. Further, they allow for rapid design and software modifications to assay behavior through multiple paradigms. These tools have been applied to video tracking of behavior41, integrated microscopy, temperature control, and optogenetics in small animal experiments42, as well as the rodent operant chambers used here22. These technologies render practical the fabrication of a large number of chambers (12 were used in the current study), so dozens of mice can be tested in parallel in a single lab, with a limited number of handlers.

These experiments confirm previous findings that motivational deficits occur before pronounced neurodegeneration in the HttQ111/+ model of HD. They also show that these deficits are assayable using inexpensive hardware and open source software tools, which should enable their more widespread utilization in preclinical studies in HD.



B6.HttQ111 mice, which have been previously described43, were originally obtained from JAX (Research Resource Identifier: IMSR\_JAX:003456) and bred and maintained at the Western Washington University vivarium. Mice were group housed until 9–10 months of age and given access to food and water ad libitum, until two weeks before testing was to begin. For genotyping, presence or absence of the mutant allele was determined by polymerase chain reaction of DNA using the primers CAG1 (5′-ATGAAGGCCTTCGAGTCCCTCAAGTCCTTC-3′)44 and HU3 (5′-GGCGGCTGAGGAAGCTGAGGA-3′)45. All experiments were conducted in accordance to the NIH Guide for the Care and Use of Laboratory Animals and approved by the Western Washington University animal care and use committee (protocol 16–007).


2-choice operant chambers were constructed according to the design from Devarakonda22 with modifications. Briefly, Arduino-controlled operant boxes deliver a liquid sucrose reward in response to nose-pokes with various reinforcement schedules. The apparatus here was modified to place the nose-poke photo-beam housing on the outer, rather than inner, wall of the operant chamber (see Fig. 1b). This eliminated time spent exploring the nose-poke housing and associated tubing, and resulted in cleaner acquisition of the FR/PR tasks. Isolation housing was designed and constructed as a grid of 3 × 4 chambers (35 × 42 × 42 cm each) with view ports in the front to visualize the response readings and vent fans in the back to produce white noise (Fig. 1d).

Behavioral Testing

Mice were single-housed and fasted over two weeks to reduce body weight to 85% of free feeding weight prior to testing and maintained at this weight for the duration of testing. For FR1 testing, mice learned to nose-poke on a fixed-ratio reinforcement schedule where a single nose-poke in the active well elicits delivery of a sucrose reward (10 μL, 20% sucrose), with a 1-second timeout after each active well press. Trial duration was 60 minutes or until the subject received 50 reinforcements, at which point mice were promptly removed. Acquisition criteria for the FR1 schedule were met when mice exhibited discrimination criteria of ≥3:1 for the active:inactive well and received ≥20 reinforcements for 3 consecutive days. Mice not meeting the FR1 acquisition criteria by 17 days were excluded from further testing (this occurred in 1 mouse from each cohort, or approximately 5% of mice). After meeting FR1 criteria, mice were moved to FR5 testing, where the reinforcement schedule requires five pokes to earn 1 reward, for three consecutive days.

For PR testing, mice were tested on a progressive ratio schedule of reinforcement where the number of nose pokes required to elicit reinforcement is calculated using the equation Reinforcements = ||5e(N*0.2)||−5, where N is equal to the number of sucrose solution reinforcements already earned plus 1. When mice earned the same number of rewards (±10%, or within 1 if less than 10 rewards are earned) for 3 consecutive days, they were considered stabilized at their “breakpoint.” For full methods and instructions, see published online methods at

Statistical analysis

All data were processed using R statistical software46. Welch’s t-tests were used to correct for unequal variances, linear mixed effects ANOVAs were run using the ‘nlme’ package47, effect size was calculated with the ‘’ package48, and power was calculated using the ‘pwr’ package49. Graphics were produced using ‘ggplot2’50 and Illustrator (Adobe).