## Introduction

Understanding the complexity of biological systems is a profound challenge for science and technology. In neuroscience, elucidating the principles governing the production of complex behavior in mammals from the vast tangle of connected neural circuits is one of the most difficult problems. Conventional methods such as those using head-fixed assays require tedious and labor-intensive brain recordings from single animals engaged in behavioral tasks over weeks to months. However, the specificities of these paradigms and their integration with the growing array of state-of-the-art brain physiological recording systems differ greatly among and within laboratories due to the variability introduced by the experimenter’s intervention. This lack of standardization generates inherent reproducibility issues and eliminates the possibility of large, sharable data sets that could significantly accelerate the pace of scientific discovery and validation. These problems have recently become apparent in mouse studies. Among mammals, the mouse contains the largest methodological toolbox for neural circuit research on behavior. Accordingly, researchers are training mice in complex behavioral assays with concurrent physiological recording and manipulation (e.g., multi-channel electrophysiology, imaging, optogenetics1, 2). However, training mice to learn complex behavioral tasks requires time-consuming species-specific training methods that stem from innate phenotypic and behavioral characteristics. Indeed, even within rodents, mice have unique characteristics including high sensitivity to experimenter biases3 and physiological stresses from handling4. Several mouse behavioral systems have been reported that attempt to address experimental limitations associated with the short mouse life cycle and the decreasing yield of trained animals with the increasing complexity of the behavioral tasks5,6,7,8,9,10,11.

Among the proposed solutions, some rely on the experimenter’s intervention, e.g., for head fixation6, 7, 9, 11, while others use freely moving animals5, 10, 12, with a noteworthy recent report of a setup featuring self-head fixation for automated wide-field optical imaging8. From a research economics perspective, an ideal mouse system would feature self-head fixation for behavioral training and rapid exploration of a large space of complex behavioral parameters with minimal experimenter intervention, allow high-throughput automated training, have the capability to explore various sources of psychometric data, flexibly integrate multiple physiology recording/stimulation systems, and enable the efficient generation of large, sharable, and reproducible data sets to standardize procedures within, and across laboratories.

We developed an experimental platform for mouse behavioral training, with full automation, voluntary head fixation, and high-throughput capacity. The platform is scalable and modular allowing behavioral training based on diverse sensory modalities, and it readily integrates with virtually any physiology setup for neural circuit- and cellular-level analysis. Moreover, its remote accessibility and web-based design make it ideal for large-scale implementation. To demonstrate the optimality of the system for the integration of complex behavioral assays with physiology setups, we used the platform to train mice in two behavioral tasks, one visual and one auditory. The training was compatible with stable cellular-level imaging as we demonstrated with two-photon GCaMP recordings in trained animals.

## Results

### Habituation system for head restraining

In both behavioral tasks, newborn mice were initially housed in an enriched environment. At the age of ~P45 they were implanted with a head-post and a round chamber, and individually housed in standard cages (Supplementary Table 1). After recovery, mice were put on a water-controlled regime (~1 mL/day) for about 1 week. Then a self-head-restraining device was introduced to the home cage (Fig. 1a; Supplementary Fig. 1) that is fully compatible with standard mouse cages and IVC racks (Fig. 1d). Mice learned that to obtain water they had to restrain their head-post, but without latching it (Supplementary Fig. 1). This step is key for future self-head fixation; if head-fixed without habituation, mice form a strong initial aversive association and then avoid the setup in following sessions, as typically observed in fear conditioning paradigms. During this habituation phase, the body weight of animals was monitored twice/day and mice were removed from training if their weight dropped below 70–75% of their original body weight (corrected by gender and age).

### Behavioral training 2: tone detection

To illustrate the use of our setup in tasks relying on different sensory modalities, we trained a group of mice in an auditory go-no-go task. We placed a speaker for auditory stimulation in front of the animal, ~10 cm from the mouse ears, and enclosed the setup in a sound isolation box to reduce ambient noise (Fig. 4; Methods). The transparent PVC tube connecting the cage to the main setup passed through an aperture on the side of the isolation box. Mice had to detect the occurrence of an 80 dB, 10 KHz pure tone played five times (for 500 ms, five pulses of 5 ms duration at 10 Hz15). This go stimulus was presented in 70% of the trials. In the remaining 30% of the trials, the mouse was exposed to an unmodulated ~50 dB background noise (no-go stimulus)16. Mice had a 2 s window from the end of the go-stimulus to report the tone detection by rotating a small wheel at least 70° in either direction (Fig. 4a). Typically, each mouse had a preference for clockwise or counterclockwise rotations. In hit trials (Hit, go responses to go stimuli) mice were water rewarded, and the following trial would start after an ITI of randomized duration, 5–10 s. In miss responses (Miss, no-go responses to go stimuli) mice did not receive any reward or punishment and a new trial would start after a randomized ITI duration. Similarly, in correct rejection trials (CR, no-go responses to no-go stimuli) mice were not rewarded nor punished and a new trial would start after an ITI. In false alarm trials (FA, go responses to no-go stimuli), mice were punished by additional 10 s of ITI and shown a full screen, square-wave checkerboard with 100% contrast (Fig. 4b). Mice performed two sessions per day and learned the task, (d′ = 1.5; Fig. 4c), over 12.5 ± 3.5 days (s.d., n = 2). FA rates remained constant throughout training, with FA rates in naive mice higher than hit rates (P < 0.005, Wilcoxon signed rank test) possibly due to a startle reflex following the target sound presentation17,18,19,20.

In summary, these two tasks show that our setup can train mice in sensory-based decision making tasks, across sensory modalities, for accurate psychophysical measurements and with full automation.

### Latching unit for physiology

In order to serve as a powerful tool to study the neural basis of cognitive functions, a setup must not only enable behavioral training but also (1) be easily integrated with diverse customized physiology setups, and (2) preserve the animals’ behavioral performance in spite of integration specificities. To address the first point, we opted for a semi-automated solution, reasoning that full automation of physiology experiments is often unnecessary and sometimes even undesirable or unfeasible (e.g., when patching small cellular processes, repeatedly inserting fragile/bendable optic fibers, electrodes, silicon probes, etc.). Moreover, the bottleneck in a typical behavioral study is often the time required to train mice, while statistical power for the physiological experiments can typically be achieved with a handful of trained animals. Hence, we developed a semi-automated method relying on a movable unit (by the experimenter), which still had an automated self-latching mechanism and preserved the separation between the animal and the experimenter (Fig. 5a; Supplementary Fig. 6). Regarding the second point, we showed that by using such a unit, and by simply having input and output devices matching those used in the main training setup, mice readily adapted to the new environment (novel sounds, lights, and smells). As proof-of-principle, we demonstrated the integration of our setup with a two-photon microscope requiring cellular-level stability as the animal performed in the behavioral task.

### Two-photon imaging

We trained mice expressing GCaMP821 (Methods) in the orientation discrimination task described in Fig. 3. When mice reached a threshold performance level (75% correct discrimination), we connected the latching unit for physiology to the home cage. Except for the reduced length of the main corridor, the unit was almost identical to the one used during training (Supplementary Fig. 6), and mice self-latched as before. At the end of the latching stage a set of four screws, tighten by the experimenter, allowed for stable block of the head-plate (Supplementary Figs. 6, 7). The platform was then placed under the two-photon microscope, still preserving the separation between the animal and the experimenter (Fig. 5a). Input and output devices in the two-photon setup matched those used in the training setup. However, the loud sounds generated by the galvo and resonant scanners, together with the bulky equipment on top of the mouse head were all novel elements. It took mice an average of 2.5 ± 1.5 sessions to return to the same performance level learned in the training setup (n = 2 mice, four sessions, and one session, respectively). Before commencing training, mice were imaged using standard methods for retinotopic mapping to identify V1 and higher visual areas22 (Fig. 5b; Methods). In typical two-photon imaging experiments, we recorded from a volume 850 × 850 × 3 μm3 of L2/3 neurons in the primary visual cortex (Fig. 5c). Using a common analysis for cell segmentation (Methods), we could identify ~200 neurons per volume. Using vascularization landmarks, we could image the same cells over days or weeks (Supplementary Fig. 7), and segregated their responses as a function of the animal’s choices (Fig. 5d) or stimulus orientations (Fig. 5e). As a corollary of this cellular-level resolution, our semi-automated procedure can then be easily combined with a large variety of other imaging, optogenetic, and electrophysiology systems requiring a similar degree of stability of the neural target of interest. In summary, the training setup combined with the latching unit for physiology is a convenient compromise for the relatively effortless integration of automated behavioral training with a large diversity of physiology systems.

## Discussion

The traditional experimental approach to integration of an animals’ behavioral training and physiological recording has often resorted to lab-specific experimental configurations relying on the experimenter’s intervention with the drawback of hindering within- and across-lab reproducibility. Here, in an effort to overcome these limitations, we describe an experimental platform for mouse behavioral training with high-throughput automation and voluntary head fixation that can integrate with diverse physiology systems, as we show with a two-photon microscope that requires cellular-level stability.

Training mice in these tasks provides important validation benchmarks for the platform’s performance. Considering the visual task, the learning yield was 67% (8 out of 12 mice) and the average training duration for peak discrimination performance was 59 ± 27 days (s.d., n = 8 mice). The task requires latching of the head-plate, daily monitoring of body weight, and incremental training procedures. If these steps were performed by a researcher, training in parallel four mice/day (a feasible commitment for a single person) the “human cost” for one trained mouse would be ~15 h (i.e., 22 days, with two 20 min sessions/day). In contrast, assuming all our setups were used for this task at our current capacity of 48 animals/day, we could produce an average of one trained mouse every 2 days (i.e., 1.3 h of rig time). Even assuming that the presence of an experimenter could potentially accelerate the training thanks to a more rapid optimization of training parameters, it would still be difficult to achieve a 10-fold reduction in training duration as obtained with our setups. Furthermore, this reduction was achieved compatibly with two key requirements for this task: (1) head fixation for view angle stability across trials and eye tracking, and (2) control of the frequency and duration of the trials in those behavioral and physiology paradigms demanding session durations set by the experimenter rather than by the animal. In conclusion, we believe that our platform has many advantages to serve as an ideal system for the large-scale standardization of behavioral assays for facile integration with physiology systems.

## Methods

### Subjects

All procedures were reviewed and approved by the Animal Care and Use Committees of the RIKEN Brain Science Institute. Behavioral data for the visual task were collected from eight C57BL/6J male mice, and from two Tg mice (Thy1-GCaMP6f (GP5.5)) for the auditory task. The age of the animals typically ranged from 8 to 28 week old from beginning to end of the experiments. Mice were housed under 12–12 h light–dark cycle. No statistical methods were used to predetermine the total number of animals needed for this study. The experiments were not randomized. The investigators were not blinded to the animals’ allocation during the experiments and assessment of the outcome.

### Animal preparation for two-photon imaging

Implantation of a head-post and optical chamber. Animals were anesthetized with gas anesthesia (Isoflurane 1.5–2.5%; Pfizer) and injected with an antibiotic (Baytrile, 0.5 ml, 2%; Bayer Yakuhin), a steroidal anti-inflammatory drug (Dexamethasone; Kyoritsu Seiyaku), an anti-edema agent (Glyceol, 100 µl, Chugai Pharmaceutical) to reduce swelling of the brain, and a painkiller (Lepetan, Otsuka Pharmaceutical). The scalp and periosteum were retracted, exposing the skull, then a 4 mm diameter trephination was made with a micro drill (Meisinger LLC). A 4 mm coverslip (120–170 µm thickness) was positioned in the center of the craniotomy in direct contact with the brain, topped by a 6 mm diameter coverslip with the same thickness. When needed, Gelform (Pfizer) was applied around the 4 mm coverslip to stop any bleeding. The 6 mm coverslip was fixed to the bone with cyanoacrylic glue (Aron Alpha, Toagosei). A round metal chamber (6.1 mm diameter) combined with a head-post was centered on the craniotomy and cemented to the bone with dental adhesive (Super-Bond C&B, Sun Medical), mixed to a black dye for improved light absorbance during imaging.

### Viral injections

A construct used to produce AAV expressing GCaMP8 (pAAV-CAG-GCaMP8) was made based on two plasmids, pAAV-CAG-GFP (#37825, Addgene, Cambridge, MA, USA, a gift from Edward Boyden, MIT, MA, USA) and pN1-GCaMP8 (a kind gift from Junichi Nakai and Masamichi Ohkura, University of Saitama, Saitama, Japan). Solutions including infectious AAV particles were made and purified using a standard method (Tsuneoka et al., 2015)23. For imaging experiments, we injected rAAV2/1-CAG-GCaMP8 solution (2 × 1012 gc/ml, 500 nl) into the right visual cortex (AP, −3.3 mm: LM 2.4 mm from the bregma) at a flow rate of ~50 nl/min using a Nanoject II (Drummond Scientific, Broomall, Pennsylvania, USA). Injection depth was 300–350 μm. After confirmation of fluorescent protein expression, we made a craniotomy (~4 mm diameter) centered on the injection point while keeping the dura intact and implanted a cover-glass window, as described above.

### Automated behavioral setup for voluntary head fixation

The setup has been developed in collaboration with O’ Hara & Co., Ltd. (Tokyo) and it is now commercially available via O’ Hara & Co., Ltd. (http://ohara-time.co.jp/). The latching unit for physiology has been produced by Micro Industries Co., Ltd. (Tokyo).

Implanted mice were housed individually in standard cages connected to the setups. Visual stimuli were presented on the center of a LCD monitor (33.6 cm × 59.8 cm, 1920 × 1080 pixels, PROLITE B2776HDS-B1, IIYAMA), placed 25 cm in front of mice. The monitor covered ~40° × 100° of visual space including the whole binocular field. Stimuli were Gabor patches, static sine gratings, ~30° in diameter, 0.08 cpd, with randomized spatial phase, and windowed by a stationary two-dimensional Gaussian envelope, which was generated with custom code using the Psychtoolbox extension for Matlab. The stimulus diameter was defined as the 2σ of the Gaussian envelope. The initial version of the software was based on code from Burgess et al.6, and subsequently customized for this platform.

Most animals exhibited a bias at the beginning of the training. Hence we used an adaptive corrective procedure in which the probability of the target stimulus being presented clockwise P(C) or counterclockwise P(CC) was, (with P(CC = 1−P(C)) was calculated to match the probability of previous CC–C choices: $$P\left( {\rm{C}} \right) = \mathop {\sum }\nolimits_i^N \mathrm {cc}_{i}/( {\mathop {\sum }\nolimits_i^{\it N} \mathrm{cc}_i + \mathop {\sum }\nolimits_i^{\it N} {c_i}} )$$, with c i and cc i  = {1,0}, for choosing or not choosing the corresponding rotation, and N = all previous trials. P(C) was updated every 10 trials.

For imaging of GCaMP8 signals, the movement of the gratings on the screen produces strong visually evoked responses. Hence in training and in imaging experiments, to remove this component we introduced an open-loop (non-interactive) period (1.0 s) after the onset of the stimulus. During this period, a rotation of the wheel did not produce any stimulus movement. With training, animals learned to minimize wheel rotations during this period. To test for habituation under the two-photon microscope, we used two mice in addition to the eight reported in Fig. 3. These mice were trained in an earlier version of the same task, where wheel rotations induced L/R translations of the stimuli. This paradigm required a longer training period (101 ± 15 days, n = 2), than with c/cc rotations (26 ± 7 days, n = 8 mice).

For the auditory discrimination task, in addition to the setup for visual discrimination task, a speaker (DX25TG59-04, Tymphany) was placed 10 cm in front of the mouse head. Correct detections were rewarded by 4 µl water. A performance index of stimulus detectability was calculated as $$d\prime = z\left( {{\rm{Hits}}\,{\rm{fraction}}} \right) - z({\rm{FA}}\,{\rm{fraction}})$$, with z the inverse of the cumulative Gaussian function. The threshold of wheel rotation to signal an auditory stimulus detection was 70°. The whole setup was enclosed in a 50 × 50 × 45 cm box with sound isolation panels (Sound Guard W, Yahata-Neji).

### Behavioral data analysis

We fitted the animal’s probability of making a right side choice as a function of task difficulty using a psychometric function ψ (Wichmann and Hill; psignifit version 3 toolbox for MATLAB http://bootstrap-software.com/):

$$\psi \left( {{\it{\epsilon }};\,\alpha ,\,\beta ,\,\gamma ,\,\lambda } \right) = \gamma + \left( {1 - \gamma - \lambda } \right)F\left( {{\it{\epsilon }};\,\alpha ,\,\beta } \right),$$

where F(x) is a cumulative Gaussian function, α and β are the mean and s.d., γ and λ are left and right (L/R) lapsing rates, ε is the signed trial easiness. Confidence intervals were computed via parametric bootstrapping (999 bootstraps). Time-out trials were excluded from the analysis.

We quantified the animals’ performance during learning (Fig. 3) by analyzing how the slope, lapse-rates, and bias of the psychometric function $$\psi$$ changed with training24.

$$\begin{array}{ccccc}& {\rm{Slope}} = \frac{1}{{\beta \sqrt {2\pi } }} \\ & {\rm{Lapsing}}\,{\rm{rate}} = \psi \left( { - 45} \right) - \psi \left( {45} \right) + 1 \\ & {\rm{Bias}} = {\psi ^{ - 1}}(0.5)\\ \end{array}$$

Goodness-of-the-fit was tested separately for each curve by computing the deviance (D) and correlation (r) within the 95% confidence interval14. With only two orientation conditions, the psychometric model is under-constrained, however a (constrained) linear model produced similar results for bias and lapsing rate (defined as in the psychometric model), and for the slope parameter derived from the slope of the fitted line.

### Two-photon imaging

Imaging was performed using the two-photon imaging mode of the multiphoton confocal microscope (Model A1RMP, Nikon, Japan). The microscope was controlled by A1 software (Nikon). The objective was a ×10 air immersion lens (NA, 0.45; working distance, 4 mm; Nikon). The field of view (512 × 512 pixels) was 850 μm × 850 μm. GCaMP8 was excited at 920 nm and laser power was 10–25 mW. Images were acquired continuously at ~15 Hz frame rate using a resonant scanner. In every imaging session, a vascular image was captured at the surface of the cortex as a reference for imaging field location.

### Analysis of two-photon data

All the analyses except for neuronal segmentation were done using custom code written in Matlab. Spatial shifts (xy translation) due to movements of the mouse were initially corrected using the xy coordinates of the peak of the spatial cross correlation between a reference frame (average of initial 10 frames) and all the other frames (typically 1–2 × 104 frames). A semi-automatic segmentation of regions of interest (ROIs) was then performed in the motion-corrected data using the Suite2P toolbox (https://github.com/cortex-lab/Suite2P). A neuropil region was also determined automatically for each ROI as a surrounding region (2–8 μm from each ROI) that does not include the soma of other ROIs. The averaged fluorescence signal of each ROI was corrected by the averaged signal of the corresponding neuropil region as F soma−0.7 × (F neuropil−median(F neuropil))25. dF/F 0 was calculated for each ROI. F 0 was the average of 10 frames of the neuropil-corrected signal immediately before the onset of visual stimulation (baseline fluorescence). For orientation tuning curves (Fig. 5e), dF/F 0 of 0–1 s from the onset of visual stimulus was averaged for each stimulus orientation.

### Data availability

The technical-drawing data are available in DXF format at https://datadryad.org/ doi: doi:10.5061/dryad.1qv5t. All relevant data are available from the corresponding author upon request.