# Internal state dynamics shape brainwide activity and foraging behaviour

## Abstract

The brain has persistent internal states that can modulate every aspect of an animal’s mental experience1,2,3,4. In complex tasks such as foraging, the internal state is dynamic5,6,7,8. Caenorhabditis elegans alternate between local search and global dispersal5. Rodents and primates exhibit trade-offs between exploitation and exploration6,7. However, fundamental questions remain about how persistent states are maintained in the brain, which upstream networks drive state transitions and how state-encoding neurons exert neuromodulatory effects on sensory perception and decision-making to govern appropriate behaviour. Here, using tracking microscopy to monitor whole-brain neuronal activity at cellular resolution in freely moving zebrafish larvae9, we show that zebrafish spontaneously alternate between two persistent internal states during foraging for live prey (Paramecia). In the exploitation state, the animal inhibits locomotion and promotes hunting, generating small, localized trajectories. In the exploration state, the animal promotes locomotion and suppresses hunting, generating long-ranging trajectories that enhance spatial dispersion. We uncover a dorsal raphe subpopulation with persistent activity that robustly encodes the exploitation state. The exploitation-state-encoding neurons, together with a multimodal trigger network that is associated with state transitions, form a stochastically activated nonlinear dynamical system. The activity of this oscillatory network correlates with a global retuning of sensorimotor transformations during foraging that leads to marked changes in both the motivation to hunt for prey and the accuracy of motor sequences during hunting. This work reveals an important hidden variable that shapes the temporal structure of motivation and decision-making.

## Access options

from\$8.99

All prices are NET prices.

## Data availability

The data supporting the findings of this study are available from the corresponding authors upon request.

## Code availability

All software was written in Julia 0.6.3 and C++11 using the Julia package ecosystem (Optim.jl, PyPlot.jl, Images.jl, Cairo.jl, HDF5.jl, Lasso.jl, CUDAdrv.jl and CUDArt.jl). GPU processing was implemented for fish tracking, as well as online and offline image registration. Image registration and data analysis code was based on published work9,10,16,21,32,33, and is available upon request.

## References

1. 1.

Dayan, P. How to set the switches on this thing. Curr. Opin. Neurobiol. 22, 1068–1074 (2012).

2. 2.

Aston-Jones, G. & Cohen, J. D. An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance. Annu. Rev. Neurosci. 28, 403–450 (2005).

3. 3.

Fox, M. D. et al. The human brain is intrinsically organized into dynamic, anticorrelated functional networks. Proc. Natl Acad. Sci. USA 102, 9673–9678 (2005).

4. 4.

Nair, J. et al. Basal forebrain contributes to default mode network regulation. Proc. Natl Acad. Sci. USA 115, 1352–1357 (2018).

5. 5.

Flavell, S. W. et al. Serotonin and the neuropeptide PDF initiate and extend opposing behavioral states in C. elegans. Cell 154, 1023–1035 (2013).

6. 6.

Lottem, E. et al. Activation of serotonin neurons promotes active persistence in a probabilistic foraging task. Nat. Commun. 9, 1000 (2018).

7. 7.

Cohen, J. D., McClure, S. M. & Yu, A. J. Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration. Philos. Trans. R. Soc. B 362, 933–942 (2007).

8. 8.

Charnov, E. L. Optimal foraging, the marginal value theorem. Theor. Popul. Biol. 9, 129–136 (1976).

9. 9.

Kim, D. H. et al. Pan-neuronal calcium imaging with cellular resolution in freely swimming zebrafish. Nat. Methods 14, 1107–1114 (2017).

10. 10.

Marques, J. C., Lackner, S., Félix, R. & Orger, M. B. Structure of the zebrafish locomotor repertoire revealed with unsupervised behavioral clustering. Curr. Biol. 28, 181–195.e5 (2018).

11. 11.

Marques, J. C. & Orger, M. B. Clusterdv: a simple density-based clustering method that is robust, general and automatic. Bioinformatics 35, 2125–2132 (2019).

12. 12.

Bianco, I. H. & Engert, F. Visuomotor transformations underlying hunting behavior in zebrafish. Curr. Biol. 25, 831–846 (2015).

13. 13.

McElligott, M. B. & O’malley, D. M. Prey tracking by larval zebrafish: axial kinematics and visual control. Brain Behav. Evol. 66, 177–196 (2005).

14. 14.

Borla, M. A., Palecek, B., Budick, S. & O’Malley, D. M. Prey capture by larval zebrafish: evidence for fine axial motor control. Brain Behav. Evol. 60, 207–229 (2002).

15. 15.

Freeman, J. et al. Mapping brain activity at scale with cluster computing. Nat. Methods 11, 941–950 (2014).

16. 16.

Friedrich, J. et al. in NIPS Workshop on Statistical Methods for Understanding Neural Systems https://pdfs.semanticscholar.org/e4ff/845a4b996482f4ef491fff4581a59d949800.pdf (2015).

17. 17.

Niessing, J. & Friedrich, R. W. Olfactory pattern classification by discrete neuronal network states. Nature 465, 47–52 (2010).

18. 18.

Granger, C. W. J. in A Companion to Theoretical Econometrics (ed. Balgati, B. H.) Ch. 26 (Blackwell, 2007).

19. 19.

Phillips, P. C. B. Understanding spurious regressions in econometrics. J. Econom. 33, 311–340 (1986).

20. 20.

Lovett-Barron, M. et al. Ancestral circuits for the coordinated modulation of brain state. Cell 171, 1411–1423 (2017).

21. 21.

Marquart, G. D. et al. High-precision registration between zebrafish brain atlases using symmetric diffeomorphic normalization. Gigascience 6, 1–15 (2017).

22. 22.

Kastenhuber, E., Kratochwil, C. F., Ryu, S., Schweitzer, J. & Driever, W. Genetic dissection of dopaminergic and noradrenergic contributions to catecholaminergic tracts in early larval zebrafish. J. Comp. Neurol. 518, 439–458 (2010).

23. 23.

Randlett, O. et al. Whole-brain activity mapping onto a zebrafish brain atlas. Nat. Methods 12, 1039–1046 (2015).

24. 24.

Bianco, I. H. & Wilson, S. W. The habenular nuclei: a conserved asymmetric relay station in the vertebrate brain. Philos. Trans. R. Soc. B 364, 1005–1020 (2009).

25. 25.

Amo, R. et al. Identification of the zebrafish ventral habenula as a homolog of the mammalian lateral habenula. J. Neurosci. 30, 1566–1574 (2010).

26. 26.

Kalén, P., Karlson, M. & Wiklund, L. Possible excitatory amino acid afferents to nucleus raphe dorsalis of the rat investigated with retrograde wheat germ agglutinin and d-[3H]aspartate tracing. Brain Res. 360, 285–297 (1985).

27. 27.

Filosa, A., Barker, A. J., Dal Maschio, M. & Baier, H. Feeding state modulates behavioral choice and processing of prey stimuli in the zebrafish tectum. Neuron 90, 596–608 (2016).

28. 28.

Fox, M. D. & Raichle, M. E. Spontaneous fluctuations in brain activity observed with functional magnetic resonance imaging. Nat. Rev. Neurosci. 8, 700–711 (2007).

29. 29.

Avants, B., Tustison, N. & Song, G. Advanced Normalization Tools (ANTS). Insight J. 2, 1–35 (2009).

30. 30.

van der Maaten, L., Hinton, G. E., van der Maaten, L. & Hinton, G. E. Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).

31. 31.

Sibson, R. SLINK: An optimally efficient algorithm for the single-link cluster method. Comput. J. 16, 30–34 (1973).

32. 32.

Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M. & Schiele, B. in European Conference on Computer Vision 34–50 (Springer, 2016).

33. 33.

Mathis, A. et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 21, 1281–1289 (2018).

34. 34.

Nath, T. et al. Using DeepLabCut for 3D markerless pose estimation across species and behaviors. Nat. Protoc. 14, 2152–2176 (2019).

35. 35.

Rabiner, L. R. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77, 257–286 (1989).

36. 36.

Lillesaar, C., Tannhäuser, B., Stigloher, C., Kremmer, E. & Bally-Cuif, L. The serotonergic phenotype is acquired by converging genetic mechanisms within the zebrafish central nervous system. Dev. Dyn. 236, 1072–1084 (2007).

37. 37.

Lillesaar, C., Stigloher, C., Tannhäuser, B., Wullimann, M. F. & Bally-Cuif, L. Axonal projections originating from raphe serotonergic neurons in the developing and adult zebrafish, Danio rerio, using transgenics to visualize raphe-specific pet1 expression. J. Comp. Neurol. 512, 158–182 (2009).

38. 38.

McLean, D. L. & Fetcho, J. R. Ontogeny and innervation patterns of dopaminergic, noradrenergic, and serotonergic neurons in larval zebrafish. J. Comp. Neurol. 480, 38–56 (2004).

39. 39.

Hong, E. et al. Cholinergic left-right asymmetry in the habenulo-interpeduncular pathway. Proc. Natl Acad. Sci. USA 110, 21171–21176 (2013).

40. 40.

Panula, P. et al. The comparative neuroanatomy and neurochemistry of zebrafish CNS systems of relevance to human neuropsychiatric diseases. Neurobiol. Dis. 40, 46–57 (2010).

## Acknowledgements

We thank M. Burns, M. Frank, S. Flavell, P. Dayan, K. Taute and G. Rainer for discussions and feedback; A. Mathis and M. W. Mathis for providing unpublished code, help with the implementation of DeepLabCut for tracking Paramecia and zebrafish, and for motivating us to use deep learning methods; and we thank M. Ahrens and H. Baier for Tg(elavl3:H2B-GCaMP6s) and I. Bianco for discussions about prey capture. This work was financially supported by the Rowland Institute at Harvard.

## Author information

J.M.L and D.N.R. conceived the project on state-dependent modulation of foraging behaviour, identified behavioural states (for example, using HMM and PCA) and state-related and hunting-related neural populations, created the dynamical system model of motivational states and provided overall guidance on analysis of behaviour and neural activity. J.C.M. contributed to the conception of whole-brain imaging of hunting sequences using the tracking microscope, conducted all experiments, contributed to experimental design, classified individual movements into bout types, annotated and confirmed each hunting sequence and outcome, analysed state-dependent modulation of behaviour and contributed to identification and functional analysis of neuromodulatory populations. M.L. trained deep neural networks for annotation of Paramecia and fish body parts, conducted PCA of whole-brain neural activity, analysed the functional role of neuromodulatory populations (such as by registration of live and immunostained brain volumes) and identified state-related and hunting-related neural populations across animals (spatial P value analysis and k-means clustering). D.S. designed and performed all immunostaining experiments. J.M.L., D.N.R., J.C.M. and M.L. wrote the manuscript.

Correspondence to Drew N. Robson or Jennifer M. Li.

## Ethics declarations

### Competing interests

The authors declare no competing interests.

Peer review information Nature thanks Mario De Bono, Yonatan Loewenstein, Ethan Scott and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Extended data figures and tables

### Extended Data Fig. 1 DNN pipeline for automated tracking of Paramecia and fish and density valley clustering identifies seven bout types performed by larvae during foraging.

a, Automated tracking of Paramecia and fish followed a four-step procedure (Methods). The rightmost panel shows the automated annotation of Paramecia position (yellow circles), eye position (dark blue, blue, yellow and red dots), yolk position (green dot) and brain position (light green dot). Fish heading (red line) and eye convergence index (angle in cyan) were calculated using eye, and yolk, and brain positions (Methods). b, Schematic of eye convergence index when the fish is not hunting (left) or hunting (right). c, Calculation of the eye convergence index. Top, angles of the left and right eyes are calculated from the two detected points of each eye. Bottom, eye convergence index is calculated as the difference of the individual eye angles. By setting a threshold for the eye convergence index, we define the start and end of each hunting sequence. di, The bout clusters that larvae executed during functional imaging with the tracking microscope (25 larvae in the presence of Paramecia and 7 in the absence of Paramecia) were identified by clusterdv (Methods). To estimate the similarity between bout clusters, pairwise ROC curves were calculated. d, Dimensionality reduction using t-distributed stochastic neighbour embedding (t-SNE). e, Evidence accumulation matrix of clusterdv assignments was constructed by repeating the t-SNE embedding 100 times. The rows and columns of the matrix were sorted by single link clustering. f, Dendrogram of e. Green line marks the largest jump on the dendrogram and determines the number of bouts types (seven) that were detected. g, First 2 PCs of 73 kinematic parameters that were calculated for each bout. Colours represent bout categories from f. h, Angle of caudal tail segment versus time of 50 randomly picked bouts of each type. Cyan lines are the average of all bouts in each category. i, Bout kinematic distributions were obtained by using all the bouts of each type. Colours represent bout types as in h.

### Extended Data Fig. 2 Sequential Markov decision process of larval foraging behaviour, final hunting bout is modulated by foraging state, and movement rate and hunting probability are inversely related.

ar, Markov decision process of larval foraging behaviour. A schematic of foraging behaviour as a Markov decision process is shown for all time (a), exploitation state (g) or exploration state (m). Each node represents a decision point in the hunting sequence, and each arrow has a state-dependent transition probability (grey number next to arrow). The ‘free behaviour’ and ‘pursuit’ nodes are composed of sequences of multiple bout types that may occur with distinct probabilities. b, c, h, i, n, o, Bout type probabilities for the free behaviour node (b, h, n), which occurs when the eyes are unconverged, and the pursuit node (c, i, o), which takes place when the eyes are converged and the fish is chasing prey. df, jl, pr, Bout type probabilities of the last bout of the pursuit sequence for each of the hunting outcomes: prey in mouth (d, j, p), abort (e, k, q) or miss (f, l, r). Once the prey is in the mouth, the fish can swallow (success) or spit it out (rejection). The colours in the box plots represent each observed bout type (legend, top right corner). Statistical analysis is presented in Supplementary Table 1. s, Movement type that the fish executes at the end of each hunting sequence during exploitation and exploration (n = 25 animals, 2,972 hunting attempts). t, Box plots of state-dependent differences in the probability of exploration-related movement types (top), probability of hunting-related movement types (middle) and probability of hunting outcomes (bottom). Probability of a movement type is defined as the fraction of all movements that are of a given type, and probability of hunting outcome is defined as the fraction of all hunting sequences that end with a given outcome. Adjusted P values for state-dependent differences: routine turn (7.37 × 10−5), slow swim type 1 (0.092), slow swim type 2 (9.83 × 10−5), eye convergence (6.14 × 10−5), J-turn (8.60 × 10−5), approach swim (9.83 × 10−5), capture swim (7.63 × 10−5), high angle turn (1.0 × 10−5), suction (0.25), success (0.020), rejection (0.63) and miss (3.5 × 10−3). n = 25 animals, 2,972 hunting attempts and 88,814 movements. u, To more finely parse state-dependent behaviour, we further subdivide the dispersal data by fitting a 3-state HMM. This results in a state with very low movement rate (< 0.1 Hz, yellow) and spatial dispersal, a state with intermediate movement rate and spatial dispersal (red) and a state with very high movement rate and dispersal (blue). We find that the inverse relationship between movement rate and hunting probability is consistent regardless of whether we fit two (t) or three states (u). The lower the movement rate, the more strongly individual movements are biased towards hunting (n = 16 animals, 1,605 hunting attempts and 40,069 movements). su, Pairwise comparisons between the states was performed by using a two-tailed Wilcoxon signed-rank test. Adjusted P values have been corrected for multiple comparisons by Holm–Bonferroni method. *P < 0.05, **P < 0.01 and ***P < 0.001.

### Extended Data Fig. 3 Distribution of behavioural parameters across exploitation and exploration state.

a, Distribution of dispersal distance (left), and colour-coded (right) by whether the observation occurred during exploration (blue) or exploitation (red) state, as classified by the HMM (Methods). To facilitate comparison across animals, the dispersal distance was normalized to the range of each animal. b, Distribution of hunting probability across all time (left), during exploration state (blue) and during exploitation state (red). c, Distribution of probability for hunting, approach swim, J-turn, capture swim, routine turn, slow swim type I, slow swim type II and high-angle turn, plotted against dispersal distance, and colour-coded by whether the observation occurred during exploitation (red) or exploration (blue) state. df, PCA of foraging behaviour reveals a bimodal distribution of behavioural variables. PCA was performed using ten behavioural variables: rate of each movement type (approach swim, J-turn, capture swim, routine turn, slow swim type I, slow swim type II and high-angle turn), rate of all hunting sequences, rate of successful hunting sequences and rate of aborted hunting sequences (Methods). d, Distribution of the first PC (left), and colour-coded (right) by whether the observation was classified as exploration (blue) or exploitation (red) by the HMM. e, Variance explained by each PC. f, Distribution of all behavioural parameters projected into the first three PC dimensions, colour-coded by whether the observation occurred during exploitation (red) or exploration (blue) state. For af, n = 19 animals. g, Normalized state duration distributions for exploitation (top) and exploration (bottom) (n = 36 animals), divided into 20 bins. The state durations of each animal were normalized by the mean state duration of that animal (normalization was separately performed for exploitation state and exploration state), and then used to fit an exponential model by maximum likelihood (red and blue lines).

### Extended Data Fig. 4 Exploitation–exploration state transitions occur in the absence of prey and are not modulated by starvation.

a, Behavioural state transitions in two animals, each placed in an arena without Paramecia for over an hour. b, PCA trajectory based on whole-brain activity from a representative animal in the absence of prey, colour-coded by the activity of exploitation-state encoding neurons. c, Fraction of time in exploitation state in the absence (grey) and presence (purple) of Paramecia (25 animals in the presence of Paramecia and 7 animals in the absence of Paramecia). d, Box plots of state-dependent differences in movement types (n = 7 animals, 13,997 movement events). The number of capture and rejection events were both 0, because there was no prey. e, f, Comparison between starved and fed larvae. e, Fraction of time in exploitation state for the first 10 min (left), and the entire duration of the experiment (right). f, Probabilities by movement type for larvae that where fed (green) or starved for 24 h before the experiment (yellow) (n = 5 animals in fed condition, n = 9 animals in 24 h starved condition). cf, Adjusted P value (above each plot) obtained by two-tailed Wilcoxon signed-rank test. Adjusted P values have been corrected for multiple comparisons by Holm–Bonferroni method.

### Extended Data Fig. 5 Spatial P value analysis, spatial distribution of neurons encoding exploitation state, exploration state, and transition from exploration to exploitation, and sensorimotor networks activated during hunting.

a, The spatial P value analysis followed a two-step procedure. In step 1, ANTs registration was performed to transform the live fluorescent brain volumes of 17 animals into a common reference space defined by Z-Brain. In step 2, a spatial P value was calculated for each functionally classified neuron in each animal by examining its spatial colocalization across all 17 fish. For instance, to calculate the spatial P value of a functionally classified neuron in fish i (shown as red star), we calculate the shortest distance dr to any neuron in fish j that was classified into the same functional class to generate a same-class distance vector dri,1,…,dri,j−1,dri,j,…,dri,m, in which m = 17. We then calculated the shortest distance dn between the neuron in fish i and a random size-matched population of neurons in fish j obtained by randomly sampling all neurons in fish j, with the number of neurons sampled equal to the number of neurons in the functional class in fish j. Therefore, a randomized distance vector dni,1,…,dni,j−1,dni,j,…,dni,m was calculated. To facilitate comparison of these distance vectors, we reduce them to a single scalar value by calculating the mean of the lowest 90% of the distances within the same-class distance vector and random distance vector. We then repeated the random sampling process N times (N = 10,000 in our analysis) to construct a null distribution of the shortest distance between a functionally classified neuron in fish i and the randomly sampled neurons in other animals. We then fit this null distribution with a normal distribution. The P value for this functionally classified neuron in fish i relative to all other animals was then calculated by evaluating dr on the basis of a normal distribution model. We reject the null hypothesis of random sampling if a functionally classified neuron in fish i has a spatial P value < 0.025. b, The spatial density for each area in the Z-Brain map was calculated for neurons that encode exploitation (left), exploration (middle), and transition from exploration to exploitation (right). ce, Brain regions with neural densities higher than two standard deviations (measured across all brain regions) for exploitation-state-encoding neurons (c), exploration-state-encoding neurons (d) and trigger neurons that encode the transition from exploration to exploitation (e) (n = 17 animals; only neurons with consistent spatial localization and spatial P value < 0.025 are considered for this analysis). f, Retinotopic map of prey detection neurons in the optic tectum. Left, anatomical position of neurons tuned to different prey angles (Methods). Right, angular tuning curves of optic tectum neurons classified by angle of peak prey response, in 30° increments. g, Location of all neurons time-locked to the onset of eye convergence. h, All neurons time-locked to successful prey ingestion. i, Retinotopic map for each prey detection angle depicted individually across eight maps. j, Motor networks that drive distinct movement types associated with pursuit and capture of prey. Motor networks were identified for all seven observed movement types. All panels depict neurons with spatially significant colocalization across animals (spatial P value < 0.025, n = 17 animals). Scale bars, 50 µm.

### Extended Data Fig. 6 PCA analysis of whole-brain neural activity across four fish.

Each row shows a different animal. The first column shows all neuronal centroids identified from NMF (Methods) for each animal. The second column shows PCA trajectories in the first three PCA dimensions. The third column shows the locations of the exploitation-state-encoding neurons for each animal. The fourth column shows whole-brain PCA trajectories colour-coded by the activity of exploitation-state-encoding neurons in each animal.

### Extended Data Fig. 7 Correlation between state duration and activity of dorsal raphe state neurons and exploration-state-encoding neurons, multimodal activation of the exploitation state, and stochastic model of trigger and state signals.

a, Top, peak activities of exploration state neurons are not correlated with the duration of exploration state. Bottom, peak activities of dorsal raphe exploitation state encoding neurons correlate with the duration of exploitation state. Activity is in units of z-scored fluorescence. bd, The trigger network that is active at the transition from exploration to exploitation contains both modality-specific neurons (b, c), that are only active at either spontaneous transitions (b) or at stimulus-evoked transitions (c), as well as cross-modal neurons that are activated at all state transitions regardless of whether the transition is stimulus-evoked or spontaneous (d). e, Schematic of the stochastic model. Spontaneous and stimulus-evoked trigger events are represented by delta functions with variable (integrated) amplitudes. The spontaneous trigger events arrive according to a Poisson process during exploration state, triggering the exit from exploration state after an exponentially distributed waiting time. Each delta function causes the trigger signal to undergo an impulsive rise, followed by rapid exponential decay. The state signal integrates the trigger signal, and then undergoes a slow linear decay back to baseline. Exploitation state corresponds to the time intervals when the state signal is elevated above baseline. Spontaneous trigger amplitudes are exponentially distributed, which leads to a corresponding amplitude in the state signal, and effectively determines the duration of the ensuing exploitation state due to the slow linear decay of the state signal. f, Two examples of recorded trigger and state signals during long exploitation states, together with corresponding signals in the fitted model. Event amplitudes $${A}_{i}^{{\rm{spont}}}$$ and times $${t}_{i}^{{\rm{spont}}}$$ are fit individually for each transition. The shared model parameters (α, β, γ) were obtained by simultaneous least squares fitting of the equations for T(t) and S(t) to the dynamics of the population mean activity of trigger and exploitation state neurons during 14 exploitation states having a duration of at least 10 min (obtained from n = 17 fish). The joint fitting procedure yielded α = 1.5, β = 0.1 min−1, and γ = 1.3 min−1. These parameters, together with the fitted behavioural state distributions from Fig. 1d (obtained from n = 36 fish), comprise the conceptual model of the stochastic nonlinear dynamical system for exploration and exploitation.

### Extended Data Fig. 8 Validation of exploitation-state-encoding dorsal raphe neurons.

a, b, Exploitation-state-encoding neurons have stable correlation to state even after excluding activity around hunting events. a, To verify the hypothesis that spatially significant (spatial P value < 0.025) exploitation-state-encoding neurons encode a persistent state and not simply hunting events, we removed the neural activity around all hunting events across the entire experiment, with increasing time intervals ranging from 0.5 s to 10 s in steps of 0.5 s. b, State-encoding neurons remain stably correlated to state after removal of increasingly large time windows around hunting events. Orange, spatially significant (spatial P value < 0.025) dorsal raphe exploitation-state-encoding neurons after spatial P value significance test. Blue, all spatially significant (spatial P value < 0.025) exploitation-state-encoding neurons (n = 17 animals). Standard error of all exploitation-state-encoding neurons is < 0.0024, and standard error of dorsal raphe exploitation-state-encoding neurons is < 0.0041. cf, Validation of exploitation-state-encoding dorsal raphe neurons by clustering analysis. As a control test, we also conducted the same analysis on another brain region, the inferior olive (gj). c, g, Clustering analysis of dorsal raphe neurons (c) and inferior olive neurons (control) (g). d, h, Activity of the tightest-correlated cluster (lowest average intra-cluster distance) in the dorsal raphe (d) and inferior olive (h) of each animal aligned to state transition from exploration to exploitation. e, i, Cross-validation of exploitation-state-encoding activity across animals for dorsal raphe (e) and inferior olive neurons (i). Each element of the array represents the mean correlation coefficient Cij between the neural activity of the tightest-correlated dorsal raphe cluster of fish i and behaviour state of fish j. f, Statistical analysis between dorsal raphe neurons in on-diagonal elements (i = j) and neurons in off-diagonal elements ($$i\ne j$$). The mean correlation coefficient $${C}_{ij}^{k}$$ for i = j was 0.468, whereas the mean correlation coefficient $${C}_{ij}^{k}$$ for $$i\ne j$$ was 0.024. A two-sample t-test analysis verified that there was a significant difference between these two groups (P < 10−8, n = 17 animals). j, The same statistical analysis showed there was no significant difference between neurons in on-diagonal elements and neurons in off-diagonal elements for the inferior olive (mean correlation coefficient $${C}_{ij}^{k}$$ for i = j was -0.047, mean correlation coefficient $${C}_{ij}^{k}$$ for $$i\ne j$$ was 0.023 (P = 1, two-sample t-test).

### Extended Data Fig. 9 Registration of live fluorescent brain volume to immunostained brain volume using ANTs.

a, Schematic of ANTs registration pipeline (Methods). b, The superimposed live fluorescent and immunostained brain volumes before registration, after global registration, and after both global and local registration. c, The immunostained brain volume (green), live fluorescent brain volume (red), and their superimposed volume for a region in the optic tectum. The white arrows highlight the precise overlap of a sparse subpopulation of neurons in the optic tectum. d, Correlation between live fluorescent and immunostained brain volumes before and after registration (Methods). Scale bars, 50 µm (b, c).

### Extended Data Fig. 10 Activity of neuromodulatory subpopulations as a function of internal state transitions and hunting events.

ac, For each neuromodulatory population, we show an example trace of activity (left) and event-triggered average activity (right) aligned to state transition, successful ingestion of prey and/or hunting initiation. Blue dots, eye convergence or hunting initiation; red dots, prey ingestion. df, Left, spatial distribution of each neuromodulatory population. Neurons in five brain regions (optic tectum, raphe, locus coeruleus, cerebellum and hindbrain) are highlighted in different colours. Right, box plots showing, for each neuromodulatory population and brain region, the number of neurons that belong to each functional type. Scale bars, 50 µm (df). All plots report mean ± s.e.m.

## Supplementary information

### Supplementary Information

This file contains Supplementary Methods, including a detailed description of Deep Neural Network (DNN) pipeline for paramecia tracking, and Supplementary Table 1: Statistical analysis of Markov Decision Process of larval behaviour. Bout type probability distributions of the sequential Markov Decision Process (MDP) were compared using a Wilcoxon signed rank test for free behaviour, pursuit, prey in mouth, abort, and miss (bout type probabilities are shown in Extended Data Fig. 2). Values represent adjusted p-values of bout type comparisons for exploitation state (red), exploration state (blue), and all time (black). Adjusted p-values have been corrected for multiple comparisons by Holm-Bonferroni method (n = 25 animals, 2,972 hunting attempts, 88,814 movements).

### Supplementary Video 1

Paramecia and fish eye tracking. The paramecia and eye tracking results are shown for four different hunting sequences, corresponding to success, abort, miss, and reject. The yellow circle denotes the position of each DNN-tracked paramecium. The blue and red dots show the four DNN-tracked eye positions. The description in the top right corner, “eyes converged” (red) or “eyes un-converged” (cyan), indicates the eye convergence annotation determined by the eye tracking results. Video is shown 5 x slower than real-time.

### Supplementary Video 2

Representative hunting sequence showing the identification and classification of each movement. The fish makes several discrete movement bouts during the hunting sequence. The bout type that the fish is executing is written in the top right corner of the video. The fish initiates hunting by converging its eyes and executing a left J-turn, but overshoots the paramecium. Immediately after, the fish executes a J-turn to the right and positions the prey in front of its mouth. The fish executes a capture swim, consumes the prey, and de-converges its eyes. Video is shown 20 x slower than real-time.

### Supplementary Video 3

3D distribution of exploitation state neurons, exploration state neurons, and trigger neurons. The brain stacks are overlaid with dots depicting the locations of functionally classified neurons. Only neurons with spatially significant colocalization across animals are shown (spatial p-value < 0.025, n = 17 animals). Brain stacks are shown from dorsal to ventral.

### Supplementary Video 4

Immunostaining stack and corresponding functional types. Immunostained brain stacks for serotonergic, dopaminergic, and acetylcholinergic neurons. Left, antibody staining; right, antibody staining overlaid with dots depicting locations of neurons functionally classified as exploitation-state encoding (red), prey-detection (yellow), eye-convergence (cyan), and success (magenta). The brain stacks are shown from dorsal to ventral.

## Rights and permissions

Reprints and Permissions

Marques, J.C., Li, M., Schaak, D. et al. Internal state dynamics shape brainwide activity and foraging behaviour. Nature 577, 239–243 (2020). https://doi.org/10.1038/s41586-019-1858-z

• Accepted:

• Published:

• Issue Date: