Disentangling the role of NAc D1 and D2 cells in hedonic eating

Overeating is driven by both the hedonic component (‘liking’) of food, and the motivation (‘wanting’) to eat it. The nucleus accumbens (NAc) is a key brain center implicated in these processes, but how distinct NAc cell populations encode ‘liking’ and ‘wanting’ to shape overconsumption remains unclear. Here, we probed the roles of NAc D1 and D2 cells in these processes using cell-specific recording and optogenetic manipulation in diverse behavioral paradigms that disentangle reward traits of ‘liking’ and ‘wanting’ related to food choice and overeating in healthy mice. Medial NAc shell D2 cells encoded experience-dependent development of ‘liking’, while D1 cells encoded innate ‘liking’ during the first food taste. Optogenetic control confirmed causal links of D1 and D2 cells to these aspects of ‘liking’. In relation to ‘wanting’, D1 and D2 cells encoded and promoted distinct aspects of food approach: D1 cells interpreted food cues while D2 cells also sustained food-visit-length that facilitates consumption. Finally, at the level of food choice, D1, but not D2, cell activity was sufficient to switch food preference, programming subsequent long-lasting overconsumption. By revealing complementary roles of D1 and D2 cells in consumption, these findings assign neural bases to ‘liking’ and ‘wanting’ in a unifying framework of D1 and D2 cell activity.


INTRODUCTION
Overweight and obesity, and eating disorders generally, have increased in recent decades [1,2].Thus, it is crucial to better understand the factors that contribute to the development and onset of different eating disorders.In the modern obesogenic environment, palatability and availability of foods play a major role in the development of overeating conditions [3][4][5][6][7][8].Food consumption is not only motivated by the homeostatic need to monitor, restore and maintain energy balance.Homeostatic signals can also be overridden, causing us to engage in nonhomeostatic eating that is driven by the sensory and/or hedonic properties of palatable foods with high fat and/or sugar contents.This 'hedonic' eating activates brain regions involved in reward, resulting in increased motivation to consume food, and increased energy consumption [9][10][11] despite a state of satiety [12].
The 'incentive salience theory' of reward has provided a valuable framework for more than a decade for investigating the role of food hedonics in the context of appetitive behavior in humans and animals [13,14].This theory postulates that reward is not a unitary process, but comprises an affective pleasure component referred to as 'liking' and a non-affective motivational component referred to as 'wanting'.'Liking' is the hedonic component that reflects the immediate experience of eating a pleasurable food [15].'Wanting' is the incentive motivation (reward-seeking) that can lead to increased appetite, food cravings and also to overconsumption of food [13,16,17].Looking at these psychological traits in eating behaviors [18][19][20], overeating in some individuals might reflect the abnormal functioning of either the 'wanting' or 'liking' mechanisms, and there are indications from subjects with eating disorders that 'liking' and 'wanting' of obesogenic foods can also be dissociated [21].Dopaminergic neurotransmission in the nucleus accumbens (NAc) plays a key role in behaviors related to natural rewards and drugs of abuse [22][23][24][25][26].In the NAc, two important cell classes are medium spiny neurons expressing D1 dopamine receptors (D1 cells) and those expressing D2 dopamine receptors (D2 cells) [27].The precise contribution of D1 and D2 cells to rewardassociated behaviors is still unresolved, and establishing a causal relationship between activation of each neuronal subtype and its effect on motivated behaviors has proven challenging.D1 cell activation is canonically considered to be related to positive rewarding events leading to persistent reinforcement, while activation of D2 cells is thought to facilitate aversion [28][29][30].However, the current literature is controversial regarding the functionality of these cells and their role in motivation-related behaviors [31][32][33][34].For instance, if looking solely at hedonic eating, some studies report a negative correlation between NAc D1 cell activity and food consumption [35], while others report a positive relationship [36,37].
Therefore, increased understanding of which cells in the dopaminergic system underly brain substrates of 'liking' and 'wanting' in healthy animals may open an avenue to comprehending the impact of food rewards on eating behaviors linked to the development and onset of obesity and other eating disorders.It may also provide insight into brain circuits and mechanisms underlying both hedonic and motivational components of reward, including mechanisms common to both natural rewards and drugs of abuse.
This paper aims to further understand the differential contribution of medial NAc shell D1 and D2 cells to different components of palatable food consumption, namely 'liking' (hedonia/perceiving food as pleasurable) and 'wanting' (incentive motivational value/desire/craving).We assess 'liking' and 'wanting' from different angles using multiple behavioral paradigms: (1) a freeconsumption task to assess hedonic pleasure experienced from eating a palatable food (this focuses on 'liking' rather than 'wanting').This included lick microstructure analyses to provide a supplementary read-out of the 'liking' and 'wanting' components of hedonic eating; (2) a motivational effort-based task assessing 'wanting'; and (3) a hedonic-shifting task to evaluate how medial NAc shell cells drive choice by inducing either 'liking' or 'wanting'.Alongside those behavioral tests, we employed neuronal recording or optogenetic tools, and multivariate analysis.This enabled us to better understand the forces underlying eating behavior that may lead to overconsumption in healthy animals, and may ultimately lead to the onset of obesity or other eating disorders.

RESULTS
NAc D1 cell activity is transiently increased, while D2 cell activity is transiently increased then suppressed, during periods of free consumption of a palatable food reward To probe the neural mechanisms underpinning reward consumption, we used in vivo fiber photometry to record fluorescence of cre-dependent GCaMP6s calcium indicator selectively expressed in D1 cells (in D1 receptor-cre mice) or D2 cells (in adenosine A2a receptor gene (Adora2a)-cre mice) [38] in the medial NAc shell (Fig. 1a).In the striatum GCaMP6s fluorescence is a broad readout of neuronal activation likely including both spiking output and non-somatic calcium changes that presumably reflect dendritic synaptic inputs [39].Here, as we were interested in overall D1 and D2 cell activity, this was not a limitation.
To determine how D1 and D2 cells represented solutions with different palatability, mice were first given free access to water from a bottle (Fig. S1a-c).Once they reliably consumed this, they were allowed to freely consume a highly palatable food (milkshake, Fig. 1b, c and S1c) over the course of five daily sessions, in the absence of nutrient deficit (to motivate fluid consumption, mice had access to 2% citric acid in water in their home cage water bottle, but they were not food deprivedsee Methods).
We found that before licking, both D1 and D2 cells transiently increased their activity, which peaked just after licking onset.In both D1 and D2 cells, the peak was more pronounced in response to milkshake than water.This may reflect hedonic processes (Fig. 1b-d) rather than nutrient detection because a higher peak was observed in response to water than to mildly acidic water (Fig. S1d).However, after consumption onset, the activity patterns of the two populations differed; D1 cells returned to baseline, while D2 cell activity was suppressed below the baseline as the licking bout continued (Fig. 1d).Both cell types continued to respond to milkshake over subsequent sessions, but the D1 peak reduced over sessions, and the D2 suppression was exaggerated (Fig. 1e).Note that underlying spiking activity may occur before the peak of the photometry signal, given the slower temporal dynamics of GCaMP6s, so we cannot exclude that spiking activity peaks just before or during licking onset (see for example Supplementary Fig. 2a of [39]).
Milkshake consumption within a session (total number of licks) was significantly higher than water, and this was consistent across sessions (Fig. S2a, b).This indicates more positive hedonic reactions to milkshake than water (Fig. S2b).However, the amount consumed provides an incomplete readout of ingestive behavior [40].Therefore, to further probe the driving forces behind these changes in overall consumption, we investigated different aspects of free consumption of water vs. a palatable food (milkshake).In particular, we examined if it would be possible to differentiate between 'wanting' and 'liking' parameters [18].As a complement, we therefore analyzed licking microstructure [41], since it provides several measures to evaluate pleasure and motivation during liquid-food consumption in rodents [40,[42][43][44][45]. Indeed, rodents' licking patterns when freely consuming a solution vary, with bouts of different lengths and structure; analysis of this structure has been often used to investigate aspects underlying feeding behavior, including motivation and palatability [46,47].Specifically, increases in the duration of single licks, lick rate [45], licking bout duration [44,48] and bout size (lick intensity determined by number of licks in a bout) [40,44,45], and decreases in inter-lick interval (ILI) have been suggested to represent increased hedonic impact and palatability ('liking').Increases in bout number [44,46,48] and decreases in inter-bout interval represent increased incentive salience ('wanting') [40,44,[49][50][51][52] (Fig. 1f).In addition to giving an indication of motivation, the number of bouts can also be affected by post-ingestive mechanisms [45].For simplicity and based on previous literature [44,46,48], we only used licking bouts and not clusters that are occasionally described in other studies [40,45], as the two measurements largely reflect similar underlying processes [45].
We found that licking parameters reflected a greater 'liking' for milkshake over water; lick duration, bout duration and bout size increased significantly, while inter-lick interval (ILI) decreased significantly, between the water and first milkshake sessions (Figs.1g and S2c), although lick frequency did not increase significantly (Fig. S2c).Therefore, as expected, animals displayed more positive hedonic reactions ('liking') to the milkshake than to water.Lick duration and lick frequency also kept increasing over milkshake sessions, suggesting an increased hedonic component of milkshake consumption in later sessions (Figs.1g, left and S2c, right).
Bout number increased, and inter-bout interval (IBI) decreased significantly between water and the first milkshake session, suggesting that mice were more motivated ('wanted') to consume milkshake (Fig. 1h).IBI also kept decreasing across milkshake sessions, suggesting 'wanting' for milkshake increased over sessions (Fig. 1h, right).Interestingly, 'liking' parameters were overall stable across sessions, while 'wanting' parameters indicated a decrease in wanting between the first 2 min and last 2 min of the sessions (Fig. S3a, b), possibly indicating satiation mechanisms coming into play (Fig. S3c), decreasing motivation [45,53].
We then performed multiple regression analysis to investigate whether in vivo calcium activity of NAc D1 and D2 cells during consumption of milkshake across all 5 sessions could predict the level of 'liking' or 'wanting' of the palatable food consumed.This analysis can identify key features of neural activation that predict specific aspects of palatable food consumption.In D1 cells, the minimum and maximum signal values were significant predictors of bout duration ('liking'), explaining around 30% of the variability in bout duration (Figs.2a and S4a, top right), with negative and positive correlations to 'liking', respectively: the Pearson r and R 2 change values (Fig. S4a) indicate the strength of the correlation, and the coefficients (Fig. 2a) indicate the direction (positive or negative relationship).We found that the higher the initial increase (and to a smaller extent, the lower the subsequent decrease), the longer the bout (Figs.2a and S4a).In D2 cells, we found that the area under the curve and the minimum signal during the 3 sec following lick onset were significant predictors of bout duration, together explaining around 40% of the variability in bout duration (Figs.2b and S4b, top right).More specifically, the more the D2 cell signal decreases, the longer the bouts are and the higher the 'liking' is (Fig. 2b).Interestingly, in a similar analysis of inter-bout interval, IBI ('wanting' -Fig.S5), the significant predictors in either D1 or D2 cell signals only explained a very small amount of IBI variability and were thus poor predictors overall.
Together, combining photometry recordings, lick microstructure and multiple regression analyses, we detected a dissociation between 'liking' and 'wanting', with the distinct signals in both D1  and D2 cells being more strongly correlated with 'liking' than 'wanting'.This was as we expected given the nature of the task (free access to a single palatable food).Specifically, the initial activation in D1 cells at licking onset and the sustained inhibition of D2 cell activity during eating were good predictors of 'liking' parameters.It is important to note, however, that prediction does not imply causation, which we next addressed.
NAc D2 cells increase consumption by increasing 'wanting' despite a decrease in 'liking' We next employed optogenetics to investigate how the roles of medial NAc shell D1 and D2 cells differ during ad libitum palatable food consumption.Based on previous findings [28,54], we initially hypothesized that activation of NAc D1 cells would lead to increased consumption, whereas activation of NAc D2 cellscounteracting the natural suppression in activity described earlier would decrease 'liking'.However, it would not necessarily affect overall intake, depending on its effect on the 'wanting' aspect of consumption.To test our predictions, we injected the optogenetic cre-dependent excitatory actuator ChrimsonR or a control GFP virus, into the NAc of D1-or D2(A2a)-cre mice, and implanted optical fibers bilaterally in the NAc (Fig. 2c).Consistent with our photometry data and our initial prediction, optogenetic activation of D2 cells contingent on eating decreased 'liking' (increased ILI, Fig. 2e).Unexpectedly, it also increased 'wanting' (increased bout number, Fig. 2f) leading to increased overall consumption (increased number of licks, Fig. 2d).Conversely, contrary to our hypothesis, D1 cell activation did not impact overall consumption via effects on either 'liking' or 'wanting' parameters (Fig. 2d-f).This would mean that sustained activity of D1 cells beyond the naturally occurring transient increase described in Fig. 1d, e does not promote further 'liking', nor does it induce 'wanting'.
Next, we sought to test whether silencing D1 and D2 cells using the inhibitory opsin eNpHR3.0could also influence consumption of a palatable food.We hypothesized that D1 cell silencing would induce a decrease in consumption, while D2 cell silencing would prolong consumption.However, silencing D1 cells did not lead to a significant decrease in overall consumption (Fig. 2d), but induced a significant increase in 'liking' (decreased ILI, Fig. 2e) with no significant effect on 'wanting' (Fig. 2f).Inhibitory postingestive mechanisms such as satiation, which increase as ingestion proceeds (Fig. S3) [48,55], may have had an influence.
Inhibition of D2 cells did not lead to any significant difference in lick or bout number compared to controls (Fig. 2d, f), suggesting that a further decrease of the natural suppression in activity observed during consumption (Fig. 1d, e) may not be possible and therefore may not yield any stronger consummatory effect.
Taken together, these results show that during ad libitum consumption, only D2 cell activation leads to an overall increase in eating, and this is mediated by an increase in 'wanting', despite a concomitant decrease in 'liking'.However, a limitation of the ad libitum paradigm is that it only examines baseline motivation and is not designed to evaluate motivation to work for a reward.Therefore, we next probed the role of D1 and D2 cells in a different motivational task, designed to interrogate 'wanting'.
NAc D1 and D2 cell activity increases with anticipation during a motivational task to retrieve palatable food, which is followed by a D2 cell activity decrease upon consumption onset To probe 'wanting' further, we paired a food reward with a cue, which allowed us to investigate the attribution of 'incentive salience' to the cue.We exposed mice to a Pavlovian 'motivation to retrieve a reward' task in which they received food rewards announced by a sound cue, which they had to retrieve from a food magazine.This did not require the mice to develop learning strategies.This allowed us to monitor three parameters: the number of entries in the food magazine, the time to retrieve and consume the food after cue onset (latency), and the length of food-magazine entries (Fig. 3a, b).
Photometry recordings showed sustained increased activity in D2 cells during the approach phase (after food delivery cue; Fig. 3d, right).D1 cells, however, only transiently increased their activity following food delivery cue (Fig. 3d, left).An important distinction in the behavior of D1 cell activity must be made: in contrast to the ad libitum paradigm where D1 cell signal increased prior to licking onset, here, in a motivational task, the increase in signal was specifically aligned to the reward delivery cue.During food magazine visits (considering only trials when cue delivery and food magazine visits were far apart enough, see Methods), the neural dynamics observed were similar to the first ad libitum study, with a sustained decrease in the activity of D2 cells (Fig. 3e).
As in the first experiment, we were interested to see if the natural D1 and D2 cell activity held predictive information about components of feeding behavior.Correlation analysis of 'wanting' behavioral parameters and D2 cell activity after a food delivery cue revealed that latency to enter the food magazine can be well predicted by the average D2 cell activity before the food magazine visit, and by the standard deviation (Std) of the prefood magazine visit D2 cell signal (Fig. S6a; the higher the signal before visit, the shorter the latency).This correlation analysis also showed that the number of food magazine visits can be well predicted by the median D2 cell signal before the food magazine visit (Fig. S6b; the higher the signal pre-food magazine visit, the more visits occurred).The average length of a food magazine visit was also found to be well predicted by the minimum of D2 cell activity during consumption, and by the standard deviation of the signal before the visit (Fig. S6c; the lower the minimum, the longer the food magazine visit).D1 cell analysis yielded no significant correlations.
Therefore, latency and number of visits correlated with D2 cell signals before visits occurred, while the length of a visit was determined by the signals during the visit.These data also suggest that suppression of D2 cell activity during feeding favors consumption, which was investigated further in the next experiment.
Inhibition of D2 or D1 cells reduces motivation to retrieve a food reward, while activation of D1 cells, but not D2 cells, enhances motivation Because both D1 and D2 cells were modulated during the motivation task (Fig. 3d, e), we aimed to causally test the effect of cell-type-specific activation and inhibition on motivated behavior ('wanting').We optogenetically manipulated NAc D1 and D2 cells at either food reward delivery cue or at food magazine visit (first visit following a delivery cue) (Fig. 3a, c).D2 cell silencing at either food delivery cue, or at food magazine visit significantly decreased the number of food magazine entries (Fig. 3f, g), in line with our regression analysis results showing that D2 cell signal before FM visit positively correlates with the number of visits (Fig. S6b).Inhibiting D2 cells after food delivery cue significantly decreased the average length of the subsequent food magazine visit (Fig. 3j), an effect not observed if inhibition was triggered after FM visit (Fig. 3k).As NAc D2 cell activity is naturally low after FM visits (Fig. 3e), further inhibition is indeed not expected to yield further effects.Thus, these results are in line with our photometry findings (Fig. 3d, right) suggesting a role for D2 cell activity in the anticipation of a food reward ('wanting').The absence of effect of D2 cell activation in reward anticipation (Fig. 3f, h, j, right) may be due to a ceiling effect, as D2 cell activity is already high (Fig. 3d, right), and thus does not call into question the inhibition results suggesting a role for D2 cell activity in 'wanting'.Optogenetic activation of D1 cells at either food delivery cue or at food magazine visit significantly increased the number of FM entries (Fig. 3f, g).Moreover, silencing D1 cells during the food delivery cue increased the average latency to food magazine entry (Fig. 3h); this effect was not seen if inhibition was triggered at FM visit, showing the specificity of the effect (Fig. 3i).Combined with photometry findings (Fig. 3d, e, left), these findings about D1 cells suggest that their activity causes a higher incentive motivation to approach the palatable food, making the mice more focused on obtaining the reward and more reactive to the cue.
Together, our data indicate that both D1 and D2 cell activities during the food approach code and control incentive motivation or 'wanting'.
Modulation of NAc D1 but not D2 cells influences hedonic preference learning (hedonic shifting) We next investigated whether pairing a flavored palatable food, constituting a conditioned stimulus (CS), with optogenetic stimulation (unconditioned stimulus, US) of either D1 or D2 cells could shift natural preference for one flavor to another.We hypothesized that optogenetic stimulation of either population would establish a learned preference for the least innately preferred flavor by increasing the hedonic value ('liking') of the CS (flavored food).
To assess innate preference, mice were first presented with two bottles, each containing a different flavor, but with identical nutritional composition ('pre-test').The least preferred flavor was then coupled to optogenetic stimulation of either D1 or D2 cells during a daily conditioning session over several days, while the preferred flavor was presented unpaired with optogenetic stimulation in a separate daily session (Fig. 4a).Following conditioning, we performed a 'choice-test' to determine if the paired flavor was now able to trigger incentive motivation ('wanting') and become equally, or possibly more, attractive than the previously preferred, unpaired flavor.During conditioning, increased consumption from the paired flavor over the unpaired one would indicate a hedonic change ('liking').During the choice test, with respect to the 'liking'/'wanting' hypothesis, disruption of a reward-oriented behavior by prior optogenetic manipulation of the NAc could, in principle, be due to a disruption of either 'liking' or 'wanting' or, presumably, a combination of both.
During optogenetic activation of D2 cells, consumption of the paired and unpaired flavors across the conditioning phase did not differ (Fig. 4c).Note that during this conditioning phase, each flavor was presented in separate sessions, hence the absence of difference between the amounts consumed, unlike the difference observed in the 'pre-test' (Fig. S8a, right).However, optogenetic stimulation of D1 cells led to a gradual increase in the consumption of the paired flavor, so that the mice consumed larger amounts of food in the late paired conditioning sessions compared to GFP controls (Fig. 4b).During the choice-test, an increase in consumption of the paired flavor compared to the pretest levels was noted while a decrease was observed in the unpaired flavor, leading to significantly different 'choice-testpretest' deltas between the two flavors (Fig. 4d, left).In control GFP mice, Figs.4d and S8a, b suggest that repeated exposure to the flavors reduces the initial difference in preference between flavors but does not reverse it.
In addition, we used lick microstructure analysis to assess potential changes in 'liking' and 'wanting' produced by flavor preference conditioning.During the conditioning sessions, stimulation of D1, but not D2 cells increased 'liking' (increased lick duration and decreased inter-lick duration) over time (Fig. 5a, c, left).This indicates an increased palatability of the flavor paired with D1 cell optostimulation, which is of relevance because conditioned increases in palatability have been shown to be important for driving overeating [56].Moreover, inter-bout interval (IBI) decreased over conditioning sessions (Fig. 5g, left), indicating an increase in 'wanting', while motivation to consume the unpaired flavor decreased over sessions, as indicated by decreasing bout numbers and initially increasing IBI (Fig. 5e, g, middle).During the choice-test, when both flavors were once again presented simultaneously, with no optogenetic stimulation on either of the flavors at this stage, licking microstructure indicated an increased 'liking' of the paired flavor (higher lick duration, and lower ILI, Fig. 5a, c, right).These effects were not observable in the unpaired flavor.'Wanting' was also increased for the paired flavor (increased bout number, decreased IBI, Fig. 5e, g, right) and decreased for the unpaired flavor (Fig. 5g, right).
These results show an overall preference for the flavor coupled with D1 cell stimulation, thus demonstrating an important role for D1 cells in promoting the consumption of preferred palatable foods.

DISCUSSION
Our study reveals distinct patterns of NAc D1 and D2 cell activity during different behavioral paradigms intended to assess different psychological processes of hedonic behaviors: hedonia ('liking' a food) and motivation ('wanting' the food).In a free palatable food consumption paradigm, the magnitude of natural D2 cell inhibition following consumption onset was found to be a good predictor for 'liking'.In contrast, D1 cell activity promotes 'liking' during free consumption.In the operant task, we showed that D1 cell activity after cue and right after eating onset drives the motivation to retrieve palatable food, while reduced activity of D2 cells after food magazine visits drives the motivation to remain in the magazine.Lastly, we show that D1 cell stimulation induced hedonic shifting in a Pavlovian task, leading to overconsumption of palatable food.This effect was specific to D1 cells, as stimulation of D2 cells during opto-paired food conditioning had no effect across conditioning days or during the choice test (see Fig. 6 for summary).

The role of NAc D1 and D2 cells during periods of free consumption of palatable food
The NAc is part of a neural circuit governing consumption of palatable food [57] and neuronal subpopulations in the NAc shell encode rewarding stimuli and palatability [58,59].Pharmacological manipulations in rodents and in hyper-dopaminergic knockout mice showed that 'wanting' and 'liking' sucrose could be separated with an increase in 'wanting' but a constant decrease in 'liking' [60][61][62][63] and that 'wanting' was increased with prolonged sucrose consumption [64].Here, using photometry and GFP (p = 0.001).D2, one-way ANOVA (opsin: F(2,252) = 3.403, p < 0.035), post-hoc with Bonferroni correction eNpHR vs. GFP (p = 0.032).h As in (f) but for the latency to enter the FM following food delivery cue.Optogenetic stimulation was triggered at food delivery cue.D1, one-way ANOVA (opsin: F(2,264) = 9.820, p < 0.001), post-hoc with Bonferroni correction eNpHR vs. GFP (p = 0.003).D2, one-way ANOVA (opsin (F(2,246) = 5.158, p = 0.006), post-hoc tests N.S. i As in (h) but when the 10-sec optogenetic stimulation occurred at the first food magazine visit of each trial.D1: one-way ANOVA (opsin: N.S.).D2: one-way ANOVA (opsin: N.S.).j As in (f) but showing the average length of a food magazine visit.Optogenetic stimulation occurred at food delivery cue.D1, one-way ANOVA (opsin: N.S.).D2, one-way ANOVA (opsin: F(2,209) = 5.408, p = 0.005), post-hoc with Bonferroni correction eNpHR vs. GFP (p = 0.004).k As in (j) but when optogenetic stimulation occurs during 10 sec following the first FM visit of each trial.D1: one-way ANOVA (opsin: N.S.).D2: one-way ANOVA (opsin: N.S.).Number of mice for panels (f)-(k): see Table 1; each mouse had 15 trials.FM: Food magazine.p values reported on the figures as follows: *p ≤ 0.05, **p < 0.01, ***p < 0.001.optogenetics, we showed a differential role between NAc D1 and D2 cells during free consumption of a palatable food.Activity of NAc D2 cells was linked to both 'liking' and 'wanting' but with opposing actions; high D2 cell activity signaled 'wanting', and low D2 cell activity enhanced 'liking'.Importantly, the increased activity in D2 cells was similar across milkshake sessions, whereas inhibition of this signaling following the onset of milkshake consumption was greater after repeated milkshake sessions (Fig. 1e).A multiple regression analysis indicated that this magnitude of inhibition (area under the curve) after consumption was a good predictor for 'liking' (bout duration, Figs.2b and S4b).Previous findings showed that during consumption of sucrose, a decrease in firing rate in a subset of NAc neurons immediately before and during a lick bout is required to initiate and maintain  consummatory behavior [65,66].The present study supports the idea that sustained D2 cell depression maintains the actions resulting in palatable food consumption and hedonic effects [58,[65][66][67][68].We showed that D1 cells on the other hand promote 'liking' around consumption onset (the slower temporal dynamics of GCaMP6s do not allow us to exclude that increase in activity occurs just before, rather than just after, consumption onset, as discussed earlier).Our optogenetic manipulations in D1 cells suggest that D1 inhibition increases 'liking' (decreased ILI, Fig. 2e).Yet, similarly to a previous study in the ventrolateral striatum which found that D1 cell inhibition decreased lick rate without affect inter-contact interval [69], it seems that lick duration, ILI and licking rate are differentially affected by changes in brain cell activity.ILI may not always be a good measure of 'liking' when satiation comes into play, as the resulting licking rate, which also depends on lick duration, is the main parameter that tends to drive overall consumption.D1 receptors have been shown to play a role in incentive learning [70], and so may be important during the initial phase of instrumental learning, when reward is novel and unpredictable [71] and in the primary response to drugs of abuse [72].This may explain why D1 cell activity was highest during the first milkshake session (Fig. 1e).Supporting this interpretation, in a multiple regression analysis, the maximum D1 cell signal following lick onset was a significant predictor of bout duration; the larger the initial increase, the longer the licking bout (Figs.2a and S4a).D1 cells would therefore encode 'liking' initially, before shifting to encoding 'wanting'.Thus the decrease in the peak amplitude of D1 cell activity after lick onset in the freelicking paradigm (Fig. 1e) would not indicate a decrease in 'liking', but a gradual shift of the role of D1 cell activity from 'liking' to 'wanting'.
Previous reports show that NAc D1 cell activity increased during consumption, in line with our findings [36,37].Yet, other studies show that on the contrary, it decreases during consumption and that D1 cell inhibition prolongs feeding bouts [35], which may appear at first glance contradictory to our results.However, O'Connor and colleagues' findings [35] go in line with our multiple regression analysis showing that the larger the amplitude of the decrease in D1 cell activity (i.e., the lower the D1 cell activity) after the initial increase at licking onset, the longer the licking bout (Figs.2a and S4a).If our initial peak was shifted to post-lick onset due to the slower dynamics of GCaMP6s compared to O'Connor et al.'s unit recordings, then our results agree, showing a decrease in medial NAc shell D1 cell activity during ongoing food reward consumption.Task design, e.g., acute vs. chronic exposure (as in O'Connor, Kremer [35]) may also account for some of the discrepancies (and indeed we saw a decrease in D1 activity after several sessions of exposure to a food reward, Fig. 1e, left).
Contribution of NAc D1 and D2 cells to reward-associated cues D1 and D2 cells displayed distinct temporal activity during food magazine visits in the motivational task; D1 cell activity increased after onset of the food delivery cue, whereas D2 cell activity was increased before, but suppressed after, food magazine entry (Fig. 3d, e).These results are in line with previous findings from drug-addiction literature showing that NAc D1 cells increase their activity before entering a drug-paired chamber, while D2 cells decrease their activity after [73].Thus it may be that in our task, D1 cell activity drives the motivation to enter the food magazine, while reduced activity of D2 cells after food magazine visit onset would drive the motivation to remain in the magazine.This might suggest that D1, but not D2 cell signaling drives reward-and reinforcement-related behaviors.In particular, we show a causal role of temporally specific cue-elicited D1 cell signaling, which may further demonstrate the role of this cell type in motivation to retrieve rewards.Activation of D1 cells when animals entered the food magazine increased the number of food magazine visits.This may imply that the D1 cell signals guide animals to seek reward, whereas suppressing D1 cell activity at either food magazine or cue onset does not lead to any changes in the number of visits compared to the control group.However, inhibition of D1 cells at cue onset delayed approach.D1 cell activity may thus drive higher incentive motivation to reach the palatable food (reward) by increasing focus on the cue, and amplifying 'wanting' for a particular incentive target (the highly-caloric palatable food), in line with previous findings showing that NAc D1 cell excitation leads to a strong motivation to self-stimulate [74].This could contribute to intense urges to indulge in those foods, leading to overeating.Attenuating or eliminating D1 cell signals may decrease the associative learning of environmental stimuli .The amplitude of increase following feeding onset diminishes with daily repeated exposure to the food reward (grey line and orange arrow).NAc D2 cells show a sustained increase in activity following a food-predicting cue and their activity drops following feeding onset; this drop becomes lower during the consumption bout and with repeated daily exposure to the food reward (orange arrow).b Hypothetical model of the mutual relationship between approaching ('wanting') and consuming ('liking') reward behaviors, and NAc D1 and D2 cell activity.NAc D1 cell activation induces both 'liking' and 'wanting' processes (yellow arrows), and D1 cells are naturally activated during behaviors associated with 'wanting' and/or 'liking' (grey arrows).NAc D2 cell activation, on the other hand, leads to increased 'wanting' and decreased 'liking' (yellow arrows).Behaviors associated with 'wanting' , e.g., following a food-predicting cue and during food approach, lead to an increase in D2 cell activity, while behaviors associated with 'liking' , e.g., during food consumption, lead to a sustained decrease in D2 cell activity in the NAc.
(reward-associated cue) with reward.Our experiments were conducted once the animals learnt the association between cue and reward and the neuronal response that was measured was to the reward-predictive cue [75,76].Pharmacological studies investigating the role of NAc D1 and D2 receptors in response to drugs of abuse have also shown that both D1 and D2 receptors have a role in associative learning [77].Dopamine responses to cues predicting drug availability can lead to drug seeking behavior and over time these responses to drug-associated cues are dampened [72,77].Our findings regarding D1 cells in the context of associative learning are in line with previous publications demonstrating that D1-like receptors are essential in reward related learning including instrumental learning [78,79] as well as driving motivation to action [80].More specifically, systemic administration as well as local infusion into the NAc of dopamine D1-like receptor antagonists have been shown to attenuate foodreinforced lever pressing and to dampen the rewarding effects of palatable food [81][82][83][84][85]. Furthermore, optogenetic activation of NAc D1 cells enhanced drug-induced conditioned place preference [28,86].Pharmacological activation of NAc D1 receptors produced similar findings, showing that D1 cell activity drives incentive motivation to consume food and drugs of abuse [34,87,88].
On the other hand, D2 cell silencing at the time of food delivery cue significantly decreased the number of food magazine entries.This suggests a role for D2 cell activity in the anticipation of a reward ('wanting').These effects on motivation, although inconsistent with some previous findings which suggested that D2 cells are associated with negative rather than positive valence events [1,28,54], were in line with other more recent findings which have shown that D2 cell activation increased motivation to obtain a reward in a progressive ratio task [26,89].
In our regression analyses, we included the standard deviation in the signals before licking/food magazine visit ('std pre') as a way to explore how neuronal activity variability relates to behavioral variability [90].In the free-consumption task, we found that the greater the variability in the D1 cell activity signal before lick onset, the longer the licking bout.In the motivational task, we found that the greater the variability in the D2 cell activity signal before food magazine visit, the shorter the latency to visit the food magazine and the longer the food magazine visit length.Therefore, greater variability in the pre-consumption signal could reflect enhanced exploratory behavior (shorter latency) and stabilized future consummatory behavior (longer food magazine visit).Thus, in both D1 and D2 cells, variability in the signal before consumption onset was correlated with more stable consummatory behavior.It is interesting to note that a distinction has been proposed in NAc subregions, whereby the NAc shell would be implicated in 'preparatory' appetitive behavior, whereas the NAc core would play a role in 'consummatory' appetitive behavior [91].Here, our results suggest that the medial NAc shell, via its D1 and D2 cell subpopulations, plays a role both in preparatory (approach) and consummatory (sensory/hedonic aspects) phases of reward consumption.

NAc D1 cell stimulation drives hedonic shifting and overconsumption of palatable food
In the hedonic-shifting choice test after the final optogenetic stimulation pairing, D1 but not D2 cell activity significantly increased preference for the flavor that was previously least preferred.This indicates that D1 cell optostimulation induced a hedonic preference via an association between the flavor and the rewarding effects of stimulation (Fig. 4d).This was not observed in control mice.In addition, during the conditioning sessions with D1 cell optostimulation, an increase in time spent consuming the opto-paired flavor was observed across sessions (Fig. 4b).This may reflect a hedonic experience which alters associative hedonic/ 'liking' processes by selectively enhancing D1 cell activity.This facilitates the association between flavor and optostimulation and promotes hedonic shifting that in turn leads to increased consumption of that flavor as observed in the choice test.This effect was specific for NAc D1 cells; stimulation of NAc D2 cells during conditioning had no effect either across conditioning days, or on the choice test.
In this experiment we were able to induce a learned shift in preference for a particular flavor when D1 cell optogenetic stimulation was coupled with a less preferred flavor.This is based on a Pavlovian association formed between NAc D1 cell activation (US) and the least preferred flavor (CS), which changed the taste preference to that flavor.This suggests that hedonic 'liking' of a flavor stimulus can be determined not only by flavor itself, but also by the relevant 'brain' state at the time of formation of Pavlovian associations, for example by a change in perceived palatability.In a similar manner, in the context of drug addiction, it was shown that NAc D1 cell activity preceding entry to the drug-paired compartment in a conditioned place preference test is needed for drug-associative learning to take place [28,73].
That our findings are not always in line with previous literature may also reflect the discordance between previous studies themselves, based on the precise NAc subregion targeted, which plays a role in the results obtained [31,92], the pattern (frequency, length) of the optogenetic modulation which also has an impact on the direction of the effects [33], and the experimental paradigm that also plays a crucial role depending on the cues and responses it entails [93].It is also important to note that D1 and D2 cell populations are not homogeneous within themselves either.For instance Yang et al. have nicely shown how D1 cell subpopulations in the medial NAc shell have different projection patterns to the medial VTA [31].Another study showed that NAc D1 cells projecting to the ventral mesencephalon or the ventral pallidum were largely segregated, and the response of their efferents to rewarding and aversive stimuli was opposite [94].Although focusing on the NAc core, another study showed how responses varied following cocaine sensitization among D1 cells, but also among D2 cells [95].
Previous studies have shown that NAc shell dopamine release during reward delivery, but not during the preceding cue, increases preference for the dopamine-release-associated reward [96].This would mean that activating D1 cells and/or inhibiting D2 cells during reward delivery but not during cue delivery would possibly increase preference.This is in part consistent with our motivational task results, where D1 cell activation during consumption increased the number of FM visits.However, D2 cell inhibition during cue delivery (which should also arise from increased dopamine tone) led to fewer visits.This indicates that the independent activity of D1 and D2 cells is not equivalent to their joint activation and inhibition triggered by dopamine release.Indeed, when considering the hedonic shifting task, we found that activating NAc shell D1 cells led to a long-lasting shift in preference (i.e.beyond optogenetic-activation days) while Sackett, Moschak [96] have shown that if triggering dopamine release in the NAc shell by activating dopamine terminals in this subregion, thus activating D1 cells while inhibiting D2 cells, preference is temporary (i.e.only observed on optogenetic days).This would mean that NAc shell D2 cell silencing makes D1-cell-activityinduced preference labile and may thus be an important element to prevent future overconsumption.This is in line with the mapping we propose in Fig. S10 of NAc shell D1 and D2 cell activity onto the 'liking' and 'wanting' behavioral space based on our results.Dopamine release, by activating D1 cells and inhibiting D2 cells would favor net 'liking' and little change in 'wanting', thus resulting in transient preference only.D1 cell activity alone, on the other hand, would favor both 'liking' and 'wanting' and thus allow for the long-lasting preference we observed (Fig. 4b).'Wanting' without 'liking' (as would be the case with D2 cell activation alone) would itself not allow for change in preference (although it may affect acute consumption beyond preference shifting).
It was also shown that in the early stage of an associative learning task, NAc dopamine release occurred at reward delivery and not cue onset, but that at later stages of the task (when the animals have learnt the association) the release occurs at cue onset but not reward [97,98].In the motivational task, our animals had learnt the association, and yet, we observe both D1 and D2 cell activity at cue onset, while dopamine release should lead to a D1 cell activity increase but a D2 cell activity decrease.Yet, other studies found transient increases of dopamine release in the NAc shell at both cue onset and reward delivery [99] and at reward consumption even after learning a lever-pressing task [100] which would be in line with our results: we observe a decrease in D2 cell activity at reward consumption suggesting dopamine release.It may be that the exact structure of the task has an impact, and the NAc subregion plays a role as well (Day, Roitman [97] aimed for the lateral NAc core, while Beyene, Carelli [99], and we, aimed for the medial NAc shell).Finally, it was shown that in the absence of predictor, dopamine release in the NAc stays locked to reward consumption [97,98], which is in line with our results for D1 and D2 cell activity in the free-consumption task (D1 cell activity is elevated while D2 cell activity is suppressed following consumption onset).
Clinical implications for the role of NAc D1 and D2 cell activity in 'liking' versus 'wanting' in overconsumption and eating disorders Our findings have potential implications for human eating disorders.'Hedonic eating' does not necessarily involve pleasure, but can also be driven by a motivation to eat which can be dissociated from the drive caused by a nutrient deficit [101,102].It is already well known that in humans, high-calorie foods, especially sweet and fatty ones, promote overconsumption.Recent findings have shown a crucial role for NAc core D1 cells in the onset of obesity [103].The incentive theory of 'liking'/'wanting' is significant in the study of food intake regulation, modelling associations between the function of the reward system and feeding or obesity.Studies tried to show that different neural mechanisms are responsible for 'wanting' a certain food than 'liking' it.For more than a decade, human studies of obesity have used 'liking' and 'wanting' to try to better understand how 'wanting' food can differ in some individuals, and override 'liking' to cause excessive cuetrigged 'wanting' (craving) that will drive overeating and lead to obesity [104][105][106][107][108]. The incentive sensitization theory, originally postulated for drug addiction [109], was then applied to eating disorders to try to explain how some individuals may experience 'wanting' to eat palatable foods [18,19].Neuroimaging studies suggest that in humans diagnosed with obesity or binge-eating disorders, overeating is triggered by visual cues associated with palatable food [110][111][112].For example, increased brain activity in response to palatable foods was positively correlated with selfreported craving rates or 'wanting' to eat compared to healthy controls [113][114][115].This evidence is very similar to findings in individuals suffering from drug addiction [106,[116][117][118][119][120].

CONCLUSION
To our knowledge, our study is the first to approach aspects of NAc dependent reward consumption using such a multi-faceted approach.We incorporated observation, inhibition and excitation of NAc D1 and D2 cells, multiple behavioral paradigms investigating different aspects of 'liking' and 'wanting', including analysis of lick microstructure, and multivariate analysis.In summary, we show that NAc D1 cell activity plays a role in reinforced behaviors induced by palatable foods in an operant paradigm or a Pavlovian association test.However, in a palatable food free-access paradigm D1 cells encode only the hedonic property of the palatable food.NAc D2 cells on the other hand, have a role in the drive to retrieve food, showing elevated activity prior to eating onset in a motivational task.Once eating starts, suppressed D2 cell activity allows for prolonged eating episodes.This would support the idea that evolutionary processes led to these two cell populations that can independently and mutually influence hedonic eating.D1 cells are prone to lead to a maladaptive cycle of increased 'liking' and 'wanting', thus driving overeating, while D2 cells cannot reinforce both processes and are faced with a tradeoff between increasing 'wanting' by being more active or allowing 'liking' by remaining silent (Figs. 6 and S10).These findings, which temporally distinguish between the different psychological processes of food reward and reinforcement, support the incentive salience hypothesis [13,19].They may assist the design of future studies and improve diagnostic criteria of addiction and eating disorders.

MATERIALS AND METHODS Animals and housing conditions
All animal procedures were carried out in accordance with the Animal Welfare Ordinance (TSchV 455.1) of the Swiss Federal Food Safety and Veterinary Office and were approved by the Zurich Cantonal Veterinary Office.Mice were kept on a reversed 12-h/12-h light/dark cycle and provided with standard chow pellets ad libitum.In order to avoid food deprivation for behavioral testing, mice were provided with 2% citric acid in water ad libitum in their home cage [121].As the transfer to citric acid water led to an initial mild weight loss (ca.90% of baseline weight) that then receded, mice were transferred at least 5 days before starting experiments to allow for their weight to stabilize.All experiments were performed during the dark phase.Both adult male and female mice were used for the experiments (see Table 1) and mice were at least 7 weeks old before surgeries were performed.As no differences were found in the results between sexes, males and females were pooled (see Statistical Analysis section).
For stereotaxic surgeries (Kopf Instruments), a small craniotomy of 0.5 mm in diameter was performed above the injection sites using a microdrill.Injections were performed using a Nanoject III injector and glass capillaries (Süd-Laborbedarf Gaunting); 120 nL of virus were injected on each side at a rate of 1 nL/sec before optic fibers were slowly inserted and fixed to the skull using Super-Bond (Sun Medical Co. Ltd).Mice were allowed 6 weeks to recover before starting experiments.

Photometry recordings
Photometry recordings were performed using a camera-based bundleimaging fiber photometry system (Doric Lenses) using interleaved illumination produced by two LEDs (2 excitation wavelengths: 405 nm and 465 nm / 2 cycles, resolution 1200 × 1200, effective sampling rate 20 Hz).Fluorescence produced by 405-nm excitation provided a real-time control for motion artifacts.Recordings were performed using the Doric Neuroscience Studio software.

Optogenetics
Optogenetic activation of D1-ChrimsonR and D2-ChrimsonR cells was performed using a red laser (635 nm, Laserglow Technologies).Laser Molecular Psychiatry (2023) 28:3531 -3547 output frequency was 4-ms pulses at 20 Hz, driven by an Arduino board.Optogenetic inhibition of D1-eNpHR3.0and D2-eNpHR3.0cells was achieved with a yellow laser (589 nm, Laserglow Technologies).The laser delivered a constant light when inhibition was triggered and ramped down at the end of an inhibition bout to avoid neuronal activity rebound.D1-GFP and D2-GFP control mice were split evenly between optoinhibitory and opto-excitatory stimulation patterns to exclude heating and ipsiversive effects of either pattern [122].All mice were also tested in laser OFF conditions for within-subject control (Fig. S7).
The lasers were connected to the bilateral fiber implants using a 1 × 2 fiber-optic rotary joint (Doric Lenses) and yielded an output power of ∼8 mW at the end of each fiber tip.Laser pulse onset parameters are described in each relevant experiment section.
Virus location and location of fibers were checked in all mice from both the Photometry and Optogenetics cohorts using standard histology methods.

'Motivation to retrieve a reward' task
Pre-surgery screening.Prior to surgeries, both photometry and optogenetic cohorts mice were screened in this simple reward delivery task.Mice were habituated to the experimental chambers (Mouse Touch Screen Chambers installed in sound-proof chambers, Lafayette Instruments) for 20 min per day for two days.The task schedule was designed and controlled by ABET II software (Lafayette Instrument).On three consecutive days, mice could receive up to 15 milkshake deliveries (Energy milk, strawberry flavor, Emmi Schweiz AG; 20 μL/delivery which was consistently entirely eaten by all mice; mice had been previously habituated in the home cage) during a maximum of 30 min.The session ended when whichever of the 15 deliveries or 30 min came first.The inter-trial interval varied pseudo-randomly, 30 ± 10 sec, counted from the moment a mouse exited the food magazine.Only those able to perform the task in 10 min or less were implanted.This was an indication that mice made the association between the cue and food delivery.The house light (white light) was on during the sessions.The tray light was turned on at each milkshake delivery and turned off when the mouse visited the food magazine.Milkshake delivery was associated with a sound cue.Consumption onset was defined as the first food magazine IR beam break following each delivery.
Photometry cohort.The same paradigm used for pre-surgery screening was run again on 3 consecutive days while photometry signals were being recorded (see Photometry Recordings section for details).
Optogenetics cohort.Here, the task structure was similar to the presurgery screening task, except mice performed the task over two days only.The first day was identical to the pre-screening task and served as a baseline.On the second day, the task was identical but 10 s of optogenetic stimulation either started from milkshake delivery cue (D1-eNpHR3.0,D2-eNpHR3.0and GFP controls) or started at food magazine visit onset (D1-ChrismonR, D2-ChrismonR, and GFP controls).After a few-weeks wash-out period this 2-day experiment was repeated, only this time mice which had received optogenetic stimulation during milkshake delivery now received it during the food magazine visit and vice versa.In these tasks the activation and inhibition patterns were the same as those described in the Optogenetics section, repeated continuously over the 10 sec each stimulation lasted for.Instead of lasers, light was delivered using a PlexBright optogenetic stimulation system (Plexon Inc.) consisting of an Optogenetic Controller (Plexon Inc.) and orange LEDs for activation (compact LED, 620 nm, Plexon Inc.) or lime LEDs for inhibition (compact LED, 550 nm, Plexon Inc.).The output at the end of each patchcord (200/ 230 μm, 0.5NA, Plexon Inc.) was ∼5 mW.The patterns were designed in and controlled by Radiant software (Plexon Inc.) which was itself triggered by ABET II software during the task.

Ad libitum consumption experiment
Mice were first habituated to spring-based tip bottles (EBECO) in the home cage using water.Habituation to the arenas (20 × 20 cm 2 arenas, Habitest modular mouse test cage, Coulbourn Instruments, USA) and patchcords (200 μm, low-autofluorescence, Doric Lenses) was performed on 2 consecutive days before the experiment started.The test cages were placed in closed, sound-attenuating chambers (Coulbourn Instruments, USA) and illuminated using infra-red light only.Licks were detected using a capacitive sensor (Capacitive Touch Breakout, Sparkfun) and recorded at 500 Hz throughout the sessions using a National Instruments board (USB-X Series DAQ, NI-6343) and custom built LabView software (National Instruments).Weights consumed were recorded at the end of each session.Mice were not food deprived for these experiments.
Photometry cohort.On experimental days, mice were placed in the arenas for 20-min long sessions (one session/day for each mouse) with free access to one spring-based tip bottle (EBECO) containing water (days 1-3), strawberry milkshake (days 4-8).Concomitant photometry signals were recorded as described in the Photometry Recordings section above.For behavioral parameters only, 6 additional mice (with no concomitant D1 or D2 cell photometry recording, but which were also tethered) were included (i.e., n = 3 × 6 = 18).
Optogenetics cohort.Over a 20-min session with access to milkshake, laser stimulation started (according to the patterns described in the Optogenetics section) every time a mouse started licking and continued for the duration of licking.

Hedonic shifting experiment
For this experiment only ChrimsonR mice (and their respective GFP controls) were used.Mice were habituated in the home-cage to two novel flavors of milkshake which were matched for their calorie and macronutrient contents: a chocolate-flavored and a vanilla-flavored milkshake ('Milk Choco Mountain' Protein, 'Vanilla Drive', respectively, Chiefs AG Schweiz).
Pre-test phase.This phase served to test the innate preference of mice for one flavor over the other.Testing was performed over two consecutive days with 20-min sessions in the same Coulbourn Instruments chambers as described in the Ad libitum consumption experiment section.Two springbased tip bottles (EBECO), each with a different flavor of milkshake were freely available.The positions of the bottles in the cage were counterbalanced across mice.Licking and amounts consumed were recorded as described in the Ad libitum consumption experiment section.The time spent licking at each bottle on the second day of the pre-test phase was used to determine which was each mouse's preferred flavour.
Conditioning phase.Two 10-min conditioning sessions were run, one in the morning, and one in the afternoon for 7 consecutive days.In each of these sessions only one flavor was available at a time, and the order of the sessions was swapped each day (see Fig. 4a).Licking at the mouse's initially-less-preferred flavor triggered optogenetic activation for as long as the mouse kept licking, using the patterns and equipment described in the Optogenetics and Ad libitum consumption experiment sections respectively.
Choice-test phase.24 h after conditioning, mice were assessed for their preference again.During the choice-test, both flavors were presented simultaneously for 20 min with no optogenetic stimulation on either flavor.
Throughout the conditioning and choice-test phases, but not for the initial preference pre-test phase, mice were mildly food-deprived.

Data analysis
Licking microstructure analysis.Based on previous literature [44,45], licking behavior was analyzed by looking at the following parameters (see Fig. 1f): Single lick durations, defined as an uninterrupted high signal.Inter-lick intervals (ILI), defined as an uninterrupted low in the signal of up to 250 ms.An upper boundary was set in accordance to Johnson [45], to distinguish them from inter-bout intervals.Number of licking bouts, a bout was defined as a licking episode of at least 500 ms, allowing for interruptions within a bout of up to 250 ms.Licking bout duration, length of a given bout in seconds.Licking bout size, number of individual licks a given bout.Inter-bout intervals (IBI), defined as interruptions in licking of more than 250 ms allowing for licking occurrences of up to 250 ms, i.e., a short lick was not considered a bout and thus does not interrupt a given IBI.
For the regression analysis (see below), licking bouts had a slightly more inclusive definition, starting with at least 100 ms of licking and allowing for pauses of up to 1 sec.

Multiple regression analysis
Correlations between photometry signals and 'liking'/'wanting' parameters were performed using the SPSS (IBM Corp. Version 28) linear regression tool and following the regression method described in Field [123].A subset of 8 parameters ("predictors") from the photometry signal around a licking bout onset was selected: maximum, minimum and area under the curve (AUC) in the 5 secs following lick onset; mean and median signal over the 3 sec before lick onset; mean and median signal over the 3 sec following lick onset and standard deviation of the non-z-scored signal in the 5 sec preceding lick onset.First, it was evaluated whether these predictors were highly correlated between themselves: Pearson's r > 8.5, VIF > 10, toler-ance<0.1.A VIF value of 10 or above (tolerance below 0.1) are general thresholds above (respectively under) which one should consider a predictor as having linear relationships with other predictors [123][124][125].An average VIF value that is much greater than 1 indicates that multicollinearity is likely to affect the regression results.If such values were observed in the data, then expression of each predictor in eigen vector base was checked (namely the variance proportions for each predictor of its coefficients in the eigenvector base).If two predictors had high variance proportions ( > 0.5) for the same combination of dimensions, then only the predictor with the most significant correlation with the outcome variable was kept.Second, a linear regression was performed keeping only the noncollinear predictors.A final block-wise (hierarchical) regression was done keeping only the predictors that showed significant correlations with the outcome variable and/or model coefficients significantly different from 0. This final block-wise approachwhich started with the predictor yielding the highest significant standardized model parameter (β) and continued in decreasing order of βsallowed us to evaluate from which predictor adding one more predictor to the model did not yield better predictive power (looking at the p value associated to the F-ratio change).Predictors whose addition to the model gave a significant F-ratio change (i.e., improved the model) are reported in Figs.2a, b and S4-6.Case-wise diagnostics allowed us to confirm the adequacy of our dataset, namely that 95% of data points had standardized residuals within ±2, and around 1% of the sample at most had standardized residuals over 2.5.Assumptions of the model (absence of heteroscedasticity, linearity) were confirmed by plotting standardized residuals against standardized predicted values.Normal distribution was confirmed via the observed vs. predicted cumulative probability plots.The Durbin-Watson index was used to confirm that the model's residuals were independent (index should be as close to 2 as possible, and at least >1 and <3).

'Motivation to retrieve a reward' task analysis
When analyzing photometry signals around food delivery, trials where the mouse entered the food magazine within 5 sec after food delivery cue were not included so that signals did not include food consumption, but only food anticipation (Fig. 3d).For analyses of signals around food magazine visits, trials for which the mouse exited the food magazine just after entering it were discarded (Fig. 3e).When calculating the average food magazine visit length, visits of less than 2 s were excluded to avoid skewing the mean to short values of very briefand mostly nonconsummatoryvisits (Fig. 3j, k).To avoid a possible bias in the results if a mouse performed very few trials, here, we chose each trial performed (and not each mouse) as the experimental unit [126,127], akin to cells from one mouse being the unit in electrophysiology studies [128].

Statistical analysis
Data were analyzed using Matlab R2019b (MathWorks).All statistical tests and descriptive statistics were performed using SPSS software (IBM Corp. Version 28).Sex was taken as a factor in the analysis but did not reveal significant differences in all comparisons.Therefore, since sex was not the primary independent variable of interest in this study, and given the lack of statistical differences, we pooled data from both sexes for all further analyses.Tests and their results are presented in the figure legends.Unless otherwise stated, data are presented as mean ± SEM, and a p value ≤ 0.05 was considered to indicate significance.In the motivation task (Fig. 3f-k), outlier trials (defined as points whose absolute z-score is >3.29) were excluded from the statistical analysis.

Fig. 4
Fig. 4 Modulation of NAc D1 and not D2 cells influences hedonic preference learning (hedonic shifting).a Schematic of the experimental design for lick detection and closed-loop optogenetic activation on the paired-flavor.Left, during the pre-test, mice had free access to both flavored bottles.Middle, mice were given "Laser ON" sessions (yellow box), where only the initially less preferred flavor was presented, and licking was paired with optostimulation.On the same day, the initially more preferred flavor was given and the laser was not triggered.Sequence of laser ON and OFF conditioning sessions counterbalanced from day to day (bottom).Right, in the 'choice test' , the two flavors were once again presented together, in the absence of optostimulation.b Time spent licking on the paired-flavor spout during conditioning sessions (Cond., yellow box) and last choice-test session (grey box) in D1-cre mice expressing ChrimsonR (ChR, red, n = 6) and GFP (blue, n = 3).One-sided independent-samples t-test: ChR vs. GFP (Cond.7:t(7) = 2.160, p = 0.034; Choice-test: t(7) = 1.804, p = 0.057).c As in (b) but in D2(A2a)-cre mice (D2-ChR: n = 6, D2-GFP: n = 3).Mixed-design ANOVA: N.S. d Change in preference expressed as the difference (Δ) in time spent licking between the choice-test and the pre-test (choice-testpre-test) on a given spout for the paired (filled bars) and unpaired (empty bars) flavors in D1-ChR (left, red) and D1-GFP (right, blue) mice.ChR, paired-samples t-test paired vs. unpaired flavors: t(5) = 4.754, p = 0.003one-sided.GFP: t(2) = 1.074, p = 0.198one-sided.ChR: ChrimsonR, Cond.: Conditioning.P values reported on the figures as follows: *p ≤ 0.05, **p < 0.01.

Fig. 6
Fig.6NAc D1 cells positively induce 'wanting' and 'liking' while D2 cells positively encode 'wanting' but negatively induce 'liking'.Summary of the findings from the different experimental paradigms to test the distinctive patterns of NAc D1 and D2 cell activity relative to distinct psychological processes of hedonic behaviors: 'liking' (hedonia) and 'wanting' (motivation).a Summary of D1 and D2 cell activity in the NAc around food approach (left, blue box) and following food consumption onset (right, turquoise box), and the effect of repeated exposure to the reward.NAc D1 cells show transient activation following a reward-predicting cue (left, blue dotted line) and following consumption onset (right, turquoise dotted line).The amplitude of increase following feeding onset diminishes with daily repeated exposure to the food reward (grey line and orange arrow).NAc D2 cells show a sustained increase in activity following a food-predicting cue and their activity drops following feeding onset; this drop becomes lower during the consumption bout and with repeated daily exposure to the food reward (orange arrow).b Hypothetical model of the mutual relationship between approaching ('wanting') and consuming ('liking') reward behaviors, and NAc D1 and D2 cell activity.NAc D1 cell activation induces both 'liking' and 'wanting' processes (yellow arrows), and D1 cells are naturally activated during behaviors associated with 'wanting' and/or 'liking' (grey arrows).NAc D2 cell activation, on the other hand, leads to increased 'wanting' and decreased 'liking' (yellow arrows).Behaviors associated with 'wanting' , e.g., following a food-predicting cue and during food approach, lead to an increase in D2 cell activity, while behaviors associated with 'liking' , e.g., during food consumption, lead to a sustained decrease in D2 cell activity in the NAc.

Table 1 .
Mice implanted in each group and line.