Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

VTA dopamine neuron activity encodes social interaction and promotes reinforcement learning through social prediction error

Abstract

Social interactions are motivated behaviors that, in many species, facilitate learning. However, how the brain encodes the reinforcing properties of social interactions remains unclear. In this study, using in vivo recording in freely moving mice, we show that dopamine (DA) neurons of the ventral tegmental area (VTA) increase their activity during interactions with an unfamiliar conspecific and display heterogeneous responses. Using a social instrumental task, we then show that VTA DA neuron activity encodes social prediction error and drives social reinforcement learning. Thus, our findings suggest that VTA DA neurons are a neural substrate for a social learning signal that drives motivated behavior.

This is a preview of subscription content

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: VTA DA neuron activity increases in social context.
Fig. 2: VTA DA neuron activity increases during social interactions with high heterogeneity response.
Fig. 3: VTA DA neuron activity adapts to social context through free repeated conspecific exposure.
Fig. 4: VTA DA neuron activity encodes active interaction value.
Fig. 5: Social instrumental task.
Fig. 6: VTA DA neuron activity encodes the positive value of social interaction during the SIT.
Fig. 7: VTA DA neurons encode a social prediction error: omission phase.
Fig. 8: Optogenetic inhibition of VTA DA neurons decreases subsequent social affiliation.

Data availability

Original data used in this study are available at https://doi.org/10.5281/zenodo.5564893. The dataset contains spiking activity of VTA DA neurons in mice and event timing during social free interaction and the social instrumental task, corresponding to Figs. 14, 6 and 7 and Extended Data Figs. 14, 6 and 7. Additional data supporting the findings are available upon reasonable request. Source data are provided with this paper.

Code availability

Innovative code used in this study is available at https://doi.org/10.5281/zenodo.5564893. Additional code supporting the findings is available upon reasonable request.

References

  1. Chen, P. & Hong, W. Neural circuit mechanisms of social behavior. Neuron 98, 16–30 (2018).

    CAS  Article  Google Scholar 

  2. Berridge, K. C. & Kringelbach, M. L. Affective neuroscience of pleasure: reward in humans and animals. Psychopharmacology 199, 457–480 (2008).

    CAS  Article  Google Scholar 

  3. Alhadeff, A. L. et al. Natural and drug rewards engage distinct pathways that converge on coordinated hypothalamic and reward circuits. Neuron 103, 891–908 (2019).

    CAS  Article  Google Scholar 

  4. Panksepp, J. B. & Lahvis, G. P. Social reward among juvenile mice. Genes Brain Behav. 6, 661–671 (2007).

    CAS  Article  Google Scholar 

  5. Dölen, G., Darvishzadeh, A., Huang, K. W. & Malenka, R. C. Social reward requires coordinated activity of nucleus accumbens oxytocin and serotonin. Nature 501, 179–184 (2013).

    Article  Google Scholar 

  6. Gunaydin, L. A. et al. Natural neural projection dynamics underlying social behavior. Cell 157, 1535–1551 (2014).

    CAS  Article  Google Scholar 

  7. Tamir, D. I. & Hughes, B. L. Social rewards: from basic social building blocks to complex social behavior. Perspect. Psychol. Sci. 13, 700–717 (2018).

    Article  Google Scholar 

  8. Hu, R. K. et al. An amygdala-to-hypothalamus circuit for social reward. Nat. Neurosci. 24, 831–842 (2021).

  9. Izuma, K., Saito, D. N. & Sadato, N. Processing of social and monetary rewards in the human striatum. Neuron 58, 284–294 (2008).

    CAS  Article  Google Scholar 

  10. Bariselli, S. et al. Role of VTA dopamine neurons and neuroligin 3 in sociability traits related to nonfamiliar conspecific interaction. Nat. Commun. 9, 3173 (2018).

    Article  Google Scholar 

  11. Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).

    CAS  Article  Google Scholar 

  12. Eshel, N. et al. Arithmetic and local circuitry underlying dopamine prediction errors. Nature 525, 243–246 (2015).

    CAS  Article  Google Scholar 

  13. Roesch, M. R., Calu, D. J. & Schoenbaum, G. Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nat. Neurosci. 10, 1615–1624 (2007).

    CAS  Article  Google Scholar 

  14. Waelti, P., Dickinson, A. & Schultz, W. Dopamine responses comply with basic assumptions of formal learning theory. Nature 412, 43–48 (2001).

    CAS  Article  Google Scholar 

  15. Schultz, W. Reward prediction error. Curr. Biol. 27, R369–R371 (2017).

    CAS  Article  Google Scholar 

  16. Sharpe, M. J. et al. Lateral hypothalamic GABAergic neurons encode reward predictions that are relayed to the ventral tegmental area to regulate learning. Curr. Biol. 27, 2089–2100 (2017).

    CAS  Article  Google Scholar 

  17. Takahashi, Y. K. et al. Dopamine neurons respond to errors in the prediction of sensory features of expected rewards. Neuron 95, 1395–1405 (2017).

    CAS  Article  Google Scholar 

  18. Engelhard, B. et al. Specialized coding of sensory, motor and cognitive variables in VTA dopamine neurons. Nature 570, 509–513 (2019).

    CAS  Article  Google Scholar 

  19. Kremer, Y., Flakowski, J., Rohner, C. & Lüscher, C. Context-dependent multiplexing by individual VTA dopamine neurons. J. Neurosci. 40, JN-RM-0502-20 (2020).

  20. Bariselli, S., Contestabile, A., Tzanoulinou, S., Musardo, S. & Bellone, C. SHANK3 downregulation in the ventral tegmental area accelerates the extinction of contextual associations induced by juvenile non-familiar conspecific interaction. Front. Mol. Neurosci. 11, 360 (2018).

    CAS  Article  Google Scholar 

  21. Cohen, J. Y., Haesler, S., Vong, L., Lowell, B. B. & Uchida, N. Neuron-type-specific signals for reward and punishment in the ventral tegmental area. Nature 482, 85–88 (2012).

    CAS  Article  Google Scholar 

  22. Starkweather, C. K., Gershman, S. J. & Uchida, N. The medial prefrontal cortex shapes dopamine reward prediction errors under state uncertainty. Neuron 98, 616–629 (2018).

    CAS  Article  Google Scholar 

  23. Mathis, A. et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 21, 1281–1289 (2018).

    CAS  Article  Google Scholar 

  24. Lisman, J. E. & Grace, A. A. The hippocampal-VTA loop: controlling the entry of information into long-term memory. Neuron 46, 703–713 (2005).

    CAS  Article  Google Scholar 

  25. Bromberg-Martin, E. S., Matsumoto, M. & Hikosaka, O. Dopamine in motivational control: rewarding, aversive, and alerting. Neuron 68, 815–834 (2010).

    CAS  Article  Google Scholar 

  26. Tapper, A. R. & Molas, S. Midbrain circuits of novelty processing. Neurobiol. Learn. Mem. 176, 107323 (2020).

    CAS  Article  Google Scholar 

  27. Berridge, K. C. ‘Liking’ and ‘wanting’ food rewards: brain substrates and roles in eating disorders. Physiol. Behav. 97, 537–550 (2009).

    CAS  Article  Google Scholar 

  28. Meye, F. J. & Adan, R. A. H. Feelings about food: the ventral tegmental area in food reward and emotional eating. Trends Pharmacol. Sci. 35, 31–40 (2014).

    CAS  Article  Google Scholar 

  29. Menegas, W., Akiti, K., Amo, R., Uchida, N. & Watabe-Uchida, M. Dopamine neurons projecting to the posterior striatum reinforce avoidance of threatening stimuli. Nat. Neurosci. 21, 1421–1430 (2018).

    CAS  Article  Google Scholar 

  30. Ljungberg, T., Apicella, P. & Schultz, W. Responses of monkey dopamine neurons during learning of behavioral reactions. J. Neurophysiol. 67, 145–163 (1992).

    CAS  Article  Google Scholar 

  31. Berke, J. D. What does dopamine mean? Nat. Neurosci. 21, 787–793 (2018).

    CAS  Article  Google Scholar 

  32. Sharpe, M. J. et al. Dopamine transients do not act as model-free prediction errors during associative learning. Nat. Commun. 11, 106 (2020).

    CAS  Article  Google Scholar 

  33. Geugies, H. et al. Impaired reward-related learning signals in remitted unmedicated patients with recurrent depression. Brain 142, 2510–2522 (2019).

    Article  Google Scholar 

  34. Chevrier, A. et al. Disrupted reinforcement learning during post-error slowing in ADHD. PLoS ONE 14, e0206780 (2019).

  35. Sinha, P. et al. Autism as a disorder of prediction. Proc. Natl Acad. Sci. USA 111, 15220–15225 (2014).

    CAS  Article  Google Scholar 

  36. Mosner, M. G. et al. Neural mechanisms of reward prediction error in autism spectrum disorder. Autism Res. Treat. 2019, 5469191 (2019).

    PubMed  PubMed Central  Google Scholar 

  37. Chevallier, C., Kohls, G., Troiani, V., Brodkin, E. S. & Schultz, R. T. The social motivation theory of autism. Trends Cogn. Sci. 16, 231–239 (2012).

    Article  Google Scholar 

  38. Kinard, J. L. et al. Neural mechanisms of social and nonsocial reward prediction errors in adolescents with autism spectrum disorder. Autism Res. 13, 715–728 (2020).

    Article  Google Scholar 

  39. Storey, G. P. et al. Nicotine modifies corticostriatal plasticity and amphetamine rewarding behaviors in mice. eNeuro 3, ENEURO.0095-15.2015 (2016).

  40. Prusky, G. T., Alam, N. M. & Douglas, R. M. Enhancement of vision by monocular deprivation in adult mice. J. Neurosci. 26, 11554–11561 (2006).

    CAS  Article  Google Scholar 

  41. Matsumoto, H., Tian, J., Uchida, N. & Watabe-Uchida, M. Midbrain dopamine neurons signal aversion in a reward-context-dependent manner. eLife 5, e1728 (2016).

    Article  Google Scholar 

  42. Tian, J. & Uchida, N. Habenula lesions reveal that multiple mechanisms underlie dopamine prediction errors. Neuron 87, 1304–1316 (2015).

    CAS  Article  Google Scholar 

Download references

Acknowledgements

We would like to thank C. Lüscher, M. Mameli, P. Faure, J. Naudé and S. Bariselli for comments on the manuscript. We would also like to thank S. Pellat and L. Jourdain for technical support. This work is supported by the Swiss National Science Foundation (31003A_182326) and the NCCR Synapsy from the Swiss National Science Foundation. C.B. is also supported by the ERC Consolidator Grant (864552).

Author information

Authors and Affiliations

Authors

Contributions

C.S. and C.B. conceived the project. C.B., C.S. and B.G. wrote the manuscript. C.S. and B.G. performed the electrophysiological recordings and the behavioral experiments, with the help of B.R. and M.T. C.S. and B.G. performed all the analyses and statistics.

Corresponding author

Correspondence to Camilla Bellone.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Neuroscience thanks Melissa Sharpe and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Recording, photolabeling and classification of the VTA DA neurons.

(a) Picture of the microdrive implanted in DAT-Cre mice with recording electrodes and optic fiber. (b) Schema of the implantation of the microdrive in the VTA. (c) Representative coronal image of immuno-staining experiments against Tyrosine Hydroxylase (TH) enzyme (red) performed on midbrain slices of DAT-Cre adult mice infected with AAV5-Ef1α-DIO-ChR2-eYFP (green) in the VTA. (d) Picture of an implanted mouse for freely behaving recording. (e) Top: Example trace of a VTA DA neuron following optogenetic light stimulation protocol at 20 Hz. Down: Example of waveforms similarity between light stimulation and no-light stimulation of a photolabeled VTA DA neuron. (f) Example PETH centered to the light pulse (top), of a VTA DA neuron responding to the 20 Hz protocol stimulation with the corresponding raster plot for all the trials (down). (g) Averaged PETH, centered to the light pulse, for all the VTA DA neurons following the optogenetic protocol at 5 Hz stimulation. (h) Averaged PETH, centered to the light pulse, for all the VTA DA neurons following the optogenetic protocol at 20 Hz stimulation. (i) Probability to have a spike in the 10 ms following the beginning of the light pulse for 5 Hz and 20 Hz protocols. Paired t-test two-sided (t(16) = -8.2603). (j) Time course of all the photolabeled VTA DA neurons before, during and after the 20 Hz protocol stimulation. (k) Mean firing rate of the VTA DA neurons at baseline (without optogenetic stimulation) and during optogenetic stimulation at 20 Hz. Responders are represented in blue while non-responders to the light stimulation are in black (example of 2 neurons). (l) Waveforms examples of different neurons after spike-sorting for VTA-pDA (Left), VTA-nonDA high firing (Middle) and VTA-nonDA low-firing (Right) neurons. The red line represents the average of all the waveforms recorded during a session for one neuron. (m) Firing probability density for 3 different neurons (VTA-pDA, VTA-nonDA high firing and VTA-nonDA low-firing). This probability is the Log10 of the instantaneous frequency for a given neuron, and allows to extract general features such as tonic and bursting activity by using Robust Gaussian Surprise method. (n) Examples of traces for the 3 different neurons types with the tonic, bursting and pause activity. (o) Left: Dimensionality reduction with UMAP (Uniform Manifold Approximation and Projection) based on features extracted from firing pattern (see methods) followed by EMGM (Expectation Maximization of Gaussian Mixture) clustering based on VTA-DA photolabeled neurons. The neurons can be dopaminergic (VTA DA photolabeled), putative dopaminergic (VTA pDA) or putative non-dopaminergic (VTA non-pDA). Right: pie charts show inclusion or exclusion of previously identified neurons in the VTA DA cluster. n indicates the number of cells. Dots represent individual data points. All the data are shown as the mean + /- s.e.m. as error bars.

Source data

Extended Data Fig. 2 The VTA DA activity increases at the initiation of interaction events and rearing behavior does not induce VTA DA changes.

(a) Examples of neurons responding to reciprocal, unilateral and passive contacts. Neurons 1 and 2 are from the same animal while Neuron 3 and 4 are from another same animal. In these examples we can see the heterogeneity of neuronal activity depending on contacts for a same neuron or animal. (b) Normalized VTA DA activity in function of the duration of bouts of reciprocal interaction with a bin of 200 ms. The initiation of the interaction seems to induce the highest increase of VTA DA activity. Pearson’s coefficient correlation two-sided. (c) Distribution of the duration to reach the peak of VTA DA activity during interactions (normalized in percentage of bouts duration). Mean = 27.25%, s.e.m = 0.853%. (d) Normalized VTA DA activity in function of the duration of bouts of unilateral interaction with a bin of 200 ms. Pearson’s coefficient correlation two-sided. (e) Distribution of the duration to reach the peak of VTA DA activity during interactions (normalized in percentage of bouts duration). Mean = 27.63%, s.e.m = 1.057%. (f) Normalized VTA DA activity in function of the duration of bouts of passive interaction with a bin of 200 ms. Pearson’s coefficient correlation two-sided. (g) Distribution of the duration to reach the peak of VTA DA activity during interactions (normalized in percentage of bouts duration). Mean = 21.47%, s.e.m = 0.995%. (h) Schema of rearing behavior. (i) PETH of the normalized VTA DA activity centered on the rearing behavior 5 seconds before and after the rearing. The rearing behavior does not induce changes at VTA DA activity population level. N, n indicate the number of mice and cells respectively. All the data are shown as the mean + /- s.e.m. as error bars.

Source data

Extended Data Fig. 3 Heterogeneity of VTA DA activity depending on the type of interaction and the trials.

a) Activity changes of the same VTA DA depending on the trials of interaction for reciprocal contacts (red/stripes: increasing; blue: decreasing; green: no activity changes for a given interaction; grey: no activity changes independently the type of interaction). Chi-square test between ratio of responses type of 1st and 3rd trials (χ2(1) = 10.78). (b) Response heterogeneity of the VTA DA neurons during reciprocal interactions in the neurons responding to at least one trial between the 1st and 3rd trials. (c) Activity changes of the same VTA DA depending on the trials of interaction for unilateral contacts. Chi-square test between ratio of responses type of 1st and 3rd trials (χ2(1) = 0.4107). (d) Response heterogeneity of the VTA DA neurons during unilateral interactions in the neurons responding to at least one trial between the 1st and 3rd trials. (e) Activity changes of the same VTA DA depending on the trials of interaction for passive contacts. Chi-square test between ratio of responses type of 1st and 3rd trials (χ2(2) = 3.000). (f) Response heterogeneity of the VTA DA neurons during passive interactions in the neurons responding to at least one trial between the 1st and 3rd trials.

Extended Data Fig. 4 Photolabeled VTA DA neurons activity increases during social interactions with high heterogeneity response.

(a) Difference of variance of firing rate between VTA DA neurons non-photolabeled and photolabeled. Two sample F-test (F(15,43) = 0.6348). (b) VTA DA firing rate during baseline and social context. Paired t-test two-sided (t(16) = -3.8210). (c) Time course of the normalized VTA DA neuron firing rate during baselines and social sessions (top) with the associated heatmap of each neuron recorded (down). VTA DA activity increases during social interaction and habituates through exposures. Repeated Measure (RM) one-way ANOVA (Time main effect: F(5,13) = 5.2503, P = 0.0004) followed by Bonferroni-Holm correction. (d) Left: Normalized VTA DA firing rate before and during reciprocal interaction during 1st trial. Wilcoxon test two-sided (W = 122). Right: Proportion of the different VTA DA neurons activity responses during reciprocal events (red, increasing; blue, decreasing; green, no activity changes for a given interaction; grey, no activity changes independently the type of interaction). (e) Left: Normalized VTA DA firing rate before and during unilateral interaction during 1st trial. Paired t-test two-sided (t(16) = 2.729). Right: Proportion of the different VTA DA neurons activity responses during unilateral events. (f) Left: Normalized VTA DA activity before and during passive interaction during 1st trial. Wilcoxon test. Paired t-test two-sided (t(16) = 2.301). Right: Proportion of the different VTA DA neurons activity responses in passive interaction. (g) Activity changes of the same VTA DA neurons depending on the type of interaction between reciprocal, unilateral and passive. Chi-square test two-sided between ratio of responses type for each interaction (Overall: χ2(2) = 55.93, P = 8.07×10-12; Reciprocal vs unilateral: χ2(1) = 24.42; Reciprocal vs passive: χ2(1) = 52.17; Unilateral vs passive: χ2(1) = 7.636). The ratio of activity responses is not different between all the VTA DA neurons (Fig. 2) and only the photolabeled. Chi-square test two-sided between ratio of responses type between all VTA DA neurons (from Fig. 2i) and photolabeled VTA DA neurons only for each interaction (Reciprocal: χ2(2) = 2.298, P = 0.6091; Unilateral: χ2(2) = 3.399, P = 0.2345; Passive: χ2(3) = 5.484, P = 0.3112). (h) Response heterogeneity of VTA DA neurons between reciprocal, unilateral and passive interactions in the neurons responding to at least one type of interaction. The neurons show either different or similar responses to the different contacts. (i) Left: Normalized VTA DA firing rate before and during reciprocal interaction during 3rd trial. Wilcoxon test two-sided (W = 95). Right: Proportion of the different VTA DA neurons activity responses during unilateral events. (j) Left: Normalized VTA DA firing rate before and during unilateral interaction during 3rd trial. Wilcoxon test two-sided (W = 83). Right: Proportion of the different VTA DA neurons activity responses during unilateral events. (k) Left: Normalized VTA DA firing rate before and during passive interaction during 3rd trial. Paired t-test two-sided (t(13) = 0.9183, P = 0.3752). Right: Proportion of the different VTA DA neurons activity responses during passive events. (l) Activity changes of the same VTA DA neurons depending on the type of interaction between reciprocal, unilateral and passive. Chi-square test two-sided between ratio of responses type for each interaction (Overall: χ2(4) = 4.250, P = 0.3525; Reciprocal vs unilateral: χ2(1) = 0.2857; Reciprocal vs passive: χ2(2) = 2.111; Unilateral vs passive: χ2(2) = 3.300). The ratio of activity responses is not different between all the VTA DA neurons (Fig. 4) and only the photolabeled. Chi-square test between ratio of responses type between all VTA DA neurons (from Fig. 4j) and photolabeled VTA DA neurons only for each interaction (Reciprocal: χ2(2) = 0.2520, P = 0.0559; Unilateral: χ2(2) = 0.3162, P = 0.1497; Passive: χ2(3) = 1.806, P = 0.3795). (m) Response heterogeneity of VTA DA neurons between reciprocal, unilateral and passive interactions in the neurons responding to at least one type of interaction. The neurons show either different or similar responses to the different contacts. N, n indicate the number of mice and cells respectively. For box plots: the center line represents the median, the bounds of the box the 25th to 75th percentile interval and the whiskers the minima and maxima. Dots represent individual data points. All the data, except box plots, are shown as the mean +/- s.e.m. as error bars.

Source data

Extended Data Fig. 5 Extinction and reinstatement in the SIT.

(a) Left: Experimental time-course of the procedures for the 1st cohort (same cohort than in Fig. 1). The DAT-Cre mice are first injected with an AAV5-DIO-ChR2 and implanted with recordings electrodes in the VTA prior to perform the shaping and instrumental phases. Right: Proportion of learners and non-learners in the task for the 1st cohort. (b) Left: Experimental time-course of the procedures for the 2nd cohort. The DAT-Cre mice are first injected with an AAV5-DIO-ChR2 and implanted with recordings electrodes in the VTA prior to perform the shaping, instrumental, extinction and reinstatement phases of the SIT. Right: Proportion of learners and non-learners in the task for the 2nd cohort. (c) Number of lever-presses across the days and the 4 different phases (shaping, instrumental, extinction and reinstatement) for the 2nd cohort. RM one-way ANOVA (Time main effect: F(79,6) = 4.72, P = 1.09×10-7). (d) Comparison of the number of lever-press between the shaping (D1-D5), instrumental (D21-D25), extinction (D51-D55) and reinstatement (D76-D80) phases. RM one-way ANOVA (Phases main effect: F(3,6) = 34.67, P = 3.51×10-5) followed by Bonferroni-Holm correction. (e) Number of transitions between the lever and social zones for the 2nd cohort. Friedman test two-sided (χ2(7) = 176.5, P = 1.0×10-15). (f) Comparison of the number of transitions between the shaping (day1 – day5), instrumental (D21-D25), extinction (D51-D55) and reinstatement (D76-D80) phases. RM one-way ANOVA (Phases main effect: F(3,6) = 18.69, P = 0.0001) followed by Bonferroni-Holm correction. (g) Peri-event time histogram (PETH) of the velocity for the shaping (D1-D5), instrumental (D21-D25), extinction (D51-D55) and reinstatement (D76-D80) phases. (h) Comparison of the maximum velocity during the transitions for all the different phases of the SIT. RM one-way ANOVA (Phases main effect: F(3,6) = 6.557, P = 0.0106) followed by Bonferroni-Holm correction. (i) Ratio of the different transitions depending the velocity across the sessions of the different phases: fast (transitions < 2 sec), slow (2 sec < transitions < 7 sec), delayed (7 sec < transitions < 12 sec) and missed (transitions > 12 sec). (j) Proportion of the different transitions between shaping (D1-D5), instrumental (D21-D25), extinction (D51-D55) and reinstatement (D76-D80) phases. N indicates the number of mice. Dots represent individual data points. All the data are shown as the mean +/− s.e.m. as error bars.

Source data

Extended Data Fig. 6 VTA DA neurons activity centered to the entry in the interaction zone and the intermediate phase of the SIT.

(a) Top: PETH of one VTA DA neuron responding to the entry in the interaction zone during the shaping phase. Down: Associated raster plot of the neuron during a session of the shaping phase. (b) Top: PETH of the normalized VTA DA activity during the shaping phase, centered on the entry in the interaction zone. Down: Associated heatmap of each neuron recorded. (c) PETH of one VTA DA neuron responding before the entry in the interaction zone during the instrumental phase. Down: Associated raster plot of the neuron during a session of the instrumental phase. (d) Top: PETH of the normalized VTA DA activity during the instrumental phase, centered on the entry in the interaction zone. Down: Associated heatmap of each neuron recorded. (e) Schema of the operant chamber during the intermediate phase. (f) Top: PETH of one VTA DA neuron responding during interaction window and lever press. Down: Associated raster plot of the neuron during a session of the intermediate phase. (g) Top: PETH of the normalized VTA DA activity during the intermediate phase centered on the lever presses. Down: Associated heatmap of each neuron recorded. (h) Proportion of the VTA DA neurons increasing their activity during the lever press (Left) and the interaction window (Right). (i) Left: Firing rate of VTA DA neurons during baseline, the lever press and the interaction window. Friedman test two-sided (χ2(54) = 26.04, P = 2.22×10-6) followed by Bonferroni-Holm correction. Right: Normalized VTA DA activity during baseline, the lever press and the interaction window. Friedman test two-sided (χ2(54) = 23.37, P = 8.42×10-6) followed by Bonferroni-Holm correction. N, n indicate the number of mice and cells respectively. For box plots: the center line represents the median, the bounds of the box the 25th to 75th percentile interval and the whiskers the minima and maxima. Dots represent individual data points. All the data, except box plots, are shown as the mean + /- s.e.m. as error bars.

Source data

Extended Data Fig. 7 VTA DA neurons activity encodes social prediction error during the SIT: Error and correct trials.

(a) Schema of the operant chamber during error trials of the instrumental phase. (b) Top: PETH of one VTA DA neuron decreasing activity during interaction window and increasing during exit of the lever zone. Down: Associated raster plot of the neuron during a session of the error trial. (c) Top: PETH of the normalized VTA DA activity during the error trials, centered on the exit of the lever zone. Down: Associated heatmap of each neuron recorded. (d) Top: Proportion of the VTA DA neurons increasing their activity during the exit of the lever zone (Left) and decreasing activity following the transitions when the door is closed (Right). (e) Left: Firing rate of VTA DA neurons during baseline, the exit of the lever zone and in the interaction zone. Friedman test two-sided (χ2(47) = 21.07, P = 2.66×10-5) followed by Bonferroni-Holm correction. Right: Normalized VTA DA activity during baseline, the exit of the lever zone and in the interaction zone. Friedman test two-sided (χ2(47) = 14.28, P = 0.0008) followed by Bonferroni-Holm correction. (f) Schema of the operant chamber during correct trials of the instrumental phase. (g) Top: PETH of one VTA DA neuron increasing during exit of the lever zone. Down: Associated raster plot of the neuron during a session of the correct trial. (h) Top: PETH of the normalized VTA DA activity during the correct trials, centered on the exit of the lever zone. Down: Associated heatmap of each neuron recorded. (i) Proportion of the VTA DA neurons increasing their activity during the exit of the lever zone (Left) and decreasing activity following the transitions when the door is closed (Right). (j) Left: Firing rate of VTA DA neurons during baseline, the exit of the lever zone and in the interaction zone. Friedman test two-sided (χ2(45) = 28.77, P = 5.65×10-7) followed by Bonferroni-Holm correction. Right: Normalized VTA DA activity during baseline, the exit of the lever zone and in the interaction zone. Friedman test (χ2(45) = 21.32, P = 2.35×10-5) followed by Bonferroni-Holm correction. N, n indicate the number of mice and cells respectively. For box plots: the center line represents the median, the bounds of the box the 25th to 75th percentile interval and the whiskers the minima and maxima. Dots represent individual data points. All the data, except box plots, are shown as the mean +/− s.e.m. as error bars.

Source data

Extended Data Fig. 8 Optogenetic inhibition of VTA DA neurons decreases time in the interaction zone.

(a) Number of visits in the interaction zone when the door is closed between D15 and D20. Unpaired t-test two-sided (t(16) = 0.2808, P = 0.7824). (b) Number of visits in the interaction zone when the door is open between D15 and D20. Mann-Whitney U test two-sided (U = 14). (c) Velocity in the operant chamber when the door is closed between D15 and D20. Unpaired t-test two-sided (t(16) = 0.1528, P = 0.8804). (d) Velocity in the operant chamber when the door is open between D15 and D20. Unpaired t-test two-sided (t(16) = 1.018, P = 0.3237). N indicates the number of mice. Dots represent individual data points. All the data are shown as the mean +/− s.e.m. as error bars.

Source data

Supplementary information

Supplementary Information

Supplementary Tables 1 and 2

Reporting Summary

Source data

Source Data Fig. 1

Neuronal activity recording data

Source Data Fig. 2

Neuronal activity recording data and behavioral scoring with DeepLabCut

Source Data Fig. 3

Neuronal activity recording and behavioral data

Source Data Fig. 4

Neuronal activity recording data

Source Data Fig. 5

Behavioral data for the SIT

Source Data Fig. 6

Neuronal activity recording data from the shaping and instrumental phases

Source Data Fig. 7

Neuronal activity data for the omission phase

Source Data Fig. 8

Behavioral analyses for optogenetic manipulation in the SIT

Source Data Extended Data Fig. 1

Neuronal activity recording data from photolabeled experiments

Source Data Extended Data Fig. 2

Neuronal activity recording data from the free interaction task

Source Data Extended Data Fig. 4

Neuronal activity recording data for photolabeled neurons during only the free social interaction task

Source Data Extended Data Fig. 5

Behavioral analyses from the SIT (extinction phase)

Source Data Extended Data Fig. 6

Neuronal activity recording data from the shaping, instrumental and intermediate phases

Source Data Extended Data Fig. 7

Neuronal activity recording data from error and correct trials of the SIT

Source Data Extended Data Fig. 8

Behavioral analyses during optogenetic manipulation in the instrumental task

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Solié, C., Girard, B., Righetti, B. et al. VTA dopamine neuron activity encodes social interaction and promotes reinforcement learning through social prediction error. Nat Neurosci 25, 86–97 (2022). https://doi.org/10.1038/s41593-021-00972-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41593-021-00972-9

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing