A long-standing notion is that kappa opioid receptors (KOPR) are part of negative valence brain systems that contribute to anhedonia and aversive motivated behaviors. Overactivation of KOPR systems has been posited to play are role in drug withdrawal aversion and the motivation to alleviate drug withdrawal [1]. Additionally, KOPRs, like the other opioid receptors, also have analgesic effects. The combination of its low abuse potential (due to its presumed aversive effects), coupled with its therapeutic potential, has made KOPR an appealing target for pain management and addiction treatment. There are many studies corroborating the use of KOPR agonists and antagonists to improve negative affective states [2]. But there is also literature indicating that KOPR modulation can instead produce anxiety, increase drug intake, or have no effect [3]. Despite the expansive literature, until now the hypothesis that KOPRs selectively contribute to negative valence processing and behavioral responses to aversive stimuli had not been directly tested. In their article published in this issue of Neuropsychopharmacology, Farahbakhsh et al. have directly tested this theory using positive and negative reinforcement procedures [4].

Based on a priori valence processing frameworks [5], the authors hypothesized that if KOPR activity mediates responses to aversive stimuli, antagonism would selectively impair negative reinforcement learning but have no effect on positive reinforcement learning. Using daily operant conditioning sessions with either positive (sucrose) or negative (footshock) reinforcement, the authors tested whether systemic KOPR antagonism with norBNI would affect the ability of mice to recognize visual light cues and perform the “correct” response to either receive the positive reinforcer or avoid the negative reinforcer. Surprisingly, KOPR antagonism enhanced positive reinforcement learning by increasing the speed and accuracy with which mice responded to sucrose-predictive cues compared to saline controls. There were no differences in maximum performance or total amount consumed between the saline and norBNI groups, suggesting that KOPRs do not directly modulate the reward value per se. Complementing the positive reinforcer results, KOPR antagonism also increased the rate of negative reinforcement learning.

Together, these findings suggest that KOPR modulates learning rate independent of presumed valence of the unconditioned stimulus. These results indicate that the negative valence theory cannot explain KOPR’s role in behavioral modulation. To clarify how KOPR blockade affects the processing of reinforcement learning, the authors performed crossover experiments, where mice that were previously trained with saline pretreatment were exposed to the task after norBNI pretreatment. They found that KOPR antagonism only enhanced positive reinforcement learning for novel contingencies; mice that had previously been trained and reached acquisition criteria were unaffected by subsequent norBNI blockade. This result suggests that KOPR blockade affects the acquisition of learned behaviors rather than behavioral expression of the learned behavior.

The authors’ results challenge the canonical model of KOPR functioning solely as part of a negative valence system. Next, the authors reevaluated the role of KOPR system and how it contributes to modulation of affective and motivated states. They designed experiments to identify an alternative explanation for KOPR control of learning, which is consistent with their results and the previous literature. They proposed that KOPR blockade increases novelty exploration, which is an important component of learning.

To investigate this new hypothesis, the authors measured exploratory behavior in a novel environment and the rate of habituation. They found that mice treated with norBNI prior to testing had increased locomotion relative to the first 5 min of exposure. This indicates a heightened exploratory drive and resulted in a prolonged period of activity that did not return to baseline levels until 60 min. In contrast, saline controls returned to baseline activity levels within 15 min or less. However, when mice were placed into a familiar environment there were no group differences in normalized locomotion or rate of locomotor decrease. To determine whether these novelty response results are related to the KOPR-sensitive associative reinforcement results described earlier, the authors used a task in which the operant reinforcers were novel cue lights. They trained mice on a fixed ratio reinforcement schedule in which responding was reinforced by illumination of three identical cue lights. However, the duration, frequency, and pattern of the lights were randomized for each presentation. Over several sessions, mice continued to respond and increased the rate of responding for the light reinforcement, indicating they were not habituated to the cues or reinforcer. The main finding in this operant task was that mice treated with norBNI received more light cue reinforcers than those treated with saline.

To measure the motivation of the mice to respond for the novel sensory cues, the authors applied a behavioral economics approach wherein the mice were increasingly required to respond, or “pay”, more to receive the light cues. They found that norBNI-treated mice were willing to pay or lever press more to receive the novel cues than saline -treated controls. The authors posit that this implies that KOPR blockade specifically increases the intrinsic motivational value of novel stimuli. However, it may also be that KOPR blockade maintains the novelty of the cues and thus the motivation to pay for their delivery. Regardless, the overall results support KOPR’s proposed role as a modulator of novelty processing.

In conclusion, Farahbakhsh et al. provided experimental evidence that do not support the prevailing dogma that KOPRs are critical for negative valence and further showed that the KOPR system is an important modulator of novelty processing. This expansion for the role of KOPR provides an opportunity to reframe and consider decades of published studies from a new framework. The authors’ results provide many new potential experimental routes, such as examining how stimulation of KOPRs may alter novelty processing, or perhaps how distinct KOPR circuits in the brain regulate this form of reinforcement learning. This will be important to investigate, as more KOPR modulators make their way towards clinical use, their impact on processes other than negative valence will need to be carefully considered.