Introduction

In 2020, an estimated 138.5 million Americans aged 12 or older reported alcohol use in the past month, and 61.6 million reported a binge drinking episode (four or five drinks for women or men, respectively) in the past month1. Rates of alcohol consumption have remained relatively stable over the past decade, with some upward trends for women and older adults possibly amplified by the COVID-19 pandemic2,3. Illicit drug use is also rising: nearly one in five Americans reported illicit drug use of some kind in the past year, and 43.5 million Americans reported cannabis use in the past year1. These trends in the USA are generally mirrored around the world1,4,5. Although many use alcohol or illicit drugs without suffering notable clinical concerns, a minority use at high levels, resulting in diagnoses of substance-use disorder and a range of acute to chronic substance-related problems. Serious harms, such as alcohol-related mortality, overdose and cirrhosis, are on the rise6,7, with notable increases during the COVID-19 pandemic8, even as other causes of death are decreasing9. The public health impact of these harms are monumental and result in billions of dollars in financial costs each year10.

Myriad policies have been developed to reduce or eliminate the burden of substance use. Yet alcohol- and drug-related morbidity and mortality have increased over the past two decades7,11,12, suggesting that existing strategies are far from adequate. Robust and valid theories are needed to guide treatment and policy development. Specifically, it is incumbent upon psychological theories of addiction to explain the following: why people consume drugs; why some people use alcohol and drugs in a manner that contributes to health and social problems, while others are able to use moderately with minimal consequences; why many people are able to reduce patterns of alcohol and drug use, often without participating in formal treatment, while others experience chronic, escalating patterns of use; and why substance-related morbidity and mortality have increased over the past two decades and in particular over the past few years7,11,12.

In this Perspective, we provide an overview of a contemporary behavioural economic theoretical account of addiction, the reinforcer pathology model, which suggests that drug consumption is the result of overvaluation of smaller immediate rewards and drug-specific reinforcement. Next, we highlight limitations of the reinforcer pathology model and describe an extension, the contextualized reinforcer pathology model, that highlights the critical role of alternative reinforcers in addiction motivation. We then review empirical literature across the translational spectrum that supports this model. Finally, we review relevant literature on increases in addiction-related morbidity and mortality and addiction-related health disparities that might be understood and potentially ameliorated via contextualized reinforcer pathology.

Reinforcer pathology

A behavioural economic account of substance use refers to a set of empirical methods and models of decision-making that integrate microeconomic and operant learning theory principles to understand the decision-making processes and contextual features that influence substance consumption over time.

The most popular contemporary behavioural economic model — reinforcer pathology13,14 — suggests that addiction is marked by within-individual differences in relative reinforcing value (high drug demand) and a more general decision-making bias that overvalues smaller immediate rewards relative to larger delayed rewards (high delay discounting) as central aetiological risk factors. These two key concepts of the reinforcer pathology model are described below.

Drug demand

Across behavioural economic models, the reinforcing value (or ‘demand’ in behavioural economic terms) of substance use is measured by the level of consumption or the amount of behavioural (or monetary) output emitted to engage in the activity15. Real or hypothetical purchase tasks are usually used to measure drug or alcohol value. In a typical alcohol purchase task, individuals report how many drinks they would purchase during a hypothetical drinking scenario at each price in a series of escalating prices16. Responses across each of a specific (monetary) cost are plotted to create a demand curve and produce indices that reflect an index of drug value (Fig. 1a). Human and laboratory animal research has consistently demonstrated that, within a closed economy (a choice economy with defined constraints on access to drugs and no access to any commodities outside the economy), response to a drug reinforcer decreases as the cost of acquiring the substance increases17 (Fig. 1a). Purchase tasks mirror (but are more cost- and time-effective than18,19) laboratory progressive ratio tasks in which animals or human participants have the opportunity to self-administer a substance as the cost for doing so (for example, the number of button presses required or the monetary expenditure) progressively increases20.

Fig. 1: Behavioural economic demand and delayed reward discounting.
figure 1

a, Representation of a behavioural economic demand curve, which can be plotted using data from purchase tasks. As cost increases, consumption decreases. Demand indices that can be extracted from the data include intensity (consumption at zero cost), breakpoint (the price at which consumption is fully suppressed) and elasticity (the rate of change in consumption as a function of cost). b, Representation of the change in value of two rewards as a function of the delay to reward receipt. The reward from substance use is smaller, but the receipt is nearer in time, whereas the reward from an alternative is larger, but receipt is further delayed in time. When receipt of both rewards are distant in time, the value of the larger reward is greater. However, owing to the hyperbolic nature of delayed reward discounting, the value of the immediate reward increases at a greater rate as reward receipt becomes closer in time. Thus, an individual might experience a preference reversal, in which the value of the immediate reward surpasses the value of the delayed reward when receipt of the immediate reward is imminent.

Individual differences in the degree to which costs lead to a decrease in responding index between-person valuation of a drug or drug demand (Fig. 1a). Reinforcer pathology suggests that these individual differences reflect strength of motivation for the drug and should be correlated with levels of alcohol use and alcohol-related problems13,14. Indeed, indices of drug demand, in particular maximum consumption, expenditure level and sensitivity to changes in drug price, show robust correlations with consumption16,21,22,23,24, substance-use problems25,26, and substance-use disorder27. Demand indices are also robust prospective predictors of drinking behaviour even after controlling for past alcohol consumption28,29. In other words, measures that aggregate a series of hypothetical drinking decisions across escalating costs have predictive utility over and above measures of recent drinking practices.

Delay discounting

Costs and benefits across choice options are unevenly distributed across time, such that some reinforcers, like drug use, have relatively greater immediate benefits (for example, intoxication, euphoria, social facilitation, anxiety reduction and withdrawal relief) and health and social costs that are substantially delayed (and probabilistic). By contrast, other reinforcers, such as earning a college degree, have relatively immediate costs (attending class, studying and paying tuition) and delayed (and probabilistic) rewards (satisfaction of earning good grades and graduating; higher-quality employment and salary). This critical temporal element to reward valuation for drugs versus alternatives is captured by delayed reward discounting, which is the relative preference for smaller, sooner rewards compared to larger, later rewards30. Delayed reward discounting describes how much the value of an activity or commodity decreases as a function of its temporal ‘distance’ from the current moment. Empirical research suggests that the subjective current value of delayed rewards decreases more steeply with initial delays, consistent with a hyperbolic decay function, rather than at a constant rate31 (Fig. 1b). One implication of hyperbolic discounting is that the preference for smaller immediate versus larger delayed rewards shifts dynamically as a function of time to reward availability, exhibiting steep devaluation at initial delays and shallower devaluation at further delays. Thus, humans and laboratory animals generally prefer larger, later rewards when reward receipt for both options is distal, but preference often reverses as the availability of the smaller, sooner reward becomes imminent31 (Fig. 1b).

Individuals vary in their time horizons for behavioural allocation, influencing the rate at which they devalue delayed rewards. Thus, utility maximization is relative to the temporal frame of reference. A local (shorter) time frame of reference typically compares discrete, independent choices (for example, should I drink alcohol tonight or study for my exam?) to maximize short-term utility (enjoyment from drinking and socializing). By contrast, a temporally extended global or molar frame of reference compares two choices on the basis of their anticipated value over the course of an extended pattern of behaviour that comprises many discrete choices that might accrue value exponentially over time32,33. For example, consider a series of discrete choices between watching TV and drinking alcohol versus exercising each night over the course of a month. An evening spent watching TV and drinking alcohol might have high immediate value that does not necessarily aggregate over time (whereas costs might aggregate). By contrast, exercise might lead to benefits that are not immediately evident after one discrete event but instead emerge after consistently engaging in a pattern of behaviour. This intertemporal choice dynamic is foundational to behavioural economics, including applications to substance-related harms33.

Importantly, discounting applies to all delayed rewards, and rates of discounting vary considerably across commodities34,35. Furthermore, steep discounting (a greater preference for smaller immediate rewards over larger delayed rewards) can be adaptive when it comes to securing reinforcers in dangerous or deprived environments in which delayed rewards are also highly uncertain (as in the idiom, ‘a bird in the hand is worth two in the bush’). Nevertheless, steep delay discounting might be an especially relevant decision-making bias that contributes to frequent drug use because positive drug effects (euphoria, enhanced focus, or reduced pain and anxiety) tend to occur immediately, whereas their costs or negative effects are generally delayed (ranging from hours for acute illness or hangover to years for health and social impacts). Additionally, in modern society, legal and illegal drug reward is often easy to obtain and imposes little upfront cost.

A central tenet of the contemporary reinforcer pathology model is that the temporal window of value allocation substantially determines the relative value of the reinforcers operating in that window13. In other words, higher substance value, and therefore persistent substance consumption even at high costs, can be attributed to a preference for immediate rewards because the value of more distal rewards diminishes very quickly as they fall outside a person’s time horizon. In turn, the reinforcer pathology model has led to intervention approaches focused on expanding the temporal horizon of decision making (that is, reducing delay discounting)36,37,38,39.

Alternative reinforcement

The contemporary reinforcer pathology model is subject to several underemphasized considerations that are critical for understanding choice behaviour. The reinforcer pathology model emphasizes that steep discounting contributes to preference for drug rewards relative to alternative rewards. However, it does not emphasize the environment, including the availability of drugs and the relative reinforcing efficacy of substance-free alternative reinforcers, as contributing factors to elevated demand or discounting, or as direct contributors to risk for harmful alcohol and drug use40. The contemporary reinforcer pathology approach emphasizes, and typically measures, drug reinforcing value and delay discounting with the assumption of all other things being equal, but the influence of alterative reinforcers frequently violates this assumption. Thus, studies in the addiction literature over the past two decades have focused on reductionistic accounts of absolute responding, closed economies, and the differences in between-person choice preferences for delayed and substance rewards. Consequently, these critical behavioural economic variables are now often considered static individual difference variables, which deviates from decades of research showing that the economy and choice context influence the relative value of a drug15,41,42,43.

Addiction might be better understood by simultaneously considering temporal discounting and drug-specific reinforcing value, alongside immediate and delayed costs and benefits of both the substance and alternatives over extended patterns of behaviour. Indeed, real-world decision making occurs in an open economy in which an individual can typically choose between two or more options in a choice context. When considering this broader choice context41,42, distal causal influences exerted by the characteristics of the choice economy emerge that cannot be described by models of proximal causation44 and emerge only through a molar analysis of behaviour42,45. Although the contemporary reinforcer pathology model acknowledges the importance of relative value and often compares the immediate value of drugs to the delayed value of some alternative, the central tenets explicitly ignore the distal causal influence of the choice environment30,41,42,45.

Indeed, the trenchancy of behavioural economics — and a distinguishing factor from other theories of addiction — is its explicit scaffolding to reconcile person-level and environmental factors (Fig. 2a). Other prominent theories of addiction emphasize factors within the person, be it through neurobiological46,47 or psychological mechanisms. By contrast, sociological48 and anthropological49 models emphasize environmental conditions over person-level factors. Although proponents of other person-level theories have begun to integrate environmental factors46, behavioural economics provides a robust conceptualization that quantitatively and intuitively accounts for both within-individual and environmental factors, making this theory ideally positioned to enhance addiction research, intervention and prevention. In a discrete choice context, a person’s intertemporal orientation, the constraints on the drug itself, and constraints on alternatives are all mutable environmental factors implicated as determinants of the likelihood of drug consumption. Over time, each of these form distinct, predictable, aggregate patterns of behaviour that can be measured and used as individual difference variables (Fig. 2b).

Fig. 2: Situating behavioural economic theories of addiction.
figure 2

a, Although most theories of addiction recognize diverse influences, disciplinary foci tend to be oriented toward person-level factors or environmental factors. b, Behavioural economics bridges the connection between environmental and within-individual determinants by framing behaviour within a discrete choice context that is heavily influenced by environmental factors; these discrete choice contexts are building blocks for patterns of behaviour over time, which aggregate into measurable individual difference variables.

In this section, we present an extension to the reinforcer pathology model, the contextualized reinforcer pathology model, which addresses the limitations described above and highlights the critical role of alternative reinforcers in addiction motivation. We then describe the matching law, which serves as a primary theoretical premise of alternative reinforcement.

Contextualized reinforcer pathology

Contextualized reinforcer pathology posits that drug value, and consequently the likelihood of drug consumption, is critically determined not only by temporal windows of value allocation, but also by the characteristics of environmental choice contexts (Fig. 3). The contextualized reinforcer pathology model is a molar theory of behaviour: behaviour is measured over extended temporal windows and diverse sets of constraints to characterize the most likely behavioural output over time45. Constraints can be anything that influence the value of the commodities in the choice context50. From this perspective, behaviour can be broadly explained through utility (value) maximization, in which choice outcomes maximize benefits and minimize costs over a specified and varying temporal window (that is, there is no assumption that human or non-human laboratory animals maximize utility in an ultimate sense).

Fig. 3: A contextualized reinforcer pathology approach.
figure 3

Two theoretical examples depicting the effects of environmental constraints on the value of alternatives and alcohol at three time points. Blue represents reinforcement from alternative activities and red represents reinforcement from alcohol-related activities. The left panel shows a scenario most likely to result in increasing levels of substance use. Initially, substance value is low, and the individual engages in many alternative activities. However, over time the environmental context places increasingly high constraints on alternatives (the local park shuts down, the individual cannot afford to go to college, the roads are bad for biking) and low constraints on alcohol (easily available from local store, cheap, social reinforcement from drinking). Consequently, alcohol value increases over time while engagement and availability of alternatives decrease. The right panel shows a scenario that would result in stable or decreasing levels of alcohol use over time. Initially, substance value is low, and the individual engages in many alternative activities. Over time, constraints on alternatives remain low. As the individual enters emerging adulthood, they connect with friends through drinking, and therefore the value of alcohol rises slightly. However, the individual maximizes more global utility and continues to engage in available alternatives that effectively compete with the immediate rewarding effects of alcohol. Consequently, when the individual leaves college and drinking among friends declines, the individual’s drinking declines as well.

A key assumption of contextualized reinforcer pathology is that a drug’s reinforcing value is not an innate quality of a drug but is instead critically determined by characteristics of the choice environment. Although delayed reward discounting and behavioural economic demand have been operationalized as stable, individual difference variables, this is a feature of measurement; the stability in these constructs is due in part to the stability of the environmental choice context and (lack of) availability of alternatives in the instructional sets. Value is influenced by factors across varying temporal and environmental (spatial) frames in a way that requires explanations of distal causation (the level of public health in the environmental context influences individual drug value). In other words, a narrow spatial analysis might ignore the environment altogether and focus on within-individual or between-individual variables that predict alcohol use. Expanding the spatial analysis might reveal county-level differences in the availability of liquor stores and alternatives, such as parks and recreational opportunities, that might explain additional variance across populations in various counties. Expanding the spatial analysis further might reveal country-level differences in the acceptability of consumption or state-level differences (for example, in legal status of cannabis), or cultural differences across nations in the acceptability of public alcohol consumption.

In the contemporary reinforcer pathology model, the pathology of overuse of a specific reinforcer resides in the internal decision-making processes of the individual, and the influence of the broader context is unaccounted for, whereas pathology in the contextualized reinforcer pathology model resides in the interaction between the person and the context. Studies have demonstrated that substance demand is malleable to numerous experimental manipulations, such as cue exposure (controlled exposure to substance-related environmental stimuli)51,52, opportunity cost (choosing the substance reinforcer at the expense of an alternative that also carries value)53,54, the social context55,56, and both pharmacological and psychosocial treatments57,58,59. Delay discounting is also influenced by context36,60, including through exposure to natural (as compared to man-made) environments61, shifts in the time to receipt of alternatives31, and manipulations targeting the temporal frame, such as episodic future thinking62,63,64. Indeed, although the effects of alternative reinforcers, demand, and delayed reward discounting are often studied in isolation, these factors may interact to influence behaviour during a discrete choice (Fig. 4).

Fig. 4: Interactions between substance cost and alternative reward.
figure 4

The likelihood of using a substance is based on the cost of the substance, the delay to the receipt of the alternative reward, and the value of the alternative reward. Across both plots, substance use is most likely when the substance cost is low and when the delay to the alternative is high. As the cost of the substance increases, the likelihood of use decreases. Further, as the alternative reward receipt becomes closer in time, the likelihood of use decreases. However, when there is a high-value alternative available (an activity that generates a positive affective state or sense of accomplishment or alleviates an aversive state; bottom panel) the likelihood of use across all delays and substance costs are attenuated relative to when a low-value alternative (a non-stimulating or aversive activity; top panel) is available.

Moreover, in the contextualized reinforcer pathology model, pathology can be determined only within an individual’s functional context. Reinforcement learning is an adaptive process that occurs because it results in reward or alleviation of distress; the reinforced behaviour serves a function and drug behaviour is only ‘pathological’ when the behaviour leads to functional impairment in the short term (for example, accidents, hangovers or missing work) or long term (for example, declining health or social functioning).

The matching law

The importance of alternative reinforcement in decision making broadly, and drug use specifically, is grounded in the behavioural matching law65, a behavioural principle which states that the relative rate of responding approximates the relative rate of reinforcement at each alternative66 (Box 1). In an exemplar experiment66, pigeons were concurrently reinforced to peck two keys in an experimental chamber under independent variable-interval schedules of reinforcement. In other words, each reinforcer was delivered following a specified amount of time after the first key peck response, and the time between reinforcers varied throughout the task. Across five experimental sessions, rates of responding corresponded almost perfectly with frequency of reinforcement. These findings illustrate that the observed response rate for each choice option is approximately equivalent to reinforcement from that option and that behaviour is reflected proportionally based on the frequency of responding relative to other reinforcers65,67. That is, consummatory behaviour matches the local reinforcement contingencies. Behavioural allocation consistent with the matching law is generally adaptive and does not imply a pathological pattern of responding but is instead a quantitative codification of the assumptions made about choice under different conditions68.

The matching law has two critical implications for theories of substance use. First, the response rate for various reinforcers might serve as a measure of reinforcement value. In line with this theoretical premise, applied human researchers have developed indices of reinforcement for humans based on time allocation and discretionary spending for alcohol and non-alcohol activities as well as activity enjoyment, with activities that are engaged in frequently and rated as subjectively enjoyable classified as highly reinforcing69.

Second, operant behaviour is a zero-sum outcome set dependent upon the available reinforcers and an individual’s response rate to each reinforcer. The introduction of any alternative eliciting a non-zero response rate will shift response rates, and therefore the reinforcement, for all other choices in the choice context65. This is consistent with other research showing that preference for drug reinforcers varies as a function of the availability of non-drug reinforcers70,71,72,73,74,75. To illustrate the zero-sum nature of reinforcement and the importance of alternatives, consider patterns of reinforcement for two individuals. Both individuals drink the same amount of alcohol per week, but person 1 allocates less time to other activities, such as work, family, and hobbies compared to person 2. Consequently, reinforcement from substance use is a larger percentage of the overall ‘reinforcement pie’ for person 1, and they are therefore more likely to have problems related to drinking and are more vulnerable to chronic alcohol-related harms.

These lines of research provide support for the premise that preference for a substance depends on the relative constraints on access to other available reinforcers in the choice context. Thus, drug consumption is in part an inverse function of access to alternative rewards. This conclusion contrasts with early work in laboratory animals and humans that consistently found that alcohol and other drugs often continue to be self-administered at high rates under schedules of reinforcement in which a reward is only provided after a specified number of responses (fixed ratio or progressive ratio schedules) and when no alternative reinforcers are available76,77. These results are consistent with the law of absolute responding, which suggests that as the amount of reinforcement for a given commodity increases, the amount of behaviour allocated toward that commodity increases hyperbolically66. However, these experiments lack validity: people are rarely presented with only one choice in daily life. Indeed, substance consumption decreases when alternatives are concurrently available alongside substance self-administration78,79,80,81,82,83.

The matching law provides a framework for understanding how addiction might develop in certain contexts. For example, the ‘primrose path’ model, which developed from early experiments with the matching law84,85,86, suggests that addictive drugs have greater local utility than most competing alternatives, and therefore the addictive drug will almost always be selected when local utility is maximized (that is, when the individual uses a proximal frame of reference). However, characteristics of addictive drugs (such as tolerance and adverse physical and social effects associated with patterns of heavy use) reduce the value of both the drug itself and alternatives. Thus, as choices accumulate, the value of both options reduce over time, but the value of the addictive drug remains higher when maximizing local utility. When an organism maximizes global utility each successive choice is considered in the calculation of the potential value of the next choice option (choices and associated rewards are bundled together into an aggregated outcome). Most salutary alternative choices to drug consumption are distributed choices with immediate effort costs (for example, work or exercise) and delayed rewards (for example, affluence or health). Thus, maximization of global utility would result in a pattern of choices favouring the alternative reward. According to proponents of the primrose path model an analysis of the available commodities is needed to understand their effect on future choice84,85,86, and addiction might be driven in part by stable, between-individual differences in the choice strategies dictated by either local or global frames of reference85. However, what is lacking in these analyses, as in the reinforcer pathology model, is that drug use will be influenced by the substantial between- and within-individual differences in constraints on access to drugs versus alternative reinforcers across environmental contexts (Fig. 3).

Contemporary accounts of reinforcer pathology have underemphasized the matching law and overemphasized individual differences in reinforcing value and delay discounting40,87. However, there is historical and theoretical precedent to assume a conceptual and nested interconnectedness between these three primary behavioural economic variables (Box 1), which each explain behaviour under increasingly specific conditions30,41,42,88.

Translational evidence

Next, we review translational evidence that supports and extends the fundamental matching law65 and shows that, across multiple levels of analysis, enhancing access to alternative rewards meaningfully affects engagement with substances over and above other necessary theoretical mechanisms of addiction. We begin with a discussion of basic non-human animal and human research and then discuss applied clinical translations in humans that demonstrate how increasing alternatives can be used as a treatment mechanism and intervention.

Experimental non-human animal laboratory research

One influential set of studies (known as ‘rat park’) provides a potent demonstration of the effect of the environment on drug administration behaviour. Specifically, experimenters tested the influence of social exposure as an alternative reinforcer competing with morphine89,90. Rats randomized to either isolation or an enriched social environment (with running wheels and other activities) were given access to morphine for 57 days. In the experimental sessions, they were allowed to make concurrent choices between morphine and water. Rats in the enriched social environment consumed less of the morphine solution compared to isolated rats89,90. This general finding that alternative reinforcers reduce drug self-administration in concurrent choice tasks among non-human laboratory animals has been replicated across substances78,91,92,93,94,95,96 and alternative reinforcers (such as food, sucrose and running wheels)78,91,93,97.

The effects of alternatives do not seem to be limited to experimental paradigms in which rats choose between two rewards simultaneously. In another study98, rats were trained to self-administer alcohol after only one lever press until stable responding was achieved, after which rats lever-pressed for alcohol or sucrose in alternating sessions. Rats reduced their responding to alcohol (that is, pressed the lever fewer times) after being introduced to sucrose, even in the sessions when sucrose was not available, suggesting that the effects of non-alcohol alternative reinforcers extend beyond the immediate choice context.

Other laboratory animal research suggests that the order in which the reinforcers become available might influence the impact of alternative reinforcers on drug self-administration. In one study97, rats had access to d-methamphetamine self-administration for 21 experimental sessions. Access to a running wheel (an alternative reinforcer) was also available during sessions 1–14 for a first group, during sessions 8–21 for a second group, and during sessions 15–21 for a third group. Rats in the first group self-administered less methamphetamine across the first fourteen sessions compared to the rats in the second and third groups. Self-administration in the second and third groups decreased when the running wheel was introduced in sessions 8 and 15, respectively. When rats in the first group lost access to the running wheel, self-administration increased, but to similar levels as self-administration in the second and third groups when the running wheel was available. These findings suggest that early life access to alternative reinforcers might be protective against later substance use, even in the context of alternative reinforcement scarcity. However, this finding has not yet been extended to humans.

Finally, animal work has integrated other behavioural economic variables such as delayed reward discounting into models of alternative reinforcement99. Experimenters trained rats on self-administration for an alternative reinforcer (60 seconds of social interaction with another rat) and for cocaine. Next, the rats were given choices between these two reinforcers over ten sessions. Across all sessions, rats showed a robust preference for social interaction over cocaine. Furthermore, increasing the delay between the lever press and receipt of the social reward, and the effort required to obtain the social reward, increased cocaine self-administration, and there were individual differences in sensitivity to delay and effort contingencies.

It is important to note that although the effects of alternative reinforcers in the laboratory are robust, they vary across studies and experimental paradigms. Moreover, there is some evidence that neurobiological differences might moderate the extent to which laboratory animals show reductions in drug use after an alternative is introduced78.

Experimental human research

Human laboratory studies are consistent with non-human animal laboratory studies and show that, within a discrete choice context, introducing alternatives reduces the use of drugs and self-administration. Early studies in the 1970s that controlled all features of an individual’s environment in residential alcohol laboratories found that availability of an enriched environment contingent upon moderate drinking (for example, social interaction) led to reduced drinking74,100,101,102. In a seminal experimental study103, individuals who drank alcohol but were not in alcohol treatment were offered choices between alcohol and money. The amount of money available (either 2¢ or 10¢ per choice) and the delay between choosing money and receiving it (either no delay, a 2-week delay, or an 8-week delay) were manipulated103. When there was no delay, participants chose alcohol 42% of the time when the alternative was 2¢, but chose alcohol only 29% of the time when the alternative was 10¢. When the delay to monetary reward increased, preference for alcohol increased103. These findings replicate established laboratory animal findings in humans and suggested that human alcohol choice behaviour is partially dependent on the contingencies of the choice environment, such as alternative reinforcement and the delay to reward. These findings have been extended to other drugs, such as cocaine104,105, cannabis106 and heroin107,108.

However, people might choose to use drugs even when alternatives are available if the value of the drug is sufficiently high109,110. For example, in one study, participants chose between different doses of cocaine and a fixed amount of money (US$6.00)110. As the dose of cocaine increased, choices to consume cocaine increased. These findings have been replicated across drugs107,109,111,112,113,114 and are consistent with an inverse relationship between drug reinforcement and the value of the alternative reinforcement available.

Importantly, human laboratory studies have high experimental demands and limited ecological validity. Moreover, there are individual differences in the availability and engagement in alternative reinforcement in the natural environment that are not captured by human laboratory studies. Thus, researchers have developed self-report measures modelled after the matching law that assess the amount of substance-free reinforcement relative to substance-related reinforcement in a person’s life over the course of a month. The most popular measures assess the amount of time spent engaged in the activity (rate of reward receipt)115,116 and the subjective enjoyment of the activity (strength of the reinforcer)117,118. These measures can be combined to quantify substance-free and substance-related reinforcement, which can then be used to compute a relative reinforcement ratio: substance-related reinforcement/(substance-free reinforcement + substance-related reinforcement). Resource allocation measures quantify relative reinforcement by examining the ratio of a single class of resource (for example, time or money) allocated to substance-related activities relative to resources allocated to other activities. Studies using these measures find that diminished alternative reinforcement is associated with greater alcohol use25,119,120, smoking121, cocaine use122, and more general illicit drug use123,124,125 in adolescents124,125, in emerging adults25,126 and in clinical populations127,128.

Applied clinical research

Of the third of American adults who will meet criteria for lifetime alcohol-use disorder, less than 25% will seek treatment and 70% will improve without any formal substance-use treatment129. Increasing alternative reinforcement has been identified as a mechanism of successful change in substance use among individuals experiencing natural recovery and in randomized clinical trials for established interventions and treatment130,131,132,133. In studies of natural recovery from alcohol-use disorders, individuals who reported lower relative monetary expenditure towards savings versus alcohol in the year prior to an attempt to reduce drinking were less likely to successfully reduce or abstain from drinking134,135,136. Further, several studies have demonstrated that stable long-term recovery from alcohol-use disorder is more likely when there are improvements across life-health domains that probably indicate enhanced availability of non-drug rewards137,138. Positive long-term outcomes among alcohol treatment recipients are accompanied by improvements in health, life satisfaction and functioning in domains often adversely affected by problem drinking that probably motivated and reinforced recovery processes and outcomes139. Although improvement in these domains during recovery does not explicitly quantify or measure alternative reinforcement, such improvements are consistent with the behavioural economic perspective. Indeed, the term ‘recovery capital’ has been coined to reflect the importance of the accrued personal, social, financial and cultural substance-free resources that aid the journey to recovery140,141, and definitions of recovery increasingly account for holistic improvements across valued life domains, in addition to reductions in drug use.

There are also several efficacious addiction treatment approaches that attempt to reduce alcohol and drug use by increasing both the response cost associated with alcohol and drug use and access to and engagement in substance-free activities142. These intensive outpatient treatments explicitly attempt to reduce substance use by: regularly monitoring alcohol and drug use using objective verification methods; systematically increasing the response cost of alcohol use (social and tangible rewards are administered contingent upon verified abstinence); and systematically increasing the availability of rewarding alternatives that are incompatible with substance use143. In the community reinforcement approach144, the latter is achieved by providing family and vocational counselling that increase social support and facilitate occupational skill building to increase the number of rewarding options in the individual’s environment. Contingency management145,146,147,148,149 is another effective tool for reducing substance use, particularly in the short-term147, by delivering abstinence-contingent monetary vouchers that can be used to purchase goods and services that can enhance substance-free rewards (such as movie tickets, sporting equipment or money for hobbies)150. Treatment effects for contingency management are stronger than cognitive behavioural therapy for substance-use disorder151. Contingency management has also been modified as an adjunct for other treatments152 and to increase treatment attendance, with positive effects153. Likewise, there is extensive evidence supporting the efficacy of the community reinforcement approach alone154, and the combination of contingency management and community reinforcement155.

Another approach, known as Life Enhancement Treatment for Substance Use (LETS Act), uses behavioural activation, a treatment for depression grounded in increasing response-contingent positive reinforcement156, to increase alternatives to substance use. LETS Act is a group treatment delivered over eight sessions that focuses on generating, scheduling, engaging in and recording value-driven substance-free behaviours that serve to increase daily positive reinforcement157. In a randomized clinical trial, patients in residential treatment for substance use reported fewer negative consequences related to substance use and a greater likelihood of abstinence 12 months later158.

Finally, the Substance-Free Activity Session is a single session intervention that integrates behavioural economic and motivational interviewing elements to reduce delay discounting and increase engagement in goal-directed and enjoyable activities that are consistent with long-term goals. This approach has been used to supplement standard brief alcohol- or drug-focused interventions with emerging adults who report binge drinking159,160,161 and adults in alcohol treatment162. Specific Substance-Free Activity Session elements include discussion of future goals, personalized feedback on recent time allocated to activities that are consistent with those goals compared to time spent drinking or using drugs, episodic future thinking, and personalized feedback on locally available substance-free activities that are consistent with goals and interests (for example, doing homework, spending time with family or friends, or learning an instrument). This treatment targets behaviours (and bundles of alternative reward) at varying temporal windows across different levels of substance-use severity. In one large multi-site trial, young adults participating in the Substance-Free Activity Session who reduced their drinking showed sustained increased reinforcement from substance-free activities at 16-month follow-up. Moreover, post-intervention reductions in alcohol use and alcohol-related problems were mediated by changes in proportionate reinforcement from substance-use activities relative to total reinforcement160.

Collectively, these preclinical and clinical lines of research show that insights from concurrent choice tasks translate to applied clinical settings, and consistently reveal that increasing the availability of valued alternative reinforcers reduces drug choices and promotes long-term changes in substance use. Moreover, increasing alternative reinforcement is an evidence-based target for treatments for individuals across the severity spectrum.

Implications for public health

Although not explicitly guided by behavioural economics, a great deal of public health data supports the premise that greater availability of alternative reinforcers reduces epidemiological risk of harmful substance use. Individuals who experience homelessness, poverty, unemployment and/or lower educational attainment bear a disproportionate burden of alcohol-related health and social consequences, including alcohol-related mortality163,164,165. Although other factors are certainly implicated, evidence supports the idea that behavioural economic variables, particularly the economic deprivation and scarcity of opportunity (that is, an environment lacking alternative reinforcers), are partially responsible. Individuals from lower socioeconomic backgrounds are more likely to work and reside in environments with fewer alternative sources of reward and resources with which to cope with stress, a higher density of alcohol and cannabis retail outlets and illicit drugs166,167 (with a greater concentration in Black neighbourhoods)168, and aggressive alcohol advertising campaigns169,170,171. One large study of adolescents from the Los Angeles area found that the longitudinal association between lower parental socioeconomic status and increased risk for drug use is mediated by lower engagement in enjoyable substance-free activities172. Moreover, stress and poverty among adolescents and young adults is associated with greater delayed reward discounting, which might contribute to a preference for drug-related rewards173, and neural responses to motivational reward anticipation might be blunted among children living in neighbourhoods with greater deprivation of natural rewards174. Thus, the key within-individual variables featured in the reinforcer pathology model (elevated delayed reward discounting and drug reward valuation) are themselves influenced by contextual variables.

Economic deprivation and scarcity of opportunities (and as a result, an environment lacking alternative reinforcers) is particularly prevalent for Black populations in the USA, who are more vulnerable to the harms of drugs and alcohol (even after controlling for use)175. Two sets of policy initiatives might be particularly relevant for understanding drug and alcohol-related harms in this community from the contextualized reinforcer pathology perspective. First, Black communities were explicitly targeted through Federal Housing Administration policies in several ways, such as refusal to insure mortgages for Black applicants, racial restrictive covenants, racial zoning and public housing176. In many cases, these policies prevented Black families from building real estate equity as economic capital (a reality that insidiously persists today177), and forced Black communities into polluted industrial zones with reduced access to quality education and healthcare176. This perpetuation of poverty reduces access to enriched environments with alternative reinforcers, such as parks and other recreational facilities, that can effectively compete with immediate and robust drug reinforcers.

Second, the set of policies officially known as the War on Drugs targeted communities of colour by shifting drug control policy toward punitive law enforcement approaches178. The War on Drugs included policies that classified and outlawed a range of drugs (some of which were, at the time, beginning to demonstrate therapeutic and medical potential179), set high legal penalties for small possession offences of drugs primarily used in the Black community, reduced the number of community mental health centres and re-funnelled government spending toward law enforcement (resulting in the militarization of police)178, and intentionally spread misinformation about the harms of drugs and drug users. These policies resulted in high rates of felony incarceration among Black Americans, who are incarcerated at five times the rate of white Americans180. These policies converge to specifically stigmatize Black drug users and to decrease familial economic stability and limit access to high-paying jobs and other rewarding alternatives among Black populations. Consistent with behavioural economic theory, these reductions in access to alternative reward might contribute to drug use15,69. The historical economic deprivation and scarcity described above probably contributes to stress, reduced access to health care, and more interactions with law enforcement. These factors might, in turn, account for the fact that, despite lower overall drinking levels, Black Americans who do drink show greater relative levels of alcohol problems and alcohol-use disorder than does the rest of the population181 (Box 2).

Variability in rates of county-level drug-related mortality provide another excellent illustration of the public health implications of alternative reinforcement. Drug-related deaths are not equally distributed across the USA but are instead concentrated in certain regions of the country182. Drug overdose deaths in 2006–2015 were most likely to occur in Appalachia, Oklahoma, the northeastern USA and New Mexico, and less likely to occur in the midwestern and the southern states183. Alcohol overdose deaths were also high in the western USA, particularly among Native American populations183. Importantly, although drug supply, including prescribed opiate pain killers and from commercial alcohol outlets, is certainly a substantial factor, it does not fully explain the mortality in these counties. For example, counties with large Native American populations in New Mexico and Oklahoma had greater rates of drug overdose in 2006–2015 (ref. 182), even though these counties had comparable or lower rates of opioid overprescribing compared to surrounding counties184. These data suggest that although opioid prescribing rates are important, between-county variability might be further explained by economic and social characteristics associated with access to reward. For example, greater economic distress, housing distress (rent taking >30% of household income), and family distress is associated with higher drug-related mortality, whereas a higher number of religious establishments and a diversified economy is associated with lower drug-related mortality182.

Variability in rates of county-level drug-related mortality demonstrate the impact of USA policy on drug- and alcohol-related harms. The history of the colonization and genocide of the Native American population during western expansion in the USA, in addition to ongoing USA policies that continue to marginalize Native American people, has resulted in a systematic lack of educational and occupational opportunities, and, in many cases, the disintegration of traditional Native American culture185. The lack of opportunity and disintegration of culture diminish the opportunity to accumulate valuable alternative reinforcers that effectively compete with substance use, which explains, from the contextualized behavioural economic perspective, the high rates of drug and alcohol use and mortality among Native American populations186,187. In Appalachia and the northeastern USA, there has been a large decline in critical industries that previously supported the regions economically188. In many cases, this has led to a lack of availability of meaningful work and decreased financial resources with which to attain alternative substance-free reinforcement (such as hobbies, outdoor green spaces and travel/leisure) that is life-enhancing and which might make the difference to whether someone chooses to use drugs.

Fortunately, some public health evidence points to possible solutions. In the 1990s, Icelandic adolescents reported very high rates of substance misuse189. In response, a population-level prevention programme aimed at reducing substance misuse among adolescents and young adults was implemented190. This programme entailed increasing costs of substance use (for example, national media campaigns discouraging smoking; positive peer influence campaign to discourage smoking; a national ban on all tobacco and alcohol advertising) while also increasing access to alternatives (for example, organized youth activities)191. Rates of substance use among Icelandic adolescents plummeted from 1997 to 2014, alongside increases in primary prevention factors such as parental monitoring and engagement in organized sports192. Because of this programme, Iceland was the only country among 36 European countries participating in the European School Survey Project on Alcohol and Other Drugs (ESPAD) that demonstrated consistent declines in substance use among adolescents193.

Collectively, these examples highlight how patterns at the population level are consistent with fundamental behavioural principles related to the importance of alternative reinforcers for reducing harmful substance use. A contextualized reinforcer pathology model that fully accounts for both within-individual and environmental contingencies might help to inform public health initiatives to reduce substance use.

Conclusions

The findings reviewed here demonstrate that, across levels of analysis, alternative reinforcement is inversely related to substance use and serves as a critical factor in maintaining motivation and as a mechanism of behaviour change that can be targeted in intervention and prevention. Importantly, alternative reinforcement fits within and extends beyond behavioural economic models of addiction. Specifically, according to a contextualized reinforcer pathology model, reductions in substance use can be attributed to a shift in the cost/benefit analysis driven by an increase in the cost of the substance, an increase in the value of alternatives, or a widening of the temporal window of value allocation, all of which have some impact on one another. For some individuals, a change in circumstances may reduce the value of a substance (for example, alcohol use might decline after leaving college owing to a reduction in the social reinforcement associated with alcohol use). Other individuals might reduce their substance use owing to the rising costs of use (for example, a spouse threatening divorce) that begin to outweigh the benefits derived from using. Still others might reduce substance use as they become increasingly involved with alternatives (such as jobs, families or exercise) that introduce an opportunity cost of use. Finally, consistent with the ‘primrose path’ model15, some individuals might have trouble reducing their substance use because the direct effects of drugs can lead to diminishing engagement and availability of alternative reinforcement, progressively resulting in reward impoverishment that increases the likelihood of seeking substance reinforcement. These phenomena reflect both within-individual and between-individual level constructs operating in parallel to neuroadaptations that occur with persistent substance use in neurobiological models.

Emphasizing alternative reinforcers in behavioural economic models reframes choice models of addiction as contextual models. Decisions are made in a specific context, with the parameters of the context defined within continuums of time and space. This contextual approach shifts from a focus on individual difference variables (for example, absolute degree of delay discounting) to a focus on the process by which contextual variables influence within-individual variables (for example, demand and delayed reward discounting) over time. The value of a substance is not a fixed property, but rather is systematically influenced by contextual factors in the environment50,56,57 according to a temporal periodicity194. The value of a substance can also be modified through intervention approaches that target both the environmental context — specifically to increase the response cost associated with drug use and reduce the response cost associated with alternatives — and within-individual variables (such as demand and discounting).

A contextual approach emphasizes the impact of the environment on behaviour and clarifies that choice models should not be conflated with a moral model of addiction. The moral model (grounded in moral Puritanism) articulates the cause of substance use as a conscious, volitional choice emerging from within a person for immediate pleasure over more societally acceptable activities (even if the choice comes at a high cost). By contrast, the contextualized reinforcer pathology model suggests that the cause of substance use is a set of temporally extended external contingencies, such as the relative availability and response cost associated with drugs versus alternatives, that contributes to patterns of substance use over time. This bidirectional model also emphasizes that patterns of drug use affect both the choice context and the decision-making processes that contribute to addiction.

Finally, contextualized reinforcer pathology provides a theoretical framework for widespread implementation of prevention, clinical and public policy initiatives that increase the availability of alternative reinforcement across levels of analysis. At the individual level, clinical interventions that target alternative reinforcement demonstrate robust efficacy147,158,160. A critical next step is the effective dissemination of these interventions, and some, such as contingency management, have begun to be incorporated into mainstream treatment settings195. At the level of public health, the sociopolitical environment has facilitated increases in substance use through contemporary adverse economic conditions and historic trends of intentional isolation, exclusion from meaningful alternative reinforcement and occupational opportunity, and economic constraints. In both cases, the adverse impact on substance use can be explained through the framework of contextualized reinforcer pathology and leads to the public policy recommendation of supporting access to salutary and meaningful alternative reinforcers.