Introduction

Drug addiction is increasingly accepted as the progress of a series of transitions from initial drug use — when a drug is voluntarily taken because it has a hedonic effect — to loss of control over this behavior, such that ultimately it becomes compulsive1. Relapse in drug seeking/taking following detoxification and/or abstinence is a major impediment in the treatment of addiction2. Stimuli predicting drug availability or associated with drug reward can elicit powerful motivational properties, which are believed to induce craving and physiological arousal as well as relapse to drug seeking/taking3, 4, 5, 6, 7.

Initial drug use is clearly a key first step in the development of drug abuse and addiction; it can also be an important contributor to later abuse by affecting individuals physiologically, psychologically, and socio-culturally8, 9. Converging evidence indicates that early experience of drug use can trigger enduring molecular and anatomical changes in the brain, which may contribute to the subsequent physical or behavioral consequences. A single administration of cocaine produces altered synaptic plasticity in the ventral tegmental area that lasts for several days10. A single exposure to amphetamine or morphine is sufficient to induce long-term behavioral sensitization and associated neuroadaptations11, 12. Rats given a single cocaine self-administration experience exhibited drug-seeking up to 1 year after abstinence13. Previous research adopted a passive drug administration paradigm or intensive self-administration schedule to investigate the effects of initial drug use; however, outside of the laboratory, recreational drug use in the early stage of addiction was under a condition of low frequency, and the voluntary activities involved in procurement of a drug differ from those required to take the drug14, 15, 16. Considering the behavioral characteristics of initial drug use, here we developed a modified heterogeneous two-chained paradigm to meet the requirements of initial heroin experience.

In the present study, we aimed to determine whether animals could acquire two-chained operant skills for heroin infusion during a low frequency short-term training phase and whether the initial heroin experience could maintain a robust, long-term influence on drug-seeking behavior after one month abstinence.

Materials and methods

Animals

Sprague Dawley male rats weighing 200–250 g were obtained from the Laboratory Animal Center, Shanghai Institute of Materia Medica. The behavioral experiment started when the rats weighed 280–320 g. They were housed individually in a temperature (23±2 °C)- and humidity (50%±5%)-controlled room with food and water available ad libitum. They were kept on a reverse 12 h light/12 h dark cycle. Handling of the animals (at least one min/[rat*day]) was started immediately upon receipt of the animals.

Before the beginning of surgery, each animal was randomly assigned to one of two groups. All treatments of the rats were performed in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals. The procedures were approved by the local Committee of Animal Use and Protection.

Drug

Heroin (diacetylmorphine hydrochloride) was obtained from the National Institute of Forensic Science (Beijing, China). The heroin was dissolved freshly in sterile 0.9% physiological saline at a concentration of 0.3 mg/mL. Drug or vehicle solution was intravenously delivered at a speed of 20 μL per second.

Surgery

Intravenous catheterization

The catheters were constructed using guide cannulae, silicon tubing, dental cement and plugs. The rats were anesthetized with sodium pentobarbital (55 mg/kg), in combination with atropine (0.4 mg/kg). The catheters, previously sterilized in 70% alcohol, were implanted with the proximal end reaching the atrium through the right jugular vein, continuing dorsally over the right shoulder and fixed between the scapulae. The catheters were flushed daily with 0.1 mL of an antibiotic solution (Penicillin sodium; 800 000 units of this powder were dissolved in 2 mL 0.9% sterile saline) and thereafter daily with a 0.1 mL heparin solution (50 units/mL in 0.9% sterile saline).

Apparatus

The apparatus for two-chained operant training was based on a modification of the Ettenberg runway17 (Figure 1A). Our apparatus included a start box (23 cm×20 cm×44 cm), a straight alley (160 cm×12 cm×44 cm), a goal box (23 cm×20 cm×44 cm), a magnet track, a flow-through swivel and a drug delivery system. Both start and goal boxes were separated from the alley by means of sliding doors. Inside the start box, there were two identical nose pokes on opposite walls, which were designated as active or inactive. A blue light positioned inside the nosepoke, which could be illuminated, served as the visual discriminative stimulus (DS). In addition, a tone generator located in the ceiling was available to provide an auditory DS. The subject's location within the apparatus was detected by three infrared photodetectors embedded in the walls along the length of the runway. A computer program for the control of operant behavioral experimentation from Anilab Software & Instrument Co, Ltd (China) was used.

Figure 1
figure 1

(A) Schematic conception of the two-chained procedure. (B) Timeline of the experimental procedures.

Experiment procedures

Two-chained operant training

Two groups of rats were used in this experiment. After at least 3 d of recovery from surgery, the rats were trained to self-administer heroin (120 μg/kg per infusion, iv, n=11) or saline (n=8) under a heterogeneous chained schedule of two trials per day. At the beginning of each trial, the rat was first put inside the start box, with the cannula connected to the infusion system through a flow-through swivel. The trial started when the rat made a specified number of response(s) in the active nose poke, followed by both the illumination of the blue light and a 2 s intermittent tone as the discriminative cue (DS) and the opening of the start box door. The animal was then free to travel along the length of the alley to the goal box. The crossing duration between the two doors (as detected by breaking the IR beams near the doors; see Figure 1A) was recorded as Travel Time. When the rat entered the goal box, which was detected by a photodetector inside the box, the door was closed, and an automatic heroin or vehicle iv infusion was earned. The latency to enter the goal box from the end of the straight alley was recorded as Enter Latency. The trial ended after the rat stayed inside the goal box for 5 min. We defined active nose poke responses in the start box (obtaining the opportunity to approach drug available position) as drug-seeking behavior and entering the goal box as drug taking progress (directly receiving the addictive drug).

The animal was trained under a fixed-ratio, two-chained schedule, which means that the rat needed to respond n times on the active nose poke to open the start door (FRn), and once the rat traveled through the straight alley and finally entered the goal box, heroin or saline solution was delivered once. The variable n was set to 1 during the first four trials and then increased from 2 to 5 gradually, so the schedule could be described as FR1(1), FR1(1), FR1(1), FR1(1), FR2(1), FR3(1), FR4(1), and FR5(1) (Figure 1B). The inter-run interval between two trials of each day was 6 h.

Abstinence phase

After four days of two-chained schedule training, all rats were kept in their individual home cages for a period of 1 month before reinstatement testing.

Contextual reinstatement test and extinction

Heroin-seeking behavior was tested under the extinction condition for 10 min in the start box with the start door closed. Responses on the active or inactive nose pokes were recorded but without programmed consequences. The extinction tests were conducted twice a day for 3 d.

Discriminative stimuli reinstatement test

Heroin-seeking behavior was tested under the same conditions as the extinction test, except that each response on the active nose-poke resulted in a 2 s contingent light and tone discriminative stimuli.

Statistical analyses

The data of first active response latency and latency to meet the response requirements and open the start door were assessed using a two-way analysis of variance (ANOVA) with group as a between subject factor. Data from reinstatement tests were analyzed separately for responding on the active and inactive nose-poke. All results were expressed as mean±SEM, and comparisons among groups were made using one-way ANOVA.

Results

Heterogeneous two-chained schedule training

The design of the apparatus used in this study came from Ettenberg's Runway17 with modification of the start box, in which two nose pokes were introduced as active and inactive operations, respectively (Figure 1A). Almost all animals in this experiment improved their skills of first chain operation during 4 d of training, as measured by first active response latency (Figure 2A). A two-factor (group×trial) ANOVA revealed a significant main effect of trial (F7,171=9.929, P<0.01), but no significant main effect of group (F1,171=0.116, P>0.05) or group×trial interaction (F7,171=3.47, P>0.05). Analyses of the start door open latency also indicated a significant main effect of trial (F7,171=6.604, P<0.01) and no significant main effect of group (F1,171=0.896, P>0.05) or group×trial interaction (F7,171=0.366, P>0.05) (Figure 2B). The second chain performance-alley running has been successfully used as an operant to demonstrate the reinforcing effect of drugs of abuse in rodents18, 19, 20. In general, heroin intravenously administered in the goal box reinforced the running of animals, as measured by travel time and enter latency. Heroin-trained rats showed a significant lower alley-travel time (Figure 2C) and goal-box entry latency (F1,18=11.23, P<0.01 and F1,18=9.45, P<0.01, respectively) compared with saline-trained rats (Figure 2D).

Figure 2
figure 2

Acquisition of the two-chained operation in the training phase. (A) Both groups of animals improved their performance in opening the start door by responding to active nose-poke, as measured by a reduction in the first active response latency, but there were no significant differences between the two groups at each trial. (B) Mean±SEM latency to open the start door during training phase. (C) Mean±SEM travel time and (D) Goal box enter latency decreased significantly in the last training trial. cP<0.01 compared to the control (saline) group.

Context-induced reinstatement of drug-associated behavior

The results of context-induced reinstatement of heroin-seeking behavior are shown in Figure 3. A one-way ANOVA revealed that heroin-trained rats exhibited significantly higher active responses relative to the saline-trained rats (F1,18=5.169, P<0.05). Analyses of inactive responding (a potential measure of general activity or response) revealed no significant difference between the two groups (F1,18=3.825, P>0.05). Intensive active responding reinstated by the environment of the first chain was goal-directed but not attributed to the enhancement of general activity.

Figure 3
figure 3

Context induced drug-seeking behavior after a one month abstinence. Mean±SEM responses on the previously active (A) and inactive (B) nose poke during the 10 min reinstatement test phase after a one month abstinence. bP<0.05 compared to the control (saline) group.

Discriminative stimuli induced reinstatement of drug-seeking behavior

After 3 d of extinction training, all rats showed low levels of nose-poke responding during the last extinction test (Figure 4A, 4B). There was no significant difference between the two groups (F1,18=0.180, P>0.05 and F1,18=0.391, P>0.05, respectively, for active and inactive nose poke responses). Figure 4 (C, D) illustrates the heroin-seeking behavior induced by discriminative stimuli. Heroin-trained rats emitted significantly more active responses compared with saline-trained rats (F1,18=10.324, P<0.01). Analyses of inactive responding revealed no significant effect of the group (F1,18=3.352, P>0.05).

Figure 4
figure 4

Discriminative stimuli (DS) induced drug-seeking behavior after a period of extinction. Mean±SEM responses to the previously active and inactive nose-poke in the last extinction session (A, B) and the DS induced reinstatement test phase (C, D). cP<0.01 compared to the control (saline) group.

Discussion

Earlier experience of drug use was associated with a subsequent increased risk of drug dependence problems. While the long term effects of addictive drugs have been extensively investigated with respect to electrophysiology, neurochemistry and behavior, the present study demonstrated that early experience of heroin use under the two-chained schedule maintained influence on drug-seeking behavior after a one month abstinence. Briefly, animals in the two groups received similar two-chained schedule training; there was no significant difference in first active response latency between the two groups during training. After one month of abstinence, heroin-reinforced animals presented with context (environment in the start box) or contingent discriminative stimuli exhibited an increased number of active responding compared to control animals, which had previously received saline solution in the goal box. These results indicated that 30 d of a forced, heroin-free period was not enough to erase the influence of the initial drug experience; the conditioned stimulus could reinstate drug-seeking behavior effectively after a long-term abstinence or extinction. These findings are consistent with the hypothesis that an early stage drug experience could contribute to the later development of drug addiction.

Drug seeking in the present animal model could be considered maintained effort to acquire availability of addictive drugs. The drug seeking behavior could be reinstated robustly with exposure to the discriminative cues that had previously been associated with drug availability. Drug taking in this study was termed to describe drug self-administration when the drug is earned in the goal-box after completion of a runway activity. In humans, the activities involved in procurement of a drug differ from those required to take it. It has been demonstrated that there is motivational distinction between drug seeking and taking in animal models15. We argued that the initial experience of heroin use could be modeled by a two link chained schedule in which performance of a drug-associated response (active nose-poke) provides the opportunity to perform a drug taking response (run for heroin). The initial heroin experience should also be considered in accordance with characteristics of low frequency and voluntary activity. We adopted the nose-poke response to control the opening of the start-door as the first chain in the present study because animals can learn to acquire food rewards with a nose-poke operant response, and the contingency between nose-poke and reinforcement could be acquired quickly by rats21. Although no reward was delivered in the first chain, our results showed that animals acquired this operant skill quite well over 4 d of training. The use of Ettenberg's runway as the second chain (Taking Chain) of the training schedule is the key for implementation of this two-chained paradigm. Previous work has indicated that the runway operant was a particularly appropriate tool for investigating the behavior of animals working for drugs, and the runway operant response for heroin administration was easier to acquire for rats relative to other operant responses (eg, lever press)17. Here, we found that opiate-reinforced rats traveled the alley more quickly and exhibited less hesitation to enter the goal-box compared to controls at the end of training phase, which is consistent with previous results in a runway procedure18, 20. The results of our study demonstrated that the combination of nose-poke and runway operant responses to model behavior in the initial phase of heroin use provided greater access to the motivational states of these behaviors and could be used to investigate the neurobiological mechanism of seeking and taking behavior, respectively, during initial heroin use.

In heroin or cocaine free individuals, drug craving and relapse to drug seeking/taking could be triggered by exposure to the self-administered drug, by stimuli previously associated with drug seeking/taking, or by exposure to stressors16, 22, 23, 24. Over the past decade the reinstatement model has been accepted widely to study factors that affect relapse in heroin and cocaine seeking behavior. To determine the effect of initial heroin experience on drug-seeking behavior after a prolonged abstinence, active responding reinstated by conditioned stimuli in the start box was compared between heroin-reinforced and control groups. Our results showed that active responding was reinstated by context (first chain-associated environments) after a one-month heroin-free period. In previous studies of conditioned place preference, animal preference for a morphine or cocaine paired environment could be maintained for more than 4 weeks25, 26. In humans, exposure to environmental contexts previously associated with drug seeking/taking often provokes relapse to drug use, even if a long drug-free period was imposed. These results were consistent with previous reports that a long abstinence was not enough to prevent a relapse induced by certain context and cues6, 27, 28, 29. In the present research, the context-induced drug-seeking responses declined to low levels after six extinction trainings, supporting the idea that extinction reliably reduces the level of drug-seeking behavior that has not been habituated or compulsive. However, it is also worth noting that extinction did not abolish the previously acquired knowledge, but rather masked it temporarily30, and drug-seeking behavior could be reinstated in other ways, for example, with discriminatory stimuli.

In summary, the results of the current study clearly demonstrate that early experience of drug use was sufficient to influence drug-seeking behaviors after a one month abstinence. Drug craving and relapse to drug seeking/taking could be attributed to complex molecular, neurochemical and even morphological adaptations in neurocircuits. The remodeling of reward, motivation, memory and control neurocircuits during initial drug use may result in a long lasting saliency value for the drug and its associated cues31, 32, 33, 34, 35, 36. Therefore, future studies will be needed to investigate the molecular and neurocircuits underlying such complex adaptations for understanding the mechanisms of addiction development.

Author contribution

Bin LU designed and preformed the research and wrote the paper. Mu LI helped with surgery and animal training. Yuan-yuan HOU and Jie CHEN assisted with research. Zhi-qiang CHI provided consultation. Jing-gen LIU designed the research and revised the paper.