The neurochemical dopamine is the central player in all addictions. Dopamine will be in every article on addiction, reward, motivation, or learning. Dopamine dysregulation is at the heart of porn addiction, cravings and withdrawal symptoms. Restoring normal dopamine function and sensitivity is a key to porn recovery
This section contains both lay articles for the general public, and research articles. If you are not an expert in addiction, I suggest starting with the lay articles, they are marked with an "L"
September 30, 2013 · by Talia Lerner
Dopamine neurons are some of the most studied, most sensationalized neurons out there. Lately, though, they’ve been going through a bit of an identity crisis. What is a dopamine neuron? Some interesting recent twists in dopamine research have definitively debunked the myth that dopamine neurons are all of a kind – and you should question any study that treats them as such.
There are many ways in which a dopamine neuron (defined as a neuron that releases the neurotransmitter dopamine) is not just a dopamine neuron. I’ll focus on three really cool ways here:
Before I describe these exciting new findings, though, let me give you the standard Neuroscience 101 introduction to dopamine neurons. This influential theory of dopamine neuron function comes to us from Wolfram Schultz and colleagues’ 1997 Science paper, “A Neural Substrate of Prediction and Reward.” It showed that dopamine neurons, which fire at some background rate, fire more in response to unpredicted, but not predicted, rewards. Additionally, if you’re expecting a reward and don’t get it, the dopamine neurons fire less. This finding led Schultz et al. to propose that dopamine neurons encode “reward prediction error.” That is, they tell you whether or not things are as good, better, or worse than you expected. Schultz et al. go on to state “The responses of these neurons are relatively homogeneous—different neurons respond in the same manner and different appetitive stimuli elicit similar neuronal responses. All responses occur in the majority of dopamine neurons (55 to 80%).”
The role of dopamine neurons as computers of reward prediction error remains a fascinating and worthy line of research, but if reward prediction error is ALL that dopamine neurons do, then what do we need 400,000-600,000 of them for?* Here’s a map of where the brain’s dopamine neurons are located (in a cross-section of a rodent brain):
Distribution of dopamine neuron cell groups A8-A16 in the adult rodent brain. Adapted from Björklund, A. & Dunnett, S. B. Dopamine neuron systems in the brain: an update. Trends in Neurosciences 30, 194–202 (2007).
*In humans. There are 160,000-320,000 in monkeys and just 20,000-45,000 in rodents.
Looking at this diagram, there already seem to be some gross anatomical distinctions among groups of dopamine neurons, which is why they are labeled A8-A16. There are also finer anatomical distinctions, which turn out to have not-so-subtle functional implications. In the first line of study I’ll focus on here, Lammel et al. set about distinguishing dopamine neurons in the ventral tegmental area (VTA, or A10 in the above picture) by their connectivity to other brain areas. Lammel et al. observed that there are at least two separable populations of dopamine neurons within the VTA. One population gets input signals from a brain area called the laterodorsal tegmentum and sends output signals to a brain area called the nucleus accumbens (call these LDT-dopamine-NAc neurons). The other population gets inputs from the lateral habenula and sends outputs to the prefrontal cortex (call these LHb-dopamine-PFC neurons). So what? Does the fact that these dopamine neurons are wired into different brain circuits matter at all for behavior? Lammel et al. showed that it does matter. When (using optogenetics!) they activated the inputs to LDT-dopamine-NAc neurons in mice, they found that the animals formed positive associations with the context in which they were stimulated. They chose to spend more time in the part of a box where they’d gotten the brain stimulation. In contrast, when Lammel et al. activated the inputs to LHb-dopamine-PFC neurons, the exact opposite was observed. Animals avoided a part of a box where they’d gotten the stimulation. In another study by the same group, when the mice naturally experienced something good or something bad, the strengths of these distinct circuits were modulated differentially. Mice given cocaine showed an increased strength of the LDT-dopamine-NAc pathway, but no change in the LHb-dopamine-PFC pathway. Mice given an irritant on their paw showed no change in the LDT-dopamine-NAc pathway, but an increased strength of the LHb-dopamine-PFC pathway.
Undermining Schultz et al.’s initial assertion that dopamine neurons are homogenous, Lammel et al. discovered that they are not. This revision likely occurred because of the increasing sensitivity of the tools available, which changed quite a bit from the 1990s to the 2010s. Newer and better tools, in combination with a little creativity, allowed Lammel et al. to distinguish subtleties that weren’t accessible to Schultz et al. In revealing these subtleties, Lammel et al. helped to demonstrate the hubris of believing that you’ve figured out a whole class of neurons because you see responses in 55-80% of a population, especially when you’re not entirely sure (or shouldn’t be) about the criteria you’ve used to define that population. (The question of defining dopamine neurons during in vivo neural recordings is a WHOLE other issue). All the credit in the world to Schultz et al. for lighting the fire of dopamine research, but it was more of a starting point than an end point.
Grouping neurons by the brain circuits they participate in makes a ton of sense if you’re trying to figure out how brain circuits work. But what if you’re trying to figure out the dopamine part of dopamine neurons? Most dopamine neuron research has assumed that when a dopamine neuron fires, it releases the neurotransmitter dopamine, a small molecule that looks like this:
In fact, that’s how we defined “dopamine neuron.” However, as is often the case in science, the situation turns out not to be so simple. In the second line of recent research I’ll discuss here, scientists showed evidence that dopamine neurons can co-release other neurotransmitter molecules, called glutamate and GABA, along with dopamine.
Actually, different subsets of dopamine neurons most likely predominantly co-release either glutamate or GABA. Studies by Hnasko et al. and Stuber et al. demonstrated that dopamine neurons in the VTA co-release glutamate. First, they noticed that many VTA dopamine neurons express a glutamate transporter called VGLUT2, a protein that packages glutamate for release from neurons. Did the presence of VGLUT2 mean that dopamine neurons were packaging glutamate in addition to dopamine? To look at this question, the scientists looked at the responses of neurons in the nucleus accumbens (one place that dopamine neurons send outputs to, see the discussion of Lammel et al. above) to dopamine neuron stimulation. Indeed, they observed fast, excitatory responses of nucleus accumbens neurons to stimulation of VTA dopamine neurons of a type that would be consistent with a glutamatergic rather than a dopaminergic response. These responses were blocked by antagonists of glutamate receptors but not by antagonists of dopamine receptors. Additionally, in mice genetically manipulated to lack VGLUT2 in dopamine neurons, no such responses were seen.
The co-release of glutamate may not occur in all dopamine neurons though. As in Lammel et al.’s studies, connectivity matters. Stuber et al. observed that dopamine neurons in a neighboring area called the substantia nigra (A9), which sends outputs to the dorsal striatum, did not display evidence of glutamate release. That negative result is still controversial. Another group, Tritsch et al., did observe some evidence of glutamate release by substantia nigra dopamine neurons. Additionally, they demonstrated that these substantia nigra dopamine neurons also co-release yet another neurotransmitter: GABA. Oddly, however, substantia nigra dopamine neurons don’t express VGAT, the normal GABA transporter. Instead, Tritsch et al. found that VMAT, the dopamine transporter, can also co-transport GABA, packaging it for synaptic release along with dopamine. Tritsch et al.’s finding might generalize beyond substantia nigra dopamine neurons. As long as there’s some GABA around, anything expressing VMAT could potentially package and release that GABA as well. One key question that arises from Tritsch et al.’s study is exactly where and when the GABA in the substantia nigra is being synthesized. Nevertheless, it’s there.
The implications of glutamate and GABA co-release from dopamine neurons for the most part remain to be seen. The only reported behavioral effect is from the Hnasko et al. paper. They show that mice lacking VGLUT2 in dopamine neurons run around less in responses to cocaine than normal mice. That’s it for now. If nothing else, it demonstrates how much more we have to learn about the phenomenon of transmitter co-release.
So far, we’ve seen that dopamine neurons can signal different things if they’re hooked up into different brain circuits, and that they can play their assigned role in a brain circuit at least in part using chemicals other than just dopamine. In the third line of research I’ll examine here, we’ll add yet another layer of really cool complexity to the picture: dopamine neurons can change the way that they participate in a brain circuit by changing whether or not they’re making and releasing dopamine at all. In this case, Dulcis et al. looked at a slightly different group of dopamine neurons from the ones I’ve been talking about until now, located in the hypothalamus. They noticed that the number of dopamine neurons in rats seemed to fluctuate with the length of “daylight” experienced by the rats. I put daylight in quotes because it’s not real daylight – just whether or not the lights are on in a very controlled laboratory setting. Most lab animals see 12 hours of light per day, but Dulcis et al. also tried just 5 hours per day or up to 19. Rats that experienced long days had fewer dopamine neurons in their hypothalamus, while rats that experienced short days had more. Upon further examination, they determined that the changes in the number of dopamine neurons in the different light conditions weren’t due to neurons dying and being born. The same neurons were actually there in all conditions, but they were switching their dopamine-ness on or off. It’s still unclear why light exposure causes these changes, or what the exact behavioral consequences are. Rats that had long days, and fewer dopamine neurons as a result, displayed depressive and anxious behaviors (keep in mind that rats are nocturnal and prefer the dark). So did rats whose hypothalamic dopamine neurons were killed with a toxin. However, if the dopamine neurons were killed with a toxin while the rats got 12 hours of light per day and then the rats were given only 5 hours of light per day, previously non-dopaminergic neurons were recruited to release dopamine and fewer depressive and anxious behaviors were observed. Pretty cool! And importantly, this work demonstrates that neurons we wouldn’t even have previously identified as dopamine neurons can transform under the right conditions. Some aspects of our brains are built to be stable, but many are changing all the time, allowing us to internalize and adapt to our experience.
After all these studies, what have we learned? To me, the big picture takeaway is that understanding the brain means appreciating complexity. To be slightly more specific, it means linking molecules and cells with circuits and behavior to provide definitions of biological entities that span modalities of study. No more grouping neurons solely by one neurotransmitter they can release. That grouping could sometimes still be relevant, but as we’ve seen in the above studies, not always. Thinking about redefining the group formally known as dopamine neurons, we also need to take a look back at the decades of previous literature with the perspective that hindsight provides. It’s not that the data in older dopamine neuron studies are wrong, but conclusions may not be quite what we thought they were. Which could be a good thing. Many arcane arguments about exactly what dopamine neurons encode may actually end up being settled by understanding that they do many different things in different contexts. Don’t be alarmed: it may seem confusing, but this is the very normal process of science maturing. Not only is the process normal, it is absolutely crucial. Scientists must constantly question and revise our definitions to reflect significant conceptual advances.
Definitions can be confusing. They can also be rather boring, and I worry that they too often drive people away from science. When I was a beginning biology student, I spent hours upon hours making flashcards to help myself memorize what seemed like endless definitions. I viewed it as a tedious but necessary initiation to the biologists’ club. Basically, although it kind of stunk, I told myself that I had to learn the vocabulary to be able to discuss higher order issues with working scientists. What I’ve come to appreciate as I’ve progressed further in my career is how nuanced those once seemingly black-and-white, right-or-wrong definitions are – how much subtlety and history is packed into them. Scientific definitions, like the definition of a dopamine neuron, don’t just provide a common language; they structure the very nature of our investigations. We require this structure in order to proceed with our experiments, but, as we do so, we also need to be aware of the ways in which these definitions can limit us. We compare defined groups to each other. We talk about group averages. So exactly which things are included in our groups can dramatically affect how our data look and what we decide they mean. Thus, we must always be aware of the biases inherent in our categorizations. Perhaps definitions are not so boring after all! Discussions of these caveats could spice up intro course material quite a bit while teaching students how to think like real scientists.
The particular question of defining neuronal cell types actually turns out to be fairly timely. Just a couple weeks ago, the first interim report from the BRAIN initiative working group came out (see also Astra Bryant’s post on the topic). In it, nine high priority research areas for FY 2014 are outlined, the first of these being “generate a census of cell types.” The report recognizes the issues I’ve been discussing here:
There is not yet a consensus on what a neuronal type is, since a variety of factors including experience, connectivity and neuromodulators can diversify the molecular, electrical and structural properties of initially similar neurons. In some cases, there may not even be sharp boundaries separating subtypes from each other. Nonetheless, there is general agreement that types can be defined provisionally by invariant and generally intrinsic properties, and that this classification can provide a good starting point for a census. Thus, the census should begin with well-described large classes of neurons (e.g. excitatory pyramidal neurons of the cortex) and then proceed to finer categories within these classifications. This census would be taken with the knowledge that it will initially be incomplete, and will improve over iterations.
The answer to the question “What is a dopamine neuron?” isn’t quite forthcoming, but high-profile recognition of the question, and the funding that should follow it, is an important first step. Cheers to that.
Wednesday, July 3, 2013,
In a brain that people love to describe as “awash with chemicals,” one chemical always seems to stand out. Dopamine: the molecule behind all our most sinful behaviors and secret cravings. Dopamine is love. Dopamine is lust. Dopamine is adultery. Dopamine is motivation. Dopamine is attention. Dopamine is feminism. Dopamine is addiction.
My, dopamine’s been busy.
Dopamine is the one neurotransmitter that everyone seems to know about. Vaughn Bell once called it the Kim Kardashian of molecules, but I don’t think that’s fair to dopamine. Suffice it to say, dopamine’s big. And every week or so, you’ll see a new article come out all about dopamine.
So is dopamine your cupcake addiction? Your gambling? Your alcoholism? Your sex life? The reality is dopamine has something to do with all of these. But it is none of them. Dopamine is a chemical in your body. That’s all. But that doesn’t make it simple.
What is dopamine? Dopamine is one of the chemical signals that pass information from one neuron to the next in the tiny spaces between them. When it is released from the first neuron, it floats into the space (the synapse) between the two neurons, and it bumps against receptors for it on the other side that then send a signal down the receiving neuron. That sounds very simple, but when you scale it up from a single pair of neurons to the vast networks in your brain, it quickly becomes complex. The effects of dopamine release depend on where it’s coming from, where the receiving neurons are going and what type of neurons they are, what receptors are binding the dopamine (there are five known types), and what role both the releasing and receiving neurons are playing.
And dopamine is busy! It’s involved in many different important pathways. But when most people talk about dopamine, particularly when they talk about motivation, addiction, attention, or lust, they are talking about the dopamine pathway known as the mesolimbic pathway, which starts with cells in the ventral tegmental area, buried deep in the middle of the brain, which send their projections out to places like the nucleus accumbens and the cortex. Increases in dopamine release in the nucleus accumbens occur in response to sex, drugs, and rock and roll. And dopamine signaling in this area is changed during the course of drug addiction. All abused drugs, from alcohol to cocaine to heroin, increase dopamine in this area in one way or another, and many people like to describe a spike in dopamine as “motivation” or “pleasure.” But that’s not quite it. Really, dopamine is signaling feedback for predicted rewards. If you, say, have learned to associate a cue (like a crack pipe) with a hit of crack, you will start getting increases in dopamine in the nucleus accumbens in response to the sight of the pipe, as your brain predicts the reward. But if you then don’t get your hit, well, then dopamine can decrease, and that’s not a good feeling. So you’d think that maybe dopamine predicts reward. But again, it gets more complex. For example, dopamine can increase in the nucleus accumbens in people with post-traumatic stress disorder when they are experiencing heightened vigilance and paranoia. So you might say, in this brain area at least, dopamine isn’t addiction or reward or fear. Instead, it’s what we call salience. Salience is more than attention: It’s a sign of something that needs to be paid attention to, something that stands out. This may be part of the mesolimbic role in attention deficit hyperactivity disorder and also a part of its role in addiction.
But dopamine itself? It’s not salience. It has far more roles in the brain to play. For example, dopamine plays a big role in starting movement, and the destruction of dopamine neurons in an area of the brain called the substantia nigra is what produces the symptoms of Parkinson’s disease. Dopamine also plays an important role as a hormone, inhibiting prolactin to stop the release of breast milk. Back in the mesolimbic pathway, dopamine can play a role in psychosis, and many antipsychotics for treatment of schizophrenia target dopamine. Dopamine is involved in the frontal cortex in executive functions like attention. In the rest of the body, dopamine is involved in nausea, in kidney function, and in heart function.
With all of these wonderful, interesting things that dopamine does, it gets my goat to see dopamine simplified to things like “attention” or “addiction.” After all, it’s so easy to say “dopamine is X” and call it a day. It’s comforting. You feel like you know the truth at some fundamental biological level, and that’s that. And there are always enough studies out there showing the role of dopamine in X to leave you convinced. But simplifying dopamine, or any chemical in the brain, down to a single action or result gives people a false picture of what it is and what it does. If you think that dopamine is motivation, then more must be better, right? Not necessarily! Because if dopamine is also “pleasure” or “high,” then too much is far too much of a good thing. If you think of dopamine as only being about pleasure or only being about attention, you’ll end up with a false idea of some of the problems involving dopamine, like drug addiction or attention deficit hyperactivity disorder, and you’ll end up with false ideas of how to fix them.
The other reason I don’t like the “dopamine is” craze is because the simplification takes away the wonder of dopamine. If you believe “dopamine is,” then you’d think that we’ve got it all figured out. You begin to wonder why we haven’t solved this addiction problem yet. Complexity means that the diseases associated with dopamine (or with any other chemical or part of the brain, for that matter) are often difficult to understand and even more difficult to treat.
By emphasizing dopamine’s complexity, it might feel like I’m taking away some of the glamour, the sexiness, of dopamine. But I don’t think so. The complexity of how a neurotransmitter behaves is what makes it wonderful. The simplicity of a single molecule and its receptors is what makes dopamine so flexible and what allows the resulting systems to be so complex. And it’s not just dopamine. While dopamine has just five receptor type, another neurotransmitter, serotonin, has 14 currently known and even more that are thought to exist. Other neurotransmitters have receptors with different subtypes, all expressed in different places, and where each combination can produce a different result. There are many types of neurons, and they make billions and billions of connections. And all of this so you can walk, talk, eat, fall in love, get married, get divorced, get addicted to cocaine, and come out on top of your addiction some day. When you think of the sheer number of connections required simply for you to read and understand this sentence—from eyes to brain, to processing, to understanding, to movement as your fingers scroll down the page—you begin to feel a sense of awe. Our brain does all this, even while it makes us think about pepperoni pizza and what that text your crush sent really means. Complexity makes the brain the fascinating and mind-boggling thing that it is.
So dopamine has to do with addiction, whether to cupcakes or cocaine. It has to do with lust and love. It has to do with milk. It has to do with movement, motivation, attention, psychosis. Dopamine plays a role in all of these. But it is none of them, and we shouldn’t want it to be. Its complexity is what makes it great. It shows us what, with a single molecule, the brain can do.
TEHRAN (FNA)- When electrical pulses are applied to the ventral tegmental area of their brain, macaques presented with two images change their preference from one image to the other.
The study is the first to confirm a causal link between activity in the ventral tegmental area and choice behavior in primates.
When electrical pulses are applied to the ventral tegmental area of their brain, macaques presented with two images change their preference from one image to the other. The study by researchers Wim Vanduffel and John Arsenault (KU Leuven and Massachusetts General Hospital) is the first to confirm a causal link between activity in the ventral tegmental area and choice behaviour in primates.
The ventral tegmental area is located in the midbrain and helps regulate learning and reinforcement in the brain's reward system. It produces dopamine, a neurotransmitter that plays an important role in positive feelings, such as receiving a reward. "In this way, this small area of the brain provides learning signals," explains Professor Vanduffel. "If a reward is larger or smaller than expected, behavior is reinforced or discouraged accordingly."
This effect can be artificially induced: "In one experiment, we allowed macaques to choose multiple times between two images -- a star or a ball, for example. This told us which of the two visual stimuli they tended to naturally prefer. In a second experiment, we stimulated the ventral tegmental area with mild electrical currents whenever they chose the initially nonpreferred image. This quickly changed their preference. We were also able to manipulate their altered preference back to the original favorite."
The study, which will be published online in the journal Current Biology on 16 June, is the first to confirm a causal link between activity in the ventral tegmental area and choice behaviour in primates. "In scans we found that electrically stimulating this tiny brain area activated the brain's entire reward system, just as it does spontaneously when a reward is received. This has important implications for research into disorders relating to the brain's reward network, such as addiction or learning disabilities."
Could this method be used in the future to manipulate our choices? "Theoretically, yes. But the ventral tegmental area is very deep in the brain. At this point, stimulating it can only be done invasively, by surgically placing electrodes -- just as is currently done for deep brain stimulation to treat Parkinson's or depression. Once non-invasive methods -- light or ultrasound, for example -- can be applied with a sufficiently high level of precision, they could potentially be used for correcting defects in the reward system, such as addiction and learning disabilities."
Neuropsychopharmacology. 2015 Feb 18. doi: 10.1038/npp.2015.45.
There are approximately 1.6 million people who meet the criteria for cocaine addiction in the United States, and there are currently no FDA approved pharmacotherapies. Amphetamine-based dopamine releasing drugs have shown efficacy for reducing the motivation to self-administer cocaine and reducing intake in animals and humans. It is hypothesized that amphetamine acts as a replacement therapy for cocaine through elevation of extracellular dopamine levels.
Using voltammetry in brain slices, we tested the ability of a single amphetamine infusion in vivo to modulate dopamine release, uptake kinetics, and cocaine potency in cocaine naïve animals and after a history of cocaine self-administration (1.5 mg/kg/infusion, fixed-ratio 1, 40 injections/day x 5 days). Dopamine kinetics were measured 1 and 24 hours after amphetamine infusion (0.56 mg/kg, i.v.). Following cocaine self-administration, dopamine release, maximal rate of uptake (Vmax), and membrane-associated dopamine transporter (DAT) levels were reduced, and the DAT was less sensitive to cocaine. A single amphetamine infusion reduced Vmax and membrane DAT levels in cocaine naïve animals, but fully restored all aspects of dopamine terminal function in cocaine self-administering animals.
Here, for the first time, we demonstrate pharmacologically-induced, immediate rescue of deficits in dopamine nerve-terminal function in animals with a history of high dose cocaine self-administration. This observation supports the notion that the DAT expression and function can be modulated on a rapid time-scale and also suggests that the pharmacotherapeutic actions of amphetamine for cocaine addiction go beyond that of replacement therapy.
Proc Natl Acad Sci U S A. Apr 29, 2014; 111(17): 6455–6460.
Published online Apr 15, 2014. doi: 10.1073/pnas.1404323111
This article has been cited by other articles in PMC.
Dopamine (DA) neurons in the ventral tegmental area (VTA) react to aversive stimuli mostly by transient silencing. It remains unclear whether this reaction directly induces aversive responses in behaving mice. We examined this question by optogenetically controlling DA neurons in the VTA and found that the inactivation of DA neurons resulted in aversive response and learning. The nucleus accumbens (NAc), the major output nuclei of VTA DA neurons, was considered to be responsible for this response, so we examined which of the fundamental pathways in the NAc was critical to this behavior by using knockdown of D1 or D2 receptor, and found that the D2 receptor-specific pathway was crucial for this behavior.
Dopamine (DA) transmission from the ventral tegmental area (VTA) is critical for controlling both rewarding and aversive behaviors. The transient silencing of DA neurons is one of the responses to aversive stimuli, but its consequences and neural mechanisms regarding aversive responses and learning have largely remained elusive. Here, we report that optogenetic inactivation of VTA DA neurons promptly down-regulated DA levels and induced up-regulation of the neural activity in the nucleus accumbens (NAc) as evaluated by Fos expression. This optogenetic suppression of DA neuron firing immediately evoked aversive responses to the previously preferred dark room and led to aversive learning toward the optogenetically conditioned place. Importantly, this place aversion was abolished by knockdown of dopamine D2 receptors but not by that of D1 receptors in the NAc. Silencing of DA neurons in the VTA was thus indispensable for inducing aversive responses and learning through dopamine D2 receptors in the NAc.
The mesolimbic dopaminergic system not only plays a pivotal role in a wide range of motivation and learning (1–3), but its dysfunction has also been implicated in severe neuropsychiatric disorders as exemplified in Parkinson disease, schizophrenia, and drug addiction. Dopamine (DA) neurons in the ventral tegmental area (VTA) react to rewarding stimuli by phasic firing, and the main function of this firing is theorized to encode “the reward prediction error,” the difference in the value between the predicted reward and the actual reward (4). In contrast to the response to rewarding stimuli, their reactions to aversive stimuli are far from homologous; i.e., some DA neurons are activated in response to aversive stimuli, whereas most others react by transiently suppressing their firings (5–9). In fact, recent studies have revealed that optogenetic activation of GABAergic neurons and resultant inactivation of DA neurons suppress reward consumption and induce an aversive response (10, 11). However, it has largely remained elusive as to which mechanisms in the neural circuits are essential for the acquisition of aversive learning following the inactivation of DA neurons in the VTA and as to how behavioral responses are controlled toward suppressing reward consumption and inducing aversive behaviors.
Accumulated evidence has revealed that the motivational and cognitive learning in response to positive and negative stimuli is largely regulated by the neural circuits including the basal ganglia (12), which receive a large amount of the dopaminergic projection from the midbrain. In the striatum, two fundamental neural circuits are constituted by specified medium-sized spiny neurons (MSNs), each expressing a distinct type of DA receptor (13).
Although studies using the pharmacological strategies and reversible neurotransmission blocking (RNB) method have supported this mechanism of regulation in the NAc (15, 16), it has remained unknown whether the suppression of DA neuron firing is sufficient to promote the activity of the indirect pathway and subsequently induce the avoidance behavior. In this present study, we addressed this issue by selectively inactivating DA neurons in the VTA by optogenetically manipulating membrane-hyperpolarizing Arch protein (17) and explicitly demonstrated that the suppression of DA neurons in the VTA subsequently decreased DA levels in the NAc and induced aversive reaction and learning. Furthermore, we investigated the mechanisms of the regulation of this reaction and disclosed that this aversive reaction was specifically controlled by D2Rs in the NAc.
To selectively inactivate firings of DA neurons, we injected a Cre-inducible adeno-associated viral construct encoding Arch-eGFP [AAV-double-floxed inverted open reading frame (DIO)-Arch] (17) unilaterally into the VTA of adult tyrosine hydroxylase (TH)-Cre mice (18) and wild-type (WT) littermates and placed an optical fiber above the VTA (Fig. S1 A and C). Two weeks after surgery, Arch-eGFP was restrictedly detected in the VTA (Fig. S1B). We tested the hyperpolarizing effect of the Arch protein by electrophysiological recording and measured the effect of optical stimulation of the VTA of TH-Cre mice injected with AAV-DIO-Arch. In vivo electrophysiological recordings from the VTA of anesthetized TH-Cre mice revealed that optical stimulation of putative DA neurons inhibited their firings (Fig. S2), indicating that the optical stimulation sufficiently hyperpolarized the membrane potential of Arch-expressing DA cells and thus inhibited their spontaneous firing.
By using these mice, we next examined whether the optical inactivation of DA neurons in the VTA could serve as an aversive signal for behavioral learning. Mice possess an innate tendency to prefer a dark environment (19). We designed a behavioral apparatus in which mice could freely explore the dark room and open bright space (Fig. 1A). After habituation, the WT mice stayed preferentially in the dark room either with or without optical stimulation in the dark room (Fig. S1D), ensuring that optical stimulation itself had no influence on their dark-room–preferring behavior. We scheduled the behavioral experiment of animals to test the effect of optical inactivation of DA neurons on their behavior (Fig. S1E). After habituation and pretest, mice were conditioned by optically stimulating the DA neurons in the VTA when they stayed in the dark room. Even during the first 5 min of conditioning, the TH-Cre mice stayed out of the previously preferred dark room and successively avoided the dark room throughout the conditioning (Fig. 1B). The TH-Cre mice did not reverse their avoidance against the dark room even though they received no optical stimulation at the posttest (Fig. 1C). These data indicate that hyperpolarization of DA neurons not only induced transient aversive behavior but also served as a signal for aversive learning against the dark room and also demonstrate that the inactivation of DA neurons played a causal role in both transient aversive behavior and prolonged aversive learning.
We next investigated whether the inactivation of DA neurons in the VTA actually modified the concentration of DA in its major targeting region, the NAc. We measured DA levels in the NAc by fast-scan cyclic voltammetry (FSCV) in anesthetized TH-Cre mice that had been injected with AAV-DIO-Arch into their VTA. DA levels in the NAc were promptly elevated by electrical stimulation of the VTA, and the evoked DA release was significantly reduced by simultaneous optical stimulation of the VTA (Fig. S3). We then tested whether optical stimulation of VTA could reduce the tonic DA level in the NAc. In the same experimental settings, we observed that the DA level in the NAc was transiently decreased by 20 s of optical stimulation of the VTA (Fig. 2), which is consistent with the reported FSCV reaction against the aversive stimuli (20). These data demonstrate that optical stimulation of the VTA was effective enough to inactivate the VTA DA neurons and to diminish the DA level in the NAc during the behavioral experiment.
The behavioral change caused by conditioned inactivation of DA neurons in the VTA suggested that optical stimulation directly altered neural activity and resulted in the shift of behavioral performance. So we next investigated the regions in which neural activity was elevated by the conditioned inactivation of DA neurons by examining the expression of Fos, an immediate early gene. Soon after the conditioning was performed in the dark-room test, mice were quickly processed to determine the amount of Fos expression by quantitative in situ hybridization analysis (Fig. 3 and Fig. S4). The NAc, the region that receives a large amount of dopaminergic projections from the VTA, showed a significantly increased amount of Fos expression in the TH-Cre mice (Fig. 3). This up-regulation was also detected in the contralateral side of optical stimulation, which was supposedly caused by a small amount of virus infection into that side. However, the up-regulation was much higher at the ipsilateral side than at the contralateral side of optical stimulation, suggesting that optical inactivation of DA neurons directly up-regulated the neural activity of the NAc. The increased Fos expression was also observed in other brain regions including the septum, periventricular regions of the striatum, basolateral amygdala (BLA), and lateral hypothalamus, but not in the lateral habenula or medial prefrontal cortex (mPFC; Fig. S4). These results indicate that the regions activated by optical inactivation of DA neurons were not restricted to the direct target regions of VTA DA neurons, but rather included the regions that could be indirectly activated in a neural circuit-dependent manner. This observation suggests that optical inactivation of DA neurons modified circuit-wide neuronal activity and could not only evoke an aversive reaction but also trigger several other brain functions such as anxiety, fear, and stress responses (21).
The majority of dopaminergic signals from the VTA are transmitted to the MSNs in the NAc through DA receptors, D1R and D2R. D1R is almost exclusively expressed in the substance P (coded by Tac1 gene)-expressing MSNs, and D2R is predominantly expressed in the enkephalin (coded by Penk gene)-expressing MSNs; each type of MSNs constitutes the direct and indirect pathways, respectively, in the NAc (3). As the affinity for DA is much higher for the D2R (nM order) than for the D1R (µM order) (22, 23), a reduction in DA levels is thought to result in the inactivation of Gi-coupled D2R but to have no appreciable effect on the D1R (3, 24), thereby up-regulating the neural activity specifically in the indirect pathway. Moreover, the Fos activation was more prominently observed in Penk- or Drd2 (D2R)-expressing cells than in the Tac1- or Drd1a (D1R)-expressing cells (Fig. S5). Based on these observations, we hypothesized that DA signaling through D2R could play a major role in the observed aversive conditioning.
To test this hypothesis, we performed the three-chamber conditioned place aversion (CPA) test (Fig. S6). We prepared a behavioral apparatus containing two chambers with virtually identical circumstances and one small corridor. This unbiased environmental condition in the CPA test enabled us to further examine whether the inactivation of VTA DA neurons is capable of inducing aversive reaction and learning, in addition to blocking the dark room preference. When animals were allowed to freely move around the entire apparatus, most of them stayed in two chambers without any typical behavioral difference at pretest. The optical conditioning was then performed by pairing optical stimulation with one fixed chamber. Even when either of the chambers was used for the conditioning, the TH-Cre mice persistently and significantly avoided staying in the optically conditioned chamber during the conditioning and at the posttest (Fig. S6 B–E). Statistical analysis validated a significant reduction in staying time of the TH-Cre mice in the optically conditioned chamber at the posttest compared with the staying time for the WT mice (Fig. S6F).
We then attempted to specify DA receptor subtypes involved in this aversive behavior by specifically suppressing each of the DA receptors in the NAc (Fig. 4 and Fig. S7). We designed and validated lentiviral vectors containing short hairpin RNA (shRNA) specific for each DA receptor with constitutive expression of mCherry. Three weeks after injecting the lentivirus into the NAc, robust expression of mCherry was localized in the NAc (Fig. 4B). The effective knockdown of mRNA expression of each receptor was confirmed by quantitative real-time PCR analysis (Fig. S7A). Measuring protein levels through Western blotting also revealed that injection of each of the lentiviruses selectively reduced its target protein product without affecting the expression of the other subtype of DA receptor (Fig. 4C and Fig. S7 B–G). The shD1R- and shD2R-expressing lentiviruses decreased their target protein level to 46.2 ± 1.1% and 38.4 ± 4.9%, respectively, compared with the level for the control virus (Fig. 4C). These results verified that the lentiviral vectors expressing shRNA specific for D1R and D2R selectively and sufficiently suppressed their target RNAs and down-regulated the amount of the respective protein products. We also confirmed that the virus-mediated expression of mCherry was not detected in the VTA, excluding the possibility that the lentivirus-mediated shRNA directly affected the VTA.
Using these lentiviruses containing shRNA, we tested which type of DA receptor was responsible for the aversive behavior induced by optogenetic inactivation of DA neurons. We injected shRNA-containing lentivirus or control lentivirus into the bilateral NAc together with AAV-DIO-Arch into the left VTA of the TH-Cre mice. The optical fiber was also inserted above the VTA (Fig. 4A). When the three-chamber CPA test was conducted at three weeks after surgery, the TH-Cre mice injected with lenti:shD1R-mCherry still showed explicit CPA against the optical stimulation-paired chamber comparable to that of the TH-Cre mice injected with the control lentivirus (lenti:mCherry). In contrast, the TH-Cre mice injected with lenti:shD2R-mCherry failed to show obvious CPA during conditioning (Fig. 4D). The exclusive learning deficit of the TH-Cre mice injected with lenti:shD2R-mCherry was further substantiated by analysis of aversive learning at the posttest (Fig. 4E). These results demonstrate that the aversive behavior to the place conditioned by the DA neuron inactivation was specifically evoked through D2R, and not through D1R, in the NAc.
In the striatum, studies have revealed that activation of the Gs-coupled D1R facilitates its firing, whereas activation of the Gi-coupled D2R results in suppressed firing efficiency (25). According to the specificity of DA receptor expression, phasic firings of DA neurons mainly activate the direct pathway through D1R, whereas a transient decrease in DA neuron firings predominantly promotes the indirect pathway competency through D2R (3, 26). Based on this mechanism of regulation, it has been proposed that silencing of DA neurons in response to aversive stimuli is mainly processed through the indirect pathway and results in aversive behavior (3). Recent studies have shown that blockade of the synaptic transmission of the indirect pathway impairs the acquisition of aversive behavior elicited by an electric shock (15) and that this impairment is caused by the inhibition of D2R-mediated signal transmission (16). In addition, the optogenetic up-regulation of D2R-expressing MSNs in the indirect pathway evokes behavioral avoidance (27). However, because DA neurons exhibit both enhanced and suppressed firings in response to aversive stimuli and because other shock-related sensory information is simultaneously processed in the brain, it still remains to be clarified whether silencing of DA neurons could directly trigger aversive reaction and learning, and whether this reaction is regulated through D2R-expressing MSNs in the indirect pathway.
In this study, we used optogenetic control of DA neuron firings in the two behavioral tests: the dark-room preference test and three-chamber CPA test. Our optogenetic manipulation showed efficient suppression of DA neuron firings in the VTA and down-regulation of DA levels in the NAc. Our precise optogenetic inactivation of DA neuron firings only during the period that the animals stayed in the conditioned chamber explicitly evoked an aversive reaction and learning, demonstrating that transient DA silencing directly caused passive avoidance behavior. Furthermore, this investigation has elucidated that D2R-mediated signal processing is a key determinant for the induction of this aversive reaction and learning.
Although our data demonstrated that D1R had no effect in the behavioral experiments to evoke the CPA, several studies have documented that phasic firing of DA neurons is required for fear responses and aversive learning (28, 29). This difference is due to the experimental setting; i.e., our optogenic approach excluded the possibility of the signaling through activated DA neurons to evoke aversive behavior, indicating that inactivating DA neurons was sufficient to induce aversive behavior and learning. The function and signal processing of the activated DA firing evoked by aversive stimuli would have different contributions to aversive behaviors from those studied here and need to be clarified in the future.
DA neurons also project to various other regions including the mPFC, amygdala, and hippocampus. A recent study indicated that optogenetic activation of lateral habenula neurons projecting to DA neurons in the VTA are capable of inducing aversive behavior, and these DA neurons mainly and specifically target to the mPFC (30), although their optogenetic conditioning was different from that in our current study, as their optogenetic stimulation was prolonged for a whole conditioning session. Because the dopaminergic input to the mPFC has been reported to be activated not only by aversive stimuli but also by chronic stress (31, 32), it is possible that their continuous activation of mPFC-projecting DA neurons would be perceived as signals from a highly stressful environment; and, as a result of the accumulation of stressful conditioning, the animals would show aversive behavior to the conditioned chamber. By contrast, we inhibited firing of DA neurons only while the animals were staying in the conditioned chamber. The results of our behavioral experiments using timing-matched conditioning indicated that a sudden suppression of DA signal would be perceived as a sudden aversive input, which resulted in their quick aversive response.
DA neurons also project to the amygdala, the region that largely contributes to the fear response. Indeed, the DA signaling to the amygdala has been implicated in the fear response and acquisition of fear memory (33, 34). In our study, labeling DA neurons in the VTA identified a set of DA neurons projecting to the BLA, but the extent of these projections was much lower than that projecting to the NAc. Although we could not exclude a subtle effect of amygdala-projected DA signaling on our observed aversive behavior, the main effect of our optogenetic inactivation of DA neurons should be on the NAc, because our experiments with specific knockdown of the D2R in the NAc dramatically diminished the aversive behavior. Future investigations addressing target-specific DA signaling are required to elucidate the effects of circuit-wide modification of DA neurons on the aversive stimuli and fear conditioning.
Tyrosine hydroxylase::IRES-Cre (TH-Cre) knock-in mice (EM:00254) (18) were obtained from the European Mouse Mutant Archive. All experimental animals had been backcrossed to the C57BL/6J strain for more than 10 generations. Mice were mated with the C57BL/6J WT mice and housed with a standard 12-h light/12-h dark cycle and given food and water ad libitum. Cre+ and Cre− mice from the same litters (3–6 mo of age) were used for the experiments. All animal experiments were approved by the animal committee of Osaka Bioscience Institute under the guidelines of animal experiments.
During all behavioral tests, mice were connected with an optical fiber and allowed to move around the entire apparatus. The movement of mice was monitored so that they could move around without any obstacles even when they were connected with an optical fiber on their heads. The position of a mouse was detected by a video camera suspended over the behavioral apparatus and analyzed by a custom-made program using Labview software.
The custom-made behavioral apparatus used in the test was composed of a dark room (15 × 9.5 cm) and a bright open space (15 × 11 cm). The dark room had walls, a floor, and a roof, which were all colored in black and had an entrance (4.5 cm long) to the open bright space. The open bright space was shaped like an ellipse and had a metal grid floor and clear walls without a roof. Before the test, all mice were habituated for 10 min in the apparatus. The test consisted of three sessions: on the early half of day 1 (pretest: 5 min), mice were allowed to explore the entire apparatus. From the late half of day 1 to day 4 (conditioning: 35 min in total), mice received optical stimulation when they stayed in the dark room. On day 5, the dark-room preference was tested without optical stimulation (posttest: 5 min; Fig. S1E).
The custom-made three-chamber conditioned place preference/CPA apparatus used in the test was composed of two chambers (10 × 17 cm) and a connecting corridor. The test consisted of three sessions. Day 1 (pretest: 15 min): Mice were allowed to freely explore the entire apparatus. Mice that stayed 1.5 times longer in one chamber than in the other were excluded from the test. Days 2 and 3 (conditioning: 15 min each): Mice received optical stimulation when they stayed in the light-paired chamber. The selection of the light-paired chamber was counterbalanced. Day 4 (posttest: 15 min): The test was conducted under the same conditions as in the pretest (Fig. S6A).
In the conditioning session, the optical stimulation was stopped for 30 s when the mice continuously stayed over 30 s in the dark room or the light-paired chamber to avoid overheating. Laser power was controlled to be approximately 5 mW at the tip of the optical fiber in all behavioral tests.
FSCV experiments were conducted by using the method described in previous studies (35–37). Mice were anesthetized with a ketamine/xylazine mixture as described in SI Materials and Methods and placed in a stereotaxic frame. An optical fiber used for stimulating Arch-expressing DA neurons was located close to the stimulating electrode. The stimulating optrode was then placed in the VTA (from bregma: anterior–posterior, −3.2 mm; lateral, 0.5 mm; and dorsal–ventral, 3.5 mm) and lowered at 0.25-mm intervals. A carbon-fiber microelectrode (300 µm in length) for voltammetric recording was lowered into the NAc (from bregma: anterior–posterior, 1.0 mm; lateral, 1.0 mm; and dorsal–ventral, 3.5 mm). Voltammetric measurements were made every 100 ms by applying a triangle waveform (−0.4 V to +1.3 V to −0.4 V versus Ag/AgCl, at 400 V/s) to the carbon-fiber microelectrode. A custom-made potentiostat was used for waveform isolation and current amplification. DA release was evoked by electrical stimulation of DA neurons by using 24-pulse stimulation (100 µA, 5 ms duration, 30 Hz). An optical stimulation of DA neurons (532 nm, ∼5 mW power at the fiber tip) was applied for 10 s starting 5 s before the onset of an electrical stimulation. Carbon-fiber microelectrodes were calibrated in a solution with known concentrations of DA (0.2 µM, 0.5 µM, and 1.0 µM). All voltammetry data were analyzed by custom-made programs using Labview and Matlab software. Reduction in DA levels by optical stimulation was resolved with principal component analysis, by using the template DA waveforms obtained from electrical VTA stimulations to separate dopamine signals (35, 36).
Statistical analysis was conducted by using GraphPad PRISM 5.0 (GraphPad Software). Data were analyzed by repeated measures ANOVA (Figs. 1B, ,4D,4D, and Fig. S6 D and E) or one-way ANOVA (Figs. 1C, ,3D,3D, 4 C and E, and Figs. S4 K–M, S6F, and S7A), and post hoc analyses were done by using the Bonferroni test. All marks/columns and bars represented the mean and ± SEM, respectively.
Other experimental procedures including virus preparation and injection, electrophysiological recording, and immunohistochemical and mRNA analysis are described in detail in SI Materials and Methods.
We thank E. Boyden for the Arch construct, R. Matsui for technical advice in lentivirus production and purification, and Y. Hayashi for technical advice in the programming of data analysis. This work was supported by Research Grants-in-Aid 22220005 (to S.N.), 23120011 (to S.Y. and S.N.), 24700339 (to T.D.), and 25871080 (to S.Y.) from the Ministry of Education, Culture, Sports, Science, and Technology of Japan and a grant from Takeda Science Foundation (to S.N.).
The authors declare no conflict of interest.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1404323111/-/DCSupplemental.
COMMENTS: Detailed review of dopamine and the nucleus accumbens in reward and aversion.
The nucleus accumbens (NAc) is a critical element of the mesocorticolimbic system, a brain circuit implicated in reward and motivation. This basal forebrain structure receives dopamine (DA) input from the ventral tegmental area (VTA) and glutamate (GLU) input from regions including the prefrontal cortex (PFC), amygdala (AMG), and hippocampus (HIP). As such, it integrates inputs from limbic and cortical regions, linking motivation with action. The NAc has a well-established role in mediating the rewarding effects of drugs of abuse and natural rewards such as food and sexual behavior. However, accumulating pharmacological, molecular, and electrophysiological evidence has raised the possibility that it also plays an important (and sometimes underappreciated) role in mediating aversive states. Here we review evidence that rewarding and aversive states are encoded in the activity of NAc medium spiny GABAergic neurons, which account for the vast majority of the neurons in this region. While admittedly simple, this working hypothesis is testable using combinations of available and emerging technologies, including electrophysiology, genetic engineering, and functional brain imaging. A deeper understanding of the basic neurobiology of mood states will facilitate the development of well-tolerated medications that treat and prevent addiction and other conditions (e.g., mood disorders) associated with dysregulation of brain motivation systems.
The biological basis of mood-related states such as reward and aversion is not understood. Classical formulations of these states implicate the mesocorticolimbic system, comprising brain areas including the NAc, VTA, and PFC, in reward (Bozarth and Wise, 1981; Goeders and Smith, 1983; Wise and Rompré, 1989). Other brain areas, including the amygdala, periaquaductal gray, and the locus coeruleus, are often implicated in aversion (Aghajanian, 1978; Phillips and LePaine, 1980; Bozarth and Wise, 1983). However, the notion that certain brain areas narrowly and rigidly mediate reward or aversion is becoming archaic. The development of increasingly sophisticated tools and methodologies has enabled new approaches that provide evidence for effects that previously would have been difficult (if not impossible) to detect. As one example from our own work, we have found that a prominent neuroadaptation triggered in the NAc by exposure to drugs of abuse (activation of the transcription factor CREB) contributes to depressive-like and aversive states in rodents (for review, see Carlezon et al., 2005). Other work suggests that changes in the activity of dopaminergic neurons in the VTA—which provides inputs to the NAc that are integrated with glutamatergic inputs from areas such as the PFC, AMG, and HIP—can also encode both rewarding and aversive states (Liu et al., 2008).
In this review, we will focus on the role of the NAc in simple states of reward and aversion. The role of NAc activity in more complex states such as drug-craving and drug-seeking is beyond the scope of this review, since these states depend upon experience-dependent neuroadaptations and do not easily map onto basic conceptualizations of rewarding and aversive states. An improved understanding of the neurobiology of reward and aversion is critical to the treatment of complex disorders like addiction. This question is particularly important as the field utilizes accumulated knowledge from decades of research on drugs of abuse to move toward the rational design of treatments for addictive disorders. The requirement for new medications goes beyond the mere reduction of drug-craving, drug-seeking, or other addictive behaviors. To be an effective therapeutic, a medication must be tolerated by the addicted brain, or compliance (sometimes called adherence) will be poor. There are already examples of medications (e.g., naltrexone) that would appear on the basis of animal data to have extraordinary potential for reducing intake of alcohol and opiates—except that addicts often report aversive effects and discontinue treatment (Weiss et al., 2004). Methods to predict rewarding or aversive responses in normal and addicted brains would accelerate the pace of drug discovery, medication development, and recovery from addiction. Here we review evidence for the simple working hypothesis that rewarding and aversive states are encoded by the activity of NAc medium spiny GABAergic neurons.
The NAc comprises the ventral components of the striatum. It is widely accepted that there are two major functional components of the NAc, the core and the shell, which are characterized by differential inputs and outputs (see Zahm, 1999; Kelley, 2004; Surmeier et al., 2007). Recent formulations further divide these two components into additional subregions (including the cone and the intermediate zone of the NAc shell) (Todtenkopf and Stellar, 2000). As in the dorsal striatum, GABA-containing medium spiny neurons (MSNs) make up the vast majority (~90–95%) of cells in the NAc, with the remaining cells being cholinergic and GABAergic interneurons (Meredith, 1999). Striatal regions contain subpopulations of these MSNs: those of so-called “direct” and “indirect” pathways (Gerfen et al., 1990; Surmeier et al., 2007). The MSNs of the direct pathway predominantly co-express dopamine D1-like receptors and the endogenous opioid peptide dynorphin, and project directly back to the midbrain (substantia nigra/VTA). In contrast, the MSNs of the indirect pathway predominantly co-express dopamine D2-like receptors and the endogenous opioid peptide enkephalin, and project indirectly to the midbrain via areas including the ventral pallidum and the subthalamic nucleus. Traditional formulations posit that dopamine actions at D1-like receptors, which are coupled to the G-protein Gs (stimulatory) and associated with activation of adenylate cyclase, tend to excite the MSNs of the direct pathway (Albin et al., 1989; Surmeier et al., 2007). Elevated activity of these cells would be expected to provide increased GABAergic and dynorphin (an endogenous ligand at κ-opioid receptors) input to the mesolimbic system and negative feedback on midbrain dopamine cells. In contrast, dopamine actions at D2-like receptors, which are coupled to Gi (inhibitory) and associated with inhibition of adenylate cyclase, tend to inhibit the MSNs of the indirect pathway (Albin et al., 1989; Surmeier et al., 2007). Inhibition of these cells would be expected to reduce GABAergic and enkephalin (an endogenous ligand at δ-opioid receptors) input to the ventral pallidum, a region that normally inhibits subthalamic cells that activate inhibitory inputs to the thalamus. Through multiple synaptic connections, inhibition of the indirect pathway at the level of the NAc would ultimately activate the thalamus (see Kelley, 2004).
Like neurons throughout the brain, MSNs also express glutamate-sensitive AMPA and NMDA receptors. These receptors enable glutamate inputs from brain areas such as AMG, HIP, and deep (infralimbic) layers of the PFC (O’Donnell and Grace, 1995; Kelley et al., 2004; Grace et al., 2007) to activate NAc MSNs. Dopamine and glutamate inputs can influence one another: for example, stimulation of D1-like receptors can trigger phosphorylation of glutamate (AMPA and NMDA) receptor subunits, thereby regulating their surface expression and subunit composition (Snyder et al., 2000; Chao et al., 2002; Mangiavacchi et al., 2004; Chartoff et al., 2006; Hallett et al., 2006; Sun et al., 2008). Thus the NAc is involved in a complex integration of excitatory glutamate inputs, sometimes excitatory dopamine (D1-like) inputs, and sometimes inhibitory dopamine (D2-like) inputs. Considering that VTA tends to have uniform a response—activation—to both rewarding (e.g., morphine; see DiChiara and Imperato, 1988; Leone et al., 1991; Johnson and North, 1992) and aversive (Dunn, 1988; Herman et al., 1988; Kalivas and Duffy, 1989; McFarland et al., 2004) stimuli, the ability of the NAc to integrate these excitatory and inhibitory signals downstream of mesolimbic dopamine neurons likely plays a key role in attaching valence and regulating mood.
It is well accepted that the NAc plays a key role in reward. Theories about its role in motivation have been a critical element in our understanding of addiction (e.g., Bozarth and Wise, 1987; Rompré and Wise, 1989). There are 3 primary lines of evidence implicating the NAc in reward, involving pharmacological, molecular, and electrophysiological approaches.
It is well established that drugs of abuse (Di Chiara and Imperato, 1988) and natural rewards (Fibiger et al., 1992; Pfaus, 1999; Kelley, 2004) have the common action of elevating extracellular concentrations of dopamine in the NAc. Moreover, lesions of the NAc reduce the rewarding effects of stimulants and opiates (Roberts et al., 1980; Kelsey et al., 1989). Pharmacology studies in rats (e.g., Caine et al., 1999) and monkeys (e.g., Caine et al., 2000) suggest that D2-like receptor function plays a critical role in reward. However, it has been studies involving the direct microinfusion of drugs into this area that have provided the strongest evidence for its role in rewarding states. For example, rats will self-administer the dopamine releasing agent amphetamine directly into the NAc (Hoebel et al., 1983), demonstrating the reinforcing effects elevating extracellular dopamine in this region. Rats will also self-administer the dopamine reuptake inhibitor cocaine into the NAc, although this effect is surprisingly weak in comparison to that reported with amphetamine (Carlezon et al., 1995). This observation has led to speculation that the rewarding effects of cocaine are mediated outside the NAc, in areas including the olfactory tubercle (Ikemoto, 2003). However, rats will avidly self-administer the dopamine reuptake inhibitor nomifensine into the NAc (Carlezon et al., 1995), suggesting that the local anesthetic properties of cocaine complicate studies in which the drug is applied directly to neurons. Co-infusion of the dopamine D2-selective antagonist sulpiride attenuates intracranial self-administration of nomifensine, demonstrating a key role for D2-like receptors in the rewarding effects intra-NAc microinfusions of this drug. When considered together with evidence from a variety of other studies (for review, see Rompré and Wise, 1989), these studies are entirely consistent with theories prevailing in the 1980’s that dopamine actions in the NAc play a necessary and sufficient role in reward and motivation.
While there is little controversy that dopamine actions in the NAc is sufficient for reward, other work began to challenge the notion that they are necessary. For example, rats will self-administer morphine directly into the NAc (Olds, 1982), away from the trigger zone (the VTA) in which the drug acts to elevate extracellular dopamine in the NAc (Leone et al., 1991; Johnson and North, 1992). Considering that μ- and δ-opioid receptors are located directly on NAc MSNs (Mansour et al., 1995), these data were the first to suggest that reward can be triggered by events occurring in parallel with (or downstream of) those triggered by dopamine. Rats will also self-administer phencyclidine (PCP), a complex drug that is a dopamine reuptake inhibitor and a non-competitive NMDA antagonist, directly into the NAc (Carlezon and Wise, 1996). Two lines of evidence suggest that this effect is not dopamine-dependent. First, intracranial self-administration of PCP is not affected by co-infusion of the dopamine D2-selective antagonist sulpiride; and second, rats will self-administer other non-competitive (MK-801) or competitive (CPP) NMDA antagonists with no direct effects on dopamine systems directly into the NAc (Carlezon and Wise, 1996). These data provided early evidence that blockade of NMDA receptors in the NAc is sufficient for reward and, by extension, reward can be dopamine-independent. Blockade of NMDA receptors would be expected to produce an overall reduction in the excitability of NAc MSNs without affecting baseline excitatory input mediated by AMPA receptors (Uchimura et al., 1989; Pennartz et al. 1990). Importantly, rats also self-administered NMDA antagonists into deep layers of the PFC (Carlezon and Wise, 1996), which project directly to the NAc (see Kelley, 2004) and have been conceptualized as a part of a inhibitory (“STOP!”) motivational circuit (Childress, 2006). When considered together, these studies provided two critical pieces of evidence that have played a prominent role in the formulation of our current working hypothesis: first, that dopamine-dependent reward is attenuated by blockade of D2-like receptors, which are inhibitory receptors expressed predominately in the NAc on the MSNs of the indirect pathway; and second, that events that would be expected to reduce the overall excitability of the NAc (e.g., stimulation of Gi-coupled opioid receptors, reduced stimulation of excitatory NMDA receptors, reduced excitatory input) are sufficient for reward. This interpretation led to the development of a model of reward in which the critical event is reduced activation of MSNs in the NAc (Carlezon and Wise, 1996).
Other pharmacological evidence supports this theory, and implicates calcium (Ca2+) and its second messenger functions. Activated NMDA receptors gate Ca2+, an intracellular signaling molecule that can affect membrane depolarization, neurotransmitter release, signal transduction, and gene regulation (see Carlezon and Nestler, 2002; Carlezon et al., 2005). Microinjection of the L-type Ca2+ antagonist diltiazem directly into the NAc increases the rewarding effects of cocaine (Chartoff et al., 2006). The mechanisms by which diltiazem-induced alterations in Ca2+ influx affect reward are unknown. One possibility is that blockade of Ca2+ influx through voltage-operated L-type channels reduces the firing rate of neurons within the ventral NAc (Cooper and White, 2000). It is important to note, however, that diltiazem alone was not rewarding, at least at the doses tested in these studies. This might indicate that baseline levels of Ca2+ influx via L-type channels within the NAc are normally low, and difficult to reduce further. A related possibility is that microinjection of diltiazem reduces aversive actions of cocaine that are mediated within the NAc, unmasking reward. For example, activity of the transcription factor cAMP response element binding protein (CREB) within the NAc is associated with aversive states and reductions in cocaine reward (Pliakas et al., 2001; Nestler and Carlezon, 2006). The activation of CREB depends on phosphorylation, which can occur via activation of L-type Ca2+ channels (Rajadhyaksha et al., 1999). Phosphorylated CREB can induce expression of dynorphin, a neuropeptide that might contribute to aversive states via activation of κ-opioid receptors in the NAc (for review, see Carlezon et al., 2005). The potential role of intra-NAc Ca2+ in regulating rewarding and aversive states is a common theme in our work that will be explained in greater detail below.
Mice lacking dopamine D2-like receptors have reduced sensitivity to the rewarding effects of cocaine (Welter et al., 2007). Ablation of D2-like receptors also reduces the rewarding effects of morphine (Maldonado et al., 1997)—presumably by reducing the ability of the drug to stimulate dopamine via VTA mechanisms: Leone et al., 1991; Johnson and North, 1992)—and lateral hypothalamic brain stimulation (Elmer et al., 2005). One interpretation of these findings is that loss of D2-like receptors in the NAc reduces the ability of dopamine to inhibit the indirect pathway, a putative mechanism of reward. These findings, when combined with evidence that human addicts have reduced dopamine D2-like receptor binding in the NAc, suggest that this receptor plays an essential role in encoding reward (Volkow et al., 2007).
Other advances in molecular biology have enabled the detection of neuroadaptative responses to drugs of abuse and the ability to mimic such changes in discrete brain areas to examine their significance. One such change is in the expression of AMPA-type glutamate receptors, which are expressed ubiquitously in the brain and composed of various combinations of the receptor subunits GluR1-4 (Hollmann et al., 1991; Malinow and Malenka, 2002). Drugs of abuse can alter GluR expression in the NAc. For example, repeated intermittent exposure to cocaine elevates GluR1 expression in the NAc (Churchill et al., 1999). Furthermore, GluR2 expression is elevated in the NAc of mice engineered to express ΔFosB, a neuroadaptation linked with increased sensitivity to drugs of abuse (Kelz et al., 1999). Studies in which viral vectors were used to elevate GluR1 selectively in the NAc indicate that this neuroadaptation tends to make cocaine aversive in place conditioning tests, whereas elevated GluR2 in the NAc increases cocaine reward (Kelz et al., 1999). Potential explanations for this pattern of findings likely involve Ca2+ and its effect on neuronal activity and intracellular signaling. Increased GluR1 expression favors formation of GluR1-homomeric (or GluR1-GluR3 heteromeric) AMPARs, which are Ca2+-permeable (Hollman et al., 1991; Malinow and Malenka, 2002). In contrast, GluR2 contains a motif that prevents Ca2+ influx; thus increased expression of GluR2 would favor formation of GluR2-containing Ca2+-impermeable AMPARs (and theoretically decrease the number of Ca2+-permeable AMPARs). Thus GluR2-containing AMPARs have physiological properties that render them functionally distinct from those lacking this subunit, particularly with respect to their interactions with Ca2+ (Fig. 1).
These early studies involved place conditioning studies, which generally require repeated exposure to drugs of abuse and presumably involve cycles of reward and aversion (withdrawal). More recent studies examined how alterations in GluR expression modeling those acquired through repeated drug exposure affect intracranial self-stimulation (ICSS), an operant task in which the magnitude of the reinforcer (brain stimulation reward) is precisely controlled (Wise, 1996). Elevated expression of GluR1 in NAc shell increases ICSS thresholds, whereas elevated GluR2 decreases them (Todtenkopf et al., 2006). The effect of GluR2 on ICSS is qualitatively similar to that caused by drugs of abuse (Wise, 1996), suggesting that it reflects increases in the rewarding impact of the stimulation. In contrast, the effect of GluR1 is qualitatively similar to that caused by prodepressive treatments including drug withdrawal (Markou et al., 1992) and κ-opioid receptor agonists (Pfeiffer et al., 1986; Wadenberg, 2003; Todtenkopf et al., 2004; Carlezon et al., 2006), suggesting that it reflects decreases in the rewarding impact of the stimulation. These findings indicate that elevated expression of GluR1 and GluR2 in NAc shell have markedly different consequences on motivated behavior. Moreover, they confirm previous observations that elevated GluR1 and GluR2 expression in NAc shell have opposite effects in cocaine place conditioning studies (Kelz et al., 1999), and extend the generalizability of these effects to behaviors that are not motivated by drugs of abuse. Perhaps most importantly, they provide more evidence to implicate Ca2+ flux within the NAc in reduced reward or elevated aversion. Because Ca2+ plays a role in both neuronal depolarization and gene regulation, alterations in GluR expression and AMPAR subunit composition in NAc shell likely initiate physiological and molecular responses, which presumably interact to alter motivation. Again, the mechanisms by which Ca2+ signal transduction might trigger genes involved in aversive states are described in detail below.
Several lines of electrophysiological investigation support the idea that decreases in NAc firing may be related to reward. First, rewarding stimuli produce NAc inhibitions in vivo. Second, neurobiological manipulations that specifically promote inhibition of NAc firing appear to enhance rewarding effects of stimuli. Third, the inhibition of NAc GABAergic MSNs can disinhibit downstream structures such as the ventral pallidum to produce signals related to the hedonic qualities of stimuli. Each of these lines of investigation will be addressed in turn. The most substantial line of investigation involves studies of NAc single-unit activity in rodent paradigms where a wide variety of drug and non-drug rewards are delivered. A consistent finding across these studies is that the most commonly observed pattern of firing modulation is a transient inhibition. This has been observed during self-administration of many different types of rewarding stimuli including cocaine (Peoples and West, 1996), heroin (Chang et al., 1997), ethanol (Janak et al., 1999), sucrose (Nicola et al., 2004), food (Carelli et al., 2000) and electrical stimulation of the medial forebrain bundle (Cheer et al., 2005). Though not as commonly investigated as self-administration paradigms, the inhibition-reward effect is also present in awake, behaving animals where rewards are delivered without requirement for an operant response (Roitman et al., 2005; Wheeler et al., 2008). These studies indicate that the transient inhibitions need not be directly related to motor output, but may be more directly tied to a rewarding or motivationally activated state. As ubiquitous as the NAc inhibition-reward relationship seems to be, however, there are counterexamples. For instance, Taha and Fields (2005) found that of those NAc neurons that appeared to encode palatability in a sucrose solution-drinking discrimination task, excitations outnumbered inhibitions, and the total number of such neurons was small (~10% of all neurons recorded). This discrepancy from what appears to be the typical NAc activity pattern highlights the need for techniques to identify the connectivity and biochemical composition of cells recorded in vivo. As these techniques become available, unique functional subclasses of NAc neurons will most likely be identified and a more detailed model of NAc function can be constructed.
How are the transient reward-related inhibitions of NAc firing generated? Because rewarding stimuli are known to produce transient elevations in extracelluar dopamine, one straightforward hypothesis is that dopamine may be responsible. In fact, findings from in vitro and in vivo studies using iontophoretic application and other methods indicate that dopamine is capable of inhibiting NAc firing (reviewed in Nicola et al., 2000, 2004). Recent studies examining simultaneous dopamine electrochemical and single unit responses (the majority of which are inhibitions) in an ICSS paradigm indicate that these parameters show a high degree of concordance in the NAc shell (Cheer et al., 2007). On the other hand, it is now clear that dopamine can have marked excitatory effects as well as inhibitory effects in behaving animals (Nicola et al., 2000, 2004). In addition, while inactivating VTA to interfere with dopamine release in NAc blocks both the cue-induced excitations and inhibitions, it does not affect reward-related inhibitions themselves (Yun et al., 2004a). The combination of these findings suggests that while dopamine may contribute to reward-related inhibition of NAc firing, there must be other factors that can drive it as well. Although there has been much less investigation of other potential contributors, additional candidates include the release of acetylcholine and the activation of μ-opioid receptors in the NAc, both of which have been shown to occur under rewarding conditions (Trujillo et al., 1988; West et al., 1989; Mark et al., 1992; Imperato et al., 1992; Guix et al., 1992; Bodnak et al., 1995; Kelley et al., 1996) and both of which have the ability to inhibit NAc firing (McCarthy et al., 1977; Hakan et al., 1989; de Rover et al., 2002).
Another newer line of electrophysiological evidence supporting the inhibition/reward hypothesis comes from experiments in which molecular genetics approaches have been used to manipulate the excitable properties of NAc neurons. The clearest example of this so far is for viral-mediated overexpression of mCREB (dominant negative CREB), a repressor of CREB activity, in the NAc. This treatment was recently shown to cause decreases in the intrinsic excitability of NAc MSNs, as indicated by the fact that neurons recorded in the NAc exhibited fewer spikes in response to a given depolarizing current injection (Dong et al., 2006). As noted above, NAc mCREB overexpression is not only associated with enhanced rewarding effects of cocaine (Carlezon et al., 1998) but also with a decrease in depressive-like behavioral effects in the forced-swim task (Pliakas et al., 2001) and a learned-helplessness paradigm (Newton et al., 2002). The combination of these findings is consistent with the idea that conditions that facilitate a transition to lower firing rates in NAc neurons also facilitate reward processes and/or elevates mood.
On the other hand, deletion of the Cdk5 gene specifically in the NAc core region produced an enhanced cocaine reward phenotype (Benavides et al., 2007). This phenotype correlated with an increase in excitability in NAc MSNs. This contrasts with the mCREB effect, which was most robust when CREB function was inhibited in the shell region, rather than the core (Carlezon et al., 1998). Considered along with other evidence, these studies highlight the importance of distinguishing between inhibition of NAc activity in the shell region, which appears to be associated with reward, versus the core region, where it may not.
Finally, the hypothesis relating NAc inhibition to reward is supported by the study of the relationship between neural activity in NAc target structures and reward. Considering that NAc MSNs are GABAergic projection neurons, inhibition of firing in these cells should disinhibit target regions. As mentioned above, one structure that receives a dense projection from the NAc shell is the ventral pallidum. Elegant electrophysiological studies have demonstrated that elevated activity in ventral pallidal neurons can encode the hedonic impact of a stimulus (Tindell et al., 2004, 2006). For example, among neurons that responded to sucrose reward (between 30–40% of total recorded units), receipt of a sucrose reward produced a robust, transient increase in firing—an effect that persisted throughout training (Tindell et al., 2004). In a subsequent study, the investigators used a clever procedure to manipulate the hedonic value of a taste stimulus to assess whether activity in pallidal neurons would track this change (Tindell et al., 2006). Although hypertonic saline solutions are typically aversive taste stimuli, in salt-deprived humans or experimental animals their palatability is increased. Both behavioral measures of positive hedonic response (i.e. facial taste reactivity measures) and increases in pallidal neuron firing occurred in response to a hypertonic saline taste stimulus in sodium-deprived animals, but not in animals maintained on a normal diet. Thus, increased firing of pallidal neurons, downstream targets of NAc efferents, appears to encode a key feature of reward. Of course, it is possible that other inputs to pallidal neurons could contribute to these reward-related firing patterns. However, recent studies have indicated a strong relationship between the ability of mu-opioid receptor activation (a factor which is known to inhibit MSN firing) in discrete regions of the NAc shell to drive increases in behavioral response to a hedonic stimulus and its ability to activate c-fos in discrete regions of ventral pallidum (Smith et al., 2007). This apparently tight coupling between NAc and pallidal “hedonic hotspots” is an intriguing new phenomenon that is just beginning to be explored.
The fact that the NAc also plays a role in aversion is sometimes underappreciated. Pharmacological treatments have been used to demonstrate aversion after NAc manipulations. In addition, molecular approaches have demonstrated that exposure to drugs of abuse and stress cause common neuroadaptions that can trigger signs (including anhedonia, dysphoria) that characterize depressive illness (Nestler and Carlezon, 2006), which is often co-morbid with addiction and involves dysregulated motivation.
Some of the earliest evidence that NAc plays a role in aversive states came from studies involving opioid receptor antagonists. Microinjections of a wide-spectrum opioid receptor antagonist (methylnaloxonium) into the NAc of opiate-dependent rats establishes conditioned place aversions (Stinus et al., 1990). In opiate-dependent rats, precipitated withdrawal can induce immediate-early genes and transcription factors in the NAc (Gracy et al., 2001; Chartoff et al., 2006), suggesting activation of MSNs. Selective κ-opioid agonists, which mimic the effects of the endogenous κ-opioid ligand dynorphin, also produce aversive states. Microinjections of a κ-opioid agonist into the NAc cause conditioned place aversions (Bals-Kubik et al., 1993) and elevate ICSS thresholds (Chen et al., 2008). Inhibitory (Gi-coupled) κ-opioid receptors are localized on the terminals of VTA dopamine inputs to the NAc (Svingos et al., 1999), where they regulate local dopamine release. As such, they are often in apposition to μ- and δ-opioid receptors (Mansour et al., 1995), and stimulation produces the opposite effects of agonists at these othr receptors in behavioral tests. Indeed, extracellular concentrations of dopamine are reduced in the NAc by systemic (DiChiara and Imperato, 1988; Carlezon et al., 2006) or local microinfusions of κ-opioid agonist (Donzati et al., 1992; Spanagel et al., 1992). Decreased function of midbrain dopamine systems has been associated with depressive states including anhedonia in rodents (Wise, 1982) and dysphoria in humans (Mizrahi et al., 2007). Thus one path to aversion appears to be reduced dopamine input to the NAc, which would reduced the stimulation of inhibitory dopamine D2-like receptors that seem critical for reward (Carlezon and Wise, 1996).
Other studies appear to confirm an important role of dopamine D2-like receptors in suppressing aversive responses. Microinjections of a dopamine D2-like antagonist into the NAc of opiate-dependent rats precipitates signs of somatic opiate withdrawal (Harris and Aston-Jones, 1994). Although the motivational effects were not measured in this study, treatments that precipitate opiate withdrawal often cause aversive states more potently than they cause somatic signs of withdrawal (Gracy et al., 2001; Chartoff et al., 2006). Interestingly, however, microinjections of a dopamine D1-like agonist into the NAc also produce somatic signs of withdrawal in opiate–dependent rats. The data demonstrate that another path to aversion is increased stimulation of excitatory dopamine D1-like receptors in rats with opiate-dependence induced neuroadaptations in the NAc. Perhaps not surprisingly, one consequence of D1-like receptor stimulation in opiate dependent rats is phosphorylation of GluR1 (Chartoff et al., 2006), which would lead to increased surface expression of AMPA receptors on the MSNs of the direct pathway.
Exposure to drugs of abuse (Turgeon et al., 1997) and stress (Pliakas et al., 2001) activate the transcription factor CREB in the NAc. Viral vector-induced elevation of CREB function in the NAc reduces the rewarding effects of drugs (Carlezon et al., 1998) and hypothalamic brain stimulation (Parsegian et al., 2006), indicating anhedonia-like effects. It also makes low doses of cocaine aversive (a putative sign of dysphoria), and increases immobility behavior in the forced swim test (a putative sign of “behavioral despair”) (Pliakas et al., 2001). Many of these effects can be attributed to CREB-regulated increases in dynorphin function (Carlezon et al., 1998). Indeed, κ-opioid receptor-selective agonists have effects that are qualitatively similar to those produced by elevated CREB function in the NAc, producing signs of anhedonia and dysphoria in reward models and increased immobility in the forced swim test (Bals-Kubik et al., 1993; Carlezon et al., 1998; Pliakas et al., 2001; Mague et al., 2003; Carlezon et al., 2006). In contrast, κ-selective antagonists produce an antidepressant-like phenotype resembling that seen in animals with disrupted CREB function in the NAc (Pliakas et al., 2001; Newton et al., 2002; Mague et al., 2003). These findings suggest that one biologically important consequence of drug- or stress-induced activation of CREB within the NAc is increased transcription of dynorphin, which triggers key signs of depression. Dynorphin effects are likely mediated via stimulation of κ-opioid receptors that act to inhibit neurotransmitter release from mesolimbic dopamine neurons, thereby reducing the activity VTA neurons, as explained above. This path to aversion appears to be reduced dopamine input to the NAc, which would produce reductions in the stimulation of inhibitory dopamine D2-like receptors that seem critical for reward (Carlezon and Wise, 1996). As explained below, there is also evidence that elevated expression of CREB in the NAc directly increases the excitability of MSNs (Dong et al., 2006) in addition to the loss of D2-regluated inhibition, raising the possibility that multiple effects contribute to the aversive responses.
Repeated exposure to drugs of abuse can elevate GluR1 expression in the NAc (Churchill et al., 1999). Viral vector-induced elevation of elevated GluR1 in the NAc increases drug aversion in place conditioning studies, an “atypical” type of drug sensitization (i.e., heightened sensitivity to the aversive rather than the rewarding aspects of cocaine). This treatment also increases ICSS thresholds (Todtenkopf et al., 2006), indicating anhedonia-like and dysphoria-like effects. Interestingly, these motivational effects are virtually identical to those caused by elevated CREB function in the NAc. These similarities raise the possibility that both effects are part of the same larger process. In one possible scenario, drug exposure might trigger changes in the expression of GluR1 in the NAc, which would lead to local increases in the surface expression of Ca2+-permeable AMPA receptors, which would increase Ca2+ influx and activate CREB, leading to alterations in sodium channel expression that affect baseline and stimulated excitability of MSNs in the NAc (Carlezon and Nestler, 2002; Carlezon et al., 2005; Dong et al., 2006). Alternatively, early changes in CREB function might precede alterations in GluR1 expression. These relationships are currently under intensive study in several NIDA-funded laboratories, including our own.
Although there has been little electrophysiological investigation of the hypothesis that widespread excitation of NAc neurons encodes information about aversive stimuli, the available data essentially mirror those for rewarding stimuli. First, two recent studies using aversive taste stimuli both indicate that three times as many NAc neurons respond to the stimuli with clear excitations as inhibitions (Roitman et al., 2005; Wheeler et al., 2008). Interestingly, these same studies find that units that respond to a sucrose or saccharin reward show the exact opposite profile: three times more cells with decreases in firing than those with increases. In addition, when an initially rewarding saccharin stimulus was made aversive by pairing it with the opportunity to self-administer cocaine, the predominant firing pattern of NAc units that responded to the stimulus shifted from inhibition to excitation (Wheeler et al., 2008). Thus, not only does this demonstrate that NAc may encode aversive states in firing increases, but that individual NAc neurons can track the hedonic valence of a stimulus by varying their firing-rate response to it.
Second, molecular genetic manipulations of synaptic and intrinsic membrane properties that increase the excitability of NAc neurons can shift the behavioral response of a stimulus from rewarding to aversive. For example, viral-mediated overexpression of CREB in NAc produces an increase in neuronal excitability in MSNs as indicated by an increase in the number of spikes in response to a given depolarizing current pulse (Dong et al. 2006). Under these conditions of enhanced NAc excitability, animals exhibit a conditioned place aversion to cocaine, rather than the place preference response that control animals show to the same dose (Pliakas et al., 2001). In addition, they exhibit increased depressive-like behaviors in forced swim test (Pliakas et al., 2001) and learned helplessness paradigm (Newton et al., 2002). Another molecular manipulation that produces a similar behavioral phenotype is the overexpression of the AMPAR subunit GluR1 in NAc (Kelz et al., 1999; Todtenkopf et al., 2006). Although it is has not yet been confirmed by electrophysiological study, this GluR1 overexpression is likely to produce an enhancement of synaptic excitability in NAc MSNs. Not only may this occur through the insertion of additional AMPARs in the membrane in general, but the abundance of GluR1 could potentially lead to the formation of GluR1 homomeric receptors, which are known to have a larger single-channel conductance (Swanson et al., 1997) and thus contribute even further to enhanced excitability.
Third, if NAc firing is elevated during aversive conditions, downstream targets should be suppressed via GABA release from MSNs during these conditions as well. Ventral pallidal unit recordings show very low firing rates following oral infusion of hypertonic saline—a taste stimulus that under normal physiological circumstances is aversive (Tindell, 2006). Although clearly more work with aversive stimuli of different modalities is needed to make any firm conclusions, the present data are consistent with the possibility that enhanced firing of NAc neurons during aversive conditions may suppress pallidal neuron firing as part of the process of encoding the unpleasant nature of a stimulus.
Based on the evidence described above, our working hypothesis is that rewarding stimuli reduce the activity of NAc MSNs, whereas aversive treatments increase the activity of these neurons. According to this model (Fig. 2), NAc neurons tonically inhibit reward-related processes. Under normal circumstances, excitatory influences mediated by glutamate actions at AMPA and NMDA receptors or dopamine actions at D1-like receptors are balanced by inhibitory dopamine actions at D2-like receptors. Treatments that would be expected to reduce activity in the NAc—including cocaine (Peoples et al., 2007), morphine (Olds et al., 1982), NMDA antagonists (Carlezon et al., 1996), L-type Ca2+ antagonists (Chartoff et al., 2006), palatable food (Wheeler et al., 2008) and expression of dominant-negative CREB (Dong et al., 2006)—have reward-related effects because they reduce the inhibitory influence of the NAc on downstream reward pathways. In contrast, treatments that activate the NAc by amplifying glutamatergic inputs (e.g., elevated expression of GluR1; Todtenkopf et al., 2006), altering ion channel function (e.g., elevated expression of CREB: Dong et al., 2006), reducing inhibitory dopamine inputs to D2-like cells (e.g., κ-opioid receptor agonists), or blocking inhibitory μ- or δ–opioid receptors (West and Wise, 1988; Weiss, 2004) are perceived as aversive because they increase the inhibitory influence of the NAc on downstream reward pathways. Interestingly, stimuli such as drugs of abuse may induce homeostatic (or allostatic) neuroadapations that persist beyond the treatment and cause baseline shifts in mood. Such shifts may be useful in explaining co-morbidity of addiction and psychiatric illness (Kessler et al., 1997): repeated exposure to drugs that reduce the activity of NAc neurons might induce compensatory neuroadaptations that render the system more excitable during abstinence (leading to conditions characterized by anhedonia or dysphoria), whereas repeated exposure to stimuli (e.g., stress) that activate the NAc might induce compensatory neuroadaptations that render the system more susceptible to the inhibitory actions of drugs of abuse, increasing their appeal. This working hypothesis is testable through a variety of increasingly sophisticated approaches.
One caveat to the inhibition/reward hypothesis is that widespread and prolonged inhibition of NAc firing, as in inactivation or lesion studies, does not appear to produce rewarding effects (e.g. Yun et al., 2004b). This raises the possibility that it is not inhibition of the NAc, per se, that encodes reward but rather the transitions from normal basal firing rates to lower rates that occur when rewarding stimuli are present. Prolonged inhibition may degrade the dynamic information normally encoded in the transient depressions of NAc firing.
Electrophysiology-based tests of the predictions of this hypothesis fall into two basic categories. The first category involves manipulating an animal’s behavioral state to produce sustained changes in responsivity to rewarding stimuli followed by testing for electrophysiological correlates of this altered reward state. For example, the early withdrawal state from chronic exposure to psychostimulants is characterized by anhedonia and lack of responsiveness to natural rewarding stimuli. What would the inhibition/reward hypothesis predict about the electrophysiological status of NAc neurons during this state? The major prediction is that NAc neurons would exhibit decreases in the activity suppression normally produced by exposure to a rewarding stimulus (e.g. sucrose). To our knowledge, this has not yet been investigated. Possible mechanisms for such a decrease in inhibition, should it occur, might include overall increases in neuronal excitability produced by any combination of changes in intrinsic excitability (e.g. increased Na+ or Ca2+ currents, decreased K+ currents) or synaptic transmission (e.g. decreases in glutamatergic or increases in GABAergic transmission). On the other hand, the available data on NAc MSN excitability during early psychostimulant withdrawal suggest that it is actually decreased during this phase (Zhang et al., 1998; Hu et al., 2004; Dong et al., 2006; Kourrich et al., 2007). As noted above, it is possible that a prolonged depression in excitability may degrade reward-related information contained in transient firing inhibitions, perhaps by creating a “floor” effect and reducing the magnitude of these inhibitions. This possibility remains to be tested.
Considering the apparent link between NAc and ventral pallidum in reward encoding (see above), we would predict that any excitability changes produced by sustained modulation of an animal’s reward state might be particularly evident in striatopallidal/D2 neurons. Although studying the detailed physiological properties of these neurons has been difficult in the past, the recent development of a line of BAC transgenic mice that expresses GFP in these neurons (Gong et al., 2003; Lobo et al., 2006) has made it possible to visualize them in in vitro slice preparations, greatly facilitating the potential for physiological characterization of D2 cells.
The second category of electrophysiology-based tests involves using genetic engineering (see below) to alter the functional expression of key components of the cellular machinery for excitability or excitability modulation in NAc neurons. In theory, this could enable modulation of the inhibitions or excitations associated with reward or aversion, respectively, in NAc neurons. With this in mind, perhaps the most useful target molecules would be those that participate in activity-dependent modulation of neuronal excitability, rather than in maintaining basal firing rates. These targets would likely provide a better opportunity to modulating stimuli responsiveness than more general targets (e.g. Na+ channel subunits), thus enabling the evaluation of the inhibition/reward hypothesis. For example, the firing frequency of active neurons can be controlled by various ionic conductances that produce spike after-hyperpolarizations (AHPs). By targeting NAc neurons with genetic (or possibly even pharmacologic) manipulation aimed at the channels that produce AHPs, it may be possible to decrease the magnitude of aversion-related excitatory responses in these neurons and thus to test whether this physiological change correlates with reduced behavioral indices of aversion.
One of the most obvious pharmacological tests would to determine if rats self-administer dopamine D2-like agonists directly into the NAc. Interestingly, previous work indicates that while rats self-administer combinations of D1-like and D2-like agonists into the NAc, they do not self-administer either drug component alone, at least at the doses tested (Ikemoto et al., 1997). While on the surface this finding might appear to invalidate our working hypothesis, electrophysiological evidence suggests that co-activation of D1 and D2 receptors on NAc neurons can, under some conditions, cause a reduction in their membrane excitability that is not seen in response to either agonist alone (O’Donnell and Grace, 1996). In addition, more work is needed to study the behavioral effects of intra-NAc microinfusions of GABA agonists; historically, this work has been hindered by poor solubility of benzodiazepines—which are known to be addictive (Griffiths and Ator, 1980) despite their tendency to decrease dopamine function in the NAc (Wood, 1982; Finlay et al., 1992: Murai et al., 1994)—and the relatively small number of researchers who use brain microinjection procedures together with models of reward. Still other ways of testing our hypothesis would be to study the effects of manipulations in brain areas downstream of D2 receptor-containing MSNs. Again, early evidence suggests reward is encoded by activation of the ventral pallidum, a presumed consequence of inhibition of the MSNs of the indirect pathway (Tindell et al., 2006).
The development of genetic engineering techniques that enable the direction of inducible or conditional mutations to specific brain areas will be an important tool with which to test our hypotheses. Mice with constitutive deletion of GluRA (an alternative nomenclature for GluR1) show many alterations in sensitivity to drugs of abuse (Vekovischeva et al., 2001; Dong et al., 2004; Mead et al., 2005, 2007), some of which are consistent with our working hypothesis and some of which are not. The loss of GluR1 early in development could dramatically alter responsiveness to numerous types of stimuli, including drugs of abuse. In addition, these GluR1-mutant mice lack the protein throughout the brain, whereas the research reviewed here focuses on mechanisms that occur within NAc. These points are especially important because loss of GluR1 in other brain regions would be expected to have dramatic, and sometimes very different, effects on drug abuse-related behaviors. As just one example, we have shown that modulation of GluR1 function in the VTA exerts the opposite effect on drug responses compared to modulation of GluR1 in the NAbc (Carlezon et al., 1997; Kelz et al., 1999). The findings in GluR1-deficient mice are not inconsistent with the combined findings from the NAc and the VTA: constitutive GluR1 mutant mice are more sensitive to the stimulant effects of morphine (an effect that could be explained by the loss of GluR1 in the NAc), but they do not develop progressive increases in responsivity to morphine (an effect that could be explained by the loss of GluR1 in the VTA) testing occurs under conditions that promote sensitization and involve additional brain regions. Accordingly, one must be cautious in assigning spatial and temporal interpretations to data from constitutive knockout mice: the literature is becoming replete with examples of proteins that have dramatically different (and sometimes opposite) effects on behavior depending upon the brain regions under study (see Carlezon et al., 2005).
Preliminary studies from mice with inducible expression of a dominant-negative form of CREB—a manipulation which reduces the excitability of NAc MSNs—are hypersensitive to the rewarding effects of cocaine while being insensitive to the aversive effects of a κ-opioid agonist (DiNieri et al., 2006). Although these findings are consistent with our working hypothesis, further studies (e.g., electrophysiology) might help to characterize the physiological basis of these effects. Regardless, an increased capacity to spatially and temporally control the expression of genes that regulate the excitability of NAc MSNs will enable progressively more sophisticated tests of our working hypothesis.
Functional brain imaging has the potential to revolutionize our understanding of the biological basis of rewarding and aversive mood states in animal models and, ultimately, people. Preliminary data from imaging studies involving alert non-human primates are providing early evidence in support of the working hypothesis described above. Intravenous administration of high doses of the κ-opioid agonist U69,593—which belongs to a class of drugs known to cause aversion in animals (Bals-Kubik et al., 1993; Carlezon et al., 2006) and dysphoria in humans (Pfeiffer et al., 1986; Wadenberg, 2003)—causes profound increases in blood-oxygen level-dependent (BOLD) functional MRI responses in the NAc (Fig. 3: from M.J. Kaufman, B. deB. Fredrick, S. S. Negus, unpublished observations; used with permission). To the extent that BOLD signal responses reflect synaptic activity, the positive BOLD response induced by U69,593 in the NAc is consistent with increased activity of MSNs, perhaps due to decreased dopamine input (DiChiara and Imperato, 1988; Carlezon et al., 2006). In contrast, positive BOLD signal responses are conspicuously absent in the NAc after treatment with an equipotent dose of fentanyl, a highly addictive μ-opioid agonist. While these fentanyl data do not indicate inhibition of the NAc per se, absence of BOLD activity in this region is not inconsistent with our working hypothesis. Clearly, additional pharmacological and electrophysiological studies are needed to characterize the meaning of these BOLD signal changes. The development of higher magnetic field strength systems is beginning to enable cutting-edge functional imaging and spectroscopy in rats and mice, opening the door to a more detailed understanding of BOLD signals and underlying brain function.
We propose a simple model of mood in which reward is encoded by reduced activity of NAc MSNs, whereas aversion is encoded by elevated activity of these same cells. Our model is supported by a preponderance of evidence already in the literature, although more rigorous tests are needed. It is also consistent with clinical studies indicating reduced numbers of inhibitory dopamine D2-like receptors in the NAc of drug addicts, which may decrease sensitivity to natural rewards and exacerbate the addiction cycle (Volkow et al., 2007). The continued development of molecular and brain imaging techniques is establishing a research environment that is conducive to the design of studies that have the power to confirm or refute this model. Regardless, a better understanding of the molecular basis of these mood states is perpetually important and relevant, particularly as accumulated knowledge from decades of research is used to develop innovative approaches that might be used to treat and prevent addiction and other conditions (e.g., mood disorders) associated with dysregulation of motivation.
Funded by the National Institute on Drug Abuse (NIDA) grants DA012736 (to WAC) and DA019666 (to MJT) and a McKnight-Land Grant professorship (to MJT). We thank M.J. Kaufman, B. deB. Fredrick, and S.S. Negus for permission to cite unpublished data from their brain imaging studies in monkeys.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
COMMENTS: Explains how dopamine dysregulation can aggravate relapse, narrow users' interests and perturb decision-making, thus accounting for a wide range of addiction-related symptoms.
Department of Psychiatry, McGill University, 1033 Pine Avenue West, Montreal, Quebec, CANADA H3A 1A1. [email protected]
In animal models considerable evidence suggests that increased motivation to seek and ingest drugs of abuse are related to conditioned and sensitized activations of the mesolimbic dopamine (DA) system. Direct evidence for these phenomena in humans, though, is sparse. However, recent studies support the following.
First, the acute administration of drugs of abuse across pharmacological classes increases extracellular DA levels within the human ventral striatum.
Second, individual differences in the magnitude of this response correlate with rewarding effects of the drugs and the personality trait of novelty seeking.
Third, transiently diminishing DA transmission in humans decreases drug craving, the propensity to preferentially respond to reward-paired stimuli, and the ability to sustain responding for future drug reward.
Finally, very recent studies suggest that repeated exposure to stimulant drugs, either on the street or in the laboratory, can lead to conditioned and sensitized behavioral responses and DA release.
In contrast to these findings, though, in individuals with a long history of substance abuse, drug-induced DA release is decreased. This diminished DA release could reflect two different phenomena. First, it is possible that drug withdrawal related decrements in DA cell function persist longer than previously suspected.
Second, drug-paired stimuli may gain marked conditioned control over the release of DA and the expression of sensitization leading to reduced DA release when drug-related cues are absent.
Based on these observations a two-factor hypothesis of the role of DA in drug abuse is proposed.
In the presence of drug cues, conditioned and sensitized DA release would occur leading to focused drug-seeking behavior.
In comparison, in the absence of drug-related stimuli DA function would be reduced, diminishing the ability of individuals to sustain goal-directed behavior and long-term objectives.
This conditioned control of the expression of sensitized DA release could aggravate susceptibility to relapse, narrow the range of interests and perturb decision-making, accounting for a wide range of addiction related phenomena.
J Genet Syndr Gene Ther. 2013 February 10; 4(121): 1000121. doi: 10.4172/2157-7412.1000121
Having entered the genomics era with confidence in the future of medicine, including psychiatry, identifying the role of DNA and polymorphic associations with brain reward circuitry has led to a new understanding of all addictive behaviors. It is noteworthy that this strategy may provide treatment for the millions who are the victims of “Reward Deficiency Syndrome” (RDS) a genetic disorder of brain reward circuitry. This article will focus on drugs and food being mutuality addictive, and the role of dopamine genetics and function in addictions, including the interaction of the dopamine transporter, and sodium food. We will briefly review our concept that concerns the genetic antecedents of multiple–addictions (RDS). Studies have also shown that evaluating a panel of established reward genes and polymorphisms enables the stratification of genetic risk to RDS. The panel is called the “Genetic Addiction Risk Score (GARS)”, and is a tool for the diagnosis of a genetic predisposition for RDS. The use of this test, as pointed out by others, would benefit the medical community by identifying at risk individuals at a very early age. We encourage, in depth work in both animal and human models of addiction. We encourage further exploration of the neurogenetic correlates of the commonalities between food and drug addiction and endorse forward thinking hypotheses like “The Salted Food Addiction Hypothesis”.
Dopamine (DA) is a neurotransmitter in the brain, which controls feelings of wellbeing. This sense of wellbeing results from the interaction of DA and neurotransmitters such as serotonin, the opioids, and other brain chemicals. Low serotonin levels are associated with depression. High levels of the opioids (the brain’s opium) are also associated with a sense of wellbeing . Moreover, DA receptors, a class of G-protein coupled receptors (GPCRs), have been targeted for drug development for the treatment of neurological, psychiatric and ocular disorders . DA has been called the “anti-stress” and/or “pleasure” molecule, but this has been recently debated by Salamone and Correa  and Sinha .
Accordingly, we have argued [5-8] that Nucleus accumbens (NAc) DA has a role in motivational processes, and that mesolimbic DA dysfunction may contribute to motivational symptoms of depression, features of substance abuse and other disorders . Although it has become traditional to label DA neurons as reward neurons, this is an over generalization, and it is necessary to consider how different aspects of motivation are affected by dopaminergic manipulations. For example, NAc DA is involved in Pavlovian processes, and instrumental learning appetitive-approach behavior, aversive motivation, behavioral activation processes sustained task engagement and exertion of effort although it does not mediate initial hunger, motivation to eat or appetite [3,5-7].
While it is true that NAc DA is involved in appetitive and aversive motivational processes we argue that DA is also involved as an important mediator in primary food motivation or appetite similar to drugs of abuse. A review of the literature provides a number of papers that show the importance of DA in food craving behavior and appetite mediation [6,7]. Gold has pioneered the concept of food addiction [5-8]. Avena et al.  correctly argue that because addictive drugs avtivate the same neurological pathways that evolved to respond to natural rewards, addiction to food seems plausible. Moreover, sugar per se is noteworthy as a substance that releases opioids and DA and thus might be expected to have addictive potential. Specifically, neural adaptations include changes in DA and opioid receptor binding, enkephalin mRNA expression and DA and acetylcholine release in the NAc. The evidence supports the hypothesis that under certain circumstances rats can become sugar dependent.
The work of Wang et al.  involving brain imaging studies in humans has implicated DA-modulated circuits in pathologic eating behavior(s). Their studies suggest that the DA in the extracellular space of the striatum is increased by food cues, this is evidence that DA is potentially involved in the non-hedonic motivational properties of food. They also found that orbitofrontal cortex metabolism is increased by food cues indicating that this region is associated with motivation for the mediation of food consumption. There is an observed reduction in striatal DA D2 receptor availability in obese subjects, similar to the reduction in drug-addicted subjects, thus obese subjects may be predisposed to use food to compensate temporarily for under stimulated reward circuits . In essence, the powerful reinforcing effects of both food and drugs are in part mediated by abrupt DA increases in the mesolimbic brain reward centers. Volkow et al.  point out that abrupt DA increases can override homeostatic control mechanisms in the brain’s of vulnerable individuals. Brain imaging studies have deliniated the neurological dysfunction that generates the shared features of food and drug addictions. The cornerstone of the commonality, of the root causes of addiction are impairments in the dopaminergic pathways that regulate the neuronal systems associated also with self-control, conditioning, stress reactivity, reward sensitivity and incentive motivation . Metabolism in prefrontal regions is involved in inhibitory control, in obese subjects the inability to limit food intake involves ghrelin and may be the result of decreased DA D2 receptors which are associated with decreased prefrontal metabolism . The limbic and cortical regions involved with motivation, memory and self-control, are activated by gastric stimulation in obese subjects  and during drug craving in drug-addicted subjects. An enhanced sensitivity to the sensory properties of food is suggested by increased metabolism in the somatosensory cortex of obese subjects. This enhanced sensitivity to food palatability coupled with reduced DA D2 receptors could make food the salient reinforcer for compulsive eating and obesity risk . These research results indicate that numerous brain circuits are disrupted in obesity and drug addiction and that the prevention and treatment of obesity may benefit from strategies that target improved DA function.
Lindblom et al.  reported that dieting as a strategy to reduce body weight often fails as it causes food cravings leading to binging and weight regain. They also agree that evidence from several lines of research suggests the presence of shared elements in the neural regulation of food and drug craving. Lindblom et al.  quantified the expression of eight genes involved in DA signaling in brain regions related to the mesolimbic and nigrostriatal DA system in male rats subjected to chronic food restriction using quantitative real-time polymerase chain reaction. They found that mRNA levels of tyrosine hydroxylase, and the dopamine transporter in the ventral tegmental area were strongly increased by food restriction and concurrent DAT up-regulation at the protein level in the shell of the NAc was also observed via quantitative autoradiography. That these effects were observed after chronic rather than acute food restriction suggests that sensitization of the mesolimbic dopamine pathway may have occurred. Thus, sensitization possibly due to increased clearance of extracellular dopamine from the NAc shell may be one of the underlying causes for the food cravings that hinder dietary compliance. These findings are in agreement with earlier findings by Patterson et al. . They demonstrated that direct intracerebroventricular infusion of insulin results in an increase in mRNA levels for the DA reuptake transporter DAT. In a 24- to 36-hour food deprivation study hybridization was used in situ to assess DAT mRNA levels in food-deprived (hypoinsulinemic) rats. Levels were in the ventral tegmental area/substantia nigra pars compacta significantly decreased suggesting that moderation of striatal DAT function can be effected by nutritional status, fasting and insulin. Ifland et al.  advanced the hypothesis that processed foods with high concentrations of sugar and other refined sweeteners, refined carbohydrates, fat, salt, and caffeine are addictive substances. Other studies have evaluated salt as important factor in food seeking behavior. Roitman et al.  points out that increased DA transmission in the NAc is correlated with motivated behaviors, including Na appetite. DA transmission is modulated by DAT and may play a role in motivated behaviors. In their studies in vivo, robust decreases in DA uptake via DAT in the rat NAc were correlated with and Na appetite induced by Na depletion. Decreased DAT activity in the NAc was observed after in vitro Aldosterone treatment. Thus, a reduction in DAT activity, in the NAc, may be the consequence of a direct action of Aldosterone and may be a mechanism by which Na depletion induces generation of increased NAc DA transmission during Na appetite. Increased NAc DA may be the motivating property for the Na-depleted rat. Further support for the role of salted food as possible substance (food) of abuse has resulted in the “The Salted Food Addiction Hypothesis” as proposed by Cocores and Gold . In a pilot study, to determine if salted foods act like a mild opiate agonist which drives overeating and weight gain, they found that an opiate dependent group developed a 6.6% increase in weight during opiate withdrawal showing a strong preference for salted food. Based on this and other literature  they suggest that Salted Food may be an addictive substance that stimulates opiate and DA receptors in the reward and pleasure center of the brain. Alternately, preference, hunger, urge, and craving for “tasty” salted food may be symptoms of opiate withdrawal and the opiate like effect of salty food. Both salty foods and opiate withdrawal stimulate the Na appetite, result in increased calorie intake, overeating and disease related to obesity.
When synaptic, DA stimulates DA receptors (D1–D5), individuals experience stress reduction and feelings of wellbeing . As mentioned earlier, the mesocorticolimbic dopaminergic pathway mediates reinforcement of both unnatural rewards and natural rewards. Natural drives are reinforced physiological drives such as hunger and reproduction while unnatural rewards involve satisfaction of acquired learned pleasures, hedonic sensations like those derived from drugs, alcohol, gambling and other risk-taking behaviors [8,20,21].
One notable DA gene is the DRD2 gene which is responsible for the synthesis of DA D2 receptors . The allelic form of the DRD2 gene (A1 versus A2) dictates the number of receptors at post-junctional sites and hypodopaminergic function [23,24]. A paucity of DA receptors predisposes individuals to seek any substance or behavior that stimulates the dopaminergic system [25-27].
The DRD2 gene and DA have long been associated with reward  in spite of controversy [3,4]. Although the Taq1 A1 allele of the DRD2 gene, has been associated with many neuropsychiatric disorders and initially with severe alcoholism, it is also associated with other substance and process addictions, as well as, Tourette’s Syndrome, high novelty seeking behaviors, Attention Deficit Hyperactivity Disorder (ADHD), and in children and adults, with co-morbid antisocial personality disorder symptoms .
While this article will focus on drugs and food being mutuality addictive, and the role of DA genetics and function in addictions, for completeness, we will briefly review our concept that concerns the genetic antecedents of multiple–addictions. “Reward Deficiency Syndrome” (RDS) was first described in 1996 as a theoretical genetic predictor of compulsive, addictive and impulsive behaviors with the realization that the DRD2 A1 genetic variant is associated with these behaviors [29-32]. RDS involves the pleasure or reward mechanisms that rely on DA. Behaviors or conditions that are the consequence of DA resistance or depletion are manifestations of RDS . An individual’s biochemical reward deficiency can be mild, the result of overindulgence or stress or more severe, the result of a DA deficiency based on genetic makeup. RDS or anti-reward pathways help to explain how certain genetic anomalies can give rise to complex aberrant behavior. There may be a common neurobiology, neuro-circuitry and neuroanatomy, for a number of psychiatric disorders and multiple addictions. It is well known that .drugs of abuse, alcohol, sex, food, gambling and aggressive thrills, indeed, most positive reinforcers, cause activation and neuronal release of brain DA and can decrease negative feelings. Abnormal cravings are linked to low DA function . Here is an example of how complex behaviors can be produced by specific genetic antecedents. A deficiency of, for example, the D2 receptors a consequence of having the A1 variant of the DRD2 gene  may predispose individuals to a high risk for cravings that can be satisfied by multiple addictive, impulsive, and compulsive behaviors. This deficiency could be compounded if the individual had another polymorphism in for example the DAT gene that resulted in excessive removal of DA from the synapse. In addition, the use of substances and aborant behaviors also deplete DA. Thus, RDS can be manifest in severe or mild forms that are a consequence a biochemical inability to derive reward from ordinary, everyday activities. Although many genes and polymorphisms predispose individuals to abnormal DA function, carriers of the Taq1 A1 allele of the DRD2 gene lack enough DA receptor sites to achieve adequate DA sensitivity. This DA deficit in the reward site of the brain can results in unhealthy appetites and craving. In essence, they seek substances like alcohol, opiates, cocaine, nicotine, glucose and behaviors; even abnormally aggressive behaviors that are known to activate dopaminergic pathways and cause preferential release of DA at the NAc. There is now evidence that rather than the NAc, the anterior cingulate cortex may be involved in operant, effort-based decision making [35-37] and a site of relapse.
Impairment of the DRD2 gene or in other DA receptor genes, such as the DRD1 involved in homeostasis and so called normal brain function, could ultimately lead to neuropsychiatric disorders including aberrant drug and food seeking behavior. Prenatal drug abuse in the pregnant female has been shown to have profound effects of the neurochemical state of offspring. These include ethanol ; cannabis ; heroin ; cocaine ; and drug abuse in general . Most recently Novak et al.  provided strong evidence showing that abnormal development of striatal neurons are part of the pathology underlying major psychiatric illnesses. The authors identified an underdeveloped gene network (early) in rat that lacks important striatal receptor pathways (signaling). At two postnatal weeks the network is down regulated and replaced by a network of mature genes expressing striatal-specific genes including the DA D1 and D2 receptors and providing these neurons with their functional identity and phenotypic characteristics. Thus, this developmental switch in both the rat and human, has the potential to be a point of susceptibility to disruption of growth by enviromental factors such as an overindulgence in foods, like salt, and drug abuse.
The DA transporter (also DA active transporter, DAT, SLC6A3) is a membrane–spanning protein that pumps the neurotransmitter DA out of the synapse back into cytosol from which other known transporters sequester DA and norepinephrine into neuronal vesicles for later storage and subsequent release .
The DAT protein is encoded by a gene located on human chromosome 5 it is about 64 kbp long and consists of 15 coding exon. Specifically, the DAT gene (SLC6A3 or DAT1) is localized to chromosome 5p15.3. Moreover, there is a VNTR polymorphism within the 3′ non-coding region of DAT1. A genetic polymorphism in the DAT gene which effects the amount of protein expressed is evidence for an association between and DA related disorders and DAT . It is well established that DAT is the primary mechanism which clears DA from synapses, except in the prefrontal cortex where DA reuptake involves norepinephrine [46,47]. DAT terminates the DA signal by removing the DA from the synaptic cleft and depositing it into surrounding cells. Importantly, several aspects of reward and cognition are functions of DA and DAT facilitates regulation of DA signaling .
It is noteworthy that DAT is an integral membrane protein and is considered a symporter and a co-transporter moving DA from the synaptic cleft across the phospholipid cell membrane by coupling its movement to the movement of Na ions down the electrochemical gradient (facilitated diffusion) and into the cell.
Moreover, DAT function requires the sequential binding and co-transport of two Na ions and one chloride ion with the DA substrate. The driving force for DAT-mediated DA reuptake is the ion concentration gradient generated by the plasma membrane Na+/K+ ATPase .
Sonders et al.  evaluated the role of the widely–accepted model for monoamine transporter function. They found that normal monoamine transporter function requires set rules. For example, Na ions must bind to the extracellular domain of the transporter before DA can bind. Once DA binds, the protein undergoes a conformational change, which allows both Na and DA to unbind on the intracellular side of the membrane. A number of electrophysiological studies have confirmed that DAT transports one molecule of neurotransmitter across the membrane with one or two Na ions like other monoamine transporters. Negatively charged chloride ions are required to prevent a buildup of positive charge. These studies used radioactive-labeled DA and have also shown that the transport rate and direction are totally dependent on the Na gradient .
Since it is well known that many drugs of abuse cause the release of neuronal DA , DAT may have a role in this effect. Because of the tight coupling of the membrane potential and the Na gradient, activity-induced changes in membrane polarity can dramatically influence transport rates. In addition, the transporter may contribute to DA release when the neuron depolarizes . In essence, as pointed out by Vandenbergh et al.  the DAT protein regulates DA -mediated neurotransmission by rapidly accumulating DA that has been released into the synapse.
The DAT membrane topology was initially theoretical, determined based on hydrophobic sequence analysis and similarity to the GABA transporter. The initial prediction of Kilty et al.  of a large extracellular loop between the third and fourth of twelve transmembrane domains was confirmed by Vaughan and Kuhar  when they used proteases, to digest proteins into smaller fragments, and glycosylation, which occurs only on extracellular loops, to verify most aspects of DAT structure.
DAT has been found in regions of the brain where there is dopaminergic circuitry, these areas include mesocortical, mesolimbic, and nigrostriatal pathways . The nuclei that make up these pathways have distinct patterns of expression. DAT was not detected within any synaptic cleft which suggests that striatal DA reuptake occurs outside of the synaptic active zones after DA has diffused from the synaptic cleft.
Two alleles, the 9 repeat (9R) and 10 repeat (10R) VNTR can increase the risk for RDS behaviors. The presence of the 9R VNTR has associated with alcoholism and Substance Use Disorder. It has been shown to augment transcription of the DAT protein resulting in an enhanced clearance of synaptic DA, resulting in a reduction in DA, and DA activation of postsynaptic neurons . The tandem repeats of the DAT have been associated with reward sensitivity and high risk for Attention Deficit Hyperactivity Disorder (ADHD) in both children and adults [59,60]. The 10-repeat allele has a small but significant association with hyperactivity-impulsivity (HI) symptoms .
Support for the impulsive nature of individuals possessing dopaminergic gene variants and other neurotransmitters (e.g. DRD2, DRD3, DRD4, DAT1, COMT, MOA-A, SLC6A4, Mu, GABAB) is derived from a number of important studies illustrating the genetic risk for drug-seeking behaviors based on association and linkage studies implicating these alleles as risk antecedents that have an impact in the mesocorticolimbic system (Table 1). Our laboratory in conjunction with LifeGen, Inc. and Dominion Diagnostics, Inc. is carrying out research involving twelve select centers across the United States to validate the first ever patented genetic test to determine a patient’s genetic risk for RDS called Genetic Addiction risk Score™ (GARS).
Submit your manuscript at: http://www.editorialmanager.com/omicsgroup/
The authors appreciate the expert editorial input from Margaret A. Madigan and Paula J. Edge. We appreciate the comments by Eric R. Braverman, Raquel Lohmann, Joan Borsten, B.W Downs, Roger L. Waite, Mary Hauser, John Femino, David E Smith, and Thomas Simpatico. Marlene Oscar-Berman is the recipient of grants from the National Institutes of Health, NIAAA RO1-AA07112 and K05-AA00219 and the Medical Research Service of the US Department of Veterans Affairs. We also acknowledge the case report input Karen Hurley, Executive Director of National Institute of Holistic Addiction studies, North Miami Beach Florida. In-part this article was supported by a grand awarded to Path foundation NY from Life Extension Foundation.
This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Conflict of Interest Kenneth Blum, PhD., holds a number of US and foreign patents related to diagnosis and treatment of RDS, which has been exclusively licensed to LifeGen, Inc. Lederach, PA. Dominion Diagnostics, LLC, North Kingstown, Rhode Island along with LifeGen, Inc., are actively involved in the commercial development of GARS. John Giordano is also a partner in LifeGen, Inc. There are no other conflicts of interest and all authors read & approved the manuscript.
Int J Neuropsychopharmacol. 2015 Apr 9. pii: pyv041. doi: 10.1093/ijnp/pyv041.
Background: Sensation-seeking is a trait that constitutes an important vulnerability factor for a variety of psychopathologies with high social cost. However, little is understood either about the mechanisms underlying motivation for intense sensory experiences or their neuropharmacological modulation in humans.
Methods: Here, we first evaluate a novel paradigm to investigate sensation-seeking in humans. This test probes the extent to which participants choose either to avoid or self-administer an intense tactile stimulus (mild electric stimulation) orthogonal to performance on a simple economic decision-making task. Next we investigate in a different set of participants whether this behavior is sensitive to manipulation of dopamine D2 receptors using a within-subjects, placebo-controlled, double-blind design.
Results: In both samples, individuals with higher self-reported sensation-seeking chose a greater proportion of mild electric stimulation-associated stimuli, even when this involved sacrifice of monetary gain. Computational modelling analysis determined that people who assigned an additional positive economic value to mild electric stimulation-associated stimuli exhibited speeding of responses when choosing these stimuli. In contrast, those who assigned a negative value exhibited slowed responses. These findings are consistent with involvement of low-level, approach-avoidance processes. Furthermore, the D2 antagonist haloperidol selectively decreased the additional economic value assigned to mild electric stimulation-associated stimuli in individuals who showed approach reactions to these stimuli under normal conditions (behavioral high-sensation seekers).
Conclusions: These findings provide the first direct evidence of sensation-seeking behavior being driven by an approach-avoidance–like mechanism, modulated by dopamine, in humans. They provide a framework for investigation of psychopathologies for which extreme sensation-seeking constitutes a vulnerability factor.
Sensation-seeking is a personality trait concerned with motivation for “intense, unusual and unpredictable” sensory experiences (Zuckerman, 1994) that constitutes an important and well-conceptualised individual difference (Roberti, 2004). Engagement in various sensation-seeking–type activities (eg, recreational drug consumption, risky driving and sexual behaviors) covaries across both adults and adolescents (Carmody et al., 1985; King et al., 2012). In addition, questionnaire-based measures of sensation-seeking personality have high heritability estimates (40–60%; Koopmans et al., 1995; Stoel et al., 2006) with rank order differences in scores remaining highly stable over time (Terracciano et al., 2011).
Extreme sensation-seeking has been implicated in a variety of psychopathologies with high social cost, including substance and gambling addictions (Zuckerman, 1994; Roberti, 2004; Perry et al., 2011). Among individuals with substance use disorders, higher sensation-seeking score is associated with earlier age of onset, increased polysubstance use, more severe functional impairment, and poorer overall treatment outcome (Ball et al., 1994; Staiger et al., 2007; Lackner et al., 2013). Identification of mechanisms underlying human sensation-seeking is therefore likely to have high clinical relevance.
Investigations of animal models of sensation-seeking have implicated variation in striatal dopamine function, particularly at D2 types (D2/D3/D4) dopamine receptors, in mediating individual preferences for novel or sensory stimulation-inducing choice options (Bardo et al., 1996; Blanchard et al., 2009; Shin et al., 2010). As the efficacy of striatal dopaminergic transmission is considered to be involved in the vigor of approach behaviors in response to salient stimuli (Ikemoto, 2007; Robbins and Everitt, 2007), one theoretical account proposes that the core basis for individual differences in sensation-seeking is in the differential activation of dopaminergic approach-withdrawal mechanisms in response to novel and intense stimuli (Zuckerman, 1990).
Consistent with this view, genetic and PET evidence has implicated differences in function at D2-type receptors in individual differences in human sensation-seeking (eg, Hamidovic et al., 2009; Gjedde et al., 2010). Crucially, however, lack of behavioral paradigms analogous to those in the preclinical literature has meant that it has not been possible to test the approach-avoidance hypothesis directly in humans. Such an approach has previously proved highly fruitful with respect to other facets of impulsivity (Winstanley, 2011; Jupp and Dalley, 2014).
Here, we first tested a novel instrumental task of human sensation-seeking–like behavior that involved the opportunity to self-administer mild (but nonpainful) electric stimulation (MES) during performance of an economic decision-making task. This task was designed to be analogous to a recent operant sensation-seeking paradigm developed for rodents (Olsen and Winder, 2009). We next used a within-subjects design to investigate the effects of the D2 dopamine receptor antagonist haloperidol on task performance in a different sample of healthy volunteers. We predicted that: (1) individuals high in trait sensation-seeking would assign a positive economic value to the opportunity to experience such an “intense and unusual” sensory stimulus; (2) this preference would be reflected in an approach-like speeded relative response time for these stimuli; and (3) such “behavioral sensation-seeking” would be disrupted by antagonism at D2 receptors, depending on baseline sensation-seeking performance (Norbury et al., 2013).
Forty-five healthy participants (28 female), mean age 24.3 (SD 3.55), were recruited via internet advertisements (for further demographic information, see Table 1). This sample size was chosen to allow us to detect a moderate-strength relationship between task performance and self-reported sensation-seeking trait on the basis of previous findings that correlations between behavioral and questionnaire measures of other facets of impulsive behavior are modest in strength (correlation coefficients up to 0.40; eg, Helmers et al., 1995; Mitchell, 1999). An a priori power calculation determined that a sample size of 44 would be necessary to detect a correlation coefficient of 0.40 at a conventional power of 80% and alpha of 0.05. Exclusion criteria consisted of any current or past neurological or psychiatric illness, or head injury. All participants provided written informed consent and the study was approved by the University College London Ethics Committee.
Demographic Information for Participants
|Study 1||Study 2|
|n (female)||45 (28)||28 (0)|
|Age (years)||24.3 (3.55)||22.3 (2.74)|
|Years of education||16.1 (3.1)||-|
|Raven’s 12-APM score||-||9.1 (2.5)|
|SSS-V-R total score (range)||261 (46) (162–352)||-|
|UPPS SS score (range)||-||23.2 (5.8) (18–47)|
|Alcohol (drinks per week)||3.7 (4.5)||5.9 (8.7)|
|Tobacco (cigarettes per week)||4.1 (10.2)||8.4 (18.3)|
|Other drug use (n)|
|Stimulant use (ever)||2||4|
|Gambling behavior (n)|
|Several times per year||5||3|
|Several times per month||1||7|
|Weekly or more||0||1|
Abbreviations: Raven’s 12-APM=Raven’s Advanced Progressive Matrices non-verbal IQ test (12-item version); SSS-V-R, Sensation-Seeking Scale version V (revised); UPPS SS, UPPS impulsivity scale sensation-seeking subscale score.
Other demographic scores refer to behavior during the last 12 months. Unless otherwise specified, figures represent mean (SD) for each group.
Participants completed a novel sensation-seeking task designed to probe the precise economic value (positive or negative) they assigned to the opportunity to receive an “intense” sensory stimulus (MES). In the first part of the task (acquisition phase), they simply learned the point values associated with various different abstract visual stimuli (conditioned stimuli [CSs]). Eight different fractals were used as CSs, with 2 of them assigned to each of 4 possible point values (25, 50, 75, or 100 points). In every trial, fractals were presented as pairs, restricted to consist of either adjacent or equal point value stimuli, yielding 10 different trial types (Figure 1).
Sensation-seeking task. In the first part of the task (acquisition phase), participants were presented with a series of forced choice decisions between pairs of abstract fractal images. There were 8 different fractal stimuli (conditioned stimuli [CSs]) with 2 different CSs assigned to each of 4 possible point values (25, 50, 75, or 100 points; with which choice option a particular fractal represented randomised for every participant). Choice pairs were restricted to consist of either adjacent or equal points value stimuli, yielding 10 trial types. The acquisition phase of the task continued for a minimum of 80 trials until participants reached a criterion level of performance, ≥80% higher points value choices over the last 10 trials where a higher points value choice was possible. After this learning stage was completed, participants progressed to the second part of the task (test phase). For the test phase, participants were instructed that all stimuli were associated with the same points value as before but that some of the stimuli were now associated with the chance of receiving a mild electric stimulus (MES) to their nondominant hand (the magnitude of the MES was individually calibrated to be “stimulating but not painful” prior to starting the task). Specifically, one-half of the stimuli became designated as CS+s (chance of MES) and the other half CS-s (no chance of MES) in such a way that trials fell into 1 of 3 types: those where the CS+ was the lower points option, those where the CS+ was the higher points option, and, crucially, those where the CS+ and CS- stimuli were of equal points value. To increase the salience of the tactile stimulus, receipt of the electrical stimulation was probabilistic in both occurrence and timing. The probability of receiving the MES given selection of a CS+ stimulus was 0.75, with the onset of the MES occurring randomly during a 2500-ms inter-stimulus interval (ISI), throughout which participants were presented with a blank screen.
The acquisition phase continued for a minimum of 80 trials until participants reached a criterion level of performance (choosing the fractal associated with the higher points value on 80% or greater of trials where this was possible, over the last ten trials). After this learning stage was completed, participants progressed to the second part of the task (test phase).
In the test phase, one-half of the choice stimuli became additionally associated with the chance of receiving a nonpainful MES to the hand. These fractals will henceforth be referred to as CS+s (for full details, see Figure 1). The other fractals were not associated with electrical stimulation and so are referred to as CS-. For each points value, one of the associated fractals became CS+ (chance of MES), while the other was CS- (no chance of MES). This yielded 3 trial types: those where CS+ was the lower points option, those where CS+ was the higher points option, and, crucially, those where the CS+ and CS- stimuli were of equal points value.
Participants thus continued to make choices between fractal pairs, with the only difference being that now one-half of the choice options were associated with the chance of receiving the MES, including, importantly, on trials where both fractals were of the same points value. The key experimental question was whether some participants’ choices would be biased towards selecting the CS+ stimuli when it was of equal points value to, or even less than, the CS-. The degree of bias in participants’ choice towards or away from CS+ stimuli, with respect to the relative points value of the CS+ option, thus allowed precise calculation of the economic value (positive or negative) each participant assigned to the opportunity to receive the additional intense sensory stimulus (see Computational modelling analysis).
Participants completed 100 test phase trials (10 per trial type) and were told they would be paid a cash bonus at the end that depended on the total number of points accrued. To increase the salience of the tactile stimulus, receipt of MES was probabilistic in both occurrence and timing. The probability of receiving the MES given selection of a CS+ stimulus was 0.75, with the onset of MES occurring randomly during a 2500-ms inter-stimulus interval.
Before initiating the task, participants rated their preference for each of the fractals to be used in the paradigm on a computerized visual analogue scale (VAS) (ranging from “like to “dislike”). This measure was repeated for a second time following completion of the acquisition phase (ie, after learning the points value associated with each CS), and for a third time at the end of the experiment (ie, following introduction of the MESs). For details of apparatus and stimulation parameters used to deliver the MES, see Supplementary Information.
Following consent and task instructions, the amplitude of the electrical stimulation was calibrated individually for each participant via a standardized work-up procedure. Specifically, participants received a series of single stimulation pulses, starting at a very low amplitude (0.5 mA; generally reported by participants as being only just detectable) and gradually increasing in current strength until the stimulation was rated as 6 out of 10 on a VAS ranging from 0 (just detectable) to 10 (painful or unpleasant), a level at which participants endorsed a description of the sensation as being “stimulating but not painful.” This procedure was repeated twice for each participant to ensure consistency.
Participants also completed several self-report measures: a revised measure of the Sensation-Seeking Scale version V (Zuckerman, 1994; Gray and Wilson, 2007); a measure of hedonic tone, the Snaith-Hamilton Anhedonia Scale (Snaith et al., 1995); and the trait scale of the State-Trait Anxiety Inventory (Spielberger et al., 1970). The latter 2 measures were included to test the possibility that individual differences in MES preference may be related to trait anxiety or current state (an)hedonia rather than being driven by sensation-seeking personality per se. Demographic information regarding years of education, cigarette and alcohol consumption, recreational drug use, and frequency of engagement in gambling-related activities was also collected.
For test phase data, it was assumed that a choice between 2 CSs, A and B (where A is the CS+ stimuli and B is the CS-), could be represented as:
where RX is the point value of stimulus X, θ is the additional value (in points) assigned to the opportunity to receive the MES (positive or negative), and VX represents the overall value of each option.
This model was then fitted across all test phase choice data for each participant via a sigmoidal choice (softmax) function:
Values of the free parameters θ and β (the softmax temperature parameter, a measure of choice stochasticity) were fitted to the data on a subject-by-subject basis using log likelihood maximization.
Overall, participants chose the MES-associated stimulus (CS+) on 20.4% (SD 17.6) of the trials where these represented the lower points option, 68.9% (24.8) of the trials where they were the higher points option, and 45.2% (19.9) of trials where CS+ and CS- stimuli were equal in points value. There was a significant effect of trial type on proportionate choice of CS+ stimuli (F 2,88=157.29, P<0.001). Posthoc t tests revealed that overall participants chose the CS+ option significantly less frequently on lower point trials than equal point trials, and significantly more often on higher point trials than equal point trials (t 44=-11.997, P<.001; t 44=-8.102, P<.001, respectively).
Importantly, there was substantial variation in preference for the MES-associated option on trials where CS+ and CS- options were equal in points value. Mean proportionate choice of CS+ stimuli ranged from 7.5% to 92.5% (Figure 2A; relative CS+ value of 0). An estimate of significantly biased choice on these trials can be made by sampling the binomial distribution; for 40 trials and an alpha of 0.05, this threshold is approximately 26/40 (0.65) for significantly high choice and 13/40 (0.35) for significantly low choice. Based on these thresholds, 8/45 (or 18%) of participants chose a significantly high proportion of CS+ stimuli, in other words, significantly sought the MES, and 13/45 (29%) of participants significantly avoided the CS+ options.
Interindividual variation in task performance. (A) Individual psychometric functions for probability of choosing the mild electric stimulation (MES; CS+ or MES-associated) option as a function of its relative points (monetary) value, generated for each participant from choice data across all trial types (black circles indicate actual proportionate choice for each trial type). The left/right translation of each function represents the influence of MES value (or θ) on choice, with the gradient of the function determined by the softmax temperature parameter β (a measure of the stochasticity of participants’ choice). A leftward shift in the function reflects a positive effect of opportunity for intense tactile stimulation on choice, that is, greater choice of the MES-associated options than would be expected from points-based choice alone. (B) The value an individual assigned to the opportunity to receive the MES (θ) strongly predicted their difference in choice reaction times (RTs) to CS+ vs CS- stimuli (median RTCS+ – median RTCS-; r =-0.690, P<.001). The opportunity for extra sensory stimulation slowed choice of these options in participants for whom it was aversive (low proportionate choice of the CS+; bottom right quadrant), but sped the choice in participants for whom it was appetitive (high choice of the CS+; top left quadrant, orange shading). Black dashed lines indicate 95% confidence intervals. n=45.
Consistently high choice of MES-associated stimuli was observed in a subset of participants even on trial types where the CS+ was the lower points value option, that is, involved sacrifice of economic value (Figure 2A, relative CS+ value of 25).
To test whether participants’ choice of the MES-associated stimuli varied significantly during the course of the task (ie, whether preference changed with decreasing stimulus novelty), test phase trials were binned into 4 sections. A repeated-measures ANOVA with the within-subjects factor of time (4 levels) found no evidence for a main effect of time-on-task on proportionate choice of CS+ stimuli across all subjects (p>.1). Overall choice of CS+ stimuli was also unrelated to number of trials taken to reach criterion performance or proportion of correct responses (higher point value choice on trials where this was possible) during the acquisition phase (P>.1), suggesting that preference for MES-associated stimuli was not associated with the learning of the points values during the first part of the task. MES preference was also not related to current amplitude (P>.1).
The computational modelling analysis describing the value (in points) that participants assigned to opportunity to receive the MES (θ) provided a good account of task performance (for details, see Supplementary Information). Figure 2B shows individual psychometric curves for probability of choosing the MES-associated option (CS+) as a function of its relative points (monetary) value, generated by fitting the model to choice data across all trial types for each participant.
Individual θ values were strongly negatively correlated with difference in choice reaction time (RT) for CS+ vs CS- stimuli (r=-0.690, P<.001) (Figure 2B). Specifically, participants who chose a greater proportion of MES-associated stimuli were faster to choose these stimuli (suggestive of conditioned approach). In contrast, participants who tended to avoid CS+ stimuli were slower to choose them (suggestive of conditioned suppression) (Pearce, 1997). This was not a time-on-task effect (eg, due to a tendency to decrease both mean RT and choice of the CS+ over the course of the task), as this relationship remained strongly significant when considering trials from only the latter half of the test phase (first half of trials r=-0.692, second half of trials r=-0.625, both P<.001).
Individual θ values were significantly positively related to self-reported sensation-seeking score, such that participants who reported higher trait sensation-seeking assigned a greater value to opportunity to receive the MES (r=0.325, P=.043) (Figure 3A).
Relationship between task performance and self-report measures. (A) Total self-reported sensation-seeking score was significantly positively related to the value participants assigned to opportunity to receive mild electric stimulation (MES) (r=0.325, P<.05). (B) There was a positive relationship between value assigned to receipt of the intense sensory stimulation (θ) and mean change in visual analogue scale (VAS) “liking” rating of MES-associated (CS+) stimuli following the introduction of the additional electrical stimulation (r=0.368, P<.05). Dashed lines indicate 95% confidence intervals. n =45.
Theta value was unrelated to trait anxiety, self-reported hedonic tone, current amplitude, or years of education (all P>.1). Nonparametric tests were used to relate task performance to self-reported alcohol and tobacco use, as these data were substantially positively skewed. Independent-samples median tests revealed that individuals who assigned a positive value to the opportunity to receive the MES (ie, θ>0, n =17) smoked significantly more cigarettes per week (Fisher’s P=.006) and showed a nonsignificant trend towards consuming more alcoholic drinks per week (P=.098) than individuals who tended to avoid the MES (ie, θ<0, n=28) (mean cigarettes per week 6.7±10.4 vs 2.5±9.9; mean drinks per week 4.2±3.9 vs 3.4±4.9). There was no significant difference in mean θ value between individuals who did vs did not (n=15 vs n=30) report any recreational substance use other than alcohol or tobacco during the past 12 months (independent samples t test, P>.1) (Table 1). There was no difference in mean θ value between male and female participants (independent samples t test, P>.1).
MES value (θ) was also significantly positively related to mean change in VAS “liking” rating for CS+ stimuli following introduction of the MES (ie, between rating sessions 2 and 3; r=0.368, P=.013) (Figure 3B). Participants who assigned positive MES values tended to increase their liking rating of MES-associated stimuli, while participants with negative values tended to decrease their ratings.
Values of the model parameter indexing choice stochasticity (β; a measure of the extent to which participants’ choice was influenced by the difference in value between the 2 options) were unrelated to both self-reported sensation-seeking trait and θ values (P>.1), suggesting that higher sensation-seeking or MES-seeking individuals were not any less value-driven in their choice behavior than their lower sensation-seeking counterparts.
Participants were 30 healthy males, mean age 22.3 (SD 2.74) (Table 1). Potential effects of haloperidol in female volunteers who might be pregnant precluded use of the drug in women in this study. Sample size (n=30) was based on the strength of the MES value/RT effect relationship we observed in Study 1. It was calculated that a sample of 29 participants should allow us to replicate (and detect any effects of haloperidol on) a true effect size of r=0.50 at a power of 80% and an alpha of 0.05. Exclusion criteria consisted of any current major illness, current or historic incident of psychiatric illness, and/or history of head injury. All subjects gave informed written consent and the study was approved by the University College London Ethics Committee.
The study was carried out according to a within-subjects, double-blind, placebo-controlled design. On the first session, participants gave informed consent and completed the sensation-seeking task in order to reduce the impact of any practice effects on performance across the subsequent 2 sessions (under placebo or drug). They then completed the UPPS impulsivity questionnaire (Whiteside and Lynam, 2001), which has subscales of sensation-seeking, and 3 other factor analysis-derived impulsivity facets. This measure was chosen to evaluate the selectivity of the relationship between task performance and sensation-seeking compared with other kinds of impulsivity. The sensation-seeking subscale of the UPPS is predominantly derived from items of the SSS-V, and therefore scores on the 2 measures intercorrelate highly (Whiteside and Lynam, 2001). A standardized, nonverbal measure of mental ability was also administered (Raven’s 12-item Advanced Progressive Matrices; Pearson Education, 2010).
On the second and third sessions, participants arrived in the morning and were administered either 2.5mg haloperidol or a placebo (drug and placebo were indistinguishable). A dose of 2.5mg haloperidol was chosen in order to be greater than that given in a previous study where inconsistent drug effects were observed (2mg; Frank and O’Reilly, 2006), but less than that used in other behavioral studies where significant negative effects of haloperidol on mood or affect were detected (3mg; Zack and Poulos, 2007; Liem-Moolenaar et al., 2010). Testing commenced 2.5 hours after ingestion of the tablet in order to allow drug plasma levels to reach maximum concentration (Midha et al., 1989; Nordström et al., 1992).
Following this uptake period, participants completed VAS measures of mood, affect, potential physical side effects, and knowledge of the drug/placebo manipulation. The Addiction Research Centre Inventory of psychoactive drug effects (ARCI; Martin et al., 1971) was also administered, as this previously has been shown to be sensitive to haloperidol (Ramaekers et al., 1999). Participants further completed 1 of 2 equivalent forms of the letter-digit substitution test (LDST; van der Elst et al., 2006), a simple pencil-and-paper test of general psychomotor and cognitive performance. Arterial heart rate and blood pressure were monitored pre- and post-drug administration.
The sensation-seeking task was as described for Study 1. For this study, participants completed an additional set of VAS ratings at the end of the task to test learning of CS+/CS- (MES-associated vs non-MES–associated) contingencies. For each CS, participants rated how strongly they believed choosing that stimulus had been associated with the chance of receiving electrical stimulation (“no chance of shock” to “chance of shock”). The individualized work-up procedure was repeated on every session to ensure that subjective intensity (as opposed to actual current amplitude) was matched across sessions. Drug/placebo order was counterbalanced across subjects, with a minimum of a 1-week washout period between the 2 test sessions (the mean time between visits was 18 days).
Computational modelling analysis of the sensation-seeking task was as described for Study 1. A repeated-measures ANOVA with the within-subjects factor of drug (haloperidol vs placebo), and the between-subjects factor of drug order (first vs second test session) was used to analyze key dependent variables from test session data. Specifically, these were participant-determined current amplitude, modelling parameters describing MES value (θ) and choice stochasticity (β), mean choice RT, and individual RT effect (median RTCS+ – median RTCS-). All reported simple effects analyses are via pairwise comparison with the Bonferroni adjustment for multiple comparisons.
Measures of general and subjective drug effects (VAS, ARCI, LDST scores, and cardiovascular measures) were compared between test sessions via paired-sample t tests. One participant was unable to attend for a final test session and so his data were excluded from the analysis. Another participant failed to reach criterion level performance in the acquisition stage of the task on both test sessions, and so his data were also excluded, yielding a final n of 28.
All statistical analyses were carried out in SPSS 19.0 (IBM Corp., Armonk, NY), except the computational modelling analysis, which was implemented in Matlab R2011b (Mathworks, Inc., Sherborn, MA).
The main findings of Study 1 were replicated in the baseline session data from our second sample of participants (significant relationships in the expected directions between θ values and both individual RT effect and self-reported sensation-seeking) (Supplementary Figure 1). A concordance analysis between data from baseline and placebo sessions also indicated fair-to-good reliability of estimates of θ value across sessions (see Supplementary Information), supporting the validity of our use of a repeated-measures design.
When considering data from the 2 test (drug/placebo) sessions, overall, participants again chose the shock-associated stimulus (CS+) significantly more often on higher points than equal points trials, and on equal compared with lower points trials, on both placebo and drug sessions (main effect of trial type; F 2,54=138.54, ƞ p2 =0.837, P < .001; difference between types all P<.001; mean (± SD) choice on placebo was 0.806±0.19, 0.398±0.17, and 0.126±0.13, respectively, for these trial types, while on haloperidol it was 0.744±0.19, 0.399±0.15, and 0.158±0.15.
There were no significant overall effects of haloperidol treatment on current amplitude, points value assigned to the MES (θ), choice stochasticity (β), mean RT, or relative RT for MES vs non-MES–associated stimuli (all P>.1). Drug order (active preparation on first vs second test session) was not a significant between-subjects factor for any of the dependent variables (P>.1), and there was no overall drug*drug order interaction (P>.1). Therefore, drug order was discarded from the model for subsequent analyses to maximize sensitivity.
The strong relationship between the points value participants assigned to receipt of the MES and relative choice RT for MES-associated vs non-MES–associated stimuli observed in Study 1 was replicated in the second sample under placebo conditions (r=-0.602, P=.001), but, intriguingly, not under haloperidol (r=-0.199, P>.1) (Figure 4A).
Effects of haloperidol on the value assigned to intense sensory stimulation. (A) In a second sample of healthy volunteers, the value assigned to intense sensory stimulation (mild electric stimulation [MES]) was significantly related to relative choice reaction time (RT) for MES vs non-MES–associated stimuli on placebo (r=-0.602, P=.001), but not under haloperidol (P>.1; significant decrease in regression coefficient, P<.05). Dashed lines indicate 95% confidence intervals. (B) If subjects were divided into those who approached (showed speeded relative RTs towards, n=8) and those who avoided (showed slowed relative RTs towards, n=20) the opportunity for the intense sensory stimulation under placebo, there was a significant interaction between sensation-seeking group and effect of drug (P<.01). Haloperidol decreased the economic value assigned to the MES only in those participants who exhibited approach reactions towards MES-associated stimuli under normal conditions (high-sensation seekers [HSS]; cf low-sensation seekers [LSS]). Error bars represent SEM. **P=.01, ns P>.10, drug vs placebo. n=28.
A posthoc analysis revealed that there was indeed a significant attenuation of this relationship under haloperidol (Fisher r-to-Z transformed Pearson-Filon test for decrease in correlation coefficient; Z=-1.735, P=.041, 1-tailed; Raghunathan et al., 1996). Thus, haloperidol treatment appeared to abolish the approach-avoidance effect with respect to relative preference for the intense sensory stimulus. Similarly, although self-reported sensation-seeking score was significantly and selectively positively correlated with MES value (θ) on placebo (r=0.391, P=.040; all other UPPS impulsivity subscale scores unrelated to MES preference, P>.1), this was not the case under haloperidol (r=-0.127, P>.1; Steiger’s Z for significant difference in correlation coefficient between drug conditions=2.25, P=.024; Steiger, 1980).
Based on the above finding, in conjunction with our previous observation that the effects of a D2-ergic drug may depend on baseline sensation-seeking (Norbury et al., 2013), a further analysis was conducted to check for baseline-dependent drug effects that may have been masked in the group-level analysis. To discover what was driving the attenuation of the RT effect under drug, participants were grouped according to whether they showed conditioned approach (speeded RT to CS+ vs CS- stimuli, ie, individual RT effect <0, N=8) or conditioned suppression (slowed RT to CS+ vs CS- stimuli, ie, individual RT effect >0, n=20) of their responses towards the intense sensory stimulation under placebo conditions.
When this approach or avoid grouping was added to the model as a between-subjects factor, there was a significant interaction between drug treatment and group on value assigned to the MES (significant drug*group interaction on θ value; F 1,26=10.64, ƞ p2=0.290, P=.003; interaction on β value P>.1). Simple effects analysis revealed a significant decrease in MES value in the approach group on haloperidol vs placebo (F 1,26=7.97, ƞ p2 =0.235, P=.009). By contrast, there was no effect of drug on MES value in the avoidance group (P>.1) (Figure 4B). Thus, haloperidol appeared to selectively attenuate MES value in individuals who exhibited approach behavior towards the intense sensory stimulus under baseline conditions.
Approach and avoid groups did not differ in age, weight, estimated IQ, or self-determined current intensity (independent samples t tests, all P>.1), but did differ in UPPS sensation-seeking score (t 26=2.261, P=.032, significantly higher mean score in the approach group; 40.9±8.1 vs 32.9±8.5). Similarly to Study 1, independent-samples median tests revealed that individuals in the approach group smoked significantly more cigarettes per week than the avoid group (Fisher’s P=.022) and showed a nonsignificant trend towards greater weekly alcohol consumption (P=.096; mean cigarettes per week 20±25 vs 3.9±13; mean drinks per week 12±13 vs 3.5±3.9).
The effect of haloperidol on θ value (difference in value between drug and placebo sessions) was unrelated to age, weight, estimated IQ, drug effect on overall mood or alertness VAS ratings, drug effect on the sedation or dysphoria scales of the ARCI, or drug effect on general psychomotor function (LDST score; all P>.1). There was also no significant relationship between effect of drug on θ value and number of alcoholic drinks consumed or cigarettes smoked in an average week (Spearman’s ρ<0.25, P>.1). Subjects who had/had not (n=10 vs n =18) (Table 1) engaged in any recreational drug use other than alcohol or tobacco during the last 12 months did not differ in the effect of haloperidol on θ value (independent samples t test, P>.1).
The above findings could not be explained by generic effects of drug treatment. Overall, there were no significant effects of haloperidol on VAS ratings of mood, affect, or potential physical side effects (16 scales, all P>.1) (for details, see Supplementary Table 1). There was also no effect of haloperidol on any ARCI subscale score (MBG euphoria, PCAG sedation, LSD dysphoric and psychotomimetic effects, BG and A stimulant-like effects scales, all P>.1) or cardiovascular measures (blood pressure and heart rate, P>.1). There was no effect of drug treatment upon participant ratings of whether they believed they were on the drug or placebo session (P>.1). Finally, there was no effect of haloperidol on general psychomotor function as indexed by LDST performance (P>.1).
Finally, we examined the hypothesis that the observed effects of haloperidol could be due to differences in learning between drug and placebo sessions. We found no effect of haloperidol on number of trials required to reach criterion performance in the first phase of the task (P>.1). Participants’ mean “shock knowledge” ratings for CS+ and CS- stimuli (ratings on a VAS ranging from chance of shock [+300] to no chance of shock [-300]) were entered into a repeated-measures model with the within-subjects factors of drug (haloperidol vs placebo) and CS type (CS+ vs CS-), revealing a significant main effect of CS type (F1,27=74.56, ƞ p2=0.734, P<.001; mean [± SEM] rating of CS+ stimuli 146±18.2, mean rating of CS- stimuli -150±19.1), but no effect of drug treatment (P>.1) or drug*CS type interaction (P>.1) on explicit knowledge of MES associations. When approach vs avoid group was added to the model as a between-subjects factor, there was no difference between groups in the effect of drug on shock knowledge ratings (drug*group, P>.1), or the effect of drug depending on CS type (drug*CS type*group, P=.09).
We examined how the opportunity to experience an intense sensory stimulus (MES) influenced behavior during a simple economic decision-making task, and, subsequently, how this behavioral index of sensation-seeking was affected by the D2 dopamine receptor antagonist haloperidol. Above chance choice of stimuli associated with intense tactile stimulation occurred reliably in some participants, even when this choice involved the sacrifice of monetary gain. This finding is consistent with the intense sensory stimulation being considered to be appetitive in these individuals. In support of this interpretation, participants who chose a greater proportion of MES-associated stimuli had higher self-reported sensation-seeking scores, increased their “liking” ratings of these stimuli following the introduction of the MESs, and assigned a positive economic value to the opportunity to receive the additional sensory stimulation in a well-fitting computational model of task performance.
Importantly, there was a highly significant relationship between preference for the intense sensory stimulus and choice RTs, consistent with the notion that the MES had motivational significance to participants. In both samples, participants who chose a greater proportion of MES-associated stimuli showed a relative speeding of their responses when choosing these stimuli, with the opposite effect observed in people who tended to avoid them. In conjunction with previous observations that individuals generally show speeded response times for appetitive stimuli but are slower to approach aversive stimuli (Crockett et al., 2009; Wright et al., 2012), this suggests that the opportunity for intense sensory stimulation influenced participants’ choice via an approach-avoidance–like mechanism.
Critically, this effect was not evident under the influence of a D2 receptor antagonist. This was due to a selective decrease in the economic value assigned to receipt of the intense sensory stimulus in participants who exhibited decreased relative RTs towards (or displayed approach reactions to) the MES under placebo conditions (behavioral high-sensation seekers).
The results presented here are in line with a broader background of evidence from both humans and animals that relates trait sensation-seeking to variation in dopaminergic neurotransmission, particularly in striatal regions (Hamidovic et al., 2009; Olsen and Winder, 2009; Shin et al., 2010; Gjedde et al., 2010; Norbury et al., 2013). A combination of evidence from genetic and PET radioligand displacement studies suggests that individuals higher in sensation-seeking personality may have both higher endogenous dopamine levels and greater dopaminergic responses to cues of upcoming reward in the striatum (Riccardi et al., 2006; Gjedde et al., 2010; O’Sullivan et al., 2011). According to one influential model of the role of dopamine in striatal function (Frank, 2005), in the normal state this may contribute to increased inhibition of “NoGo” (action inhibition) pathway neurons via increased stimulation of inhibitory postsynaptic D2 receptors. This in turn would result in greater overall thalamic disinhibition or “Go” bias (favoring action expression) in high-sensation seekers, particularly in the presence of reward cues.
Haloperidol is a silent D2 receptor antagonist (blocks endogenous dopamine signalling via D2 receptors; Cosi et al., 2006), and D2 antagonists have previously been shown to preferentially affect striatal function (Kuroki et al., 1999; Honey et al., 2003). Therefore, it is possible that under haloperidol, the responses of higher sensation seekers may be normalized (increase in resemblance to lower sensation seekers) by allowing increased NoGo pathway output. This would explain our finding of a selective decrease in appetitive reactions to the intense sensory stimulation in the higher sensation-seeking (approach group) individuals.
Our finding of a significant effect of haloperidol on choice, in the absence of any influence on learning, is consistent with recent work suggesting that D2 antagonists may have strong effects on choice of rewarding-predicting stimuli while leaving learning intact (Eisenegger et al., 2014). However, it is important to note that the putative mechanism suggested above assumes a predominantly postsynaptic effect of haloperidol (Frank and O’Reilly, 2006). Despite our attempt to ensure significant postsynaptic receptor binding by use of a greater dose than the previously cited study (where mixed pre- and postsynaptic D2-ergic effects were thought to be observed), we can provide no direct evidence of this. Further, inferences regarding the brain regions involved in our findings are speculative and would need to be tested in further work, for example involving functional imaging.
The studies presented here have some limitations. First, as sensation-seeking behaviors in the real world can take many different forms, it might appear surprising that use of a single, tactile sensory stimulus (MES) is able to sufficiently capture sensation-seeking behavior in all individuals. However, our findings are consistent with a previous study reporting distinct physiological response profiles to electric shock in low and high self-reported sensation-seekers (De Pascalis et al., 2007). We would not seek to claim that performance on our task captures all of sensation-seeking personality, as this is a complex multidimensional trait, but it may tap operational sensation-seeking–like behavior in at least a subset of high-sensation–seeking individuals, thereby allowing us to probe underlying neural mechanisms in the laboratory (eg, with pharmacological manipulations). In analogous fashion, there is some evidence that apparently dissimilar animal operationalizations of sensation-seeking behavior may tap at least partially overlapping neural circuitry (eg, Parkitna et al., 2013).
Crucially, in both our studies, choice of MES-associated stimuli was found to correlate selectively with total self-reported sensation-seeking scores, which probe multiple classes of sensation-seeking–type behaviors. Although these relationships were of only moderate strength, it should be noted that these findings are at the higher end of the range of those generally found between behavioral and questionnaire measures of impulsive behavior (Helmers et al., 1995; Mitchell, 1999). We also found some evidence of greater recreational substance consumption amongst individuals who assigned a positive value towards opportunity to experience the MES, indicating that task performance may relate to real-life engagement in sensation-seeking behaviors.
Second, as our drug finding is based on a significant decrease in value in one (previously higher mean value) subgroup, an alternative explanation of our findings from Study 2 is that this simply represents a regression to the mean effect. However, against this interpretation, we found evidence of fair-to-good reliability of θ values generated from the same participants across multiple sessions of our novel paradigm (Supplementary Information).
Furthermore, the subgrouping for Study 2 is based on individual difference in relative choice RTs rather than θ values per se (although the 2 are significantly correlated). We also used our estimate of RT effect from the second or third testing session (placebo session) to group participants, a strategy that has previously been argued to help guard against regression to the mean effects (Barnett et al., 2005). Taken together, we would contend that these factors argue against a purely trivial effect of haloperidol on MES value in the approach or high-sensation–seeking individuals.
Third, although haloperidol is considered to be a selective D2 receptor antagonist (it binds >15 times more strongly to D2 than D1 receptors in rat and human cloned cells; Arnt and Skarsfeldt, 1998), it has also been shown to have modest affinity for the α-1 adrenoreceptor and the serotonin 2A receptor in postmortem human brains (Richelson and Souder, 2000). Therefore, we cannot be completely certain about the mechanism underlying our drug effects. As haloperidol has previously been reported to induce high levels of brain D2 receptor occupancy at relatively low oral doses (60–70% at 3mg and 53–74% at 2mg; Nordström et al., 1992; Kapur et al., 1997), we are confident that the dose used in our study (2.5mg) was sufficient to antagonize central D2 receptors in our participants. Another potential limitation is the possibility that the behavioral effects we observed are due to some general effect of haloperidol treatment, for example, increased negative affect in some participants. However, the effect of drug on MES value was unrelated to differences in mood, affect, sedation or dysphoria ratings, or our measure of general psychomotor function between drug and placebo sessions.
In summary, the novel paradigm introduced here appears to tap a dimension of willingness to self-administer intense and unusual sensory stimulation, together with associated behavioral invigoration. For participants who choose to approach rather than avoid this kind of stimulation, we propose that it is intrinsically rewarding and that, similar to analogous findings from the animal literature, this appetitive response involves the D2 receptor dopamine system. These findings may aid investigation of various psychopathologies for which more extreme sensation-seeking scores constitute a vulnerability factor.
J.P.R. is a consultant for Cambridge Cognition and has participated as a paid speaker in a media advisory board for Lundbeck. All other authors have no financial interests to disclose.
This work was supported by the Wellcome Trust (award 098282 to M.H.) and the UK Medical Research Council.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
(1994) Sensation seeking, substance abuse, and psychopathology in treatment-seeking and community cocaine abusers. J Consult 62:1053– 1057.
(2005) Regression to the mean: what it is and how to deal with it. Int J Epidemiol 34:215– 220.
(2009) Reconciling the role of serotonin in behavioral inhibition and aversion: acute tryptophan depletion abolishes punishment-induced inhibition in humans. J Neurosci 29:11993– 11999.
(2010) Inverted-U-shaped correlation between dopamine receptor availability in striatum and sensation seeking. Proc Natl Acad Sci 107:3870– 3875.
(2007) A detailed analysis of the reliability and validity of the sensation seeking scale in a UK sample. Personal Individ Differ 42:641– 651.
(2009) Evaluation of genetic variability in the dopamine receptor D2 in relation to behavioral inhibition and impulsivity/sensation seeking: an exploratory study with d-amphetamine in healthy participants. Exp Clin Psychopharmacol 17:374– 383.
(1995) Assessment of measures of impulsivity in healthy male volunteers. Personal Individ Differ 19:927– 935.
(2003) Dopaminergic drug effects on physiological connectivity in a human cortico‐striato‐thalamic system. Brain 126:1767– 1781.
(1999) Effects of antipsychotic drugs on extracellular dopamine levels in rat medial prefrontal cortex and nucleus accumbens. J Pharmacol Exp Ther 288:774– 781.
(2013) Differences in big five personality traits between alcohol and polydrug abusers: implications for treatment in the therapeutic community. Int J Ment Health Addict 11:682– 692.
(2010) Psychomotor and cognitive effects of a single oral dose of talnetant (SB223412) in healthy volunteers compared with placebo or haloperidol. J Psychopharmacol (Oxf) 24:73– 82.
(2013) Dopamine modulates risk-taking as a function of baseline sensation-seeking trait. J Neurosci 33:12982– 12986.
(2011) Cue-induced striatal dopamine release in Parkinson’s disease-associated impulsive-compulsive behaviors. Brain 134:969– 978.
(2013) Novelty-seeking behaviors and the escalation of alcohol drinking after abstinence in mice are controlled by metabotropic glutamate receptor 5 on neurons expressing dopamine D1 receptors. Biol Psychiatry 73:263– 270.
(1997) Instrumental conditioning. In: Animal learning and cognition: an introduction. 2nd edition. Hove, East Sussex: Psychology Press.
(1999) Psychomotor, cognitive, extrapyramidal, and affective functions of healthy volunteers during treatment with an atypical (amisulpride) and a classic (haloperidol) antipsychotic. J Clin Psychopharmacol 19:209– 221.
(1995) A scale for the assessment of hedonic tone the Snaith-Hamilton Pleasure Scale. Br J Psychiatry 167:99– 103.
(1970) The state-trait anxiety inventory: test manual for form X. Palo Alto, CA: Consulting Psychologists Press.
(2011) Meta-analysis of genome-wide association studies identifies common variants in CTNNA2 associated with excitement-seeking. Transl Psychiatry 1:e49.
(2001) The Five Factor Model and impulsivity: using a structural model of personality to understand impulsivity. Personal Individ Differ 30:669– 689.
(2012) Approach–avoidance processes contribute to dissociable impacts of risk and loss on choice. J Neurosci 32:7009– 7020.
(1994) Behavioral expressions and biosocial bases of sensation seeking. Cambridge University Press.
Front Neural Circuits. 2013 Oct 11;7:152.
Molecular Neurobiology Laboratory, Department of Life Sciences, Korea University Seoul, South Korea.
Dopamine (DA) regulates emotional and motivational behavior through the mesolimbic dopaminergic pathway. Changes in DA mesolimbic neurotransmission have been found to modify behavioral responses to various environmental stimuli associated with reward behaviors. Psychostimulants, drugs of abuse, and natural reward such as food can cause substantial synaptic modifications to the mesolimbic DA system. Recent studies using optogenetics and DREADDs, together with neuron-specific or circuit-specific genetic manipulations have improved our understanding of DA signaling in the reward circuit, and provided a means to identify the neural substrates of complex behaviors such as drug addiction and eating disorders. This review focuses on the role of the DA system in drug addiction and food motivation, with an overview of the role of D1 and D2 receptors in the control of reward-associated behaviors.
dopamine, dopamine receptor, drug addiction, food reward, reward circuit
Dopamine (DA) is the predominant catecholamine neurotransmitter in the brain, and is synthesized by mesencephalic neurons in the substantia nigra (SN) and ventral tegmental area (VTA). DA neurons originate in these nuclei and project to the striatum, cortex, limbic system and hypothalamus. Through these pathways, DA affects many physiological functions, such as the control of coordinated movements and hormone secretion, as well as motivated and emotional behaviors (Hornykiewicz, 1966; Beaulieu and Gainetdinov, 2011; Tritsch and Sabatini, 2012).
Regulation of the DA system in reward-related behaviors has received a great deal of attention because of the serious consequences of dysfunction in this circuit, such as drug addiction and food reward linked obesity, which are both major public health issues. It is now well accepted that following repeated exposure to addictive substances, adaptive changes occur at the molecular and cellular level in the DA mesolimbic pathway, which is responsible for regulating motivational behavior and for the organization of emotional and contextual behaviors (Nestler and Carlezon, 2006; Steketee and Kalivas, 2011). These modifications to the mesolimbic pathway are thought to lead to drug dependence, which is a chronic, relapsing disorder in which compulsive drug-seeking and drug-taking behaviors persist despite serious negative consequences (Thomas et al., 2008).
Recent findings suggest that glutamatergic and GABAergic synaptic networks in the limbic system are also affected by drugs of abuse, and that this can alter the behavioral effects of addictive drugs (Schmidt and Pierce, 2010; Lüscher and Malenka, 2011). Considerable evidence now suggests that substantial synaptic modifications of the mesolimbic DA system are associated with not only the rewarding effects of psychostimulants and other drugs of abuse, but also with the rewarding effects of natural reward, such as food; however, the mechanism by which drugs of abuse induce the modify synaptic strength in this circuit remains elusive. In fact, DA reward signaling seems extremely complex, and is also implicated in learning and conditioning processes, as evidenced by studies revealing a DAergic response coding a prediction error in behavioral learning, for example (Wise, 2004; Schultz, 2007, 2012), thus suggesting a need for a fine dissection at a circuit level to properly understand these motivated reward-related behaviors. Recent studies using optogenetics and neuron-specific or circuit-specific genetic manipulations are now allowing a better understanding of DA signaling in the reward circuit.
In this review, I will provide a short summary of DA signaling in reward-related behaviors, with an overview of recent studies on cocaine-addiction behaviors as well as some on food reward in the context of the role of D1 and D2 receptors in regulating these behaviors.
Dopamine interacts with membrane receptors belonging to the family of seven transmembrane domain G-protein coupled receptors, with activation leading to the formation of second messengers, and the activation or repression of specific signaling pathways. To date, five different subtypes of DA receptors have been cloned from different species. Based on their structural and pharmacological properties, a general subdivision into two groups has been made: the D1-like receptors, which stimulate intracellular cAMP levels, comprising D1 (Dearry et al., 1990; Zhou et al., 1990) and D5 (Grandy et al., 1991; Sunahara et al., 1991), and the D2-like receptors, which inhibit intracellular cAMP levels, comprising D2 (Bunzow et al., 1988; Dal Toso et al., 1989), D3 (Sokoloff et al., 1990), and D4 (Van Tol et al., 1991) receptors.
D1 and D2 receptors are the most abundantly expressed DA receptors in the brain. The D2 receptor has two isoforms generated by alternative splicing of the same gene (Dal Toso et al., 1989; Montmayeur et al., 1991). These isoforms, named D2L and D2S, are identical except for an insert of 29 amino acids present in the putative third intracellular loop of D2L, an intracellular domain thought to play a role in coupling this class of receptor to specific second messengers.
D2 receptors are localized presynaptically, revealed by D2 receptor immunoreactivity, mRNA, and binding sites present in DA neurons throughout the midbrain (Sesack et al., 1994), with lower level of D2 receptor expression in theVTA than in the SN (Haber et al., 1995). These D2-type autoreceptors represent either somatodendritic autoreceptors, known to dampen neuronal excitability (Lacey et al., 1987, 1988; Chiodo and Kapatos, 1992), or terminal autoreceptors, which mostly decrease DA synthesis and packaging (Onali et al., 1988; Pothos et al., 1998), but also inhibit impulse-dependent DA release (Cass and Zahniser, 1991; Kennedy et al., 1992; Congar et al., 2002). Therefore, the principal role of these autoreceptors is the inhibition and modulation of overall DA neurotransmission; however, it has been suggested that in the embryonic stage, the D2-type autoreceptor could have a different function in DA neuronal development (Kim et al., 2006, 2008; Yoon et al., 2011; Yoon and Baik, 2013). Thus, the cellular and molecular role of these presynaptic D2 receptors needs to be explored further. The expression of D3, D4, and D5 receptors in the brain is considerably more restricted and weaker than that of either D1 or D2 receptors.
There is some difference in the affinity of DA for D1-like receptors and D2-like receptors, mostly reported on the basis of receptor-ligand binding assay studies using heterologously expressed DA receptors in cell lines. For example, D2-like receptors seem to have a 10- to 100-fold greater affinity for DA than the D1-like family, with the D1 receptor reported to have the lowest affinity for DA (Beaulieu and Gainetdinov, 2011; Tritsch and Sabatini, 2012). These differences suggest a differential role for the two receptors given that DA neurons can have two different patterns of DA release, “tonic” or “phasic” based on their firing properties (Grace et al., 2007). It has been suggested that low-frequency, irregular firing of DA neurons tonically generates a low basal level of extracellular DA (Grace et al., 2007), while burst firing, or “phasic” activity is crucially dependent on afferent input, and is believed to be the functionally relevant signal sent to postsynaptic sites to indicate reward and modulate goal-directed behavior (Berridge and Robinson, 1998; Schultz, 2007; Grace et al., 2007). Therefore, bursting activity of DA neurons, leading to a transient increase in the DA level, is thought to be a key component of the reward circuitry (Overton and Clark, 1997; Schultz, 2007). Consequently, the D1 receptor, which is known as the low-affinity DA receptor, is thought to be preferentially activated by the transient, high concentrations of DA mediated by phasic bursts of DA neurons (Goto and Grace, 2005; Grace et al., 2007). In contrast, it is hypothesized that D2-like receptors, which are known to have a high affinity for DA, can detect the lower levels of tonic DA release (Goto et al., 2007). However, given that measurements of receptor affinity rely on ligand binding assays from heterologously expressed DA receptors, and do not reflect the receptor’s coupling capacity to downstream signaling cascades, it is difficult to infer whether D2-like receptors are preferentially activated by basal extracellular levels of DA in vivo. Thus, it remains to be elucidated how these two different receptors participate in different pattern of DA neuronal activity in vivo.
The D1- and D2-like receptor classes differ functionally in the intracellular signaling pathways they modulate. The D1-like receptors, including D1 and D5, are coupled to heterotrimeric G-proteins that include the G proteins Gαs and Gαolf, with activation leading to increased adenylyl cyclase (AC) activity, and increased cyclic adenosine monophosphate (cAMP) production. This pathway induces the activation of protein kinase A (PKA), resulting in the phosphorylation of variable substrates and the induction of immediate early gene expression, as well as the modulation of numerous ion channels. In contrast, D2-class DA receptors (D2, D3, and D4) are coupled to Gαi and Gαo proteins, and negatively regulate the production of cAMP, resulting in decreased PKA activity, activation of K+ channels, and the modulation of numerous other ion channels (Kebabian and Greengard, 1971; Kebabian and Calne, 1979; Missale et al., 1998; Beaulieu and Gainetdinov, 2011).
One of best-studied substrates of PKA is the DA- and cAMP-regulated phosphoprotein, Mr ~32,000 (DARPP-32), which is an inhibitor of protein phosphatase, and is predominantly expressed in medium spiny neurons (MSNs) of the striatum (Hemmings et al., 1984a). It appears that DARPP-32 acts as an integrator involved in the modulation of cell signaling in response to DA in striatal neurons. It has been demonstrated that phosphorylation of DARPP-32 at threonine 34 by PKA activates inhibitory function of DARPP-32 over the protein phosphatase (PP1; Hemmings et al., 1984a,b). In D1 receptor expressing striatal neurons, D1 receptor stimulation results in an increased phosphorylation of DARPP-32 in response to PKA activation, while stimulation of D2 receptors in D2 receptor-expressing neurons reduces the phosphorylation of DARPP-32 at threonine 34, presumably as a consequence of reduced PKA activation (Bateup et al., 2008). However, it appears that a cAMP-independent pathway also participates in the D2-receptor-mediated regulation of DARPP-32, given that dephosphorylation of threonine 34 by the calmodulin-dependent protein phosphatase 2B (PP2B; also known as calcineurin), which is activated by increased intracellular Ca2+following D2 receptor activation (Nishi et al., 1997). These findings suggest that DA exerts a bidirectional control on the state of phosphorylation of DARPP-32, a DA-centered signaling molecule. Therefore, one can imagine that overall, under DA tone, these signaling pathways mediated by the two classes of receptors can influence neuronal excitability, and consequently synaptic plasticity, in terms of their synaptic networks in the brain, given that their precise signaling varies depending on the cell type and brain region in which they are expressed (Beaulieu and Gainetdinov, 2011; Girault, 2012).
In the case of D2 receptors, the situation is further complicated, as D2 receptors are alternatively spliced, giving rise to isoforms with distinct physiological properties and subcellular localizations. The large isoform appears to be expressed dominantly in all brain regions, although the exact ratio of the two isoforms can vary (Montmayeur et al., 1991). In fact, the phenotype of D2 receptor total knockout (KO) mice was found to be quite different from that of D2L KO mice (Baik et al., 1995; Usiello et al., 2000), indicating that the two isoformsmight have different functions in vivo. Recent results from Moyer et al. (2011) support a differential in vivo function of the D2 isoforms in human brain, showing a role of two variants of D2 receptor gene with intronic single-nucleotide polymorphisms (SNPs) in D2 receptor alternative splicing, and a genetic association between these SNPs and cocaine abuse in Caucasians (Moyer et al., 2011; Gorwood et al., 2012).
One signaling pathway of particular interest in neurons is the mitogen-activated protein kinases, extracellular-signal regulated kinases (ERK), which are activated by D1 and D2 receptors. It is now widely accepted that ERK activation contributes to different physiological responses in neurons, such as cell death and development, as well as synaptic plasticity, and that modulating ERK activity in the CNS can result in different neurophysiological responses (Chang and Karin, 2001; Sweatt, 2004; Thomas and Huganir, 2004). Additionally, ERK activation can be regulated by various neurotransmitter systems, a process that can be complex but is finely tuned depending on the differential regulation of the signaling pathways mediated by the various neurotransmitters. Therefore, it is interesting to see what the physiological output of ERK signaling upon DA stimulation through these receptors would be.
Results obtained from heterologous cell culture systems suggest that both D1- and D2-class DA receptors can regulate ERK1 and 2 (Choi et al., 1999; Beom et al., 2004; Chen et al., 2004; Kim et al., 2004; Wang et al., 2005). D1 receptor-mediated ERK singling involves an interaction with the NMDA glutamtate receptor (Valjent et al., 2000, 2005), which has been mostly described in the striatum. D1 receptor stimulation is not able to mediate ERK phosphorylation in itself, but rather requires endogenous glutamate (Pascoli et al., 2011). With D1 receptor activation, activated PKA can mediate the phosphorylation of DARPP-32 at its Thr-34, as mentioned above. Phosphorylated DARPP-32 can act as potent inhibitor of the protein phosphatase PP-1, which dephosphorylates another phosphatase, the striatal-enriched tyrosine phosphatase (STEP). Dephosphorylation of STEP activates its phosphatase activity, thus allowing STEP to dephosphorylate ERK (Paul et al., 2003). DARPP-32 also acts upstream of ERK, possibly by inhibiting PP-1, preventing PP-1 from dephosphorylating MEK, the upstream kinase of ERK (Valjent et al., 2005). Thus, D1 receptor activation acts to increase ERK phosphorylation by preventing its dephosphorylation by STEP, but also by preventing the dephosphorylation of the upstream kinase of ERK. In addition, the cross talk between D1 and NMDA receptors contributes to the ERK activation. For example, a recent study showed that stimulation of D1 receptors increases calcium influx through NMDA receptors, a process that involves phosphorylation of the NMDA receptor NR2B subunit by a Src-family tyrosine kinase (Pascoli et al., 2011). This increased calcium influx activates a number of signaling pathways, including calcium and calmodulin-dependent kinase II, which can activate ERK via the Ras-Raf-MEK cascade (Fasano et al., 2009; Shiflett and Balleine, 2011; Girault, 2012). Consequently, D1 receptor-mediated ERK activation employs a complex regulation by phosphatases and kinases in addition to the cross talk with glutamate receptor signaling (Figure Figure11).
D2 receptor-mediated ERK activation has been reported in heterologous cell culture systems (Luo et al., 1998; Welsh et al., 1998; Choi et al., 1999). D2 receptor-mediated ERK activation was found to be dependent on Gαi protein coupling, and it appears thatit requires the transactivation of receptor tyrosine kinase, which activates downstream signaling to finally activate ERK (Choi et al., 1999; Kim et al., 2004; Wang et al., 2005; Yoon et al., 2011; Yoon and Baik, 2013). Arrestin has been also suggested to contribute to D2 receptor-mediated ERK activation (Beom et al., 2004; Kim et al., 2004), which can activate MAPK signaling by mobilizing clathrin-mediated endocytosis in a β-arrestin/dynamin-dependent manner (Kim et al., 2004). A further possibility of D2 receptorscoupling to Gq proteins cannot be ruled out; in this case, Gq protein-mediated PKC activation could also induce ERK activation (Choi et al., 1999; Figure Figure22).
In view of the physiological role of this DA receptor-mediated ERK signaling, it has been shown that in mesencephalic neurons, DA activates ERK signaling via mesencephalic D2 receptors, which in turn activates the transcription factors such as Nurr1, a transcription factor critical for the development of DA neurons (Kim et al., 2006). Furthermore, our recent work demonstrated that STEP or Wnt5a can be involved in this regulation, by interacting with D2 receptors (Kim et al., 2008; Yoon et al., 2011). In light of these findings, it is intriguing whether this signaling can play a role in DA neurotransmission in the adult brain.
However, in the dorsal striatum, administration of the typical anti-psychotic D2-class receptor antagonist haloperidol stimulated the phosphorylation of ERK1/2, while the atypical anti-psychotic clozapine, which is also a D2-class antagonist, reduced ERK1/2 phosphorylation, showing that haloperidol and clozapine induce distinct patterns of phosphorylation in the dorsal striatum (Pozzi et al., 2003). Thus, the physiological relevance of this D2 receptor-mediated ERK signaling remains as an open issue.
Taken together, it is evident that D1and D2 receptors induce ERK activation via distinct mechanisms, and one can imagine that activation of these receptors can have different consequences, depending on the location and physiological status of the neurons expressing them.
The role of D1 and D2 receptors in reward-related behaviors has been investigated pharmacologically using subtype specific agonists and antagonists, as well as by the analysis of receptor gene KO mice. Recent progress in optogenetics and the use of viral vectors with different genetic manipulations now allow a refined examination of the functional importance of these receptors in vivo (Table Table11).
Exposure to a psychostimulant such as cocaine induces a progressive and enduring enhancement in the locomotor stimulant effect of subsequent administration, a phenomenon known as sensitization (Robinson and Berridge, 1993; Vanderschuren and Kalivas, 2000; Kalivas and Volkow, 2005; Steketee and Kalivas, 2011). The process of behavioral sensitization includes two distinct phases; initiation and expression. The initiation phase refers to the period during which the increased behavioral response following daily cocaine administration is associated with an increase in extracellular DA concentration. Behavioral sensitization continues to increase after the cessation of cocaine administration, and this procedure produces long-lasting sensitization, known as the expression of sensitization (Vanderschuren and Kalivas, 2000; Thomas et al., 2001; Steketee and Kalivas, 2011). The expression phase is characterized by a persistent drug hyper-responsiveness after cessation of the drug, which is associated with a cascade of neuroadaptation (Kalivas and Duffy, 1990; Robinson and Berridge, 1993). While this phenomenon has been studied mostly in experimental animals, the neuronal plasticity underlying behavioral sensitization is believed to reflect the neuroadaptations that contribute to compulsive drug cravings in humans (Robinson and Berridge, 1993; Kalivas et al., 1998). It has been suggested that the mesolimbic DA system from the VTA to the nucleus accumbens (NAc) and prefrontal cortex is an important mediator of these plastic changes, in association with the glutamatergic circuitry (Robinson and Berridge, 1993; Kalivas et al., 1998; Vanderschuren and Kalivas, 2000).
Animals behaviorally sensitized to cocaine, amphetamine, nicotine, or morphine (Kalivas and Duffy, 1990; Parsons and Justice, 1993) show enhanced DA release in the NAc in response to drug exposure. In addition to changes in neurotransmitter release, DA binding to its receptors plays a key role in behavioral sensitization (Steketee and Kalivas, 2011). For example, the enhanced excitability of VTA DA neurons that occurs with repeated cocaine exposure is associated with decreased D2 autoreceptor sensitivity (White and Wang, 1984; Henry et al., 1989). In addition, repeated intra-VTA injections of low doses of the D2 antagonist eticlopride, which is presumably autoreceptor-selective, enhanced subsequent responses to amphetamine (Tanabe et al., 2004).
A number of studies have shown that D1 and D2 DA receptors are differentially involved in cocaine-induced changes in locomotor activity. For example, initial studies employing pharmacological approaches have shown that mice or rats pre-treated with the D1 receptor antagonist SCH 23390 showed an attenuated locomotor response to acute cocaine challenge, while the D2 receptor antagonists haloperidol, and raclopride had no such effect (Cabib et al., 1991; Ushijima et al., 1995; Hummel and Unterwald, 2002). These results suggest different roles of DA receptor subtypes in the modulation of the stimulant effects of cocaine on locomotion. However, with regards to the behavioral sensitization induced by repetitive injections of cocaine, it has been reported that systemic administration of the D1 receptor antagonist SCH23390, or of the D2 receptor antagonists sulpiride, YM-09151-2 or eticlopride, does not affect the induction of cocaine sensitization (Kuribara and Uchihashi, 1993; Mattingly et al., 1994; Steketee, 1998; White et al., 1998; Vanderschuren and Kalivas, 2000).
The effects of direct intra-accumbens administration of SCH23390 on cocaine-induced locomotion, sniffing, and conditioned place preference (CPP) were investigated in rats, and these studies showed that the stimulation of D1-like receptors in the NAc is necessary for cocaine-CPP, but not for cocaine-induced locomotion (Baker et al., 1998; Neisewander et al., 1998). The direct intra-accumbens infusion of the D2/D3 receptor antagonist sulpiride in rats demonstrated that blockade of D2 receptors reverses the acute cocaine-induced locomotion (Neisewander et al., 1995; Baker et al., 1996), but these studies did not examine the effect on cocaine-induced behavioral sensitization. Interestingly, it has been reported that injection of the D2 receptor agonist quinpirole into the intra-medial prefrontal cortex blocked the initiation and attenuated the expression of cocaine-induced behavioral sensitization (Beyer and Steketee, 2002).
D1 receptor null mice have been examined in the context of addictive behaviors, and initial studies revealed that D1 receptor mutant mice failed to exhibit the psychomotor stimulant effect of cocaine on motor and stereotyped behaviors compared to their wild-type littermates (Xu et al., 1994; Drago et al., 1996). However, it appears that D1 receptor KO abolishes the acute locomotor response to cocaine, but does not fully prevent locomotor sensitization to cocaine at all doses (Karlsson et al., 2008), demonstrating that genetic KO of D1 receptors is not sufficient to fully block cocaine sensitization under all conditions.
In D2 receptor KO mice, with reduced general locomotor activity, the cocaine-induced motor activity level is low compared to WT mice, but these animals were similar in terms of the ability to induce cocaine-mediated behavioral sensitization, or cocaine-seeking behaviors with a slight decrease in sensitivity (Chausmer et al., 2002; Welter et al., 2007; Sim et al., 2013). Depletion of D2 receptors in the NAc by infusion of a lentiviral vector with a shRNA against the D2 receptor did not affect basal locomotor activity, nor cocaine-induced behavioral sensitization, but conferred stress-induced inhibition of the expression of cocaine-induced behavioral sensitization (Sim et al., 2013). These findings, together with previous reports, strongly suggest that blockade of D2 receptors in the NAc does not prevent cocaine-mediated behavioral sensitization, and that D2 receptor in the NAc play a distinct role in the regulation of synaptic modification triggered by stress and drug addiction.
Recent studies using genetically engineered mice that express Cre recombinase in cell-type specific manner, revealed some role of D1 or D2 receptor-expressing MSNs in cocaine-addictive behaviors. For example, loss of DARPP-32 in D2 receptor-expressing cells resulted in an enhanced acute locomotor response to cocaine (Bateup, 2010). Hikida and co-workers used AAV vectors to express tetracycline-repressive transcription factor (tTa) using substance P (for D1-expressing MSNs) or enkephalin (for D2-expressing MSNs) promoters (Hikida et al., 2010). These vectors were injected into the NAc of mice, in which tetanus toxin light chain (TN) was controlled by the tetracycline-responsive element, to selectively abolish synaptic transmission in each MSN subtype. Reversible inactivation of D1/D2 receptor-expressing MSNs with the tetanus toxin (Hikida et al., 2010) revealed the predominant roles of the D1 receptor-expressing cells in reward learning and cocaine sensitization, but there was no change in sensitization caused by the inactivation of D2 receptor-expressing cells. Using DREADD (designer receptors exclusively activated by a designer drugs) strategies, with viral-mediated expression of an engineered GPCR (Gi/o-coupled human muscarinic M4DREADD receptor, hM4D) that is activated by an otherwise pharmacologically inert ligand, Ferguson et al. (2011) showed that the activation of striatal D2 receptor-expressing neurons facilitated the development of amphetamine-induced sensitization. However, the optogenetic activation of D2 receptor-expressing cells in the NAc induced no change in cocaine-induced behavioral sensitization (Lobo, 2010).
Optogenetic inactivation of D1 receptor-expressing MSNs using the light activated chloride pump, halorhodopsin eNpHR3.0 (enhanced Natronomonas pharaonis halorhodopsin 3.0), during cocaine exposure resulted in an attenuation of cocaine-induced locomotor sensitization (Chandra et al., 2013). Furthermore, the conditional reconstruction of functional D1 receptor signaling in subregions of the NAc in D1 receptor KO mice resulted in D1 receptor expression in the core region of the NAc, but not the shell, mediated D1 receptor-dependent cocaine sensitization (Gore and Zweifel, 2013). These findings suggest that DA mechanisms critically mediate cocaine-induced behavioral sensitization, with distinct roles for D1 and D2 receptors, although the precise contribution of D1 and D2 receptors and their downstream signaling pathways remains to be determined.
The CPP paradigm is a commonly used preclinical behavioral test with a classical (Pavlovian) conditioning model. During the training phase of CPP, one distinct context is paired with drug injections, while another context is paired with vehicle injections (Thomas et al., 2008). During a subsequent drug-free CPP test, the animal chooses between the drug- and the vehicle-paired contexts. An increased preference for the drug context serves as a measure of the drug’s Pavlovian reinforcing effects (Thomas et al., 2008).
Although it has been previously reported that both systemic and intra-accumbens administration of the D1 receptor antagonist SCH23390 prevented cocaine CPP (Cervo and Samanin, 1995; Baker et al., 1998), D1 receptor mutant mice have been reported to demonstrate normal responses to the rewarding effects of cocaine in the CPP paradigm (Miner et al., 1995; Karasinska et al., 2005). Regarding the role of D2 receptors in CPP, there is considerable consensus in the literature that D2-like antagonists fail to influence place preference induced by cocaine (Spyraki et al., 1982; Shippenberg and Heidbreder, 1995; Cervo and Samanin, 1995; Nazarian et al., 2004). Consistent with these pharmacological studies, D2 receptor KO mice displayed a comparable CPP score to WT mice (Welter et al., 2007; Sim et al., 2013). Furthermore, D2L-/- mice developed a CPP to cocaine as did WT mice (Smith et al., 2002).
Recently, the effect of a conditional presynaptic KO of D2 receptors on addictive behaviors has been reported, and this study demonstrated that mice lacking D2 autoreceptors displayed cocaine supersensitivity, exhibited increased place preference for cocaine, as well as enhanced motivation for food reward, perhaps owing to the absence of presynaptic inhibition by autoreceptors that further elevates extracellular DA and maximizes the stimulation of postsynaptic DA receptors (Bello et al., 2011).
Results obtained from a different line of investigation showed that when D1-expressing MSNs are selectively activated by optogenetics, D1-Cre mice expressing DIO-AAV-ChR2-EYFP in the NAc displayed a significant increase in cocaine/blue-light preference compared to the control group (Lobo, 2010). In contrast, D2-Cre mice expressing DIO-AAV-ChR2-EYFP exhibited a significant attenuation of cocaine/blue-light preference relative to controls (Lobo, 2010), implicating a role for the activation of D1-expressing MSNs in enhancing the rewarding effects of cocaine, with activation of D2-expressing MSNs antagonizing the cocaine reward effect. Inhibition of D1-expressing MSNs with the tetanus toxin (Hikida et al., 2010) resulted in a diminished cocaine CPP, while no alterations to cocaine CPP after abolishing synaptic transmission in D2-expressing MSNs were observed (Hikida et al., 2010). Therefore, these data using optogenetics and cell-type specific inactivation of neurons implicate opposing roles of D1-and D2-expressing MSNs in CPP, with D1 receptor-expressing MSNs implicated in promoting both reward responses to psychostimulants, and D2 receptor-expressing MSNs dampening these behaviors (Lobo and Nestler, 2011).
Cocaine self-administration is an operant model in which laboratory animals lever press (or nose poke) for drug injections. The “self-administration” behavioral paradigm serves as an animal behavioral model of the human pathology of addiction (Thomas et al., 2008). It has been reported that selective lesion of DA terminals with 6-hydroxy DA (6-OHDA), or with the neurotoxin kainic acid in the NAc significantly attenuates cocaine self-administration, supporting the hypothesis that the reinforcing effects of cocaine are dependent upon mesolimbic DA (Pettit et al., 1984; Zito et al., 1985; Caine and Koob, 1994). Consistent with these findings, in vivo microdialysis studies demonstrate that accumbal extrasynaptic DA levels are enhanced during cocaineself-administration in both the rat (Hurd et al., 1989; Pettit and Justice, 1989) and monkey (Czoty et al., 2000). Collectively, these findings suggest that enhanced DA transmission in the NAc plays a crucial role in cocaine self-administration behavior.
DA receptor antagonists and agonists modulate cocaine self-administration, showing a dose-dependent biphasic effect. For example, selective antagonists for both D1 (Woolverton, 1986; Britton et al., 1991; Hubner and Moreton, 1991; Vanover et al., 1991; Caine and Koob, 1994) and D2 (Woolverton, 1986; Britton et al., 1991; Hubner and Moreton, 1991; Caine and Koob, 1994) receptors increase cocaine self-administration in response to lower doses of antagonist, but decrease self-administration in response to higher doses. This modulation appears to be specific when injected into the NAc but not the caudate nucleus, indicating a distinct role of NAc DA receptors in cocaine self-administration behaviors.
Later, using D1 and D2 receptor null mice, the involvement of these receptors in the cocaine self-administration was examined. Interestingly, despite the observation of normal cocaine CPP in D1 receptor KO mice, cocaine self-administration was eliminatedin these mice (Caine et al., 2007). In D2 receptor KO mice however, self-administration of low to moderate doses of cocaine was unaffected, while self-administration of moderate to high doses of cocaine was actually increased (Caine et al., 2002). Recently, Alvarez and co-workers reported that synaptic strengthening onto D2-expressing MSNs in the NAc occurs in mice with a history of intravenous cocaine self-administration (Bock et al., 2013). Inhibition of D2-MSNs using a chemicogenetic approach enhanced the motivation to obtain cocaine, while optogenetic activation of D2-MSNs suppressed cocaine self-administration, suggesting that recruitment of D2-MSNs in the NAc functions to restrain cocaine self-administration (Bock et al., 2013).
Studies investigating the reinstatement of cocaine-seeking behavior revealed that the administration of D2 receptor agonists reinstates cocaine-seeking behavior (Self et al., 1996; De Vries et al., 1999, 2002; Spealman et al., 1999; Khroyan et al., 2000; Fuchs et al., 2002). Consistent with these findings, D2 receptor antagonists attenuate cocaine priming-induced drug-seeking behavior (Spealman et al., 1999; Khroyan et al., 2000), while pre-treatment with a D2-like agonist prior to a priming injection of cocaine potentiated the behavior (Self et al., 1996; Fuchs et al., 2002). However, it appears that D1-like receptor agonists do not reinstate cocaine-seeking behavior (Self et al., 1996; De Vries et al., 1999; Spealman et al., 1999; Khroyan et al., 2000). In fact, systemically administered D1-like agonists and antagonists both attenuate the drug-seeking behavior induced by a priming cocaine injection (Self et al., 1996; Norman et al., 1999; Spealman et al., 1999; Khroyan et al., 2000, 2003), showing a differential involvement of D1 and D2 receptors in priming-induced reinstatement of cocaine seeking.
Results from our laboratory indicate that in the absence of D2 receptors, cocaine-induced reinstatement was not affected (Sim et al., 2013). It is suggested that the reinstatement of drug-seeking behavior can also be precipitated by re-exposure to cocaine-associated stimuli or stressors (Shaham et al., 2003). When this possibility was tested, results from our laboratory found that while stress potentiates the cocaine-induced reinstatement in WT mice, stress suppressed the cocaine-induced reinstatement in the D2 receptor mutant animals, suggesting an unexplored role of D2 receptors in the regulation of synaptic modification triggered by stress and drug addiction (Sim et al., 2013).
Food and food-related cues can activate different brain circuits involved in reward, including the NAc, hippocampus, amygdala and/or pre-frontal cortex and midbrain (Palmiter, 2007; Kenny, 2011). It is believed that the mesolimbic DA system promotes the learning of associations between natural reward and the environments in which they are found; thus, food and water, or cues that predict them, promote rapid firing of DA neurons, and facilitate behaviors directed toward acquisition of the reward (Palmiter, 2007). Indeed DA-deficient mice show a loss of motivation to feed (Zhou and Palmiter, 1995), while D1 receptor null mice exhibit retarded growth and low survival after weaning; this phenotype can be rescued by providing KO mice with easy access to a palatable food, suggesting that the absence of D1 receptor is more related to a motor deficit (Drago et al., 1994; Xu et al., 1994). In contrast, D2 receptor KO mice show reduced food intake and body weight along with an increased basal energy expenditure level compared to their wild type littermates (Kim et al., 2010). Therefore, it is difficult to delineate the exact role of the DA system and of the receptor subtypes in food reward. Nevertheless, most human studies indicate the importance of the D2 receptor in the regulation of food reward in association with obesity.
Increasing evidence suggests that variations in DA receptors and DA release play a role in overeating and obesity, especially in association with striatal D2 receptor function and expression (Stice et al., 2011; Salamone and Correa, 2013). In animal studies, it has been shown that feeding increases the extracellular DA concentration in the NAc (Bassareo and Di Chiara, 1997), in a similar manner to drugs of abuse. However, in contrast to its effect on behaviors related to drug addiction, NAc DA depletion alone does not alter feeding behavior (Salamone et al., 1993). It appears that the pharmacological blockade of D1 and D2 receptors in the NAc affects motor behavior, amount and duration of feeding, but it does not reduce the amount of food consumed (Baldo et al., 2002). Interestingly, recent data showed that binge eating was ameliorated by the acute administration of unilateral NAc shell deep brain stimulation, and this effect was mediated in part by activation of the D2 receptor, while deep brain stimulation of the dorsal striatum had no influence on this behavior (Halpern et al., 2013) in mice. However, it has been reported that when exposed to the same high-fat diet, mice with a lower density of D2 receptors in the putamen exhibit more weight gain than mice with a higher density of D2 receptorsin the same region (Huang et al., 2006). This study compared DAT and D2 receptor densities in chronic, high-fat diet-induced obese, obese-resistant and low-fat-fed control mice, and found that D2 receptor density was significantly lower in the rostral part of caudate putamen in chronic high-fat diet-induced obese mice compared to obese-resistant and low-fat-fed control mice (Huang et al., 2006). This low level of D2 receptor may be associated with altered DA release, and it has also been reported that consumption of a high-fat, high-sugar diet leads to the downregulation of D2 receptors (Small et al., 2003) and reduced DA turnover (Davis et al., 2008).
In human studies, obese people and drug addicts both tend to show reduced expression of D2 receptors in striatal areas, and imaging studies have demonstrated that similar brain areas are activated by food- and drug-related cues (Wang et al., 2009). Positron emission tomography (PET) studies suggest that the availability of D2 receptors was decreased in obese individuals in proportion to their body mass index (Wang et al., 2001), thus suggesting that DA deficiency in obese individuals may perpetuate pathological eating as a means of compensating for the decreased activation of DA-mediated reward circuits. Volkow and co-workers also reported that obese versus lean adults show less striatal D2 receptor binding, and that this was positively correlated with metabolism in the dorsolateral prefrontal, medial orbitofrontal, anterior cingulate gyrus and somatosensory cortices (Volkow et al., 2008). This observation led to a discussion over whether decreases in striatal D2 receptors could contribute to overeating via the modulation of striatal prefrontal pathways that participate in inhibitory control and salience attribution, and whether the association between striatal D2 receptors and metabolism in the somatosensory cortices (regions that process palatability) could underlie one of the mechanisms through which DA regulates the reinforcing properties of food (Volkow et al., 2008).
Stice and co-workers used functional magnetic resonance imaging (fMRI) to show that individuals may overeat to compensate for a hypofunctioning dorsal striatum, particularly those with genetic polymorphisms of an A1 allele of the TaqIA in D2 receptor (DRD2/ANKK1) gene, which is associated with lower striatal D2 receptor density and attenuated striatal DA signaling (Stice et al., 2008a,b). These observations indicate that individuals who show blunted striatal activation during food intake are at risk for obesity, particularly those also at genetic risk for compromised DA signaling in brain regions implicated in food reward (Stice et al., 2008a, 2011). However, recent data showed that obese adults with or without binge eating disorder had a distinct genetic polymorphism of the TaqIA D2 receptor (DRD2/ANKK1) gene (Davis et al., 2012); therefore, it is plausible that similar brain DA systems are disrupted in both food motivation and drug addiction, even though it is not yet clear what these DA receptor data represent from the functional perspective of DA neurotransmission in brain.
As in obese people, low D2 receptor availability is associated with chronic cocaine abuse in humans (Volkow et al., 1993; Martinez et al., 2004). In contrast, overexpression of D2 receptors reduces the self-administration of alcohol in rats (Thanos et al., 2001). In humans, a higher-than-normal D2 receptor availability in non-alcoholic members of alcoholic families was reported (Volkow et al., 2006; Gorwood et al., 2012), supporting the hypothesis that low levels of D2 receptors may be associated with an increased risk of addictive disorders. Therefore, it is possible that in the brains of both obese individuals and chronic drug abusers, there are low basal DA concentrations, and periodic exaggerated DA release associated with either food or drug intake, along with low expression, or dysfunctional D2 receptors.
Dopamine receptor expression levels in other areas of the brain may also be important. For example, Fetissov et al. (2002) observed that obese Zucker rats, which display a feeding pattern consisting of large meal size and small meal number, have a comparatively low level of D2 receptor expression in the ventromedial hypothalamus (VMH). Interestingly, in their study, when a selective D2 receptor antagonist, sulpiride was injected into the VMH of obese and lean rats, a hyperphagic response was elicited only in the obese rats, suggesting that by aggravating the already low level of D2 receptors, it was possible to increase food intake. This low D2 receptor expression may cause an exaggerated DA release in obese rats during food ingestion and a reduced satiety feedback effect of DA, which would facilitate DA release into the brain areas “craving” for DA (Fetissov et al., 2002).
Recently, in an elegant study conducted by Johnson and Kenny (2010), it was observed animals provided with a “cafeteria diet” consisting of a selection of highly palatable energy-dense food gained weight, demonstrating compulsive eating behavior. In addition to their excessive adiposity and compulsive-like eating, cafeteria diet rats also had decreased D2 receptor expression in the striatum. Surprisingly, lentivirus-mediated knockdown of striatal D2 receptors rapidly accelerated the development of addiction-like reward deficits, and the onset of compulsive-like food-seeking behaviorin rats with extended access to palatable high-fat food (Johnson and Kenny, 2010), again indicating that common hedonic mechanisms may therefore underlie obesity and drug addiction. However, our own laboratory found somewhat unexpected results showing that D2 KO mice have a lean phenotype with enhanced hypothalamic leptin signaling compared to WT mice (Kim et al., 2010). Therefore, we cannot rule out that the D2 receptor plays a role in the homeostatic regulation of metabolism in association with a regulator of energy homeostasis such as leptin, in addition to its role in food motivation behavior. An animal model with a genetically manipulated conditional restriction of the D2 receptor in leptin receptor-expressing cells for example, or other reward-related neuronal cells, together with neural integrative tools, could potentially elucidate the role of the DA system via D2 receptors in food reward and the homeostatic regulation of food intake.
Increasing evidence indicates that homeostatic regulators of food intake, such as leptin, insulin, and ghrelin, control and interact with the reward circuit of food intake, and thus regulate behavioral aspects of food intake and conditioning to food stimuli behaviors (Abizaid et al., 2006; Fulton et al., 2006; Hommel et al., 2006; Baicy et al., 2007; Farooqi et al., 2007; Palmiter, 2007; Konner et al., 2011; Volkow et al., 2011). Recent findings reveal that hormones implicated in regulating energy homeostasis also impinge directly on DA neurons; for example, leptin and insulin directly inhibit DA neurons, while ghrelin activates them (Palmiter, 2007; Kenny, 2011).
Hommel and co-workers demonstrated that VTA DA neurons express leptin receptor mRNA, and respond to leptin with the activation of an intracellular JAK-STAT (Janus kinase-signal transducer and activator of transcription) pathway, which is the major pathway involved in leptin receptor downstream signaling, as well as a reduction in the firing rate of DA neurons (Hommel et al., 2006). This study showed that direct administration of leptin to the VTA caused decreased food intake, while long-term RNAi-mediated knockdown of leptin receptors in the VTA led to increased food intake, locomotor activity, and sensitivity to highly palatable food. These data support a critical role for VTA leptin receptorsin regulating feeding behavior, and provide functional evidence for the direct action of a peripheral metabolic signal on VTA DA neurons. These results are consistent with the idea that leptin signaling in the VTA normally suppresses DA signaling, and consequently decreases both food intake and locomotor activity. This suggests a physiological role for leptin signaling in the VTA, although the authors did not demonstrate that the effect of the virus injection on feeding was correlated directly with increased DA signaling (Hommel et al., 2006).
Fulton and co-workers also investigated the functional significance of leptin action in VTA DA neurons, to expand understanding of the multiple actions of leptin in the DA reward circuit (Fulton et al., 2006). Using double-label immunohistochemistry, they observed increased STAT3 phosphorylation in the VTA following peripheral leptin administration. These pSTAT3-positive neurons colocalized with DA neurons, and to a lesser extent with markers for GABA neurons. Retrograde neuronal tracing from the NAc revealed colocalization of the tracer with pSTAT3, indicating that a subset of VTA DA neurons expressing leptin receptors project to the NAc. When they assessed leptin function in the VTA, they found that ob/ob mice had a diminished locomotor response to amphetamine, and lacked locomotor sensitization to repeated amphetamine injections, with both defects being reversed by leptin infusion, thus indicating that the mesoaccumbens DA pathway, critical to integrating motivated behavior, also responds to this adipose-derived signal (Fulton et al., 2006). These lines of evidence importantly suggested the action of leptin in the DA reward system. However, given that physiological level of leptin receptor expression appear to be very low in the midbrain, normal circulating leptin levels seem to have little effect on leptin receptor signaling within the VTA. Thus, whether in vivo leptin can exert an significant effect to inhibit DA neuron activity through their receptors in VTA remains questionable (Palmiter, 2007).
There are also human studies showing that leptin can indeed control rewarding responses. Farooqi and co-workers reported that patients with congenital leptin deficiency displayed activation of DA mesolimbic targets (Farooqi et al., 2007). In the leptin-deficient state, images of well-liked foods engendered a greater wanting response, even when the subject has just been fed, while after leptin treatment, well-liked food images engendered this response only in the fasted state, an effect consistent with the response in control subjects. Leptin reduces activation in the NAc-caudate, and mesolimbic activation (Farooqi et al., 2007). Thus, this study suggests that leptin diminished the rewarding responses to food, acting on the DA system (Farooqi et al., 2007; Volkow et al., 2011). Another fMRI study by Baicy et al., also performed with patients with congenital leptin deficiency, showed that during viewing of food-related stimuli, leptin replacement reduced neural activation in brain regions linked to hunger (the insula, parietal and temporal cortex), while enhancing activation in regions linked to inhibition and satiety (the prefrontal cortex; Baicy et al., 2007). Therefore, it appears that leptin acts on neural circuits involved in hunger and satiety with inhibitory control.
Another peptide hormone, ghrelin, which is produced in the stomach and pancreas, is known to increase appetite and food intake (Abizaid et al., 2006). The ghrelin receptor growth hormone secretagogue 1 receptor (GHSR) is present in hypothalamic centers as well as in the VTA. Abizaid and co-workers showed that in mice and rats, ghrelin bound to neurons of the VTA, where it triggered increased DA neuronal activity, synapse formation, and DA turnover in the NAc, in a GHSR-dependent manner. In addition, they demonstrated that direct VTA administration of ghrelin also triggered feeding behavior, while intra-VTA delivery of a selective GHSR antagonist blocked the orexigenic effect of circulating ghrelin, and blunted rebound feeding following fasting, suggesting that the DA reward circuitry is targeted by ghrelin to influence motivation for food (Abizaid et al., 2006).
Insulin, which is one of the key hormones involved in the regulation of glucose metabolism, and inhibits feeding, has been shown to also regulate the DA system in the brain. Insulin receptors are expressed in brain regions that are rich in DA neurons, such as the striatum and midbrain (Zahniser et al., 1984; Figlewicz et al., 2003), suggesting a functional interaction between the insulin and DA systems. Indeed, it has been shown that insulin acts on DA neurons, and infusion of insulin into the VTA decreases food intake in rats (Figlewicz et al., 2008; Bruijnzeel et al., 2011). Recent studies on the selective deletion of insulin receptors in midbrain DA neurons in mice demonstrated that this manipulation results in increased body weight, increased fat mass, and hyperphagia (Konner et al., 2011). While insulin acutely stimulated firing frequency in 50% of dopaminergic VTA/SN neurons, this response was abolished in those mice with the insulin receptor selectively deleted in DA neurons. Interestingly, in these mice, D2 receptor expression in the VTA was decreased compared to control mice. Moreover, these mice exhibited an altered response to cocaine under food-restricted conditions (Konner et al., 2011). Another recent report indicates that insulin can induce long-term depression (LTD) of mouse excitatory synapses onto VTA DA neurons (Labouèbe et al., 2013). Furthermore, after a sweetened high-fat meal, which elevates endogenous insulin levels, insulin-induced LTD is occluded. Finally, insulin in the VTA reduces food anticipatory behavior in mice, and CPP for food in rats. This study raises an interesting issue about how insulin can modulate reward circuitry, and suggests a new type of insulin-induced synaptic plasticity on VTA DA neurons (Labouèbe et al., 2013).
This review has focused on the role of the DA system, mainly concentrating on the roles of D1 and D2 receptors in reward-related behaviors, including addiction and food motivation. However, it is well known that the DA system in this reward-circuit is finely modulation by glutamatergic, GABAergic, and other neurotramistter systems, which form specific circuits to encode the neuronal correlates of behaviors. Recent breakthroughs in optogenetic tools to alter neuronal firing and function with light, as well as DREADDs, together with genetic manipulation of specific neuronal cells or circuits are now allowing us to refine our insight into reward circuits in addiction, and the hedonic value of food intake. It is of no doubt that these lines of investigation have provided a foundation for future direction of our study in neurocircuitry of the DA system in these behaviors. Future studies could include enlarged manipulations of important signaling molecules, for example, signaling molecules implicated in the D1 and D2 receptor signaling cascades, to explore the impact of these molecules on the induction and expression of specific reward behaviors. Given that these two receptors employ distinct signaling pathways, in terms of their respective G protein coupling, as well as in the activation of common singling molecules such as ERK, the differential distribution of receptors, as well as of their downstream signaling molecules may result in a different type of physiological response. Additionally, with this conceptual and technical evolution of the DA system in behaviors, this research will have important implications in the clinical investigation of related neurological disorders and psychiatric diseases. Therefore, our continuing efforts to identify and characterize the organization and modification of DA synaptic functions in both animals and humans will contribute to the elucidation of neural circuits underlying the pathophysiology of drug addiction and eating disorders.
The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP; No. 2011-0015678, No. 2012-0005303), MSIP: the Ministry of Science, ICT & Future Planningand by a grant of the Korean Health Technology R&D Project (A111776) from Ministry of Health & Welfare, Republic of Korea.
Drug-induced dopamine dysregulation can cause reckless sexual behavior. Does this have relevance for those who dysregulate dopamine with heavy porn use?
Parkinsonism Relat Disord. 2011 May;17(4):260-4. doi: 10.1016/j.parkreldis.2011.01.009.
Hassan A, Bower JH, Kumar N, Matsumoto JY, Fealey RD, Josephs KA, Ahlskog JE. Parkinsonism Relat Disord. 2011 Feb 8; Department of Neurology, Mayo Clinic, Rochester, MN 55905, USA.
BACKGROUND: Compulsive behaviors provoked by dopamine agonists often go undetected in clinical series, especially if not specifically inquired about.
AIM: To determine the frequency of compulsive behaviors in a Parkinson's disease (PD) clinic where agonist-treated patients were routinely asked about such aberrant behaviors.
METHODS: We utilized the Mayo Health Science Research database to ascertain all PD patients taking a dopamine agonist over a two year period (2007-2009). All were seen by a Mayo-Rochester Movement Disorders Staff specialist who routinely inquired about behavior compulsions.
RESULTS: Of 321 PD patients taking an agonist, 69 (22%) experienced compulsive behaviors, and 50/321 (16%) were pathologic. However, when the analysis was restricted to patients taking agonist doses that were at least minimally therapeutic, pathological behaviors were documented in 24%. The subtypes were: gambling (25; 36%), hypersexuality (24; 35%), compulsive spending/shopping (18; 26%), binge eating (12; 17%), compulsive hobbying (8; 12%) and compulsive computer use (6; 9%). The vast majority of affected cases (94%) were concurrently taking carbidopa/levodopa. Among those with adequate followup, behaviors completely or partly resolved when the dopamine agonist dose was reduced or ceased.
CONCLUSIONS: Dopamine agonist treatment of PD carries a substantial risk of pathological behaviors. These occurred in 16% of agonist-treated patients; however, when assessing patients whose dose was at least minimally in the therapeutic range, the frequency jumped to 24%. Pathological gambling and hypersexuality were most common. Carbidopa/levodopa therapy taken concurrently with a dopamine agonist appeared to be an important risk factor.
The anhedonia hypothesis – that brain dopamine plays a critical role in the subjective pleasure associated with positive rewards – was intended to draw the attention of psychiatrists to the growing evidence that dopamine plays a critical role in the objective reinforcement and incentive motivation associated with food and water, brain stimulation reward, and psychomotor stimulant and opiate reward. The hypothesis called to attention the apparent paradox that neuroleptics, drugs used to treat a condition involving anhedonia (schizophrenia), attenuated in laboratory animals the positive reinforcement that we normally associate with pleasure. The hypothesis held only brief interest for psychiatrists, who pointed out that the animal studies reflected acute actions of neuroleptics whereas the treatment of schizophrenia appears to result from neuroadaptations to chronic neuroleptic administration, and that it is the positive symptoms of schizophrenia that neuroleptics alleviate, rather than the negative symptoms that include anhedonia. Perhaps for these reasons, the hypothesis has had minimal impact in the psychiatric literature. Despite its limited heuristic value for the understanding of schizophrenia, however, the anhedonia hypothesis has had major impact on biological theories of reinforcement, motivation, and addiction. Brain dopamine plays a very important role in reinforcement of response habits, conditioned preferences, and synaptic plasticity in cellular models of learning and memory. The notion that dopamine plays a dominant role in reinforcement is fundamental to the psychomotor stimulant theory of addiction, to most neuroadaptation theories of addiction, and to current theories of conditioned reinforcement and reward prediction. Properly understood, it is also fundamental to recent theories of incentive motivation.
The anhedonia hypothesis of neuroleptic action (Wise, 1982) was, from its inception (Wise et al., 1978), a corollary of broader hypotheses, the dopamine hypotheses of reward (Wise, 1978) or reinforcement (Fibiger, 1978). The dopamine hypotheses were themselves deviations from an earlier catecholaminergic theory, the noradrenergic theory of reward (Stein, 1968). The present review sketches the background, initial response, and current status of the inter-related dopamine hypotheses: the dopamine hypothesis of reward, the dopamine hypothesis of reinforcement, and the anhedonia hypothesis of neuroleptic action.
The notion that animal behavior is controlled by reward and punishment is certainly older than recorded history (Plato attributed it to his older brother). The notion that an identifiable brain mechanism subserves this function was anchored firmly to biological fact by the finding of Olds and Milner (1954) that rats will work for electrical stimulation of some but not other regions of the forebrain. This led to the postulation by Olds (1956) of "pleasure centers" in the lateral hypothalamus and related brain regions. Brain stimulation studies by Sem-Jacobsen (1959) and Heath (1963) confirmed that humans would work for such stimulation and found it pleasurable (Heath, 1972). Olds (Olds and Olds, 1963) mapped much of the rat brain for reward sites, and even as his title phrase "pleasure centers" (Olds, 1956) was capturing the minds of a generation of students he was thinking not about isolated centers so much as about interconnected circuit elements (Olds, 1956; 1959; Olds and Olds, 1965). Olds (1956) assumed these to be specialized circuits that "would be excited by satisfaction of the basic drives – hunger, sex, thirst and so forth."
The first hints of what neurotransmitters might carry reward-related signals in the brain came from pharmacological studies. Olds and Travis (1960) and Stein (1962) found that the tranquilizers reserpine and chlorpromazine dramatically attenuated intracranial self-stimulation, while the stimulant amphetamine potentiated it. Imipramine potentiated the effects of amphetamine (Stein, 1962). Reserpine was known to deplete brain noradrenaline, chlorpromazine was known to block noradrenergic receptors, amphetamine was known to be a noradrenaline releaser, and imipramine was known to block noradrenergic reuptake. Largely on the basis of these facts and the location of reward sites in relation to noradrenergic cells and fibers, Stein (1968) proposed that reward function was mediated by a noradrenergic pathway originating in the brainstem (interestingly, Stein initially identified the A10 cell group, which turned out to comprise dopaminergic rather than noradrenergic neurons, as the primary origin of this system). Pursuing his hypothesis, C.D. Wise and Stein (1969; 1970) found that inhibition of dopamine-β-hydroxylase the enzyme that converts dopamine to norepinephrine – abolished self-stimulation and eliminated the rate-enhancing action of amphetamine; intraventricular administration of l-norepinephrine reinstated self-stimulation and restored the ability of dopamine to facilitate it.
At the time of initial formulation of the noradrenergic theory of reward, dopamine was known as a noradrenergic precursor but not as a transmitter in its own right. At about this time, however, Carlsson et al. (1958) suggested that dopamine might be a neurotransmitter in its own right. The discovery that noradrenaline and dopamine have different distributions in the nervous system (Carlsson, 1959; Carlsson and Hillarp, 1962) appeared to confirm this assumption, and reward sites in the region of the dopamine-containing cells of the midbrain led Crow and others to suggest that the two catecholamine transmitters in forebrain circuitry – noradrenaline and dopamine – might each subserve reward function (Crow, 1972; Crow et al., 1972; Phillips and Fibiger, 1973; German and Bowden, 1974).
Evidence that eventually ruled out a major role for norepinephrine in brain stimulation and addictive drug reward began to accumulate from two sources: pharmacology and anatomy. The pharmacological issue was whether selective noradrenergic blockers or depletions disrupted reward function itself or merely impaired the performance capacity of the animals. For example, Roll (1970) reported that noradrenergic synthesis inhibition disrupted self-stimulation by making animals sleepy; waking them restored the behavior for a time, until the animals lapsed into sleep again (Roll, 1970). Noradrenergic receptor antagonists clearly disrupted intracranial self-stimulation in ways suggestive of debilitation rather than loss of sensitivity to reward (Fouriezos et al., 1978; Franklin, 1978). Also, noradrenergic antagonists failed to disrupt intravenous (IV) self-administration of amphetamine (Yokel and Wise, 1975; 1976; Risner and Jones, 1976) or cocaine (de Wit and Wise, 1977; Risner and Jones, 1980). Further, lesions of the noradrenergic fibers of the dorsal bundle failed to disrupt self-stimulation with stimulating electrodes near the locus coeruleus, where the bundle originates, or in the lateral hypothalamus, through which the bundle projects (Corbett et al., 1977). Finally, careful mapping of the region of the locus coeruleus and the trajectory of the dorsal noradrenergic bundle fibers that originate there revealed that positive reward sites in these regions did not correspond to the precise location of histochemically confirmed noradrenergic elements (Corbett and Wise, 1979).
On the other hand, as selective antagonists for dopamine receptors became available, evidence began to accumulate that dopamine receptor blockade disrupted self-stimulation in ways that implied a devaluation of reward rather than an impairment of performance capacity. There was considerable early concern that the effect of dopamine antagonists – neuroleptics – was primarily motor impairment (Fibiger et al., 1976). Our first study in this area was not subject to this interpretation because performance in our task was enhanced rather than disrupted by neuroleptics. In our study rats were trained to lever-press for IV injections of amphetamine, a drug that causes release of each of the four monoamine neurotransmitters – norepinephrine, epinephrine, dopamine, and serotonin. We trained animals to self-administer IV amphetamine and challenged with selective antagonists for adrenergic or dopaminergic receptors. Animals treated with low and moderate doses of selective dopamine antagonists simply increased their responding (as do animals tested with lower than normal amphetamine doses), while animals treated with high doses increased responding in the first hour or two but responded intermittently thereafter (as do animals tested with saline substituted for amphetamine) (Yokel and Wise, 1975; 1976). Similar effects were seen in rats lever-pressing for cocaine (de Wit and Wise, 1977). Very different effects were seen with selective noradrenergic antagonists; these drugs decreased responding from the very start of the session and did not lead to further decreases as the animals earned and experienced the drug in this condition (Yokel and Wise, 1975; 1976; de Wit and Wise, 1977). The increases in responding for drug reward could clearly not be attributed to performance impairment. The findings were interpreted as reflecting a reduction of the rewarding efficacy of amphetamine and cocaine, such that the duration of reward from a given injection was reduced by dopaminergic, but not noradrenergic, antagonists.
In parallel with our pharmacological studies of psychomotor stimulant reward, we carried out pharmacological studies of brain stimulation reward. Here, however, dopamine antagonists, like reward-reduction, reduced rather than increased lever-pressing. The reason that neuroleptics decrease responding for brain stimulation and increase responding for psychomotor stimulants are interesting and are now understood (Lepore and Franklin, 1992), but at the time decreased responding was suggested to reflect parkinsonian side-effects of dopaminergic impairment (Fibiger et al., 1976). The timecourse of our finding appeared to rule out this explanation. We tracked the time-course of responding in well-trained animals that were pre-treated with the dopamine antagonists pimozide or butaclamol. We found that the animals responded normally in the initial minutes of each session, when they would have expected normal reward from the prior reinforcement history, but they slowed or ceased responding, depending on the neuroleptic dose, as did animals unexpectectly tested under conditions of reduced reward (Fouriezos and Wise, 1976; Fouriezos et al., 1978). Animals pretreated with the noradrenergic antagonist phenoxybenzamine, in contrast, showed depressed lever-pressing from the very start of the session and they did not slow further as they earned and experienced the rewarding stimulation. Performance was poor in the phenoxybenzamine-treated animals, but it did not worsen as the animals gained experience with the reward while under the influence of the drug.
That dopaminergic but not noradrenergic antagonists impaired the ability of reward to sustain motivated responding was confirmed in animals tested in a discrete-trial runway test. Here, the animals ran a two-meter alleyway from a start box to a goal box where they could lever-press, on each of 10 trials per day, for 15 half-second trains of brain stimulation reward. After several days of training the animals were tested after neuroleptic pretreatment. Over the course of 10 trials in the neuroleptic condition, the animals stopped leaving the start box immediately when the door was opened, stopped running quickly and directly to the goal box, and stopped lever-pressing for the stimulation. Importantly, however, the consummatory response – earning the stimulation once they reached the goal box response – deteriorated before the instrumental responses – leaving the start box and running the alleyway deteriorated. The animals left the start box with normal latency for the first 8 trials, ran normally for only the first 7 trials, and lever-pressed at normal rates for only the first 6 trials of the neuroleptic test session. Thus the animals showed signs of disappointment in the reward – indicated by the decreased responding in the goal box – before they showed any lack of motivation indicated by approach responding.
These self-stimulation findings were again incompatible with the possibility that our neuroleptic doses were simply causing motor deficits. The animals showed normal capacity at the beginning of sessions, and continued to run the alleyway at peak speed until after they showed signs disappointment with the reward in the goal box. Moreover, in the lever-pressing experiments the neuroleptic-treated animals sometimes leaped out of their open-topped test chambers and balanced precariously on the edge of the plywood walls; thus the animals still had good motor strength and coordination (Fouriezos, 1985). Moreover, neuroleptic-treated animals that ceased responding after a few minutes did not do so because of exhaustion; they re-initiated normal responding when presented reward-predictive environmental stimuli (Fouriezos and Wise, 1976; Franklin and McCoy, 1979). Moreover, after extinguishing one learned response for brain stimulation reward, neuroleptic-treated rats will initiate, with normal response strength, an alternative, previously learned, instrumental response for the same reward (they then go through progressive extinction of the second response: Gallistel et al., 1982). Finally, moderate reward-attenuating doses of neuroleptics do not impose a lowered response ceiling as do changes in performance demands (Edmonds and Gallistel, 1974); rather they merely increase the amount of stimulation (reward) necessary to motivate responding at the normal maximum rates (Gallistel and Karras, 1984). These pharmacological findings suggested that whatever collateral deficits they may cause, neuroleptic drugs devalue the effectiveness of brain stimulation and psychomotor stimulant rewards.
In parallel with our pharmacological studies, we initiated anatomical mapping studies with two advantages over earlier approaches. First, we used a moveable electrode (Wise, 1976) so that we could test several stimulation sites within each animal. In each animal, then, we had anatomical controls: ineffective stimulation sites above or below loci where stimulation was rewarding. Electrode movements of 1/8 mm were often sufficient to take an electrode tip from a site where stimulation was not rewarding to a site where it was, or vice versa. This allowed us to identify the dorsal-ventral boundaries of the reward circuitry within a vertical electrode penetration in each animal. Second, we took advantage of a new histochemical method (Bloom and Battenberg, 1976) to identify the boundaries of the catecholamine systems in the same histological material that showed the electrode track. Previous studies had relied on single electrode sites in each animal and on comparisons between nissl-stained histological sections and line drawings showing the locations of catecholamine systems. Our mapping studies showed that the boundaries of the effective zones of stimulation did not correspond to the boundaries of noradrenergic cell groups or fiber bundles (Corbett and Wise, 1979) and did correspond to the boundaries of the dopamine cell groups in the ventral tegmental area and substantia nigra pars compacta (Corbett and Wise, 1980) and pars lateralis (Wise, 1981). While subsequent work has raised the question of whether rewarding stimulation activates high-threshold catecholamine systems directly or rather activates their low-threshold input fibers (Gallistel et al., 1981; Bielajew and Shizgal, 1986; Yeomans et al., 1988), the mapping studies tended to focus attention on dopamine rather than norepinephrine systems as substrates of reward.
The term "anhedonia" was first introduced in relation to studies of food reward (Wise et al., 1978). Here again, we found that when well-trained animals were first tested under moderate doses of the dopamine antagonist pimozide, they initiated responding normally for food reward. Indeed, pimozide-pretreated animals responded as much (at 0.5 mg/kg) or almost as much (at 1.0 mg/kg) the first day under pimozide treatment as they did when food was given in the absence of pimozide. When retrained for two days and then tested a second time under pimozide, however, they again responded normally in the early portion of their 45-min sessions but stopped responding earlier than normal and their total responding for this second session was significantly lower than on a drug-free day or on their first pimozide-test day. When retrained and tested a third and fourth time under pimozide, the animals still initiated responding normally but ceased responding progressively earlier. Normal responding in the first few minutes of each session confirmed that the doses of pimozide were not simply debilitating the animals; decreased responding after tasting the food in the pimozide condition suggested that the rewarding (response-sustaining) effect of food was devalued when the dopamine system was blocked.
In this study, a comparison group was trained the same way, but these animals were simply not rewarded on the four "test" days when the experimental groups were pretreated with pimozide. Just as the pimozide-treated animals lever-pressed the normal 200 times for food pellets on the first day, so did the non-rewarded animals lever-press the normal 200 times despite the absence of the normal food reward. On successive days of testing, however, lever-pressing in the non-rewarded group dropped to 100, 50, and 25 responses, showing the expected decrease in resistance to extinction that paralleled the pattern seen in the pimozide-treated animals. A similar pattern across successive tests is seen when animals trained under deprivation are tested several times under conditions of satiety; the first time tested the animals respond for and eat food that was freely available before or during the test. Like the habit-driven lever-pressing in our pimozide-treated or non-rewarded animals, the habit-driven eating under satiety decreases progressively with repeated testing. Morgan (1974) termed the progressive deterioration of responding under satiety "resistance to satiation," calling attention to the parallel with resistance to extinction. In all three conditions – responding under neuroleptics, responding under non-reward, and responding under satiety – the behavior is driven by a response habit that decays if not supported by normal reinforcement. In our experiment, an additional comparison group established that there was no sequential debilitating effect of repeated testing with pimozide, a drug with a long half-life and subject to sequestration by fat. The animals of this group received pimozide in their home cages but were not tested on the first three "test days"; they were allowed to lever-press for food only after the fourth of their series of pimozide injections. These animals responded avidly for food after their fourth pimozide treatment, just like animals that were given the opportunity to lever-press for food the first time they were treated with pimozide. Thus responding in Test 4 depended not just on having had pimozide in the past, but on having tasted food under pimozide conditions in the past. Something about the memory of food experience under pimozide – not just of pimozide alone – caused the progressively earlier response cessation seen when pimozide tests were repeated. The fact that pimozide-pretreated animals responded avidly for food until after they had tasted it in the pimozide condition led us to postulate that the food was not as enjoyable under the pimozide condition. The essential feature of what appeared to be a devaluation of reward under pimozide had been captured earlier in a remark of George Fouriezos in connection with our brain stimulation experiments: "Pimozide takes the jolts out of the volts."
The formal statement of the anhedonia appeared a few years after the food reward studies in a journal that published peer commentaries along with review papers (Wise, 1982). Two thirds of the initial commentaries either contested the hypothesis or proposed an alternative to it (Wise, 1990). For the most part, the primary arguments against the original hypothesis appealed to motor or other performance deficits (Freed and Zec, 1982; Koob, 1982; Gramling et al., 1984; Ahlenius, 1985). These were arguments addressed to the finding that neuroleptics caused decreased performance for food or brain stimulation reward but did not, for the most part, address the fact that neuroleptics disrupted maintenance rather than initiation of responding. They also failed to address the fact that when neuroleptic-treated animals stopped responding their responding could be reinstated by exposing them to previously conditioned reward-predictive stimuli (Fouriezos and Wise, 1976; Franklin and McCoy, 1979). Nor could these arguments be reconciled with the fact that such reinstated responding itself underwent apparent extinction. Finally, they did not address the fact that neuroleptics caused compensatory increases in lever-pressing for amphetamine and cocaine reward (Yokel and Wise, 1975; 1976; de Wit and Wise, 1977).
The most critical evidence against a motor hypothesis was elaborated before the formal statement of the anhedonia hypothesis. The paper (Wise et al., 1978) is still steadily cited, but is probably rarely now read in the original. The original findings are summarized above, but they continue to escape the attention of most remaining proponents of motor hypotheses (or other hypotheses of debilitation); for this reason the original paper is still worth reading. The critical findings are that moderate doses of neuroleptics only severely attenuate responding for food after the animal has had experience with that food while under the influence of the neuroleptic. If the animal has had experience with the neuroleptic in the absence of food, its subsequent effect on responding for food is minimal; however, after having had experience with the food under the influence of the neuroleptic, the effect of the neuroleptic becomes progressively stronger. Similar effects are seen when the only instrumental responses required of the animal are those of picking up the food, chewing it, and swallowing (Wise and Colle, 1984; Wise and Raptis, 1986).
Several of the criticisms of the anhedonia hypothesis have been more semantic than substantial. While agreeing that the effects of neuroleptics cannot be explained as simple motor debilitation, several authors have suggested other names for the condition. Katz (1982) termed it "hedonic arousal"; Liebman (1982) termed it "neuroleptothesia"; Rech (1982) termed it "neurolepsis' or "blunting of emotional reactivity"; Kornetsky (1985) termed it a problem of "motivational arousal"; and Koob (1982) begged the question by calling it a "higher order" motor problem. The various criticisms addressed differentially the anhedonia hypothsis, the reinforcement hypothesis, and the reward hypothesis.
The anhedonia hypothesis was really a corollary of the hypothesis that dopamine was important for objectively measured reward function. The initial statement of the hypothesis was that the neuroleptic pimozide "appears to selectively blunt the rewarding impact of food and other hedonic stimuli" (Wise, 1978). It was not really an hypothesis about subjectively experienced anhedonia but rather an hypothesis about objectively measured reward function. The first time the hypothesis was actually labeled the "anhedonia hypothesis" (Wise, 1982), it was stated thusly: "the most subtle and interesting effect of neuroleptics is a selective attenuation of motivational arousal that is (a) critical for goal-directed behavior, (b) normally induced by reinforcers and associated environmental stimuli, and (c) normally accompanied by the subjective experience of pleasure." The hypothesis linked dopamine function explicitly to motivational arousal and reinforcement – the two fundamental properties of rewards – and implied only a partial correlation with the subjective experience of the pleasure that "usually" accompanies positive reinforcement.
The suggestion that dopamine might be important for pleasure itself came in part from the subjective reports of patients (Healy, 1989) or normal subjects (Hollister et al., 1960; Bellmaker and Wald, 1977) given neuroleptic treatments. The dysphoria caused by neuroleptics is quite consistent with the suggestion that they attenuate the normal pleasures of life. Consistent with this view were that drugs like cocaine and amphetamine – drugs that are presumed to be addictive at least in part because of the euphoria they cause (Bijerot, 1980) – increase extracellular dopamine levels (vanRossum et al., 1962; Axelrod, 1970; Carlsson, 1970). The neuroleptic pimozide, a competitive antagonist at dopamine receptors (and the neuroleptic used in our animal studies), had been reported to decrease the euphoria induced by IV amphetamine in humans (Jönsson et al., 1971; Gunne et al., 1972).
The ability of neuroleptics to block the subjective effects of euphoria have been questioned on the basis of clinical reports of continued amphetamine and cocaine abuse in neuroleptic-treated schizophrenic patients and on the basis of more recent studies on the subjective effects of neuroleptic-treated normal humans. The clinical observations are difficult to interpret because of compensatory adaptations to chronic dopamine receptor blockade and because of variability in drug intake, neuroleptic dose, and compliance with treatment during periods of stimulant use. The more recent controlled studies of the effects of pimozide on amphetamine euphoria (Brauer and de Wit, 1996; 1997) are also problematic. First, there are issues of pimozide dose: the high dose of the early investigators was 20 mg (Jönsson et al., 1971; Gunne et al., 1972), whereas, because of concern about extrapyramidal side-effects, the high dose in the more recent studies was 8 mg. More troublesome are the differences in amphetamine treatment between the original and the more recent studies. In the original studies, 200 mg of amphetamine was given intravenously to regular amphetamine users; in the more recent studies, 10 or 20 mg was given to normal volunteers by mouth in capsules. One must wonder if normal volunteers are feeling and rating the same euphoria from their 20 mg capsules as is felt by chronic amphetamine users after their 200 mg IV injection (Grace, 2000; Volkow and Swanson, 2003).
The notion that neuroleptics attenuate the pleasure of food reward has also been challenged on the basis of rat studies (Treit and Berridge, 1990; Pecina et al., 1997). Here the challenge was based on the taste-reactivity test, putatively a test of the hedonic impact of sweet taste (Berridge, 2000). The test has been used to challenge directly the hypothesis that "pimozide and other dopamine antagonists produce anhedonia, a specific reduction of the capacity for sensory pleasure" (Pecina et al., 1997, p. 801). This challenge is, however, subject to serious caveats: "When using taste reactivity as a measure of 'liking' or hedonic impact it is important to be clear about a potential confusion. Use of terms such as 'like' and 'dislike' does not necessarily imply that taste reactivity patterns reflect a subjective experience of pleasure produced by a food" (Berridge, 2000, p. 192, emphasis as in the original), and that "We will place 'liking' and 'wanting' in quotation marks because our use differs in an important way from the ordinary use of these words. By their ordinary meaning, these words typically refer to the subjective experience of conscious pleasure or conscious desire" (Berridge and Robinson, 1998, p. 313). The taste reactivity test seems unlikely to directly measure the subjective pleasure of food, as "normal" taste reactivity in this paradigm is seen in decorticate rats (Grill and Norgren, 1978) and similar reactions are seen in anencephalic children (Steiner, 1973). Thus it appears that the initial interpretation of the taste reactivity test (Berridge and Grill, 1984) was correct: the test measures the fixed action patterns of food ingestion or rejection – more a part of swallowing than of smiling – reflecting hedonic impact only insomuch as it reflects the positive or negative valence of the fluid injected into the passive animal's mouth.
The anhedonia hypothesis was based on the observation that a variety of rewards failed to sustain normal levels of instrumental behavior in well-trained but neuroleptic-treated animals. This was not taken as evidence of neuroleptic-induced anhedonia, but rather evidence of neuroloptic-induced attenuation of positive reinforcement. Under neuroleptic treatment animals showed normal initiation but progressive decrements in responding both within and across repeated trials, and these decrements paralleled in pattern, if not in degree, the similar decrements seen in animals that were simply allowed to respond under conditions of non-reward (Wise et al., 1978). Moreover, naïve rats were found not to learn to lever-press normally for food if they were pretreated with neuroleptic for their training sessions (Wise and Schwartz, 1981). Thus the habit-forming effect of food is severely attenuated by dopamine blockade. These findings have not been challenged but have rather been replicated by critics of what has come to be labeled the anhedonia hypothesis (Tombaugh et al., 1979; Mason et al., 1980), who have argued that under their conditions neuroleptics cause performance deficits above and beyond clear deficits in reinforcement. Given the fact that neuroleptics block all dopamine systems, some of which are thought to be involved in motor function, this was not surprising or contested (Wise, 1985).
Clear similarities between the effects of non-reward and the effects of reward under neuroleptic treatment are further illustrated by two much more subtle paradigms. The first is a partial reinforcement paradigm. It is well established that animals respond more under extinction conditions if they are trained not to expect a reward for every response they make. That animals respond more in extinction if they have been trained under intermittent reinforcement is known as the partial reinforcement extinction effect (Robbins, 1971). Ettenberg and Camp found partial reinforcement extinction effects with neuroleptic challenges of food- and water-trained response habits. They tested animals in extinction of a runway task after training in each of three conditions. Food- or water-deprived animals were trained, one trial per day, to run 155 cm in a straight alley runway for food (Ettenberg and Camp, 1986b) or water (Ettenberg and Camp, 1986a) reward. One group was trained under a "continuous" reinforcement schedule; that is, they received their designated reward on each of the 30 days of training. A second group was trained under partial reinforcement; they received their designated reward on only 20 of the 30 training days; on 10 days randomly spaced in the training period, the animals found no food or water when they arrived at the goal box. The third group received food or water on every trial but were periodically treated with the neuroleptic haloperidol; on 10 of their training trials they found food or water in the goal box, but, having been pretreated with haloperidol on those days, they experienced the food or water under conditions of dopamine receptor blockade. The consequences of these training regimens were assessed in 22 subsequent daily "extinction" trials in which each group was allowed to run but received no reward in the goal box. All animals ran progressively slower as the extinction trials continued. However, the performance of animals that had been trained under conditioned reinforcement conditions deteriorated much more rapidly from day to day than did that of animals that had been trained under partial reinforcement conditions. The animals that had been trained under "partial" haloperidol conditions also persevered more than the animals with the continuous reinforcement training; the intermittent haloperidol animals had start-box latencies and running times that were identical to those of the animals trained under partial reinforcement. That is, the animals pretreated with haloperidol on 1/3 of their training days performed in extinction as if they had experienced no reward on 1/3 of their training days. There is no possibility of a debilitation confound here, first because the performance of the haloperidol-treated animals was better than that of the control animals and second because haloperidol was not given on the test days, only on some of the training days.
The second subtle paradigm is a two-lever drug discrimination paradigm. Here the animals are trained to continue responding on one of two levers as long as that lever yields food reward, and to shift to the other lever when no longer rewarded. With low-doses of haloperidol, animals inexplicably shift to the wrong lever as if they had earned no food with their initial lever-press (Colpaert et al., 2007). That is, haloperidol-treated rats that earned food on their initial lever-press behaved like normal rats that failed to earn food on their initial lever-press. This was not a reflection of some form of haloperidol-induced motor deficit, because the evidence that food was not rewarding under haloperidol involved not the absence of a response but rather the initiation of a response: a response on the second lever.
Thus it is increasingly clear that, whatever else they do, neuroleptics decrease the reinforcing efficacy of a range of normally positive rewards.
The most recent challenge to the anhedonia hypothesis comes from theorists who argue that the primary motivational deficit caused by neuroleptics is a deficit in the drive or motivation to find or earn reward rather than the reinforcement that accompanies the receipt of reward (Berridge and Robinson, 1998; Salamone and Correa, 2002; Robinson et al., 2005; Baldo and Kelley, 2007). The suggestion that dopamine plays an important role in motivational arousal was, in fact, stressed more strongly in the original statement of the anhedonia hypothesis than was anhedonia itself: "the most subtle and interesting effect of neuroleptics is a selective attenuation of motivational arousal which is (a) critical for goal-directed behavior…" (Wise, 1982). That elevations of extracellular dopamine can motivate learned behavior sequences is perhaps best illustrated by the "priming" effect that is seen when free reward is given to an animal that is temporarily not responding in an instrumental task (Howarth and Deutsch, 1962; Pickens and Harris, 1968). This effect is best illustrated by drug-induced reinstatement of responding in animals that have undergone repeated extinction trials (Stretch and Gerber, 1973; de Wit and Stewart, 1983). One of the most powerful stimuli for reinstatement of responding in animals that have extinguished a cocaine-seeking or a heroin-seeking habit is an unearned injection of the dopamine agonist bromocriptine (Wise et al., 1990). The inclusion of motivational arousal is the main feature that differentiates the dopamine hypothesis of reward from the narrower dopamine hypothesis of reinforcement (Wise, 1989; 2004).
While there is ample evidence that dopamine can amplify or augment motivational arousal, there is equally ample evidence that neuroleptic drugs do not block the normal motivational arousal that is provided for a well-trained animal by reward-predictive cues in the environment. As discussed above, neuroleptic-treated animals tend to initiate response habits normally. Such animals start but do not normally continue to lever-press, run, or eat in operant chambers, runways, or free-feeding tests. When given in a discrete-trial runway task, haloperidol- treated animals run normally during the trial when the haloperidol is given; their motivational deficit only appears the next day, when the haloperidol has been metabolized and all that is left of the treatment is the memory of the treatment trial (McFarland and Ettenberg, 1995; 1998). The start-box cues fail to trigger running down the runway for food or heroin not on the day when the animals are under the influence of haloperidol, but on the next day when they only remember what the reward was like on the haloperidol day. So the motivational arousal of the animal on the day it gets haloperidol treatment is not compromised by the treatment; rather it must be the memory of a degraded reward that discourages the animal the day after the treatment trial. This is the most salient message from studies of the effects of neuroleptics on instrumental behavior in the range of tasks; neuroleptics at appropriate doses do not interfere with the ability of learned stimuli to instigate motivated behavior until after the stimuli have begun to lose the ability to maintain that behavior because of experience of the reward in the neuroleptic condition (Fouriezos and Wise, 1976; Fouriezos et al., 1978; Wise et al., 1978; Wise and Raptis, 1986; McFarland and Ettenberg, 1995; 1998).
This is not to say that dopamine is completely irrelevant to motivated behavior, only that the surges of phasic dopamine that are triggered by reward-predictors (Schultz, 1998) are, for the moment, unnecessary for the normal motivation of animals with an uncompromised reinforcement history. Well-trained animals respond out of habit, and do so even under conditions of dopamine receptor blockade. If brain dopamine is completely depleted, however, there are very dramatic effects on motivated behavior (Ungerstedt, 1971; Stricker and Zigmond, 1974). This is evident from studies of mutant mice that do not synthesize dopamine; these animals, like animals with experimental dopamine depletions, fail to move unless aroused by pain or stress, a dopamine agonist, or the dopamine-independent stimulant caffeine (Robinson et al., 2005). Thus minimal levels of functional dopamine are necessary for all normal behavior; dopamine-depleted animals, like dopamine-depleted parkinsonian patients (Hornykiewicz, 1979), are almost completely inactive unless stressed (Zigmond and Stricker, 1989). Among the primary deficits associated with dopamine depletion are aphagia and adipsia, which have motivational as well as motor components (Teitelbaum and Epstein, 1962; Ungerstedt, 1971; Stricker and Zigmond, 1974). Reward-blocking doses of neuroleptics, however, fail to produce the profound catalepsy that is caused by profound dopamine depletion.
The dopamine terminal field that has received most attention with respect to reward function is nucleus accumbens. Attention was drawn to nucleus accumbens first because lesions of this but not other catecholamine systems disrupted cocaine self-administration (Roberts et al., 1977). Further attention was generated by the suggestions that nucleus accumbens septi should be considered a limbic extension of the striatum, rather than an extension of the septum (Nauta et al., 1978a,b) and that it is an interface between the limbic system – conceptually linked to functions of motivation and emotion – and the extrapyramidal motor system (Mogenson et al., 1980). Studies of opiate reward also suggested that it is the mesolimbic dopamine system – the system projecting primarily from the ventral tegmental area to the nucleus accumbens – that is associated with reward function. Morphine in the ventral tegmental area was found to activate (Gysling and Wang, 1983; Matthews and German, 1984), by disinhibiting them (Johnson and North, 1992), dopaminergic neurons, and microinjections of morphine in this region potentiated brain stimulation reward (Broekkamp et al., 1976), produced conditioned place preferences (Phillips and LePiane, 1980), and were self-administered in their own right (Bozarth and Wise, 1981).
One challenge to the dopamine hypotheses thus arose from the finding that nucleus accumbens lesions failed to disrupt all instrumental behavior (Salamone et al., 1997). Aside from the problem that it is almost impossible to lesion nucleus accumbens selectively and, at the same time, completely, there are other reasons to assume that nucleus accumbens lesions should not eliminate all of dopamine's motivational actions. First, cocaine is directly self-administered not only into nucleus accumbens (Carlezon et al., 1995; Ikemoto, 2003), but also – and more avidly – into the medial prefrontal cortex (Goeders and Smith, 1983; Goeders et al., 1986) and olfactory tubercle (Ikemoto, 2003). Intravenous cocaine reward is attenuated not only by microinjections of a D1 antagonist into the ventral tegmental area (Ranaldi and Wise, 2001) but also by similar injections into the substantia nigra (Quinlan et al., 2004). Finally, post-trial dopamine release in the dorsal striatum enhances consolidation of learning and memory (White and Viaud, 1991), and dopamine blockade in the dorsal striatum impairs long-term potentiation (a cellular model of learning and memory) in this region (Centonze et al., 2001). Potentiation of memory consolidation is, in essence, the substance of reinforcement (Landauer, 1969) and dopamine appears to potentiate memory consolidation in the dorsal striatum and a variety of other structures (White, 1989; Wise, 2004).
Thus, for a variety of reasons, the dopamine hypothesis should not be reduced to a nucleus accumbens hypothesis. Nucleus accumbens is but one of the dopamine terminal fields implicated in reward function.
While evidence has steadily accumulated for an important role of dopamine in reward function a role we originally summarized loosely as "motivational arousal" our understanding of the precise nature of this function continues to develop in subtlety and complexity. Four issues, in addition to variations on the old motor hypothesis, have arisen in the recent literature.
One suggestion, offered as a direct challenge to the anhedonia hypothesis and the dopamine hypothesis of reward (Salamone et al., 1994; 1997; 2005) is that what neuroleptics reduce is not motivation or reinforcement but rather the animal's willingness to exert effort (Salamone et al., 2003). This suggestion is merely semantic. The willingness to exert effort is the essence of what we mean by motivation or drive, the first element in the initial three-part statement of the anhedonia hypothesis (Wise, 1982).
Studies of mutant mice lacking dopamine in dopaminergic neurons (but retaining it in noradrenergic neurons) show that brain dopamine is not absolutely necessary for food-rewarded instrumental learning. If given caffeine to arouse them, dopamine-deficient mice can learn to choose the correct arm of a T-maze for food reward (Robinson et al., 2005). This implicates dopamine in the motivational arousal that is lacking in dopamine-deficient mice that are not treated with caffeine, and indicates that dopamine is not essential to – though it normally contributes greatly to – the rewarding effects of food. It is interesting to note, however, that caffeine – required if the mutant mice are to behave at all without dopamine – also restores the feeding response that is lost after neurotoxic lesions of dopamine neurons in adult animals (Stricker et al., 1977). The mechanism of the caffeine effects is not fully understood, but caffeine affects the same medium-sized spiny striatal neurons that are the normal neuronal targets of dopaminergic fibers of the nigro-striatal and meso-limbic dopamine systems. It acts there as a phosphodiesterase inhibitor that increases intracellular cyclic AMP (Greengard, 1976) and as an adenosine receptor antagonist (Snyder et al., 1981). Moreover, the adenosine receptors that are blocked by caffeine normally form heteromers with dopamine receptors and affect the intracellular response to the effects of dopamine at those receptors (Ferre et al., 1997; Schiffmann et al., 2007). The complex interactions of dopamine and adenosine receptors in the striatum raises the possibility that caffeine enables learning in dopamine-deficient mice by substituting for dopamine in a shared or overlapping intracellular signaling cascade.
Schultz and colleagues have shown that the ventral tegmental dopamine neurons implicated in reward function respond not only to food reward itself but, as a result of experience, to predictors of food reward (Romo and Schultz, 1990; Ljungberg et al., 1992). As the animal learns that an environmental stimulus predicts food reward, the 200 millisecond burst of dopaminergic nerve firing that was initially triggered by food presentation itself becomes linked, instead, to the food-predictive stimulus that precedes it. If the food-predictive stimulus predicts food on only a fraction of the trials, then the dopaminergic neurons burst, to a lesser extent, in response to both the predictor and to the food; the stronger the probability of prediction, the stronger the response to the predictor and the weaker the response to the food presentation.
The fact that the dopaminergic neurons cease to respond to food itself and respond instead to food predictors raises the issue of whether the taste of food is not itself merely a reward predictor (Wise, 2002). Some tastes appear to be unconditioned reinforcers from birth (Steiner, 1974), but others gain motivational significance through the association of their taste with their post-ingestional consequences (Sclafani and Ackroff, 1994).
The concept of "reinforcement" is a concept of "stamping in" of associations (Thorndike, 1898). Whether the association is between a conditioned and an unconditioned stimulus (Pavlov, 1928), a stimulus and a response (Thorndike, 1911), or a response and an outcome (Skinner, 1937), reinforcement refers to the strengthening of an association through experience. Another way to look at it is that reinforcement is a process that enhances consolidation of the memory trace for the association (Landauer, 1969). Studies of post-trial dopaminergic activation suggest that dopamine serves to enhance or reinforce the memory trace for recently experienced events and associations, and that it does so in a variety of dopamine terminal fields (White and Milner, 1992). Several lines of evidence (Reynolds et al., 2001; Wise, 2004; Hyman et al., 2006; Wickens et al., 2007) now implicate a modulatory role for dopamine in cellular models of learning and memory that is consistent with the view that dopamine plays an important role in reinforcement.
While variations of the anhedonia hypothesis or the dopamine hypotheses of reward or reinforcement continue to appear, the hypothesis as originally stated still captures the scope of the involvement of dopamine in motivational theory. Normal levels of brain dopamine are important for normal motivation, while phasic elevations of dopamine play an important role in the reinforcement that establishes response habits and stamps in the association between rewards and reward-predicting stimuli. Subjective pleasure is the normal correlate of the rewarding events that cause phasic dopamine elevations, but stressful events can also cause dopamine elevations; thus pleasure is not a necessary correlate of dopamine elevations or even reinforcement itself (Kelleher and Morse, 1968).
COMMENTS: Addicted brains not only suffer from decreased sensitivity to dopamine, but also less dopamine released in response to stimuli.
Nora D. Volkow, MD; Joanna S. Fowler, PhD; Gene-Jack Wang, MD; James M. Swanson, PhD; Frank Telang, MD
Arch Neurol. 2007;64(11):1575-1579.
Imaging studies have provided new insights on the role of dopamine (DA) in drug abuse and addiction in the human brain. These studies have shown that the reinforcing effects of drugs of abuse in human beings are contingent not just on DA increases per se in the striatum (including the nucleus accumbens) but on the rate of DA increases. The faster the increases, the more intense the reinforcing effects. They have also shown that elevated levels of DA in the dorsal striatum are involved in the motivation to procure the drug when the addicted subject is exposed to stimuli associated with the drug (conditioned stimuli). In contrast, long-term drug use seems to be associated with decreased DA function, as evidenced by reductions in D2 DA receptors and DA release in the striatum in addicted subjects. Moreover, the reductions in D2 DA receptors in the striatum are associated with reduced activity of the orbitofrontal cortex (region involved with salience attribution and motivation and with compulsive behaviors) and of the cingulate gyrus (region involved with inhibitory control and impulsivity), which implicates deregulation of frontal regions by DA in the loss of control and compulsive drug intake that characterizes addiction. Because DA cells fire in response to salient stimuli and facilitate conditioned learning, their activation by drugs will be experienced as highly salient, driving the motivation to take the drug and further strengthening conditioned learning and producing automatic behaviors (compulsions and habits).
Dopamine (DA) is the neurotransmitter that has been classically associated with the reinforcing effects of drugs of abuse and may have a key role in triggering the neurobiological changes associated with addiction. This notion reflects the fact that all of the drugs of abuse increase the extracellular concentration of DA in the nucleus accumbens. Increases in DA levels have an important role in coding reward and prediction of reward, in the motivational drive to procure the reward, and in facilitating learning.1 It is also believed that DA codes not just for reward but for saliency, which, in addition to reward, includes aversive, novel, and unexpected stimuli. The diversity of DA effects is likely translated by the specific brain regions (limbic, cortical, and striatal) it modulates.
Herein, we summarize findings from imaging studies that used positron emission tomography (PET) to investigate the role of DA in the reinforcing effects of drugs, the long-term brain changes in drug-addicted subjects, and the vulnerability to addiction. Though most of the PET studies on addiction have focused on DA, it is clear that drug-induced adaptations in other neurotransmitters (ie, glutamate, γ-aminobutyric acid, opioids, and cannabinoids) are also involved, but the lack of radioligands has limited their investigation.
The effects of short-term drug exposure on extracellular DA concentrations in the human brain can be studied using PET and D2 DA receptor radioactive ligands that are sensitive to competition with endogenous DA, such as raclopride labeled with carbon 11 (11C). The relationship between the effects of drugs on DA and their reinforcing properties in the human brain (assessed by self-reports of “high” and “euphoria”) was studied for the stimulant drugs methylphenidate and amphetamine. Methylphenidate, like cocaine, increases DA by blocking DA transporters, whereas amphetamine, like methamphetamine, increases DA by releasing it from the terminal via DA transporters. Intravenous methylphenidate (0.5 mg/kg) and amphetamine (0.3 mg/kg) increased the extracellular DA concentration of DA in the striatum, and these increases were associated with increases in self-reports of high and euphoria.2 In contrast, when given orally, methylphenidate (0.75-1 mg/kg) also increased DA but was not perceived as reinforcing.3 Because intravenous administration leads to fast DA changes, whereas oral administration increases DA slowly, the failure to observe the high with oral methylphenidate likely reflects its slow pharmacokinetics. Indeed, the speed at which drugs of abuse enter the brain is recognized as affecting their reinforcing effects.4 This association has also been shown in PET studies that evaluated the pharmacokinetics of cocaine (using [11C]cocaine) and MP (using [11C]methylphenidate) in the human brain, documenting that it was the fast uptake of the drug into the brain but not the brain concentration per se that was associated with getting high.5 The dependency of the reinforcing effects of drugs on brain pharmacokinetic properties suggests a possible association with phasic DA cell firing (fast-burst firing at frequencies >30 Hz), which also leads to fast changes in DA concentration and whose function is to highlight the saliency of stimuli.6 This is in contrast to tonic DA cell firing (slow firing at frequencies around 5 Hz), which maintains baseline steady-state DA levels and whose function is to set the overall responsiveness of the DA system. This led us to speculate that drugs of abuse induce changes in DA concentration that mimic but exceed those produced by phasic DA cell firing.
Synaptic increases in DA concentration occur during drug intoxication in both addicted and nonaddicted subjects. However, a compulsive drive to continue drug taking when exposed to the drug is not triggered in all subjects. Inasmuch as it is the loss of control and the compulsive drug taking that characterizes addiction, the short-term drug-induced DA level increase alone cannot explain this condition. Because drug addiction requires long-term drug administration, we suggest that in vulnerable individuals (because of genetic, developmental, or environmental factors), addiction is related to the repeated perturbation of DA-regulated brain circuits involved with reward/saliency, motivation/drive, inhibitory control/executive function, and memory/conditioning. Herein, we discuss findings from imaging studies on the nature of these changes.
Many radioactive tracers have been used to assess changes in targets involved in DA neurotransmission (Table 1). Using 18-N-methylspiroperidol or [11C]raclopride, we and others have shown that subjects with a wide variety of drug addictions (cocaine, heroin, alcohol, and methamphetamine) have significant reductions in D2 DA receptor availability in the striatum (including the ventral striatum) that persist months after protracted detoxification (reviewed in Volkow et al2). We have also revealed evidence of decreased DA cell activity in cocaine abusers. Specifically, we showed that the striatal increases in DA level induced by intravenous methylphenidate (assessed with [11C]raclopride) in cocaine abusers were substantially blunted when compared with DA level increases in control subjects (50% lower).7 Because DA concentration increases induced by methylphenidate are dependent on DA release, a function of DA cell firing, we speculated that this difference likely reflects decreased DA cell activity in the cocaine abusers. Similar findings have been reported in alcohol abusers.8
These brain-imaging studies suggest 2 abnormalities in addicted subjects that would result in decreased output of DA circuits related to reward; that is, decreases in D2 DA receptors and decreases in DA release in the striatum (including the nucleus accumbens). Each would contribute to the decreased sensitivity in addicted subjects to natural reinforcers. Because drugs are much more potent at stimulating DA-regulated reward circuits than natural reinforcers, we postulated that drugs are still able to activate these down-regulated reward circuits. The decreased sensitivity of reward circuits would lead to decreased interest in day-to-day environmental stimuli, possibly predisposing subjects to seek drug stimulation as a means to temporarily activate these reward circuits underlying the transition from taking drugs to feel high to taking them to feel normal.
Preclinical studies have demonstrated a prominent role of DA in motivation that seems to be mediated in part via a DA-regulated circuit involving the orbitofrontal cortex (OFC) and the anterior cingulate gyrus (CG).9 In imaging studies in human subjects using the radioactive tracer fludeoxyglucose F 18, we and others have shown decreased activity in the OFC and CG in different classes of addicted subjects (reviewed in Volkow et al2). Moreover, in both cocaine- and methamphetamine-addicted subjects, we have shown that the reduced activity in the OFC and CG is associated with decreased availability of D2 DA receptors in the striatum (reviewed in Volkow et al7) (Figure). Because the OFC and CG participate in the assignment of value to reinforcers as a function of context, their disruption in the abuser could interfere with their ability to change the saliency value of the drug as a function of alternative reinforcers, becoming the main drive motivating behavior. In contrast to the pattern of decreased OFC and CG activity when drug-free, addicted subjects show increased activation in these regions when presented with the drug or drug-related stimuli, consistent with the enhanced saliency values of drugs or drug reinforcers in these subjects. Moreover, the enhanced activation of the OFC and CG was associated with the intensity of desire for the drug. This has led us to speculate that the hypermetabolism in the OFC and CG triggered by drugs or drug cues underlies the compulsive drug intake, just as it underlies the compulsive behaviors in patients with obsessive-compulsive disorders.10 This dual effect of disruption of the OFC-CG brain circuit is consistent with the behavior of the drug addict, whose compulsion to take the drug overrides competing cognitive-based tendencies not to take the drug; just as in patients with obsessive-compulsive disorders, the compulsion persists despite cognitive attempts to stop the behaviors.
A, Images of D2 dopamine receptors (raclopride labeled with carbon 11) and of brain glucose metabolism (fludeoxyglucose), which is used as an indicator of brain function in a control subject and a cocaine abuser. Cocaine abusers have lower D2 dopamine receptor availability in the striatum and lower metabolism in the orbitofrontal cortex (OFC) than do control subjects. B, Correlations between D2 dopamine (DA) receptors and orbitofrontal cortex (OFC) metabolism in detoxified cocaine abusers and detoxified methamphetamine abusers. Note that the subjects with the lowest measures of D2 DA receptor availability have the lowest metabolism in the OFC.
The CG and the OFC are also involved with inhibitory control, which led us to postulate that disrupted DA modulation of the OFC and CG also contributes to the loss of control over drug intake by drug-addicted subjects.10 Inhibitory control is also dependent on the dorsolateral prefrontal cortex, which is also affected in addiction (reviewed in Volkow et al2). Abnormalities in the dorsolateral prefrontal cortex are expected to affect processes involved in executive control including impairments in self-monitoring and behavior control, which have an important role in the cognitive changes that perpetuate drug self-administration.10
Circuits underlying memory and learning, including conditioned-incentive learning, habit learning, and declarative memory (reviewed in Vanderschuren and Everitt11), have been proposed to be involved in drug addiction. The effects of drugs on memory systems suggest ways that neutral stimuli can acquire reinforcing properties and motivational salience, that is, through conditioned-incentive learning. In research on relapse, it has been important to understand why drug-addicted subjects experience an intense desire for the drug when exposed to places where they have taken the drug, to persons with whom previous drug use occurred, and to paraphernalia used to administer the drug. This is clinically relevant because exposure to conditioned cues (stimuli associated with the drug) is a key contributor to relapse. Because DA is involved with prediction of reward (reviewed in Schultz9), we hypothesized that DA might underlie conditioned responses that trigger craving. Studies in laboratory animals support this hypothesis: when neutral stimuli are paired with a drug, they will, with repeated associations, acquire the ability to increase DA in the nucleus accumbens and dorsal striatum, becoming conditioned cues. Furthermore, these neurochemical responses are associated with drug-seeking behavior (reviewed in Vanderschuren and Everitt11). In human beings, PET studies with [11C]raclopride recently confirmed this hypothesis by showing that, in cocaine abusers, drug cues (cocaine-cue video of scenes of subjects taking cocaine) substantially increased DA in the dorsal striatum and that these increases were associated with cocaine craving.12- 13 Because the dorsal striatum is implicated in habit learning, this association likely reflects the strengthening of habits as chronicity of addiction progresses. This suggests that a basic neurobiologic disruption in addiction might be a DA-triggered conditioned response that results in habits leading to compulsive drug consumption. It is likely that these conditioned responses reflect adaptations in corticostriatal glutamatergic pathways that regulate DA release (reviewed in Vanderschuren and Everitt11).
A challenging question in the neurobiology of drug abuse is why some individuals are more vulnerable than others to becoming addicted to drugs. Imaging studies suggest that preexisting differences in DA circuits may be one mechanism underlying the variability in responsiveness to drugs of abuse. Specifically, baseline measures of striatal D2 DA receptors in nonaddicted subjects have been shown to predict subjective responses to the reinforcing effects of intravenous methylphenidate treatment; individuals describing the experience as pleasant had substantially lower levels of D2 DA receptors compared with those describing methylphenidate as unpleasant (reviewed in Volkow et al7). This suggests that the relationship between DA levels and reinforcing responses follows an inverted U-shaped curve: too little is not optimal for reinforcement but too much is aversive. Thus, high D2 DA receptor levels could protect against drug self-administration. Support for this was provided by preclinical studies that showed that up-regulation of D2 DA receptors in the nucleus accumbens dramatically reduced alcohol intake in animals previously trained to self-administer alcohol14 and by clinical studies showing that subjects who, despite having a dense family history of alcoholism, were not alcoholics had substantially higher D2 DA receptors in the striatum compared with individuals without such family histories.15 In these subjects, the higher the D2 DA receptors, the higher the metabolism in the OFC and CG. Thus, we postulate that high levels of D2 DA receptors may protect against alcoholism by modulating frontal circuits involved in salience attribution and inhibitory control.
Imaging studies have corroborated the role of DA in the reinforcing effects of drugs of abuse in human beings and have extended traditional views of DA involvement in drug addiction. These findings suggest multicomponent strategies for the treatment of drug addiction that include strategies to (1) decrease the reward value of the drug of choice and increase the reward value of nondrug reinforcers, (2) weaken conditioned drug behaviors, (3) weaken the motivational drive to take the drug, and (4) strengthen frontal inhibitory and executive control (Table 2).
Correspondence: Nora D. Volkow, MD, National Institute on Drug Abuse, 6001 Executive Blvd, Room 5274-MSC 9581, Bethesda, MD 20892 ([email protected]).
Accepted for Publication: January 17, 2007.
Author Contributions:Study concept and design: Volkow. Acquisition of data: Volkow, Wang, Swanson, and Telang. Analysis and interpretation of data: Volkow, Fowler, Wang, and Telang. Drafting of the manuscript: Volkow and Swanson. Critical revision of the manuscript for important intellectual content: Volkow, Fowler, Wang, Swanson, and Telang. Statistical analysis: Volkow. Obtained funding: Volkow, Fowler, and Wang. Administrative, technical, and material support: Volkow, Fowler, Wang, and Telang. Study supervision: Volkow, Wang, and Telang.
Financial Disclosure: None reported.
Funding/Support: This study was supported in part by the intramural program of the National Institute on Alcohol Abuse and Alcoholism; grants DA 06891, DA 09490, DA 06278, and AA 09481 from the National Institutes of Health; and the US Department of Energy, Office of Biological and Environmental Research.
Neuron. Author manuscript; available in PMC Dec 9, 2011.
Published in final edited form as:
See other articles in PMC that cite the published article.
Midbrain dopamine neurons are well known for their strong responses to rewards and their critical role in positive motivation. It has become increasingly clear, however, that dopamine neurons also transmit signals related to salient but non-rewarding experiences such as aversive and alerting events. Here we review recent advances in understanding the reward and non-reward functions of dopamine. Based on this data, we propose that dopamine neurons come in multiple types that are connected with distinct brain networks and have distinct roles in motivational control. Some dopamine neurons encode motivational value, supporting brain networks for seeking, evaluation, and value learning. Others encode motivational salience, supporting brain networks for orienting, cognition, and general motivation. Both types of dopamine neurons are augmented by an alerting signal involved in rapid detection of potentially important sensory cues. We hypothesize that these dopaminergic pathways for value, salience, and alerting cooperate to support adaptive behavior.
The neurotransmitter dopamine (DA) has a crucial role in motivational control – in learning what things in the world are good and bad, and in choosing actions to gain the good things and avoid the bad things. The major sources of DA in the cerebral cortex and in most subcortical areas are the DA-releasing neurons of the ventral midbrain, located in the substantia nigra pars compacta (SNc) and ventral tegmental area (VTA) (Bjorklund and Dunnett, 2007). These neurons transmit DA in two modes, ‘tonic’ and ‘phasic’ (Grace, 1991; Grace et al., 2007). In their tonic mode DA neurons maintain a steady, baseline level of DA in downstream neural structures that is vital for enabling the normal functions of neural circuits (Schultz, 2007). In their phasic mode DA neurons sharply increase or decrease their firing rates for 100–500 milliseconds, causing large changes in DA concentrations in downstream structures lasting for several seconds (Schultz, 1998; Schultz, 2007).
These phasic DA responses are triggered by many types of rewards and reward-related sensory cues (Schultz, 1998) and are ideally positioned to fulfill DA’s roles in motivational control, including its roles as a teaching signal that underlies reinforcement learning (Schultz et al., 1997; Wise, 2005) and as an incentive signal that promotes immediate reward seeking (Berridge and Robinson, 1998). As a result, these phasic DA reward signals have taken on a prominent role in theories about the functions of cortical and subcortical circuits and have become the subject of intense neuroscience research. In the first part of this review we will introduce the conventional theory of phasic DA reward signals and will review recent advances in understanding their nature and their control over neural processing and behavior.
In contrast to the accepted role of DA in reward processing, there has been considerable debate over the role of phasic DA activity in processing non-rewarding events. Some theories suggest that DA neuron phasic responses primarily encode reward-related events (Schultz, 1998; Ungless, 2004; Schultz, 2007), while others suggest that DA neurons transmit additional non-reward signals related to surprising, novel, salient, and even aversive experiences (Redgrave et al., 1999; Horvitz, 2000; Di Chiara, 2002; Joseph et al., 2003; Pezze and Feldon, 2004; Lisman and Grace, 2005; Redgrave and Gurney, 2006). In the second part of this review we will discuss a series of studies that have put these theories to the test and have revealed much about the nature of non-reward signals in DA neurons. In particular, these studies provide evidence that DA neurons are more diverse than previously thought. Rather than encoding a single homogeneous motivational signal, DA neurons come in multiple types that encode reward and non-reward events in different manners. This poses a problem for general theories which seek to identify dopamine with a single neural signal or motivational mechanism.
To remedy this dilemma, in the final part of this review we propose a new hypothesis to explain the presence of multiple types of DA neurons, the nature of their neural signals, and their integration into distinct brain networks for motivational control. Our basic proposal is as follows. One type of DA neurons encode motivational value, excited by rewarding events and inhibited by aversive events. These neurons support brain systems for seeking goals, evaluating outcomes, and value learning. A second type of DA neurons encode motivational salience, excited by both rewarding and aversive events. These neurons support brain systems for orienting, cognitive processing, and motivational drive. In addition to their value and salience-coding activity, both types of DA neurons also transmit an alerting signal, triggered by unexpected sensory cues of high potential importance. Together, we hypothesize that these value, salience, and alerting signals cooperate to coordinate downstream brain structures and control motivated behavior.
Dopamine has long been known to be important for reinforcement and motivation of actions. Drugs that interfere with DA transmission interfere with reinforcement learning, while manipulations which enhance DA transmission, such as brain stimulation and addictive drugs, often acts as reinforcers (Wise, 2004). DA transmission is crucial for creating a state of motivation to seek rewards (Berridge and Robinson, 1998; Salamone et al., 2007) and for establishing memories of cue-reward associations (Dalley et al., 2005). DA release is not necessary for all forms of reward learning and may not always be ‘liked’ in the sense of causing pleasure, but it is critical for causing goals to become ‘wanted’ in the sense of motivating actions to achieve them (Berridge and Robinson, 1998; Palmiter, 2008).
One hypothesis about how dopamine supports reinforcement learning is that it adjusts the strength of synaptic connections between neurons. The most straightforward version of this hypothesis is that dopamine controls synaptic plasticity according to a modified Hebbian rule that can be roughly stated as “neurons that fire together wire together, as long as they get a burst of dopamine”. In other words, if cell A activates cell B, and cell B causes a behavioral action which results in a reward, then dopamine would be released and the A→B connection would be reinforced (Montague et al., 1996; Schultz, 1998). This mechanism would allow an organism to learn the optimal choice of actions to gain rewards, given sufficient trial-and-error experience. Consistent with this hypothesis, dopamine has a potent influence on synaptic plasticity in numerous brain regions (Surmeier et al., 2010; Goto et al., 2010; Molina-Luna et al., 2009; Marowsky et al., 2005; Lisman and Grace, 2005). In some cases dopamine enables synaptic plasticity along the lines of the Hebbian rule described above, in a manner that is correlated with reward-seeking behavior (Reynolds et al., 2001). In addition to its effects on long-term synaptic plasticity, dopamine can also exert immediate control over neural circuits by modulating neural spiking activity and synaptic connections between neurons (Surmeier et al., 2007; Robbins and Arnsten, 2009), in some cases doing so in a manner that would promote immediate reward-seeking actions (Frank, 2005).
In order to motivate actions that lead to rewards, dopamine should be released during rewarding experiences. Indeed, most DA neurons are strongly activated by unexpected primary rewards such as food and water, often producing phasic ‘bursts’ of activity (Schultz, 1998) (phasic excitations including multiple spikes (Grace and Bunney, 1983)). However, the pioneering studies of Wolfram Schultz showed that these DA neuron responses are not triggered by reward consumption per se. Instead they resemble a ‘reward prediction error’, reporting the difference between the reward that is received and the reward that was predicted to occur (Schultz et al., 1997) (Figure 1A). Thus, if a reward is larger than predicted, DA neurons are strongly excited (positive prediction error, Figure 1E, red); if a reward is smaller than predicted or fails to occur at its appointed time, DA neurons are phasically inhibited (negative prediction error, Figure 1E, blue); and if a reward is cued in advance so that its size is fully predictable, DA neurons have little or no response (zero prediction error, Figure 1C, black). The same principle holds for DA responses to sensory cues that provide new information about future rewards. DA neurons are excited when a cue indicates an increase in future reward value (Figure 1C, red), inhibited when a cue indicates a decrease in future reward value (Figure 1C, blue), and generally have little response to cues that convey no new reward information (Figure 1E, black). These DA responses resemble a specific type of reward prediction error called the temporal difference error or “TD error”, which has been proposed to act as a reinforcement signal for learning the value of actions and environmental states (Houk et al., 1995; Montague et al., 1996; Schultz et al., 1997). Computational models using a TD-like reinforcement signal can explain many aspects of reinforcement learning in humans, animals, and DA neurons themselves (Sutton and Barto, 1981; Waelti et al., 2001; Montague and Berns, 2002; Dayan and Niv, 2008).
An impressive array of experiments have shown that DA signals represent reward predictions in a manner that closely matches behavioral preferences, including the preference for large rewards over small ones (Tobler et al., 2005) probable rewards over improbable ones (Fiorillo et al., 2003; Satoh et al., 2003; Morris et al., 2004) and immediate rewards over delayed ones (Roesch et al., 2007; Fiorillo et al., 2008; Kobayashi and Schultz, 2008). There is even evidence that DA neurons in humans encode the reward value of money (Zaghloul et al., 2009). Furthermore, DA signals emerge during learning with a similar timecourse to behavioral measures of reward prediction (Hollerman and Schultz, 1998; Satoh et al., 2003; Takikawa et al., 2004; Day et al., 2007) and are correlated with subjective measures of reward preference (Morris et al., 2006). These findings have established DA neurons as one of the best understood and most replicated examples of reward coding in the brain. As a result, recent studies have subjected DA neurons to intense scrutiny to discover how they generate reward predictions and how their signals act on downstream structures to control behavior.
Recent advances in understanding DA reward signals come from considering three broad questions: How do DA neurons learn reward predictions? How accurate are their predictions? And just what do they treat as rewarding?
How do DA neurons learn reward predictions? Classic theories suggest that reward predictions are learned through a gradual reinforcement process requiring repeated stimulus-reward pairings (Rescorla and Wagner, 1972; Montague et al., 1996). Each time stimulus A is followed by an unexpected reward, the estimated value of A is increased. Recent data, however, shows that DA neurons go beyond simple stimulus-reward learning and make predictions based on sophisticated beliefs about the structure of the world. DA neurons can predict rewards correctly even in unconventional environments where rewards paired with a stimulus cause a decrease in the value of that stimulus (Satoh et al., 2003; Nakahara et al., 2004; Bromberg-Martin et al., 2010c) or cause a change in the value of an entirely different stimulus (Bromberg-Martin et al., 2010b). DA neurons can also adapt their reward signals based on higher-order statistics of the reward distribution, such as scaling prediction error signals based on their expected variance (Tobler et al., 2005) and ‘spontaneously recovering’ their responses to extinguished reward cues (Pan et al., 2008). All of these phenomena form a remarkable parallel to similar effects seen in sensory and motor adaptation (Braun et al., 2010; Fairhall et al., 2001; Shadmehr et al., 2010), suggesting that they may reflect a general neural mechanism for predictive learning.
How accurate are DA reward predictions? Recent studies have shown that DA neurons faithfully adjust their reward signals to account for three sources of prediction uncertainty. First, humans and animals suffer from internal timing noise that prevents them from making reliable predictions about long cue-reward time intervals (Gallistel and Gibbon, 2000). Thus, if cue-reward delays are short (1–2 seconds) timing predictions are accurate and reward delivery triggers little DA response, but for longer cue-reward delays timing predictions become less reliable and rewards evoke clear DA bursts (Kobayashi and Schultz, 2008; Fiorillo et al., 2008). Second, many cues in everyday life are imprecise, specifying a broad distribution of reward delivery times. DA neurons again reflect this form of timing uncertainty: they are progressively inhibited during variable reward delays, as though signaling increasingly negative reward prediction errors at each moment the reward fails to appear (Fiorillo et al., 2008; Bromberg-Martin et al., 2010a; Nomoto et al., 2010). Finally, many cues are perceptually complex, requiring detailed inspection to reach a firm conclusion about their reward value. In such situations DA reward signals occur at long latencies and in a gradual fashion, appearing to reflect the gradual flow of perceptual information as the stimulus value is decoded (Nomoto et al., 2010).
Just what events do DA neurons treat as rewarding? Conventional theories of reward learning suggest that DA neurons assign value based on the expected amount of future primary reward (Montague et al., 1996). Yet even when the rate of primary reward is held constant, humans and animals often express an additional preference for predictability – seeking environments where each reward’s size, probability, and timing can be known in advance (Daly, 1992; Chew and Ho, 1994; Ahlbrecht and Weber, 1996). A recent study in monkeys found that DA neurons signal this preference (Bromberg-Martin and Hikosaka, 2009). Monkeys expressed a strong preference to view informative visual cues that would allow them to predict the size of a future reward, rather than uninformative cues that provided no new information. In parallel, DA neurons were excited by the opportunity to view the informative cues in a manner that was correlated with the animal’s behavioral preference (Figure 1B,D). This suggests that DA neurons not only motivate actions to gain rewards but also motivate actions to make accurate predictions about those rewards, in order to ensure that rewards can be properly anticipated and prepared for in advance.
Taken together, these findings show that DA reward prediction error signals are sensitive to sophisticated factors that inform human and animal reward predictions, including adaptation to high-order reward statistics, reward uncertainty, and preferences for predictive information.
DA reward responses occur in synchronous phasic bursts (Joshua et al., 2009b), a response pattern that shapes DA release in target structures (Gonon, 1988; Zhang et al., 2009; Tsai et al., 2009). It has long been theorized that these phasic bursts influence learning and motivation in a distinct manner from tonic DA activity (Grace, 1991; Grace et al., 2007; Schultz, 2007; Lapish et al., 2007). Recently developed technology has made it possible to confirm this hypothesis by controlling DA neuron activity with fine spatial and temporal precision. Optogenetic stimulation of VTA DA neurons induces a strong conditioned place preference which only occurs when stimulation is applied in a bursting pattern (Tsai et al., 2009). Conversely, genetic knockout of NMDA receptors from DA neurons, which impairs bursting while leaving tonic activity largely intact, causes a selective impairment in specific forms of reward learning (Zweifel et al., 2009; Parker et al., 2010) (although note that this knockout also impairs DA neuron synaptic plasticity (Zweifel et al., 2008)). DA bursts may enhance reward learning by reconfiguring local neural circuits. Notably, reward-predictive DA bursts are sent to specific regions of the nucleus accumbens, and these regions have especially high levels of reward-predictive neural activity (Cheer et al., 2007; Owesson-White et al., 2009).
Compared to phasic bursts, less is known about the importance of phasic pauses in spiking activity for negative reward prediction errors. These pauses cause smaller changes in spike rate, are less modulated by reward expectation (Bayer and Glimcher, 2005; Joshua et al., 2009a; Nomoto et al., 2010), and may have smaller effects on learning (Rutledge et al., 2009). However, certain types of negative prediction error learning require the VTA (Takahashi et al., 2009), suggesting that phasic pauses may still be decoded by downstream structures.
Since bursts and pauses cause very different patterns of DA release, they are likely to influence downstream structures through distinct mechanisms. There is recent evidence for this hypothesis in one major target of DA neurons, the dorsal striatum. Dorsal striatum projection neurons come in two types which express different DA receptors. One type expresses D1 receptors and projects to the basal ganglia ‘direct pathway’ to facilitate body movements; the second type expresses D2 receptors and projects to the ‘indirect pathway’ to suppress body movements (Figure 2) (Albin et al., 1989; Gerfen et al., 1990; Kravitz et al., 2010; Hikida et al., 2010). Based on the properties of these pathways and receptors, it has been theorized that DA bursts produce conditions of high DA, activate D1 receptors, and cause the direct pathway to select high-value movements (Figure 2A), whereas DA pauses produce conditions of low DA, inhibit D2 receptors, and cause the indirect pathway to suppress low-value movements (Figure 2B) (Frank, 2005; Hikosaka, 2007). Consistent with this hypothesis, high DA receptor activation promotes potentiation of cortico-striatal synapses onto the direct pathway (Shen et al., 2008) and learning from positive outcomes (Frank et al., 2004; Voon et al., 2010), while striatal D1 receptor blockade selectively impairs movements to rewarded targets (Nakamura and Hikosaka, 2006). In an analogous manner, low DA receptor activation promotes potentiation of cortico-striatal synapses onto the indirect pathway (Shen et al., 2008) and learning from negative outcomes (Frank et al., 2004; Voon et al., 2010), while striatal D2 receptor blockade selectively suppresses movements to non-rewarded targets (Nakamura and Hikosaka, 2006). This division of D1 and D2 receptor functions in motivational control explains many of the effects of DA-related genes on human behavior (Ullsperger, 2010; Frank and Fossella, 2010) and may extend beyond the dorsal striatum, as there is evidence for a similar division of labor in the ventral striatum (Grace et al., 2007; Lobo et al., 2010).
While the above scheme paints a simple picture of phasic DA control of behavior through its effects on the striatum, the full picture is much more complex. DA influences reward-related behavior by acting on many brain regions including the prefrontal cortex (Hitchcott et al., 2007), rhinal cortex (Liu et al., 2004), hippocampus (Packard and White, 1991; Grecksch and Matties, 1981) and amygdala (Phillips et al., 2010). The effects of DA are likely to differ widely between these regions due to variations in the density of DA innervation, DA transporters, metabolic enzymes, autoreceptors, receptors, and receptor coupling to intracellular signaling pathways (Neve et al., 2004; Bentivoglio and Morelli, 2005; Frank and Fossella, 2010). Furthermore, at least in the VTA, DA neurons can have different cellular properties depending on their projection targets (Lammel et al., 2008; Margolis et al., 2008), and some have the remarkable ability to transmit glutamate as well as dopamine (Descarries et al., 2008; Chuhma et al., 2009; Hnasko et al., 2010; Tecuapetla et al., 2010; Stuber et al., 2010; Birgner et al., 2010). Thus, the full extent of DA neuron control over neural processing is only beginning to be revealed.
Thus far we have discussed the role of DA neurons in reward-related behavior, founded upon dopamine responses resembling reward prediction errors. It has become increasingly clear, however, that DA neurons phasically respond to several types of events that are not intrinsically rewarding and are not cues to future rewards, and that these non-reward signals have an important role in motivational processing. These non-reward events can be grouped into two broad categories, aversive and alerting, which we will discuss in detail below. Aversive events include intrinsically undesirable stimuli (such as air puffs, bitter tastes, electrical shocks, and other unpleasant sensations) and sensory cues that have gained aversive properties through association with these events. Alerting events are unexpected sensory cues of high potential importance, which generally trigger immediate reactions to determine their meaning.
A neuron’s response to aversive events provides a crucial test of its functions in motivational control (Schultz, 1998; Berridge and Robinson, 1998; Redgrave et al., 1999; Horvitz, 2000; Joseph et al., 2003). In many respects we treat rewarding and aversive events in opposite manners, reflecting their opposite motivational value. We seek rewards and assign them positive value, while we avoid aversive events and assign them negative value. In other respects we treat rewarding and aversive events in similar manners, reflecting their similar motivational salience [FOOTNOTE1]. Both rewarding and aversive events trigger orienting of attention, cognitive processing, and increases in general motivation.
Which of these functions do DA neurons support? It has long been known that stressful and aversive experiences cause large changes in DA concentrations in downstream brain structures, and that behavioral reactions to these experiences are dramatically altered by DA agonists, antagonists, and lesions (Salamone, 1994; Di Chiara, 2002; Pezze and Feldon, 2004; Young et al., 2005). These studies have produced a striking diversity of results, however (Levita et al., 2002; Di Chiara, 2002; Young et al., 2005). Many studies are consistent with DA neurons encoding motivational salience. They report that aversive events increase DA levels and that behavioral aversion is supported by high levels of DA transmission (Salamone, 1994; Joseph et al., 2003; Ventura et al., 2007; Barr et al., 2009; Fadok et al., 2009) including phasic DA bursts (Zweifel et al., 2009). But other studies are more consistent with DA neurons encoding motivational value. They report that aversive events reduce DA levels and that behavioral aversion is supported by low levels of DA transmission (Mark et al., 1991; Shippenberg et al., 1991; Liu et al., 2008; Roitman et al., 2008). In many cases these mixed results have been found in single studies, indicating that aversive experiences cause different patterns of DA release in different brain structures (Thierry et al., 1976; Besson and Louilot, 1995; Ventura et al., 2001; Jeanblanc et al., 2002; Bassareo et al., 2002; Pascucci et al., 2007), and that DA-related drugs can produce a mixture of neural and behavioral effects similar to those caused by both rewarding and aversive experiences (Ettenberg, 2004; Wheeler et al., 2008).
This diversity of DA release patterns and functions is difficult to reconcile with the idea that DA neurons transmit a uniform motivational signal to all brain structures. These diverse responses could be explained, however, if DA neurons are themselves diverse – composed of multiple neural populations that support different aspects of aversive processing. This view is supported by neural recording studies in anesthetized animals. These studies have shown that noxious stimuli evoke excitation in some DA neurons but inhibition in other DA neurons (Chiodo et al., 1980; Maeda and Mogenson, 1982; Schultz and Romo, 1987; Mantz et al., 1989; Gao et al., 1990; Coizet et al., 2006). Importantly, both excitatory and inhibitory responses occur in neurons confirmed to be dopaminergic using juxtacellular labeling (Brischoux et al., 2009) (Figure 3). A similar diversity of aversive responses occurs during active behavior. Different groups of DA neurons are phasically excited or inhibited by aversive events including noxious stimulation of the skin (Kiyatkin, 1988a; Kiyatkin, 1988b), sensory cues predicting aversive shocks (Guarraci and Kapp, 1999), aversive airpuffs (Matsumoto and Hikosaka, 2009b), and sensory cues predicting aversive airpuffs (Matsumoto and Hikosaka, 2009b; Joshua et al., 2009a). Furthermore, when two DA neurons are recorded simultaneously, their aversive responses generally have little trial-to-trial correlation with each other (Joshua et al., 2009b), suggesting that aversive responses are not coordinated across the DA population as a whole.
To understand the functions of these diverse aversive responses, we need to know how they are combined with reward responses to generate a meaningful motivational signal. A recent study investigated this topic and revealed that DA neurons are divided into multiple populations with distinct motivational signals (Matsumoto and Hikosaka, 2009b). One population is excited by rewarding events and inhibited by aversive events, as though encoding motivational value (Figure 4A). A second population is excited by both rewarding and aversive events in similar manners, as though encoding motivational salience (Figure 4B). In both of these populations many neurons are sensitive to reward and aversive predictions: they respond when rewarding events are more rewarding than predicted and when aversive events are more aversive than predicted (Matsumoto and Hikosaka, 2009b). This shows that their aversive responses are truly caused by predictions about aversive events, ruling out the possibility that they could be caused by non-specific factors such as raw sensory input or generalized associations with reward (Schultz, 2010). These two populations differ, however, in the detailed nature of their predictive code. Motivational value coding DA neurons encode an accurate prediction error signal, including strong inhibition by omission of rewards and mild excitation by omission of aversive events (Figure 4A, right). In contrast, motivational salience coding DA neurons respond when salient events are present but not when they are absent (Figure 4B, right), consistent with theoretical notions of arousal (Lang and Davis, 2006) [FOOTNOTE2]. Evidence for these two DA neuron populations has been observed even when neural activity has been examined in an averaged manner. Thus, studies targeting different parts of the DA system found phasic DA signals encoding aversive events with inhibition (Roitman et al., 2008), similar to coding of motivational value, or with excitation (Joshua et al., 2008; Anstrom et al., 2009), similar to coding of motivational salience.
These recent findings might appear to contradict an early report that DA neurons respond preferentially to reward cues rather than aversive cues (Mirenowicz and Schultz, 1996). When examined closely, however, even that study is fully consistent with DA value and salience coding. In that study reward cues led to reward outcomes with high probability (>90%) while aversive cues led to aversive outcomes with low probability (<10%). Hence value and salience-coding DA neurons would have little response to the aversive cues, accurately encoding their low level of aversiveness.
Taken together, the above findings indicate that DA neurons are divided into multiple populations suitable for distinct roles in motivational control. Motivational value coding DA neurons fit well with current theories of dopamine neurons and reward processing (Schultz et al., 1997; Berridge and Robinson, 1998; Wise, 2004). These neurons encode a complete prediction error signal and encode rewarding and aversive events in opposite directions. Thus these neurons provide an appropriate instructive signal for seeking, evaluation, and value learning (Figure 5). If a stimulus causes value coding DA neurons to be excited then we should approach it, assign it high value, and learn actions to seek it again in the future. If a stimulus causes value coding DA neurons to be inhibited then we should avoid it, assign it low value, and learn actions to avoid it again in the future.
In contrast, motivational salience coding DA neurons fit well with theories of dopamine neurons and processing of salient events (Redgrave et al., 1999; Horvitz, 2000; Joseph et al., 2003; Kapur, 2003). These neurons are excited by both rewarding and aversive events and have weaker responses to neutral events, providing an appropriate instructive signal for neural circuitry to learn to detect, predict, and respond to situations of high importance. Here we will consider three such brain systems (Figure 5). First, neural circuits for visual and attentional orienting are calibrated to discover information about all types of events, both rewarding and aversive. For instance, both reward and aversive cues attract orienting reactions more effectively than neutral cues (Lang and Davis, 2006; Matsumoto and Hikosaka, 2009b; Austin and Duka, 2010). Second, both rewarding and aversive situations engage neural systems for cognitive control and action selection - we need to engage working memory to hold information in mind, conflict resolution to decide upon a course of action, and long-term memory to remember the resulting outcome (Bradley et al., 1992; Botvinick et al., 2001; Savine et al., 2010). Third, both rewarding and aversive situations require an increase in general motivation to energize actions and to ensure that they are executed properly. Indeed, DA neurons are critical in motivating effort to achieve high-value goals and in translating knowledge of task demands into reliable motor performance (Berridge and Robinson, 1998; Mazzoni et al., 2007; Niv et al., 2007; Salamone et al., 2007).
In addition to their signals encoding motivational value and salience, the majority of DA neurons also have burst responses to several types of sensory events that are not directly associated with rewarding or aversive experiences. These responses have been theorized to depend on a number of neural and psychological factors, including direct sensory input, surprise, novelty, arousal, attention, salience, generalization, and pseudo-conditioning (Schultz, 1998; Redgrave et al., 1999; Horvitz, 2000; Lisman and Grace, 2005; Redgrave and Gurney, 2006; Joshua et al., 2009a; Schultz, 2010).
Here we will attempt to synthesize these ideas and account for these DA responses in terms of a single underlying signal, an alerting signal (Figure 5). The term ‘alerting’ was used by Schultz (Schultz, 1998) as a general term for events that attract attention. Here we will use it in a more specific sense. By an alerting event, we mean an unexpected sensory cue that captures attention based on a rapid assessment of its potential importance, using simple features such as its location, size, and sensory modality. Such alerting events often trigger immediate behavioral reactions to investigate them and determine their precise meaning. Thus DA alerting signals typically occur at short latencies, are based on the rough features of a stimulus, and are best correlated with immediate reactions such as orienting reactions (Schultz and Romo, 1990; Joshua et al., 2009a; Schultz, 2010). This is in contrast to other motivational signals in DA neurons which typically occur at longer latencies, take into account the precise identity of the stimulus, and are best correlated with considered behavioral actions such as decisions to approach or avoid (Schultz and Romo, 1990; Joshua et al., 2009a; Schultz, 2010).
DA alerting responses can be triggered by surprising sensory events such as unexpected light flashes and auditory clicks, which evoke prominent burst excitations in 60–90% of DA neurons throughout the SNc and VTA (Strecker and Jacobs, 1985; Horvitz et al., 1997; Horvitz, 2000) (Figure 6A). These alerting responses seem to reflect the degree to which the stimulus is surprising and captures attention; they are reduced if a stimulus occurs at predictable times, if attention is engaged elsewhere, or during sleep (Schultz, 1998; Takikawa et al., 2004; Strecker and Jacobs, 1985; Steinfels et al., 1983). For instance, an unexpected clicking sound evokes a prominent DA burst when a cat is in a passive state of quiet waking, but has no effect when the cat is engaged in attention-demanding activities such as hunting a rat, feeding, grooming, being petted by the experimenter, and so on (Strecker and Jacobs, 1985) (Figure 6A). Similarly, DA burst responses are triggered by sensory events that are physically weak but are alerting due to their novelty (Ljungberg et al., 1992; Schultz, 1998). These responses habituate as the novel stimulus becomes familiar, in parallel with the habituation of orienting reactions (Figure 6B). Consistent with these findings, surprising and novel events evoke DA release in downstream structures (Lisman and Grace, 2005) and activate DA-related brain circuits in a manner that shapes reward processing (Zink et al., 2003; Davidson et al., 2004; Duzel et al., 2010).
DA alerting responses are also triggered by unexpected sensory cues that have the potential to provide new information about motivationally salient events. As expected for a short-latency alerting signal, these responses are rather non-selective: they are triggered by any stimulus that merely resembles a motivationally salient cue, even if the resemblance is very slight (a phenomenon called generalization) (Schultz, 1998). As a result, DA neurons often respond to a stimulus with a mixture of two signals: a fast alerting signal encoding the fact that the stimulus is potentially important, and a second signal encoding its actual rewarding or aversive meaning (Schultz and Romo, 1990; Waelti et al., 2001; Tobler et al., 2003; Day et al., 2007; Kobayashi and Schultz, 2008; Fiorillo et al., 2008; Nomoto et al., 2010) (see (Kakade and Dayan, 2002; Joshua et al., 2009a; Schultz, 2010) for review). An example can be seen in a set of motivational salience coding DA neurons shown in Figure 6C (Bromberg-Martin et al., 2010a). These neurons were excited by reward and aversive cues, but they were also excited by a neutral cue. The neutral cue had never been paired with motivational outcomes, but did have a (very slight) physical resemblance to the reward and aversive cues.
These alerting responses seem closely tied to a sensory cue’s ability to trigger orienting reactions to examine it further and discover its meaning. This can be seen in three notable properties. First, alerting responses only occur for sensory cues that have to be examined to determine their meaning, not for intrinsically rewarding or aversive events such as delivery of juice or airpuffs (Schultz, 2010). Second, alerting responses only occur when a cue is potentially important and has the ability to trigger orienting reactions, not when the cue is irrelevant to the task at hand and fails to trigger orienting reactions (Schultz and Romo, 1990). Third, alerting responses are enhanced in situations when cues would trigger an abrupt shift of attention – when they appear at an unexpected time or away from the center of gaze (Bromberg-Martin et al., 2010a). Thus when motivational cues are presented with unpredictable timing they trigger immediate orienting reactions and a generalized DA alerting response – excitation by all cues including neutral cues (Figure 6C, black). But if their timing is made predictable – for example, by forewarning the subjects with a “trial start cue” presented one second before the cues appear – the cues no longer evoke an alerting response (Figure 6D, gray). Instead, the alerting response shifts to the trial start cue – the first event of the trial that has unpredictable timing and evokes orienting reactions (Figure 6D, black).
What is the underlying mechanism that generates DA neuron alerting signals? One hypothesis is that alerting responses are simply conventional reward prediction error signals that occur at short latencies, encoding the expected reward value of a stimulus before it has been fully discriminated (Kakade and Dayan, 2002). More recent evidence, however, suggests that alerting signals can be generated by a distinct mechanism from conventional DA reward signals (Satoh et al., 2003; Bayer and Glimcher, 2005; Bromberg-Martin et al., 2010a; Bromberg-Martin et al., 2010c; Nomoto et al., 2010). Most strikingly, the alerting response to the trial start cue is not restricted to rewarding tasks; it can have equal strength during an aversive task in which no rewards are delivered (Figure 6C,D, bottom, “aversive task”). This occurs even though conventional DA reward signals in the same neurons correctly signal that the rewarding task has a much higher expected value than the aversive task (Bromberg-Martin et al., 2010a). These alerting signals are not purely a form of value coding or purely a form of salience coding, because they occur in the majority of both motivational value and salience coding DA neurons (Bromberg-Martin et al., 2010a). A second dissociation can be seen in the way that DA neurons predict future rewards based on the memory of past reward outcomes (Satoh et al., 2003; Bayer and Glimcher, 2005). Whereas conventional DA reward signals are controlled by a long-timescale memory trace optimized for accurate reward prediction, alerting responses to the trial start cue are controlled by a separate memory trace resembling that seen in immediate orienting reactions (Bromberg-Martin et al., 2010c). A third dissociation can be seen in the way that these signals are distributed across the DA neuron population. Whereas conventional DA reward signals are strongest in the ventromedial SNc, alerting responses to the trial start cue (and to other unexpectedly timed cues) are broadcast throughout the SNc (Nomoto et al., 2010).
In contrast to these dissociations from conventional reward signals, DA alerting signals are correlated with the speed of orienting and approach responses to the alerting event (Satoh et al., 2003; Bromberg-Martin et al., 2010a; Bromberg-Martin et al., 2010c). This suggests that alerting signals are generated by a neural process that motivates fast reactions to investigate potentially important events. At the present time, unfortunately, relatively little is known about precisely what events this process treats as ‘important’. For example, are alerting responses equally sensitive to rewarding and aversive events? Alerting responses are known to occur for stimuli that resemble reward cues or that resemble both reward and aversive cues (e.g. by sharing the same sensory modality). But it is not yet known whether alerting responses occur for stimuli that solely resemble aversive cues.
As we have seen, alerting signals are likely to be generated by a distinct mechanism from motivational value and salience signals. However, alerting signals are sent to both motivational value and salience coding DA neurons, and therefore are likely to regulate brain processing and behavior in a similar manner to value and salience signals (Figure 5).
Alerting signals sent to motivational salience coding DA neurons would support orienting of attention to the alerting stimulus, engagement of cognitive resources to discover its meaning and decide on a plan for action, and increase motivation levels to implement this plan efficiently (Figure 5). These effects could occur through immediate effects on neural processing or by reinforcing actions which led to detection of the alerting event. This functional role fits well with the correlation between DA alerting responses and fast behavioral reactions to the alerting stimulus, and with theories that short-latency DA neuron responses are involved in orienting of attention, arousal, enhancement of cognitive processing, and immediate behavioral reactions (Redgrave et al., 1999; Horvitz, 2000; Joseph et al., 2003; Lisman and Grace, 2005; Redgrave and Gurney, 2006; Joshua et al., 2009a).
The presence of alerting signals in motivational value coding DA neurons is more difficult to explain. These neurons transmit motivational value signals that are ideal for seeking, evaluation of outcomes, and value learning; yet they can also be excited by alerting events such as unexpected clicking sounds and the onset of aversive trials. According to our hypothesized pathway (Figure 5), this would cause alerting events to be assigned positive value and to be sought after in a manner similar to rewards! While surprising at first glance, there is reason to suspect that alerting events can be treated as positive goals. Alerting signals provide the first warning that a potentially important event is about to occur, and hence provide the first opportunity to take action to control that event. If alerting cues are available, motivationally salient events can be detected, predicted, and prepared for in advance; if alerting cues are absent, motivationally salient events always occur as an unexpected surprise. Indeed, humans and animals often express a preference for environments where rewarding, aversive, and even motivationally neutral sensory events can be observed and predicted in advance (Badia et al., 1979; Herry et al., 2007; Daly, 1992; Chew and Ho, 1994) and many DA neurons signal the behavioral preference to view reward-predictive information (Bromberg-Martin and Hikosaka, 2009). DA alerting signals may support these preferences by assigning positive value to environments where potentially important sensory cues can be anticipated in advance.
Thus far we have divided DA neurons into two types which encode motivational value and motivational salience and are suitable for distinct roles in motivational control (Figure 5). How does this conceptual scheme map onto neural pathways in the brain? Here we propose a hypothesis about the anatomical locations of these neurons, their projections to downstream structures, and the sources of their motivational signals (Figures 6,,77).
A recent study mapped the locations of DA reward and aversive signals in the lateral midbrain including the SNc and lateralmost part of the VTA (Matsumoto and Hikosaka, 2009b). Motivational value and motivational salience signals were distributed across this region in an anatomical gradient. Motivational value signals were found more commonly in neurons in the ventromedial SNc and lateral VTA, while motivational salience signals were found more commonly in neurons in the dorsolateral SNc (Figure 7B). This is consistent with reports that DA reward value coding is strongest in the ventromedial SNc (Nomoto et al., 2010) while aversive excitations tend to be strongest more laterally (Mirenowicz and Schultz, 1996). Other studies have explored the more medial midbrain. These studies found a mixture of excitatory and inhibitory aversive responses with no significant difference in their locations, although with a trend for aversive excitations to be located more ventrally (Guarraci and Kapp, 1999; Brischoux et al., 2009) (Figure 7C).
According to our hypothesis, motivational value coding DA neurons should project to brain regions involved in approach and avoidance actions, evaluation of outcomes, and value learning (Figure 5). Indeed, the ventromedial SNc and VTA project to the ventromedial prefrontal cortex (Williams and Goldman-Rakic, 1998) including the orbitofrontal cortex (OFC) (Porrino and Goldman-Rakic, 1982) (Figure 7A). The OFC has been consistently implicated in value coding in functional imaging studies (Anderson et al., 2003; Small et al., 2003; Jensen et al., 2007; Litt et al., 2010) and single neuron recordings (Morrison and Salzman, 2009; Roesch and Olson, 2004). The OFC is thought to evaluate choice options (Padoa-Schioppa, 2007; Kable and Glimcher, 2009), encode outcome expectations (Schoenbaum et al., 2009), and update these expectations during learning (Walton et al., 2010). Furthermore, the OFC is involved in learning from negative reward prediction errors (Takahashi et al., 2009) which are strongest in value-coding DA neurons (Figure 4).
In addition, the medial portions of the dopaminergic midbrain project to the ventral striatum including the nucleus accumbens shell (NAc shell) (Haber et al., 2000) (Figure 7A). A recent study demonstrated that the NAc shell receives phasic DA signals encoding the motivational value of taste outcomes (Roitman et al., 2008). These signals are likely to cause value learning because direct infusion of DA drugs into the NAc shell is strongly reinforcing (Ikemoto, 2010) while treatments that reduce DA input to the shell can induce aversions (Liu et al., 2008). One caveat is that studies of NAc shell DA release over long timescales (minutes) have produced mixed results, some consistent with value coding and others with salience coding (e.g. (Bassareo et al., 2002; Ventura et al., 2007)). This suggests that value signals may be restricted to specific locations within the NAc shell. Notably, different regions of the NAc shell are specialized for controlling appetitive and aversive behavior (Reynolds and Berridge, 2002), which both require input from DA neurons (Faure et al., 2008).
Finally, DA neurons throughout the extent of the SNc send heavy projections to the dorsal striatum (Haber et al., 2000), suggesting that the dorsal striatum may receive both motivational value and salience coding DA signals (Figure 7A). Motivational value coding DA neurons would provide an ideal instructive signal for striatal circuitry involved in value learning, such as learning of stimulus-response habits (Faure et al., 2005; Yin and Knowlton, 2006; Balleine and O'Doherty, 2010). When these DA neurons burst, they would engage the direct pathway to learn to gain reward outcomes; when they pause, they would engage the indirect pathway to learn to avoid aversive outcomes (Figure 2). Indeed, there is recent evidence that the striatal pathways follow exactly this division of labor for reward and aversive processing (Hikida et al., 2010). It is still unknown, however, how neurons in these pathways respond to rewarding and aversive events during behavior. At least in the dorsal striatum as a whole, a subset of neurons respond to certain rewarding and aversive events in distinct manners (Ravel et al., 2003; Yamada et al., 2004, 2007; Joshua et al., 2008).
According to our hypothesis, motivational salience coding DA neurons should project to brain regions involved in orienting, cognitive processing, and general motivation (Figure 5). Indeed, DA neurons in the dorsolateral midbrain send projections to dorsal and lateral frontal cortex (Williams and Goldman-Rakic, 1998) (Figure 7A), a region which has been implicated in cognitive functions such as attentional search, working memory, cognitive control, and decision making between motivational outcomes (Williams and Castner, 2006; Lee and Seo, 2007; Wise, 2008; Kable and Glimcher, 2009; Wallis and Kennerley, 2010). Dorsolateral prefrontal cognitive functions are tightly regulated by DA levels (Robbins and Arnsten, 2009) and are theorized to depend on phasic DA neuron activation (Cohen et al., 2002; Lapish et al., 2007). Notably, a subset of lateral prefrontal neurons respond to both rewarding and aversive visual cues, and the great majority respond in the same direction resembling coding of motivational salience (Kobayashi et al., 2006). Furthermore, the activity of these neurons is correlated with behavioral success at performing working memory tasks (Kobayashi et al., 2006). Although this dorsolateral DA→dorsolateral frontal cortex pathway appears to be specific to primates (Williams and Goldman-Rakic, 1998), a functionally similar pathway may exist in other species. In particular, many of the cognitive functions of the primate dorsolateral prefrontal cortex are performed by the rodent medial prefrontal cortex (Uylings et al., 2003), and there is evidence that this region receives DA motivational salience signals and controls salience-related behavior (Mantz et al., 1989; Di Chiara, 2002; Joseph et al., 2003; Ventura et al., 2007; Ventura et al., 2008).
Given the evidence that the VTA contains both salience and value coding neurons and that value coding signals are sent to the NAc shell, salience signals might be sent to the NAc core (Figure 7A). Indeed, the NAc core (but not shell) is crucial for enabling motivation to overcome response costs such as physical effort; for performance of set-shifting tasks requiring cognitive flexibility; and for enabling reward cues to cause an enhancement of general motivation (Ghods-Sharifi and Floresco, 2010; Floresco et al., 2006; Hall et al., 2001; Cardinal, 2006). Consistent with coding of motivational salience, the NAc core receives phasic bursts of DA during both rewarding experiences (Day et al., 2007) and aversive experiences (Anstrom et al., 2009).
Finally, as discussed above, some salience coding DA neurons may project to the dorsal striatum (Figure 7A). While some regions of the dorsal striatum are involved in functions related to learning action values, the dorsal striatum is also involved in functions that should be engaged for all salient events, such as orienting, attention, working memory, and general motivation (Hikosaka et al., 2000; Klingberg, 2010; Palmiter, 2008). Indeed, a subset of dorsal striatal neurons are more strongly responsive to rewarding and aversive events than to neutral events (Ravel et al., 1999; Blazquez et al., 2002; Yamada et al., 2004, 2007), although their causal role in motivated behavior is not yet known.
A recent series of studies suggests that DA neurons receive motivational value signals from a small nucleus in the epithalamus, the lateral habenula (LHb) (Hikosaka, 2010) (Figure 8). The LHb exerts potent negative control over DA neurons: LHb stimulation inhibits DA neurons at short latencies (Christoph et al., 1986) and can regulate learning in an opposite manner to VTA stimulation (Shumake et al., 2010). Consistent with a negative control signal, many LHb neurons have mirror-inverted phasic responses to DA neurons: LHb neurons are inhibited by positive reward prediction errors and excited by negative reward prediction errors (Matsumoto and Hikosaka, 2007, 2009a; Bromberg-Martin et al., 2010a; Bromberg-Martin et al., 2010c). In several cases these signals occur at shorter latencies in the LHb, consistent with LHb → DA transmission (Matsumoto and Hikosaka, 2007; Bromberg-Martin et al., 2010a).
The LHb is capable of controlling DA neurons throughout the midbrain, but several lines of evidence suggest that it exerts preferential control over motivational value coding DA neurons. First, LHb neurons encode motivational value in a manner closely mirroring value-coding DA neurons – they encode both positive and negative reward prediction errors and respond in opposite directions to rewarding and aversive events (Matsumoto and Hikosaka, 2009a; Bromberg-Martin et al., 2010a). Second, LHb stimulation has its most potent effects on DA neurons whose properties are consistent with value coding, including inhibition by no-reward cues and anatomical location in the ventromedial SNc (Matsumoto and Hikosaka, 2007, 2009b). Third, lesions to the LHb impair DA neuron inhibitory responses to aversive events, suggesting a causal role for the LHb in generating DA value signals (Gao et al., 1990).
The LHb is part of a more extensive neural pathway by which DA neurons can be controlled by the basal ganglia (Figure 8). The LHb receives signals resembling reward prediction errors through a projection from a population of neurons located around the globus pallidus border (GPb) (Hong and Hikosaka, 2008). Once these signals reach the LHb they are likely to be sent to DA neurons through a disynaptic pathway in which the LHb excites midbrain GABA neurons that in turn inhibit DA neurons (Ji and Shepard, 2007; Omelchenko et al., 2009; Brinschwitz et al., 2010). This could occur through LHb projections to interneurons in the VTA and to an adjacent GABA-ergic nucleus called the rostromedial tegmental nucleus (RMTg) (Jhou et al., 2009b) (also called the ‘caudal tail of VTA’ (Kaufling et al., 2009)). Notably, RMTg neurons have response properties similar to LHb neurons, encode motivational value, and have a heavy inhibitory projection to dopaminergic midbrain (Jhou et al., 2009a). Thus, the complete basal ganglia pathway to send motivational value signals to DA neurons may be GPb→LHb→RMTg→DA (Hikosaka, 2010).
An important question for future research is whether motivational value signals are channeled solely through the LHb or whether they are carried by multiple input pathways. Notably, DA inhibitions by aversive footshocks are controlled by activity in the mesopontine parabrachial nucleus (PBN) (Coizet et al., 2010) (Figure 8). This nucleus contains neurons that receive direct input from the spinal cord encoding noxious sensations and could inhibit DA neurons through excitatory projections to the RMTg (Coizet et al., 2010; Gauriau and Bernard, 2002). This suggests that the LHb sends DA neurons motivational value signals for both rewarding and aversive cues and outcomes while the PBN provides a component of the value signal specifically related to aversive outcomes.
Less is known about the source of motivational salience signals in DA neurons. One intriguing candidate is the central nucleus of the amygdala (CeA) which has been consistently implicated in orienting, attention, and general motivational responses during both rewarding and aversive events (Holland and Gallagher, 1999; Baxter and Murray, 2002; Merali et al., 2003; Balleine and Killcross, 2006) (Figure 8). The CeA and other amygdala nuclei contain many neurons whose signals are consistent with motivational salience: they signal rewarding and aversive events in the same direction, are enhanced when events occur unexpectedly, and are correlated with behavioral measures of arousal (Nishijo et al., 1988; Belova et al., 2007; Shabel and Janak, 2009). These signals may be sent to DA neurons because the CeA has descending projections to the brainstem that carry rewarding and aversive information (Lee et al., 2005; Pascoe and Kapp, 1985) and the CeA is necessary for DA release during reward-related events (Phillips et al., 2003a). Furthermore, the CeA participates with DA neurons in pathways consistent with our proposed anatomical and functional networks for motivational salience. A pathway including the CeA, SNc, and dorsal striatum is necessary for learned orienting to food cues (Han et al., 1997; Lee et al., 2005; El-Amamy and Holland, 2007). Consistent with our division of salience vs. value signals, this pathway is needed for learning to orient to food cues but not for learning to approach food outcomes (Han et al., 1997). A second pathway, including the CeA, SNc, VTA, and NAc core, is necessary for reward cues to cause an increase in general motivation to perform reward-seeking actions (Hall et al., 2001; Corbit and Balleine, 2005; El-Amamy and Holland, 2007).
In addition to the CeA, DA neurons could receive motivational salience signals from other sources such as salience-coding neurons in the basal forebrain (Lin and Nicolelis, 2008; Richardson and DeLong, 1991) and neurons in the PBN (Coizet et al., 2010), although these pathways remain to be investigated.
There are several good candidates for providing DA neurons with alerting signals. Perhaps the most attractive candidate is the superior colliculus (SC), a midbrain nucleus that receives short-latency sensory input from multiple sensory modalities and controls orienting reactions and attention (Redgrave and Gurney, 2006) (Figure 8). The SC has a direct projection to the SNc and VTA (May et al., 2009; Comoli et al., 2003). In anesthetized animals the SC is a vital conduit for short-latency visual signals to reach DA neurons and trigger DA release in downstream structures (Comoli et al., 2003; Dommett et al., 2005). The SC-DA pathway is best suited to convey alerting signals rather than reward and aversion signals, as SC neurons have little response to reward delivery and have only a mild influence over DA aversive responses (Coizet et al., 2006). This suggests a sequence of events in which SC neurons (1) detect a stimulus, (2) select it as potentially important, (3) trigger an orienting reaction to examine the stimulus, and (4) simultaneously trigger a DA alerting response which causes a burst of DA in downstream structures (Redgrave and Gurney, 2006).
A second candidate for sending alerting signals to DA neurons is the LHb (Figure 8). Notably, the unexpected onset of a trial start cue inhibits many LHb neurons in an inverse manner to the DA neuron alerting signal, and this response occurs at shorter latency in the LHb consistent with a LHb→DA direction of transmission (Bromberg-Martin et al., 2010a; Bromberg-Martin et al., 2010c). We have also anecdotally observed that LHb neurons are commonly inhibited by unexpected visual images and sounds in an inverse manner to DA excitations (M.M., E.S.B.-M., and O.H., unpublished observations) although this awaits a more systematic investigation.
Finally, a third candidate for sending alerting signals to DA neurons is the pedunculopontine tegmental nucleus (PPTg), which projects to both the SNc and VTA and is involved in motivational processing (Winn, 2006) (Figure 8). The PPTg is important for enabling VTA DA neuron bursts (Grace et al., 2007) including burst responses to reward cues (Pan and Hyland, 2005). Consistent with an alerting signal, PPTg neurons have short-latency responses to multiple sensory modalities and are active during orienting reactions (Winn, 2006). There is evidence that PPTg sensory responses are influenced by reward value and by requirements for immediate action (Dormont et al., 1998; Okada et al., 2009) (but see (Pan and Hyland, 2005)). Some PPTg neurons also respond to rewarding or aversive outcomes themselves (Dormont et al., 1998; Kobayashi et al., 2002; Ivlieva and Timofeeva, 2003b, a). It will be important to test whether the signals the PPTg sends to DA neurons are related specifically to alerting or whether they contain other motivational signals such as value and salience.
We have reviewed the nature of reward, aversive, and alerting signals in DA neurons, and have proposed a hypothesis about the underlying neural pathways and their roles in motivated behavior. We consider this to be a working hypothesis, a guide for future theories and research that will bring us to a more complete understanding. Here we will highlight several areas where further investigation is needed to reveal deeper complexities.
At the present time, our understanding of the neural pathways underlying DA signals is at an early stage. Therefore, we have attempted to infer the sources and destinations of value and salience coding DA signals largely based on indirect measures such as the neural response properties and functional roles of different brain areas. It will be important to put these candidate pathways to a direct test and to discover their detailed properties, aided by recently developed tools that allow DA transmission to be monitored (Robinson et al., 2008) and controlled (Tsai et al., 2009; Tecuapetla et al., 2010; Stuber et al., 2010) with high spatial and temporal precision. As noted above, several of these candidate structures have a topographic organization, suggesting that their communication with DA neurons might be topographic as well. The neural sources of phasic DA signals may also be more complex than the simple feedforward pathways we have proposed, since the neural structures that communicate with DA neurons are densely interconnected (Geisler and Zahm, 2005) and DA neurons can communicate with each other within the midbrain (Ford et al., 2010).
We have focused on a selected set of DA neuron connections, but DA neurons receive functional input from many additional structures including the subthalamic nucleus, laterodorsal tegmental nucleus, bed nucleus of the stria terminalis, prefrontal cortex, ventral pallidum, and lateral hypothalamus (Grace et al., 2007; Shimo and Wichmann, 2009; Jalabert et al., 2009). Notably, lateral hypothalamus orexin neurons project to DA neurons, are activated by rewarding rather than aversive events, and trigger drug-seeking behavior (Harris and Aston-Jones, 2006), suggesting a possible role in value-related functions. DA neurons also send projections to many additional structures including the hypothalamus, hippocampus, amygdala, habenula, and a great many cortical areas. Notably, the anterior cingulate cortex (ACC) has been proposed to receive reward prediction error signals from DA neurons (Holroyd and Coles, 2002) and contains neurons with activity positively related to motivational value (Koyama et al., 1998). Yet ACC activation is also linked to aversive processing (Vogt, 2005; Johansen and Fields, 2004). These ACC functions might be supported by a mixture of DA motivational value and salience signals, which will be important to test in future study. Indeed, neural signals related to reward prediction errors have been reported in several areas including the medial prefrontal cortex (Matsumoto et al., 2007; Seo and Lee, 2007), orbitofrontal cortex (Sul et al., 2010) (but see (Takahashi et al., 2009; Kennerley and Wallis, 2009)), and dorsal striatum (Kim et al., 2009; Oyama et al., 2010), and their causal relationship to DA neuron activity remains to be discovered.
We have described motivational events with a simple dichotomy, classifying them as ‘rewarding’ or ‘aversive’. Yet these categories contain great variety. An aversive illness is gradual, prolonged, and caused by internal events; an aversive airpuff is fast, brief, and caused by the external world. These situations demand very different behavioral responses which are likely to be supported by different neural systems. Furthermore, although we have focused our discussion on two types of DA neurons with signals resembling motivational value and salience, a close examination shows that DA neurons are not limited to this strict dichotomy. As indicated by our notion of an anatomical gradient some DA neurons transmit mixtures of both salience-like and value-like signals; still other DA neurons respond to rewarding but not aversive events (Matsumoto and Hikosaka, 2009b; Bromberg-Martin et al., 2010a). These considerations suggest that some DA neurons may not encode motivational events along our intuitive axis of ‘good’ vs. ‘bad’ and may instead be specialized to support specific forms of adaptive behavior.
Even in the realm of rewards, there is evidence that DA neurons transmit different reward signals to different brain regions (Bassareo and Di Chiara, 1999; Ito et al., 2000; Stefani and Moghaddam, 2006; Wightman et al., 2007; Aragona et al., 2009). Diverse responses reported in the SNc and VTA include neurons that: respond only to the start of a trial (Roesch et al., 2007), perhaps encoding a pure alerting signal; respond differently to visual and auditory modalities (Strecker and Jacobs, 1985), perhaps receiving input from different SC and PPTg neurons; respond to the first or last event in a sequence (Ravel and Richmond, 2006; Jin and Costa, 2010); have sustained activation by risky rewards (Fiorillo et al., 2003); or are activated during body movements (Schultz, 1986; Kiyatkin, 1988a; Puryear et al., 2010; Jin and Costa, 2010) (see also (Phillips et al., 2003b; Stuber et al., 2005)). While each of these response patterns has only been reported in a minority of studies or neurons, this data suggests that DA neurons could potentially be divided into a much larger number of functionally distinct populations.
A final and important consideration is that present recording studies in behaving animals do not yet provide fully conclusive measurements of DA neuron activity, because these studies have only been able to distinguish between DA and non-DA neurons using indirect methods, based on neural properties such as firing rate, spike waveform, and sensitivity to D2 receptor agonists (Grace and Bunney, 1983; Schultz, 1986). These techniques appear to identify DA neurons reliably within the SNc, indicated by several lines of evidence including comparison of intracellular and extracellular methods, juxtacellular recordings, and the effects of DA-specific lesions (Grace and Bunney, 1983; Grace et al., 2007; Brown et al., 2009). However, recent studies indicate that this technique may be less reliable in the VTA, where DA and non-DA neurons have a wider variety of cellular properties (Margolis et al., 2006; Margolis et al., 2008; Lammel et al., 2008; Brischoux et al., 2009). Even direct measurements of DA concentrations in downstream structures do not provide conclusive evidence of DA neuron spiking activity, because DA concentrations may be controlled by additional factors such as glutamatergic activation of DA axon terminals (Cheramy et al., 1991) and rapid changes in the activity of DA transporters (Zahniser and Sorkin, 2004). To perform fully conclusive measurements of DA neuron activity during active behavior it will be necessary to use new recording techniques, such as combining extracellular recording with optogenetic stimulation (Jin and Costa, 2010).
An influential concept of midbrain DA neurons has been that they transmit a uniform motivational signal to all downstream structures. Here we have reviewed evidence that DA signals are more diverse than commonly thought. Rather than encoding a uniform signal, DA neurons come in multiple types that send distinct motivational messages about rewarding and non-rewarding events. Even single DA neurons do not appear to transmit single motivational signals. Instead, DA neurons transmit mixtures of multiple signals generated by distinct neural processes. Some reflect detailed predictions about rewarding and aversive experiences, while others reflect fast responses to events of high potential importance.
In addition, we have proposed a hypothesis about the nature of these diverse DA signals, the neural networks that generate them, and their influence on downstream brain structures and on motivated behavior. Our proposal can be seen as a synthesis of previous theories. Many previous theories have attempted to identify DA neurons with a single motivational process such as seeking of valued goals, engaging motivationally salient situations, or reacting to alerting changes in the environment. In our view, DA neurons receive signals related to all three of these processes. Yet rather than distilling these signals into a uniform message, we have proposed that DA neurons transmit these signals to distinct brain structures in order to support distinct neural systems for motivated cognition and behavior. Some DA neurons support brain systems that assign motivational value, promoting actions to seek rewarding events, avoid aversive events, and ensure that alerting events can be predicted and prepared for in advance. Other DA neurons support brain systems that are engaged by motivational salience, including orienting to detect potentially important events, cognitive processing to choose a response and to remember its consequences, and motivation to persist in pursuit of an optimal outcome. We hope that this proposal helps lead us to a more refined understanding of DA functions in the brain, in which DA neurons tailor their signals to support multiple neural networks with distinct roles in motivational control.
This work was supported by the intramural research program at the National Eye Institute. We also thank Amy Arnsten for valuable discussions.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
FOOTNOTE1By motivational salience we mean a quantity that is high for both rewarding and aversive events and is low for motivationally neutral (non-rewarding and non-aversive) events. This is similar to the definition given by (Berridge and Robinson, 1998). Note that motivational salience is distinct from other notions of salience used in neuroscience, such as incentive salience (which applies only to desirable events; (Berridge and Robinson, 1998)) and perceptual salience (which applies to motivationally neutral events such as moving objects and colored lights; (Bisley and Goldberg, 2010)).]
FOOTNOTE2Note that motivational salience coding DA neuron signals are distinct from the classic notions of “associability” and “change in associability” that have been proposed to regulate the rate of reinforcement learning (e.g. (Pearce and Hall, 1980)). Such theories state that animals learn (and adjust learning rates) from both positive and negative prediction errors. Although these DA neurons may contribute to learning from positive prediction errors, during which they can have a strong response (e.g. to unexpected reward delivery), they may not contribute to learning from negative prediction errors, during which they can have little or no response (e.g. to unexpected reward omission) (Fig. 4B).
J Neurosci. 2014 Oct 22;34(43):14349-64. doi: 10.1523/JNEUROSCI.3492-14.2014.
Approach to reward is a fundamental adaptive behavior, disruption of which is a core symptom of addiction and depression. Nucleus accumbens (NAc) dopamine is required for reward-predictive cues to activate vigorous reward seeking, but the underlying neural mechanism is unknown. Reward-predictive cues elicit both dopamine release in the NAc and excitations and inhibitions in NAc neurons.
However, a direct link has not been established between dopamine receptor activation, NAc cue-evoked neuronal activity, and reward-seeking behavior. Here, we use a novel microelectrode array that enables simultaneous recording of neuronal firing and local dopamine receptor antagonist injection. We demonstrate that, in the NAc of rats performing a discriminative stimulus task for sucrose reward, blockade of either D1 or D2 receptors selectively attenuates excitation, but not inhibition, evoked by reward-predictive cues.
Furthermore, we establish that this dopamine-dependent signal is necessary for reward-seeking behavior. These results demonstrate a neural mechanism by which NAc dopamine invigorates environmentally cued reward-seeking behavior.
The dopamine projection from the ventral tegmental area (VTA) to the NAc is an essential component of the neural circuit that promotes reward-seeking behavior (Nicola, 2007). If NAc dopamine function is reduced experimentally, animals are less likely to exert effort to obtain reward (Salamone and Correa, 2012) and often fail to respond to reward-predictive cues (Di Ciano et al., 2001; Yun et al., 2004; Nicola, 2007, 2010; Saunders and Robinson, 2012). These deficits are due to impairment of a specific component of reward seeking: the latency to initiate approach behavior is increased, whereas the speed of approach, the ability to find the goal and perform the necessary operant behavior required to earn reward, and the ability to consume reward are unaffected (Nicola, 2010). Dopamine must promote approach by influencing the activity of NAc neurons, but the nature of this influence remains unclear. Large proportions of NAc neurons are excited or inhibited by reward-predictive cues (Nicola et al., 2004a; Roitman et al., 2005; Ambroggi et al., 2008, 2011; McGinty et al., 2013), and the excitations begin before onset of cued approach behavior and predict the latency to initiate locomotion (McGinty et al., 2013). Therefore, this activity has the characteristics required of a dopamine-dependent signal that promotes cued approach, but whether it does so is unknown.
Neurons in two structures that send glutamatergic afferents to the NAc, the BLA and dorsal medial PFC (Brog et al., 1993), are excited by reward-predictive cues (Schoenbaum et al., 1998; Ambroggi et al., 2008), and reversible inactivation of either of these structures (Ambroggi et al., 2008; Ishikawa et al., 2008) or of the VTA (Yun et al., 2004) reduces the magnitude of cue-evoked excitations in the NAc. These observations sugges