The neurochemical dopamine is the central player in all addictions. Dopamine will be in every article on addiction, reward, motivation, or learning. Dopamine dysregulation is at the heart of porn addiction, cravings and withdrawal symptoms. Restoring normal dopamine function and sensitivity is a key to porn recovery
This section contains both lay articles for the general public, and research articles. If you are not an expert in addiction, I suggest starting with the lay articles, they are marked with an "L"
September 30, 2013 · by Talia Lerner
Dopamine neurons are some of the most studied, most sensationalized neurons out there. Lately, though, they’ve been going through a bit of an identity crisis. What is a dopamine neuron? Some interesting recent twists in dopamine research have definitively debunked the myth that dopamine neurons are all of a kind – and you should question any study that treats them as such.
There are many ways in which a dopamine neuron (defined as a neuron that releases the neurotransmitter dopamine) is not just a dopamine neuron. I’ll focus on three really cool ways here:
Before I describe these exciting new findings, though, let me give you the standard Neuroscience 101 introduction to dopamine neurons. This influential theory of dopamine neuron function comes to us from Wolfram Schultz and colleagues’ 1997 Science paper, “A Neural Substrate of Prediction and Reward.” It showed that dopamine neurons, which fire at some background rate, fire more in response to unpredicted, but not predicted, rewards. Additionally, if you’re expecting a reward and don’t get it, the dopamine neurons fire less. This finding led Schultz et al. to propose that dopamine neurons encode “reward prediction error.” That is, they tell you whether or not things are as good, better, or worse than you expected. Schultz et al. go on to state “The responses of these neurons are relatively homogeneous—different neurons respond in the same manner and different appetitive stimuli elicit similar neuronal responses. All responses occur in the majority of dopamine neurons (55 to 80%).”
The role of dopamine neurons as computers of reward prediction error remains a fascinating and worthy line of research, but if reward prediction error is ALL that dopamine neurons do, then what do we need 400,000-600,000 of them for?* Here’s a map of where the brain’s dopamine neurons are located (in a cross-section of a rodent brain):
Distribution of dopamine neuron cell groups A8-A16 in the adult rodent brain. Adapted from Björklund, A. & Dunnett, S. B. Dopamine neuron systems in the brain: an update. Trends in Neurosciences 30, 194–202 (2007).
*In humans. There are 160,000-320,000 in monkeys and just 20,000-45,000 in rodents.
Looking at this diagram, there already seem to be some gross anatomical distinctions among groups of dopamine neurons, which is why they are labeled A8-A16. There are also finer anatomical distinctions, which turn out to have not-so-subtle functional implications. In the first line of study I’ll focus on here, Lammel et al. set about distinguishing dopamine neurons in the ventral tegmental area (VTA, or A10 in the above picture) by their connectivity to other brain areas. Lammel et al. observed that there are at least two separable populations of dopamine neurons within the VTA. One population gets input signals from a brain area called the laterodorsal tegmentum and sends output signals to a brain area called the nucleus accumbens (call these LDT-dopamine-NAc neurons). The other population gets inputs from the lateral habenula and sends outputs to the prefrontal cortex (call these LHb-dopamine-PFC neurons). So what? Does the fact that these dopamine neurons are wired into different brain circuits matter at all for behavior? Lammel et al. showed that it does matter. When (using optogenetics!) they activated the inputs to LDT-dopamine-NAc neurons in mice, they found that the animals formed positive associations with the context in which they were stimulated. They chose to spend more time in the part of a box where they’d gotten the brain stimulation. In contrast, when Lammel et al. activated the inputs to LHb-dopamine-PFC neurons, the exact opposite was observed. Animals avoided a part of a box where they’d gotten the stimulation. In another study by the same group, when the mice naturally experienced something good or something bad, the strengths of these distinct circuits were modulated differentially. Mice given cocaine showed an increased strength of the LDT-dopamine-NAc pathway, but no change in the LHb-dopamine-PFC pathway. Mice given an irritant on their paw showed no change in the LDT-dopamine-NAc pathway, but an increased strength of the LHb-dopamine-PFC pathway.
Undermining Schultz et al.’s initial assertion that dopamine neurons are homogenous, Lammel et al. discovered that they are not. This revision likely occurred because of the increasing sensitivity of the tools available, which changed quite a bit from the 1990s to the 2010s. Newer and better tools, in combination with a little creativity, allowed Lammel et al. to distinguish subtleties that weren’t accessible to Schultz et al. In revealing these subtleties, Lammel et al. helped to demonstrate the hubris of believing that you’ve figured out a whole class of neurons because you see responses in 55-80% of a population, especially when you’re not entirely sure (or shouldn’t be) about the criteria you’ve used to define that population. (The question of defining dopamine neurons during in vivo neural recordings is a WHOLE other issue). All the credit in the world to Schultz et al. for lighting the fire of dopamine research, but it was more of a starting point than an end point.
Grouping neurons by the brain circuits they participate in makes a ton of sense if you’re trying to figure out how brain circuits work. But what if you’re trying to figure out the dopamine part of dopamine neurons? Most dopamine neuron research has assumed that when a dopamine neuron fires, it releases the neurotransmitter dopamine, a small molecule that looks like this:
In fact, that’s how we defined “dopamine neuron.” However, as is often the case in science, the situation turns out not to be so simple. In the second line of recent research I’ll discuss here, scientists showed evidence that dopamine neurons can co-release other neurotransmitter molecules, called glutamate and GABA, along with dopamine.
Actually, different subsets of dopamine neurons most likely predominantly co-release either glutamate or GABA. Studies by Hnasko et al. and Stuber et al. demonstrated that dopamine neurons in the VTA co-release glutamate. First, they noticed that many VTA dopamine neurons express a glutamate transporter called VGLUT2, a protein that packages glutamate for release from neurons. Did the presence of VGLUT2 mean that dopamine neurons were packaging glutamate in addition to dopamine? To look at this question, the scientists looked at the responses of neurons in the nucleus accumbens (one place that dopamine neurons send outputs to, see the discussion of Lammel et al. above) to dopamine neuron stimulation. Indeed, they observed fast, excitatory responses of nucleus accumbens neurons to stimulation of VTA dopamine neurons of a type that would be consistent with a glutamatergic rather than a dopaminergic response. These responses were blocked by antagonists of glutamate receptors but not by antagonists of dopamine receptors. Additionally, in mice genetically manipulated to lack VGLUT2 in dopamine neurons, no such responses were seen.
The co-release of glutamate may not occur in all dopamine neurons though. As in Lammel et al.’s studies, connectivity matters. Stuber et al. observed that dopamine neurons in a neighboring area called the substantia nigra (A9), which sends outputs to the dorsal striatum, did not display evidence of glutamate release. That negative result is still controversial. Another group, Tritsch et al., did observe some evidence of glutamate release by substantia nigra dopamine neurons. Additionally, they demonstrated that these substantia nigra dopamine neurons also co-release yet another neurotransmitter: GABA. Oddly, however, substantia nigra dopamine neurons don’t express VGAT, the normal GABA transporter. Instead, Tritsch et al. found that VMAT, the dopamine transporter, can also co-transport GABA, packaging it for synaptic release along with dopamine. Tritsch et al.’s finding might generalize beyond substantia nigra dopamine neurons. As long as there’s some GABA around, anything expressing VMAT could potentially package and release that GABA as well. One key question that arises from Tritsch et al.’s study is exactly where and when the GABA in the substantia nigra is being synthesized. Nevertheless, it’s there.
The implications of glutamate and GABA co-release from dopamine neurons for the most part remain to be seen. The only reported behavioral effect is from the Hnasko et al. paper. They show that mice lacking VGLUT2 in dopamine neurons run around less in responses to cocaine than normal mice. That’s it for now. If nothing else, it demonstrates how much more we have to learn about the phenomenon of transmitter co-release.
So far, we’ve seen that dopamine neurons can signal different things if they’re hooked up into different brain circuits, and that they can play their assigned role in a brain circuit at least in part using chemicals other than just dopamine. In the third line of research I’ll examine here, we’ll add yet another layer of really cool complexity to the picture: dopamine neurons can change the way that they participate in a brain circuit by changing whether or not they’re making and releasing dopamine at all. In this case, Dulcis et al. looked at a slightly different group of dopamine neurons from the ones I’ve been talking about until now, located in the hypothalamus. They noticed that the number of dopamine neurons in rats seemed to fluctuate with the length of “daylight” experienced by the rats. I put daylight in quotes because it’s not real daylight – just whether or not the lights are on in a very controlled laboratory setting. Most lab animals see 12 hours of light per day, but Dulcis et al. also tried just 5 hours per day or up to 19. Rats that experienced long days had fewer dopamine neurons in their hypothalamus, while rats that experienced short days had more. Upon further examination, they determined that the changes in the number of dopamine neurons in the different light conditions weren’t due to neurons dying and being born. The same neurons were actually there in all conditions, but they were switching their dopamine-ness on or off. It’s still unclear why light exposure causes these changes, or what the exact behavioral consequences are. Rats that had long days, and fewer dopamine neurons as a result, displayed depressive and anxious behaviors (keep in mind that rats are nocturnal and prefer the dark). So did rats whose hypothalamic dopamine neurons were killed with a toxin. However, if the dopamine neurons were killed with a toxin while the rats got 12 hours of light per day and then the rats were given only 5 hours of light per day, previously non-dopaminergic neurons were recruited to release dopamine and fewer depressive and anxious behaviors were observed. Pretty cool! And importantly, this work demonstrates that neurons we wouldn’t even have previously identified as dopamine neurons can transform under the right conditions. Some aspects of our brains are built to be stable, but many are changing all the time, allowing us to internalize and adapt to our experience.
After all these studies, what have we learned? To me, the big picture takeaway is that understanding the brain means appreciating complexity. To be slightly more specific, it means linking molecules and cells with circuits and behavior to provide definitions of biological entities that span modalities of study. No more grouping neurons solely by one neurotransmitter they can release. That grouping could sometimes still be relevant, but as we’ve seen in the above studies, not always. Thinking about redefining the group formally known as dopamine neurons, we also need to take a look back at the decades of previous literature with the perspective that hindsight provides. It’s not that the data in older dopamine neuron studies are wrong, but conclusions may not be quite what we thought they were. Which could be a good thing. Many arcane arguments about exactly what dopamine neurons encode may actually end up being settled by understanding that they do many different things in different contexts. Don’t be alarmed: it may seem confusing, but this is the very normal process of science maturing. Not only is the process normal, it is absolutely crucial. Scientists must constantly question and revise our definitions to reflect significant conceptual advances.
Definitions can be confusing. They can also be rather boring, and I worry that they too often drive people away from science. When I was a beginning biology student, I spent hours upon hours making flashcards to help myself memorize what seemed like endless definitions. I viewed it as a tedious but necessary initiation to the biologists’ club. Basically, although it kind of stunk, I told myself that I had to learn the vocabulary to be able to discuss higher order issues with working scientists. What I’ve come to appreciate as I’ve progressed further in my career is how nuanced those once seemingly black-and-white, right-or-wrong definitions are – how much subtlety and history is packed into them. Scientific definitions, like the definition of a dopamine neuron, don’t just provide a common language; they structure the very nature of our investigations. We require this structure in order to proceed with our experiments, but, as we do so, we also need to be aware of the ways in which these definitions can limit us. We compare defined groups to each other. We talk about group averages. So exactly which things are included in our groups can dramatically affect how our data look and what we decide they mean. Thus, we must always be aware of the biases inherent in our categorizations. Perhaps definitions are not so boring after all! Discussions of these caveats could spice up intro course material quite a bit while teaching students how to think like real scientists.
The particular question of defining neuronal cell types actually turns out to be fairly timely. Just a couple weeks ago, the first interim report from the BRAIN initiative working group came out (see also Astra Bryant’s post on the topic). In it, nine high priority research areas for FY 2014 are outlined, the first of these being “generate a census of cell types.” The report recognizes the issues I’ve been discussing here:
There is not yet a consensus on what a neuronal type is, since a variety of factors including experience, connectivity and neuromodulators can diversify the molecular, electrical and structural properties of initially similar neurons. In some cases, there may not even be sharp boundaries separating subtypes from each other. Nonetheless, there is general agreement that types can be defined provisionally by invariant and generally intrinsic properties, and that this classification can provide a good starting point for a census. Thus, the census should begin with well-described large classes of neurons (e.g. excitatory pyramidal neurons of the cortex) and then proceed to finer categories within these classifications. This census would be taken with the knowledge that it will initially be incomplete, and will improve over iterations.
The answer to the question “What is a dopamine neuron?” isn’t quite forthcoming, but high-profile recognition of the question, and the funding that should follow it, is an important first step. Cheers to that.
Wednesday, July 3, 2013,
In a brain that people love to describe as “awash with chemicals,” one chemical always seems to stand out. Dopamine: the molecule behind all our most sinful behaviors and secret cravings. Dopamine is love. Dopamine is lust. Dopamine is adultery. Dopamine is motivation. Dopamine is attention. Dopamine is feminism. Dopamine is addiction.
My, dopamine’s been busy.
Dopamine is the one neurotransmitter that everyone seems to know about. Vaughn Bell once called it the Kim Kardashian of molecules, but I don’t think that’s fair to dopamine. Suffice it to say, dopamine’s big. And every week or so, you’ll see a new article come out all about dopamine.
So is dopamine your cupcake addiction? Your gambling? Your alcoholism? Your sex life? The reality is dopamine has something to do with all of these. But it is none of them. Dopamine is a chemical in your body. That’s all. But that doesn’t make it simple.
What is dopamine? Dopamine is one of the chemical signals that pass information from one neuron to the next in the tiny spaces between them. When it is released from the first neuron, it floats into the space (the synapse) between the two neurons, and it bumps against receptors for it on the other side that then send a signal down the receiving neuron. That sounds very simple, but when you scale it up from a single pair of neurons to the vast networks in your brain, it quickly becomes complex. The effects of dopamine release depend on where it’s coming from, where the receiving neurons are going and what type of neurons they are, what receptors are binding the dopamine (there are five known types), and what role both the releasing and receiving neurons are playing.
And dopamine is busy! It’s involved in many different important pathways. But when most people talk about dopamine, particularly when they talk about motivation, addiction, attention, or lust, they are talking about the dopamine pathway known as the mesolimbic pathway, which starts with cells in the ventral tegmental area, buried deep in the middle of the brain, which send their projections out to places like the nucleus accumbens and the cortex. Increases in dopamine release in the nucleus accumbens occur in response to sex, drugs, and rock and roll. And dopamine signaling in this area is changed during the course of drug addiction. All abused drugs, from alcohol to cocaine to heroin, increase dopamine in this area in one way or another, and many people like to describe a spike in dopamine as “motivation” or “pleasure.” But that’s not quite it. Really, dopamine is signaling feedback for predicted rewards. If you, say, have learned to associate a cue (like a crack pipe) with a hit of crack, you will start getting increases in dopamine in the nucleus accumbens in response to the sight of the pipe, as your brain predicts the reward. But if you then don’t get your hit, well, then dopamine can decrease, and that’s not a good feeling. So you’d think that maybe dopamine predicts reward. But again, it gets more complex. For example, dopamine can increase in the nucleus accumbens in people with post-traumatic stress disorder when they are experiencing heightened vigilance and paranoia. So you might say, in this brain area at least, dopamine isn’t addiction or reward or fear. Instead, it’s what we call salience. Salience is more than attention: It’s a sign of something that needs to be paid attention to, something that stands out. This may be part of the mesolimbic role in attention deficit hyperactivity disorder and also a part of its role in addiction.
But dopamine itself? It’s not salience. It has far more roles in the brain to play. For example, dopamine plays a big role in starting movement, and the destruction of dopamine neurons in an area of the brain called the substantia nigra is what produces the symptoms of Parkinson’s disease. Dopamine also plays an important role as a hormone, inhibiting prolactin to stop the release of breast milk. Back in the mesolimbic pathway, dopamine can play a role in psychosis, and many antipsychotics for treatment of schizophrenia target dopamine. Dopamine is involved in the frontal cortex in executive functions like attention. In the rest of the body, dopamine is involved in nausea, in kidney function, and in heart function.
With all of these wonderful, interesting things that dopamine does, it gets my goat to see dopamine simplified to things like “attention” or “addiction.” After all, it’s so easy to say “dopamine is X” and call it a day. It’s comforting. You feel like you know the truth at some fundamental biological level, and that’s that. And there are always enough studies out there showing the role of dopamine in X to leave you convinced. But simplifying dopamine, or any chemical in the brain, down to a single action or result gives people a false picture of what it is and what it does. If you think that dopamine is motivation, then more must be better, right? Not necessarily! Because if dopamine is also “pleasure” or “high,” then too much is far too much of a good thing. If you think of dopamine as only being about pleasure or only being about attention, you’ll end up with a false idea of some of the problems involving dopamine, like drug addiction or attention deficit hyperactivity disorder, and you’ll end up with false ideas of how to fix them.
The other reason I don’t like the “dopamine is” craze is because the simplification takes away the wonder of dopamine. If you believe “dopamine is,” then you’d think that we’ve got it all figured out. You begin to wonder why we haven’t solved this addiction problem yet. Complexity means that the diseases associated with dopamine (or with any other chemical or part of the brain, for that matter) are often difficult to understand and even more difficult to treat.
By emphasizing dopamine’s complexity, it might feel like I’m taking away some of the glamour, the sexiness, of dopamine. But I don’t think so. The complexity of how a neurotransmitter behaves is what makes it wonderful. The simplicity of a single molecule and its receptors is what makes dopamine so flexible and what allows the resulting systems to be so complex. And it’s not just dopamine. While dopamine has just five receptor type, another neurotransmitter, serotonin, has 14 currently known and even more that are thought to exist. Other neurotransmitters have receptors with different subtypes, all expressed in different places, and where each combination can produce a different result. There are many types of neurons, and they make billions and billions of connections. And all of this so you can walk, talk, eat, fall in love, get married, get divorced, get addicted to cocaine, and come out on top of your addiction some day. When you think of the sheer number of connections required simply for you to read and understand this sentence—from eyes to brain, to processing, to understanding, to movement as your fingers scroll down the page—you begin to feel a sense of awe. Our brain does all this, even while it makes us think about pepperoni pizza and what that text your crush sent really means. Complexity makes the brain the fascinating and mind-boggling thing that it is.
So dopamine has to do with addiction, whether to cupcakes or cocaine. It has to do with lust and love. It has to do with milk. It has to do with movement, motivation, attention, psychosis. Dopamine plays a role in all of these. But it is none of them, and we shouldn’t want it to be. Its complexity is what makes it great. It shows us what, with a single molecule, the brain can do.
TEHRAN (FNA)- When electrical pulses are applied to the ventral tegmental area of their brain, macaques presented with two images change their preference from one image to the other.
The study is the first to confirm a causal link between activity in the ventral tegmental area and choice behavior in primates.
When electrical pulses are applied to the ventral tegmental area of their brain, macaques presented with two images change their preference from one image to the other. The study by researchers Wim Vanduffel and John Arsenault (KU Leuven and Massachusetts General Hospital) is the first to confirm a causal link between activity in the ventral tegmental area and choice behaviour in primates.
The ventral tegmental area is located in the midbrain and helps regulate learning and reinforcement in the brain's reward system. It produces dopamine, a neurotransmitter that plays an important role in positive feelings, such as receiving a reward. "In this way, this small area of the brain provides learning signals," explains Professor Vanduffel. "If a reward is larger or smaller than expected, behavior is reinforced or discouraged accordingly."
This effect can be artificially induced: "In one experiment, we allowed macaques to choose multiple times between two images -- a star or a ball, for example. This told us which of the two visual stimuli they tended to naturally prefer. In a second experiment, we stimulated the ventral tegmental area with mild electrical currents whenever they chose the initially nonpreferred image. This quickly changed their preference. We were also able to manipulate their altered preference back to the original favorite."
The study, which will be published online in the journal Current Biology on 16 June, is the first to confirm a causal link between activity in the ventral tegmental area and choice behaviour in primates. "In scans we found that electrically stimulating this tiny brain area activated the brain's entire reward system, just as it does spontaneously when a reward is received. This has important implications for research into disorders relating to the brain's reward network, such as addiction or learning disabilities."
Could this method be used in the future to manipulate our choices? "Theoretically, yes. But the ventral tegmental area is very deep in the brain. At this point, stimulating it can only be done invasively, by surgically placing electrodes -- just as is currently done for deep brain stimulation to treat Parkinson's or depression. Once non-invasive methods -- light or ultrasound, for example -- can be applied with a sufficiently high level of precision, they could potentially be used for correcting defects in the reward system, such as addiction and learning disabilities."
COMMENTS: Detailed review of dopamine and the nucleus accumbens in reward and aversion.
The nucleus accumbens (NAc) is a critical element of the mesocorticolimbic system, a brain circuit implicated in reward and motivation. This basal forebrain structure receives dopamine (DA) input from the ventral tegmental area (VTA) and glutamate (GLU) input from regions including the prefrontal cortex (PFC), amygdala (AMG), and hippocampus (HIP). As such, it integrates inputs from limbic and cortical regions, linking motivation with action. The NAc has a well-established role in mediating the rewarding effects of drugs of abuse and natural rewards such as food and sexual behavior. However, accumulating pharmacological, molecular, and electrophysiological evidence has raised the possibility that it also plays an important (and sometimes underappreciated) role in mediating aversive states. Here we review evidence that rewarding and aversive states are encoded in the activity of NAc medium spiny GABAergic neurons, which account for the vast majority of the neurons in this region. While admittedly simple, this working hypothesis is testable using combinations of available and emerging technologies, including electrophysiology, genetic engineering, and functional brain imaging. A deeper understanding of the basic neurobiology of mood states will facilitate the development of well-tolerated medications that treat and prevent addiction and other conditions (e.g., mood disorders) associated with dysregulation of brain motivation systems.
The biological basis of mood-related states such as reward and aversion is not understood. Classical formulations of these states implicate the mesocorticolimbic system, comprising brain areas including the NAc, VTA, and PFC, in reward (Bozarth and Wise, 1981; Goeders and Smith, 1983; Wise and Rompré, 1989). Other brain areas, including the amygdala, periaquaductal gray, and the locus coeruleus, are often implicated in aversion (Aghajanian, 1978; Phillips and LePaine, 1980; Bozarth and Wise, 1983). However, the notion that certain brain areas narrowly and rigidly mediate reward or aversion is becoming archaic. The development of increasingly sophisticated tools and methodologies has enabled new approaches that provide evidence for effects that previously would have been difficult (if not impossible) to detect. As one example from our own work, we have found that a prominent neuroadaptation triggered in the NAc by exposure to drugs of abuse (activation of the transcription factor CREB) contributes to depressive-like and aversive states in rodents (for review, see Carlezon et al., 2005). Other work suggests that changes in the activity of dopaminergic neurons in the VTA—which provides inputs to the NAc that are integrated with glutamatergic inputs from areas such as the PFC, AMG, and HIP—can also encode both rewarding and aversive states (Liu et al., 2008).
In this review, we will focus on the role of the NAc in simple states of reward and aversion. The role of NAc activity in more complex states such as drug-craving and drug-seeking is beyond the scope of this review, since these states depend upon experience-dependent neuroadaptations and do not easily map onto basic conceptualizations of rewarding and aversive states. An improved understanding of the neurobiology of reward and aversion is critical to the treatment of complex disorders like addiction. This question is particularly important as the field utilizes accumulated knowledge from decades of research on drugs of abuse to move toward the rational design of treatments for addictive disorders. The requirement for new medications goes beyond the mere reduction of drug-craving, drug-seeking, or other addictive behaviors. To be an effective therapeutic, a medication must be tolerated by the addicted brain, or compliance (sometimes called adherence) will be poor. There are already examples of medications (e.g., naltrexone) that would appear on the basis of animal data to have extraordinary potential for reducing intake of alcohol and opiates—except that addicts often report aversive effects and discontinue treatment (Weiss et al., 2004). Methods to predict rewarding or aversive responses in normal and addicted brains would accelerate the pace of drug discovery, medication development, and recovery from addiction. Here we review evidence for the simple working hypothesis that rewarding and aversive states are encoded by the activity of NAc medium spiny GABAergic neurons.
The NAc comprises the ventral components of the striatum. It is widely accepted that there are two major functional components of the NAc, the core and the shell, which are characterized by differential inputs and outputs (see Zahm, 1999; Kelley, 2004; Surmeier et al., 2007). Recent formulations further divide these two components into additional subregions (including the cone and the intermediate zone of the NAc shell) (Todtenkopf and Stellar, 2000). As in the dorsal striatum, GABA-containing medium spiny neurons (MSNs) make up the vast majority (~90–95%) of cells in the NAc, with the remaining cells being cholinergic and GABAergic interneurons (Meredith, 1999). Striatal regions contain subpopulations of these MSNs: those of so-called “direct” and “indirect” pathways (Gerfen et al., 1990; Surmeier et al., 2007). The MSNs of the direct pathway predominantly co-express dopamine D1-like receptors and the endogenous opioid peptide dynorphin, and project directly back to the midbrain (substantia nigra/VTA). In contrast, the MSNs of the indirect pathway predominantly co-express dopamine D2-like receptors and the endogenous opioid peptide enkephalin, and project indirectly to the midbrain via areas including the ventral pallidum and the subthalamic nucleus. Traditional formulations posit that dopamine actions at D1-like receptors, which are coupled to the G-protein Gs (stimulatory) and associated with activation of adenylate cyclase, tend to excite the MSNs of the direct pathway (Albin et al., 1989; Surmeier et al., 2007). Elevated activity of these cells would be expected to provide increased GABAergic and dynorphin (an endogenous ligand at κ-opioid receptors) input to the mesolimbic system and negative feedback on midbrain dopamine cells. In contrast, dopamine actions at D2-like receptors, which are coupled to Gi (inhibitory) and associated with inhibition of adenylate cyclase, tend to inhibit the MSNs of the indirect pathway (Albin et al., 1989; Surmeier et al., 2007). Inhibition of these cells would be expected to reduce GABAergic and enkephalin (an endogenous ligand at δ-opioid receptors) input to the ventral pallidum, a region that normally inhibits subthalamic cells that activate inhibitory inputs to the thalamus. Through multiple synaptic connections, inhibition of the indirect pathway at the level of the NAc would ultimately activate the thalamus (see Kelley, 2004).
Like neurons throughout the brain, MSNs also express glutamate-sensitive AMPA and NMDA receptors. These receptors enable glutamate inputs from brain areas such as AMG, HIP, and deep (infralimbic) layers of the PFC (O’Donnell and Grace, 1995; Kelley et al., 2004; Grace et al., 2007) to activate NAc MSNs. Dopamine and glutamate inputs can influence one another: for example, stimulation of D1-like receptors can trigger phosphorylation of glutamate (AMPA and NMDA) receptor subunits, thereby regulating their surface expression and subunit composition (Snyder et al., 2000; Chao et al., 2002; Mangiavacchi et al., 2004; Chartoff et al., 2006; Hallett et al., 2006; Sun et al., 2008). Thus the NAc is involved in a complex integration of excitatory glutamate inputs, sometimes excitatory dopamine (D1-like) inputs, and sometimes inhibitory dopamine (D2-like) inputs. Considering that VTA tends to have uniform a response—activation—to both rewarding (e.g., morphine; see DiChiara and Imperato, 1988; Leone et al., 1991; Johnson and North, 1992) and aversive (Dunn, 1988; Herman et al., 1988; Kalivas and Duffy, 1989; McFarland et al., 2004) stimuli, the ability of the NAc to integrate these excitatory and inhibitory signals downstream of mesolimbic dopamine neurons likely plays a key role in attaching valence and regulating mood.
It is well accepted that the NAc plays a key role in reward. Theories about its role in motivation have been a critical element in our understanding of addiction (e.g., Bozarth and Wise, 1987; Rompré and Wise, 1989). There are 3 primary lines of evidence implicating the NAc in reward, involving pharmacological, molecular, and electrophysiological approaches.
It is well established that drugs of abuse (Di Chiara and Imperato, 1988) and natural rewards (Fibiger et al., 1992; Pfaus, 1999; Kelley, 2004) have the common action of elevating extracellular concentrations of dopamine in the NAc. Moreover, lesions of the NAc reduce the rewarding effects of stimulants and opiates (Roberts et al., 1980; Kelsey et al., 1989). Pharmacology studies in rats (e.g., Caine et al., 1999) and monkeys (e.g., Caine et al., 2000) suggest that D2-like receptor function plays a critical role in reward. However, it has been studies involving the direct microinfusion of drugs into this area that have provided the strongest evidence for its role in rewarding states. For example, rats will self-administer the dopamine releasing agent amphetamine directly into the NAc (Hoebel et al., 1983), demonstrating the reinforcing effects elevating extracellular dopamine in this region. Rats will also self-administer the dopamine reuptake inhibitor cocaine into the NAc, although this effect is surprisingly weak in comparison to that reported with amphetamine (Carlezon et al., 1995). This observation has led to speculation that the rewarding effects of cocaine are mediated outside the NAc, in areas including the olfactory tubercle (Ikemoto, 2003). However, rats will avidly self-administer the dopamine reuptake inhibitor nomifensine into the NAc (Carlezon et al., 1995), suggesting that the local anesthetic properties of cocaine complicate studies in which the drug is applied directly to neurons. Co-infusion of the dopamine D2-selective antagonist sulpiride attenuates intracranial self-administration of nomifensine, demonstrating a key role for D2-like receptors in the rewarding effects intra-NAc microinfusions of this drug. When considered together with evidence from a variety of other studies (for review, see Rompré and Wise, 1989), these studies are entirely consistent with theories prevailing in the 1980’s that dopamine actions in the NAc play a necessary and sufficient role in reward and motivation.
While there is little controversy that dopamine actions in the NAc is sufficient for reward, other work began to challenge the notion that they are necessary. For example, rats will self-administer morphine directly into the NAc (Olds, 1982), away from the trigger zone (the VTA) in which the drug acts to elevate extracellular dopamine in the NAc (Leone et al., 1991; Johnson and North, 1992). Considering that μ- and δ-opioid receptors are located directly on NAc MSNs (Mansour et al., 1995), these data were the first to suggest that reward can be triggered by events occurring in parallel with (or downstream of) those triggered by dopamine. Rats will also self-administer phencyclidine (PCP), a complex drug that is a dopamine reuptake inhibitor and a non-competitive NMDA antagonist, directly into the NAc (Carlezon and Wise, 1996). Two lines of evidence suggest that this effect is not dopamine-dependent. First, intracranial self-administration of PCP is not affected by co-infusion of the dopamine D2-selective antagonist sulpiride; and second, rats will self-administer other non-competitive (MK-801) or competitive (CPP) NMDA antagonists with no direct effects on dopamine systems directly into the NAc (Carlezon and Wise, 1996). These data provided early evidence that blockade of NMDA receptors in the NAc is sufficient for reward and, by extension, reward can be dopamine-independent. Blockade of NMDA receptors would be expected to produce an overall reduction in the excitability of NAc MSNs without affecting baseline excitatory input mediated by AMPA receptors (Uchimura et al., 1989; Pennartz et al. 1990). Importantly, rats also self-administered NMDA antagonists into deep layers of the PFC (Carlezon and Wise, 1996), which project directly to the NAc (see Kelley, 2004) and have been conceptualized as a part of a inhibitory (“STOP!”) motivational circuit (Childress, 2006). When considered together, these studies provided two critical pieces of evidence that have played a prominent role in the formulation of our current working hypothesis: first, that dopamine-dependent reward is attenuated by blockade of D2-like receptors, which are inhibitory receptors expressed predominately in the NAc on the MSNs of the indirect pathway; and second, that events that would be expected to reduce the overall excitability of the NAc (e.g., stimulation of Gi-coupled opioid receptors, reduced stimulation of excitatory NMDA receptors, reduced excitatory input) are sufficient for reward. This interpretation led to the development of a model of reward in which the critical event is reduced activation of MSNs in the NAc (Carlezon and Wise, 1996).
Other pharmacological evidence supports this theory, and implicates calcium (Ca2+) and its second messenger functions. Activated NMDA receptors gate Ca2+, an intracellular signaling molecule that can affect membrane depolarization, neurotransmitter release, signal transduction, and gene regulation (see Carlezon and Nestler, 2002; Carlezon et al., 2005). Microinjection of the L-type Ca2+ antagonist diltiazem directly into the NAc increases the rewarding effects of cocaine (Chartoff et al., 2006). The mechanisms by which diltiazem-induced alterations in Ca2+ influx affect reward are unknown. One possibility is that blockade of Ca2+ influx through voltage-operated L-type channels reduces the firing rate of neurons within the ventral NAc (Cooper and White, 2000). It is important to note, however, that diltiazem alone was not rewarding, at least at the doses tested in these studies. This might indicate that baseline levels of Ca2+ influx via L-type channels within the NAc are normally low, and difficult to reduce further. A related possibility is that microinjection of diltiazem reduces aversive actions of cocaine that are mediated within the NAc, unmasking reward. For example, activity of the transcription factor cAMP response element binding protein (CREB) within the NAc is associated with aversive states and reductions in cocaine reward (Pliakas et al., 2001; Nestler and Carlezon, 2006). The activation of CREB depends on phosphorylation, which can occur via activation of L-type Ca2+ channels (Rajadhyaksha et al., 1999). Phosphorylated CREB can induce expression of dynorphin, a neuropeptide that might contribute to aversive states via activation of κ-opioid receptors in the NAc (for review, see Carlezon et al., 2005). The potential role of intra-NAc Ca2+ in regulating rewarding and aversive states is a common theme in our work that will be explained in greater detail below.
Mice lacking dopamine D2-like receptors have reduced sensitivity to the rewarding effects of cocaine (Welter et al., 2007). Ablation of D2-like receptors also reduces the rewarding effects of morphine (Maldonado et al., 1997)—presumably by reducing the ability of the drug to stimulate dopamine via VTA mechanisms: Leone et al., 1991; Johnson and North, 1992)—and lateral hypothalamic brain stimulation (Elmer et al., 2005). One interpretation of these findings is that loss of D2-like receptors in the NAc reduces the ability of dopamine to inhibit the indirect pathway, a putative mechanism of reward. These findings, when combined with evidence that human addicts have reduced dopamine D2-like receptor binding in the NAc, suggest that this receptor plays an essential role in encoding reward (Volkow et al., 2007).
Other advances in molecular biology have enabled the detection of neuroadaptative responses to drugs of abuse and the ability to mimic such changes in discrete brain areas to examine their significance. One such change is in the expression of AMPA-type glutamate receptors, which are expressed ubiquitously in the brain and composed of various combinations of the receptor subunits GluR1-4 (Hollmann et al., 1991; Malinow and Malenka, 2002). Drugs of abuse can alter GluR expression in the NAc. For example, repeated intermittent exposure to cocaine elevates GluR1 expression in the NAc (Churchill et al., 1999). Furthermore, GluR2 expression is elevated in the NAc of mice engineered to express ΔFosB, a neuroadaptation linked with increased sensitivity to drugs of abuse (Kelz et al., 1999). Studies in which viral vectors were used to elevate GluR1 selectively in the NAc indicate that this neuroadaptation tends to make cocaine aversive in place conditioning tests, whereas elevated GluR2 in the NAc increases cocaine reward (Kelz et al., 1999). Potential explanations for this pattern of findings likely involve Ca2+ and its effect on neuronal activity and intracellular signaling. Increased GluR1 expression favors formation of GluR1-homomeric (or GluR1-GluR3 heteromeric) AMPARs, which are Ca2+-permeable (Hollman et al., 1991; Malinow and Malenka, 2002). In contrast, GluR2 contains a motif that prevents Ca2+ influx; thus increased expression of GluR2 would favor formation of GluR2-containing Ca2+-impermeable AMPARs (and theoretically decrease the number of Ca2+-permeable AMPARs). Thus GluR2-containing AMPARs have physiological properties that render them functionally distinct from those lacking this subunit, particularly with respect to their interactions with Ca2+ (Fig. 1).
These early studies involved place conditioning studies, which generally require repeated exposure to drugs of abuse and presumably involve cycles of reward and aversion (withdrawal). More recent studies examined how alterations in GluR expression modeling those acquired through repeated drug exposure affect intracranial self-stimulation (ICSS), an operant task in which the magnitude of the reinforcer (brain stimulation reward) is precisely controlled (Wise, 1996). Elevated expression of GluR1 in NAc shell increases ICSS thresholds, whereas elevated GluR2 decreases them (Todtenkopf et al., 2006). The effect of GluR2 on ICSS is qualitatively similar to that caused by drugs of abuse (Wise, 1996), suggesting that it reflects increases in the rewarding impact of the stimulation. In contrast, the effect of GluR1 is qualitatively similar to that caused by prodepressive treatments including drug withdrawal (Markou et al., 1992) and κ-opioid receptor agonists (Pfeiffer et al., 1986; Wadenberg, 2003; Todtenkopf et al., 2004; Carlezon et al., 2006), suggesting that it reflects decreases in the rewarding impact of the stimulation. These findings indicate that elevated expression of GluR1 and GluR2 in NAc shell have markedly different consequences on motivated behavior. Moreover, they confirm previous observations that elevated GluR1 and GluR2 expression in NAc shell have opposite effects in cocaine place conditioning studies (Kelz et al., 1999), and extend the generalizability of these effects to behaviors that are not motivated by drugs of abuse. Perhaps most importantly, they provide more evidence to implicate Ca2+ flux within the NAc in reduced reward or elevated aversion. Because Ca2+ plays a role in both neuronal depolarization and gene regulation, alterations in GluR expression and AMPAR subunit composition in NAc shell likely initiate physiological and molecular responses, which presumably interact to alter motivation. Again, the mechanisms by which Ca2+ signal transduction might trigger genes involved in aversive states are described in detail below.
Several lines of electrophysiological investigation support the idea that decreases in NAc firing may be related to reward. First, rewarding stimuli produce NAc inhibitions in vivo. Second, neurobiological manipulations that specifically promote inhibition of NAc firing appear to enhance rewarding effects of stimuli. Third, the inhibition of NAc GABAergic MSNs can disinhibit downstream structures such as the ventral pallidum to produce signals related to the hedonic qualities of stimuli. Each of these lines of investigation will be addressed in turn. The most substantial line of investigation involves studies of NAc single-unit activity in rodent paradigms where a wide variety of drug and non-drug rewards are delivered. A consistent finding across these studies is that the most commonly observed pattern of firing modulation is a transient inhibition. This has been observed during self-administration of many different types of rewarding stimuli including cocaine (Peoples and West, 1996), heroin (Chang et al., 1997), ethanol (Janak et al., 1999), sucrose (Nicola et al., 2004), food (Carelli et al., 2000) and electrical stimulation of the medial forebrain bundle (Cheer et al., 2005). Though not as commonly investigated as self-administration paradigms, the inhibition-reward effect is also present in awake, behaving animals where rewards are delivered without requirement for an operant response (Roitman et al., 2005; Wheeler et al., 2008). These studies indicate that the transient inhibitions need not be directly related to motor output, but may be more directly tied to a rewarding or motivationally activated state. As ubiquitous as the NAc inhibition-reward relationship seems to be, however, there are counterexamples. For instance, Taha and Fields (2005) found that of those NAc neurons that appeared to encode palatability in a sucrose solution-drinking discrimination task, excitations outnumbered inhibitions, and the total number of such neurons was small (~10% of all neurons recorded). This discrepancy from what appears to be the typical NAc activity pattern highlights the need for techniques to identify the connectivity and biochemical composition of cells recorded in vivo. As these techniques become available, unique functional subclasses of NAc neurons will most likely be identified and a more detailed model of NAc function can be constructed.
How are the transient reward-related inhibitions of NAc firing generated? Because rewarding stimuli are known to produce transient elevations in extracelluar dopamine, one straightforward hypothesis is that dopamine may be responsible. In fact, findings from in vitro and in vivo studies using iontophoretic application and other methods indicate that dopamine is capable of inhibiting NAc firing (reviewed in Nicola et al., 2000, 2004). Recent studies examining simultaneous dopamine electrochemical and single unit responses (the majority of which are inhibitions) in an ICSS paradigm indicate that these parameters show a high degree of concordance in the NAc shell (Cheer et al., 2007). On the other hand, it is now clear that dopamine can have marked excitatory effects as well as inhibitory effects in behaving animals (Nicola et al., 2000, 2004). In addition, while inactivating VTA to interfere with dopamine release in NAc blocks both the cue-induced excitations and inhibitions, it does not affect reward-related inhibitions themselves (Yun et al., 2004a). The combination of these findings suggests that while dopamine may contribute to reward-related inhibition of NAc firing, there must be other factors that can drive it as well. Although there has been much less investigation of other potential contributors, additional candidates include the release of acetylcholine and the activation of μ-opioid receptors in the NAc, both of which have been shown to occur under rewarding conditions (Trujillo et al., 1988; West et al., 1989; Mark et al., 1992; Imperato et al., 1992; Guix et al., 1992; Bodnak et al., 1995; Kelley et al., 1996) and both of which have the ability to inhibit NAc firing (McCarthy et al., 1977; Hakan et al., 1989; de Rover et al., 2002).
Another newer line of electrophysiological evidence supporting the inhibition/reward hypothesis comes from experiments in which molecular genetics approaches have been used to manipulate the excitable properties of NAc neurons. The clearest example of this so far is for viral-mediated overexpression of mCREB (dominant negative CREB), a repressor of CREB activity, in the NAc. This treatment was recently shown to cause decreases in the intrinsic excitability of NAc MSNs, as indicated by the fact that neurons recorded in the NAc exhibited fewer spikes in response to a given depolarizing current injection (Dong et al., 2006). As noted above, NAc mCREB overexpression is not only associated with enhanced rewarding effects of cocaine (Carlezon et al., 1998) but also with a decrease in depressive-like behavioral effects in the forced-swim task (Pliakas et al., 2001) and a learned-helplessness paradigm (Newton et al., 2002). The combination of these findings is consistent with the idea that conditions that facilitate a transition to lower firing rates in NAc neurons also facilitate reward processes and/or elevates mood.
On the other hand, deletion of the Cdk5 gene specifically in the NAc core region produced an enhanced cocaine reward phenotype (Benavides et al., 2007). This phenotype correlated with an increase in excitability in NAc MSNs. This contrasts with the mCREB effect, which was most robust when CREB function was inhibited in the shell region, rather than the core (Carlezon et al., 1998). Considered along with other evidence, these studies highlight the importance of distinguishing between inhibition of NAc activity in the shell region, which appears to be associated with reward, versus the core region, where it may not.
Finally, the hypothesis relating NAc inhibition to reward is supported by the study of the relationship between neural activity in NAc target structures and reward. Considering that NAc MSNs are GABAergic projection neurons, inhibition of firing in these cells should disinhibit target regions. As mentioned above, one structure that receives a dense projection from the NAc shell is the ventral pallidum. Elegant electrophysiological studies have demonstrated that elevated activity in ventral pallidal neurons can encode the hedonic impact of a stimulus (Tindell et al., 2004, 2006). For example, among neurons that responded to sucrose reward (between 30–40% of total recorded units), receipt of a sucrose reward produced a robust, transient increase in firing—an effect that persisted throughout training (Tindell et al., 2004). In a subsequent study, the investigators used a clever procedure to manipulate the hedonic value of a taste stimulus to assess whether activity in pallidal neurons would track this change (Tindell et al., 2006). Although hypertonic saline solutions are typically aversive taste stimuli, in salt-deprived humans or experimental animals their palatability is increased. Both behavioral measures of positive hedonic response (i.e. facial taste reactivity measures) and increases in pallidal neuron firing occurred in response to a hypertonic saline taste stimulus in sodium-deprived animals, but not in animals maintained on a normal diet. Thus, increased firing of pallidal neurons, downstream targets of NAc efferents, appears to encode a key feature of reward. Of course, it is possible that other inputs to pallidal neurons could contribute to these reward-related firing patterns. However, recent studies have indicated a strong relationship between the ability of mu-opioid receptor activation (a factor which is known to inhibit MSN firing) in discrete regions of the NAc shell to drive increases in behavioral response to a hedonic stimulus and its ability to activate c-fos in discrete regions of ventral pallidum (Smith et al., 2007). This apparently tight coupling between NAc and pallidal “hedonic hotspots” is an intriguing new phenomenon that is just beginning to be explored.
The fact that the NAc also plays a role in aversion is sometimes underappreciated. Pharmacological treatments have been used to demonstrate aversion after NAc manipulations. In addition, molecular approaches have demonstrated that exposure to drugs of abuse and stress cause common neuroadaptions that can trigger signs (including anhedonia, dysphoria) that characterize depressive illness (Nestler and Carlezon, 2006), which is often co-morbid with addiction and involves dysregulated motivation.
Some of the earliest evidence that NAc plays a role in aversive states came from studies involving opioid receptor antagonists. Microinjections of a wide-spectrum opioid receptor antagonist (methylnaloxonium) into the NAc of opiate-dependent rats establishes conditioned place aversions (Stinus et al., 1990). In opiate-dependent rats, precipitated withdrawal can induce immediate-early genes and transcription factors in the NAc (Gracy et al., 2001; Chartoff et al., 2006), suggesting activation of MSNs. Selective κ-opioid agonists, which mimic the effects of the endogenous κ-opioid ligand dynorphin, also produce aversive states. Microinjections of a κ-opioid agonist into the NAc cause conditioned place aversions (Bals-Kubik et al., 1993) and elevate ICSS thresholds (Chen et al., 2008). Inhibitory (Gi-coupled) κ-opioid receptors are localized on the terminals of VTA dopamine inputs to the NAc (Svingos et al., 1999), where they regulate local dopamine release. As such, they are often in apposition to μ- and δ-opioid receptors (Mansour et al., 1995), and stimulation produces the opposite effects of agonists at these othr receptors in behavioral tests. Indeed, extracellular concentrations of dopamine are reduced in the NAc by systemic (DiChiara and Imperato, 1988; Carlezon et al., 2006) or local microinfusions of κ-opioid agonist (Donzati et al., 1992; Spanagel et al., 1992). Decreased function of midbrain dopamine systems has been associated with depressive states including anhedonia in rodents (Wise, 1982) and dysphoria in humans (Mizrahi et al., 2007). Thus one path to aversion appears to be reduced dopamine input to the NAc, which would reduced the stimulation of inhibitory dopamine D2-like receptors that seem critical for reward (Carlezon and Wise, 1996).
Other studies appear to confirm an important role of dopamine D2-like receptors in suppressing aversive responses. Microinjections of a dopamine D2-like antagonist into the NAc of opiate-dependent rats precipitates signs of somatic opiate withdrawal (Harris and Aston-Jones, 1994). Although the motivational effects were not measured in this study, treatments that precipitate opiate withdrawal often cause aversive states more potently than they cause somatic signs of withdrawal (Gracy et al., 2001; Chartoff et al., 2006). Interestingly, however, microinjections of a dopamine D1-like agonist into the NAc also produce somatic signs of withdrawal in opiate–dependent rats. The data demonstrate that another path to aversion is increased stimulation of excitatory dopamine D1-like receptors in rats with opiate-dependence induced neuroadaptations in the NAc. Perhaps not surprisingly, one consequence of D1-like receptor stimulation in opiate dependent rats is phosphorylation of GluR1 (Chartoff et al., 2006), which would lead to increased surface expression of AMPA receptors on the MSNs of the direct pathway.
Exposure to drugs of abuse (Turgeon et al., 1997) and stress (Pliakas et al., 2001) activate the transcription factor CREB in the NAc. Viral vector-induced elevation of CREB function in the NAc reduces the rewarding effects of drugs (Carlezon et al., 1998) and hypothalamic brain stimulation (Parsegian et al., 2006), indicating anhedonia-like effects. It also makes low doses of cocaine aversive (a putative sign of dysphoria), and increases immobility behavior in the forced swim test (a putative sign of “behavioral despair”) (Pliakas et al., 2001). Many of these effects can be attributed to CREB-regulated increases in dynorphin function (Carlezon et al., 1998). Indeed, κ-opioid receptor-selective agonists have effects that are qualitatively similar to those produced by elevated CREB function in the NAc, producing signs of anhedonia and dysphoria in reward models and increased immobility in the forced swim test (Bals-Kubik et al., 1993; Carlezon et al., 1998; Pliakas et al., 2001; Mague et al., 2003; Carlezon et al., 2006). In contrast, κ-selective antagonists produce an antidepressant-like phenotype resembling that seen in animals with disrupted CREB function in the NAc (Pliakas et al., 2001; Newton et al., 2002; Mague et al., 2003). These findings suggest that one biologically important consequence of drug- or stress-induced activation of CREB within the NAc is increased transcription of dynorphin, which triggers key signs of depression. Dynorphin effects are likely mediated via stimulation of κ-opioid receptors that act to inhibit neurotransmitter release from mesolimbic dopamine neurons, thereby reducing the activity VTA neurons, as explained above. This path to aversion appears to be reduced dopamine input to the NAc, which would produce reductions in the stimulation of inhibitory dopamine D2-like receptors that seem critical for reward (Carlezon and Wise, 1996). As explained below, there is also evidence that elevated expression of CREB in the NAc directly increases the excitability of MSNs (Dong et al., 2006) in addition to the loss of D2-regluated inhibition, raising the possibility that multiple effects contribute to the aversive responses.
Repeated exposure to drugs of abuse can elevate GluR1 expression in the NAc (Churchill et al., 1999). Viral vector-induced elevation of elevated GluR1 in the NAc increases drug aversion in place conditioning studies, an “atypical” type of drug sensitization (i.e., heightened sensitivity to the aversive rather than the rewarding aspects of cocaine). This treatment also increases ICSS thresholds (Todtenkopf et al., 2006), indicating anhedonia-like and dysphoria-like effects. Interestingly, these motivational effects are virtually identical to those caused by elevated CREB function in the NAc. These similarities raise the possibility that both effects are part of the same larger process. In one possible scenario, drug exposure might trigger changes in the expression of GluR1 in the NAc, which would lead to local increases in the surface expression of Ca2+-permeable AMPA receptors, which would increase Ca2+ influx and activate CREB, leading to alterations in sodium channel expression that affect baseline and stimulated excitability of MSNs in the NAc (Carlezon and Nestler, 2002; Carlezon et al., 2005; Dong et al., 2006). Alternatively, early changes in CREB function might precede alterations in GluR1 expression. These relationships are currently under intensive study in several NIDA-funded laboratories, including our own.
Although there has been little electrophysiological investigation of the hypothesis that widespread excitation of NAc neurons encodes information about aversive stimuli, the available data essentially mirror those for rewarding stimuli. First, two recent studies using aversive taste stimuli both indicate that three times as many NAc neurons respond to the stimuli with clear excitations as inhibitions (Roitman et al., 2005; Wheeler et al., 2008). Interestingly, these same studies find that units that respond to a sucrose or saccharin reward show the exact opposite profile: three times more cells with decreases in firing than those with increases. In addition, when an initially rewarding saccharin stimulus was made aversive by pairing it with the opportunity to self-administer cocaine, the predominant firing pattern of NAc units that responded to the stimulus shifted from inhibition to excitation (Wheeler et al., 2008). Thus, not only does this demonstrate that NAc may encode aversive states in firing increases, but that individual NAc neurons can track the hedonic valence of a stimulus by varying their firing-rate response to it.
Second, molecular genetic manipulations of synaptic and intrinsic membrane properties that increase the excitability of NAc neurons can shift the behavioral response of a stimulus from rewarding to aversive. For example, viral-mediated overexpression of CREB in NAc produces an increase in neuronal excitability in MSNs as indicated by an increase in the number of spikes in response to a given depolarizing current pulse (Dong et al. 2006). Under these conditions of enhanced NAc excitability, animals exhibit a conditioned place aversion to cocaine, rather than the place preference response that control animals show to the same dose (Pliakas et al., 2001). In addition, they exhibit increased depressive-like behaviors in forced swim test (Pliakas et al., 2001) and learned helplessness paradigm (Newton et al., 2002). Another molecular manipulation that produces a similar behavioral phenotype is the overexpression of the AMPAR subunit GluR1 in NAc (Kelz et al., 1999; Todtenkopf et al., 2006). Although it is has not yet been confirmed by electrophysiological study, this GluR1 overexpression is likely to produce an enhancement of synaptic excitability in NAc MSNs. Not only may this occur through the insertion of additional AMPARs in the membrane in general, but the abundance of GluR1 could potentially lead to the formation of GluR1 homomeric receptors, which are known to have a larger single-channel conductance (Swanson et al., 1997) and thus contribute even further to enhanced excitability.
Third, if NAc firing is elevated during aversive conditions, downstream targets should be suppressed via GABA release from MSNs during these conditions as well. Ventral pallidal unit recordings show very low firing rates following oral infusion of hypertonic saline—a taste stimulus that under normal physiological circumstances is aversive (Tindell, 2006). Although clearly more work with aversive stimuli of different modalities is needed to make any firm conclusions, the present data are consistent with the possibility that enhanced firing of NAc neurons during aversive conditions may suppress pallidal neuron firing as part of the process of encoding the unpleasant nature of a stimulus.
Based on the evidence described above, our working hypothesis is that rewarding stimuli reduce the activity of NAc MSNs, whereas aversive treatments increase the activity of these neurons. According to this model (Fig. 2), NAc neurons tonically inhibit reward-related processes. Under normal circumstances, excitatory influences mediated by glutamate actions at AMPA and NMDA receptors or dopamine actions at D1-like receptors are balanced by inhibitory dopamine actions at D2-like receptors. Treatments that would be expected to reduce activity in the NAc—including cocaine (Peoples et al., 2007), morphine (Olds et al., 1982), NMDA antagonists (Carlezon et al., 1996), L-type Ca2+ antagonists (Chartoff et al., 2006), palatable food (Wheeler et al., 2008) and expression of dominant-negative CREB (Dong et al., 2006)—have reward-related effects because they reduce the inhibitory influence of the NAc on downstream reward pathways. In contrast, treatments that activate the NAc by amplifying glutamatergic inputs (e.g., elevated expression of GluR1; Todtenkopf et al., 2006), altering ion channel function (e.g., elevated expression of CREB: Dong et al., 2006), reducing inhibitory dopamine inputs to D2-like cells (e.g., κ-opioid receptor agonists), or blocking inhibitory μ- or δ–opioid receptors (West and Wise, 1988; Weiss, 2004) are perceived as aversive because they increase the inhibitory influence of the NAc on downstream reward pathways. Interestingly, stimuli such as drugs of abuse may induce homeostatic (or allostatic) neuroadapations that persist beyond the treatment and cause baseline shifts in mood. Such shifts may be useful in explaining co-morbidity of addiction and psychiatric illness (Kessler et al., 1997): repeated exposure to drugs that reduce the activity of NAc neurons might induce compensatory neuroadaptations that render the system more excitable during abstinence (leading to conditions characterized by anhedonia or dysphoria), whereas repeated exposure to stimuli (e.g., stress) that activate the NAc might induce compensatory neuroadaptations that render the system more susceptible to the inhibitory actions of drugs of abuse, increasing their appeal. This working hypothesis is testable through a variety of increasingly sophisticated approaches.
One caveat to the inhibition/reward hypothesis is that widespread and prolonged inhibition of NAc firing, as in inactivation or lesion studies, does not appear to produce rewarding effects (e.g. Yun et al., 2004b). This raises the possibility that it is not inhibition of the NAc, per se, that encodes reward but rather the transitions from normal basal firing rates to lower rates that occur when rewarding stimuli are present. Prolonged inhibition may degrade the dynamic information normally encoded in the transient depressions of NAc firing.
Electrophysiology-based tests of the predictions of this hypothesis fall into two basic categories. The first category involves manipulating an animal’s behavioral state to produce sustained changes in responsivity to rewarding stimuli followed by testing for electrophysiological correlates of this altered reward state. For example, the early withdrawal state from chronic exposure to psychostimulants is characterized by anhedonia and lack of responsiveness to natural rewarding stimuli. What would the inhibition/reward hypothesis predict about the electrophysiological status of NAc neurons during this state? The major prediction is that NAc neurons would exhibit decreases in the activity suppression normally produced by exposure to a rewarding stimulus (e.g. sucrose). To our knowledge, this has not yet been investigated. Possible mechanisms for such a decrease in inhibition, should it occur, might include overall increases in neuronal excitability produced by any combination of changes in intrinsic excitability (e.g. increased Na+ or Ca2+ currents, decreased K+ currents) or synaptic transmission (e.g. decreases in glutamatergic or increases in GABAergic transmission). On the other hand, the available data on NAc MSN excitability during early psychostimulant withdrawal suggest that it is actually decreased during this phase (Zhang et al., 1998; Hu et al., 2004; Dong et al., 2006; Kourrich et al., 2007). As noted above, it is possible that a prolonged depression in excitability may degrade reward-related information contained in transient firing inhibitions, perhaps by creating a “floor” effect and reducing the magnitude of these inhibitions. This possibility remains to be tested.
Considering the apparent link between NAc and ventral pallidum in reward encoding (see above), we would predict that any excitability changes produced by sustained modulation of an animal’s reward state might be particularly evident in striatopallidal/D2 neurons. Although studying the detailed physiological properties of these neurons has been difficult in the past, the recent development of a line of BAC transgenic mice that expresses GFP in these neurons (Gong et al., 2003; Lobo et al., 2006) has made it possible to visualize them in in vitro slice preparations, greatly facilitating the potential for physiological characterization of D2 cells.
The second category of electrophysiology-based tests involves using genetic engineering (see below) to alter the functional expression of key components of the cellular machinery for excitability or excitability modulation in NAc neurons. In theory, this could enable modulation of the inhibitions or excitations associated with reward or aversion, respectively, in NAc neurons. With this in mind, perhaps the most useful target molecules would be those that participate in activity-dependent modulation of neuronal excitability, rather than in maintaining basal firing rates. These targets would likely provide a better opportunity to modulating stimuli responsiveness than more general targets (e.g. Na+ channel subunits), thus enabling the evaluation of the inhibition/reward hypothesis. For example, the firing frequency of active neurons can be controlled by various ionic conductances that produce spike after-hyperpolarizations (AHPs). By targeting NAc neurons with genetic (or possibly even pharmacologic) manipulation aimed at the channels that produce AHPs, it may be possible to decrease the magnitude of aversion-related excitatory responses in these neurons and thus to test whether this physiological change correlates with reduced behavioral indices of aversion.
One of the most obvious pharmacological tests would to determine if rats self-administer dopamine D2-like agonists directly into the NAc. Interestingly, previous work indicates that while rats self-administer combinations of D1-like and D2-like agonists into the NAc, they do not self-administer either drug component alone, at least at the doses tested (Ikemoto et al., 1997). While on the surface this finding might appear to invalidate our working hypothesis, electrophysiological evidence suggests that co-activation of D1 and D2 receptors on NAc neurons can, under some conditions, cause a reduction in their membrane excitability that is not seen in response to either agonist alone (O’Donnell and Grace, 1996). In addition, more work is needed to study the behavioral effects of intra-NAc microinfusions of GABA agonists; historically, this work has been hindered by poor solubility of benzodiazepines—which are known to be addictive (Griffiths and Ator, 1980) despite their tendency to decrease dopamine function in the NAc (Wood, 1982; Finlay et al., 1992: Murai et al., 1994)—and the relatively small number of researchers who use brain microinjection procedures together with models of reward. Still other ways of testing our hypothesis would be to study the effects of manipulations in brain areas downstream of D2 receptor-containing MSNs. Again, early evidence suggests reward is encoded by activation of the ventral pallidum, a presumed consequence of inhibition of the MSNs of the indirect pathway (Tindell et al., 2006).
The development of genetic engineering techniques that enable the direction of inducible or conditional mutations to specific brain areas will be an important tool with which to test our hypotheses. Mice with constitutive deletion of GluRA (an alternative nomenclature for GluR1) show many alterations in sensitivity to drugs of abuse (Vekovischeva et al., 2001; Dong et al., 2004; Mead et al., 2005, 2007), some of which are consistent with our working hypothesis and some of which are not. The loss of GluR1 early in development could dramatically alter responsiveness to numerous types of stimuli, including drugs of abuse. In addition, these GluR1-mutant mice lack the protein throughout the brain, whereas the research reviewed here focuses on mechanisms that occur within NAc. These points are especially important because loss of GluR1 in other brain regions would be expected to have dramatic, and sometimes very different, effects on drug abuse-related behaviors. As just one example, we have shown that modulation of GluR1 function in the VTA exerts the opposite effect on drug responses compared to modulation of GluR1 in the NAbc (Carlezon et al., 1997; Kelz et al., 1999). The findings in GluR1-deficient mice are not inconsistent with the combined findings from the NAc and the VTA: constitutive GluR1 mutant mice are more sensitive to the stimulant effects of morphine (an effect that could be explained by the loss of GluR1 in the NAc), but they do not develop progressive increases in responsivity to morphine (an effect that could be explained by the loss of GluR1 in the VTA) testing occurs under conditions that promote sensitization and involve additional brain regions. Accordingly, one must be cautious in assigning spatial and temporal interpretations to data from constitutive knockout mice: the literature is becoming replete with examples of proteins that have dramatically different (and sometimes opposite) effects on behavior depending upon the brain regions under study (see Carlezon et al., 2005).
Preliminary studies from mice with inducible expression of a dominant-negative form of CREB—a manipulation which reduces the excitability of NAc MSNs—are hypersensitive to the rewarding effects of cocaine while being insensitive to the aversive effects of a κ-opioid agonist (DiNieri et al., 2006). Although these findings are consistent with our working hypothesis, further studies (e.g., electrophysiology) might help to characterize the physiological basis of these effects. Regardless, an increased capacity to spatially and temporally control the expression of genes that regulate the excitability of NAc MSNs will enable progressively more sophisticated tests of our working hypothesis.
Functional brain imaging has the potential to revolutionize our understanding of the biological basis of rewarding and aversive mood states in animal models and, ultimately, people. Preliminary data from imaging studies involving alert non-human primates are providing early evidence in support of the working hypothesis described above. Intravenous administration of high doses of the κ-opioid agonist U69,593—which belongs to a class of drugs known to cause aversion in animals (Bals-Kubik et al., 1993; Carlezon et al., 2006) and dysphoria in humans (Pfeiffer et al., 1986; Wadenberg, 2003)—causes profound increases in blood-oxygen level-dependent (BOLD) functional MRI responses in the NAc (Fig. 3: from M.J. Kaufman, B. deB. Fredrick, S. S. Negus, unpublished observations; used with permission). To the extent that BOLD signal responses reflect synaptic activity, the positive BOLD response induced by U69,593 in the NAc is consistent with increased activity of MSNs, perhaps due to decreased dopamine input (DiChiara and Imperato, 1988; Carlezon et al., 2006). In contrast, positive BOLD signal responses are conspicuously absent in the NAc after treatment with an equipotent dose of fentanyl, a highly addictive μ-opioid agonist. While these fentanyl data do not indicate inhibition of the NAc per se, absence of BOLD activity in this region is not inconsistent with our working hypothesis. Clearly, additional pharmacological and electrophysiological studies are needed to characterize the meaning of these BOLD signal changes. The development of higher magnetic field strength systems is beginning to enable cutting-edge functional imaging and spectroscopy in rats and mice, opening the door to a more detailed understanding of BOLD signals and underlying brain function.
We propose a simple model of mood in which reward is encoded by reduced activity of NAc MSNs, whereas aversion is encoded by elevated activity of these same cells. Our model is supported by a preponderance of evidence already in the literature, although more rigorous tests are needed. It is also consistent with clinical studies indicating reduced numbers of inhibitory dopamine D2-like receptors in the NAc of drug addicts, which may decrease sensitivity to natural rewards and exacerbate the addiction cycle (Volkow et al., 2007). The continued development of molecular and brain imaging techniques is establishing a research environment that is conducive to the design of studies that have the power to confirm or refute this model. Regardless, a better understanding of the molecular basis of these mood states is perpetually important and relevant, particularly as accumulated knowledge from decades of research is used to develop innovative approaches that might be used to treat and prevent addiction and other conditions (e.g., mood disorders) associated with dysregulation of motivation.
Funded by the National Institute on Drug Abuse (NIDA) grants DA012736 (to WAC) and DA019666 (to MJT) and a McKnight-Land Grant professorship (to MJT). We thank M.J. Kaufman, B. deB. Fredrick, and S.S. Negus for permission to cite unpublished data from their brain imaging studies in monkeys.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
COMMENTS: Explains how dopamine dysregulation can aggravate relapse, narrow users' interests and perturb decision-making, thus accounting for a wide range of addiction-related symptoms.
Department of Psychiatry, McGill University, 1033 Pine Avenue West, Montreal, Quebec, CANADA H3A 1A1. firstname.lastname@example.org
In animal models considerable evidence suggests that increased motivation to seek and ingest drugs of abuse are related to conditioned and sensitized activations of the mesolimbic dopamine (DA) system. Direct evidence for these phenomena in humans, though, is sparse. However, recent studies support the following.
First, the acute administration of drugs of abuse across pharmacological classes increases extracellular DA levels within the human ventral striatum.
Second, individual differences in the magnitude of this response correlate with rewarding effects of the drugs and the personality trait of novelty seeking.
Third, transiently diminishing DA transmission in humans decreases drug craving, the propensity to preferentially respond to reward-paired stimuli, and the ability to sustain responding for future drug reward.
Finally, very recent studies suggest that repeated exposure to stimulant drugs, either on the street or in the laboratory, can lead to conditioned and sensitized behavioral responses and DA release.
In contrast to these findings, though, in individuals with a long history of substance abuse, drug-induced DA release is decreased. This diminished DA release could reflect two different phenomena. First, it is possible that drug withdrawal related decrements in DA cell function persist longer than previously suspected.
Second, drug-paired stimuli may gain marked conditioned control over the release of DA and the expression of sensitization leading to reduced DA release when drug-related cues are absent.
Based on these observations a two-factor hypothesis of the role of DA in drug abuse is proposed.
In the presence of drug cues, conditioned and sensitized DA release would occur leading to focused drug-seeking behavior.
In comparison, in the absence of drug-related stimuli DA function would be reduced, diminishing the ability of individuals to sustain goal-directed behavior and long-term objectives.
This conditioned control of the expression of sensitized DA release could aggravate susceptibility to relapse, narrow the range of interests and perturb decision-making, accounting for a wide range of addiction related phenomena.
J Genet Syndr Gene Ther. 2013 February 10; 4(121): 1000121. doi: 10.4172/2157-7412.1000121
Having entered the genomics era with confidence in the future of medicine, including psychiatry, identifying the role of DNA and polymorphic associations with brain reward circuitry has led to a new understanding of all addictive behaviors. It is noteworthy that this strategy may provide treatment for the millions who are the victims of “Reward Deficiency Syndrome” (RDS) a genetic disorder of brain reward circuitry. This article will focus on drugs and food being mutuality addictive, and the role of dopamine genetics and function in addictions, including the interaction of the dopamine transporter, and sodium food. We will briefly review our concept that concerns the genetic antecedents of multiple–addictions (RDS). Studies have also shown that evaluating a panel of established reward genes and polymorphisms enables the stratification of genetic risk to RDS. The panel is called the “Genetic Addiction Risk Score (GARS)”, and is a tool for the diagnosis of a genetic predisposition for RDS. The use of this test, as pointed out by others, would benefit the medical community by identifying at risk individuals at a very early age. We encourage, in depth work in both animal and human models of addiction. We encourage further exploration of the neurogenetic correlates of the commonalities between food and drug addiction and endorse forward thinking hypotheses like “The Salted Food Addiction Hypothesis”.
Dopamine (DA) is a neurotransmitter in the brain, which controls feelings of wellbeing. This sense of wellbeing results from the interaction of DA and neurotransmitters such as serotonin, the opioids, and other brain chemicals. Low serotonin levels are associated with depression. High levels of the opioids (the brain’s opium) are also associated with a sense of wellbeing . Moreover, DA receptors, a class of G-protein coupled receptors (GPCRs), have been targeted for drug development for the treatment of neurological, psychiatric and ocular disorders . DA has been called the “anti-stress” and/or “pleasure” molecule, but this has been recently debated by Salamone and Correa  and Sinha .
Accordingly, we have argued [5-8] that Nucleus accumbens (NAc) DA has a role in motivational processes, and that mesolimbic DA dysfunction may contribute to motivational symptoms of depression, features of substance abuse and other disorders . Although it has become traditional to label DA neurons as reward neurons, this is an over generalization, and it is necessary to consider how different aspects of motivation are affected by dopaminergic manipulations. For example, NAc DA is involved in Pavlovian processes, and instrumental learning appetitive-approach behavior, aversive motivation, behavioral activation processes sustained task engagement and exertion of effort although it does not mediate initial hunger, motivation to eat or appetite [3,5-7].
While it is true that NAc DA is involved in appetitive and aversive motivational processes we argue that DA is also involved as an important mediator in primary food motivation or appetite similar to drugs of abuse. A review of the literature provides a number of papers that show the importance of DA in food craving behavior and appetite mediation [6,7]. Gold has pioneered the concept of food addiction [5-8]. Avena et al.  correctly argue that because addictive drugs avtivate the same neurological pathways that evolved to respond to natural rewards, addiction to food seems plausible. Moreover, sugar per se is noteworthy as a substance that releases opioids and DA and thus might be expected to have addictive potential. Specifically, neural adaptations include changes in DA and opioid receptor binding, enkephalin mRNA expression and DA and acetylcholine release in the NAc. The evidence supports the hypothesis that under certain circumstances rats can become sugar dependent.
The work of Wang et al.  involving brain imaging studies in humans has implicated DA-modulated circuits in pathologic eating behavior(s). Their studies suggest that the DA in the extracellular space of the striatum is increased by food cues, this is evidence that DA is potentially involved in the non-hedonic motivational properties of food. They also found that orbitofrontal cortex metabolism is increased by food cues indicating that this region is associated with motivation for the mediation of food consumption. There is an observed reduction in striatal DA D2 receptor availability in obese subjects, similar to the reduction in drug-addicted subjects, thus obese subjects may be predisposed to use food to compensate temporarily for under stimulated reward circuits . In essence, the powerful reinforcing effects of both food and drugs are in part mediated by abrupt DA increases in the mesolimbic brain reward centers. Volkow et al.  point out that abrupt DA increases can override homeostatic control mechanisms in the brain’s of vulnerable individuals. Brain imaging studies have deliniated the neurological dysfunction that generates the shared features of food and drug addictions. The cornerstone of the commonality, of the root causes of addiction are impairments in the dopaminergic pathways that regulate the neuronal systems associated also with self-control, conditioning, stress reactivity, reward sensitivity and incentive motivation . Metabolism in prefrontal regions is involved in inhibitory control, in obese subjects the inability to limit food intake involves ghrelin and may be the result of decreased DA D2 receptors which are associated with decreased prefrontal metabolism . The limbic and cortical regions involved with motivation, memory and self-control, are activated by gastric stimulation in obese subjects  and during drug craving in drug-addicted subjects. An enhanced sensitivity to the sensory properties of food is suggested by increased metabolism in the somatosensory cortex of obese subjects. This enhanced sensitivity to food palatability coupled with reduced DA D2 receptors could make food the salient reinforcer for compulsive eating and obesity risk . These research results indicate that numerous brain circuits are disrupted in obesity and drug addiction and that the prevention and treatment of obesity may benefit from strategies that target improved DA function.
Lindblom et al.  reported that dieting as a strategy to reduce body weight often fails as it causes food cravings leading to binging and weight regain. They also agree that evidence from several lines of research suggests the presence of shared elements in the neural regulation of food and drug craving. Lindblom et al.  quantified the expression of eight genes involved in DA signaling in brain regions related to the mesolimbic and nigrostriatal DA system in male rats subjected to chronic food restriction using quantitative real-time polymerase chain reaction. They found that mRNA levels of tyrosine hydroxylase, and the dopamine transporter in the ventral tegmental area were strongly increased by food restriction and concurrent DAT up-regulation at the protein level in the shell of the NAc was also observed via quantitative autoradiography. That these effects were observed after chronic rather than acute food restriction suggests that sensitization of the mesolimbic dopamine pathway may have occurred. Thus, sensitization possibly due to increased clearance of extracellular dopamine from the NAc shell may be one of the underlying causes for the food cravings that hinder dietary compliance. These findings are in agreement with earlier findings by Patterson et al. . They demonstrated that direct intracerebroventricular infusion of insulin results in an increase in mRNA levels for the DA reuptake transporter DAT. In a 24- to 36-hour food deprivation study hybridization was used in situ to assess DAT mRNA levels in food-deprived (hypoinsulinemic) rats. Levels were in the ventral tegmental area/substantia nigra pars compacta significantly decreased suggesting that moderation of striatal DAT function can be effected by nutritional status, fasting and insulin. Ifland et al.  advanced the hypothesis that processed foods with high concentrations of sugar and other refined sweeteners, refined carbohydrates, fat, salt, and caffeine are addictive substances. Other studies have evaluated salt as important factor in food seeking behavior. Roitman et al.  points out that increased DA transmission in the NAc is correlated with motivated behaviors, including Na appetite. DA transmission is modulated by DAT and may play a role in motivated behaviors. In their studies in vivo, robust decreases in DA uptake via DAT in the rat NAc were correlated with and Na appetite induced by Na depletion. Decreased DAT activity in the NAc was observed after in vitro Aldosterone treatment. Thus, a reduction in DAT activity, in the NAc, may be the consequence of a direct action of Aldosterone and may be a mechanism by which Na depletion induces generation of increased NAc DA transmission during Na appetite. Increased NAc DA may be the motivating property for the Na-depleted rat. Further support for the role of salted food as possible substance (food) of abuse has resulted in the “The Salted Food Addiction Hypothesis” as proposed by Cocores and Gold . In a pilot study, to determine if salted foods act like a mild opiate agonist which drives overeating and weight gain, they found that an opiate dependent group developed a 6.6% increase in weight during opiate withdrawal showing a strong preference for salted food. Based on this and other literature  they suggest that Salted Food may be an addictive substance that stimulates opiate and DA receptors in the reward and pleasure center of the brain. Alternately, preference, hunger, urge, and craving for “tasty” salted food may be symptoms of opiate withdrawal and the opiate like effect of salty food. Both salty foods and opiate withdrawal stimulate the Na appetite, result in increased calorie intake, overeating and disease related to obesity.
When synaptic, DA stimulates DA receptors (D1–D5), individuals experience stress reduction and feelings of wellbeing . As mentioned earlier, the mesocorticolimbic dopaminergic pathway mediates reinforcement of both unnatural rewards and natural rewards. Natural drives are reinforced physiological drives such as hunger and reproduction while unnatural rewards involve satisfaction of acquired learned pleasures, hedonic sensations like those derived from drugs, alcohol, gambling and other risk-taking behaviors [8,20,21].
One notable DA gene is the DRD2 gene which is responsible for the synthesis of DA D2 receptors . The allelic form of the DRD2 gene (A1 versus A2) dictates the number of receptors at post-junctional sites and hypodopaminergic function [23,24]. A paucity of DA receptors predisposes individuals to seek any substance or behavior that stimulates the dopaminergic system [25-27].
The DRD2 gene and DA have long been associated with reward  in spite of controversy [3,4]. Although the Taq1 A1 allele of the DRD2 gene, has been associated with many neuropsychiatric disorders and initially with severe alcoholism, it is also associated with other substance and process addictions, as well as, Tourette’s Syndrome, high novelty seeking behaviors, Attention Deficit Hyperactivity Disorder (ADHD), and in children and adults, with co-morbid antisocial personality disorder symptoms .
While this article will focus on drugs and food being mutuality addictive, and the role of DA genetics and function in addictions, for completeness, we will briefly review our concept that concerns the genetic antecedents of multiple–addictions. “Reward Deficiency Syndrome” (RDS) was first described in 1996 as a theoretical genetic predictor of compulsive, addictive and impulsive behaviors with the realization that the DRD2 A1 genetic variant is associated with these behaviors [29-32]. RDS involves the pleasure or reward mechanisms that rely on DA. Behaviors or conditions that are the consequence of DA resistance or depletion are manifestations of RDS . An individual’s biochemical reward deficiency can be mild, the result of overindulgence or stress or more severe, the result of a DA deficiency based on genetic makeup. RDS or anti-reward pathways help to explain how certain genetic anomalies can give rise to complex aberrant behavior. There may be a common neurobiology, neuro-circuitry and neuroanatomy, for a number of psychiatric disorders and multiple addictions. It is well known that .drugs of abuse, alcohol, sex, food, gambling and aggressive thrills, indeed, most positive reinforcers, cause activation and neuronal release of brain DA and can decrease negative feelings. Abnormal cravings are linked to low DA function . Here is an example of how complex behaviors can be produced by specific genetic antecedents. A deficiency of, for example, the D2 receptors a consequence of having the A1 variant of the DRD2 gene  may predispose individuals to a high risk for cravings that can be satisfied by multiple addictive, impulsive, and compulsive behaviors. This deficiency could be compounded if the individual had another polymorphism in for example the DAT gene that resulted in excessive removal of DA from the synapse. In addition, the use of substances and aborant behaviors also deplete DA. Thus, RDS can be manifest in severe or mild forms that are a consequence a biochemical inability to derive reward from ordinary, everyday activities. Although many genes and polymorphisms predispose individuals to abnormal DA function, carriers of the Taq1 A1 allele of the DRD2 gene lack enough DA receptor sites to achieve adequate DA sensitivity. This DA deficit in the reward site of the brain can results in unhealthy appetites and craving. In essence, they seek substances like alcohol, opiates, cocaine, nicotine, glucose and behaviors; even abnormally aggressive behaviors that are known to activate dopaminergic pathways and cause preferential release of DA at the NAc. There is now evidence that rather than the NAc, the anterior cingulate cortex may be involved in operant, effort-based decision making [35-37] and a site of relapse.
Impairment of the DRD2 gene or in other DA receptor genes, such as the DRD1 involved in homeostasis and so called normal brain function, could ultimately lead to neuropsychiatric disorders including aberrant drug and food seeking behavior. Prenatal drug abuse in the pregnant female has been shown to have profound effects of the neurochemical state of offspring. These include ethanol ; cannabis ; heroin ; cocaine ; and drug abuse in general . Most recently Novak et al.  provided strong evidence showing that abnormal development of striatal neurons are part of the pathology underlying major psychiatric illnesses. The authors identified an underdeveloped gene network (early) in rat that lacks important striatal receptor pathways (signaling). At two postnatal weeks the network is down regulated and replaced by a network of mature genes expressing striatal-specific genes including the DA D1 and D2 receptors and providing these neurons with their functional identity and phenotypic characteristics. Thus, this developmental switch in both the rat and human, has the potential to be a point of susceptibility to disruption of growth by enviromental factors such as an overindulgence in foods, like salt, and drug abuse.
The DA transporter (also DA active transporter, DAT, SLC6A3) is a membrane–spanning protein that pumps the neurotransmitter DA out of the synapse back into cytosol from which other known transporters sequester DA and norepinephrine into neuronal vesicles for later storage and subsequent release .
The DAT protein is encoded by a gene located on human chromosome 5 it is about 64 kbp long and consists of 15 coding exon. Specifically, the DAT gene (SLC6A3 or DAT1) is localized to chromosome 5p15.3. Moreover, there is a VNTR polymorphism within the 3′ non-coding region of DAT1. A genetic polymorphism in the DAT gene which effects the amount of protein expressed is evidence for an association between and DA related disorders and DAT . It is well established that DAT is the primary mechanism which clears DA from synapses, except in the prefrontal cortex where DA reuptake involves norepinephrine [46,47]. DAT terminates the DA signal by removing the DA from the synaptic cleft and depositing it into surrounding cells. Importantly, several aspects of reward and cognition are functions of DA and DAT facilitates regulation of DA signaling .
It is noteworthy that DAT is an integral membrane protein and is considered a symporter and a co-transporter moving DA from the synaptic cleft across the phospholipid cell membrane by coupling its movement to the movement of Na ions down the electrochemical gradient (facilitated diffusion) and into the cell.
Moreover, DAT function requires the sequential binding and co-transport of two Na ions and one chloride ion with the DA substrate. The driving force for DAT-mediated DA reuptake is the ion concentration gradient generated by the plasma membrane Na+/K+ ATPase .
Sonders et al.  evaluated the role of the widely–accepted model for monoamine transporter function. They found that normal monoamine transporter function requires set rules. For example, Na ions must bind to the extracellular domain of the transporter before DA can bind. Once DA binds, the protein undergoes a conformational change, which allows both Na and DA to unbind on the intracellular side of the membrane. A number of electrophysiological studies have confirmed that DAT transports one molecule of neurotransmitter across the membrane with one or two Na ions like other monoamine transporters. Negatively charged chloride ions are required to prevent a buildup of positive charge. These studies used radioactive-labeled DA and have also shown that the transport rate and direction are totally dependent on the Na gradient .
Since it is well known that many drugs of abuse cause the release of neuronal DA , DAT may have a role in this effect. Because of the tight coupling of the membrane potential and the Na gradient, activity-induced changes in membrane polarity can dramatically influence transport rates. In addition, the transporter may contribute to DA release when the neuron depolarizes . In essence, as pointed out by Vandenbergh et al.  the DAT protein regulates DA -mediated neurotransmission by rapidly accumulating DA that has been released into the synapse.
The DAT membrane topology was initially theoretical, determined based on hydrophobic sequence analysis and similarity to the GABA transporter. The initial prediction of Kilty et al.  of a large extracellular loop between the third and fourth of twelve transmembrane domains was confirmed by Vaughan and Kuhar  when they used proteases, to digest proteins into smaller fragments, and glycosylation, which occurs only on extracellular loops, to verify most aspects of DAT structure.
DAT has been found in regions of the brain where there is dopaminergic circuitry, these areas include mesocortical, mesolimbic, and nigrostriatal pathways . The nuclei that make up these pathways have distinct patterns of expression. DAT was not detected within any synaptic cleft which suggests that striatal DA reuptake occurs outside of the synaptic active zones after DA has diffused from the synaptic cleft.
Two alleles, the 9 repeat (9R) and 10 repeat (10R) VNTR can increase the risk for RDS behaviors. The presence of the 9R VNTR has associated with alcoholism and Substance Use Disorder. It has been shown to augment transcription of the DAT protein resulting in an enhanced clearance of synaptic DA, resulting in a reduction in DA, and DA activation of postsynaptic neurons . The tandem repeats of the DAT have been associated with reward sensitivity and high risk for Attention Deficit Hyperactivity Disorder (ADHD) in both children and adults [59,60]. The 10-repeat allele has a small but significant association with hyperactivity-impulsivity (HI) symptoms .
Support for the impulsive nature of individuals possessing dopaminergic gene variants and other neurotransmitters (e.g. DRD2, DRD3, DRD4, DAT1, COMT, MOA-A, SLC6A4, Mu, GABAB) is derived from a number of important studies illustrating the genetic risk for drug-seeking behaviors based on association and linkage studies implicating these alleles as risk antecedents that have an impact in the mesocorticolimbic system (Table 1). Our laboratory in conjunction with LifeGen, Inc. and Dominion Diagnostics, Inc. is carrying out research involving twelve select centers across the United States to validate the first ever patented genetic test to determine a patient’s genetic risk for RDS called Genetic Addiction risk Score™ (GARS).
Submit your manuscript at: http://www.editorialmanager.com/omicsgroup/
The authors appreciate the expert editorial input from Margaret A. Madigan and Paula J. Edge. We appreciate the comments by Eric R. Braverman, Raquel Lohmann, Joan Borsten, B.W Downs, Roger L. Waite, Mary Hauser, John Femino, David E Smith, and Thomas Simpatico. Marlene Oscar-Berman is the recipient of grants from the National Institutes of Health, NIAAA RO1-AA07112 and K05-AA00219 and the Medical Research Service of the US Department of Veterans Affairs. We also acknowledge the case report input Karen Hurley, Executive Director of National Institute of Holistic Addiction studies, North Miami Beach Florida. In-part this article was supported by a grand awarded to Path foundation NY from Life Extension Foundation.
This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Conflict of Interest Kenneth Blum, PhD., holds a number of US and foreign patents related to diagnosis and treatment of RDS, which has been exclusively licensed to LifeGen, Inc. Lederach, PA. Dominion Diagnostics, LLC, North Kingstown, Rhode Island along with LifeGen, Inc., are actively involved in the commercial development of GARS. John Giordano is also a partner in LifeGen, Inc. There are no other conflicts of interest and all authors read & approved the manuscript.
Front Neural Circuits. 2013 Oct 11;7:152.
Molecular Neurobiology Laboratory, Department of Life Sciences, Korea University Seoul, South Korea.
Dopamine (DA) regulates emotional and motivational behavior through the mesolimbic dopaminergic pathway. Changes in DA mesolimbic neurotransmission have been found to modify behavioral responses to various environmental stimuli associated with reward behaviors. Psychostimulants, drugs of abuse, and natural reward such as food can cause substantial synaptic modifications to the mesolimbic DA system. Recent studies using optogenetics and DREADDs, together with neuron-specific or circuit-specific genetic manipulations have improved our understanding of DA signaling in the reward circuit, and provided a means to identify the neural substrates of complex behaviors such as drug addiction and eating disorders. This review focuses on the role of the DA system in drug addiction and food motivation, with an overview of the role of D1 and D2 receptors in the control of reward-associated behaviors.
dopamine, dopamine receptor, drug addiction, food reward, reward circuit
Dopamine (DA) is the predominant catecholamine neurotransmitter in the brain, and is synthesized by mesencephalic neurons in the substantia nigra (SN) and ventral tegmental area (VTA). DA neurons originate in these nuclei and project to the striatum, cortex, limbic system and hypothalamus. Through these pathways, DA affects many physiological functions, such as the control of coordinated movements and hormone secretion, as well as motivated and emotional behaviors (Hornykiewicz, 1966; Beaulieu and Gainetdinov, 2011; Tritsch and Sabatini, 2012).
Regulation of the DA system in reward-related behaviors has received a great deal of attention because of the serious consequences of dysfunction in this circuit, such as drug addiction and food reward linked obesity, which are both major public health issues. It is now well accepted that following repeated exposure to addictive substances, adaptive changes occur at the molecular and cellular level in the DA mesolimbic pathway, which is responsible for regulating motivational behavior and for the organization of emotional and contextual behaviors (Nestler and Carlezon, 2006; Steketee and Kalivas, 2011). These modifications to the mesolimbic pathway are thought to lead to drug dependence, which is a chronic, relapsing disorder in which compulsive drug-seeking and drug-taking behaviors persist despite serious negative consequences (Thomas et al., 2008).
Recent findings suggest that glutamatergic and GABAergic synaptic networks in the limbic system are also affected by drugs of abuse, and that this can alter the behavioral effects of addictive drugs (Schmidt and Pierce, 2010; Lüscher and Malenka, 2011). Considerable evidence now suggests that substantial synaptic modifications of the mesolimbic DA system are associated with not only the rewarding effects of psychostimulants and other drugs of abuse, but also with the rewarding effects of natural reward, such as food; however, the mechanism by which drugs of abuse induce the modify synaptic strength in this circuit remains elusive. In fact, DA reward signaling seems extremely complex, and is also implicated in learning and conditioning processes, as evidenced by studies revealing a DAergic response coding a prediction error in behavioral learning, for example (Wise, 2004; Schultz, 2007, 2012), thus suggesting a need for a fine dissection at a circuit level to properly understand these motivated reward-related behaviors. Recent studies using optogenetics and neuron-specific or circuit-specific genetic manipulations are now allowing a better understanding of DA signaling in the reward circuit.
In this review, I will provide a short summary of DA signaling in reward-related behaviors, with an overview of recent studies on cocaine-addiction behaviors as well as some on food reward in the context of the role of D1 and D2 receptors in regulating these behaviors.
Dopamine interacts with membrane receptors belonging to the family of seven transmembrane domain G-protein coupled receptors, with activation leading to the formation of second messengers, and the activation or repression of specific signaling pathways. To date, five different subtypes of DA receptors have been cloned from different species. Based on their structural and pharmacological properties, a general subdivision into two groups has been made: the D1-like receptors, which stimulate intracellular cAMP levels, comprising D1 (Dearry et al., 1990; Zhou et al., 1990) and D5 (Grandy et al., 1991; Sunahara et al., 1991), and the D2-like receptors, which inhibit intracellular cAMP levels, comprising D2 (Bunzow et al., 1988; Dal Toso et al., 1989), D3 (Sokoloff et al., 1990), and D4 (Van Tol et al., 1991) receptors.
D1 and D2 receptors are the most abundantly expressed DA receptors in the brain. The D2 receptor has two isoforms generated by alternative splicing of the same gene (Dal Toso et al., 1989; Montmayeur et al., 1991). These isoforms, named D2L and D2S, are identical except for an insert of 29 amino acids present in the putative third intracellular loop of D2L, an intracellular domain thought to play a role in coupling this class of receptor to specific second messengers.
D2 receptors are localized presynaptically, revealed by D2 receptor immunoreactivity, mRNA, and binding sites present in DA neurons throughout the midbrain (Sesack et al., 1994), with lower level of D2 receptor expression in theVTA than in the SN (Haber et al., 1995). These D2-type autoreceptors represent either somatodendritic autoreceptors, known to dampen neuronal excitability (Lacey et al., 1987, 1988; Chiodo and Kapatos, 1992), or terminal autoreceptors, which mostly decrease DA synthesis and packaging (Onali et al., 1988; Pothos et al., 1998), but also inhibit impulse-dependent DA release (Cass and Zahniser, 1991; Kennedy et al., 1992; Congar et al., 2002). Therefore, the principal role of these autoreceptors is the inhibition and modulation of overall DA neurotransmission; however, it has been suggested that in the embryonic stage, the D2-type autoreceptor could have a different function in DA neuronal development (Kim et al., 2006, 2008; Yoon et al., 2011; Yoon and Baik, 2013). Thus, the cellular and molecular role of these presynaptic D2 receptors needs to be explored further. The expression of D3, D4, and D5 receptors in the brain is considerably more restricted and weaker than that of either D1 or D2 receptors.
There is some difference in the affinity of DA for D1-like receptors and D2-like receptors, mostly reported on the basis of receptor-ligand binding assay studies using heterologously expressed DA receptors in cell lines. For example, D2-like receptors seem to have a 10- to 100-fold greater affinity for DA than the D1-like family, with the D1 receptor reported to have the lowest affinity for DA (Beaulieu and Gainetdinov, 2011; Tritsch and Sabatini, 2012). These differences suggest a differential role for the two receptors given that DA neurons can have two different patterns of DA release, “tonic” or “phasic” based on their firing properties (Grace et al., 2007). It has been suggested that low-frequency, irregular firing of DA neurons tonically generates a low basal level of extracellular DA (Grace et al., 2007), while burst firing, or “phasic” activity is crucially dependent on afferent input, and is believed to be the functionally relevant signal sent to postsynaptic sites to indicate reward and modulate goal-directed behavior (Berridge and Robinson, 1998; Schultz, 2007; Grace et al., 2007). Therefore, bursting activity of DA neurons, leading to a transient increase in the DA level, is thought to be a key component of the reward circuitry (Overton and Clark, 1997; Schultz, 2007). Consequently, the D1 receptor, which is known as the low-affinity DA receptor, is thought to be preferentially activated by the transient, high concentrations of DA mediated by phasic bursts of DA neurons (Goto and Grace, 2005; Grace et al., 2007). In contrast, it is hypothesized that D2-like receptors, which are known to have a high affinity for DA, can detect the lower levels of tonic DA release (Goto et al., 2007). However, given that measurements of receptor affinity rely on ligand binding assays from heterologously expressed DA receptors, and do not reflect the receptor’s coupling capacity to downstream signaling cascades, it is difficult to infer whether D2-like receptors are preferentially activated by basal extracellular levels of DA in vivo. Thus, it remains to be elucidated how these two different receptors participate in different pattern of DA neuronal activity in vivo.
The D1- and D2-like receptor classes differ functionally in the intracellular signaling pathways they modulate. The D1-like receptors, including D1 and D5, are coupled to heterotrimeric G-proteins that include the G proteins Gαs and Gαolf, with activation leading to increased adenylyl cyclase (AC) activity, and increased cyclic adenosine monophosphate (cAMP) production. This pathway induces the activation of protein kinase A (PKA), resulting in the phosphorylation of variable substrates and the induction of immediate early gene expression, as well as the modulation of numerous ion channels. In contrast, D2-class DA receptors (D2, D3, and D4) are coupled to Gαi and Gαo proteins, and negatively regulate the production of cAMP, resulting in decreased PKA activity, activation of K+ channels, and the modulation of numerous other ion channels (Kebabian and Greengard, 1971; Kebabian and Calne, 1979; Missale et al., 1998; Beaulieu and Gainetdinov, 2011).
One of best-studied substrates of PKA is the DA- and cAMP-regulated phosphoprotein, Mr ~32,000 (DARPP-32), which is an inhibitor of protein phosphatase, and is predominantly expressed in medium spiny neurons (MSNs) of the striatum (Hemmings et al., 1984a). It appears that DARPP-32 acts as an integrator involved in the modulation of cell signaling in response to DA in striatal neurons. It has been demonstrated that phosphorylation of DARPP-32 at threonine 34 by PKA activates inhibitory function of DARPP-32 over the protein phosphatase (PP1; Hemmings et al., 1984a,b). In D1 receptor expressing striatal neurons, D1 receptor stimulation results in an increased phosphorylation of DARPP-32 in response to PKA activation, while stimulation of D2 receptors in D2 receptor-expressing neurons reduces the phosphorylation of DARPP-32 at threonine 34, presumably as a consequence of reduced PKA activation (Bateup et al., 2008). However, it appears that a cAMP-independent pathway also participates in the D2-receptor-mediated regulation of DARPP-32, given that dephosphorylation of threonine 34 by the calmodulin-dependent protein phosphatase 2B (PP2B; also known as calcineurin), which is activated by increased intracellular Ca2+following D2 receptor activation (Nishi et al., 1997). These findings suggest that DA exerts a bidirectional control on the state of phosphorylation of DARPP-32, a DA-centered signaling molecule. Therefore, one can imagine that overall, under DA tone, these signaling pathways mediated by the two classes of receptors can influence neuronal excitability, and consequently synaptic plasticity, in terms of their synaptic networks in the brain, given that their precise signaling varies depending on the cell type and brain region in which they are expressed (Beaulieu and Gainetdinov, 2011; Girault, 2012).
In the case of D2 receptors, the situation is further complicated, as D2 receptors are alternatively spliced, giving rise to isoforms with distinct physiological properties and subcellular localizations. The large isoform appears to be expressed dominantly in all brain regions, although the exact ratio of the two isoforms can vary (Montmayeur et al., 1991). In fact, the phenotype of D2 receptor total knockout (KO) mice was found to be quite different from that of D2L KO mice (Baik et al., 1995; Usiello et al., 2000), indicating that the two isoformsmight have different functions in vivo. Recent results from Moyer et al. (2011) support a differential in vivo function of the D2 isoforms in human brain, showing a role of two variants of D2 receptor gene with intronic single-nucleotide polymorphisms (SNPs) in D2 receptor alternative splicing, and a genetic association between these SNPs and cocaine abuse in Caucasians (Moyer et al., 2011; Gorwood et al., 2012).
One signaling pathway of particular interest in neurons is the mitogen-activated protein kinases, extracellular-signal regulated kinases (ERK), which are activated by D1 and D2 receptors. It is now widely accepted that ERK activation contributes to different physiological responses in neurons, such as cell death and development, as well as synaptic plasticity, and that modulating ERK activity in the CNS can result in different neurophysiological responses (Chang and Karin, 2001; Sweatt, 2004; Thomas and Huganir, 2004). Additionally, ERK activation can be regulated by various neurotransmitter systems, a process that can be complex but is finely tuned depending on the differential regulation of the signaling pathways mediated by the various neurotransmitters. Therefore, it is interesting to see what the physiological output of ERK signaling upon DA stimulation through these receptors would be.
Results obtained from heterologous cell culture systems suggest that both D1- and D2-class DA receptors can regulate ERK1 and 2 (Choi et al., 1999; Beom et al., 2004; Chen et al., 2004; Kim et al., 2004; Wang et al., 2005). D1 receptor-mediated ERK singling involves an interaction with the NMDA glutamtate receptor (Valjent et al., 2000, 2005), which has been mostly described in the striatum. D1 receptor stimulation is not able to mediate ERK phosphorylation in itself, but rather requires endogenous glutamate (Pascoli et al., 2011). With D1 receptor activation, activated PKA can mediate the phosphorylation of DARPP-32 at its Thr-34, as mentioned above. Phosphorylated DARPP-32 can act as potent inhibitor of the protein phosphatase PP-1, which dephosphorylates another phosphatase, the striatal-enriched tyrosine phosphatase (STEP). Dephosphorylation of STEP activates its phosphatase activity, thus allowing STEP to dephosphorylate ERK (Paul et al., 2003). DARPP-32 also acts upstream of ERK, possibly by inhibiting PP-1, preventing PP-1 from dephosphorylating MEK, the upstream kinase of ERK (Valjent et al., 2005). Thus, D1 receptor activation acts to increase ERK phosphorylation by preventing its dephosphorylation by STEP, but also by preventing the dephosphorylation of the upstream kinase of ERK. In addition, the cross talk between D1 and NMDA receptors contributes to the ERK activation. For example, a recent study showed that stimulation of D1 receptors increases calcium influx through NMDA receptors, a process that involves phosphorylation of the NMDA receptor NR2B subunit by a Src-family tyrosine kinase (Pascoli et al., 2011). This increased calcium influx activates a number of signaling pathways, including calcium and calmodulin-dependent kinase II, which can activate ERK via the Ras-Raf-MEK cascade (Fasano et al., 2009; Shiflett and Balleine, 2011; Girault, 2012). Consequently, D1 receptor-mediated ERK activation employs a complex regulation by phosphatases and kinases in addition to the cross talk with glutamate receptor signaling (Figure Figure11).
D2 receptor-mediated ERK activation has been reported in heterologous cell culture systems (Luo et al., 1998; Welsh et al., 1998; Choi et al., 1999). D2 receptor-mediated ERK activation was found to be dependent on Gαi protein coupling, and it appears thatit requires the transactivation of receptor tyrosine kinase, which activates downstream signaling to finally activate ERK (Choi et al., 1999; Kim et al., 2004; Wang et al., 2005; Yoon et al., 2011; Yoon and Baik, 2013). Arrestin has been also suggested to contribute to D2 receptor-mediated ERK activation (Beom et al., 2004; Kim et al., 2004), which can activate MAPK signaling by mobilizing clathrin-mediated endocytosis in a β-arrestin/dynamin-dependent manner (Kim et al., 2004). A further possibility of D2 receptorscoupling to Gq proteins cannot be ruled out; in this case, Gq protein-mediated PKC activation could also induce ERK activation (Choi et al., 1999; Figure Figure22).
In view of the physiological role of this DA receptor-mediated ERK signaling, it has been shown that in mesencephalic neurons, DA activates ERK signaling via mesencephalic D2 receptors, which in turn activates the transcription factors such as Nurr1, a transcription factor critical for the development of DA neurons (Kim et al., 2006). Furthermore, our recent work demonstrated that STEP or Wnt5a can be involved in this regulation, by interacting with D2 receptors (Kim et al., 2008; Yoon et al., 2011). In light of these findings, it is intriguing whether this signaling can play a role in DA neurotransmission in the adult brain.
However, in the dorsal striatum, administration of the typical anti-psychotic D2-class receptor antagonist haloperidol stimulated the phosphorylation of ERK1/2, while the atypical anti-psychotic clozapine, which is also a D2-class antagonist, reduced ERK1/2 phosphorylation, showing that haloperidol and clozapine induce distinct patterns of phosphorylation in the dorsal striatum (Pozzi et al., 2003). Thus, the physiological relevance of this D2 receptor-mediated ERK signaling remains as an open issue.
Taken together, it is evident that D1and D2 receptors induce ERK activation via distinct mechanisms, and one can imagine that activation of these receptors can have different consequences, depending on the location and physiological status of the neurons expressing them.
The role of D1 and D2 receptors in reward-related behaviors has been investigated pharmacologically using subtype specific agonists and antagonists, as well as by the analysis of receptor gene KO mice. Recent progress in optogenetics and the use of viral vectors with different genetic manipulations now allow a refined examination of the functional importance of these receptors in vivo (Table Table11).
Exposure to a psychostimulant such as cocaine induces a progressive and enduring enhancement in the locomotor stimulant effect of subsequent administration, a phenomenon known as sensitization (Robinson and Berridge, 1993; Vanderschuren and Kalivas, 2000; Kalivas and Volkow, 2005; Steketee and Kalivas, 2011). The process of behavioral sensitization includes two distinct phases; initiation and expression. The initiation phase refers to the period during which the increased behavioral response following daily cocaine administration is associated with an increase in extracellular DA concentration. Behavioral sensitization continues to increase after the cessation of cocaine administration, and this procedure produces long-lasting sensitization, known as the expression of sensitization (Vanderschuren and Kalivas, 2000; Thomas et al., 2001; Steketee and Kalivas, 2011). The expression phase is characterized by a persistent drug hyper-responsiveness after cessation of the drug, which is associated with a cascade of neuroadaptation (Kalivas and Duffy, 1990; Robinson and Berridge, 1993). While this phenomenon has been studied mostly in experimental animals, the neuronal plasticity underlying behavioral sensitization is believed to reflect the neuroadaptations that contribute to compulsive drug cravings in humans (Robinson and Berridge, 1993; Kalivas et al., 1998). It has been suggested that the mesolimbic DA system from the VTA to the nucleus accumbens (NAc) and prefrontal cortex is an important mediator of these plastic changes, in association with the glutamatergic circuitry (Robinson and Berridge, 1993; Kalivas et al., 1998; Vanderschuren and Kalivas, 2000).
Animals behaviorally sensitized to cocaine, amphetamine, nicotine, or morphine (Kalivas and Duffy, 1990; Parsons and Justice, 1993) show enhanced DA release in the NAc in response to drug exposure. In addition to changes in neurotransmitter release, DA binding to its receptors plays a key role in behavioral sensitization (Steketee and Kalivas, 2011). For example, the enhanced excitability of VTA DA neurons that occurs with repeated cocaine exposure is associated with decreased D2 autoreceptor sensitivity (White and Wang, 1984; Henry et al., 1989). In addition, repeated intra-VTA injections of low doses of the D2 antagonist eticlopride, which is presumably autoreceptor-selective, enhanced subsequent responses to amphetamine (Tanabe et al., 2004).
A number of studies have shown that D1 and D2 DA receptors are differentially involved in cocaine-induced changes in locomotor activity. For example, initial studies employing pharmacological approaches have shown that mice or rats pre-treated with the D1 receptor antagonist SCH 23390 showed an attenuated locomotor response to acute cocaine challenge, while the D2 receptor antagonists haloperidol, and raclopride had no such effect (Cabib et al., 1991; Ushijima et al., 1995; Hummel and Unterwald, 2002). These results suggest different roles of DA receptor subtypes in the modulation of the stimulant effects of cocaine on locomotion. However, with regards to the behavioral sensitization induced by repetitive injections of cocaine, it has been reported that systemic administration of the D1 receptor antagonist SCH23390, or of the D2 receptor antagonists sulpiride, YM-09151-2 or eticlopride, does not affect the induction of cocaine sensitization (Kuribara and Uchihashi, 1993; Mattingly et al., 1994; Steketee, 1998; White et al., 1998; Vanderschuren and Kalivas, 2000).
The effects of direct intra-accumbens administration of SCH23390 on cocaine-induced locomotion, sniffing, and conditioned place preference (CPP) were investigated in rats, and these studies showed that the stimulation of D1-like receptors in the NAc is necessary for cocaine-CPP, but not for cocaine-induced locomotion (Baker et al., 1998; Neisewander et al., 1998). The direct intra-accumbens infusion of the D2/D3 receptor antagonist sulpiride in rats demonstrated that blockade of D2 receptors reverses the acute cocaine-induced locomotion (Neisewander et al., 1995; Baker et al., 1996), but these studies did not examine the effect on cocaine-induced behavioral sensitization. Interestingly, it has been reported that injection of the D2 receptor agonist quinpirole into the intra-medial prefrontal cortex blocked the initiation and attenuated the expression of cocaine-induced behavioral sensitization (Beyer and Steketee, 2002).
D1 receptor null mice have been examined in the context of addictive behaviors, and initial studies revealed that D1 receptor mutant mice failed to exhibit the psychomotor stimulant effect of cocaine on motor and stereotyped behaviors compared to their wild-type littermates (Xu et al., 1994; Drago et al., 1996). However, it appears that D1 receptor KO abolishes the acute locomotor response to cocaine, but does not fully prevent locomotor sensitization to cocaine at all doses (Karlsson et al., 2008), demonstrating that genetic KO of D1 receptors is not sufficient to fully block cocaine sensitization under all conditions.
In D2 receptor KO mice, with reduced general locomotor activity, the cocaine-induced motor activity level is low compared to WT mice, but these animals were similar in terms of the ability to induce cocaine-mediated behavioral sensitization, or cocaine-seeking behaviors with a slight decrease in sensitivity (Chausmer et al., 2002; Welter et al., 2007; Sim et al., 2013). Depletion of D2 receptors in the NAc by infusion of a lentiviral vector with a shRNA against the D2 receptor did not affect basal locomotor activity, nor cocaine-induced behavioral sensitization, but conferred stress-induced inhibition of the expression of cocaine-induced behavioral sensitization (Sim et al., 2013). These findings, together with previous reports, strongly suggest that blockade of D2 receptors in the NAc does not prevent cocaine-mediated behavioral sensitization, and that D2 receptor in the NAc play a distinct role in the regulation of synaptic modification triggered by stress and drug addiction.
Recent studies using genetically engineered mice that express Cre recombinase in cell-type specific manner, revealed some role of D1 or D2 receptor-expressing MSNs in cocaine-addictive behaviors. For example, loss of DARPP-32 in D2 receptor-expressing cells resulted in an enhanced acute locomotor response to cocaine (Bateup, 2010). Hikida and co-workers used AAV vectors to express tetracycline-repressive transcription factor (tTa) using substance P (for D1-expressing MSNs) or enkephalin (for D2-expressing MSNs) promoters (Hikida et al., 2010). These vectors were injected into the NAc of mice, in which tetanus toxin light chain (TN) was controlled by the tetracycline-responsive element, to selectively abolish synaptic transmission in each MSN subtype. Reversible inactivation of D1/D2 receptor-expressing MSNs with the tetanus toxin (Hikida et al., 2010) revealed the predominant roles of the D1 receptor-expressing cells in reward learning and cocaine sensitization, but there was no change in sensitization caused by the inactivation of D2 receptor-expressing cells. Using DREADD (designer receptors exclusively activated by a designer drugs) strategies, with viral-mediated expression of an engineered GPCR (Gi/o-coupled human muscarinic M4DREADD receptor, hM4D) that is activated by an otherwise pharmacologically inert ligand, Ferguson et al. (2011) showed that the activation of striatal D2 receptor-expressing neurons facilitated the development of amphetamine-induced sensitization. However, the optogenetic activation of D2 receptor-expressing cells in the NAc induced no change in cocaine-induced behavioral sensitization (Lobo, 2010).
Optogenetic inactivation of D1 receptor-expressing MSNs using the light activated chloride pump, halorhodopsin eNpHR3.0 (enhanced Natronomonas pharaonis halorhodopsin 3.0), during cocaine exposure resulted in an attenuation of cocaine-induced locomotor sensitization (Chandra et al., 2013). Furthermore, the conditional reconstruction of functional D1 receptor signaling in subregions of the NAc in D1 receptor KO mice resulted in D1 receptor expression in the core region of the NAc, but not the shell, mediated D1 receptor-dependent cocaine sensitization (Gore and Zweifel, 2013). These findings suggest that DA mechanisms critically mediate cocaine-induced behavioral sensitization, with distinct roles for D1 and D2 receptors, although the precise contribution of D1 and D2 receptors and their downstream signaling pathways remains to be determined.
The CPP paradigm is a commonly used preclinical behavioral test with a classical (Pavlovian) conditioning model. During the training phase of CPP, one distinct context is paired with drug injections, while another context is paired with vehicle injections (Thomas et al., 2008). During a subsequent drug-free CPP test, the animal chooses between the drug- and the vehicle-paired contexts. An increased preference for the drug context serves as a measure of the drug’s Pavlovian reinforcing effects (Thomas et al., 2008).
Although it has been previously reported that both systemic and intra-accumbens administration of the D1 receptor antagonist SCH23390 prevented cocaine CPP (Cervo and Samanin, 1995; Baker et al., 1998), D1 receptor mutant mice have been reported to demonstrate normal responses to the rewarding effects of cocaine in the CPP paradigm (Miner et al., 1995; Karasinska et al., 2005). Regarding the role of D2 receptors in CPP, there is considerable consensus in the literature that D2-like antagonists fail to influence place preference induced by cocaine (Spyraki et al., 1982; Shippenberg and Heidbreder, 1995; Cervo and Samanin, 1995; Nazarian et al., 2004). Consistent with these pharmacological studies, D2 receptor KO mice displayed a comparable CPP score to WT mice (Welter et al., 2007; Sim et al., 2013). Furthermore, D2L-/- mice developed a CPP to cocaine as did WT mice (Smith et al., 2002).
Recently, the effect of a conditional presynaptic KO of D2 receptors on addictive behaviors has been reported, and this study demonstrated that mice lacking D2 autoreceptors displayed cocaine supersensitivity, exhibited increased place preference for cocaine, as well as enhanced motivation for food reward, perhaps owing to the absence of presynaptic inhibition by autoreceptors that further elevates extracellular DA and maximizes the stimulation of postsynaptic DA receptors (Bello et al., 2011).
Results obtained from a different line of investigation showed that when D1-expressing MSNs are selectively activated by optogenetics, D1-Cre mice expressing DIO-AAV-ChR2-EYFP in the NAc displayed a significant increase in cocaine/blue-light preference compared to the control group (Lobo, 2010). In contrast, D2-Cre mice expressing DIO-AAV-ChR2-EYFP exhibited a significant attenuation of cocaine/blue-light preference relative to controls (Lobo, 2010), implicating a role for the activation of D1-expressing MSNs in enhancing the rewarding effects of cocaine, with activation of D2-expressing MSNs antagonizing the cocaine reward effect. Inhibition of D1-expressing MSNs with the tetanus toxin (Hikida et al., 2010) resulted in a diminished cocaine CPP, while no alterations to cocaine CPP after abolishing synaptic transmission in D2-expressing MSNs were observed (Hikida et al., 2010). Therefore, these data using optogenetics and cell-type specific inactivation of neurons implicate opposing roles of D1-and D2-expressing MSNs in CPP, with D1 receptor-expressing MSNs implicated in promoting both reward responses to psychostimulants, and D2 receptor-expressing MSNs dampening these behaviors (Lobo and Nestler, 2011).
Cocaine self-administration is an operant model in which laboratory animals lever press (or nose poke) for drug injections. The “self-administration” behavioral paradigm serves as an animal behavioral model of the human pathology of addiction (Thomas et al., 2008). It has been reported that selective lesion of DA terminals with 6-hydroxy DA (6-OHDA), or with the neurotoxin kainic acid in the NAc significantly attenuates cocaine self-administration, supporting the hypothesis that the reinforcing effects of cocaine are dependent upon mesolimbic DA (Pettit et al., 1984; Zito et al., 1985; Caine and Koob, 1994). Consistent with these findings, in vivo microdialysis studies demonstrate that accumbal extrasynaptic DA levels are enhanced during cocaineself-administration in both the rat (Hurd et al., 1989; Pettit and Justice, 1989) and monkey (Czoty et al., 2000). Collectively, these findings suggest that enhanced DA transmission in the NAc plays a crucial role in cocaine self-administration behavior.
DA receptor antagonists and agonists modulate cocaine self-administration, showing a dose-dependent biphasic effect. For example, selective antagonists for both D1 (Woolverton, 1986; Britton et al., 1991; Hubner and Moreton, 1991; Vanover et al., 1991; Caine and Koob, 1994) and D2 (Woolverton, 1986; Britton et al., 1991; Hubner and Moreton, 1991; Caine and Koob, 1994) receptors increase cocaine self-administration in response to lower doses of antagonist, but decrease self-administration in response to higher doses. This modulation appears to be specific when injected into the NAc but not the caudate nucleus, indicating a distinct role of NAc DA receptors in cocaine self-administration behaviors.
Later, using D1 and D2 receptor null mice, the involvement of these receptors in the cocaine self-administration was examined. Interestingly, despite the observation of normal cocaine CPP in D1 receptor KO mice, cocaine self-administration was eliminatedin these mice (Caine et al., 2007). In D2 receptor KO mice however, self-administration of low to moderate doses of cocaine was unaffected, while self-administration of moderate to high doses of cocaine was actually increased (Caine et al., 2002). Recently, Alvarez and co-workers reported that synaptic strengthening onto D2-expressing MSNs in the NAc occurs in mice with a history of intravenous cocaine self-administration (Bock et al., 2013). Inhibition of D2-MSNs using a chemicogenetic approach enhanced the motivation to obtain cocaine, while optogenetic activation of D2-MSNs suppressed cocaine self-administration, suggesting that recruitment of D2-MSNs in the NAc functions to restrain cocaine self-administration (Bock et al., 2013).
Studies investigating the reinstatement of cocaine-seeking behavior revealed that the administration of D2 receptor agonists reinstates cocaine-seeking behavior (Self et al., 1996; De Vries et al., 1999, 2002; Spealman et al., 1999; Khroyan et al., 2000; Fuchs et al., 2002). Consistent with these findings, D2 receptor antagonists attenuate cocaine priming-induced drug-seeking behavior (Spealman et al., 1999; Khroyan et al., 2000), while pre-treatment with a D2-like agonist prior to a priming injection of cocaine potentiated the behavior (Self et al., 1996; Fuchs et al., 2002). However, it appears that D1-like receptor agonists do not reinstate cocaine-seeking behavior (Self et al., 1996; De Vries et al., 1999; Spealman et al., 1999; Khroyan et al., 2000). In fact, systemically administered D1-like agonists and antagonists both attenuate the drug-seeking behavior induced by a priming cocaine injection (Self et al., 1996; Norman et al., 1999; Spealman et al., 1999; Khroyan et al., 2000, 2003), showing a differential involvement of D1 and D2 receptors in priming-induced reinstatement of cocaine seeking.
Results from our laboratory indicate that in the absence of D2 receptors, cocaine-induced reinstatement was not affected (Sim et al., 2013). It is suggested that the reinstatement of drug-seeking behavior can also be precipitated by re-exposure to cocaine-associated stimuli or stressors (Shaham et al., 2003). When this possibility was tested, results from our laboratory found that while stress potentiates the cocaine-induced reinstatement in WT mice, stress suppressed the cocaine-induced reinstatement in the D2 receptor mutant animals, suggesting an unexplored role of D2 receptors in the regulation of synaptic modification triggered by stress and drug addiction (Sim et al., 2013).
Food and food-related cues can activate different brain circuits involved in reward, including the NAc, hippocampus, amygdala and/or pre-frontal cortex and midbrain (Palmiter, 2007; Kenny, 2011). It is believed that the mesolimbic DA system promotes the learning of associations between natural reward and the environments in which they are found; thus, food and water, or cues that predict them, promote rapid firing of DA neurons, and facilitate behaviors directed toward acquisition of the reward (Palmiter, 2007). Indeed DA-deficient mice show a loss of motivation to feed (Zhou and Palmiter, 1995), while D1 receptor null mice exhibit retarded growth and low survival after weaning; this phenotype can be rescued by providing KO mice with easy access to a palatable food, suggesting that the absence of D1 receptor is more related to a motor deficit (Drago et al., 1994; Xu et al., 1994). In contrast, D2 receptor KO mice show reduced food intake and body weight along with an increased basal energy expenditure level compared to their wild type littermates (Kim et al., 2010). Therefore, it is difficult to delineate the exact role of the DA system and of the receptor subtypes in food reward. Nevertheless, most human studies indicate the importance of the D2 receptor in the regulation of food reward in association with obesity.
Increasing evidence suggests that variations in DA receptors and DA release play a role in overeating and obesity, especially in association with striatal D2 receptor function and expression (Stice et al., 2011; Salamone and Correa, 2013). In animal studies, it has been shown that feeding increases the extracellular DA concentration in the NAc (Bassareo and Di Chiara, 1997), in a similar manner to drugs of abuse. However, in contrast to its effect on behaviors related to drug addiction, NAc DA depletion alone does not alter feeding behavior (Salamone et al., 1993). It appears that the pharmacological blockade of D1 and D2 receptors in the NAc affects motor behavior, amount and duration of feeding, but it does not reduce the amount of food consumed (Baldo et al., 2002). Interestingly, recent data showed that binge eating was ameliorated by the acute administration of unilateral NAc shell deep brain stimulation, and this effect was mediated in part by activation of the D2 receptor, while deep brain stimulation of the dorsal striatum had no influence on this behavior (Halpern et al., 2013) in mice. However, it has been reported that when exposed to the same high-fat diet, mice with a lower density of D2 receptors in the putamen exhibit more weight gain than mice with a higher density of D2 receptorsin the same region (Huang et al., 2006). This study compared DAT and D2 receptor densities in chronic, high-fat diet-induced obese, obese-resistant and low-fat-fed control mice, and found that D2 receptor density was significantly lower in the rostral part of caudate putamen in chronic high-fat diet-induced obese mice compared to obese-resistant and low-fat-fed control mice (Huang et al., 2006). This low level of D2 receptor may be associated with altered DA release, and it has also been reported that consumption of a high-fat, high-sugar diet leads to the downregulation of D2 receptors (Small et al., 2003) and reduced DA turnover (Davis et al., 2008).
In human studies, obese people and drug addicts both tend to show reduced expression of D2 receptors in striatal areas, and imaging studies have demonstrated that similar brain areas are activated by food- and drug-related cues (Wang et al., 2009). Positron emission tomography (PET) studies suggest that the availability of D2 receptors was decreased in obese individuals in proportion to their body mass index (Wang et al., 2001), thus suggesting that DA deficiency in obese individuals may perpetuate pathological eating as a means of compensating for the decreased activation of DA-mediated reward circuits. Volkow and co-workers also reported that obese versus lean adults show less striatal D2 receptor binding, and that this was positively correlated with metabolism in the dorsolateral prefrontal, medial orbitofrontal, anterior cingulate gyrus and somatosensory cortices (Volkow et al., 2008). This observation led to a discussion over whether decreases in striatal D2 receptors could contribute to overeating via the modulation of striatal prefrontal pathways that participate in inhibitory control and salience attribution, and whether the association between striatal D2 receptors and metabolism in the somatosensory cortices (regions that process palatability) could underlie one of the mechanisms through which DA regulates the reinforcing properties of food (Volkow et al., 2008).
Stice and co-workers used functional magnetic resonance imaging (fMRI) to show that individuals may overeat to compensate for a hypofunctioning dorsal striatum, particularly those with genetic polymorphisms of an A1 allele of the TaqIA in D2 receptor (DRD2/ANKK1) gene, which is associated with lower striatal D2 receptor density and attenuated striatal DA signaling (Stice et al., 2008a,b). These observations indicate that individuals who show blunted striatal activation during food intake are at risk for obesity, particularly those also at genetic risk for compromised DA signaling in brain regions implicated in food reward (Stice et al., 2008a, 2011). However, recent data showed that obese adults with or without binge eating disorder had a distinct genetic polymorphism of the TaqIA D2 receptor (DRD2/ANKK1) gene (Davis et al., 2012); therefore, it is plausible that similar brain DA systems are disrupted in both food motivation and drug addiction, even though it is not yet clear what these DA receptor data represent from the functional perspective of DA neurotransmission in brain.
As in obese people, low D2 receptor availability is associated with chronic cocaine abuse in humans (Volkow et al., 1993; Martinez et al., 2004). In contrast, overexpression of D2 receptors reduces the self-administration of alcohol in rats (Thanos et al., 2001). In humans, a higher-than-normal D2 receptor availability in non-alcoholic members of alcoholic families was reported (Volkow et al., 2006; Gorwood et al., 2012), supporting the hypothesis that low levels of D2 receptors may be associated with an increased risk of addictive disorders. Therefore, it is possible that in the brains of both obese individuals and chronic drug abusers, there are low basal DA concentrations, and periodic exaggerated DA release associated with either food or drug intake, along with low expression, or dysfunctional D2 receptors.
Dopamine receptor expression levels in other areas of the brain may also be important. For example, Fetissov et al. (2002) observed that obese Zucker rats, which display a feeding pattern consisting of large meal size and small meal number, have a comparatively low level of D2 receptor expression in the ventromedial hypothalamus (VMH). Interestingly, in their study, when a selective D2 receptor antagonist, sulpiride was injected into the VMH of obese and lean rats, a hyperphagic response was elicited only in the obese rats, suggesting that by aggravating the already low level of D2 receptors, it was possible to increase food intake. This low D2 receptor expression may cause an exaggerated DA release in obese rats during food ingestion and a reduced satiety feedback effect of DA, which would facilitate DA release into the brain areas “craving” for DA (Fetissov et al., 2002).
Recently, in an elegant study conducted by Johnson and Kenny (2010), it was observed animals provided with a “cafeteria diet” consisting of a selection of highly palatable energy-dense food gained weight, demonstrating compulsive eating behavior. In addition to their excessive adiposity and compulsive-like eating, cafeteria diet rats also had decreased D2 receptor expression in the striatum. Surprisingly, lentivirus-mediated knockdown of striatal D2 receptors rapidly accelerated the development of addiction-like reward deficits, and the onset of compulsive-like food-seeking behaviorin rats with extended access to palatable high-fat food (Johnson and Kenny, 2010), again indicating that common hedonic mechanisms may therefore underlie obesity and drug addiction. However, our own laboratory found somewhat unexpected results showing that D2 KO mice have a lean phenotype with enhanced hypothalamic leptin signaling compared to WT mice (Kim et al., 2010). Therefore, we cannot rule out that the D2 receptor plays a role in the homeostatic regulation of metabolism in association with a regulator of energy homeostasis such as leptin, in addition to its role in food motivation behavior. An animal model with a genetically manipulated conditional restriction of the D2 receptor in leptin receptor-expressing cells for example, or other reward-related neuronal cells, together with neural integrative tools, could potentially elucidate the role of the DA system via D2 receptors in food reward and the homeostatic regulation of food intake.
Increasing evidence indicates that homeostatic regulators of food intake, such as leptin, insulin, and ghrelin, control and interact with the reward circuit of food intake, and thus regulate behavioral aspects of food intake and conditioning to food stimuli behaviors (Abizaid et al., 2006; Fulton et al., 2006; Hommel et al., 2006; Baicy et al., 2007; Farooqi et al., 2007; Palmiter, 2007; Konner et al., 2011; Volkow et al., 2011). Recent findings reveal that hormones implicated in regulating energy homeostasis also impinge directly on DA neurons; for example, leptin and insulin directly inhibit DA neurons, while ghrelin activates them (Palmiter, 2007; Kenny, 2011).
Hommel and co-workers demonstrated that VTA DA neurons express leptin receptor mRNA, and respond to leptin with the activation of an intracellular JAK-STAT (Janus kinase-signal transducer and activator of transcription) pathway, which is the major pathway involved in leptin receptor downstream signaling, as well as a reduction in the firing rate of DA neurons (Hommel et al., 2006). This study showed that direct administration of leptin to the VTA caused decreased food intake, while long-term RNAi-mediated knockdown of leptin receptors in the VTA led to increased food intake, locomotor activity, and sensitivity to highly palatable food. These data support a critical role for VTA leptin receptorsin regulating feeding behavior, and provide functional evidence for the direct action of a peripheral metabolic signal on VTA DA neurons. These results are consistent with the idea that leptin signaling in the VTA normally suppresses DA signaling, and consequently decreases both food intake and locomotor activity. This suggests a physiological role for leptin signaling in the VTA, although the authors did not demonstrate that the effect of the virus injection on feeding was correlated directly with increased DA signaling (Hommel et al., 2006).
Fulton and co-workers also investigated the functional significance of leptin action in VTA DA neurons, to expand understanding of the multiple actions of leptin in the DA reward circuit (Fulton et al., 2006). Using double-label immunohistochemistry, they observed increased STAT3 phosphorylation in the VTA following peripheral leptin administration. These pSTAT3-positive neurons colocalized with DA neurons, and to a lesser extent with markers for GABA neurons. Retrograde neuronal tracing from the NAc revealed colocalization of the tracer with pSTAT3, indicating that a subset of VTA DA neurons expressing leptin receptors project to the NAc. When they assessed leptin function in the VTA, they found that ob/ob mice had a diminished locomotor response to amphetamine, and lacked locomotor sensitization to repeated amphetamine injections, with both defects being reversed by leptin infusion, thus indicating that the mesoaccumbens DA pathway, critical to integrating motivated behavior, also responds to this adipose-derived signal (Fulton et al., 2006). These lines of evidence importantly suggested the action of leptin in the DA reward system. However, given that physiological level of leptin receptor expression appear to be very low in the midbrain, normal circulating leptin levels seem to have little effect on leptin receptor signaling within the VTA. Thus, whether in vivo leptin can exert an significant effect to inhibit DA neuron activity through their receptors in VTA remains questionable (Palmiter, 2007).
There are also human studies showing that leptin can indeed control rewarding responses. Farooqi and co-workers reported that patients with congenital leptin deficiency displayed activation of DA mesolimbic targets (Farooqi et al., 2007). In the leptin-deficient state, images of well-liked foods engendered a greater wanting response, even when the subject has just been fed, while after leptin treatment, well-liked food images engendered this response only in the fasted state, an effect consistent with the response in control subjects. Leptin reduces activation in the NAc-caudate, and mesolimbic activation (Farooqi et al., 2007). Thus, this study suggests that leptin diminished the rewarding responses to food, acting on the DA system (Farooqi et al., 2007; Volkow et al., 2011). Another fMRI study by Baicy et al., also performed with patients with congenital leptin deficiency, showed that during viewing of food-related stimuli, leptin replacement reduced neural activation in brain regions linked to hunger (the insula, parietal and temporal cortex), while enhancing activation in regions linked to inhibition and satiety (the prefrontal cortex; Baicy et al., 2007). Therefore, it appears that leptin acts on neural circuits involved in hunger and satiety with inhibitory control.
Another peptide hormone, ghrelin, which is produced in the stomach and pancreas, is known to increase appetite and food intake (Abizaid et al., 2006). The ghrelin receptor growth hormone secretagogue 1 receptor (GHSR) is present in hypothalamic centers as well as in the VTA. Abizaid and co-workers showed that in mice and rats, ghrelin bound to neurons of the VTA, where it triggered increased DA neuronal activity, synapse formation, and DA turnover in the NAc, in a GHSR-dependent manner. In addition, they demonstrated that direct VTA administration of ghrelin also triggered feeding behavior, while intra-VTA delivery of a selective GHSR antagonist blocked the orexigenic effect of circulating ghrelin, and blunted rebound feeding following fasting, suggesting that the DA reward circuitry is targeted by ghrelin to influence motivation for food (Abizaid et al., 2006).
Insulin, which is one of the key hormones involved in the regulation of glucose metabolism, and inhibits feeding, has been shown to also regulate the DA system in the brain. Insulin receptors are expressed in brain regions that are rich in DA neurons, such as the striatum and midbrain (Zahniser et al., 1984; Figlewicz et al., 2003), suggesting a functional interaction between the insulin and DA systems. Indeed, it has been shown that insulin acts on DA neurons, and infusion of insulin into the VTA decreases food intake in rats (Figlewicz et al., 2008; Bruijnzeel et al., 2011). Recent studies on the selective deletion of insulin receptors in midbrain DA neurons in mice demonstrated that this manipulation results in increased body weight, increased fat mass, and hyperphagia (Konner et al., 2011). While insulin acutely stimulated firing frequency in 50% of dopaminergic VTA/SN neurons, this response was abolished in those mice with the insulin receptor selectively deleted in DA neurons. Interestingly, in these mice, D2 receptor expression in the VTA was decreased compared to control mice. Moreover, these mice exhibited an altered response to cocaine under food-restricted conditions (Konner et al., 2011). Another recent report indicates that insulin can induce long-term depression (LTD) of mouse excitatory synapses onto VTA DA neurons (Labouèbe et al., 2013). Furthermore, after a sweetened high-fat meal, which elevates endogenous insulin levels, insulin-induced LTD is occluded. Finally, insulin in the VTA reduces food anticipatory behavior in mice, and CPP for food in rats. This study raises an interesting issue about how insulin can modulate reward circuitry, and suggests a new type of insulin-induced synaptic plasticity on VTA DA neurons (Labouèbe et al., 2013).
This review has focused on the role of the DA system, mainly concentrating on the roles of D1 and D2 receptors in reward-related behaviors, including addiction and food motivation. However, it is well known that the DA system in this reward-circuit is finely modulation by glutamatergic, GABAergic, and other neurotramistter systems, which form specific circuits to encode the neuronal correlates of behaviors. Recent breakthroughs in optogenetic tools to alter neuronal firing and function with light, as well as DREADDs, together with genetic manipulation of specific neuronal cells or circuits are now allowing us to refine our insight into reward circuits in addiction, and the hedonic value of food intake. It is of no doubt that these lines of investigation have provided a foundation for future direction of our study in neurocircuitry of the DA system in these behaviors. Future studies could include enlarged manipulations of important signaling molecules, for example, signaling molecules implicated in the D1 and D2 receptor signaling cascades, to explore the impact of these molecules on the induction and expression of specific reward behaviors. Given that these two receptors employ distinct signaling pathways, in terms of their respective G protein coupling, as well as in the activation of common singling molecules such as ERK, the differential distribution of receptors, as well as of their downstream signaling molecules may result in a different type of physiological response. Additionally, with this conceptual and technical evolution of the DA system in behaviors, this research will have important implications in the clinical investigation of related neurological disorders and psychiatric diseases. Therefore, our continuing efforts to identify and characterize the organization and modification of DA synaptic functions in both animals and humans will contribute to the elucidation of neural circuits underlying the pathophysiology of drug addiction and eating disorders.
The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP; No. 2011-0015678, No. 2012-0005303), MSIP: the Ministry of Science, ICT & Future Planningand by a grant of the Korean Health Technology R&D Project (A111776) from Ministry of Health & Welfare, Republic of Korea.
Drug-induced dopamine dysregulation can cause reckless sexual behavior. Does this have relevance for those who dysregulate dopamine with heavy porn use?
Parkinsonism Relat Disord. 2011 May;17(4):260-4. doi: 10.1016/j.parkreldis.2011.01.009.
Hassan A, Bower JH, Kumar N, Matsumoto JY, Fealey RD, Josephs KA, Ahlskog JE. Parkinsonism Relat Disord. 2011 Feb 8; Department of Neurology, Mayo Clinic, Rochester, MN 55905, USA.
BACKGROUND: Compulsive behaviors provoked by dopamine agonists often go undetected in clinical series, especially if not specifically inquired about.
AIM: To determine the frequency of compulsive behaviors in a Parkinson's disease (PD) clinic where agonist-treated patients were routinely asked about such aberrant behaviors.
METHODS: We utilized the Mayo Health Science Research database to ascertain all PD patients taking a dopamine agonist over a two year period (2007-2009). All were seen by a Mayo-Rochester Movement Disorders Staff specialist who routinely inquired about behavior compulsions.
RESULTS: Of 321 PD patients taking an agonist, 69 (22%) experienced compulsive behaviors, and 50/321 (16%) were pathologic. However, when the analysis was restricted to patients taking agonist doses that were at least minimally therapeutic, pathological behaviors were documented in 24%. The subtypes were: gambling (25; 36%), hypersexuality (24; 35%), compulsive spending/shopping (18; 26%), binge eating (12; 17%), compulsive hobbying (8; 12%) and compulsive computer use (6; 9%). The vast majority of affected cases (94%) were concurrently taking carbidopa/levodopa. Among those with adequate followup, behaviors completely or partly resolved when the dopamine agonist dose was reduced or ceased.
CONCLUSIONS: Dopamine agonist treatment of PD carries a substantial risk of pathological behaviors. These occurred in 16% of agonist-treated patients; however, when assessing patients whose dose was at least minimally in the therapeutic range, the frequency jumped to 24%. Pathological gambling and hypersexuality were most common. Carbidopa/levodopa therapy taken concurrently with a dopamine agonist appeared to be an important risk factor.
The anhedonia hypothesis – that brain dopamine plays a critical role in the subjective pleasure associated with positive rewards – was intended to draw the attention of psychiatrists to the growing evidence that dopamine plays a critical role in the objective reinforcement and incentive motivation associated with food and water, brain stimulation reward, and psychomotor stimulant and opiate reward. The hypothesis called to attention the apparent paradox that neuroleptics, drugs used to treat a condition involving anhedonia (schizophrenia), attenuated in laboratory animals the positive reinforcement that we normally associate with pleasure. The hypothesis held only brief interest for psychiatrists, who pointed out that the animal studies reflected acute actions of neuroleptics whereas the treatment of schizophrenia appears to result from neuroadaptations to chronic neuroleptic administration, and that it is the positive symptoms of schizophrenia that neuroleptics alleviate, rather than the negative symptoms that include anhedonia. Perhaps for these reasons, the hypothesis has had minimal impact in the psychiatric literature. Despite its limited heuristic value for the understanding of schizophrenia, however, the anhedonia hypothesis has had major impact on biological theories of reinforcement, motivation, and addiction. Brain dopamine plays a very important role in reinforcement of response habits, conditioned preferences, and synaptic plasticity in cellular models of learning and memory. The notion that dopamine plays a dominant role in reinforcement is fundamental to the psychomotor stimulant theory of addiction, to most neuroadaptation theories of addiction, and to current theories of conditioned reinforcement and reward prediction. Properly understood, it is also fundamental to recent theories of incentive motivation.
The anhedonia hypothesis of neuroleptic action (Wise, 1982) was, from its inception (Wise et al., 1978), a corollary of broader hypotheses, the dopamine hypotheses of reward (Wise, 1978) or reinforcement (Fibiger, 1978). The dopamine hypotheses were themselves deviations from an earlier catecholaminergic theory, the noradrenergic theory of reward (Stein, 1968). The present review sketches the background, initial response, and current status of the inter-related dopamine hypotheses: the dopamine hypothesis of reward, the dopamine hypothesis of reinforcement, and the anhedonia hypothesis of neuroleptic action.
The notion that animal behavior is controlled by reward and punishment is certainly older than recorded history (Plato attributed it to his older brother). The notion that an identifiable brain mechanism subserves this function was anchored firmly to biological fact by the finding of Olds and Milner (1954) that rats will work for electrical stimulation of some but not other regions of the forebrain. This led to the postulation by Olds (1956) of "pleasure centers" in the lateral hypothalamus and related brain regions. Brain stimulation studies by Sem-Jacobsen (1959) and Heath (1963) confirmed that humans would work for such stimulation and found it pleasurable (Heath, 1972). Olds (Olds and Olds, 1963) mapped much of the rat brain for reward sites, and even as his title phrase "pleasure centers" (Olds, 1956) was capturing the minds of a generation of students he was thinking not about isolated centers so much as about interconnected circuit elements (Olds, 1956; 1959; Olds and Olds, 1965). Olds (1956) assumed these to be specialized circuits that "would be excited by satisfaction of the basic drives – hunger, sex, thirst and so forth."
The first hints of what neurotransmitters might carry reward-related signals in the brain came from pharmacological studies. Olds and Travis (1960) and Stein (1962) found that the tranquilizers reserpine and chlorpromazine dramatically attenuated intracranial self-stimulation, while the stimulant amphetamine potentiated it. Imipramine potentiated the effects of amphetamine (Stein, 1962). Reserpine was known to deplete brain noradrenaline, chlorpromazine was known to block noradrenergic receptors, amphetamine was known to be a noradrenaline releaser, and imipramine was known to block noradrenergic reuptake. Largely on the basis of these facts and the location of reward sites in relation to noradrenergic cells and fibers, Stein (1968) proposed that reward function was mediated by a noradrenergic pathway originating in the brainstem (interestingly, Stein initially identified the A10 cell group, which turned out to comprise dopaminergic rather than noradrenergic neurons, as the primary origin of this system). Pursuing his hypothesis, C.D. Wise and Stein (1969; 1970) found that inhibition of dopamine-β-hydroxylase the enzyme that converts dopamine to norepinephrine – abolished self-stimulation and eliminated the rate-enhancing action of amphetamine; intraventricular administration of l-norepinephrine reinstated self-stimulation and restored the ability of dopamine to facilitate it.
At the time of initial formulation of the noradrenergic theory of reward, dopamine was known as a noradrenergic precursor but not as a transmitter in its own right. At about this time, however, Carlsson et al. (1958) suggested that dopamine might be a neurotransmitter in its own right. The discovery that noradrenaline and dopamine have different distributions in the nervous system (Carlsson, 1959; Carlsson and Hillarp, 1962) appeared to confirm this assumption, and reward sites in the region of the dopamine-containing cells of the midbrain led Crow and others to suggest that the two catecholamine transmitters in forebrain circuitry – noradrenaline and dopamine – might each subserve reward function (Crow, 1972; Crow et al., 1972; Phillips and Fibiger, 1973; German and Bowden, 1974).
Evidence that eventually ruled out a major role for norepinephrine in brain stimulation and addictive drug reward began to accumulate from two sources: pharmacology and anatomy. The pharmacological issue was whether selective noradrenergic blockers or depletions disrupted reward function itself or merely impaired the performance capacity of the animals. For example, Roll (1970) reported that noradrenergic synthesis inhibition disrupted self-stimulation by making animals sleepy; waking them restored the behavior for a time, until the animals lapsed into sleep again (Roll, 1970). Noradrenergic receptor antagonists clearly disrupted intracranial self-stimulation in ways suggestive of debilitation rather than loss of sensitivity to reward (Fouriezos et al., 1978; Franklin, 1978). Also, noradrenergic antagonists failed to disrupt intravenous (IV) self-administration of amphetamine (Yokel and Wise, 1975; 1976; Risner and Jones, 1976) or cocaine (de Wit and Wise, 1977; Risner and Jones, 1980). Further, lesions of the noradrenergic fibers of the dorsal bundle failed to disrupt self-stimulation with stimulating electrodes near the locus coeruleus, where the bundle originates, or in the lateral hypothalamus, through which the bundle projects (Corbett et al., 1977). Finally, careful mapping of the region of the locus coeruleus and the trajectory of the dorsal noradrenergic bundle fibers that originate there revealed that positive reward sites in these regions did not correspond to the precise location of histochemically confirmed noradrenergic elements (Corbett and Wise, 1979).
On the other hand, as selective antagonists for dopamine receptors became available, evidence began to accumulate that dopamine receptor blockade disrupted self-stimulation in ways that implied a devaluation of reward rather than an impairment of performance capacity. There was considerable early concern that the effect of dopamine antagonists – neuroleptics – was primarily motor impairment (Fibiger et al., 1976). Our first study in this area was not subject to this interpretation because performance in our task was enhanced rather than disrupted by neuroleptics. In our study rats were trained to lever-press for IV injections of amphetamine, a drug that causes release of each of the four monoamine neurotransmitters – norepinephrine, epinephrine, dopamine, and serotonin. We trained animals to self-administer IV amphetamine and challenged with selective antagonists for adrenergic or dopaminergic receptors. Animals treated with low and moderate doses of selective dopamine antagonists simply increased their responding (as do animals tested with lower than normal amphetamine doses), while animals treated with high doses increased responding in the first hour or two but responded intermittently thereafter (as do animals tested with saline substituted for amphetamine) (Yokel and Wise, 1975; 1976). Similar effects were seen in rats lever-pressing for cocaine (de Wit and Wise, 1977). Very different effects were seen with selective noradrenergic antagonists; these drugs decreased responding from the very start of the session and did not lead to further decreases as the animals earned and experienced the drug in this condition (Yokel and Wise, 1975; 1976; de Wit and Wise, 1977). The increases in responding for drug reward could clearly not be attributed to performance impairment. The findings were interpreted as reflecting a reduction of the rewarding efficacy of amphetamine and cocaine, such that the duration of reward from a given injection was reduced by dopaminergic, but not noradrenergic, antagonists.
In parallel with our pharmacological studies of psychomotor stimulant reward, we carried out pharmacological studies of brain stimulation reward. Here, however, dopamine antagonists, like reward-reduction, reduced rather than increased lever-pressing. The reason that neuroleptics decrease responding for brain stimulation and increase responding for psychomotor stimulants are interesting and are now understood (Lepore and Franklin, 1992), but at the time decreased responding was suggested to reflect parkinsonian side-effects of dopaminergic impairment (Fibiger et al., 1976). The timecourse of our finding appeared to rule out this explanation. We tracked the time-course of responding in well-trained animals that were pre-treated with the dopamine antagonists pimozide or butaclamol. We found that the animals responded normally in the initial minutes of each session, when they would have expected normal reward from the prior reinforcement history, but they slowed or ceased responding, depending on the neuroleptic dose, as did animals unexpectectly tested under conditions of reduced reward (Fouriezos and Wise, 1976; Fouriezos et al., 1978). Animals pretreated with the noradrenergic antagonist phenoxybenzamine, in contrast, showed depressed lever-pressing from the very start of the session and they did not slow further as they earned and experienced the rewarding stimulation. Performance was poor in the phenoxybenzamine-treated animals, but it did not worsen as the animals gained experience with the reward while under the influence of the drug.
That dopaminergic but not noradrenergic antagonists impaired the ability of reward to sustain motivated responding was confirmed in animals tested in a discrete-trial runway test. Here, the animals ran a two-meter alleyway from a start box to a goal box where they could lever-press, on each of 10 trials per day, for 15 half-second trains of brain stimulation reward. After several days of training the animals were tested after neuroleptic pretreatment. Over the course of 10 trials in the neuroleptic condition, the animals stopped leaving the start box immediately when the door was opened, stopped running quickly and directly to the goal box, and stopped lever-pressing for the stimulation. Importantly, however, the consummatory response – earning the stimulation once they reached the goal box response – deteriorated before the instrumental responses – leaving the start box and running the alleyway deteriorated. The animals left the start box with normal latency for the first 8 trials, ran normally for only the first 7 trials, and lever-pressed at normal rates for only the first 6 trials of the neuroleptic test session. Thus the animals showed signs of disappointment in the reward – indicated by the decreased responding in the goal box – before they showed any lack of motivation indicated by approach responding.
These self-stimulation findings were again incompatible with the possibility that our neuroleptic doses were simply causing motor deficits. The animals showed normal capacity at the beginning of sessions, and continued to run the alleyway at peak speed until after they showed signs disappointment with the reward in the goal box. Moreover, in the lever-pressing experiments the neuroleptic-treated animals sometimes leaped out of their open-topped test chambers and balanced precariously on the edge of the plywood walls; thus the animals still had good motor strength and coordination (Fouriezos, 1985). Moreover, neuroleptic-treated animals that ceased responding after a few minutes did not do so because of exhaustion; they re-initiated normal responding when presented reward-predictive environmental stimuli (Fouriezos and Wise, 1976; Franklin and McCoy, 1979). Moreover, after extinguishing one learned response for brain stimulation reward, neuroleptic-treated rats will initiate, with normal response strength, an alternative, previously learned, instrumental response for the same reward (they then go through progressive extinction of the second response: Gallistel et al., 1982). Finally, moderate reward-attenuating doses of neuroleptics do not impose a lowered response ceiling as do changes in performance demands (Edmonds and Gallistel, 1974); rather they merely increase the amount of stimulation (reward) necessary to motivate responding at the normal maximum rates (Gallistel and Karras, 1984). These pharmacological findings suggested that whatever collateral deficits they may cause, neuroleptic drugs devalue the effectiveness of brain stimulation and psychomotor stimulant rewards.
In parallel with our pharmacological studies, we initiated anatomical mapping studies with two advantages over earlier approaches. First, we used a moveable electrode (Wise, 1976) so that we could test several stimulation sites within each animal. In each animal, then, we had anatomical controls: ineffective stimulation sites above or below loci where stimulation was rewarding. Electrode movements of 1/8 mm were often sufficient to take an electrode tip from a site where stimulation was not rewarding to a site where it was, or vice versa. This allowed us to identify the dorsal-ventral boundaries of the reward circuitry within a vertical electrode penetration in each animal. Second, we took advantage of a new histochemical method (Bloom and Battenberg, 1976) to identify the boundaries of the catecholamine systems in the same histological material that showed the electrode track. Previous studies had relied on single electrode sites in each animal and on comparisons between nissl-stained histological sections and line drawings showing the locations of catecholamine systems. Our mapping studies showed that the boundaries of the effective zones of stimulation did not correspond to the boundaries of noradrenergic cell groups or fiber bundles (Corbett and Wise, 1979) and did correspond to the boundaries of the dopamine cell groups in the ventral tegmental area and substantia nigra pars compacta (Corbett and Wise, 1980) and pars lateralis (Wise, 1981). While subsequent work has raised the question of whether rewarding stimulation activates high-threshold catecholamine systems directly or rather activates their low-threshold input fibers (Gallistel et al., 1981; Bielajew and Shizgal, 1986; Yeomans et al., 1988), the mapping studies tended to focus attention on dopamine rather than norepinephrine systems as substrates of reward.
The term "anhedonia" was first introduced in relation to studies of food reward (Wise et al., 1978). Here again, we found that when well-trained animals were first tested under moderate doses of the dopamine antagonist pimozide, they initiated responding normally for food reward. Indeed, pimozide-pretreated animals responded as much (at 0.5 mg/kg) or almost as much (at 1.0 mg/kg) the first day under pimozide treatment as they did when food was given in the absence of pimozide. When retrained for two days and then tested a second time under pimozide, however, they again responded normally in the early portion of their 45-min sessions but stopped responding earlier than normal and their total responding for this second session was significantly lower than on a drug-free day or on their first pimozide-test day. When retrained and tested a third and fourth time under pimozide, the animals still initiated responding normally but ceased responding progressively earlier. Normal responding in the first few minutes of each session confirmed that the doses of pimozide were not simply debilitating the animals; decreased responding after tasting the food in the pimozide condition suggested that the rewarding (response-sustaining) effect of food was devalued when the dopamine system was blocked.
In this study, a comparison group was trained the same way, but these animals were simply not rewarded on the four "test" days when the experimental groups were pretreated with pimozide. Just as the pimozide-treated animals lever-pressed the normal 200 times for food pellets on the first day, so did the non-rewarded animals lever-press the normal 200 times despite the absence of the normal food reward. On successive days of testing, however, lever-pressing in the non-rewarded group dropped to 100, 50, and 25 responses, showing the expected decrease in resistance to extinction that paralleled the pattern seen in the pimozide-treated animals. A similar pattern across successive tests is seen when animals trained under deprivation are tested several times under conditions of satiety; the first time tested the animals respond for and eat food that was freely available before or during the test. Like the habit-driven lever-pressing in our pimozide-treated or non-rewarded animals, the habit-driven eating under satiety decreases progressively with repeated testing. Morgan (1974) termed the progressive deterioration of responding under satiety "resistance to satiation," calling attention to the parallel with resistance to extinction. In all three conditions – responding under neuroleptics, responding under non-reward, and responding under satiety – the behavior is driven by a response habit that decays if not supported by normal reinforcement. In our experiment, an additional comparison group established that there was no sequential debilitating effect of repeated testing with pimozide, a drug with a long half-life and subject to sequestration by fat. The animals of this group received pimozide in their home cages but were not tested on the first three "test days"; they were allowed to lever-press for food only after the fourth of their series of pimozide injections. These animals responded avidly for food after their fourth pimozide treatment, just like animals that were given the opportunity to lever-press for food the first time they were treated with pimozide. Thus responding in Test 4 depended not just on having had pimozide in the past, but on having tasted food under pimozide conditions in the past. Something about the memory of food experience under pimozide – not just of pimozide alone – caused the progressively earlier response cessation seen when pimozide tests were repeated. The fact that pimozide-pretreated animals responded avidly for food until after they had tasted it in the pimozide condition led us to postulate that the food was not as enjoyable under the pimozide condition. The essential feature of what appeared to be a devaluation of reward under pimozide had been captured earlier in a remark of George Fouriezos in connection with our brain stimulation experiments: "Pimozide takes the jolts out of the volts."
The formal statement of the anhedonia appeared a few years after the food reward studies in a journal that published peer commentaries along with review papers (Wise, 1982). Two thirds of the initial commentaries either contested the hypothesis or proposed an alternative to it (Wise, 1990). For the most part, the primary arguments against the original hypothesis appealed to motor or other performance deficits (Freed and Zec, 1982; Koob, 1982; Gramling et al., 1984; Ahlenius, 1985). These were arguments addressed to the finding that neuroleptics caused decreased performance for food or brain stimulation reward but did not, for the most part, address the fact that neuroleptics disrupted maintenance rather than initiation of responding. They also failed to address the fact that when neuroleptic-treated animals stopped responding their responding could be reinstated by exposing them to previously conditioned reward-predictive stimuli (Fouriezos and Wise, 1976; Franklin and McCoy, 1979). Nor could these arguments be reconciled with the fact that such reinstated responding itself underwent apparent extinction. Finally, they did not address the fact that neuroleptics caused compensatory increases in lever-pressing for amphetamine and cocaine reward (Yokel and Wise, 1975; 1976; de Wit and Wise, 1977).
The most critical evidence against a motor hypothesis was elaborated before the formal statement of the anhedonia hypothesis. The paper (Wise et al., 1978) is still steadily cited, but is probably rarely now read in the original. The original findings are summarized above, but they continue to escape the attention of most remaining proponents of motor hypotheses (or other hypotheses of debilitation); for this reason the original paper is still worth reading. The critical findings are that moderate doses of neuroleptics only severely attenuate responding for food after the animal has had experience with that food while under the influence of the neuroleptic. If the animal has had experience with the neuroleptic in the absence of food, its subsequent effect on responding for food is minimal; however, after having had experience with the food under the influence of the neuroleptic, the effect of the neuroleptic becomes progressively stronger. Similar effects are seen when the only instrumental responses required of the animal are those of picking up the food, chewing it, and swallowing (Wise and Colle, 1984; Wise and Raptis, 1986).
Several of the criticisms of the anhedonia hypothesis have been more semantic than substantial. While agreeing that the effects of neuroleptics cannot be explained as simple motor debilitation, several authors have suggested other names for the condition. Katz (1982) termed it "hedonic arousal"; Liebman (1982) termed it "neuroleptothesia"; Rech (1982) termed it "neurolepsis' or "blunting of emotional reactivity"; Kornetsky (1985) termed it a problem of "motivational arousal"; and Koob (1982) begged the question by calling it a "higher order" motor problem. The various criticisms addressed differentially the anhedonia hypothsis, the reinforcement hypothesis, and the reward hypothesis.
The anhedonia hypothesis was really a corollary of the hypothesis that dopamine was important for objectively measured reward function. The initial statement of the hypothesis was that the neuroleptic pimozide "appears to selectively blunt the rewarding impact of food and other hedonic stimuli" (Wise, 1978). It was not really an hypothesis about subjectively experienced anhedonia but rather an hypothesis about objectively measured reward function. The first time the hypothesis was actually labeled the "anhedonia hypothesis" (Wise, 1982), it was stated thusly: "the most subtle and interesting effect of neuroleptics is a selective attenuation of motivational arousal that is (a) critical for goal-directed behavior, (b) normally induced by reinforcers and associated environmental stimuli, and (c) normally accompanied by the subjective experience of pleasure." The hypothesis linked dopamine function explicitly to motivational arousal and reinforcement – the two fundamental properties of rewards – and implied only a partial correlation with the subjective experience of the pleasure that "usually" accompanies positive reinforcement.
The suggestion that dopamine might be important for pleasure itself came in part from the subjective reports of patients (Healy, 1989) or normal subjects (Hollister et al., 1960; Bellmaker and Wald, 1977) given neuroleptic treatments. The dysphoria caused by neuroleptics is quite consistent with the suggestion that they attenuate the normal pleasures of life. Consistent with this view were that drugs like cocaine and amphetamine – drugs that are presumed to be addictive at least in part because of the euphoria they cause (Bijerot, 1980) – increase extracellular dopamine levels (vanRossum et al., 1962; Axelrod, 1970; Carlsson, 1970). The neuroleptic pimozide, a competitive antagonist at dopamine receptors (and the neuroleptic used in our animal studies), had been reported to decrease the euphoria induced by IV amphetamine in humans (Jönsson et al., 1971; Gunne et al., 1972).
The ability of neuroleptics to block the subjective effects of euphoria have been questioned on the basis of clinical reports of continued amphetamine and cocaine abuse in neuroleptic-treated schizophrenic patients and on the basis of more recent studies on the subjective effects of neuroleptic-treated normal humans. The clinical observations are difficult to interpret because of compensatory adaptations to chronic dopamine receptor blockade and because of variability in drug intake, neuroleptic dose, and compliance with treatment during periods of stimulant use. The more recent controlled studies of the effects of pimozide on amphetamine euphoria (Brauer and de Wit, 1996; 1997) are also problematic. First, there are issues of pimozide dose: the high dose of the early investigators was 20 mg (Jönsson et al., 1971; Gunne et al., 1972), whereas, because of concern about extrapyramidal side-effects, the high dose in the more recent studies was 8 mg. More troublesome are the differences in amphetamine treatment between the original and the more recent studies. In the original studies, 200 mg of amphetamine was given intravenously to regular amphetamine users; in the more recent studies, 10 or 20 mg was given to normal volunteers by mouth in capsules. One must wonder if normal volunteers are feeling and rating the same euphoria from their 20 mg capsules as is felt by chronic amphetamine users after their 200 mg IV injection (Grace, 2000; Volkow and Swanson, 2003).
The notion that neuroleptics attenuate the pleasure of food reward has also been challenged on the basis of rat studies (Treit and Berridge, 1990; Pecina et al., 1997). Here the challenge was based on the taste-reactivity test, putatively a test of the hedonic impact of sweet taste (Berridge, 2000). The test has been used to challenge directly the hypothesis that "pimozide and other dopamine antagonists produce anhedonia, a specific reduction of the capacity for sensory pleasure" (Pecina et al., 1997, p. 801). This challenge is, however, subject to serious caveats: "When using taste reactivity as a measure of 'liking' or hedonic impact it is important to be clear about a potential confusion. Use of terms such as 'like' and 'dislike' does not necessarily imply that taste reactivity patterns reflect a subjective experience of pleasure produced by a food" (Berridge, 2000, p. 192, emphasis as in the original), and that "We will place 'liking' and 'wanting' in quotation marks because our use differs in an important way from the ordinary use of these words. By their ordinary meaning, these words typically refer to the subjective experience of conscious pleasure or conscious desire" (Berridge and Robinson, 1998, p. 313). The taste reactivity test seems unlikely to directly measure the subjective pleasure of food, as "normal" taste reactivity in this paradigm is seen in decorticate rats (Grill and Norgren, 1978) and similar reactions are seen in anencephalic children (Steiner, 1973). Thus it appears that the initial interpretation of the taste reactivity test (Berridge and Grill, 1984) was correct: the test measures the fixed action patterns of food ingestion or rejection – more a part of swallowing than of smiling – reflecting hedonic impact only insomuch as it reflects the positive or negative valence of the fluid injected into the passive animal's mouth.
The anhedonia hypothesis was based on the observation that a variety of rewards failed to sustain normal levels of instrumental behavior in well-trained but neuroleptic-treated animals. This was not taken as evidence of neuroleptic-induced anhedonia, but rather evidence of neuroloptic-induced attenuation of positive reinforcement. Under neuroleptic treatment animals showed normal initiation but progressive decrements in responding both within and across repeated trials, and these decrements paralleled in pattern, if not in degree, the similar decrements seen in animals that were simply allowed to respond under conditions of non-reward (Wise et al., 1978). Moreover, naïve rats were found not to learn to lever-press normally for food if they were pretreated with neuroleptic for their training sessions (Wise and Schwartz, 1981). Thus the habit-forming effect of food is severely attenuated by dopamine blockade. These findings have not been challenged but have rather been replicated by critics of what has come to be labeled the anhedonia hypothesis (Tombaugh et al., 1979; Mason et al., 1980), who have argued that under their conditions neuroleptics cause performance deficits above and beyond clear deficits in reinforcement. Given the fact that neuroleptics block all dopamine systems, some of which are thought to be involved in motor function, this was not surprising or contested (Wise, 1985).
Clear similarities between the effects of non-reward and the effects of reward under neuroleptic treatment are further illustrated by two much more subtle paradigms. The first is a partial reinforcement paradigm. It is well established that animals respond more under extinction conditions if they are trained not to expect a reward for every response they make. That animals respond more in extinction if they have been trained under intermittent reinforcement is known as the partial reinforcement extinction effect (Robbins, 1971). Ettenberg and Camp found partial reinforcement extinction effects with neuroleptic challenges of food- and water-trained response habits. They tested animals in extinction of a runway task after training in each of three conditions. Food- or water-deprived animals were trained, one trial per day, to run 155 cm in a straight alley runway for food (Ettenberg and Camp, 1986b) or water (Ettenberg and Camp, 1986a) reward. One group was trained under a "continuous" reinforcement schedule; that is, they received their designated reward on each of the 30 days of training. A second group was trained under partial reinforcement; they received their designated reward on only 20 of the 30 training days; on 10 days randomly spaced in the training period, the animals found no food or water when they arrived at the goal box. The third group received food or water on every trial but were periodically treated with the neuroleptic haloperidol; on 10 of their training trials they found food or water in the goal box, but, having been pretreated with haloperidol on those days, they experienced the food or water under conditions of dopamine receptor blockade. The consequences of these training regimens were assessed in 22 subsequent daily "extinction" trials in which each group was allowed to run but received no reward in the goal box. All animals ran progressively slower as the extinction trials continued. However, the performance of animals that had been trained under conditioned reinforcement conditions deteriorated much more rapidly from day to day than did that of animals that had been trained under partial reinforcement conditions. The animals that had been trained under "partial" haloperidol conditions also persevered more than the animals with the continuous reinforcement training; the intermittent haloperidol animals had start-box latencies and running times that were identical to those of the animals trained under partial reinforcement. That is, the animals pretreated with haloperidol on 1/3 of their training days performed in extinction as if they had experienced no reward on 1/3 of their training days. There is no possibility of a debilitation confound here, first because the performance of the haloperidol-treated animals was better than that of the control animals and second because haloperidol was not given on the test days, only on some of the training days.
The second subtle paradigm is a two-lever drug discrimination paradigm. Here the animals are trained to continue responding on one of two levers as long as that lever yields food reward, and to shift to the other lever when no longer rewarded. With low-doses of haloperidol, animals inexplicably shift to the wrong lever as if they had earned no food with their initial lever-press (Colpaert et al., 2007). That is, haloperidol-treated rats that earned food on their initial lever-press behaved like normal rats that failed to earn food on their initial lever-press. This was not a reflection of some form of haloperidol-induced motor deficit, because the evidence that food was not rewarding under haloperidol involved not the absence of a response but rather the initiation of a response: a response on the second lever.
Thus it is increasingly clear that, whatever else they do, neuroleptics decrease the reinforcing efficacy of a range of normally positive rewards.
The most recent challenge to the anhedonia hypothesis comes from theorists who argue that the primary motivational deficit caused by neuroleptics is a deficit in the drive or motivation to find or earn reward rather than the reinforcement that accompanies the receipt of reward (Berridge and Robinson, 1998; Salamone and Correa, 2002; Robinson et al., 2005; Baldo and Kelley, 2007). The suggestion that dopamine plays an important role in motivational arousal was, in fact, stressed more strongly in the original statement of the anhedonia hypothesis than was anhedonia itself: "the most subtle and interesting effect of neuroleptics is a selective attenuation of motivational arousal which is (a) critical for goal-directed behavior…" (Wise, 1982). That elevations of extracellular dopamine can motivate learned behavior sequences is perhaps best illustrated by the "priming" effect that is seen when free reward is given to an animal that is temporarily not responding in an instrumental task (Howarth and Deutsch, 1962; Pickens and Harris, 1968). This effect is best illustrated by drug-induced reinstatement of responding in animals that have undergone repeated extinction trials (Stretch and Gerber, 1973; de Wit and Stewart, 1983). One of the most powerful stimuli for reinstatement of responding in animals that have extinguished a cocaine-seeking or a heroin-seeking habit is an unearned injection of the dopamine agonist bromocriptine (Wise et al., 1990). The inclusion of motivational arousal is the main feature that differentiates the dopamine hypothesis of reward from the narrower dopamine hypothesis of reinforcement (Wise, 1989; 2004).
While there is ample evidence that dopamine can amplify or augment motivational arousal, there is equally ample evidence that neuroleptic drugs do not block the normal motivational arousal that is provided for a well-trained animal by reward-predictive cues in the environment. As discussed above, neuroleptic-treated animals tend to initiate response habits normally. Such animals start but do not normally continue to lever-press, run, or eat in operant chambers, runways, or free-feeding tests. When given in a discrete-trial runway task, haloperidol- treated animals run normally during the trial when the haloperidol is given; their motivational deficit only appears the next day, when the haloperidol has been metabolized and all that is left of the treatment is the memory of the treatment trial (McFarland and Ettenberg, 1995; 1998). The start-box cues fail to trigger running down the runway for food or heroin not on the day when the animals are under the influence of haloperidol, but on the next day when they only remember what the reward was like on the haloperidol day. So the motivational arousal of the animal on the day it gets haloperidol treatment is not compromised by the treatment; rather it must be the memory of a degraded reward that discourages the animal the day after the treatment trial. This is the most salient message from studies of the effects of neuroleptics on instrumental behavior in the range of tasks; neuroleptics at appropriate doses do not interfere with the ability of learned stimuli to instigate motivated behavior until after the stimuli have begun to lose the ability to maintain that behavior because of experience of the reward in the neuroleptic condition (Fouriezos and Wise, 1976; Fouriezos et al., 1978; Wise et al., 1978; Wise and Raptis, 1986; McFarland and Ettenberg, 1995; 1998).
This is not to say that dopamine is completely irrelevant to motivated behavior, only that the surges of phasic dopamine that are triggered by reward-predictors (Schultz, 1998) are, for the moment, unnecessary for the normal motivation of animals with an uncompromised reinforcement history. Well-trained animals respond out of habit, and do so even under conditions of dopamine receptor blockade. If brain dopamine is completely depleted, however, there are very dramatic effects on motivated behavior (Ungerstedt, 1971; Stricker and Zigmond, 1974). This is evident from studies of mutant mice that do not synthesize dopamine; these animals, like animals with experimental dopamine depletions, fail to move unless aroused by pain or stress, a dopamine agonist, or the dopamine-independent stimulant caffeine (Robinson et al., 2005). Thus minimal levels of functional dopamine are necessary for all normal behavior; dopamine-depleted animals, like dopamine-depleted parkinsonian patients (Hornykiewicz, 1979), are almost completely inactive unless stressed (Zigmond and Stricker, 1989). Among the primary deficits associated with dopamine depletion are aphagia and adipsia, which have motivational as well as motor components (Teitelbaum and Epstein, 1962; Ungerstedt, 1971; Stricker and Zigmond, 1974). Reward-blocking doses of neuroleptics, however, fail to produce the profound catalepsy that is caused by profound dopamine depletion.
The dopamine terminal field that has received most attention with respect to reward function is nucleus accumbens. Attention was drawn to nucleus accumbens first because lesions of this but not other catecholamine systems disrupted cocaine self-administration (Roberts et al., 1977). Further attention was generated by the suggestions that nucleus accumbens septi should be considered a limbic extension of the striatum, rather than an extension of the septum (Nauta et al., 1978a,b) and that it is an interface between the limbic system – conceptually linked to functions of motivation and emotion – and the extrapyramidal motor system (Mogenson et al., 1980). Studies of opiate reward also suggested that it is the mesolimbic dopamine system – the system projecting primarily from the ventral tegmental area to the nucleus accumbens – that is associated with reward function. Morphine in the ventral tegmental area was found to activate (Gysling and Wang, 1983; Matthews and German, 1984), by disinhibiting them (Johnson and North, 1992), dopaminergic neurons, and microinjections of morphine in this region potentiated brain stimulation reward (Broekkamp et al., 1976), produced conditioned place preferences (Phillips and LePiane, 1980), and were self-administered in their own right (Bozarth and Wise, 1981).
One challenge to the dopamine hypotheses thus arose from the finding that nucleus accumbens lesions failed to disrupt all instrumental behavior (Salamone et al., 1997). Aside from the problem that it is almost impossible to lesion nucleus accumbens selectively and, at the same time, completely, there are other reasons to assume that nucleus accumbens lesions should not eliminate all of dopamine's motivational actions. First, cocaine is directly self-administered not only into nucleus accumbens (Carlezon et al., 1995; Ikemoto, 2003), but also – and more avidly – into the medial prefrontal cortex (Goeders and Smith, 1983; Goeders et al., 1986) and olfactory tubercle (Ikemoto, 2003). Intravenous cocaine reward is attenuated not only by microinjections of a D1 antagonist into the ventral tegmental area (Ranaldi and Wise, 2001) but also by similar injections into the substantia nigra (Quinlan et al., 2004). Finally, post-trial dopamine release in the dorsal striatum enhances consolidation of learning and memory (White and Viaud, 1991), and dopamine blockade in the dorsal striatum impairs long-term potentiation (a cellular model of learning and memory) in this region (Centonze et al., 2001). Potentiation of memory consolidation is, in essence, the substance of reinforcement (Landauer, 1969) and dopamine appears to potentiate memory consolidation in the dorsal striatum and a variety of other structures (White, 1989; Wise, 2004).
Thus, for a variety of reasons, the dopamine hypothesis should not be reduced to a nucleus accumbens hypothesis. Nucleus accumbens is but one of the dopamine terminal fields implicated in reward function.
While evidence has steadily accumulated for an important role of dopamine in reward function a role we originally summarized loosely as "motivational arousal" our understanding of the precise nature of this function continues to develop in subtlety and complexity. Four issues, in addition to variations on the old motor hypothesis, have arisen in the recent literature.
One suggestion, offered as a direct challenge to the anhedonia hypothesis and the dopamine hypothesis of reward (Salamone et al., 1994; 1997; 2005) is that what neuroleptics reduce is not motivation or reinforcement but rather the animal's willingness to exert effort (Salamone et al., 2003). This suggestion is merely semantic. The willingness to exert effort is the essence of what we mean by motivation or drive, the first element in the initial three-part statement of the anhedonia hypothesis (Wise, 1982).
Studies of mutant mice lacking dopamine in dopaminergic neurons (but retaining it in noradrenergic neurons) show that brain dopamine is not absolutely necessary for food-rewarded instrumental learning. If given caffeine to arouse them, dopamine-deficient mice can learn to choose the correct arm of a T-maze for food reward (Robinson et al., 2005). This implicates dopamine in the motivational arousal that is lacking in dopamine-deficient mice that are not treated with caffeine, and indicates that dopamine is not essential to – though it normally contributes greatly to – the rewarding effects of food. It is interesting to note, however, that caffeine – required if the mutant mice are to behave at all without dopamine – also restores the feeding response that is lost after neurotoxic lesions of dopamine neurons in adult animals (Stricker et al., 1977). The mechanism of the caffeine effects is not fully understood, but caffeine affects the same medium-sized spiny striatal neurons that are the normal neuronal targets of dopaminergic fibers of the nigro-striatal and meso-limbic dopamine systems. It acts there as a phosphodiesterase inhibitor that increases intracellular cyclic AMP (Greengard, 1976) and as an adenosine receptor antagonist (Snyder et al., 1981). Moreover, the adenosine receptors that are blocked by caffeine normally form heteromers with dopamine receptors and affect the intracellular response to the effects of dopamine at those receptors (Ferre et al., 1997; Schiffmann et al., 2007). The complex interactions of dopamine and adenosine receptors in the striatum raises the possibility that caffeine enables learning in dopamine-deficient mice by substituting for dopamine in a shared or overlapping intracellular signaling cascade.
Schultz and colleagues have shown that the ventral tegmental dopamine neurons implicated in reward function respond not only to food reward itself but, as a result of experience, to predictors of food reward (Romo and Schultz, 1990; Ljungberg et al., 1992). As the animal learns that an environmental stimulus predicts food reward, the 200 millisecond burst of dopaminergic nerve firing that was initially triggered by food presentation itself becomes linked, instead, to the food-predictive stimulus that precedes it. If the food-predictive stimulus predicts food on only a fraction of the trials, then the dopaminergic neurons burst, to a lesser extent, in response to both the predictor and to the food; the stronger the probability of prediction, the stronger the response to the predictor and the weaker the response to the food presentation.
The fact that the dopaminergic neurons cease to respond to food itself and respond instead to food predictors raises the issue of whether the taste of food is not itself merely a reward predictor (Wise, 2002). Some tastes appear to be unconditioned reinforcers from birth (Steiner, 1974), but others gain motivational significance through the association of their taste with their post-ingestional consequences (Sclafani and Ackroff, 1994).
The concept of "reinforcement" is a concept of "stamping in" of associations (Thorndike, 1898). Whether the association is between a conditioned and an unconditioned stimulus (Pavlov, 1928), a stimulus and a response (Thorndike, 1911), or a response and an outcome (Skinner, 1937), reinforcement refers to the strengthening of an association through experience. Another way to look at it is that reinforcement is a process that enhances consolidation of the memory trace for the association (Landauer, 1969). Studies of post-trial dopaminergic activation suggest that dopamine serves to enhance or reinforce the memory trace for recently experienced events and associations, and that it does so in a variety of dopamine terminal fields (White and Milner, 1992). Several lines of evidence (Reynolds et al., 2001; Wise, 2004; Hyman et al., 2006; Wickens et al., 2007) now implicate a modulatory role for dopamine in cellular models of learning and memory that is consistent with the view that dopamine plays an important role in reinforcement.
While variations of the anhedonia hypothesis or the dopamine hypotheses of reward or reinforcement continue to appear, the hypothesis as originally stated still captures the scope of the involvement of dopamine in motivational theory. Normal levels of brain dopamine are important for normal motivation, while phasic elevations of dopamine play an important role in the reinforcement that establishes response habits and stamps in the association between rewards and reward-predicting stimuli. Subjective pleasure is the normal correlate of the rewarding events that cause phasic dopamine elevations, but stressful events can also cause dopamine elevations; thus pleasure is not a necessary correlate of dopamine elevations or even reinforcement itself (Kelleher and Morse, 1968).
COMMENTS: Addicted brains not only suffer from decreased sensitivity to dopamine, but also less dopamine released in response to stimuli.
Nora D. Volkow, MD; Joanna S. Fowler, PhD; Gene-Jack Wang, MD; James M. Swanson, PhD; Frank Telang, MD
Arch Neurol. 2007;64(11):1575-1579.
Imaging studies have provided new insights on the role of dopamine (DA) in drug abuse and addiction in the human brain. These studies have shown that the reinforcing effects of drugs of abuse in human beings are contingent not just on DA increases per se in the striatum (including the nucleus accumbens) but on the rate of DA increases. The faster the increases, the more intense the reinforcing effects. They have also shown that elevated levels of DA in the dorsal striatum are involved in the motivation to procure the drug when the addicted subject is exposed to stimuli associated with the drug (conditioned stimuli). In contrast, long-term drug use seems to be associated with decreased DA function, as evidenced by reductions in D2 DA receptors and DA release in the striatum in addicted subjects. Moreover, the reductions in D2 DA receptors in the striatum are associated with reduced activity of the orbitofrontal cortex (region involved with salience attribution and motivation and with compulsive behaviors) and of the cingulate gyrus (region involved with inhibitory control and impulsivity), which implicates deregulation of frontal regions by DA in the loss of control and compulsive drug intake that characterizes addiction. Because DA cells fire in response to salient stimuli and facilitate conditioned learning, their activation by drugs will be experienced as highly salient, driving the motivation to take the drug and further strengthening conditioned learning and producing automatic behaviors (compulsions and habits).
Dopamine (DA) is the neurotransmitter that has been classically associated with the reinforcing effects of drugs of abuse and may have a key role in triggering the neurobiological changes associated with addiction. This notion reflects the fact that all of the drugs of abuse increase the extracellular concentration of DA in the nucleus accumbens. Increases in DA levels have an important role in coding reward and prediction of reward, in the motivational drive to procure the reward, and in facilitating learning.1 It is also believed that DA codes not just for reward but for saliency, which, in addition to reward, includes aversive, novel, and unexpected stimuli. The diversity of DA effects is likely translated by the specific brain regions (limbic, cortical, and striatal) it modulates.
Herein, we summarize findings from imaging studies that used positron emission tomography (PET) to investigate the role of DA in the reinforcing effects of drugs, the long-term brain changes in drug-addicted subjects, and the vulnerability to addiction. Though most of the PET studies on addiction have focused on DA, it is clear that drug-induced adaptations in other neurotransmitters (ie, glutamate, γ-aminobutyric acid, opioids, and cannabinoids) are also involved, but the lack of radioligands has limited their investigation.
The effects of short-term drug exposure on extracellular DA concentrations in the human brain can be studied using PET and D2 DA receptor radioactive ligands that are sensitive to competition with endogenous DA, such as raclopride labeled with carbon 11 (11C). The relationship between the effects of drugs on DA and their reinforcing properties in the human brain (assessed by self-reports of “high” and “euphoria”) was studied for the stimulant drugs methylphenidate and amphetamine. Methylphenidate, like cocaine, increases DA by blocking DA transporters, whereas amphetamine, like methamphetamine, increases DA by releasing it from the terminal via DA transporters. Intravenous methylphenidate (0.5 mg/kg) and amphetamine (0.3 mg/kg) increased the extracellular DA concentration of DA in the striatum, and these increases were associated with increases in self-reports of high and euphoria.2 In contrast, when given orally, methylphenidate (0.75-1 mg/kg) also increased DA but was not perceived as reinforcing.3 Because intravenous administration leads to fast DA changes, whereas oral administration increases DA slowly, the failure to observe the high with oral methylphenidate likely reflects its slow pharmacokinetics. Indeed, the speed at which drugs of abuse enter the brain is recognized as affecting their reinforcing effects.4 This association has also been shown in PET studies that evaluated the pharmacokinetics of cocaine (using [11C]cocaine) and MP (using [11C]methylphenidate) in the human brain, documenting that it was the fast uptake of the drug into the brain but not the brain concentration per se that was associated with getting high.5 The dependency of the reinforcing effects of drugs on brain pharmacokinetic properties suggests a possible association with phasic DA cell firing (fast-burst firing at frequencies >30 Hz), which also leads to fast changes in DA concentration and whose function is to highlight the saliency of stimuli.6 This is in contrast to tonic DA cell firing (slow firing at frequencies around 5 Hz), which maintains baseline steady-state DA levels and whose function is to set the overall responsiveness of the DA system. This led us to speculate that drugs of abuse induce changes in DA concentration that mimic but exceed those produced by phasic DA cell firing.
Synaptic increases in DA concentration occur during drug intoxication in both addicted and nonaddicted subjects. However, a compulsive drive to continue drug taking when exposed to the drug is not triggered in all subjects. Inasmuch as it is the loss of control and the compulsive drug taking that characterizes addiction, the short-term drug-induced DA level increase alone cannot explain this condition. Because drug addiction requires long-term drug administration, we suggest that in vulnerable individuals (because of genetic, developmental, or environmental factors), addiction is related to the repeated perturbation of DA-regulated brain circuits involved with reward/saliency, motivation/drive, inhibitory control/executive function, and memory/conditioning. Herein, we discuss findings from imaging studies on the nature of these changes.
Many radioactive tracers have been used to assess changes in targets involved in DA neurotransmission (Table 1). Using 18-N-methylspiroperidol or [11C]raclopride, we and others have shown that subjects with a wide variety of drug addictions (cocaine, heroin, alcohol, and methamphetamine) have significant reductions in D2 DA receptor availability in the striatum (including the ventral striatum) that persist months after protracted detoxification (reviewed in Volkow et al2). We have also revealed evidence of decreased DA cell activity in cocaine abusers. Specifically, we showed that the striatal increases in DA level induced by intravenous methylphenidate (assessed with [11C]raclopride) in cocaine abusers were substantially blunted when compared with DA level increases in control subjects (50% lower).7 Because DA concentration increases induced by methylphenidate are dependent on DA release, a function of DA cell firing, we speculated that this difference likely reflects decreased DA cell activity in the cocaine abusers. Similar findings have been reported in alcohol abusers.8
These brain-imaging studies suggest 2 abnormalities in addicted subjects that would result in decreased output of DA circuits related to reward; that is, decreases in D2 DA receptors and decreases in DA release in the striatum (including the nucleus accumbens). Each would contribute to the decreased sensitivity in addicted subjects to natural reinforcers. Because drugs are much more potent at stimulating DA-regulated reward circuits than natural reinforcers, we postulated that drugs are still able to activate these down-regulated reward circuits. The decreased sensitivity of reward circuits would lead to decreased interest in day-to-day environmental stimuli, possibly predisposing subjects to seek drug stimulation as a means to temporarily activate these reward circuits underlying the transition from taking drugs to feel high to taking them to feel normal.
Preclinical studies have demonstrated a prominent role of DA in motivation that seems to be mediated in part via a DA-regulated circuit involving the orbitofrontal cortex (OFC) and the anterior cingulate gyrus (CG).9 In imaging studies in human subjects using the radioactive tracer fludeoxyglucose F 18, we and others have shown decreased activity in the OFC and CG in different classes of addicted subjects (reviewed in Volkow et al2). Moreover, in both cocaine- and methamphetamine-addicted subjects, we have shown that the reduced activity in the OFC and CG is associated with decreased availability of D2 DA receptors in the striatum (reviewed in Volkow et al7) (Figure). Because the OFC and CG participate in the assignment of value to reinforcers as a function of context, their disruption in the abuser could interfere with their ability to change the saliency value of the drug as a function of alternative reinforcers, becoming the main drive motivating behavior. In contrast to the pattern of decreased OFC and CG activity when drug-free, addicted subjects show increased activation in these regions when presented with the drug or drug-related stimuli, consistent with the enhanced saliency values of drugs or drug reinforcers in these subjects. Moreover, the enhanced activation of the OFC and CG was associated with the intensity of desire for the drug. This has led us to speculate that the hypermetabolism in the OFC and CG triggered by drugs or drug cues underlies the compulsive drug intake, just as it underlies the compulsive behaviors in patients with obsessive-compulsive disorders.10 This dual effect of disruption of the OFC-CG brain circuit is consistent with the behavior of the drug addict, whose compulsion to take the drug overrides competing cognitive-based tendencies not to take the drug; just as in patients with obsessive-compulsive disorders, the compulsion persists despite cognitive attempts to stop the behaviors.
A, Images of D2 dopamine receptors (raclopride labeled with carbon 11) and of brain glucose metabolism (fludeoxyglucose), which is used as an indicator of brain function in a control subject and a cocaine abuser. Cocaine abusers have lower D2 dopamine receptor availability in the striatum and lower metabolism in the orbitofrontal cortex (OFC) than do control subjects. B, Correlations between D2 dopamine (DA) receptors and orbitofrontal cortex (OFC) metabolism in detoxified cocaine abusers and detoxified methamphetamine abusers. Note that the subjects with the lowest measures of D2 DA receptor availability have the lowest metabolism in the OFC.
The CG and the OFC are also involved with inhibitory control, which led us to postulate that disrupted DA modulation of the OFC and CG also contributes to the loss of control over drug intake by drug-addicted subjects.10 Inhibitory control is also dependent on the dorsolateral prefrontal cortex, which is also affected in addiction (reviewed in Volkow et al2). Abnormalities in the dorsolateral prefrontal cortex are expected to affect processes involved in executive control including impairments in self-monitoring and behavior control, which have an important role in the cognitive changes that perpetuate drug self-administration.10
Circuits underlying memory and learning, including conditioned-incentive learning, habit learning, and declarative memory (reviewed in Vanderschuren and Everitt11), have been proposed to be involved in drug addiction. The effects of drugs on memory systems suggest ways that neutral stimuli can acquire reinforcing properties and motivational salience, that is, through conditioned-incentive learning. In research on relapse, it has been important to understand why drug-addicted subjects experience an intense desire for the drug when exposed to places where they have taken the drug, to persons with whom previous drug use occurred, and to paraphernalia used to administer the drug. This is clinically relevant because exposure to conditioned cues (stimuli associated with the drug) is a key contributor to relapse. Because DA is involved with prediction of reward (reviewed in Schultz9), we hypothesized that DA might underlie conditioned responses that trigger craving. Studies in laboratory animals support this hypothesis: when neutral stimuli are paired with a drug, they will, with repeated associations, acquire the ability to increase DA in the nucleus accumbens and dorsal striatum, becoming conditioned cues. Furthermore, these neurochemical responses are associated with drug-seeking behavior (reviewed in Vanderschuren and Everitt11). In human beings, PET studies with [11C]raclopride recently confirmed this hypothesis by showing that, in cocaine abusers, drug cues (cocaine-cue video of scenes of subjects taking cocaine) substantially increased DA in the dorsal striatum and that these increases were associated with cocaine craving.12- 13 Because the dorsal striatum is implicated in habit learning, this association likely reflects the strengthening of habits as chronicity of addiction progresses. This suggests that a basic neurobiologic disruption in addiction might be a DA-triggered conditioned response that results in habits leading to compulsive drug consumption. It is likely that these conditioned responses reflect adaptations in corticostriatal glutamatergic pathways that regulate DA release (reviewed in Vanderschuren and Everitt11).
A challenging question in the neurobiology of drug abuse is why some individuals are more vulnerable than others to becoming addicted to drugs. Imaging studies suggest that preexisting differences in DA circuits may be one mechanism underlying the variability in responsiveness to drugs of abuse. Specifically, baseline measures of striatal D2 DA receptors in nonaddicted subjects have been shown to predict subjective responses to the reinforcing effects of intravenous methylphenidate treatment; individuals describing the experience as pleasant had substantially lower levels of D2 DA receptors compared with those describing methylphenidate as unpleasant (reviewed in Volkow et al7). This suggests that the relationship between DA levels and reinforcing responses follows an inverted U-shaped curve: too little is not optimal for reinforcement but too much is aversive. Thus, high D2 DA receptor levels could protect against drug self-administration. Support for this was provided by preclinical studies that showed that up-regulation of D2 DA receptors in the nucleus accumbens dramatically reduced alcohol intake in animals previously trained to self-administer alcohol14 and by clinical studies showing that subjects who, despite having a dense family history of alcoholism, were not alcoholics had substantially higher D2 DA receptors in the striatum compared with individuals without such family histories.15 In these subjects, the higher the D2 DA receptors, the higher the metabolism in the OFC and CG. Thus, we postulate that high levels of D2 DA receptors may protect against alcoholism by modulating frontal circuits involved in salience attribution and inhibitory control.
Imaging studies have corroborated the role of DA in the reinforcing effects of drugs of abuse in human beings and have extended traditional views of DA involvement in drug addiction. These findings suggest multicomponent strategies for the treatment of drug addiction that include strategies to (1) decrease the reward value of the drug of choice and increase the reward value of nondrug reinforcers, (2) weaken conditioned drug behaviors, (3) weaken the motivational drive to take the drug, and (4) strengthen frontal inhibitory and executive control (Table 2).
Correspondence: Nora D. Volkow, MD, National Institute on Drug Abuse, 6001 Executive Blvd, Room 5274-MSC 9581, Bethesda, MD 20892 (email@example.com).
Accepted for Publication: January 17, 2007.
Author Contributions:Study concept and design: Volkow. Acquisition of data: Volkow, Wang, Swanson, and Telang. Analysis and interpretation of data: Volkow, Fowler, Wang, and Telang. Drafting of the manuscript: Volkow and Swanson. Critical revision of the manuscript for important intellectual content: Volkow, Fowler, Wang, Swanson, and Telang. Statistical analysis: Volkow. Obtained funding: Volkow, Fowler, and Wang. Administrative, technical, and material support: Volkow, Fowler, Wang, and Telang. Study supervision: Volkow, Wang, and Telang.
Financial Disclosure: None reported.
Funding/Support: This study was supported in part by the intramural program of the National Institute on Alcohol Abuse and Alcoholism; grants DA 06891, DA 09490, DA 06278, and AA 09481 from the National Institutes of Health; and the US Department of Energy, Office of Biological and Environmental Research.
Neuron. Author manuscript; available in PMC Dec 9, 2011.
Published in final edited form as:
See other articles in PMC that cite the published article.
Midbrain dopamine neurons are well known for their strong responses to rewards and their critical role in positive motivation. It has become increasingly clear, however, that dopamine neurons also transmit signals related to salient but non-rewarding experiences such as aversive and alerting events. Here we review recent advances in understanding the reward and non-reward functions of dopamine. Based on this data, we propose that dopamine neurons come in multiple types that are connected with distinct brain networks and have distinct roles in motivational control. Some dopamine neurons encode motivational value, supporting brain networks for seeking, evaluation, and value learning. Others encode motivational salience, supporting brain networks for orienting, cognition, and general motivation. Both types of dopamine neurons are augmented by an alerting signal involved in rapid detection of potentially important sensory cues. We hypothesize that these dopaminergic pathways for value, salience, and alerting cooperate to support adaptive behavior.
The neurotransmitter dopamine (DA) has a crucial role in motivational control – in learning what things in the world are good and bad, and in choosing actions to gain the good things and avoid the bad things. The major sources of DA in the cerebral cortex and in most subcortical areas are the DA-releasing neurons of the ventral midbrain, located in the substantia nigra pars compacta (SNc) and ventral tegmental area (VTA) (Bjorklund and Dunnett, 2007). These neurons transmit DA in two modes, ‘tonic’ and ‘phasic’ (Grace, 1991; Grace et al., 2007). In their tonic mode DA neurons maintain a steady, baseline level of DA in downstream neural structures that is vital for enabling the normal functions of neural circuits (Schultz, 2007). In their phasic mode DA neurons sharply increase or decrease their firing rates for 100–500 milliseconds, causing large changes in DA concentrations in downstream structures lasting for several seconds (Schultz, 1998; Schultz, 2007).
These phasic DA responses are triggered by many types of rewards and reward-related sensory cues (Schultz, 1998) and are ideally positioned to fulfill DA’s roles in motivational control, including its roles as a teaching signal that underlies reinforcement learning (Schultz et al., 1997; Wise, 2005) and as an incentive signal that promotes immediate reward seeking (Berridge and Robinson, 1998). As a result, these phasic DA reward signals have taken on a prominent role in theories about the functions of cortical and subcortical circuits and have become the subject of intense neuroscience research. In the first part of this review we will introduce the conventional theory of phasic DA reward signals and will review recent advances in understanding their nature and their control over neural processing and behavior.
In contrast to the accepted role of DA in reward processing, there has been considerable debate over the role of phasic DA activity in processing non-rewarding events. Some theories suggest that DA neuron phasic responses primarily encode reward-related events (Schultz, 1998; Ungless, 2004; Schultz, 2007), while others suggest that DA neurons transmit additional non-reward signals related to surprising, novel, salient, and even aversive experiences (Redgrave et al., 1999; Horvitz, 2000; Di Chiara, 2002; Joseph et al., 2003; Pezze and Feldon, 2004; Lisman and Grace, 2005; Redgrave and Gurney, 2006). In the second part of this review we will discuss a series of studies that have put these theories to the test and have revealed much about the nature of non-reward signals in DA neurons. In particular, these studies provide evidence that DA neurons are more diverse than previously thought. Rather than encoding a single homogeneous motivational signal, DA neurons come in multiple types that encode reward and non-reward events in different manners. This poses a problem for general theories which seek to identify dopamine with a single neural signal or motivational mechanism.
To remedy this dilemma, in the final part of this review we propose a new hypothesis to explain the presence of multiple types of DA neurons, the nature of their neural signals, and their integration into distinct brain networks for motivational control. Our basic proposal is as follows. One type of DA neurons encode motivational value, excited by rewarding events and inhibited by aversive events. These neurons support brain systems for seeking goals, evaluating outcomes, and value learning. A second type of DA neurons encode motivational salience, excited by both rewarding and aversive events. These neurons support brain systems for orienting, cognitive processing, and motivational drive. In addition to their value and salience-coding activity, both types of DA neurons also transmit an alerting signal, triggered by unexpected sensory cues of high potential importance. Together, we hypothesize that these value, salience, and alerting signals cooperate to coordinate downstream brain structures and control motivated behavior.
Dopamine has long been known to be important for reinforcement and motivation of actions. Drugs that interfere with DA transmission interfere with reinforcement learning, while manipulations which enhance DA transmission, such as brain stimulation and addictive drugs, often acts as reinforcers (Wise, 2004). DA transmission is crucial for creating a state of motivation to seek rewards (Berridge and Robinson, 1998; Salamone et al., 2007) and for establishing memories of cue-reward associations (Dalley et al., 2005). DA release is not necessary for all forms of reward learning and may not always be ‘liked’ in the sense of causing pleasure, but it is critical for causing goals to become ‘wanted’ in the sense of motivating actions to achieve them (Berridge and Robinson, 1998; Palmiter, 2008).
One hypothesis about how dopamine supports reinforcement learning is that it adjusts the strength of synaptic connections between neurons. The most straightforward version of this hypothesis is that dopamine controls synaptic plasticity according to a modified Hebbian rule that can be roughly stated as “neurons that fire together wire together, as long as they get a burst of dopamine”. In other words, if cell A activates cell B, and cell B causes a behavioral action which results in a reward, then dopamine would be released and the A→B connection would be reinforced (Montague et al., 1996; Schultz, 1998). This mechanism would allow an organism to learn the optimal choice of actions to gain rewards, given sufficient trial-and-error experience. Consistent with this hypothesis, dopamine has a potent influence on synaptic plasticity in numerous brain regions (Surmeier et al., 2010; Goto et al., 2010; Molina-Luna et al., 2009; Marowsky et al., 2005; Lisman and Grace, 2005). In some cases dopamine enables synaptic plasticity along the lines of the Hebbian rule described above, in a manner that is correlated with reward-seeking behavior (Reynolds et al., 2001). In addition to its effects on long-term synaptic plasticity, dopamine can also exert immediate control over neural circuits by modulating neural spiking activity and synaptic connections between neurons (Surmeier et al., 2007; Robbins and Arnsten, 2009), in some cases doing so in a manner that would promote immediate reward-seeking actions (Frank, 2005).
In order to motivate actions that lead to rewards, dopamine should be released during rewarding experiences. Indeed, most DA neurons are strongly activated by unexpected primary rewards such as food and water, often producing phasic ‘bursts’ of activity (Schultz, 1998) (phasic excitations including multiple spikes (Grace and Bunney, 1983)). However, the pioneering studies of Wolfram Schultz showed that these DA neuron responses are not triggered by reward consumption per se. Instead they resemble a ‘reward prediction error’, reporting the difference between the reward that is received and the reward that was predicted to occur (Schultz et al., 1997) (Figure 1A). Thus, if a reward is larger than predicted, DA neurons are strongly excited (positive prediction error, Figure 1E, red); if a reward is smaller than predicted or fails to occur at its appointed time, DA neurons are phasically inhibited (negative prediction error, Figure 1E, blue); and if a reward is cued in advance so that its size is fully predictable, DA neurons have little or no response (zero prediction error, Figure 1C, black). The same principle holds for DA responses to sensory cues that provide new information about future rewards. DA neurons are excited when a cue indicates an increase in future reward value (Figure 1C, red), inhibited when a cue indicates a decrease in future reward value (Figure 1C, blue), and generally have little response to cues that convey no new reward information (Figure 1E, black). These DA responses resemble a specific type of reward prediction error called the temporal difference error or “TD error”, which has been proposed to act as a reinforcement signal for learning the value of actions and environmental states (Houk et al., 1995; Montague et al., 1996; Schultz et al., 1997). Computational models using a TD-like reinforcement signal can explain many aspects of reinforcement learning in humans, animals, and DA neurons themselves (Sutton and Barto, 1981; Waelti et al., 2001; Montague and Berns, 2002; Dayan and Niv, 2008).
An impressive array of experiments have shown that DA signals represent reward predictions in a manner that closely matches behavioral preferences, including the preference for large rewards over small ones (Tobler et al., 2005) probable rewards over improbable ones (Fiorillo et al., 2003; Satoh et al., 2003; Morris et al., 2004) and immediate rewards over delayed ones (Roesch et al., 2007; Fiorillo et al., 2008; Kobayashi and Schultz, 2008). There is even evidence that DA neurons in humans encode the reward value of money (Zaghloul et al., 2009). Furthermore, DA signals emerge during learning with a similar timecourse to behavioral measures of reward prediction (Hollerman and Schultz, 1998; Satoh et al., 2003; Takikawa et al., 2004; Day et al., 2007) and are correlated with subjective measures of reward preference (Morris et al., 2006). These findings have established DA neurons as one of the best understood and most replicated examples of reward coding in the brain. As a result, recent studies have subjected DA neurons to intense scrutiny to discover how they generate reward predictions and how their signals act on downstream structures to control behavior.
Recent advances in understanding DA reward signals come from considering three broad questions: How do DA neurons learn reward predictions? How accurate are their predictions? And just what do they treat as rewarding?
How do DA neurons learn reward predictions? Classic theories suggest that reward predictions are learned through a gradual reinforcement process requiring repeated stimulus-reward pairings (Rescorla and Wagner, 1972; Montague et al., 1996). Each time stimulus A is followed by an unexpected reward, the estimated value of A is increased. Recent data, however, shows that DA neurons go beyond simple stimulus-reward learning and make predictions based on sophisticated beliefs about the structure of the world. DA neurons can predict rewards correctly even in unconventional environments where rewards paired with a stimulus cause a decrease in the value of that stimulus (Satoh et al., 2003; Nakahara et al., 2004; Bromberg-Martin et al., 2010c) or cause a change in the value of an entirely different stimulus (Bromberg-Martin et al., 2010b). DA neurons can also adapt their reward signals based on higher-order statistics of the reward distribution, such as scaling prediction error signals based on their expected variance (Tobler et al., 2005) and ‘spontaneously recovering’ their responses to extinguished reward cues (Pan et al., 2008). All of these phenomena form a remarkable parallel to similar effects seen in sensory and motor adaptation (Braun et al., 2010; Fairhall et al., 2001; Shadmehr et al., 2010), suggesting that they may reflect a general neural mechanism for predictive learning.
How accurate are DA reward predictions? Recent studies have shown that DA neurons faithfully adjust their reward signals to account for three sources of prediction uncertainty. First, humans and animals suffer from internal timing noise that prevents them from making reliable predictions about long cue-reward time intervals (Gallistel and Gibbon, 2000). Thus, if cue-reward delays are short (1–2 seconds) timing predictions are accurate and reward delivery triggers little DA response, but for longer cue-reward delays timing predictions become less reliable and rewards evoke clear DA bursts (Kobayashi and Schultz, 2008; Fiorillo et al., 2008). Second, many cues in everyday life are imprecise, specifying a broad distribution of reward delivery times. DA neurons again reflect this form of timing uncertainty: they are progressively inhibited during variable reward delays, as though signaling increasingly negative reward prediction errors at each moment the reward fails to appear (Fiorillo et al., 2008; Bromberg-Martin et al., 2010a; Nomoto et al., 2010). Finally, many cues are perceptually complex, requiring detailed inspection to reach a firm conclusion about their reward value. In such situations DA reward signals occur at long latencies and in a gradual fashion, appearing to reflect the gradual flow of perceptual information as the stimulus value is decoded (Nomoto et al., 2010).
Just what events do DA neurons treat as rewarding? Conventional theories of reward learning suggest that DA neurons assign value based on the expected amount of future primary reward (Montague et al., 1996). Yet even when the rate of primary reward is held constant, humans and animals often express an additional preference for predictability – seeking environments where each reward’s size, probability, and timing can be known in advance (Daly, 1992; Chew and Ho, 1994; Ahlbrecht and Weber, 1996). A recent study in monkeys found that DA neurons signal this preference (Bromberg-Martin and Hikosaka, 2009). Monkeys expressed a strong preference to view informative visual cues that would allow them to predict the size of a future reward, rather than uninformative cues that provided no new information. In parallel, DA neurons were excited by the opportunity to view the informative cues in a manner that was correlated with the animal’s behavioral preference (Figure 1B,D). This suggests that DA neurons not only motivate actions to gain rewards but also motivate actions to make accurate predictions about those rewards, in order to ensure that rewards can be properly anticipated and prepared for in advance.
Taken together, these findings show that DA reward prediction error signals are sensitive to sophisticated factors that inform human and animal reward predictions, including adaptation to high-order reward statistics, reward uncertainty, and preferences for predictive information.
DA reward responses occur in synchronous phasic bursts (Joshua et al., 2009b), a response pattern that shapes DA release in target structures (Gonon, 1988; Zhang et al., 2009; Tsai et al., 2009). It has long been theorized that these phasic bursts influence learning and motivation in a distinct manner from tonic DA activity (Grace, 1991; Grace et al., 2007; Schultz, 2007; Lapish et al., 2007). Recently developed technology has made it possible to confirm this hypothesis by controlling DA neuron activity with fine spatial and temporal precision. Optogenetic stimulation of VTA DA neurons induces a strong conditioned place preference which only occurs when stimulation is applied in a bursting pattern (Tsai et al., 2009). Conversely, genetic knockout of NMDA receptors from DA neurons, which impairs bursting while leaving tonic activity largely intact, causes a selective impairment in specific forms of reward learning (Zweifel et al., 2009; Parker et al., 2010) (although note that this knockout also impairs DA neuron synaptic plasticity (Zweifel et al., 2008)). DA bursts may enhance reward learning by reconfiguring local neural circuits. Notably, reward-predictive DA bursts are sent to specific regions of the nucleus accumbens, and these regions have especially high levels of reward-predictive neural activity (Cheer et al., 2007; Owesson-White et al., 2009).
Compared to phasic bursts, less is known about the importance of phasic pauses in spiking activity for negative reward prediction errors. These pauses cause smaller changes in spike rate, are less modulated by reward expectation (Bayer and Glimcher, 2005; Joshua et al., 2009a; Nomoto et al., 2010), and may have smaller effects on learning (Rutledge et al., 2009). However, certain types of negative prediction error learning require the VTA (Takahashi et al., 2009), suggesting that phasic pauses may still be decoded by downstream structures.
Since bursts and pauses cause very different patterns of DA release, they are likely to influence downstream structures through distinct mechanisms. There is recent evidence for this hypothesis in one major target of DA neurons, the dorsal striatum. Dorsal striatum projection neurons come in two types which express different DA receptors. One type expresses D1 receptors and projects to the basal ganglia ‘direct pathway’ to facilitate body movements; the second type expresses D2 receptors and projects to the ‘indirect pathway’ to suppress body movements (Figure 2) (Albin et al., 1989; Gerfen et al., 1990; Kravitz et al., 2010; Hikida et al., 2010). Based on the properties of these pathways and receptors, it has been theorized that DA bursts produce conditions of high DA, activate D1 receptors, and cause the direct pathway to select high-value movements (Figure 2A), whereas DA pauses produce conditions of low DA, inhibit D2 receptors, and cause the indirect pathway to suppress low-value movements (Figure 2B) (Frank, 2005; Hikosaka, 2007). Consistent with this hypothesis, high DA receptor activation promotes potentiation of cortico-striatal synapses onto the direct pathway (Shen et al., 2008) and learning from positive outcomes (Frank et al., 2004; Voon et al., 2010), while striatal D1 receptor blockade selectively impairs movements to rewarded targets (Nakamura and Hikosaka, 2006). In an analogous manner, low DA receptor activation promotes potentiation of cortico-striatal synapses onto the indirect pathway (Shen et al., 2008) and learning from negative outcomes (Frank et al., 2004; Voon et al., 2010), while striatal D2 receptor blockade selectively suppresses movements to non-rewarded targets (Nakamura and Hikosaka, 2006). This division of D1 and D2 receptor functions in motivational control explains many of the effects of DA-related genes on human behavior (Ullsperger, 2010; Frank and Fossella, 2010) and may extend beyond the dorsal striatum, as there is evidence for a similar division of labor in the ventral striatum (Grace et al., 2007; Lobo et al., 2010).
While the above scheme paints a simple picture of phasic DA control of behavior through its effects on the striatum, the full picture is much more complex. DA influences reward-related behavior by acting on many brain regions including the prefrontal cortex (Hitchcott et al., 2007), rhinal cortex (Liu et al., 2004), hippocampus (Packard and White, 1991; Grecksch and Matties, 1981) and amygdala (Phillips et al., 2010). The effects of DA are likely to differ widely between these regions due to variations in the density of DA innervation, DA transporters, metabolic enzymes, autoreceptors, receptors, and receptor coupling to intracellular signaling pathways (Neve et al., 2004; Bentivoglio and Morelli, 2005; Frank and Fossella, 2010). Furthermore, at least in the VTA, DA neurons can have different cellular properties depending on their projection targets (Lammel et al., 2008; Margolis et al., 2008), and some have the remarkable ability to transmit glutamate as well as dopamine (Descarries et al., 2008; Chuhma et al., 2009; Hnasko et al., 2010; Tecuapetla et al., 2010; Stuber et al., 2010; Birgner et al., 2010). Thus, the full extent of DA neuron control over neural processing is only beginning to be revealed.
Thus far we have discussed the role of DA neurons in reward-related behavior, founded upon dopamine responses resembling reward prediction errors. It has become increasingly clear, however, that DA neurons phasically respond to several types of events that are not intrinsically rewarding and are not cues to future rewards, and that these non-reward signals have an important role in motivational processing. These non-reward events can be grouped into two broad categories, aversive and alerting, which we will discuss in detail below. Aversive events include intrinsically undesirable stimuli (such as air puffs, bitter tastes, electrical shocks, and other unpleasant sensations) and sensory cues that have gained aversive properties through association with these events. Alerting events are unexpected sensory cues of high potential importance, which generally trigger immediate reactions to determine their meaning.
A neuron’s response to aversive events provides a crucial test of its functions in motivational control (Schultz, 1998; Berridge and Robinson, 1998; Redgrave et al., 1999; Horvitz, 2000; Joseph et al., 2003). In many respects we treat rewarding and aversive events in opposite manners, reflecting their opposite motivational value. We seek rewards and assign them positive value, while we avoid aversive events and assign them negative value. In other respects we treat rewarding and aversive events in similar manners, reflecting their similar motivational salience [FOOTNOTE1]. Both rewarding and aversive events trigger orienting of attention, cognitive processing, and increases in general motivation.
Which of these functions do DA neurons support? It has long been known that stressful and aversive experiences cause large changes in DA concentrations in downstream brain structures, and that behavioral reactions to these experiences are dramatically altered by DA agonists, antagonists, and lesions (Salamone, 1994; Di Chiara, 2002; Pezze and Feldon, 2004; Young et al., 2005). These studies have produced a striking diversity of results, however (Levita et al., 2002; Di Chiara, 2002; Young et al., 2005). Many studies are consistent with DA neurons encoding motivational salience. They report that aversive events increase DA levels and that behavioral aversion is supported by high levels of DA transmission (Salamone, 1994; Joseph et al., 2003; Ventura et al., 2007; Barr et al., 2009; Fadok et al., 2009) including phasic DA bursts (Zweifel et al., 2009). But other studies are more consistent with DA neurons encoding motivational value. They report that aversive events reduce DA levels and that behavioral aversion is supported by low levels of DA transmission (Mark et al., 1991; Shippenberg et al., 1991; Liu et al., 2008; Roitman et al., 2008). In many cases these mixed results have been found in single studies, indicating that aversive experiences cause different patterns of DA release in different brain structures (Thierry et al., 1976; Besson and Louilot, 1995; Ventura et al., 2001; Jeanblanc et al., 2002; Bassareo et al., 2002; Pascucci et al., 2007), and that DA-related drugs can produce a mixture of neural and behavioral effects similar to those caused by both rewarding and aversive experiences (Ettenberg, 2004; Wheeler et al., 2008).
This diversity of DA release patterns and functions is difficult to reconcile with the idea that DA neurons transmit a uniform motivational signal to all brain structures. These diverse responses could be explained, however, if DA neurons are themselves diverse – composed of multiple neural populations that support different aspects of aversive processing. This view is supported by neural recording studies in anesthetized animals. These studies have shown that noxious stimuli evoke excitation in some DA neurons but inhibition in other DA neurons (Chiodo et al., 1980; Maeda and Mogenson, 1982; Schultz and Romo, 1987; Mantz et al., 1989; Gao et al., 1990; Coizet et al., 2006). Importantly, both excitatory and inhibitory responses occur in neurons confirmed to be dopaminergic using juxtacellular labeling (Brischoux et al., 2009) (Figure 3). A similar diversity of aversive responses occurs during active behavior. Different groups of DA neurons are phasically excited or inhibited by aversive events including noxious stimulation of the skin (Kiyatkin, 1988a; Kiyatkin, 1988b), sensory cues predicting aversive shocks (Guarraci and Kapp, 1999), aversive airpuffs (Matsumoto and Hikosaka, 2009b), and sensory cues predicting aversive airpuffs (Matsumoto and Hikosaka, 2009b; Joshua et al., 2009a). Furthermore, when two DA neurons are recorded simultaneously, their aversive responses generally have little trial-to-trial correlation with each other (Joshua et al., 2009b), suggesting that aversive responses are not coordinated across the DA population as a whole.
To understand the functions of these diverse aversive responses, we need to know how they are combined with reward responses to generate a meaningful motivational signal. A recent study investigated this topic and revealed that DA neurons are divided into multiple populations with distinct motivational signals (Matsumoto and Hikosaka, 2009b). One population is excited by rewarding events and inhibited by aversive events, as though encoding motivational value (Figure 4A). A second population is excited by both rewarding and aversive events in similar manners, as though encoding motivational salience (Figure 4B). In both of these populations many neurons are sensitive to reward and aversive predictions: they respond when rewarding events are more rewarding than predicted and when aversive events are more aversive than predicted (Matsumoto and Hikosaka, 2009b). This shows that their aversive responses are truly caused by predictions about aversive events, ruling out the possibility that they could be caused by non-specific factors such as raw sensory input or generalized associations with reward (Schultz, 2010). These two populations differ, however, in the detailed nature of their predictive code. Motivational value coding DA neurons encode an accurate prediction error signal, including strong inhibition by omission of rewards and mild excitation by omission of aversive events (Figure 4A, right). In contrast, motivational salience coding DA neurons respond when salient events are present but not when they are absent (Figure 4B, right), consistent with theoretical notions of arousal (Lang and Davis, 2006) [FOOTNOTE2]. Evidence for these two DA neuron populations has been observed even when neural activity has been examined in an averaged manner. Thus, studies targeting different parts of the DA system found phasic DA signals encoding aversive events with inhibition (Roitman et al., 2008), similar to coding of motivational value, or with excitation (Joshua et al., 2008; Anstrom et al., 2009), similar to coding of motivational salience.
These recent findings might appear to contradict an early report that DA neurons respond preferentially to reward cues rather than aversive cues (Mirenowicz and Schultz, 1996). When examined closely, however, even that study is fully consistent with DA value and salience coding. In that study reward cues led to reward outcomes with high probability (>90%) while aversive cues led to aversive outcomes with low probability (<10%). Hence value and salience-coding DA neurons would have little response to the aversive cues, accurately encoding their low level of aversiveness.
Taken together, the above findings indicate that DA neurons are divided into multiple populations suitable for distinct roles in motivational control. Motivational value coding DA neurons fit well with current theories of dopamine neurons and reward processing (Schultz et al., 1997; Berridge and Robinson, 1998; Wise, 2004). These neurons encode a complete prediction error signal and encode rewarding and aversive events in opposite directions. Thus these neurons provide an appropriate instructive signal for seeking, evaluation, and value learning (Figure 5). If a stimulus causes value coding DA neurons to be excited then we should approach it, assign it high value, and learn actions to seek it again in the future. If a stimulus causes value coding DA neurons to be inhibited then we should avoid it, assign it low value, and learn actions to avoid it again in the future.
In contrast, motivational salience coding DA neurons fit well with theories of dopamine neurons and processing of salient events (Redgrave et al., 1999; Horvitz, 2000; Joseph et al., 2003; Kapur, 2003). These neurons are excited by both rewarding and aversive events and have weaker responses to neutral events, providing an appropriate instructive signal for neural circuitry to learn to detect, predict, and respond to situations of high importance. Here we will consider three such brain systems (Figure 5). First, neural circuits for visual and attentional orienting are calibrated to discover information about all types of events, both rewarding and aversive. For instance, both reward and aversive cues attract orienting reactions more effectively than neutral cues (Lang and Davis, 2006; Matsumoto and Hikosaka, 2009b; Austin and Duka, 2010). Second, both rewarding and aversive situations engage neural systems for cognitive control and action selection - we need to engage working memory to hold information in mind, conflict resolution to decide upon a course of action, and long-term memory to remember the resulting outcome (Bradley et al., 1992; Botvinick et al., 2001; Savine et al., 2010). Third, both rewarding and aversive situations require an increase in general motivation to energize actions and to ensure that they are executed properly. Indeed, DA neurons are critical in motivating effort to achieve high-value goals and in translating knowledge of task demands into reliable motor performance (Berridge and Robinson, 1998; Mazzoni et al., 2007; Niv et al., 2007; Salamone et al., 2007).
In addition to their signals encoding motivational value and salience, the majority of DA neurons also have burst responses to several types of sensory events that are not directly associated with rewarding or aversive experiences. These responses have been theorized to depend on a number of neural and psychological factors, including direct sensory input, surprise, novelty, arousal, attention, salience, generalization, and pseudo-conditioning (Schultz, 1998; Redgrave et al., 1999; Horvitz, 2000; Lisman and Grace, 2005; Redgrave and Gurney, 2006; Joshua et al., 2009a; Schultz, 2010).
Here we will attempt to synthesize these ideas and account for these DA responses in terms of a single underlying signal, an alerting signal (Figure 5). The term ‘alerting’ was used by Schultz (Schultz, 1998) as a general term for events that attract attention. Here we will use it in a more specific sense. By an alerting event, we mean an unexpected sensory cue that captures attention based on a rapid assessment of its potential importance, using simple features such as its location, size, and sensory modality. Such alerting events often trigger immediate behavioral reactions to investigate them and determine their precise meaning. Thus DA alerting signals typically occur at short latencies, are based on the rough features of a stimulus, and are best correlated with immediate reactions such as orienting reactions (Schultz and Romo, 1990; Joshua et al., 2009a; Schultz, 2010). This is in contrast to other motivational signals in DA neurons which typically occur at longer latencies, take into account the precise identity of the stimulus, and are best correlated with considered behavioral actions such as decisions to approach or avoid (Schultz and Romo, 1990; Joshua et al., 2009a; Schultz, 2010).
DA alerting responses can be triggered by surprising sensory events such as unexpected light flashes and auditory clicks, which evoke prominent burst excitations in 60–90% of DA neurons throughout the SNc and VTA (Strecker and Jacobs, 1985; Horvitz et al., 1997; Horvitz, 2000) (Figure 6A). These alerting responses seem to reflect the degree to which the stimulus is surprising and captures attention; they are reduced if a stimulus occurs at predictable times, if attention is engaged elsewhere, or during sleep (Schultz, 1998; Takikawa et al., 2004; Strecker and Jacobs, 1985; Steinfels et al., 1983). For instance, an unexpected clicking sound evokes a prominent DA burst when a cat is in a passive state of quiet waking, but has no effect when the cat is engaged in attention-demanding activities such as hunting a rat, feeding, grooming, being petted by the experimenter, and so on (Strecker and Jacobs, 1985) (Figure 6A). Similarly, DA burst responses are triggered by sensory events that are physically weak but are alerting due to their novelty (Ljungberg et al., 1992; Schultz, 1998). These responses habituate as the novel stimulus becomes familiar, in parallel with the habituation of orienting reactions (Figure 6B). Consistent with these findings, surprising and novel events evoke DA release in downstream structures (Lisman and Grace, 2005) and activate DA-related brain circuits in a manner that shapes reward processing (Zink et al., 2003; Davidson et al., 2004; Duzel et al., 2010).
DA alerting responses are also triggered by unexpected sensory cues that have the potential to provide new information about motivationally salient events. As expected for a short-latency alerting signal, these responses are rather non-selective: they are triggered by any stimulus that merely resembles a motivationally salient cue, even if the resemblance is very slight (a phenomenon called generalization) (Schultz, 1998). As a result, DA neurons often respond to a stimulus with a mixture of two signals: a fast alerting signal encoding the fact that the stimulus is potentially important, and a second signal encoding its actual rewarding or aversive meaning (Schultz and Romo, 1990; Waelti et al., 2001; Tobler et al., 2003; Day et al., 2007; Kobayashi and Schultz, 2008; Fiorillo et al., 2008; Nomoto et al., 2010) (see (Kakade and Dayan, 2002; Joshua et al., 2009a; Schultz, 2010) for review). An example can be seen in a set of motivational salience coding DA neurons shown in Figure 6C (Bromberg-Martin et al., 2010a). These neurons were excited by reward and aversive cues, but they were also excited by a neutral cue. The neutral cue had never been paired with motivational outcomes, but did have a (very slight) physical resemblance to the reward and aversive cues.
These alerting responses seem closely tied to a sensory cue’s ability to trigger orienting reactions to examine it further and discover its meaning. This can be seen in three notable properties. First, alerting responses only occur for sensory cues that have to be examined to determine their meaning, not for intrinsically rewarding or aversive events such as delivery of juice or airpuffs (Schultz, 2010). Second, alerting responses only occur when a cue is potentially important and has the ability to trigger orienting reactions, not when the cue is irrelevant to the task at hand and fails to trigger orienting reactions (Schultz and Romo, 1990). Third, alerting responses are enhanced in situations when cues would trigger an abrupt shift of attention – when they appear at an unexpected time or away from the center of gaze (Bromberg-Martin et al., 2010a). Thus when motivational cues are presented with unpredictable timing they trigger immediate orienting reactions and a generalized DA alerting response – excitation by all cues including neutral cues (Figure 6C, black). But if their timing is made predictable – for example, by forewarning the subjects with a “trial start cue” presented one second before the cues appear – the cues no longer evoke an alerting response (Figure 6D, gray). Instead, the alerting response shifts to the trial start cue – the first event of the trial that has unpredictable timing and evokes orienting reactions (Figure 6D, black).
What is the underlying mechanism that generates DA neuron alerting signals? One hypothesis is that alerting responses are simply conventional reward prediction error signals that occur at short latencies, encoding the expected reward value of a stimulus before it has been fully discriminated (Kakade and Dayan, 2002). More recent evidence, however, suggests that alerting signals can be generated by a distinct mechanism from conventional DA reward signals (Satoh et al., 2003; Bayer and Glimcher, 2005; Bromberg-Martin et al., 2010a; Bromberg-Martin et al., 2010c; Nomoto et al., 2010). Most strikingly, the alerting response to the trial start cue is not restricted to rewarding tasks; it can have equal strength during an aversive task in which no rewards are delivered (Figure 6C,D, bottom, “aversive task”). This occurs even though conventional DA reward signals in the same neurons correctly signal that the rewarding task has a much higher expected value than the aversive task (Bromberg-Martin et al., 2010a). These alerting signals are not purely a form of value coding or purely a form of salience coding, because they occur in the majority of both motivational value and salience coding DA neurons (Bromberg-Martin et al., 2010a). A second dissociation can be seen in the way that DA neurons predict future rewards based on the memory of past reward outcomes (Satoh et al., 2003; Bayer and Glimcher, 2005). Whereas conventional DA reward signals are controlled by a long-timescale memory trace optimized for accurate reward prediction, alerting responses to the trial start cue are controlled by a separate memory trace resembling that seen in immediate orienting reactions (Bromberg-Martin et al., 2010c). A third dissociation can be seen in the way that these signals are distributed across the DA neuron population. Whereas conventional DA reward signals are strongest in the ventromedial SNc, alerting responses to the trial start cue (and to other unexpectedly timed cues) are broadcast throughout the SNc (Nomoto et al., 2010).
In contrast to these dissociations from conventional reward signals, DA alerting signals are correlated with the speed of orienting and approach responses to the alerting event (Satoh et al., 2003; Bromberg-Martin et al., 2010a; Bromberg-Martin et al., 2010c). This suggests that alerting signals are generated by a neural process that motivates fast reactions to investigate potentially important events. At the present time, unfortunately, relatively little is known about precisely what events this process treats as ‘important’. For example, are alerting responses equally sensitive to rewarding and aversive events? Alerting responses are known to occur for stimuli that resemble reward cues or that resemble both reward and aversive cues (e.g. by sharing the same sensory modality). But it is not yet known whether alerting responses occur for stimuli that solely resemble aversive cues.
As we have seen, alerting signals are likely to be generated by a distinct mechanism from motivational value and salience signals. However, alerting signals are sent to both motivational value and salience coding DA neurons, and therefore are likely to regulate brain processing and behavior in a similar manner to value and salience signals (Figure 5).
Alerting signals sent to motivational salience coding DA neurons would support orienting of attention to the alerting stimulus, engagement of cognitive resources to discover its meaning and decide on a plan for action, and increase motivation levels to implement this plan efficiently (Figure 5). These effects could occur through immediate effects on neural processing or by reinforcing actions which led to detection of the alerting event. This functional role fits well with the correlation between DA alerting responses and fast behavioral reactions to the alerting stimulus, and with theories that short-latency DA neuron responses are involved in orienting of attention, arousal, enhancement of cognitive processing, and immediate behavioral reactions (Redgrave et al., 1999; Horvitz, 2000; Joseph et al., 2003; Lisman and Grace, 2005; Redgrave and Gurney, 2006; Joshua et al., 2009a).
The presence of alerting signals in motivational value coding DA neurons is more difficult to explain. These neurons transmit motivational value signals that are ideal for seeking, evaluation of outcomes, and value learning; yet they can also be excited by alerting events such as unexpected clicking sounds and the onset of aversive trials. According to our hypothesized pathway (Figure 5), this would cause alerting events to be assigned positive value and to be sought after in a manner similar to rewards! While surprising at first glance, there is reason to suspect that alerting events can be treated as positive goals. Alerting signals provide the first warning that a potentially important event is about to occur, and hence provide the first opportunity to take action to control that event. If alerting cues are available, motivationally salient events can be detected, predicted, and prepared for in advance; if alerting cues are absent, motivationally salient events always occur as an unexpected surprise. Indeed, humans and animals often express a preference for environments where rewarding, aversive, and even motivationally neutral sensory events can be observed and predicted in advance (Badia et al., 1979; Herry et al., 2007; Daly, 1992; Chew and Ho, 1994) and many DA neurons signal the behavioral preference to view reward-predictive information (Bromberg-Martin and Hikosaka, 2009). DA alerting signals may support these preferences by assigning positive value to environments where potentially important sensory cues can be anticipated in advance.
Thus far we have divided DA neurons into two types which encode motivational value and motivational salience and are suitable for distinct roles in motivational control (Figure 5). How does this conceptual scheme map onto neural pathways in the brain? Here we propose a hypothesis about the anatomical locations of these neurons, their projections to downstream structures, and the sources of their motivational signals (Figures 6,,77).
A recent study mapped the locations of DA reward and aversive signals in the lateral midbrain including the SNc and lateralmost part of the VTA (Matsumoto and Hikosaka, 2009b). Motivational value and motivational salience signals were distributed across this region in an anatomical gradient. Motivational value signals were found more commonly in neurons in the ventromedial SNc and lateral VTA, while motivational salience signals were found more commonly in neurons in the dorsolateral SNc (Figure 7B). This is consistent with reports that DA reward value coding is strongest in the ventromedial SNc (Nomoto et al., 2010) while aversive excitations tend to be strongest more laterally (Mirenowicz and Schultz, 1996). Other studies have explored the more medial midbrain. These studies found a mixture of excitatory and inhibitory aversive responses with no significant difference in their locations, although with a trend for aversive excitations to be located more ventrally (Guarraci and Kapp, 1999; Brischoux et al., 2009) (Figure 7C).
According to our hypothesis, motivational value coding DA neurons should project to brain regions involved in approach and avoidance actions, evaluation of outcomes, and value learning (Figure 5). Indeed, the ventromedial SNc and VTA project to the ventromedial prefrontal cortex (Williams and Goldman-Rakic, 1998) including the orbitofrontal cortex (OFC) (Porrino and Goldman-Rakic, 1982) (Figure 7A). The OFC has been consistently implicated in value coding in functional imaging studies (Anderson et al., 2003; Small et al., 2003; Jensen et al., 2007; Litt et al., 2010) and single neuron recordings (Morrison and Salzman, 2009; Roesch and Olson, 2004). The OFC is thought to evaluate choice options (Padoa-Schioppa, 2007; Kable and Glimcher, 2009), encode outcome expectations (Schoenbaum et al., 2009), and update these expectations during learning (Walton et al., 2010). Furthermore, the OFC is involved in learning from negative reward prediction errors (Takahashi et al., 2009) which are strongest in value-coding DA neurons (Figure 4).
In addition, the medial portions of the dopaminergic midbrain project to the ventral striatum including the nucleus accumbens shell (NAc shell) (Haber et al., 2000) (Figure 7A). A recent study demonstrated that the NAc shell receives phasic DA signals encoding the motivational value of taste outcomes (Roitman et al., 2008). These signals are likely to cause value learning because direct infusion of DA drugs into the NAc shell is strongly reinforcing (Ikemoto, 2010) while treatments that reduce DA input to the shell can induce aversions (Liu et al., 2008). One caveat is that studies of NAc shell DA release over long timescales (minutes) have produced mixed results, some consistent with value coding and others with salience coding (e.g. (Bassareo et al., 2002; Ventura et al., 2007)). This suggests that value signals may be restricted to specific locations within the NAc shell. Notably, different regions of the NAc shell are specialized for controlling appetitive and aversive behavior (Reynolds and Berridge, 2002), which both require input from DA neurons (Faure et al., 2008).
Finally, DA neurons throughout the extent of the SNc send heavy projections to the dorsal striatum (Haber et al., 2000), suggesting that the dorsal striatum may receive both motivational value and salience coding DA signals (Figure 7A). Motivational value coding DA neurons would provide an ideal instructive signal for striatal circuitry involved in value learning, such as learning of stimulus-response habits (Faure et al., 2005; Yin and Knowlton, 2006; Balleine and O'Doherty, 2010). When these DA neurons burst, they would engage the direct pathway to learn to gain reward outcomes; when they pause, they would engage the indirect pathway to learn to avoid aversive outcomes (Figure 2). Indeed, there is recent evidence that the striatal pathways follow exactly this division of labor for reward and aversive processing (Hikida et al., 2010). It is still unknown, however, how neurons in these pathways respond to rewarding and aversive events during behavior. At least in the dorsal striatum as a whole, a subset of neurons respond to certain rewarding and aversive events in distinct manners (Ravel et al., 2003; Yamada et al., 2004, 2007; Joshua et al., 2008).
According to our hypothesis, motivational salience coding DA neurons should project to brain regions involved in orienting, cognitive processing, and general motivation (Figure 5). Indeed, DA neurons in the dorsolateral midbrain send projections to dorsal and lateral frontal cortex (Williams and Goldman-Rakic, 1998) (Figure 7A), a region which has been implicated in cognitive functions such as attentional search, working memory, cognitive control, and decision making between motivational outcomes (Williams and Castner, 2006; Lee and Seo, 2007; Wise, 2008; Kable and Glimcher, 2009; Wallis and Kennerley, 2010). Dorsolateral prefrontal cognitive functions are tightly regulated by DA levels (Robbins and Arnsten, 2009) and are theorized to depend on phasic DA neuron activation (Cohen et al., 2002; Lapish et al., 2007). Notably, a subset of lateral prefrontal neurons respond to both rewarding and aversive visual cues, and the great majority respond in the same direction resembling coding of motivational salience (Kobayashi et al., 2006). Furthermore, the activity of these neurons is correlated with behavioral success at performing working memory tasks (Kobayashi et al., 2006). Although this dorsolateral DA→dorsolateral frontal cortex pathway appears to be specific to primates (Williams and Goldman-Rakic, 1998), a functionally similar pathway may exist in other species. In particular, many of the cognitive functions of the primate dorsolateral prefrontal cortex are performed by the rodent medial prefrontal cortex (Uylings et al., 2003), and there is evidence that this region receives DA motivational salience signals and controls salience-related behavior (Mantz et al., 1989; Di Chiara, 2002; Joseph et al., 2003; Ventura et al., 2007; Ventura et al., 2008).
Given the evidence that the VTA contains both salience and value coding neurons and that value coding signals are sent to the NAc shell, salience signals might be sent to the NAc core (Figure 7A). Indeed, the NAc core (but not shell) is crucial for enabling motivation to overcome response costs such as physical effort; for performance of set-shifting tasks requiring cognitive flexibility; and for enabling reward cues to cause an enhancement of general motivation (Ghods-Sharifi and Floresco, 2010; Floresco et al., 2006; Hall et al., 2001; Cardinal, 2006). Consistent with coding of motivational salience, the NAc core receives phasic bursts of DA during both rewarding experiences (Day et al., 2007) and aversive experiences (Anstrom et al., 2009).
Finally, as discussed above, some salience coding DA neurons may project to the dorsal striatum (Figure 7A). While some regions of the dorsal striatum are involved in functions related to learning action values, the dorsal striatum is also involved in functions that should be engaged for all salient events, such as orienting, attention, working memory, and general motivation (Hikosaka et al., 2000; Klingberg, 2010; Palmiter, 2008). Indeed, a subset of dorsal striatal neurons are more strongly responsive to rewarding and aversive events than to neutral events (Ravel et al., 1999; Blazquez et al., 2002; Yamada et al., 2004, 2007), although their causal role in motivated behavior is not yet known.
A recent series of studies suggests that DA neurons receive motivational value signals from a small nucleus in the epithalamus, the lateral habenula (LHb) (Hikosaka, 2010) (Figure 8). The LHb exerts potent negative control over DA neurons: LHb stimulation inhibits DA neurons at short latencies (Christoph et al., 1986) and can regulate learning in an opposite manner to VTA stimulation (Shumake et al., 2010). Consistent with a negative control signal, many LHb neurons have mirror-inverted phasic responses to DA neurons: LHb neurons are inhibited by positive reward prediction errors and excited by negative reward prediction errors (Matsumoto and Hikosaka, 2007, 2009a; Bromberg-Martin et al., 2010a; Bromberg-Martin et al., 2010c). In several cases these signals occur at shorter latencies in the LHb, consistent with LHb → DA transmission (Matsumoto and Hikosaka, 2007; Bromberg-Martin et al., 2010a).
The LHb is capable of controlling DA neurons throughout the midbrain, but several lines of evidence suggest that it exerts preferential control over motivational value coding DA neurons. First, LHb neurons encode motivational value in a manner closely mirroring value-coding DA neurons – they encode both positive and negative reward prediction errors and respond in opposite directions to rewarding and aversive events (Matsumoto and Hikosaka, 2009a; Bromberg-Martin et al., 2010a). Second, LHb stimulation has its most potent effects on DA neurons whose properties are consistent with value coding, including inhibition by no-reward cues and anatomical location in the ventromedial SNc (Matsumoto and Hikosaka, 2007, 2009b). Third, lesions to the LHb impair DA neuron inhibitory responses to aversive events, suggesting a causal role for the LHb in generating DA value signals (Gao et al., 1990).
The LHb is part of a more extensive neural pathway by which DA neurons can be controlled by the basal ganglia (Figure 8). The LHb receives signals resembling reward prediction errors through a projection from a population of neurons located around the globus pallidus border (GPb) (Hong and Hikosaka, 2008). Once these signals reach the LHb they are likely to be sent to DA neurons through a disynaptic pathway in which the LHb excites midbrain GABA neurons that in turn inhibit DA neurons (Ji and Shepard, 2007; Omelchenko et al., 2009; Brinschwitz et al., 2010). This could occur through LHb projections to interneurons in the VTA and to an adjacent GABA-ergic nucleus called the rostromedial tegmental nucleus (RMTg) (Jhou et al., 2009b) (also called the ‘caudal tail of VTA’ (Kaufling et al., 2009)). Notably, RMTg neurons have response properties similar to LHb neurons, encode motivational value, and have a heavy inhibitory projection to dopaminergic midbrain (Jhou et al., 2009a). Thus, the complete basal ganglia pathway to send motivational value signals to DA neurons may be GPb→LHb→RMTg→DA (Hikosaka, 2010).
An important question for future research is whether motivational value signals are channeled solely through the LHb or whether they are carried by multiple input pathways. Notably, DA inhibitions by aversive footshocks are controlled by activity in the mesopontine parabrachial nucleus (PBN) (Coizet et al., 2010) (Figure 8). This nucleus contains neurons that receive direct input from the spinal cord encoding noxious sensations and could inhibit DA neurons through excitatory projections to the RMTg (Coizet et al., 2010; Gauriau and Bernard, 2002). This suggests that the LHb sends DA neurons motivational value signals for both rewarding and aversive cues and outcomes while the PBN provides a component of the value signal specifically related to aversive outcomes.
Less is known about the source of motivational salience signals in DA neurons. One intriguing candidate is the central nucleus of the amygdala (CeA) which has been consistently implicated in orienting, attention, and general motivational responses during both rewarding and aversive events (Holland and Gallagher, 1999; Baxter and Murray, 2002; Merali et al., 2003; Balleine and Killcross, 2006) (Figure 8). The CeA and other amygdala nuclei contain many neurons whose signals are consistent with motivational salience: they signal rewarding and aversive events in the same direction, are enhanced when events occur unexpectedly, and are correlated with behavioral measures of arousal (Nishijo et al., 1988; Belova et al., 2007; Shabel and Janak, 2009). These signals may be sent to DA neurons because the CeA has descending projections to the brainstem that carry rewarding and aversive information (Lee et al., 2005; Pascoe and Kapp, 1985) and the CeA is necessary for DA release during reward-related events (Phillips et al., 2003a). Furthermore, the CeA participates with DA neurons in pathways consistent with our proposed anatomical and functional networks for motivational salience. A pathway including the CeA, SNc, and dorsal striatum is necessary for learned orienting to food cues (Han et al., 1997; Lee et al., 2005; El-Amamy and Holland, 2007). Consistent with our division of salience vs. value signals, this pathway is needed for learning to orient to food cues but not for learning to approach food outcomes (Han et al., 1997). A second pathway, including the CeA, SNc, VTA, and NAc core, is necessary for reward cues to cause an increase in general motivation to perform reward-seeking actions (Hall et al., 2001; Corbit and Balleine, 2005; El-Amamy and Holland, 2007).
In addition to the CeA, DA neurons could receive motivational salience signals from other sources such as salience-coding neurons in the basal forebrain (Lin and Nicolelis, 2008; Richardson and DeLong, 1991) and neurons in the PBN (Coizet et al., 2010), although these pathways remain to be investigated.
There are several good candidates for providing DA neurons with alerting signals. Perhaps the most attractive candidate is the superior colliculus (SC), a midbrain nucleus that receives short-latency sensory input from multiple sensory modalities and controls orienting reactions and attention (Redgrave and Gurney, 2006) (Figure 8). The SC has a direct projection to the SNc and VTA (May et al., 2009; Comoli et al., 2003). In anesthetized animals the SC is a vital conduit for short-latency visual signals to reach DA neurons and trigger DA release in downstream structures (Comoli et al., 2003; Dommett et al., 2005). The SC-DA pathway is best suited to convey alerting signals rather than reward and aversion signals, as SC neurons have little response to reward delivery and have only a mild influence over DA aversive responses (Coizet et al., 2006). This suggests a sequence of events in which SC neurons (1) detect a stimulus, (2) select it as potentially important, (3) trigger an orienting reaction to examine the stimulus, and (4) simultaneously trigger a DA alerting response which causes a burst of DA in downstream structures (Redgrave and Gurney, 2006).
A second candidate for sending alerting signals to DA neurons is the LHb (Figure 8). Notably, the unexpected onset of a trial start cue inhibits many LHb neurons in an inverse manner to the DA neuron alerting signal, and this response occurs at shorter latency in the LHb consistent with a LHb→DA direction of transmission (Bromberg-Martin et al., 2010a; Bromberg-Martin et al., 2010c). We have also anecdotally observed that LHb neurons are commonly inhibited by unexpected visual images and sounds in an inverse manner to DA excitations (M.M., E.S.B.-M., and O.H., unpublished observations) although this awaits a more systematic investigation.
Finally, a third candidate for sending alerting signals to DA neurons is the pedunculopontine tegmental nucleus (PPTg), which projects to both the SNc and VTA and is involved in motivational processing (Winn, 2006) (Figure 8). The PPTg is important for enabling VTA DA neuron bursts (Grace et al., 2007) including burst responses to reward cues (Pan and Hyland, 2005). Consistent with an alerting signal, PPTg neurons have short-latency responses to multiple sensory modalities and are active during orienting reactions (Winn, 2006). There is evidence that PPTg sensory responses are influenced by reward value and by requirements for immediate action (Dormont et al., 1998; Okada et al., 2009) (but see (Pan and Hyland, 2005)). Some PPTg neurons also respond to rewarding or aversive outcomes themselves (Dormont et al., 1998; Kobayashi et al., 2002; Ivlieva and Timofeeva, 2003b, a). It will be important to test whether the signals the PPTg sends to DA neurons are related specifically to alerting or whether they contain other motivational signals such as value and salience.
We have reviewed the nature of reward, aversive, and alerting signals in DA neurons, and have proposed a hypothesis about the underlying neural pathways and their roles in motivated behavior. We consider this to be a working hypothesis, a guide for future theories and research that will bring us to a more complete understanding. Here we will highlight several areas where further investigation is needed to reveal deeper complexities.
At the present time, our understanding of the neural pathways underlying DA signals is at an early stage. Therefore, we have attempted to infer the sources and destinations of value and salience coding DA signals largely based on indirect measures such as the neural response properties and functional roles of different brain areas. It will be important to put these candidate pathways to a direct test and to discover their detailed properties, aided by recently developed tools that allow DA transmission to be monitored (Robinson et al., 2008) and controlled (Tsai et al., 2009; Tecuapetla et al., 2010; Stuber et al., 2010) with high spatial and temporal precision. As noted above, several of these candidate structures have a topographic organization, suggesting that their communication with DA neurons might be topographic as well. The neural sources of phasic DA signals may also be more complex than the simple feedforward pathways we have proposed, since the neural structures that communicate with DA neurons are densely interconnected (Geisler and Zahm, 2005) and DA neurons can communicate with each other within the midbrain (Ford et al., 2010).
We have focused on a selected set of DA neuron connections, but DA neurons receive functional input from many additional structures including the subthalamic nucleus, laterodorsal tegmental nucleus, bed nucleus of the stria terminalis, prefrontal cortex, ventral pallidum, and lateral hypothalamus (Grace et al., 2007; Shimo and Wichmann, 2009; Jalabert et al., 2009). Notably, lateral hypothalamus orexin neurons project to DA neurons, are activated by rewarding rather than aversive events, and trigger drug-seeking behavior (Harris and Aston-Jones, 2006), suggesting a possible role in value-related functions. DA neurons also send projections to many additional structures including the hypothalamus, hippocampus, amygdala, habenula, and a great many cortical areas. Notably, the anterior cingulate cortex (ACC) has been proposed to receive reward prediction error signals from DA neurons (Holroyd and Coles, 2002) and contains neurons with activity positively related to motivational value (Koyama et al., 1998). Yet ACC activation is also linked to aversive processing (Vogt, 2005; Johansen and Fields, 2004). These ACC functions might be supported by a mixture of DA motivational value and salience signals, which will be important to test in future study. Indeed, neural signals related to reward prediction errors have been reported in several areas including the medial prefrontal cortex (Matsumoto et al., 2007; Seo and Lee, 2007), orbitofrontal cortex (Sul et al., 2010) (but see (Takahashi et al., 2009; Kennerley and Wallis, 2009)), and dorsal striatum (Kim et al., 2009; Oyama et al., 2010), and their causal relationship to DA neuron activity remains to be discovered.
We have described motivational events with a simple dichotomy, classifying them as ‘rewarding’ or ‘aversive’. Yet these categories contain great variety. An aversive illness is gradual, prolonged, and caused by internal events; an aversive airpuff is fast, brief, and caused by the external world. These situations demand very different behavioral responses which are likely to be supported by different neural systems. Furthermore, although we have focused our discussion on two types of DA neurons with signals resembling motivational value and salience, a close examination shows that DA neurons are not limited to this strict dichotomy. As indicated by our notion of an anatomical gradient some DA neurons transmit mixtures of both salience-like and value-like signals; still other DA neurons respond to rewarding but not aversive events (Matsumoto and Hikosaka, 2009b; Bromberg-Martin et al., 2010a). These considerations suggest that some DA neurons may not encode motivational events along our intuitive axis of ‘good’ vs. ‘bad’ and may instead be specialized to support specific forms of adaptive behavior.
Even in the realm of rewards, there is evidence that DA neurons transmit different reward signals to different brain regions (Bassareo and Di Chiara, 1999; Ito et al., 2000; Stefani and Moghaddam, 2006; Wightman et al., 2007; Aragona et al., 2009). Diverse responses reported in the SNc and VTA include neurons that: respond only to the start of a trial (Roesch et al., 2007), perhaps encoding a pure alerting signal; respond differently to visual and auditory modalities (Strecker and Jacobs, 1985), perhaps receiving input from different SC and PPTg neurons; respond to the first or last event in a sequence (Ravel and Richmond, 2006; Jin and Costa, 2010); have sustained activation by risky rewards (Fiorillo et al., 2003); or are activated during body movements (Schultz, 1986; Kiyatkin, 1988a; Puryear et al., 2010; Jin and Costa, 2010) (see also (Phillips et al., 2003b; Stuber et al., 2005)). While each of these response patterns has only been reported in a minority of studies or neurons, this data suggests that DA neurons could potentially be divided into a much larger number of functionally distinct populations.
A final and important consideration is that present recording studies in behaving animals do not yet provide fully conclusive measurements of DA neuron activity, because these studies have only been able to distinguish between DA and non-DA neurons using indirect methods, based on neural properties such as firing rate, spike waveform, and sensitivity to D2 receptor agonists (Grace and Bunney, 1983; Schultz, 1986). These techniques appear to identify DA neurons reliably within the SNc, indicated by several lines of evidence including comparison of intracellular and extracellular methods, juxtacellular recordings, and the effects of DA-specific lesions (Grace and Bunney, 1983; Grace et al., 2007; Brown et al., 2009). However, recent studies indicate that this technique may be less reliable in the VTA, where DA and non-DA neurons have a wider variety of cellular properties (Margolis et al., 2006; Margolis et al., 2008; Lammel et al., 2008; Brischoux et al., 2009). Even direct measurements of DA concentrations in downstream structures do not provide conclusive evidence of DA neuron spiking activity, because DA concentrations may be controlled by additional factors such as glutamatergic activation of DA axon terminals (Cheramy et al., 1991) and rapid changes in the activity of DA transporters (Zahniser and Sorkin, 2004). To perform fully conclusive measurements of DA neuron activity during active behavior it will be necessary to use new recording techniques, such as combining extracellular recording with optogenetic stimulation (Jin and Costa, 2010).
An influential concept of midbrain DA neurons has been that they transmit a uniform motivational signal to all downstream structures. Here we have reviewed evidence that DA signals are more diverse than commonly thought. Rather than encoding a uniform signal, DA neurons come in multiple types that send distinct motivational messages about rewarding and non-rewarding events. Even single DA neurons do not appear to transmit single motivational signals. Instead, DA neurons transmit mixtures of multiple signals generated by distinct neural processes. Some reflect detailed predictions about rewarding and aversive experiences, while others reflect fast responses to events of high potential importance.
In addition, we have proposed a hypothesis about the nature of these diverse DA signals, the neural networks that generate them, and their influence on downstream brain structures and on motivated behavior. Our proposal can be seen as a synthesis of previous theories. Many previous theories have attempted to identify DA neurons with a single motivational process such as seeking of valued goals, engaging motivationally salient situations, or reacting to alerting changes in the environment. In our view, DA neurons receive signals related to all three of these processes. Yet rather than distilling these signals into a uniform message, we have proposed that DA neurons transmit these signals to distinct brain structures in order to support distinct neural systems for motivated cognition and behavior. Some DA neurons support brain systems that assign motivational value, promoting actions to seek rewarding events, avoid aversive events, and ensure that alerting events can be predicted and prepared for in advance. Other DA neurons support brain systems that are engaged by motivational salience, including orienting to detect potentially important events, cognitive processing to choose a response and to remember its consequences, and motivation to persist in pursuit of an optimal outcome. We hope that this proposal helps lead us to a more refined understanding of DA functions in the brain, in which DA neurons tailor their signals to support multiple neural networks with distinct roles in motivational control.
This work was supported by the intramural research program at the National Eye Institute. We also thank Amy Arnsten for valuable discussions.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
FOOTNOTE1By motivational salience we mean a quantity that is high for both rewarding and aversive events and is low for motivationally neutral (non-rewarding and non-aversive) events. This is similar to the definition given by (Berridge and Robinson, 1998). Note that motivational salience is distinct from other notions of salience used in neuroscience, such as incentive salience (which applies only to desirable events; (Berridge and Robinson, 1998)) and perceptual salience (which applies to motivationally neutral events such as moving objects and colored lights; (Bisley and Goldberg, 2010)).]
FOOTNOTE2Note that motivational salience coding DA neuron signals are distinct from the classic notions of “associability” and “change in associability” that have been proposed to regulate the rate of reinforcement learning (e.g. (Pearce and Hall, 1980)). Such theories state that animals learn (and adjust learning rates) from both positive and negative prediction errors. Although these DA neurons may contribute to learning from positive prediction errors, during which they can have a strong response (e.g. to unexpected reward delivery), they may not contribute to learning from negative prediction errors, during which they can have little or no response (e.g. to unexpected reward omission) (Fig. 4B).
J Neurosci. 2014 Oct 22;34(43):14349-64. doi: 10.1523/JNEUROSCI.3492-14.2014.
Approach to reward is a fundamental adaptive behavior, disruption of which is a core symptom of addiction and depression. Nucleus accumbens (NAc) dopamine is required for reward-predictive cues to activate vigorous reward seeking, but the underlying neural mechanism is unknown. Reward-predictive cues elicit both dopamine release in the NAc and excitations and inhibitions in NAc neurons.
However, a direct link has not been established between dopamine receptor activation, NAc cue-evoked neuronal activity, and reward-seeking behavior. Here, we use a novel microelectrode array that enables simultaneous recording of neuronal firing and local dopamine receptor antagonist injection. We demonstrate that, in the NAc of rats performing a discriminative stimulus task for sucrose reward, blockade of either D1 or D2 receptors selectively attenuates excitation, but not inhibition, evoked by reward-predictive cues.
Furthermore, we establish that this dopamine-dependent signal is necessary for reward-seeking behavior. These results demonstrate a neural mechanism by which NAc dopamine invigorates environmentally cued reward-seeking behavior.
Dept. of Psychology, Univ. of Illinois at Chicago, Chicago, IL 60607. firstname.lastname@example.org.
Adolescence may be a period of vulnerability to drug addiction. In rats, elevated firing activity of ventral tegmental area (VTA) dopamine neurons predicts enhanced addiction liability. Our aim was to determine if dopamine neurons are more active in adolescents than in adults and to examine mechanisms underlying any age-related difference. VTA dopamine neurons fired faster in adolescents than in adults as measured with in vivo extracellular recordings. Dopamine neuron firing can be divided into nonbursting (single spikes) and bursting activity (clusters of high-frequency spikes). Nonbursting activity was higher in adolescents compared with adults. Frequency of burst events did not differ between ages, but bursts were longer in adolescents than in adults. Elevated dopamine neuron firing in adolescent rats was also observed in cell-attached recordings in ex vivo brain slices. Using whole cell recordings, we found that passive and active membrane properties were similar across ages. Hyperpolarization-activated cation currents and small-conductance calcium-activated potassium channel currents were also comparable across ages. We found no difference in dopamine D2-class autoreceptor function across ages, although the high baseline firing in adolescents resulted in autoreceptor activation being less effective at silencing neurons. Finally, AMPA receptor-mediated spontaneous excitatory postsynaptic currents occurred at lower frequency in adolescents; GABA(A) receptor-mediated spontaneous inhibitory postsynaptic currents occurred at both lower frequency and smaller amplitude in adolescents. In conclusion, VTA dopamine neurons fire faster in adolescence, potentially because GABA tone increases as rats reach adulthood. This elevation of firing rate during adolescence is consistent with it representing a vulnerable period for developing drug addiction.
Kay M. Tye, Julie J. Mirzabekov, Melissa R. Warden, Emily A. Ferenczi, Hsing-Chen Tsai, Joel Finkelstein, Sung-Yon Kim, Avishek Adhikari, Kimberly R. Thompson, Aaron S. Andalman, Lisa A. Gunaydin, Ilana B. Witten & Karl Deisseroth
Major depression is characterized by diverse debilitating symptoms that include hopelessness and anhedonia1. Dopamine neurons involved in reward and motivation2, 3, 4, 5, 6, 7, 8, 9 are among many neural populations that have been hypothesized to be relevant10, and certain antidepressant treatments, including medications and brain stimulation therapies, can influence the complex dopamine system. Until now it has not been possible to test this hypothesis directly, even in animal models, as existing therapeutic interventions are unable to specifically target dopamine neurons. Here we investigated directly the causal contributions of defined dopamine neurons to multidimensional depression-like phenotypes induced by chronic mild stress, by integrating behavioural, pharmacological, optogenetic and electrophysiological methods in freely moving rodents. We found that bidirectional control (inhibition or excitation) of specified midbrain dopamine neurons immediately and bidirectionally modulates (induces or relieves) multiple independent depression symptoms caused by chronic stress. By probing the circuit implementation of these effects, we observed that optogenetic recruitment of these dopamine neurons potently alters the neural encoding of depression-related behaviours in the downstream nucleus accumbens of freely moving rodents, suggesting that processes affecting depression symptoms may involve alterations in the neural encoding of action in limbic circuitry.
“Our work presents new leads to understanding neuropsychiatric disorders,” UCI's Emiliana Borrelli said.
Dysfunction in dopamine signaling profoundly changes the activity level of about 2,000 genes in the brain's prefrontal cortex and may be an underlying cause of certain complex neuropsychiatric disorders, such as schizophrenia, according to UC Irvine scientists.
This epigenetic alteration of gene activity in brain cells that receive this neurotransmitter showed for the first time that dopamine deficiencies can affect a variety of behavioral and physiological functions regulated in the prefrontal cortex.
The study, led by Emiliana Borrelli, a UCI professor of microbiology & molecular genetics, appears online in the journal Molecular Psychiatry.
"Our work presents new leads to understanding neuropsychiatric disorders," Borrelli said. "Genes previously linked to schizophrenia seem to be dependent on the controlled release of dopamine at specific locations in the brain. Interestingly, this study shows that altered dopamine levels can modify gene activity through epigenetic mechanisms despite the absence of genetic mutations of the DNA."
Dopamine is a neurotransmitter that acts within certain brain circuitries to help manage functions ranging from movement to emotion. Changes in the dopaminergic system are correlated with cognitive, motor, hormonal and emotional impairment. Excesses in dopamine signaling, for example, have been identified as a trigger for neuropsychiatric disorder symptoms.
Borrelli and her team wanted to understand what would happen if dopamine signaling was hindered. To do this, they used mice that lacked dopamine receptors in midbrain neurons, which radically affected regulated dopamine synthesis and release.
The researchers discovered that this receptor mutation profoundly altered gene expression in neurons receiving dopamine at distal sites in the brain, specifically in the prefrontal cortex. Borrelli said they observed a remarkable decrease in expression levels of some 2,000 genes in this area, coupled with a widespread increase in modifications of basic DNA proteins called histones – particularly those associated with reduced gene activity.
Borrelli further noted that the dopamine receptor-induced reprogramming led to psychotic-like behaviors in the mutant mice and that prolonged treatment with a dopamine activator restored regular signaling, pointing to one possible therapeutic approach.
The researchers are continuing their work to gain more insights into the genes altered by this dysfunctional dopamine signaling.
Comments: Reducing dopamine impaired decision-making. Researchers noticed short-sightedness and difficulties resisting short-term reward despite adverse long-term consequences. It looks like lowering dopamine, or lowering dopamine receptors, creates "addict brain."
Psychopharmacology (Berl). 2006 Oct;188(2):228-35. Epub 2006 Aug 17.
Sevy S, Hassoun Y, Bechara A, Yechiam E, Napolitano B, Burdick K, Delman H, Malhotra A Department of Psychiatry and Behavioral Sciences, Albert Einstein College of Medicine, Yeshiva University, Bronx, and Psychiatry Research Department , The Zucker Hillside Hospital, Glen Oaks, NY, USA. email@example.com
INTRODUCTION: Converging evidences from animal and human studies suggest that addiction is associated with dopaminergic dysfunction in brain reward circuits. So far, it is unclear what aspects of addictive behaviors are related to a dopaminergic dysfunction.
DISCUSSION: We hypothesize that a decrease in dopaminergic activity impairs emotion-based decision-making. To demonstrate this hypothesis, we investigated the effects of a decrease in dopaminergic activity on the performance of an emotion-based decision-making task, the Iowa gambling task (IGT), in 11 healthy human subjects.
MATERIALS AND METHODS: We used a double-blind, placebo-controlled, within-subject design to examine the effect of a mixture containing the branched-chain amino acids (BCAA) valine, isoleucine and leucine on prolactin, IGT performance, perceptual competency and visual aspects of visuospatial working memory, visual attention and working memory, and verbal memory. The expectancy-valence model was used to determine the relative contributions of distinct IGT components (attention to past outcomes, relative weight of wins and losses, and choice strategies) in the decision-making process.
OBSERVATIONS AND RESULTS: Compared to placebo, the BCAA mixture increased prolactin levels and impaired IGT performance. BCAA administration interfered with a particular component process of decision-making related to attention to more recent events as compared to more distant events. There were no differences between placebo and BCAA conditions for other aspects of cognition.
Our results suggest a direct link between a reduced dopaminergic activity and poor emotion-based decision-making characterized by shortsightedness, and thus difficulties resisting short-term reward, despite long-term negative consequences. These findings have implications for behavioral and pharmacological interventions targeting impaired emotion-based decision-making in addictive disorders.
Front Integr Neurosci. 2014 Mar 4;8:21. doi: 10.3389/fnint.2014.00021. eCollection 2014.
The presentation of novel, remarkable, and unpredictable tastes increases dopamine (DA) transmission in different DA terminal areas such as the nucleus accumbens (NAc) shell and core and the medial prefrontal cortex (mPFC), as estimated by in vivo microdialysis studies in rats. This effect undergoes adaptive regulation, as there is a decrease in DA responsiveness after a single pre-exposure to the same taste. This phenomenon termed habituation has been described as peculiar to NAc shell but not to NAc core and mPFC DA transmission. On this basis, it has been proposed that mPFC DA codes for generic motivational stimulus value and, together with the NAc core DA, is more consistent with a role in the expression of motivation. Conversely, NAc shell DA is specifically activated by unfamiliar or novel taste stimuli and rewards, and might serve to associate the sensory properties of the rewarding stimulus with its biological effect (Bassareo etal., 2002; Di Chiara etal., 2004). Notably, habituation of the DA response to intraoral sweet or bitter tastes is not associated with a reduction in hedonic or aversive taste reactions, thus indicating that habituation is unrelated to satiety-induced hedonic devaluation and that it is not influenced by DA alteration or depletion. This mini-review describes specific circumstances of disruption of the habituation of NAc shell DA responsiveness (De Luca etal., 2011; Bimpisidis etal., 2013). In particular, we observed an abolishment of NAc shell DA habituation to chocolate (sweet taste) by morphine sensitization and mPFC 6-hydroxy-dopamine hydrochloride (6-OHDA) lesion. Moreover, morphine sensitization was associated with the appearance of the habituation in the mPFC, and with an increased and delayed response of NAc core DA to taste in naive rats, but not in pre-exposed animals. The results here described shed light on the mechanism of the habituation phenomenon of mesolimbic and mesocortical DA transmission, and its putative role as a marker of cortical dysfunction in specific conditions such as addiction.
Primary motivational states, both positive and negative, are often ruled by the activity of dopamine (DA) neurons in the ventral tegmental area (VTA) and their terminal targets, such as the nucleus accumbens (NAc) and the medial prefrontal cortex (mPFC). In these terminal regions, DA responds to appetitive or aversive stimuli differently depending on specific factors such as stimulus valence, stimulus sensory modality, specific DA neuron subpopulations, different terminal areas studied, and the techniques used for the detection of DA (e.g., microdialysis vs voltammetry; Fibiger and Phillips, 1988; Di Chiara, 1995; Westerink, 1995; Berridge and Robinson, 1998; Schultz, 1998; Redgrave et al., 1999; Di Chiara et al., 2004; Aragona et al., 2009; Lammel et al., 2012; McCutcheon et al., 2012).
The direct correlation between motivational stimulus valence and its effect on the responsiveness of DA transmission has been extensively appreciated by in vivo brain microdialysis studies in three different DA terminal areas: NAc shell, NAc core, and mPFC (Bassareo and Di Chiara, 1999; Bassareo et al., 2002). Particularly, it has been observed that the exposure to natural rewards (e.g., highly palatable food) and to salient food taste stimuli (sweet and bitter) increases DA transmission in NAc shell and core and in mPFC of non-food-deprived rats. In NAc shell, but not in NAc core or in mPFC, this response undergoes adaptive regulation after a single pre-exposure to the same taste/food. This response reduces following a recurrent stimulus, and is termed habituation (Thompson and Spencer, 1966; Cohen et al., 1997; Rankin et al., 2009). In NAc shell, habituation to natural rewards is taste specific, and it is reversed by food deprivation of the animals and modified by the presentation of cues associated with the stimulus (Bassareo and Di Chiara, 1999). These observations demonstrate that NAc shell DA is activated by unfamiliar appetitive taste stimuli while DA in the mPFC codes for generic motivational value independently of stimulus valence. Additionally, this underlines the role of NAc shell DA and its habituation in associative learning (Bassareo et al., 2002; Di Chiara et al., 2004).
In contrast, habituation of DA response is not present after repeated exposure to drugs of abuse (e.g., nicotine, opiates, psychostimulants, cannabinoids), which preferentially stimulate DA transmission in NAc shell as compared to NAc core (Pontieri et al., 1995,1996; Tanda et al., 1997). However, the use of in vivo voltammetry by other labs showed opposite and specific sub-regional changes in DA concentration in response to both cued and unconditioned appetitive stimuli or after cocaine (Aragona et al., 2009; Brown et al., 2011; Badrinarayan et al., 2012).
This review describes experimental evidence for the disruption of habituation of NAc shell DA responsiveness to motivational stimuli in vivo, and on the specific circumstances that could contribute to these significant changes. The data here discussed highlight the role of DA in both learning and hedonic processes.
Morphine administration increases DA transmission in the mesolimbic system, as estimated by in vivo brain microdialysis (Di Chiara and Imperato, 1988; Pontieri et al., 1996). Specific experimental protocols of repeated exposure to morphine produced sensitization.
The effect of morphine sensitization on the habituation of the responsiveness of DA transmission to a single pre-exposure to novel, remarkable and unpredictable taste stimuli has been evaluated (De Luca et al., 2011). In order to induce behavioral and biochemical sensitization, a protocol conceived by Cadoni and Di Chiara (1999) has been used. Thus, rats were administered twice a day for three consecutive days with increasing doses of morphine (10, 20, 40 mg/kg s.c.) or saline. After 15 days of withdrawal, rats were administered a precise amount of appetitive sweet chocolate solution through an intraoral cannula (1 ml/5 min, i.o.) during the microdialysis session for NAc shell, core and mPFC dialysate DA analysis.
Our main finding was that opiate sensitization and chocolate pre-exposure exert a differential influence on the response of DA transmission as regards to the specific subdivision of the mesocorticolimbic DA system. Figure Figure11 shows the effect of morphine sensitization on the response of NAc shell and core and mPFC DA levels to intraoral sweet chocolate in naive and chocolate pre-exposed rats. We reported that pre-exposure to chocolate produced opposite changes in DA transmission in the mPFC and in the NAc shell (De Luca et al., 2011). In fact, unexpected appearance of habituation in mPFC DA responsiveness to taste stimuli was accompanied by a loss of habituation in NAc shell. Moreover, morphine sensitization was associated with an increased and delayed (50–110 min after chocolate) response of NAc core DA to taste in naive rats while an immediate increase of DA was observed in pre-exposed animals. Similar results were obtained with an aversive stimulus (De Luca et al., 2011). Moreover, although sensitization to morphine is associated with long-term changes in mesolimbic and mesocortical DA responsiveness to taste stimuli, changes in behavioral taste reactivity are lacking. The latter evidence supports the hypothesis that taste-hedonia does not depend on DA (Berridge and Robinson, 1998), thus the increase of DA transmission in these brain regions could arise from the motivational and not from the sensory or hedonic properties of the taste (Bassareo and Di Chiara, 1999; Bassareo et al., 2002).
All of the DA terminal regions studied displayed changes in the habituation (i.e., abolishment vs appearance), which might result in an increased incentive arousal and learning. Notably, the habituation of mPFC DA responsiveness to chocolate releases NAc shell DA from inhibition, thereby abolishing the single-trial habituation of DA. Under this condition, repeated approaches toward a motivational stimulus might be facilitated.
In intact brain, mPFC DA prominently regulates the activity of subcortical DA areas involved in reward and motivation through a complex interaction of many different sub-regions inside the PFC (Murase et al., 1993; Taber and Fibiger, 1995; Kennerley and Walton, 2011). Such control is modulated by DA receptors in the mPFC (Louilot et al., 1989; Jaskiw et al., 1991; Vezina et al., 1991; Lacroix et al., 2000). mPFC DA functions are engaged in cognitive processes (Seamans and Yang, 2004), regulation of emotions (Sullivan, 2004), working memory (Khan and Muly, 2011), and executive functions such as motor planning, inhibitory response control and sustained attention (Fibiger and Phillips, 1988; Granon et al., 2000; Robbins, 2002).
We recently studied the effect of mPFC 6-OHDA lesion on NAc shell and core DA responsiveness to chocolate in naive and chocolate pre-exposed rats. 6-OHDA bilateral infusions in the mPFC modify the responsiveness of NAc DA to gustatory stimuli administered by an intraoral catheter. As shown in Figure Figure22, we observed that in NAc shell of naive subjects the lesion did not change the DA response to intraoral chocolate. However, the lesion of mPFC DA terminals produced an elevated, delayed, and prolonged increase of DA in NAc core in response to an appetitive taste stimulus. In pre-exposed subjects, the lesion did not affect NAc core DA responsiveness to chocolate while it abolished one-trial habituation of NAc shell DA response to sweet taste. After DA terminal lesions, an effect on neither hedonic taste score nor motor activity has been observed (Bimpisidis et al., 2013).
These observations might suggest that the mPFC DA inhibitory control of DA responsiveness in subcortical striatal areas is different depending on the ventral striatum sub-region studied. Moreover, different sub-regions within the mPFC (e.g., prelimbic, infralimbic) have different projections to different compartments of the NAc. Accordingly, in the NAc shell, which is mostly innervated by the infralimbic area, the cortical-subcortical relationship might work in an opposite manner to that in NAc core.
This is consistent with the different responsiveness of NAc shell and core DA to discrete stimuli and conditions (Di Chiara et al., 2004; Di Chiara and Bassareo, 2007; Aragona et al., 2009; Corbit and Balleine, 2011; Cacciapaglia et al., 2012).
The experimental results here described may help explain, in part, the reason why traumatic PFC injury often facilitates development of drug use disorders (Delmonico et al., 1998). Accordingly, disruption of PFC functions appears following both traumatic conditions (Bechara and Van Der Linden, 2005) and history of drug addiction (Van den Oever et al., 2010; Goldstein and Volkow, 2011). Our data also suggest a correlation between the NAc DA responsiveness to repeated exposure to a motivational stimulus and the control of its activity by the mPFC DA. This refers to mPFC a crucial role in subcortical dysfunction, which may occur in different stages of drug addiction. Similarly, the mPFC plays a crucial role in subcortical dysfunction, which may occur in different stages of drug addiction. Other studies show the direct involvement of mPFC in addiction (Schenk et al., 1991; Weissenborn et al., 1997; Bolla et al., 2003), drug seeking, craving and relapse, which are related to drugs taken either by humans or animals (Kalivas and Volkow, 2005).
Remarkably, we found similarities between the effect of repeated morphine exposure and selective mPFC DA terminal lesions on DA transmission in response to motivational taste stimuli both in NAc shell and in NAc core. However, this correlation seems to exist only after prolonged administration of drugs of abuse, as a single drug exposure did not modify the habituation in NAc shell (De Luca et al., 2012). Moreover, the absence of any relationship between DA habituation and taste reactivity (Berridge, 2000; Bassareo et al., 2002; De Luca et al., 2012) has been validated.
In summary, the specific conditions leading to the abolishment of habituation illustrated in this work clarify the meaning of the habituation phenomenon of mesolimbic and mesocortical DA transmission. Habituation is usually present in NAc shell, but not in NAc core or mPFC, and it is ruled by intact DA transmission within the mPFC. However, the appearance of habituation in the mPFC could be considered as a marker of mPFC dysfunction in its ability to inhibit crucial subcortical functions. This may result in excessive motivation for inappropriate actions originating from a clear loss of impulse control. Finally, yet importantly, NAc DA habituation may be considered per se as a marker of drug dependence and its liability.
The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This work was supported by grant from the Fondazione Banco di Sardegna, and by RAS LR 7, 2007. The author would like to thank Ms. Tonka Ivanisevic for the help with the preparation of the manuscript.
Comments: One of the latest and best reviews of dopamine's role in addiction. Volkow is one of the premiere experts in addiction, and the current head of NIDA.
Volkow ND, Fowler JS, Wang GJ, Baler R, Telang F.
Neuropharmacology. 2009;56 Suppl 1:3-8.
Dopamine is involved in drug reinforcement but its role in addiction is less clear. Here we describe PET imaging studies that investigate dopamine’s involvement in drug abuse in the human brain. In humans the reinforcing effects of drugs are associated with large and fast increases in extracellular dopamine, which mimic those induced by physiological dopamine cell firing but are more intense and protracted.
Since dopamine cells fire in response to salient stimuli, supraphysiological activation by drugs is experienced as highly salient (driving attention, arousal, conditioned learning and motivation) and with repeated drug use may raise the thresholds required for dopamine cell activation and signaling. Indeed, imaging studies show that drug abusers have marked decreases in dopamine D2 receptors and in dopamine release. This decrease in dopamine function is associated with reduced regional activity in orbitofrontal cortex (involved in salience attribution; its disruption results in compulsive behaviors), cingulate gyrus (involved in inhibitory control; its disruption results in impulsivity) and dorsolateral prefrontal cortex (involved in executive function; its disruption results in impaired regulation of intentional actions). In parallel, conditioning triggered by drugs leads to enhanced dopamine signaling when exposed to conditioned cues, which then drives the motivation to procure the drug in part by activation of prefrontal and striatal regions. These findings implicate deficits in dopamine activity—inked with prefrontal and striatal deregulation—in the loss of control and compulsive drug intake that results when the addicted person takes the drugs or is exposed to conditioned cues. The decreased dopamine function in addicted individuals also reduces their sensitivity to natural reinforcers. Therapeutic interventions aimed at restoring brain dopaminergic tone and activity of cortical projection regions could improve prefrontal function, enhance inhibitory control and interfere with impulsivity and compulsive drug administration while helping to motivate the addicted person to engage in non-drug related behaviors.
Keywords: Positron emission tomography, Orbitofrontal cortex, Cingulate gyrus, Dorsolateral prefrontal cortex, Dopamine D2 receptors, Reward, Predisposition, Salience, Raclopride, Fluoro-deoxyglucose
Drugs of abuse trigger large increases in extracellular dopamine (DA) in limbic regions (including nucleus accumbens; NAc) (Di Chiara and Imperato, 1988; Koob and Bloom, 1988), which are associated with their reinforcing effects. These effects mimic but surpass the DA increases secondary to phasic DA cell firing that play a physiological role in coding for saliency and reward (Schultz et al., 2000). Though some animal studies have questioned the extent to which DA increases in NAc are associated with reward (Drevets et al., 2001; Day et al., 2007), human imaging studies have shown that drug-induced increases in DA in the striatum (including the ventral striatum, where the NAc is located) are associated with subjective descriptors of reward (high, euphoria) ( Volkow et al., 1996a; Drevets et al., 2001). Nevertheless, it is also evident that the firing rate of DA cells encode not just reward (Tobler et al., 2007) and expectancy of reward (Volkow et al., 2003b) but also the saliency of a given event or stimulus (Rolls et al., 1984; Williams et al., 1993; Horvitz, 2000; Zink et al., 2003). The saliency of an event is driven either by its unexpectedness, its novelty, its conditioned expectations or its reinforcing effects (positive as well as negative) (Volkow et al., 2003, 2006b). Firing of DA cells, concomitant to the use of the drug will also facilitate the consolidation of memory traces connected to the drug. These, in turn, will trigger DA cells firing with future exposure to stimuli associated with the drug (in expectation of the reward) (Waelti et al., 2001). Because of DA’s role in motivation, the DA increases associated with drug-cues or the drug itself are also likely to modulate the motivation to procure the reward (McClure et al., 2003).
The increase in knowledge regarding the multiple roles of DA in the reinforcement processes has led to more complex models of drug addiction. It is currently believed that drugs are reinforcing not just because they are pleasurable but because, by increasing DA, they are being processed as salient stimuli that will inherently motivate the procurement of more drug (regardless of whether the drug is consciously perceived as pleasurable or not).
Brain imaging techniques have contributed greatly to this new understanding. They have allowed us to measure neurochemical and metabolic processes in the living human brain (Volkow et al., 1997a), to investigate the nature of the changes in DA induced by drugs of abuse and their behavioral relevance, and to study the plastic changes in brain DA activity and its functional consequences in drug addicted subjects. This paper provides an updated review of relevant findings.
2. Drug-induced dopamine increases in the human brain and in reinforcement
The use of positron emission tomography (PET) and specific D2 DA receptor radioligands (e.g., [11C]raclopride, [18F]N-methylspiroperidol) has proven invaluable for the study of the relationships between a drug’s ability to modulate DA and its reinforcing (i.e., euphorigenic, high-inducing, drug-liking) effects in the human brain. The approach has been used effectively to assess the effects of stimulant drugs (i.e., methylphenidate, amphetamine, cocaine) as well as those of nicotine (Barrett et al., 2004; Brody et al., 2004; Montgomery et al., 2007; Takahashi et al., 2007). Both the intravenous administration of methylphenidate (0.5 mg/kg), which like cocaine, increases DA by blocking DA transporters (DAT) as well as that of amphetamine (0.3 mg/kg), which like methamphetamine, increases DA by releasing it from the terminal via DAT, increase extracellular DA concentration in striatum and such increases are associated with self-reports of “high” and “euphoria” (Hemby et al., 1997; Villemagne et al., 1999). Interestingly, orally administered methylphenidate (0.75–1 mg/kg) also increased DA but is not typically perceived as reinforcing (Chait, 1994; Volkow et al., 2001b). Since intravenous administration leads to fast DA changes whereas oral administration increases DA slowly, the failure to observe the “high” with oral methylphenidate—or amphetamine (Stoops et al., 2007)—is likely to reflect the slower pharmacokinetics (Parasrampuria et al., 2007). Indeed, the speed at which drugs of abuse enter the brain has been recognized as a key parameter affecting its reinforcing effects (Balster and Schuster, 1973; Volkow et al., 1995, 2000). Not surprisingly, the DA increases in ventral striatum induced after smoking, which has similarly very fast rate of brain uptake, are also associated with its reinforcing effects (Brody et al., 2004).
This link between fast brain uptake (leading to fast DA changes) and the reinforcing properties of a given drug suggests the involvement of phasic DA firing. The fast bursts (>30 Hz) generated by phasic release result in abrupt fluctuations in DA levels that contribute to highlight the saliency of a stimulus (Grace, 2000). Such a mechanism stands in contrast to tonic DA cell firing (with slower frequencies of around 5 Hz), which is responsible for maintaining the baseline steady-state DA levels that set the DA system’s responsiveness threshold. Therefore, we have proposed that drugs of abuse manage to induce changes in DA concentration that mimic, but greatly exceed, those produced by physiologic phasic DA cell firing. On the other hand the oral administration of stimulant drugs, which is the route used for therapeutic purposes is likely to induce slow DA changes that resemble those associated with tonic DA cell firing (Volkow and Swanson, 2003). Because stimulant drugs block DATs, which are the main mechanism for DA removal (Williams and Galli, 2006), they could—even when given orally—increase the reinforcing value of other reinforcers (natural or drug rewards) (Volkow et al., 2001b). Similarly, nicotine, which facilitates DA cell firing, also enhances the reinforcing value of stimuli with which it is paired. In the latter case the combination of nicotine with the natural reward becomes inextricably linked to its reinforcing effects.
3. Role of dopamine in the long-term effects of drugs of abuse on DA in the human brain: involvement in addiction
Synaptic increases in DA occur during drug intoxication in both addicted as well as non-addicted subjects (Di Chiara and Imperato, 1988; Koob and Bloom, 1988). However, only a minority of exposed subjects—the actual proportion being a function of the type of drug used—ever develops a compulsive drive to continue taking the drug (Schuh et al.,1996). This indicates that the acute drug-induced DA increase alone cannot explain the ensuing development of addiction. Because drug addiction requires chronic drug administration, it is likely to be rooted—in vulnerable individuals—in the repeated perturbation of the DA system, triggering neuro-adaptations in reward/saliency, motivation/drive, inhibitory control/executive function and memory/conditioning circuits, all of which are modulated by dopaminergic pathways (Volkow et al., 2003a).
Consistent with this line of thought, there is mounting evidence that exposure to stimulants, nicotine, or opiates produces persistent adaptive changes in the structure of dendrites and dendritic spines on cells in key areas of the brain with roles in motivation, reward, judgment, and the inhibitory control of behavior (Robinson and Kolb, 2004). For example, chronic adaptations in DA receptor signaling may trigger compensatory glutamate receptor responses with the potential to affect synaptic plasticity (Wolf et al., 2003). The fact that DA (Wolf et al., 2003; Liu et al., 2005), but also glutamate, GABA, and other neurotransmitters, are all highly versatile modulators of synaptic plasticity, draws a direct path connecting the effects of drugs of abuse with the adaptive alterations, not only in the reward center but also in many other circuits, through the strengthening, formation, and elimination of synapses.
Multiple radiotracers have been used to detect and measure these types of changes in targets within DA network in the human brain (Table 1). Using [18F]N-methylspiroperidol or [11C]raclopride we and others (Martinez et al., 2004, 2005, 2007) have shown that subjects addicted to a wide variety of drugs (cocaine, heroin, alcohol, and methamphetamine), exhibit significant reductions in D2 DA receptor availability in the striatum (including ventral striatum) that persist months after protracted detoxification (Volkow et al., 2007a). Similar findings were also recently reported in nicotine dependent subjects (Fehr et al., 2008).
Table 1 Summary of PET findings comparing various targets involved in DA neurotransmission between substance abusers and control subjects for which statistically significant differences between the groups were identified
It is also relevant to point out in this context that the striatal increases in DA induced by intravenous methylphenidate or intravenous amphetamine (assessed with [11C]raclopride) in cocaine abusers and alcoholics are at least 50% lower than in control subjects (Volkow et al., 1997b; Martinez et al., 2007). Since DA increases induced by methylphenidate are dependent on DA release—a function of DA cell firing—it is reasonable to hypothesize that the difference likely reflects decreased DA cell activity in these drug abusers.
It is important to keep in mind that the results of PET studies done with [11C]raclopride, which is sensitive to competition with endogenous DA, are merely a reflection of vacant D2 DA receptors available to bind to the tracer. Thus, any reduction in D2 DA receptor availability as measured with [11C]raclopride could reflect either decreases in levels of D2 DA receptors and/or increases in DA release (competing for binding with [11C]raclopride for the D2 receptors) in striatum (including NAc). However the fact that cocaine abusers when given i.v. MP showed blunted reductions in specific binding (indicative of decreased DA release) indicates that in cocaine abusers there is both a reduction in the levels of D2 receptors as well as a decrease in DA release in striatum. Each would contribute to the decreased sensitivity in addicted subjects to natural reinforcers (Volkow et al., 2002b). Because drugs are much more potent at stimulating DA-regulated reward circuits than natural reinforcers, drugs would still be able to activate the depressed reward circuits. This decreased sensitivity, on the other hand would result in a reduced interest for environmental stimuli, possibly predisposing subjects for seeking drug stimulation as a means to temporarily activate these reward circuits. As time progresses, the chronic nature of this behavior may explain the transition from taking drugs in order to feel “high” to taking them just to feel normal.
What are the metabolic and functional correlates of such long term drug-induced perturbation in dopaminergic balance? Using the PET radiotracer [18F]fluoro-deoxyglucose (FDG) that measures regional brain glucose metabolism, we and others have shown decreased activity in orbitofrontal cortex (OFC), cingulate gyrus (CG) and dorsolateral prefrontal cortex (DLPFC) in addicted subjects (alcoholics, cocaine abusers, marihuana abusers) (London et al., 1990; Galynker et al., 2000; Ersche et al., 2006; Volkow et al., 2007a). Moreover, in cocaine (Volkow and Fowler, 2000) and methamphetamine (Volkowet al., 2001a) addicted subjects and in alcoholics (Volkow et al., 2007d), we have shown that the reduced activity in OFC, CG and DLPFC is associated with decreased availability of D2 DA receptors in striatum (see Fig. 1 for cocaine and methamphetamine results). Since the OFC, CG and DLPFC are involved with inhibitory control (Goldstein and Volkow, 2002) and with emotional processing (Phan et al., 2002), we had postulated that their abnormal regulation by DA in addicted subjects could underlie their loss of control over drug intake and their poor emotional self-regulation. Indeed, in alcoholics, reductions in D2 DA receptor availability in ventral striatum have been shown to be associated with alcohol craving severity and with greater cue-induced activation of the medial prefrontal cortex and anterior CG, as assessed with fMRI (Heinz et al., 2004). In addition, because damage to the OFC results in perseverative behaviors (Rolls, 2000)—and in humans impairments in OFC and CG are associated with obsessive compulsive behaviors (Saxena et al., 2002)—we have also postulated that DA impairment of these regions could underlie the compulsive drug intake that characterizes addiction (Volkow et al., 2005).
Fig. 1 A) Normalized volume distribution of [11C]raclopride binding in the striatum of cocaine and methamphetamine abusers and non-drug-abusing comparison subjects. (B) Correlation of DA receptor availability (Bmax/Kd) in the striatum with measures of metabolic activity in the orbitofrontal cortex (OFC) in cocaine (closed diamonds) and methamphetamine (open diamonds) abusers. Modified with permission based on Volkow et al. (1993, 2001a).
However, the association could also be interpreted to indicate that impaired activity in prefrontal regions could put individuals at risk for drug abuse and that only then the repeated drug use could result in the downregulation of D2 DA receptors.
DA also modulates the activity of the hippocampus, amygdala and dorsal striatum, which are regions implicated in memory, conditioning, and habit formation (Volkow et al., 2002a). Moreover, adaptations in these regions have been documented in preclinical models of drug abuse (Kauer and Malenka, 2007). Indeed, there is increasing recognition of the relevance and likely involvement of memory and learning mechanisms in drug addiction (Vanderschuren and Everitt, 2005). The effects of drugs of abuse on memory systems suggest ways that neutral stimuli can acquire reinforcing properties and motivational salience—that is, through conditioned-incentive learning. In research on relapse, it has been very important to understand why drug addicted subjects experience an intense desire for the drug when exposed to places where they have taken the drug, to people with whom prior drug use had occurred, and to paraphernalia used to administer the drug. This is clinically relevant since exposure to conditioned cues (stimuli that had become strongly linked to the drug experience) is a key contributor to relapse. Since DA is involved with prediction of reward (Schultz, 2002), DA has been predicted to underlie the conditioned responses that trigger craving. Preclinical studies support this hypothesis: when neutral stimuli are paired with a drug, animals will—with repeated associations—acquire the ability to increase DA in NAc and dorsal striatum when exposed to the now conditioned cue. Predictably, these neurochemical responses have been found to be associated with drug-seeking behavior (Vanderschuren and Everitt, 2005).
In humans, PET studies with [11C]raclopride recently confirmed this hypothesis by showing that in cocaine abusers drug cues (cocaine-cue video of scenes of subjects taking cocaine) significantly increased DA in dorsal striatum, and that these increases were also associated with cocaine craving (Volkow et al., 2006c; Wong et al., 2006) in a cue-dependent fashion (Volkow et al., 2008). Because the dorsal striatum is implicated in habit learning, this association is likely to reflect the strengthening of habits as chronicity of addiction develops. This suggests that the DA-triggered conditioned responses that form, first habits and then compulsive drug consumption, may reflect a fundamental neurobiological perturbation in addiction. It is likely that these conditioned responses involve adaptations in cortico-striatal glutamatergic pathways that regulate DA release (Vanderschuren and Everitt, 2005).
To assess if cue-induced DA increases reflect a primary or a secondary response to the cue a recent imaging study in cocaine addicted subjects evaluated the effects of increasing DA (achieved by oral administration of methylphenidate), with and without the cue, in an attempt to determine whether DA increases by themselves could induce craving. The results of the study revealed a clear dissociation between oral methylphenidate-induced DA increases and cue-associated cravings (Volkow et al., 2008) suggesting that cue-induced DA increases are not the primary effectors but rather reflect downstream stimulation of DA cells (cortico-striatal glutamatergic pathways that regulate DA release; Kalivas and Volkow, 2005). This observation further illuminates the subtle effects of DA firing rate upon addiction circuitry, for the failure of methylphenidate-induced DA increases to induce craving in this paradigm could be explained by the slow nature of the DA increases. On the other hand, fast DA changes as triggered by phasic DA cell firing—as a secondary response to the activation of descending path-ways—may underlie the successful induction of cravings with exposure to a cue. It is worth highlighting, that Martinez et al. reported a negative correlation between the DA increases induced by intravenous amphetamine in cocaine abusers and their choice of cocaine over money when tested on a separate paradigm(Martinez et al., 2007). That is, the subjects that showed the lower DA increases when given amphetamine were the ones more likely to select cocaine over a monetary reinforcer. Because in their studies they also reported reduced DA increases in cocaine abusers when compared with controls this could indicate that cocaine abusers with the most severe decreases in brain dopaminergic activity are the ones more likely to choose cocaine over other reinforcers.
4. DA and vulnerability to drug abuse
Understanding why some individuals are more vulnerable to becoming addicted to drugs than others remains one of the most challenging questions in drug abuse research. In healthy non-drug abusing controls we showed that D2 DA receptor availability in the striatum modulated their subjective responses to the stimulant drug methylphenidate. Subjects describing the experience as pleasant had significantly lower levels of receptors compared with those describing methylphenidate as unpleasant (Volkow et al., 1999, 2002c). This suggests that the relationship between DA levels and reinforcing responses follows an inverted u-shaped curve: too little is suboptimal for reinforcement while too much may become aversive. Thus, high D2 DA receptor levels could protect against drug self administration. Support for this is provided by preclinical studies, which showed that higher levels of D2 DA receptors in NAc significantly reduced alcohol intake in animals previously trained to self-administer alcohol (Thanos et al., 2001) and the tendency of group-housed cynomolgus macaques to self-administer cocaine (Morgan et al., 2002), and by clinical studies showing that subjects who despite having a dense family history of alcoholism were not alcoholics had significantly higher D2 DA receptors in striatum than individuals without such family histories (Volkow et al., 2006a). The higher the D2 DA receptors in these subjects, the higher their metabolism in OFC and CG. Thus we can postulate that high levels of D2 DA receptors may protect against alcoholism by modulating frontal circuits involved in salience attribution and inhibitory control.
On the other end of the spectrum, we have found evidence of depressed dopamine activity in specific brain regions of adults with ADHD compared to controls. Deficiencies were seen at the level of both D2 DA receptors and DA release in the caudate (Volkow et al., 2007b) and in the ventral striatum (Volkow et al., 2007c). And, consistent with the current model, the depressed DA phenotype was associated with higher scores on self-reports of methylphenidate liking (Volkow et al., 2007b). Interestingly, if left untreated, individuals with ADHD have a high risk for substance abuse disorders (Elkins et al., 2007).
Finally, sex differences in addictive disorders have been observed repeatedly, and it would be reasonable to ask whether imaging studies could substantiate the preclinical evidence suggesting such differences are due in part to striatal DA system differences and/or whether they result from differences in activity of prefrontal regions (Koch et al., 2007). Indeed, recent studies have documented sexually dimorphic patterns of amphetamine-induced striatal DA release (Munro et al., 2006; Riccardi et al., 2006) that could impact substance abuse vulnerability differently in men and women; although the data do not permit at this point a clear cut conclusion as to whether men or women display greater DA responses. It is also likely that the patterns will be sensitive to experimental conditions, such as context, age and stage of menstrual cycle.
When combined, these observations provide critical insight into the striatal DA’s system contribution to addiction vulnerability, to the emergence of frequent psychiatric comorbid pairings, and to the observed sexually dimorphic patterns of substance abuse.
5. Treatment implications
Imaging studies have corroborated the role of DA in the reinforcing effects of drugs of abuse in humans and have extended traditional views of DA involvement in drug addiction. These findings suggest multiprong strategies for the treatment of drug addiction that should attempt to (a) decrease the reward value of the drug of choice and increase the reward value of non-drug reinforcers; (b) weaken conditioned drug behaviors, and the motivational drive to take the drug; and (c) strengthen frontal inhibitory and executive control. Not discussed in this review is the critical involvement of circuits that regulate emotions and response to stress (Koob and Le Moal, 1997) as well as those responsible for interoceptive perception of needs and desires (Gray and Critchley, 2007), which are also potential targets for therapeutic interventions.
1. Balster RL, Schuster CR. Fixed-interval schedule of cocaine reinforcement: effect of dose and infusion duration. J. Exp. Anal. Behav. 1973;20:119–129. [PMC free article] [PubMed]
2. Barrett SP, Boileau I, Okker J, Pihl RO, Dagher A. The hedonic response to cigarette smoking is proportional to dopamine release in the human striatum as measured by positron emission tomography and [11C]raclopride. Synapse. 2004;54:65–71. [PubMed]
3. Brody AL, Olmstead RE, London ED, et al. Smoking-induced ventral striatum dopamine release. Am. J. Psychiatry. 2004;161:1211–1218. [PubMed]
4. Chait LD. Reinforcing and subjective effects of methylphenidate in humans. Behav. Pharmacol. 1994;5:281–288. [PubMed]
5. Chang L, Alicata D, Ernst T, Volkow N. Structural and metabolic brain changes in the striatum associated with methamphetamine abuse. Addiction. 2007;102 Suppl. 1:16–32. [PubMed]
6. Day JJ, Roitman MF, Wightman RM, Carelli RM. Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens. Nat. Neurosci. 2007;10:1020–1028. [PubMed]
7. Di Chiara G, Imperato A. Drugs abused by humans preferentially increase synaptic dopamine concentrations in the mesolimbic system of freely moving rats. Proc. Natl. Acad. Sci. U.S.A. 1988;85:5274–5278. [PMC free article] [PubMed]
8. Drevets WC, Gautier C, Price JC, et al. Amphetamine-induced dopamine release in human ventral striatum correlates with euphoria. Biol. Psychiatry. 2001;49:81–96. [PubMed]
9. Elkins IJ, McGue M, Iacono WG. Prospective effects of attention-deficit/hyperactivity disorder, conduct disorder, and sex on adolescent substance use and abuse. Arch. Gen. Psychiatry. 2007;64:1145–1152. [PubMed]
10. Ersche KD, Fletcher PC, Roiser JP, et al. Differences in orbitofrontal activation during decision-making between methadone-maintained opiate users, heroin users and healthy volunteers. Psychopharmacology (Berl.) 2006;188:364–373. [PMC free article] [PubMed]
11. Fehr C, Yakushev I, Hohmann N, et al. Association of low striatal dopamine D2 receptor availability with nicotine dependence similar to that seen with other drugs of abuse. Am. J. Psychiatry. 2008;165:507–514. [PubMed]
12. Fowler JS, Logan J, Wang GJ, Volkow ND. Monoamine oxidase and cigarette smoking. Neurotoxicology. 2003;24:75–82. [PubMed]
13. Galynker II, Watras-Ganz S, Miner C, et al. Cerebral metabolism in opiatedependent subjects: effects of methadone maintenance. Mt. Sinai J. Med. 2000;67:381–387. [PubMed]
14. Goldstein RZ, Volkow ND. Drug addiction and its underlying neurobiological basis: neuroimaging evidence for the involvement of the frontal cortex. Am. J. Psychiatry. 2002;159:1642–1652. [PMC free article] [PubMed]
15. Grace AA. The tonic/phasic model of dopamine system regulation and its implications for understanding alcohol and psychostimulant craving. Addiction. 2000;95 Suppl. 2:S119–S128. [PubMed]
16. Gray MA, Critchley HD. nteroceptive basis to craving. Neuron. 2007;54:183–186. [PMC free article] [PubMed]
17. Heinz A, Siessmeier T, Wrase J, et al. Correlation between dopamine D(2) receptors in the ventral striatum and central processing of alcohol cues and craving. Am. J. Psychiatry. 2004;161:1783–1789. [PubMed]
18. Heinz A, Siessmeier T, Wrase J, et al. Correlation of alcohol craving with striatal dopamine synthesis capacity and D2/3 receptor availability: a combined [18F]DOPA and [18F]DMFP PET study in detoxified alcoholic patients. Am. J. Psychiatry. 2005;162:1515–1520. [PubMed]
19. Hemby SE, Johnson BA, Dworkin SI. Neurobiological Basis of Drug Reinforcement. Philadelphia: Lippincott-Raven; 1997.
20. Hietala J, West C, Syvalahti E, et al. Striatal D2 dopamine receptor binding characteristics in vivo in patients with alcohol dependence. Psychopharmacology (Berl.) 1994;116:285–290. [PubMed]
21. Horvitz JC. Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events. Neuroscience. 2000;96:651–656. [PubMed]
22. Kalivas PW, Volkow ND. The neural basis of addiction: a pathology of motivation and choice. Am. J. Psychiatry. 2005;162:1403–1413. [PubMed]
23. Kauer JA, Malenka RC. Synaptic plasticity and addiction. Nat. Rev. Neurosci. 2007;8:844–858. [PubMed]
24. Koch K, Pauly K, Kellermann T, et al. Gender differences in the cognitive control of emotion: An fMRI study. Neuropsychologia. 2007;45:2744–2754. [PubMed]
25. Koob GF, Bloom FE. Cellular and molecular mechanisms of drug dependence. Science. 1988;242:715–723. [PubMed]
26. Koob GF, Le Moal M. Drug abuse: hedonic homeostatic dysregulation. Science. 1997;278:52–58. [PubMed]
27. Laine TP, Ahonen A, Torniainen P, et al. Dopamine transporters increase in human brain after alcohol withdrawal. Mol. Psychiatry. 1999;4:189–191. 104–105. [PubMed]
28. Liu QS, Pu L, Poo MM. Repeated cocaine exposure in vivo facilitates LTP induction in midbrain dopamine neurons. Nature. 2005;437:1027–1031. [PMC free article] [PubMed]
29. London ED, Cascella NG, Wong DF, et al. Cocaine-induced reduction of glucose utilization in human brain. A study using positron emission tomography and [fluorine 18]-fluorodeoxyglucose. Arch. Gen. Psychiatry. 1990;47:567–574. [PubMed]
30. Malison RT, Best SE, van Dyck CH, et al. Elevated striatal dopamine transporters during acute cocaine abstinence as measured by [123I] beta-CIT SPECT. Am. J. Psychiatry. 1998;155:832–834. [PubMed]
31. Martinez D, Broft A, Foltin RW, et al. Cocaine dependence and D2 receptor availability in the functional subdivisions of the striatum: relationship with cocaine-seeking behavior. Neuropsychopharmacology. 2004;29:1190–1202. [PubMed]
32. Martinez D, Gil R, Slifstein M, et al. Alcohol dependence is associated with blunted dopamine transmission in the ventral striatum. Biol. Psychiatry. 2005;58:779–786. [PubMed]
33. Martinez D, Narendran R, Foltin RW, et al. Amphetamine-induced dopamine release: markedly blunted in cocaine dependence and predictive of the choice to self-administer cocaine. Am. J. Psychiatry. 2007;164:622–629. [PubMed]
34. McClure SM, Daw ND, Montague PR. A computational substrate for incentive salience. Trends Neurosci. 2003;26:423–428. [PubMed]
35. Montgomery AJ, Lingford-Hughes AR, Egerton A, Nutt DJ, Grasby PM. The effect of nicotine on striatal dopamine release in man: A [11C]raclopride PET study. Synapse. 2007;61:637–645. [PubMed]
36. Morgan D, Grant KA, Gage HD, et al. Social dominance in monkeys: dopamine D2 receptors and cocaine self-administration. Nat. Neurosci. 2002;5:169–174. [PubMed]
37. Munro CA, McCaul ME, Wong DF, et al. Sex differences in striatal dopamine release in healthy adults. Biol. Psychiatry. 2006;59:966–974. [PubMed]
38. Parasrampuria DA, Schoedel KA, Schuller R, et al. Assessment of pharmacokinetics and pharmacodynamic effects related to abuse potential of a unique oral osmotic-controlled extended-release methylphenidate formulation in humans. J. Clin. Pharmacol. 2007;47:1476–1488. [PubMed]
39. Phan KL, Wager T, Taylor SF, Liberzon I. Functional neuroanatomy of emotion: a meta-analysis of emotion activation studies in PET and fMRI. Neuroimage. 2002;16:331–348. [PubMed]
40. Riccardi P, Zald D, Li R, et al. Sex differences in amphetamine-induced displacement of [(18)F]fallypride in striatal and extrastriatal regions: a PET study. Am. J. Psychiatry. 2006;163:1639–1641. [PubMed]
41. Robinson TE, Kolb B. Structural plasticity associated with exposure to drugs of abuse. Neuropharmacology. 2004;47 Suppl. 1:33–46. [PubMed]
42. Rolls ET. The orbitofrontal cortex and reward. Cereb Cortex. 2000;10:284–294. [PubMed]
43. Rolls ET, Thorpe SJ, Boytim M, Szabo I, Perrett DI. Responses of striatal neurons in the behaving monkey. 3. Effects of iontophoretically applied dopamine on normal responsiveness. Neuroscience. 1984;12:1201–1212. [PubMed]
44. Saxena S, Brody AL, Ho ML, et al. Differential cerebral metabolic changes with paroxetine treatment of obsessive-compulsive disorder vs major depression. Arch. Gen. Psychiatry. 2002;59:250–261. [PubMed]
45. Schlaepfer TE, Pearlson GD, Wong DF, Marenco S, Dannals RF. PET study of competition between intravenous cocaine and [11C]raclopride at dopamine receptors in human subjects. Am. J. Psychiatry. 1997;154:1209–1213. [PubMed]
46. Schuh LM, Schuh KJ, Henningfield JE. Pharmacologic Determinants of Tobacco Dependence. Am. J. Ther. 1996;3:335–341. [PubMed]
47. Schultz W. Getting formal with dopamine and reward. Neuron. 2002;36:241–263. [PubMed]
48. Schultz W, Tremblay L, Hollerman JR. Reward processing in primate orbitofrontal cortex and basal ganglia. Cereb. Cortex. 2000;10:272–284. [PubMed]
49. Sevy S, Smith GS, Ma Y, et al. Cerebral glucose metabolism and D(2)/D(3) receptor availability in young adults with cannabis dependence measured with positron emission tomography. Psychopharmacology (Berl.) 2008;197:549–556. [PubMed]
50. Stoops WW, Vansickel AR, Lile JA, Rush CR. Acute d-amphetamine pretreatment does not alter stimulant self-administration in humans. Pharmacol. Biochem. Behav. 2007;87:20–29. [PMC free article] [PubMed]
51. Takahashi H, Fujimura Y, Hayashi M, et al. Enhanced dopamine release by nicotine in cigarette smokers: a double-blind, randomized, placebo-controlled pilot study. Int. J. Neuropsychopharmacol. 2007:1–5. [PubMed]
52. Thanos PK, Volkow ND, Freimuth P, et al. Overexpression of dopamine D2 receptors reduces alcohol self-administration. J. Neurochem. 2001;78:1094–1103. [PubMed]
53. Tobler PN, O’Doherty JP, Dolan RJ, Schultz W. Reward value coding distinct from risk attitude-related uncertainty coding in human reward systems. J. Neurophysiol. 2007;97:1621–1632. [PMC free article] [PubMed]
54. Vanderschuren LJ, Everitt BJ. Behavioral and neural mechanisms of compulsive drug seeking. Eur. J. Pharmacol. 2005;526:77–88. [PubMed]
55. Villemagne VL, Wong DF, Yokoi F, et al. GBR12909 attenuates amphetamine- induced striatal dopamine release as measured by [(11)C]raclopride continuous infusion PET scans. Synapse. 1999;33:268–273. [PubMed]
56. Volkow ND, Fowler JS. Addiction, a disease of compulsion and drive: involvement of the orbitofrontal cortex. Cereb. Cortex. 2000;10:318–325. [PubMed]
57. Volkow ND, Swanson JM. Variables that affect the clinical use and abuse of methylphenidate in the treatment of ADHD. Am. J. Psychiatry. 2003;160:1909–1918. [PubMed]
58. Volkow ND, Fowler JS, Wang GJ, et al. Decreased dopamine D2 receptor availability is associated with reduced frontal metabolism in cocaine abusers. Synapse. 1993;14:169–177. [PubMed]
59. Volkow ND, Ding YS, Fowler JS, et al. Is methylphenidate like cocaine? Studies on their pharmacokinetics and distribution in the human brain. Arch. Gen. Psychiatry. 1995;52:456–463. [PubMed]
60. Volkow ND, Wang GJ, Fowler JS, et al. Relationship between psychostimulant- induced “high” and dopamine transporter occupancy. Proc. Natl. Acad. Sci. U.S.A. 1996a;93:10388–10392. [PMC free article] [PubMed]
61. Volkow ND, Wang GJ, Fowler JS, et al. Cocaine uptake is decreased in the brain of detoxified cocaine abusers. Neuropsychopharmacology. 1996b;14:159–168. [PubMed]
62. Volkow ND, Wang GJ, Fowler JS, et al. Decreases in dopamine receptors but not in dopamine transporters in alcoholics. Alcohol Clin. Exp. Res. 1996c;20:1594–1598. [PubMed]
63. Volkow ND, Rosen B, Farde L. Imaging the living human brain: magnetic resonance imaging and positron emission tomography. Proc. Natl. Acad. Sci. U.S.A. 1997a;94:2787–2788. [PMC free article] [PubMed]
64. Volkow ND, Wang GJ, Fowler JS, et al. Decreased striatal dopaminergic responsiveness in detoxified cocaine-dependent subjects. Nature. 1997b;386:830–833. [PubMed]
65. Volkow ND, Wang GJ, Fowler JS, et al. Prediction of reinforcing responses to psychostimulants in humans by brain dopamine D2 receptor levels. Am. J. Psychiatry. 1999;156:1440–1443. [PubMed]
66. Volkow ND, Wang GJ, Fischman MW, et al. Effects of route of administration on cocaine induced dopamine transporter blockade in the human brain. Life Sci. 2000;67:1507–1515. [PubMed]
67. Volkow ND, Chang L, Wang GJ, et al. Low level of brain dopamine D2 receptors in methamphetamine abusers: association with metabolism in the orbitofrontal cortex. Am. J. Psychiatry. 2001a;158:2015–2021. [PubMed]
68. Volkow ND, Wang G, Fowler JS, et al. Therapeutic doses of oral methylphenidate significantly increase extracellular dopamine in the human brain. J. Neurosci. 2001b;21:RC121. [PubMed]
69. Volkow ND, Fowler JS, Wang GJ, Goldstein RZ. Role of dopamine, the frontal cortex and memory circuits in drug addiction: insight from imaging studies. Neurobiol. Learn. Mem. 2002a;78:610–624. [PubMed]
70. Volkow ND, Fowler JS, Wang GJ. Role of dopamine in drug reinforcement and addiction in humans: results from imaging studies. Behav. Pharmacol. 2002b;13:355–366. [PubMed]
71. Volkow ND, Wang GJ, Fowler JS, et al. Brain DA D2 receptors predict reinforcing effects of stimulants in humans: replication study. Synapse. 2002c;46:79–82. [PubMed]
72. Volkow ND, Wang GJ, Maynard L, et al. Effects of alcohol detoxification on dopamine D2 receptors in alcoholics: a preliminary study. Psychiatry Res. 2002d;116:163–172. [PubMed]
73. Volkow ND, Fowler JS, Wang GJ. The addicted human brain: insights from imaging studies. J. Clin. Invest. 2003a;111:1444–1451. [PMC free article] [PubMed]
74. Volkow ND, Wang GJ, Ma Y, et al. Expectation enhances the regional brain metabolic and the reinforcing effects of stimulants in cocaine abusers. J. Neurosci. 2003b;23:11461–11468. [PubMed]
75. Volkow ND, Wang GJ, Ma Y, et al. Activation of orbital and medial prefrontal cortex by methylphenidate in cocaine-addicted subjects but not in controls: relevance to addiction. J. Neurosci. 2005;25:3932–3939. [PubMed]
76. Volkow ND, Wang GJ, Begleiter H, et al. High levels of dopamine D2 receptors in unaffected members of alcoholic families: possible protective factors. Arch. Gen. Psychiatry. 2006a;63:999–1008. [PubMed]
77. Volkow ND, Wang GJ, Ma Y, et al. Effects of expectation on the brain metabolic responses to methylphenidate and to its placebo in non-drug abusing subjects. Neuroimage. 2006b;32:1782–1792. [PubMed]
78. Volkow ND, Wang GJ, Telang F, et al. Cocaine cues and dopamine in dorsal striatum: mechanism of craving in cocaine addiction. J. Neurosci. 2006c;26:6583–6588. [PubMed]
79. Volkow ND, Fowler JS, Wang GJ, Swanson JM, Telang F. Dopamine in drug abuse and addiction: results of imaging studies and treatment implications. Arch. Neurol. 2007a;64:1575–1579. [PubMed]
80. Volkow ND, Wang GJ, Newcorn J, et al. Depressed dopamine activity in caudate and preliminary evidence of limbic involvement in adults with attention- deficit/hyperactivity disorder. Arch. Gen. Psychiatry. 2007b;64:932–940. [PubMed]
81. Volkow ND, Wang GJ, Newcorn J, et al. Brain dopamine transporter levels in treatment and drug naive adults with ADHD. Neuroimage. 2007c;34:1182–1190. [PubMed]
82. Volkow ND, Wang GJ, Telang F, et al. Profound decreases in dopamine release in striatum in detoxified alcoholics: possible orbitofrontal involvement. J. Neurosci. 2007d;27:12700–12706. [PubMed]
83. Volkow ND, Wang GJ, Telang F, et al. Dopamine increases in striatum do not elicit craving in cocaine abusers unless they are coupled with cocaine cues. Neuroimage. 2008;39:1266–1273. [PubMed]
84. Waelti P, Dickinson A, Schultz W. Dopamine responses comply with basic assumptions of formal learning theory. Nature. 2001;412:43–48. [PubMed]
85. Wang GJ, Volkow ND, Fowler JS, et al. Dopamine D2 receptor availability in opiate-dependent subjects before and after naloxone-precipitated withdrawal. Neuropsychopharmacology. 1997;16:174–182. [PubMed]
86. Williams JM, Galli A. The dopamine transporter: a vigilant border control for psychostimulant action. Handb. Exp. Pharmacol. 2006:215–232. [PubMed]
87. Williams GV, Rolls ET, Leonard CM, Stern C. Neuronal responses in the ventral striatum of the behaving macaque. Behav. Brain Res. 1993;55:243–252. [PubMed]
88. Wolf ME, Mangiavacchi S, Sun X. Mechanisms by which dopamine receptors may influence synaptic plasticity. Ann.. N.Y. Acad. Sci. 2003;1003:241–249. [PubMed]
89. Wong DF, Kuwabara H, Schretlen DJ, et al. Increased occupancy of dopamine receptors in human striatum during cue-elicited cocaine craving. Neuropsychopharmacology. 2006;31:2716–2727. [PubMed]
90. Wu JC, Bell K, Najafi A, et al. Decreasing striatal 6-FDOPA uptake with increasing duration of cocaine withdrawal. Neuropsychopharmacology. 1997;17:402–409. [PubMed]
91. Yang YK, Yao WJ, Yeh TL, et al. Decreased dopamine transporter availability in male smokers—a dual isotope SPECT study. Prog Neuropsychopharmacol Biol. Psychiatry. 2008;32:274–279. [PubMed]
92. Zink CF, Pagnoni G, Martin ME, Dhamala M, Berns GS. Human striatal response to salient nonrewarding stimuli. J. Neurosci. 2003;23:8092–8097. [PubMed]
Despite explicitly wanting to quit, long-term addicts find themselves powerless to resist drugs, despite knowing that drug-taking may be a harmful course of action. Such inconsistency between the explicit knowledge of negative consequences and the compulsive behavioral patterns represents a cognitive/behavioral conflict that is a central characteristic of addiction. Neurobiologically, differential cue-induced activity in distinct striatal subregions, as well as the dopamine connectivity spiraling from ventral striatal regions to the dorsal regions, play critical roles in compulsive drug seeking. However, the functional mechanism that integrates these neuropharmacological observations with the above-mentioned cognitive/behavioral conflict is unknown. Here we provide a formal computational explanation for the drug-induced cognitive inconsistency that is apparent in the addicts' “self-described mistake”. We show that addictive drugs gradually produce a motivational bias toward drug-seeking at low-level habitual decision processes, despite the low abstract cognitive valuation of this behavior. This pathology emerges within the hierarchical reinforcement learning framework when chronic exposure to the drug pharmacologically produces pathologicaly persistent phasic dopamine signals. Thereby the drug hijacks the dopaminergic spirals that cascade the reinforcement signals down the ventro-dorsal cortico-striatal hierarchy. Neurobiologically, our theory accounts for rapid development of drug cue-elicited dopamine efflux in the ventral striatum and a delayed response in the dorsal striatum. Our theory also shows how this response pattern depends critically on the dopamine spiraling circuitry. Behaviorally, our framework explains gradual insensitivity of drug-seeking to drug-associated punishments, the blocking phenomenon for drug outcomes, and the persistent preference for drugs over natural rewards by addicts. The model suggests testable predictions and beyond that, sets the stage for a view of addiction as a pathology of hierarchical decision-making processes. This view is complementary to the traditional interpretation of addiction as interaction between habitual and goal-directed decision systems.
Citation: Keramati M, Gutkin B (2013) Imbalanced Decision Hierarchy in Addicts Emerging from Drug-Hijacked Dopamine Spiraling Circuit. PLoS ONE 8(4): e61489. doi:10.1371/journal.pone.0061489
Editor: Allan V. Kalueff, Tulane University Medical School, United States of America
Received: January 4, 2013; Accepted: March 10, 2013; Published: April 24, 2013
Copyright: © 2013 Keramati, Gutkin. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This study was supported by fundings from Frontiers du Vivant, the French MESR, CNRS, INSERM, ANR, ENP and NERF. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
“We admitted we were powerless over our addiction—that our lives had become unmanageable” states the very first tenet of the Narcotics Anonymous 12-step program . This spotlights how powerless addicts find themselves when it comes to resisting drugs, despite knowing that drug-taking is a wrong course of action –. In fact, the hallmark of addiction is compulsive seeking of the drugs even at the cost of evident adverse consequences . A signature of such pathological behavior becomes evident in controlled experiments where addicts exhibit a characteristic “self-described mistake”: an inconsistency between the potent behavioral response toward drug-associated choices and the relatively low subjective value that the addict reports for the drug , , . When combined with the loss of inhibitory cognitive control over behavior, after protracted exposure to drugs, this divergence between the cognitive plans and the consolidated habits may result in a transition from casual to compulsive drug-seeking behavior .
The loss of cognitive control and self-described mistake have so far eluded a principled explanation by formal models of addiction –. Previous computational theories of drug addiction, mostly posed within the reinforcement learning framework, view addiction as a pathological state of the habit learning (stimulus-response) system –. The central hypothesis behind all those models is that the pharmacological effect of drugs on dopamine signaling, supposedly carrying a stimulus-response teaching signal, results in gradual over-reinforcement of such associations. This effect in turn leads to compulsive drug-seeking habits. While this reduced view of addiction has captured some aspects of the phenomenon, a growing consensus in the addiction literature indicates that multiple learning systems are involved in the pathology. Only such a more complex picture that includes brain's cognitive, as well as low-level habitual processes, can explain the variety of addiction-like behaviors , .
In this paper, we adopt a hierarchical reinforcement learning approach  where decisions are represented at different levels of abstraction, in a cognitive-to-motor hierarchy. We assume that a cascade of dopamine-dependent learning signals links levels of the hierarchy together . We further assume that drugs of abuse pharmacologically hijack the communication mechanism between levels of abstraction. Based on these assumptions, we show that the reported cognitive dissonance in addicts emerges within the hierarchical reinforcement learning framework when chronic drug-exposure disrupts value-learning across the decision hierarchy. This disruption results in a pathological over-valuation of drug choices at low-level habitual processes and hence drives habitual drug-seeking behavior. We then demonstrate that “disliked” but compulsive drug-seeking can be explained as drug-hijacked low-level habitual processes dominating behavior, while healthy cognitive systems at the top representational levels lose control over behavior. Furthermore, we demonstrate that the proposed model can account for recent evidence on rapid vs. delayed development of drug cue-elicited dopamine efflux in the ventral vs. dorsal striatum, respectively, as well as the dependence of this pattern on dopamine spiraling circuitry.
In concordance with a rich cognitive psychology literature, our hierarchical reinforcement learning ,  framework assumes that an abstract cognitive plan like “brewing tea” can be broken into a sequence of lower-level actions: boiling water, putting tea in the pot, etc. Such decomposition proceeds until concrete motor-level responses at the lowest level of the hierarchy (Figure 1A). Neurobiologically, the different levels of decision hierarchy from cognitive to motor levels are represented along the rostro-caudal axis of the cortico-basal ganglia (BG) circuit –. This circuit is composed of several parallel closed loops between the frontal cortex and the basal ganglia ,  (Figure 1B). Whereas the anterior loops underlie more abstract representation of actions, the caudal loops, consisting of sensory-motor cortex and dorsolateral striatum, encode low-level habits –.
Figure 1. Hierarchical organization of behavior and the cortico-BG circuit.
A, An example of a decision hierarchy for two alternative choices: drug vs. food. Each course of action is represented at different levels of abstraction, supposedly encoded at different cortico-BG loops. Seeking each of the two types of reward might follow a punishment of magnitude 16. B, Glutamatergic connections from different prefrontal areas project to striatal subregions and then project back to the PFC through the pallidum and thalamus, forming several parallel loops. Through the striato-nigro-striatal dopamine network, the ventral regions of the striatum influence the more dorsal regions. vmPFC, ventral medial prefrontal cortex; OFC, orbital frontal cortex; dACC, dorsal anterior cingulate cortex; SMC, sensory-motor cortex; VTA, ventral tegmental area; SNc, substantia nigra pars compacta. Figure 1B Modified from ref 21.
Within this circuitry, the phasic activity of midbrain dopamine (DA) neurons projecting to the striatum signals the error between predicted and received rewards, thereby carrying stimulus-response reinforcing information . These DAergic projections form a cascading serial connectivity linking the more ventral regions of the striatum to progressively more dorsal regions through the so-called ″spiraling″ connections – (Figure 1B). Functionally, such feed-forward organization connecting the rostral to caudal cortico-BG loops allows directed coupling from coarse to fine representations. Accordingly, the DA spirals are hypothesized to provide a neurobiological substrate for the progressive tuning of the reward prediction error by the higher levels of the hierarchy (encoding the abstract knowledge about the value of behavioral options). This error is then utilized for updating action-values at more detailed levels . In other words, the DA spirals allow for the abstract cognitive levels of valuation to guide the learning in the more detailed action-valuation processes.
In terms of the computational theory of reinforcement learning  (RL), the agent (in our case a person or an animal) learns to make informed action-choices by updating its prior estimated value, , for each state-action pair, , when a reward is received by the agent at time as a result of performing an action in the contextual state (stimulus) . The value is updated by computing the reward prediction error signal. This signal not only depends on the instantaneously received reward (), but also on the value of the new state the agent ends up in, after that action has been performed. Denoted by , this temporally-advanced value-function represents the sum of future rewards the animal expects to receive from the resultant state, , onward. The prediction error can be computed by the following equation:
Intuitively, the prediction error signal computes the discrepancy between the expected and the realized rewarding value of an action. In a hierarchical decision structure, however, rather than learning the -values independently at different levels, more abstract levels can tune the teaching signal computed at lower levels. Since higher levels of the hierarchy represent a more abstract representation of environmental contingencies, learning occurs faster in those levels. This is due to the relative low-dimensionality of the abstract representation of behavior: an action plan can be represented as a single step (one dimension) at the top level of the hierarchy and as multiple detailed actions (multiple dimensions) at the lower levels of the hierarchy. The top level value of this action-plan would be learned quickly as compared to the detailed levels where the reward errors would need to back-propagate all the detailed action-steps. Thus, tuning the lower level values by the value information from the higher levels can speed up the convergence of these values. One statistically efficient way of doing so is to suppose that for computing the prediction error signal at the -th level of abstraction, , the temporally-advanced value function, , comes from one higher level of abstraction, :
To preserve optimality, equation 2 can be used for computing the prediction error only when the last constituent primitive action of an abstract option is performed (see Figure S1 in File S1). In other cases, value-learning at different levels occur independently, as in equation 1. In both cases, the teaching signal is then used for updating the prior values at the corresponding level:
where is the learning rate. This form of inter-level information-sharing is biologically plausible since it reflects the spiraling structure of the DA circuitry, carrying the information down the hierarchy in the ventro-dorsal direction. At the same time, being guided by more abstract levels significantly accelerates learning, alleviating the high-dimensionality of value learning at detailed levels .
In this paper we show that the interaction between a modified version of the model developed in  and the specific pharmacological effects of drugs of abuse on the dopaminergic system can capture addiction-related data at radically different scales of analysis: behavioral and circuit-level neurobiological. First, the new model brings about a possible cogent explanation for several intriguing behavioral aspects associated with addiction to drugs (e.g. the self-described mistake , , ). Second, we can account for a wide range of evidence regarding the dynamics of the drug-evoked dopamine release .
We modify the model presented in  as follows. We make the model more efficient in terms of working memory capacity by replacing with , in equation 2, since the two values converge to the same steady level (see Figure S2 in File S1, for computational and neurobiological basis):
Here, is the relatively abstract option and is the last primitive action in the behavioral sequence that full-fills this option. Similarly, is the rewarding value of , which includes (the rewarding value of ).
Crucially, the various drugs abused by humans share a fundamental property of pharmacologically increasing dopamine concentration within the striatum . Accordingly, we incorporate this pharmacological effect of the drug by adding a positive bias, , (see also –) to the prediction error signal carried by dopamine neurons (see Figure S3 in File S1, for computational and neurobiological basis):
Here captures the direct pharmacological effect of drug on the DA system, and is its reinforcing value due to the euphorigenic effects (see File S1 for supplementary information).
While equations 3 and 5 together define the computational mechanism to update the values in our model, we also hypothesize that an uncertainty-based competition mechanism determines the level of abstraction that controls behavior. This is inspired by the mechanism proposed in  for arbitration between the habitual and goal-directed systems. In this respect, at each decision point, only the level of abstraction with the highest certainty in estimating the value of choices controls behavior. Once this level has made the decision to act, all the lower levels of the hierarchy will be deployed by this dominant level to implement the selected action as a sequence of primitive motor responses (see File S1 for supplementary information; Figure S4 in File S1; Figure S5 in File S1). Upon receiving the reward feedback from the environment, the values at all the levels are updated. This uncertainty-based arbitration mechanism predicts that as abstract processes are more flexible, they have superior value-approximation capability during the early stages of learning and thus, control behavior at these stages. However, since the abstract levels use a coarse representation of the environment (e.g. due to containing a relatively small number of basis functions), their ultimate value approximation capability is not as precise as those of detailed levels. In other words, after extensive training the certainty associated with the estimated values is lower for the lower levels of the hierarchy as compared to the upper levels. Thus, with progressive learning, the lower levels of the hierarchy take over the control over the action selection, as their uncertainty decreases gradually. This is in agreement with several lines of evidence showing a progressive dominance of the dorsal over the ventral striatum in the control over drug-seeking (as well as seeking natural rewards) , , .
In contrast to the previous reinforcement learning-based computational models of addiction – which are based on a single-decision-system approach, our account is build upon a multiple-interacting-systems framework. As a result, although the form of modeling drug's effect on the prediction error signal in our model is similar to the previous ones –, it results in fundamentally different consequences. The drug-induced transient dopamine increase boosts the immediate prediction error at each level of the hierarchy and as a result, entrains a bias, , on the transfer of knowledge from one level of abstraction to the next, along the coarse-to-fine direction of the hierarchy. This bias causes the asymptotic value of drug-seeking at a given level to be units higher than that of one more abstract layer (Figure 2B). The accumulation of these discrepancies along the rostro-caudal axis progressively induces significant differences in the value of drug-seeking behaviors between the top and bottom extremes of the hierarchy. Thus, even when followed by a strong punishment, the value of drug-associated behavior remains positive at the low-level motor loops, while it becomes negative at cognitive levels. In other words, the model predicts that accumulation of drug effect over DA spirals increases drug-seeking value at motor-level habits to such high amplitude that even a strong natural punishment will not be able to decrease it sufficiently. We suggest that this explains the inconsistency between cognitive and low-level evaluation of drug-related behaviors in addicts. In other words, we propose that compulsive drug seeking and the significantly reduced elasticity to associated costs stems from the pharmacological effect of the drug hijacking the dopamine-dependent mechanism that transfers the information among the levels of decision hierarchy.
Figure 2. Motivation for food vs. drug at different levels of abstraction (simulation results).
In the first 150 trials where no punishment follows the reward, the value of seeking natural rewards at all levels converge to 10 (A). For the case of drug, however, the direct pharmacological effect of drug (, set to) results in the asymptotic value at each level to be units higher than that of one higher level of abstraction (B). Thus, when followed by punishment, whereas cognitive loops correctly assign a negative value to drug-seeking choice, motor-level loops find drug-seeking desirable (positive value). The curves in this figure show the evolution of values in “one” simulated animal and thus, no statistical analysis was applicable.
While drugs, in our model, result in imbalanced valuation across levels, the value of natural rewards converges to the same value across all levels, due to lack of a direct pharmacological effect on DA signaling mechanism (). Consequently, neither inconsistency nor overvaluation at detailed levels will be observed for the case of natural rewards (Figure 2A). Overvaluation of drug-seeking responses at lower levels of the hierarchy should result in abnormal preference of drugs over natural rewards, and over-engagement in drug-related activities.
Neurobiologically, differential roles of the striatal subregions in the acquisition and expression of drug-seeking behavior has taken center stage in addiction research. Converging evidence from different lines of research suggests that the behavioral transition from recreational to compulsive drug use reflects a neurobiological shift of valuation from the ventral to the dorsolateral striatum , , , corresponding to a shift from cognitive to detailed levels in our model. Consistent with our model, DA spiraling network connecting the ventral to progressively more dorsal regions of the striatum is shown to play a pivotal role in this transition .
In a key recent study Willuhn et al.  assessed the pattern of dopamine release in response to drug-associated cues in the ventral and dorsolateral striatum of rats during three weeks of experiencing cocaine. Using fast-scan cyclic voltammetry, the critical observation was that cue-induced DA efflux in the ventral striatum emerges even after very limited training. In contrast, the dorsolateral striatum showed cue-triggered DA efflux only after extensive training, and the development of this release pattern disappeared when the ventral striatum was lesioned in the ipsilateral hemisphere.
Since the temporal resolution of fast-scan voltammetry captures subsecond fluctuations in concentration, the observed pattern of DA efflux should be attributed to “phasic” DA signaling and thus, to the prediction error signal, according to the RL theory of dopamine . According to RL theory, the prediction error signal upon observing an unexpected stimulus is equal to the rewarding value that that stimulus predicts. Therefore, cue-induced DA release is equivalent to the value predicted by that cue.
In this respect, our hierarchical framework provides a formal explanation for the differential pattern of ventral versus dorsal striatal DA efflux reported in . The value predicted by the drug-associated cue at the abstract cognitive levels of the hierarchy increases rapidly at the very early stages of training (Figure 2B), due to low-dimensionality of the learning problem at high levels of abstraction. As a result, our model shows that the cue-induced DA efflux should be observed in the ventral striatum even after limited training (Figure 3). At the more detailed levels of representation, however, the learning process is slow (Figure 2B), due to high-dimensionality of the problem space, as well as dependency of learning on more abstract levels through DA spirals. Consequently, cue-induced DA efflux in the dorsolateral striatum should develop gradually and become observable only after extensive training (Figure 3).
Figure 3. Dopamine efflux at different striatal subregions in response to drug-associated cues (simulation results).
In line with experimental data , the model shows (left column) that in response to drug-associated cues, there will be dopamine efflux in the ventral striatum, after limited and extensive training. In more dorsolateral subregions, however, cue-elicited DA efflux will develop gradually during the course of learning. The model predicts (second column from right) that this delayed development of cue-elicited DA efflux in dorsal striatum depends on the DA-dependent serial connectivity that links the ventral to the dorsal striatum. That is, as a result of disconnecting the DA spirals, whereas cue-elicited DA response remains intact in the ventral striatum, it significantly decreases in the dorsolateral striatum. Moreover, the model predicts (third column from right) similar results for cue-induced DA efflux in dorsolateral striatum for the case of lesioned ventral striatum. Finally, if after extensive drug-cue pairing in intact animals, a punishment follows drug, the model predicts (right column) that drug-related cue results in inhibition of the ventral leg of DA spirals, even after limited training. In more dorsal regions, however, DA efflux decreases slowly during learning, but will remain positive, even after extensive drug-punishment pairing. The data presented in this figure are obtained from “one” simulated animal and thus, no statistical analysis was applicable.
Furthermore, our model explains the evidence in  that such delayed development of cue-elicited DA efflux in the dorsolateral striatum depends on the ventral striatum (Figure 3). In our model, a simulated unilateral lesion of the ventral striatum (the abstract valuation level in the model) significantly decreases the drug cue-predicted value at detailed levels in the ipsilateral hemisphere and thus, significantly decreases the level of cue-induced DA efflux. In order to model lesion of the ventral striatum, we simply fix the value of all stimuli at the highest level of the hierarchy to zero.
Similarly, our model predicts that the development of phasic DA signaling in the dorsolateral striatum depends on the integrity of the DA spiraling circuit (Figure 3). In fact, a disconnection in the DA spiraling circuit in our model cuts the communication across levels of abstraction, which in turn, prevents accumulation of the drug-induced bias on the reinforcement signal, along the levels of decision hierarchy. To model the disconnection in the DA-dependent serial circuitry of ventral to dorsal striatum, we clamp each level of abstraction to compute the prediction error signal locally (as in equation 3), without receiving the value of the temporally advanced state from the immediately higher level of abstraction.
Furthermore, the model predicts that the pattern of cue-elicited DA efflux will change if after an extensive training with cocaine and cocaine associated cues, as in the above experiment, one starts to pair the cocaine delivery with a strong punishment. We predict that the DA efflux in response to the cocaine-associated cue should rapidly decrease below baseline in the ventral striatum. In the dorsolateral striatum, however, cue-induced DA release should stay above baseline (Figure 3) with a possible delayed partial decrease. This indicates assigning positive subjective value to the drug stimulus at detailed levels, despite negative (below baseline) values at cognitive levels. It is noteworthy that this prediction depends on the assumption that punishment is treated by the brain simply as a negative reward. This assumption is somewhat controversial: it is clearly supported by experimental studies , yet also discussed otherwise by others , . Except for this prediction, other aspects of the model do not depend on whether punishment is encoded by dopamine or by another signaling system.
The training regimen used by Willuhn et al.  is not sufficiently extended to producing compulsive drug-seeking behavior, characterized by insensitivity to drug-associated punishments , . Thus, a key question to be answered is what is the relation between delayed development of cue-induced DA response in DLS, and late development of compulsive responding. According to our model, compulsive behavior requires not only the excessive valuation of drug choice at low levels of the hierarchy, but also the transfer of control over behavior from the abstract cognitive to the low-level habitual processes. The time scale of these two processes are only partly dependent to each other: the over-valuation process depends on the prediction error signal, while the transfer of behavioral control also depends on the relative uncertainties in value-estimation. Hence, the over- valuation of drug-associated cues at low levels of the hierarchy can precede the shift of control over behavior from top to the bottom of the hierarchy. The exact time scales of the two processes depend on the learning rate and the noise inherent at the different levels, respectively (see File S1 for supplementary information). In other words, it is likely that the cue-induced dopamine efflux in the DLS may develop significantly before the compulsive drug-seeking is behaviorally manifested.
Behaviorally, in our model, if punishment is paired with drug at the early stages of voluntary drug use, the abstract value of drug-seeking response becomes negative rapidly. Assuming that drug-seeking is controlled by abstract levels during these early stages, negative abstract evaluation of drug choice makes the subject unwilling to experience that course of action any longer. This will prevent consolidation of strong low-level preference toward drugs over time. Thus, the model explains elasticity of drug choices to costs during the early stages of drug consumption, but not after chronic use. Consistently, animal models of addiction show that insensitivity of drug-seeking responses to harmful consequences associated with drug develops only after prolonged drug self-administration, but not limited drug use , . In contrast to our theory, earlier computational models of addiction ,  are in direct contradiction with this body of evidence, since they predict that adverse behavioral outcomes that immediately follow drug use, have no motivational effect even at the very early stages of experiencing drugs (see File S1 for supplementary information).
Our model further accounts for the occurrence of blocking effect for drug outcomes . Blocking is a conditioning phenomenon where prior pairing of a stimulus A with an outcome blocks formation of association between a different stimulus B with that outcome in a subsequent training phase, where both A and B are presented before the delivery of the outcome . Results of simulating our model in a Pavlovian experimental design (see File S1 for supplementary information on the Pavlovian version of the model) shows that for both cases of natural rewards and drugs, when the estimated value at a certain level of the hierarchy reaches its steady state (rather than growing unboundedly), no further learning occurs at that level, since the prediction error signal has decreased to zero (Figure 4). Thus, associating a new stimulus with the already-predicted reward will be blocked. Behavioral evidence showing a blocking effect associated with both drug and natural reinforcers  has been used as a major argument to criticize the previously proposed dopamine-based computational model of addiction . Here we showed that focusing on the hierarchical nature of representations and dorsal-ventral spiraling dopamine loop organization can in fact account for the blocking data, thereby circumventing this criticism (see File S1 for supplementary information).
Figure 4. Blocking effect for natural vs. drug rewards.
The model predicts that blocking occurs for natural rewards (A) and drugs (B), only if the initial training period is “extensive”, such that the first stimulus fully predicts the value of the outcome. After “moderate” training, cognitive levels that are more flexible fully predict the values and thus, block further learning. However, learning is still active in low-level processes when the second training phase (simultaneous presentation of both stimuli) starts. Thus, our model predicts that moderate initial training in a blocking experiment with natural rewards will also result in cognitive/behavioral inconsistency. The data presented in this figure are obtained from “one” simulated animal and thus, no statistical analysis was applicable.
As mentioned before, several lines of evidence show a progressive dominance of the dorsal over the ventral striatum in the control over behavior during the course of learning , , . Being interpreted on a background of those evidence, the imbalanced drug-seeking valuation across the hierarchy also explains addicts' unsuccessful efforts to cut down drug-use after prolonged experience with drug, when control over drug-related choices has shifted from cognitive to low-level habitual processes. This supremacy of drug-dominated processes naturally leads to behavioral inelasticity to drug-associated costs (compulsive drug-seeking), likely accompanied with self-described mistake. For the case of natural rewards, however, our model predicts that even though behavioral inelasticity increases over the course of learning, as no valuation-inconsistency develops across the levels of the hierarchy, punishments associated with reward will eventually inhibit reward-seeking.
Our model focuses on evaluation of actions in a “presumably given” decision hierarchy, and leaves aside how the abstract options and their corresponding low-level subroutines are initially discovered during development. Discovering the decision hierarchy is proposed to be a bottom-up process, accomplished by chunking together sequences of low-level actions and constructing more abstract options . This process, supposedly undergoing a shift from the dorsal to the ventral striatum, is in the opposite direction of the competition mechanism proposed here, for taking control over behavior.
The growing body of evidence on the differential role of different striatal subregions in addiction is usually interpreted in the framework of habitual vs. goal-directed dichotomy , , . The hierarchical decision making approach we use here is complementary to such dual-system accounts. Whereas the dual-process approach deals with different algorithms (model-free vs. model-base ) for solving a single problem, the hierarchical RL framework focuses on different representations of the same problem at different levels of temporal abstraction. In theory, either a habitual or a goal-directed algorithm can solve each of these different representations of the problem. In our model, the accumulation of drug-induced biases over DA spirals occurs in a setting where the value-estimation algorithm is model-free (habit learning). However, this does not rule out existence of model-based systems working at the top levels of the hierarchy. One can simply incorporate the PFC-dependent goal-directed valuation and decision system into the model by assuming that actions at the highest levels of abstraction are evaluated by a goal-directed system. While such complication does not change the nature of results presented in this manuscript, its ensuing additional flexibility in explaining other aspect of addiction is left to future studies. In fact, in our model, irrespective of whether a goal-direct system exists or not, the discrepancy in the asymptotic value of drug-seeking between the two extremes of the hierarchy grows with the number of decision levels governed by the “habitual” process.
In the light of our theory, relapse can be viewed as revival of dormant motor-level maladaptive habits, after a period of dominance of cognitive levels. In fact, one can imagine that as a result of cognitive therapy (in human addicts) or forced extinction (in animal models of abstinence), high value of drug-seeking at the detailed level of the hierarchy is not extinguished, but become dormant due to shift of control back to cognitive levels. Since drug-related behavior is sensitive to adverse consequences at abstract levels, hence drug-seeking can be avoided as long as high-level cognitive processes dominate control of behavior. One can even speculate that the popular 12 step programs (e.g. Alcoholics Anonymous, Narcotics Anonymous, etc) work in part by explicitly requiring the participants to admit the inconsistency of their drug related lifestyle, thereby empowering the abstract cognitive levels to exert explicit control over their behavior. Stressful conditions or re-exposure to drug (priming) can be thought of as risk factors that weaken the dominance of abstract levels over behavior, which can result in re-emergence of drug-seeking responses (due to the latent high non-cognitive values).
In summary, we propose a coherent account for several, apparently disparate phenomena characteristic of drug addiction. Our model provides a normative account for data on the differential roles of the ventral vs. dorsal striatal circuits in drug-seeking acquisition and habit performance, as well as the selective role of feed-forward DA connectivity for effects of drug versus natural reinforcers. Most importantly, we show how the drug-induced pathology in ventral-to-dorsal DA signals trickling the motivational information down cognitive representation hierarchy could leads to discordance between addicts' abstract attitudes toward drug-seeking and what they actually do. Obviously, our model does not, and is not meant to, give a complete account of drug addiction. Explaining other unexplained aspects of addiction requires incorporating many other brain systems that are demonstrated to be affected by drugs of abuse . How to incorporate such systems within the formal computational network remains a topic for further investigation.
Figure S1,A sample decision hierarchy with five levels of abstraction. Figure S2, The corresponding neural circuit for the three discussed value learning algorithms is a hierarchical decision structure. A, Using a simple TD-learning algorithm (equation S7), the prediction error signal in each level of abstraction is computed independently from other levels. B, In the model proposed by Haruno and Kawato (4) (equation S8), the value of the temporally-advanced state comes from one higher level of abstraction. C, in our model (equation S9) the value of the temporally-advanced state is substituted with a combination of the reward and Q-value of the performed action at a higher level of abstraction. Figure S3, Our model predicts different sites of action of drugs on the reward-learning circuit: sites 1 to 3. Drugs affecting sites 4 to 6, in contrast, will not result in the behavioral and neurobiological patterns produced by simulation of the model for drugs, but will produce results similar to the case of natural rewards. Figure S4, The task used for simulating the uncertainty-based competition mechanism among the levels of the hierarchy for taking control over behavior. Figure S5, Simulation result, showing gradual shift of control over behavior from higher to lower levels of the hierarchy. Q(s,a) and U(s,a) show the estimated value and uncertainty of the state-action pairs, respectively.
Conceived and designed the experiments: MK. Performed the experiments: MK. Analyzed the data: MK BG. Contributed reagents/materials/analysis tools: MK. Wrote the paper: MK BG.