Monday, 1 June 2015

Research Briefing: reward-guided working memory

George Wallis
Research Briefing, by George Wallis

In almost any situation, there are hundreds of things (or ‘stimuli’) that could attract our attention - just count the number of objects you can see from where you are now.  In order to get on with life and avoid total mental chaos, we have to be extremely selective about what we process – most stimuli are essentially ignored.  It is a long-established finding that the number of things we can hold in mind (or hold in ‘working memory’) is really very small – about 2-8 depending on the experiment we run.  Clearly we possess powerful mechanisms that let us filter in only certain stimuli.  The ways in which we can select what gets into working memory is a much-studied topic in psychology, and often psychologists run experiments in which they present ‘cues’ (e.g. arrows) that tell people which items in the experiment to ‘select’.  This is the experimental equivalent of pointing out something with your finger.
Harry Styles of One Direction
giving an attentional cue to the crowd

However, most of the time, this isn’t how we select what gets into working memory: in the real world people aren’t on hand to continuously tell us what to pay attention to.  One real-world factor psychologists think may be important in determining whether an item gets into memory is its ‘reward value’.  For example, a twenty-pound note is more likely to grab our attention than a piece of scrap paper, even if they are about the same size and appearance.  Our paper recently published in Visual Cognition (Wallis, Stokes, Arnold, & Nobre, 2015) describe the results of two experiments we ran in which we looked at how reward value affects the likelihood that a stimulus will get into working memory.

Anderson and colleagues performed experiments showing that items displayed in colours that the experimenters had previously associated with a high monetary reward ‘grab’ attention as people look around a visual scene for a particular target, slowing them down slightly (e.g. Anderson, Laurent, & Yantis, 2011).  We adapted their experiment to look specifically at memory, presenting four nonsense-shapes briefly, and then testing how well people remembered which shapes had been presented a few seconds later.  Before running the memory task we associated some of the shapes with high reward and some with low reward. 

An experimental trial from our shapes experiment

When we asked people to remember four shapes out of which some items were worth more than others, people didn’t remember the high value items any better than the low value items.  This was a surprise!  On the basis of the paper by Anderson, we expected the high value items would be better remembered – after all, they ought to grab attention.  However, we did find a curious effect.  If all of the shapes in a display were high-value (a ‘high value trial’), then any one of them was remembered better than if all the shapes were low value (a ‘low value trial’).  More curious yet, if half the shapes were high value, and half were low value, any shape was remembered about equally well – but they were remembered a little less well than when all the items were high value, and a little better than when all the items were low value.

We reasoned that this could have been because people simply made more effort when the shapes in the memory array were higher in value, on average, and so they did a bit better.  However, a more specific (and interesting) explanation was also possible. We know that a chemical in the brain, dopamine, is involved in processing reward – in studies on monkeys, where the dopamine neurons are measured directly, experimenters see ‘pulses’ of dopamine release when rewarded items are presented to monkeys.  We also know that the prefrontal cortex (PFC) – the part of the brain thought to be most important in controlling working memory, is ‘soaked’ in dopamine: dopamine is released throughout the PFC.  Some have suggested that the dopamine pulses ‘open the gate’ to working memory, and the more dopamine released at a given time, the wider the gate is opened (Braver & Cohen, 2000).

We couldn’t record dopamine firing in our volunteers, so we couldn’t test this possibility directly.  However, it made us wonder – what would happen if instead of showing our memory items all at the same time, we presented them very quickly one after the other, in a row?  If subjects simply made more effort on those trials where they had encountered a higher value item, then we would still expect all of the items we showed to benefit from this.  However, we know that dopamine pulses are only a fraction of a second long (about a third of a second).  If we presented each item for about this length of time, one after the other, then a ‘reward pulse’ might be able to ‘pick out’ the high reward item, and not the other less valuable items.   So, we ran the experiment, with a few adaptations: rather than shapes, we used coloured lines, presented one after the other, and asked people to remember the orientation of the lines.  Certain colours were given high or low reward values.

An experimental trial from our 
second, sequential experiment

We found that indeed, only the high-value item in this experiment was more likely to be encoded, and not its near-neighbours.  This doesn’t prove that dopamine pulses are responsible – we didn’t measure our volunteers’ dopamine neurons – but it does suggest that the reward effect is quite tightly localized in time: a ‘pulse’ tied to an item, not a more general ‘making an effort’ effect.  This was an intriguing finding and it opens up several questions.  Firstly, is this effect really down to dopamine, like we speculate?  To find out, we’d need to see what dopamine neurons are doing at the same time as running the task.  Interestingly, there is some evidence that the diameter of people’s pupils responds rapidly to dopamine release, so maybe measuring pupil diameter would be a way of getting some more evidence without having to get inside the brain. 

Secondly – what’s the point of this rather weird-seeming ‘pulse’ mechanism?  And why would it be useful – like in our first experiment – for unrewarded items to get ‘caught by the pulse’?  Our speculative answer to this question is that our experiment was unnatural – we asked our volunteers to keep staring at the centre of the screen and flashed up the shapes all together, just for a moment.  They had no chance to move their eyes (indeed we deliberately tried to prevent that!).  However, in more natural settings, our eyes constantly flit around the scene.  This is pretty hard to notice in yourself but watch a friend’s eyes for a while (without freaking them out too much) – they jump from place to place continually, ‘fixating’ first this object, then that.  In fact they move about 3 or 4 times per second, jumping from looking at one item to another.  If our putative working-memory-updating pulses were ‘tied’ to these fixations this might provide a mechanism by which the more rewarded items in the scene were more likely to enter memory.


Anderson, B. A., Laurent, P. A., & Yantis, S. (2011). Value-driven attentional capture. Proceedings of the National Academy of Sciences, 108(25), 10367–10371. doi:10.1073/pnas.1104047108
Braver, T. S., & Cohen, J. D. (2000). On the control of control: The role of dopamine in regulating prefrontal function and working memory. Control of Cognitive Processes: Attention and Performance XVIII, 713–737.

Wallis, G., Stokes, M. G., Arnold, C., & Nobre, A. C. (2015). Reward boosts working memory encoding over a brief temporal window. Visual Cognition, 23(1-2), 291–312. doi:10.1080/13506285.2015.1013168