Our analysis enabled us to study the entire time course of cortical processes underlying decision making, outcome evaluation, and learning (i.e., updating) value representations. click here Upon stimulus presentation, retrieval of learnt values activates cortical value representations
reflected in early midfrontal EEG activity. Decision certainty is reflected in P3b-like parietal EEG activity around response latency, and mapping of the selected action to the motor response is reflected in lateralized activity from (pre)motor cortices (FigureĀ S5C). After feedback, initially outcomes are processed separately depending on whether their consequences are real or fictive, presumably in order to convert feedback information into a common value currency allowing for efficient learning of stimulus values. Then the information about necessary value updates converges on common parietal P3b-like activity modulated by whether the action was successful or not. Given the probabilistic nature of the instrumental learning task, several parameters need to be used to weight the impact of single-trial outcomes. Over the course of multiple trials, learning rate indicates the learning success and downweights the single-feedback information at later learning stages. Moreover, when a choice Selleck CP 690550 is made with high certainty, perseveration of this behavior is favorable. This means that already at the time of the response (and thus before
feedback), high certainty might be used to strengthen the current value representation, thereby shielding it from potentially misleading feedback. Interestingly, the stimulus- and feedback-locked late parietal P3b-like activity is consistent with the notion of certainty- and learning-rate-weighted value strengthening and updates at different time points: high response 3-mercaptopyruvate sulfurtransferase certainty, which should be associated with re-encoding (strengthening) of the stimulus value to assure perseveration,
is associated with high stimulus-locked P3b amplitudes. In contrast, after feedback, high learning rates and unfavorable outcomes commonly give rise to high feedback-locked P3b amplitudes, presumably reflecting value updating and storage, thereby increasing the likelihood to change future choice behavior. To put it briefly, lower stimulus-related P3b and higher feedback-related P3b amplitudes should be associated with an increased likelihood to switch choice on the next encounter with the same stimulus. This notion that feedback- and stimulus-related P3b amplitudes are inversely related to switch behavior was tested at electrode Pz, which was identified via a conjunction analysis of all relevant stimulus- and feedback-locked effects in the P3b time window (FigureĀ 4D). A discrimination threshold was iteratively estimated in one half of randomly chosen trials that was then used to predict switching in the second half of trials.