Predictive feedback, early sensory representations and fast responses to predicted stimuli depend on NMDA receptors
Using high-density EEG, Bayesian modelling and machine learning in an audio-visual delayed match-to-sample task, the study shows that frontal alpha activity and predictive beta feedback pre-activate sensory templates and account for trial-by-trial reaction times. Low-dose ketamine, an NMDA receptor blocker—but not dexmedetomidine—disrupted these behavioural and neural predictive signatures, indicating that NMDA receptors are necessary for predictive feedback and fast responses to expected stimuli.
Authors
- Afrasiabi, M.
- Austerweil, J. L.
- Casey, C.
Published
Abstract
Abstract Learned associations between stimuli allow us to model the world and make predictions, crucial for efficient behavior; e.g., hearing a siren, we expect to see an ambulance and quickly make way. While there are theoretical and computational frameworks for prediction, the circuit and receptor-level mechanisms are unclear. Using high-density EEG, Bayesian modeling and machine learning, we show that inferred “causal” relationships between stimuli and frontal alpha activity account for reaction times (a proxy for predictions) on a trial-by-trial basis in an audio-visual delayed match-to-sample task which elicited predictions. Predictive beta feedback activated sensory representations in advance of predicted stimuli. Low-dose ketamine, a NMDA receptor blocker – but not the control drug dexmedetomidine – perturbed behavioral indices of predictions, their representation in higher-order cortex, feedback to posterior cortex and pre-activation of sensory templates in higher-order sensory cortex. This study suggests predictions depend on alpha activity in higher-order cortex, beta feedback and NMDA receptors, and ketamine blocks access to learned predictive information.
Research Summary of 'Predictive feedback, early sensory representations and fast responses to predicted stimuli depend on NMDA receptors'
Introduction
Predictive coding frameworks propose that the brain actively generates top-down predictions from higher-order cortical levels and compares them with incoming sensory evidence, with mismatches producing error signals that update internal models. Although theoretical work has implicated N-methyl-D-aspartate receptors (NMDARs) in mediating feedback predictions—because NMDARs are enriched in cortical layers associated with feedback and modulate higher-order excitability—direct experimental evidence linking NMDAR function to prediction generation, representation and feedback is limited. Classic oddball paradigms report ketamine-related changes in mismatch negativity, but those paradigms conflate prediction, sensory evidence and error signalling and lack a clear trial-by-trial behavioural readout of prediction.
Methods
Behavioural modelling employed hierarchical drift-diffusion models (HDDM) to relate trial-by-trial measures of causal strength (ΔP, causal power, causal support) and transitional probability to the bias parameter (z) that determines the starting point of evidence accumulation. Models were compared with Deviance Information Criterion (DIC). EEG–behaviour links were tested by including single-trial RF alpha power as a predictor of bias in HDDM. Multivariate decoding used a support vector machine (SVM) on band-limited power features (theta, alpha, beta, gamma) from the four clusters to classify HP/MP/NP trials, and a time-generalised recurrent neural network (RNN) to decode visual identities (V1–V3) from raw EEG, assessing pre-stimulus activation of sensory templates. Source localisation used DICS beamforming and AAL ROIs; non-parametric spectral Granger causality in source space tested feedback from right frontal regions to right inferior temporal cortex, with frequency-specific analyses emphasising the beta band (15–30 Hz). Statistical tests used linear mixed-effects models with contrast analysis for prediction (linear contrast HP,MP,NP: -1,0,1) and drug effects, cluster permutation tests for ERPs and repeated-measures ANOVA or permutation approaches where appropriate.
Results
Pre-stimulus sensory templates and feedback: Time-generalised RNN decoding of visual identities from RF/RC/LC/OC clusters revealed significant above-chance classification during the delay period prior to image onset in the pre-drug baseline and under DEX, consistent with pre-activation of sensory templates. The right central (RC) cluster contributed most to this pre-stimulus decoding when training on data 100 ms post-stimulus and testing across the trial: RC contribution peaked 476 ms before visual onset in baseline, 344 ms in DEX, and only 52 ms before onset under ketamine (ANOVA F(20)=2.9, P=0.011), indicating substantially delayed or weakened pre-activation under ketamine. Source-level analyses identified right-hemisphere frontal and temporal ROIs showing delay-period alpha increases versus baseline; non-parametric spectral Granger causality from averaged right frontal ROIs to right inferior temporal cortex in the beta band correlated with predictive strength (ANOVA, F(1,30.46)=11.47, P=0.0001). Ketamine disrupted this relationship: a prediction × drug interaction showed no longer a correlation between predictive value and beta-band frontal→inferior temporal Granger causality under ketamine (ANOVA interaction F(1,24.98)=2.61, P=0.04), whereas DEX preserved the graded feedback.
Discussion
The authors acknowledge uncertainties and boundaries of interpretation: some reported frequency effects in predictive processing vary across tasks and timing, and the present work focuses on a specific audio–visual delayed paradigm with right-hemisphere sources for these particular stimuli. They also note converging evidence from macaque studies but do not claim to identify laminar mechanisms directly in humans. Overall, the study supports a role for frontal alpha, beta-mediated feedback and NMDARs in forming, representing and using predictions, and highlights that NMDAR blockade with ketamine can selectively perturb those predictive mechanisms while leaving basic feedforward sensory responses intact.
View full paper sections
METHODS
investigated what information in the trial history subjects base predictions on; in other words, how do subjects learn the predictive relationship between stimuli. We tested two possibilities: (i) did subjects generate predictions by keeping track of simple co-occurrences of each soundimage pair, i.e., were they basing predictions on correlations; or (ii) did subjects not only track the occurrence of each image with its paired sound but also unpaired sounds, i.e., were they basing predictions on "causation". Here, the term causation is used in a statistical sense, where subjects can infer a causal relationship between the initial sound and the image that closely follows in time. To clarify the difference between correlation versus causation in this context, let us consider a hypothetical example (adapted from) in which you would like to test whether your new automatic sprinkler system worked overnight. In the morning, you walk outside and see that the grass is wet. Hence, you might think that the sprinkler operated overnight. In this case, your inference is based on the correlation of two events. However, after learning from the weather report that it rained last night, you lose confidence that your sprinkler watered the lawn, i.e., the wet grass may be due to the rain instead. Here, the presence of one cause (rain) casts doubt on the other (sprinkler) and thus helps us develop a more appropriate cause-effect relationship between events (unlike simple correlation). To relate this back to our task, when subjects track the occurrence of each image with its paired sound but also unpaired sounds, it is similar to checking the weather report to deduce if the sprinkler worked properly. Inferring a cause-effect structure between stimuli helps subjects eliminate weak/conditional relations between particular sounds and images. To measure correlation (option i above), we calculated the transitional probability, i.e., how often a particular image follows the sound only. To measure causation (option ii), we calculated the causal powerof the sound-image association, which is the amount of evidence that a sound "causes" a particular image, as opposed to a random different cause (Tableand methods show calculation details; we also calculated two other measures of "causation" -P and causal supportand observed similar results). We updated the transitional probability and causal power estimates each trial, to account for the additional information available, i.e., the transitional probability/causal power value was the same for each stimulus at the start of the pre-drug baseline, but these values eventually systematically differed between stimuli as more trials were performed, reflecting the accumulating information from the trial history (Fig 1Fshows the causal power differentiating all stimuli earlier than transitional probability). We next determined whether transitional probability and/or causal power can account for the behavioral results (RTs). To this end, we modeled subjects' decision-making process using a drift-diffusion model (DDM). Evidence accumulates (drift process) from a starting point to one of two boundaries. Here, the boundaries represent the two possible outcomes for match trials only (correct and incorrect). The drift process stops when it reaches a boundary, indicating the choice, and the time taken to reach the boundary represents the RT for the trial (Fig). The starting point of the drift processhere modeled from image onsetmay be biased towards one of the boundaries, and it is determined by a bias parameter, z. This parameter represents the predictive value of the sound, which can be based on the transitional probability or causal power. A drift process that starts with a larger bias will reach the decision boundary quicker, resulting in a faster RT, i.e., more predictive sounds generate a larger bias and faster RT. Thus, whichever of transitional probability or causal power (through the bias parameter, z) yield better correspondence with subjects' RTs will be the better indicator of the information subjects used to generate predictions. To test this, we used hierarchical Bayesian parameter estimation (HDDM), which calculates the posterior probability density of the diffusion parameters generating the RTs for the entire group of subjects simultaneously, while allowing for individual differences. We estimated the regression coefficients to determine the relationship between trial-to-trial transitional probability/causal power and biases estimated from the posterior predictive distribution. In other words, we calculated the bias for each trial that best predicted the RT. But, for each trial, the bias was constrained to depend on the transitional probability or causal power (equation, Fig). Hence, for each trial we calculated the relationship (regression coefficient 1) between the bias and transitional probability/causal power that best predicted RT. Specifically, we estimated the posterior probability density of the regression coefficient (1; Fig) to determine the relationship between the bias and either transitional probability or causal power. We found causal power (deviance information criterion, DIC=-3197) predicted RTs better than transitional probabilities (DIC=-1795), i.e., causal power better captured the basis of prediction generation (option ii above). Further, bias was positively correlated with causal power (P{1>0}=0.04; Fig). This suggests that subjects based predictions on trial-by-trial updates of inferred "causal" relationships between sounds and images, rather than just correlations. The HDDM also provides a framework to model drug effects. Thus, we repeated the above analysis of subjects' behavior under ketamine and under DEX. If ketamine prevents predictive information from conferring a behavioral advantage, all sounds will generate similar biases; i.e., there will be no correlation between the bias and the predictive value of sounds, so 1 will be zero. Indeed, under ketamine, 1 was not different from zero (P{1>0}=0.To answer this, we used the HDDM to analyze the first 30 trials for each subject after recovery (translating to approximately 10 trials for each sound cue, for every subject). We found that, only for the former (option (a)), bias positively correlated with causal power (P{1>0}=0.03; Fig). This suggests that ketamine did not produce a loss of previously learned predictive information, but rather ketamine prevented access to the predictive information.
RESULTS
Predictions improved RTs. Subjects initially learned paired associations (A1-V1, A2-V2, A3-V3) between three sounds (A1, A2, A3) and three images (V1, V2, V3) through trial-and-error. During learning, each sound and image had equal probability (33%) of appearing in any given trial, preventing subjects developing any differential predictions due to stimulus frequency. Thus, the presence of any given sound does not predict the occurrence of any future image, during this learning phase. Following the presentation of both stimuli, subjects reported if the sound and image were in fact paired, i.e., whether or not they matched. To manipulate subjects' predictions during subsequent testing, we varied the probability of an image appearing after its associated sound. This probability was different for each sound: 85% chance of V1 after A1; 50% chance of V2 after A2; and 33% chance of V3 after A3 (Fig). Thus, A1 was highly predictive (HP), A2 was moderately predictive (MP), and A3 was not match predictive (NP). We hypothesized that increasing the predictive value of the sound would allow subjects to better predict the upcoming image, enabling quicker responses (HP<MP<NP) in match trials. However, if predictions are mediated by NMDARs, then ketamine should disrupt predictions; i.e., subjects administered with a sub-anesthetic dose of ketamine should be unable to exploit the differential predictive value of each sound, thus preventing faster RTs. If these effects are specific to NMDAR manipulation, then the control drug DEX should still allow faster RTs to predicted stimuli. To investigate these hypotheses (and to restrict multiple tests on the same dataset), we ran a linear mixed effects (LME) model. We used the sounds' predictive value (HP, MP, NP) and drug condition (before drug, under ketamine, under DEX, after recovery from drug) as independent variables and RT as our dependent variable (RT~ sounds' predictive value + drug condition + sounds' predictive value x drug condition). We applied a contrast analysis strategy to model our independent variables. Our study conformed to the guidelines set out by Ableson and Prenticewith regards to contrast analysis; i.e., we included the contrast of interest along with paired, orthogonal contrasts. (Contrasts of interest only explain a part of the total variation between groups. We included orthogonal contrasts to explain the residual variance. According to Abelson and Prentice, an analysis of the residual variance, i.e., orthogonal contrasts, is important since one may miss systematic patterns in the data if one only tests the contrast of interest. They suggested that finding a significant contrast of interest and a non-significant orthogonal contrast confirms the data support the hypothesis.) To test our hypothesis that a greater predictive value of sounds allows subjects to respond faster, we used a linear contrast (HP, MP, NP: -1, 0, 1) as our contrast of interest, and a quadratic contrast (HP, MP, NP: -1, 2, -1) as the orthogonal contrast in the analysis. Further, to test our hypothesis that only ketamine prevents these faster responses, we used a four-level contrast (under ketamine, before drug, under DEX, after recovery: 3, -1, -1, -1) as our contrast of interest, and two orthogonal contrasts (under ketamine, before drug, under DEX, after recovery: 0, 0, -1, -1 and 0, 2, -1, -1) in the analysis. A significant main effect of prediction will confirm predictive sounds produce faster responses. A significant interaction effect of prediction and drug condition will confirm that ketamine disrupts subjects' ability to exploit predictive sounds to respond faster. We found a significant main effect of sounds' predictive value in our LME model (ANOVA, F(1, 21.07)=14.14, P=0.001; orthogonal contrasts non-significant). RTs were faster when sounds had greater predictive value (Fig). This result was further validated in parallel psychophysics experiments, where we controlled for possible match biasusing randomly interleaved "inversion trials" (in which subjects simply indicated whether greebles were inverted) to minimize the expectation of match trials (ANOVA, F(1, 21.92)=18.71, P=0.0001, orthogonal contrasts non-significant, Fig). Furthermore, we found effects on RTs could not be explained by speed-accuracy trade-offs, as subjects were most accurate for HP, followed by MP and NP sounds (ANOVA, F(1, 21.66)=14.42, P=0.001, orthogonal contrasts non-significant, Fig. Ketamine blocked fast RTs to predictive sounds. Importantly, we found a significant interaction of sounds' predictive value and drug condition (ANOVA, F(1,16.42)=5.51, P=0.03). The interaction effect confirmed that, under ketamine, the linear correlation between the predictive value of sounds and RT was diminished. This effect was specific to NMDAR manipulation, as DEX did not disrupt the ability of subjects to exploit the differential predictive value of the sounds. Under DEX, the linear correlation between the predictive value of sounds and RT was intact (Fig 1E ), similar to the pre-drug baseline condition. These pharmacological effects were not due to low accuracy as subjects' average accuracy was similar across all three conditions (77.8% under ketamine, 85.7% without ketamine, and 81.0% under DEX; ANOVA, F(1, 23.932)=1.36, P=0.22 ). Neither were effects due to the level of sedation as subjects were more alert under ketamine than dexmedetomidine (under ketamine, average modified observer's assessment of alertness/sedation (OAA/S) score of 4.85 compared to 3.33 under DEX (5, awake -1, unresponsive); unpaired t-test, P=0.003). Significant interaction effects in our LME model also confirmed that the linear correlation of RTs with predictive strength returns after recovery from ketamine (2-4 hours after ending ketamine administration, depending upon subject's recovery; Fig). Overall, our results demonstrate that subjects used predictive information to enhance behavioral performance and the NMDAR-blocker ketamine prevented this behavioral advantage.
CONCLUSION
Our results show NMDAR-mediated, circuit-level mechanisms of prediction and its behavioral effects. Frontal cortex represented predictions and, starting prior to image onset, transmitted them to posterior cortex in the beta band, to activate a sensory representation of the predicted image. Stronger predictions enabled faster responses, and reflected causal power, i.e., inferred "causal" relationships between sounds and images. In contrast, ketamine prevented fast responses to predictive stimuli, as well as subjects from using the strength of causal power to generate predictions. At the circuit level, ketamine disrupted predictions by reducing frontal alpha power to the same low level prior to all images (likely indicating reduced SNR), leading to undifferentiated feedback and perturbed pre-stimulus sensory activations. Overall, it suggests that NMDARs normally sharpen representations of predictions in frontal and posterior cortex, to enable PC. The data are less supportive of the classical view of perception, with its emphasis on feedforward processing to reconstruct images because one might have expected little systematic difference in behavioral and neural measures for different predictive conditions. The initial predictive auditory stimulus will activate auditory pathways, leading to the generation of a prediction of the subsequent visual image by a higher-order, multi-modal area. Complex auditory stimuli, like the trisyllabic greeble names in our task, are represented in auditory lateral belt and parabelt cortex. Belt and parabelt regions are connected with a number of multimodal areas in superior temporal and prefrontal cortex, where there are multisensory neurons responding to both vocalizations and images. Recent worksuggests that early sensory fusion occurs in temporal or parietal multimodal areas, whereas more flexible weighting and integration of sensory signals for adaptive behavior occurs in frontal multimodal areas. Our results are consistent with this, but go further by showing that the frontal multimodal areas are the source of predictive multimodal signals, which, in addition to the enhancement of sensorimotor processing shown here, may be useful for communication and language processing. Interestingly, we found that the frontal source and posterior targetshowing pre-stimulus sensory activations of the predicted greebleof predictive information was lateralized to the right hemisphere. This is consistent with previous studies suggesting a possible right hemispheric bias for the processing of greebles, and possibly faces more generally. There have been two broad approaches for understanding how we learn relationships between stimuli: associative and causal approaches. Classical theories like the Rescorla-Wagner modelpropose learning as the association between a cue and an outcome. Causal models of learning, on the other hand, propose that we learn relationships between latent unobservable "causes" and observable stimuli (both cues and outcomes). In other words, causal learning models have put forward the idea of "clustering", where observations (related to both cue and outcome) are clustered together according to their hypothetical latent causes. In line with previous work, we found that subjects' RTs were best predicted by a causal model (casual power) and not an associative model (transitional probability). Our causal learning results have further implications on models of sequential learning. Previous work on causal learning focused on summarized data contingency. Our findings support trial-by-trial learning from sequential data as proposed by a recent modelling study. Additionally, using HDDM we found that both RF alpha power and causal power correlated to RT on a trial-by-trial basis. This points towards a neural readout of causal inference in humans. Although beta oscillations have been proposed to maintain the current brain state, there is growing evidence of beta activity playing a more dynamic role. It has been proposed that beta oscillations are suitable for endogenous re-activation of cortical representations, to facilitate task-relevant activity patterns and cognitive demands. In line with this, we found beta band predictive feedback from right frontal to inferior temporal cortices reactivates greeble representations in posterior cortex prior to image onset. Spitzer and Haegensspeculate that beta band activity is well suited to be a 'transit' between alpha frequency (generally associated with cortical excitation/inhibition) and gamma frequency (generally associated with population spiking and active stimulus coding) activity. Our finding of frontal cortical alpha power coding for the predictive value of sound cues, followed by frontal influence on temporal areas (shown to have greeble representations) at beta frequencies prior to visual-evoked activity, supports beta's role as a 'transit' band. Intracortical laminar recordings in animal studies of sensory and attentional processing suggest that feedforward signaling operates at gamma frequencies, whereas feedback signaling operates at lower frequencies. Consistent with this, previous work on predictive processing using univariate measures has reported the involvement of various lower frequencies, including theta, alpha and beta bands. Although there are varying reports of the direction of alpha power changes in predictive processing, this may be due to "when" or "what" is being predicted, differences in task structure (e.g., analyzing the stimulus or delay period) and differences between brain regions. Further considering connectivity, a recent macaque study reported greater alpha and beta feedback from prefrontal to visual cortex during the presentation of more predictive visual stimuli. Van Pelt et al.also found strongest top-down feedback connectivity in the beta band while subjects viewed videos of predictable events. Our finding of frontal cortex causally influencing posterior cortex in the beta band according to predictions extends this finding to the delay period in the absence of sensory stimulation, as well as provides support for PC models that incorporate a key role for oscillatory activity more generally. NMDARs are located in both superficial and deep cortical layers, potentially allowing NMDARs to modulate the representation of predictions in deep layers, as proposed in certain PC models, and predictive feedback signaling to superficial and/or deep layers. NMDAR blockade in humans has been shown to modulate frontal cortical excitability, and our decoding analyses suggest that this NMDAR-related change in excitability reduces the SNR of prediction representations. This is consistent with macaque experiments showing that NMDAR blockade reduces the SNR in frontal cortex during the working memory period of an antisaccade task. In our study, NMDAR blockade also perturbed predictive feedback, consistent with macaque experiments showing NMDARs contribute to feedback signaling. Taken together, these results suggest NMDARs influence both the representation of predictions in higher-order areas and predictive signaling to lower-order areas, which impacts the formation of pre-stimulus templates. Ketamine at sub-hypnotic doses perturbed feedback connectivity from frontal to more posterior cortexbut not evoked activity in sensory corticesduring which subjects could still perceive and accurately respond to audio-visual stimuli. This raises questions about the requirement of frontal feedback integrity for consciousness. Further, it has been proposed that generative models create virtual realities that support conscious experience. That subjects' predictions in our study could be disrupted without impairing consciousness imposes constraints on PC as a theory of consciousness.