Fig. 6 | Nature Communications

Fig. 6

From: Deviation from the matching law reflects an optimal strategy involving learning over multiple timescales

Fig. 6

The model-independent analysis of integration over long timescales: the longest measurable integration timescale (LMIT). a The schematic of the analysis. From the plots describing matching behavior (see Fig. 1b), we fit a line to the datapoints and we measure its slope (undermatching) and its intercept at x = 0.5 (color choice bias). We also estimate session-by-session color reward imbalance that we defined as the imbalance between rewards obtained from two color targets. We then estimate how long ago color reward imbalance can influence future color choice bias by lagged correlation analysis, which we refer to as the longest measurable integration timescale (LMIT). Although this is a model-independent measure, it is related to our model parameter wSlow. The motivation for introducing the LMIT is to show that the LMIT co-varies with the undermatching slope over time, providing direct evidence for the link between undermatching behavior and reward history integration over very long timescales. b Precise definition of the LMIT and its dependence on the relative weight wSlow of the slow integrator of the model (simulated data). The lagged correlation between color choice bias and color reward imbalance are estimated with data simulated using the model in Fig. 2a employing different relative weights of the slow integrator wSlow. The correlation decays as the time-lag between the color choice bias and the color reward imbalance increases. The correlation points are fitted by a line using weighted linear regression (dotted lines), and the LMIT is defined by the point at which the fitted line crosses the zero correlation (red filled circles). The correlations are the mean of over 50 simulations and the error bars indicate the standard deviations. The model was simulated over the same reward schedule experienced by Monkey F. The timescales are assumed to be τFast = 5 trials and τSlow = 1000 trials, though the precise choice of timescales is not essential for the results. c The LMIT estimated through simulations vs the weight of long-time constant in the model. The estimated LMIT shows a clear positive correlation with the relative weight of the long timescale wSlow employed in the model simulations. The LMIT is expressed in trials, by converting from sessions using the mean session size. This strong monotonic relationship between wSlow and the LMIT suggests that we can use the LMIT estimate as a proxy for the wSlow estimate, when analyzing experimental data

Back to article page