key: cord-0272285-29nw937u
authors: Wittkuhn, Lennart; Krippner, Lena M.; Schuck, Nicolas W.
title: Statistical learning of successor representations is related to on-task replay
date: 2022-02-02
journal: bioRxiv
DOI: 10.1101/2022.02.02.478787
sha: b09f7e01018e79a2591fec95f5a9448a97f47c8d
doc_id: 272285
cord_uid: 29nw937u

Humans automatically infer higher-order relationships between events in the environment from their statistical co-occurrence, often without conscious awareness. Neural replay of task representations, which has been described as sampling from a learned transition structure of the environment, is a candidate mechanism by which the brain could use or even learn such relational information in the service of adaptive behavior. Human participants viewed sequences of images that followed probabilistic transitions determined by ring-like graph structures. Behavioral modeling revealed that participants acquired multi-step transition knowledge through gradual updating of an internal successor representation (SR) model, although half of participants did not indicate any knowledge about the sequential task structure. To investigate neural replay, we analyzed dynamics of multivariate functional magnetic resonance imaging (fMRI) patterns during short pauses from the ongoing statistical learning task. Evidence for sequential replay consistent with the probabilistic task structure was found in occipito-temporal and sensorimotor cortices during short on-task intervals. These findings indicate that implicit learning of higher-order relationships establishes an internal SR-based map of the task, and is accompanied by cortical on-task replay.

The representation of structural knowledge in the brain in form of a so-called cognitive map has been 28 a topic of great interest. A common assumption is that a cognitive map provides the basis for flexible 29 learning, inference, and generalization (Tolman, 1948; Wilson et al., 2014; Schuck et al., 2016 ; Behrens experienced both graphs as well as the change between them (Fig. 2c) . 12 participants started in the 126 unidirectional condition and transitioned to the bidirectional graph (uni -bi), while 27 participants 127 experienced the reverse order (bi -uni) . The relationships among the six task stimuli depicted as a ring-like graph structure (left). In the unidirectional graph (middle), stimuli frequently transitioned to the clockwise neighboring node (pij = pAB = 0.7), never to the counterclockwise neighboring node (pAF = 0.0), and only occasionally to the three other nodes (pAC = pAD = pAE = 0.1). In the bidirectional graph (right), stimuli were equally likely to transition to the clockwise or counterclockwise neighboring node (pAB = pAF = 0.35) and only occasionally transitioned to the three other nodes (pAC = pAD = pAE = 0.1). Transition probabilities are highlighted for node A only, but apply equally to all other nodes. Arrows indicate possible transitions, colors indicate transition probabilities (for a legend, see panel b). (b) Transition matrices of the unidirectional (left) and bidirectional (right) graph structures. Each matrix depicts the probability (colors) of transitioning from the stimulus at the previous trial t − 1 (x-axis) to the current stimulus at trial t (y-axis). (c) Within-participant order of the two graph structures across the five runs of the graph learning task. n = 12 participants first experienced the unidirectional, then the bidirectional graph structure (uni -bi; top horizontal panel) while n = 27 participants experienced the reverse order (bi -uni; bottom horizontal panel) . In both groups of participants, the graph structure was changed without prior announcement halfway through the third task run. Numbers indicate approximate run duration in minutes (min). Colors indicate graph condition (uni vs. bi; see legend).

(d) Visualization of the relative magnitude of the outcome variable (e.g., behavioral responses or classifier probabilities; y-axis) for specific transitions between the nodes (x-axis) and the two graph structures (uni vs. bi; horizontal panels) under the three assumptions (vertical panels), (1) that there is no difference between transitions (null hypothesis), (2) that response times are only influenced by the one-step transition probabilities between the nodes (colors), or (3) that response times are influenced by the multi-step relationships between nodes in the graph structure (here indicated by node distance). An effect of unidirectional graph structure would be evident in a linear relationship between node distance and the outcome variable, whereas a bidirectional graph structure would be reflected in a U-shaped relationship between node distance and independent measures (possibly inverted, depending on the measure). The stimulus material (individual images of a bear, a dromedary, a dear, an eagle, an elephant and a fox) shown in (a), and (b) were taken from a set of colored and shaded images commissioned by Rossion and Pourtois (2004) , which are loosely based on images from the original Snodgrass and Vanderwart set (Snodgrass and Vanderwart, 1980) . The images are freely available from the internet at https://sites.google.com/andrew.cmu.edu/tarrlab/resources/tarrlab-stimuli under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported license (CC BY-NC-SA 3.0; for details, see https://creativecommons.org/licenses/by-nc-sa/3.0/). Stimulus images courtesy of Michael J. Tarr, Carnegie Mellon University, (for details, see http://www.tarrlab.org/).

A first analysis revealed that participants reacted faster and more accurately to transitions with 199 To investigate if these analyses would differ between the two graph structures (uni vs. bi) and the 200 two graph orders (uni -bi vs. bi -uni), we split the data according to these two factors and repeated 201 a similar analysis of LME models (for details, see Methods). These analyses again showed that models 202 based on a non-zero γ parameter achieved better fits, confirming that participants learned higher-order 203 relationships among the nodes in the graph structure from experiencing sequences of transitions in the 204 task ( Fig. 3e) . Interestingly, data from the first graph structure were fit best by the same γ parameter Finally, we assessed whether participants were able to express knowledge of the sequential ordering 214 of stimuli and graph structures explicitly during a post-task questionnaire. Asked whether they had 215 noticed any sequential ordering of the stimuli in the preceding graph task, n = 19 participants replied 216 "yes" and n = 20 replied "no" (Fig. 3f) . Of those participants who noticed sequential ordering 217 (n = 19), almost all (18 out of 19) indicated that they had noticed ordering within the first three runs 218 of the task (Fig. 3g) , and more than half of those participants (11 out of 19) indicated that they had 219 noticed ordering during the third task run, i.e., the run during which the graph structure was changed. 220 Thus, sequential ordering of task stimuli remained at least partially implicit in half of the sample, 221 and the change in the sequential order halfway through the third run of graph trials seemed to be one 222 potential cause for the conscious realization of sequential structure. Participants were also asked to rate 223 the transition probabilities of all pairwise sequential combinations of the six task stimuli (30 ratings in 224 total). Interestingly, participants on average reported probability ratings that reflected bidirectional 225 graph structure. Probabilities of transitions to clockwise and counterclockwise neighboring nodes were 226 rated higher than rarer transitions to intermediate nodes, regardless of the order in which participants 227 had experienced the two graph structures immediately before the questionnaire (Fig. 3h ).

228 Figure 3 : Behavioral responses are modulated by transition probabilities and graph structure. (a) Behavioral accuracy (y-axis) following transitions with low (pij = 0.1) and high probability (x-axis; pij = 0.7 and pij = 0.35 in the uni and bi conditions, respectively) for both graph structures (panels). Colors as in Fig. 2d . The horizontal dashed lines indicate the chance level (16.67%). (b) Log response time (y-axis) following transitions with low (pij = 0.1) and high probability (x-axis; pij = 0.7 and pij = 0.35 in the uni and bi conditions, respectively) for both graph structures (panels). Colors as in panel (a) and Fig. 2d . (c) Log response times (y-axis) as a function of uni-or bidirectional (u | b) node distance (x-axis) in data from the two graph structures (colors / panels). (d) AIC scores (y-axis) for LME models fit to participants' log response time data using Shannon surprise based on SRs with varying predictive horizons (the discounting parameter γ; x-axis) as the predictor variable. (e) AIC scores (y-axis) for LME models fit to participants' log response time data using Shannon information based on SRs with varying predictive horizons (the discounting parameter γ; x-axis) as the predictor variable, separated by graph order (uni -bi vs. bi -uni; horizontal panels) and graph condition (uni vs. bi; panel colors) . (f ) Number of participants (y-axis) indicating whether they had noticed any sequential ordering during the graph task ("yes" or "no", x-axis). (g) Number of those participants (y-axis) who had detected sequential ordering indicating in which of the five runs of the graph task (x-axis) they had first noticed sequential ordering. (h) Ratings of pairwise transition probabilities (in %; y-axis) as a function of node distance / transition probability, separately for both graph orderings (uni -bi vs. bi -uni; panels) . Boxplots in (a), (b), (c), and (h) indicate the median and IQR. The lower and upper hinges correspond to the first and third quartiles (the 25 th and 75 th percentiles). The upper whisker extends from the hinge to the largest value no further than 1.5 * IQR from the hinge (where IQR is the interquartile range (IQR), or distance between the first and third quartiles). The lower whisker extends from the hinge to the smallest value at most 1.5 * IQR of the hinge. The diamond shapes in (a), (b), (c), and (h) show the sample mean. Error bars and shaded areas in (a), (b), (c), and (h) indicate ±1 SEM. Each dot in (a), (b), (c), and (h) corresponds to averaged data from one participant. Vertical lines in (d) and (e) mark the lowest AIC score. All statistics have been derived from data of n = 39 human participants who participated in one experiment. We next asked whether learning of map-like graph representations was accompanied by on-task replay.

First, we trained logistic regression classifiers on fMRI signals related to stimulus and response onsets 231 in correct recall trials (one-versus-rest training; for details, see Methods; cf. Wittkuhn and Schuck, 232 2021). Separate classifiers were trained on data from gray-matter-restricted anatomical regions of 233 interest (ROIs) of (a) occipito-temporal cortex and (b) pre-and postcentral gyri, which reflect visual 234 object processing (cf. Haxby et al., 2001) and sensorimotor activity (e.g., Kolasinski et al., 2016), 235 respectively. In each case, a single repetition time (TR) per trial corresponding either to the onset of 236 the visual stimulus, or to participants' motor response was chosen (accounting for hemodynamic lag, 237 time points were shifted by roughly 4 s; for details, see Methods). Note, that the order of displayed 238 animals in recall trials was random, and image displays and motor responses were separated by SRIs 239 and ITIs of 2500 ms to reduce temporal autocorrelation (cf. Dale, 1999; Wittkuhn and Schuck, 2021) .

The trained classifiers successfully distinguished between the six animals. Leave-one-run-out clas-241 sification accuracy was M = 63.08% in occipito-temporal data (SD = 12.57 Fig. 4a ). We also tested whether the classifiers successfully 245 generalized from session 1 (eight recall runs) to session 2 (one recall run), and found no evidence for 246 diminished cross-session decoding, compared to within-session, F 8.00,655.00 = 0.95, p = 0.48 (for details 247 see Methods). Next, we examined the sensitivity of the classifiers to pattern activation time courses by 248 applying them to fifteen TRs following event onsets in recall trials (cf. Wittkuhn and Schuck, 2021) . 249 This analysis showed that the estimated normalized classification probability of the true stimulus class 250 given the data peaked at the fourth TR as expected (Fig. 4b) , where the probability of the true event 251 was significantly higher than the mean probability of all other events at that time point ( To address our main questions concerning on-task neural replay, we applied the classifiers to data 256 from the graph trials that included 10 s on-task intervals (ITIs) with only a fixation on screen (120 trials 257 per participant in total; 24 trials per run; 4 trials per stimulus per run; 10 s correspond to 8 TRs). We 258 expected that participants would replay anticipated upcoming events or recently experienced event 259 sequences during these on-task intervals, and that such replay would be evident in the ordering of 260 classification probabilities. Crucially, classifier probabilities should reflect participants' knowledge of 261 one-step transitions, but also their map-like representations that enabled them to form multi-step 262 expectations, as described above. For example, in unidirectional graph trials image A was followed 263 by image B with a higher probability than the other images. Therefore, the probability of decoding 264 image B during an on-task interval following image A should be higher than the classifier probabilities 265 of the other four possible next images (see Fig. 2a ). In addition, although images C, D, and E 266 had equal one-step transition probabilities, we expected the corresponding classifier probabilities to 267 be ordered such as to reflect the multi-step SR-model described above. Following our previous work 268 (Wittkuhn and Schuck, 2021), we also assumed that the ordering during the earlier phase of the on- animal displayed in the current trial was higher compared to all other classes (Fig. 4c) , and rising 273 and falling slowly as observed in recall trials (Fig. 4d, Fig. 5a To investigate replay of experienced or anticipated stimulus sequences, we modeled classifier prob-279 abilities of non-displayed stimuli with LME models. LME models contained predictors that reflected 280 node distance, i.e., how likely each stimulus was to appear soon, given either a unidirectional (lin-281 ear node distance) or bidirectional graph (quadratic node distance, see above). Because linear and 282 quadratic predictors were collinear, corresponding LME models were run separately. Each model 283 included fixed effects of ROIs (occipito-temporal vs. sensorimotor) and ITI phase (early vs. late).

284 Figure 4 : Classification accuracy and probabilistic classifier time courses on recall and graph trials. (a) Cross-validated classification accuracy (in %) in decoding the six unique visual objects in occipito-temporal data ("vis") and six unique motor responses in sensorimotor cortex data ("mot") during task performance. Chance level is at 16.67% (horizontal dashed line). (b) Time courses (in TRs from stimulus onset; x-axis) of probabilistic classification evidence (in %; y-axis) for the event on the current recall trial (black) compared to all other events (gray), separately for both ROIs (panels). (c) Mean classifier probability (in %; y-axis) for the event that occurred on the current graph trial (black color), shortly before the onset of the on-task interval, compared to all other events (gray color), averaged across all TRs in the on-task interval, separately for each ROI (panels). (d) Time courses (in TRs from on-task interval onset; x-axis) of mean probabilistic classification evidence (in %; y-axis) in graph trials for the event that occurred on the current trial (black) and all other events (gray). Each line in (b) and (c) represents one participant. Classifier probabilities in (b), (c), and (d) were normalized across 15 TRs. The chance level therefore is at 100/15 = 6.67% (horizontal dashed line). Gray rectangles in (d) indicate the on-task interval (TRs 1-8). The light and dark gray areas in (d) indicate early (TRs 1-4) and late (TRs 5-8) phases, respectively. Boxplots in (a) and (c) indicate the median and IQR. The lower and upper hinges correspond to the first and third quartiles (the 25 th and 75 th percentiles). The upper whisker extends from the hinge to the largest value no further than 1.5 * IQR from the hinge (where IQR is the interquartile range (IQR), or distance between the first and third quartiles). The lower whisker extends from the hinge to the smallest value at most 1.5 * IQR of the hinge. The diamond shapes in (a) and (c) show the sample mean. Error bars and shaded areas indicate ±1 SEM. Each dot corresponds to averaged data from one participant. All statistics have been derived from data of n = 39 human participants who participated in one experiment.

Considering data from runs in which stimulus transitions were governed by the unidirectional graph, 285 an LME model containing the linear node distance predictor indicated a three-way interaction between 286 node distance, ROI and phase F 1.00,852.00 = 7.21, p = 0.007. Post-hoc tests revealed an effect of node 287 distance on classifier probabilities in unidirectional data in both ROIs in the early phase (TRs 1-4) 288 of the ITIs, F 1.00,810.00 ≥ 78.18, ps < 0.001, akin to backward replay of recently experienced stimuli.

Effects in the late phase failed to reach significance (TRs 5-8), ps ≤ 0.11 (Fig. 5c ). Considering with our expectations about on-task multi-step replay. Although linear and quadratic node distance 295 predictors were collinear and therefore difficult to disentangle, we next tried to assess the specificity 296 of the above effects by testing the linear (unidirectional) node distance on bidirectional data and the 297 quadratic (bidirectional) node distance on unidirectional data. When a linear predictor was used 298 in an LME model of bidirectional data, only a main effect of phase (early vs. late) was observed, 299 F 1.00,852.00 = 11.55, p < 0.001, but no main effect of the linear predictor, F 1.00,852.00 = 0.27, p = 0.60, 300 or any interactions among the predictor variables, ps ≤ 0.09. Importantly, direct model comparison 301 revealed that the linear model fit better in the unidirectional graph condition and the early phase 302 of the ITI (see Fig. S6a -b). Using the quadratic predictor in the analysis of unidirectional data, 303 we observed a three-way interaction between bidirectional node distance, the ROI, and the phase, (TRs 1-4) of the ITIs, F 1.00,810.00 ≥ 5.56, ps < 0.02 (Fig. 5c ). Yet, model comparison again showed 307 that the the quadratic model fit better in the bidirectional graph condition in both TR phases (dif-308 ferences in AICs were between −31.02 and 162.03, see Fig. S6a -b). Hence, these analyses confirmed 309 that the observed classifier ordering was specific to the currently experienced graph.

The above analysis assumed that replayed sequences would always follow the most likely transitions 311 (assuming a fixed ordering of replay sequences according to the multi-step graph structure). Yet, replay 312 might correspond more closely to a mental simulation of several possible sequences that are generated 313 from a mental model. Consistent with this idea, the distribution of the observed sequential orders 314 of classifier probabilities indicated a wide variety of replayed sequences (Fig. 5d , distribution over 315 the entire ITI of 8 TRs). We next quantified how likely each possible sequential ordering of 5-item 316 sequences was, based on the transition probabilities estimated by the SR model described above (γ 317 was set to 0.3 in order to approximate to the mean level of planning depth we had estimated based on 318 the behavioral data, see above). To model measurement noise in the observed relative to the predicted 319 sequences, we employed a hidden markov model (HMM) with structured emission probabilities (for 320 details, see Methods). This revealed that during the unidirectional runs, the frequency with which 321 we observed a sequence in brain data during the on-task pauses, strongly related to the probability 322 of that sequence given the unidirectional graph structure (occipito-temporal ROI: r = .51, p < 0.001; 323 motor ROI: r = .35, p < 0.001; Fig. 5e ). Unexpectedly, this was not the case for the bidirectional 324 runs (p = 0.21 and p = 0.50, respectively; Fig. 5e ).

We then sought to characterize the time courses of evidence for replay of sequences most likely 326 to occur when mentally simulating a given sequence in the two graph structures. To this end, we in earlier compared to later TRs, but was present in both occipito-temporal and motor ROIs and 355 followed a similar dynamic with respect to early and late phases of the ITI in both ROIs. Time courses (in TRs from ITI onset; x-axis) of mean probabilistic classification evidence (in %; y-axis) for each of the five classes that were not presented on the current trial, colored by node distance in the two graph structures (vertical panels) for both anatomical ROI (horizontal panels). (c) Mean probabilistic classification evidence (in %; y-axis) for each node distance (colors) in the unidirectional (left vertical panel) and bidirectional (right vertical panel) graph structures averaged across TRs in the early (TRs 1-4) or late (TRs 5-8) phase (x-axis) for data in the occipito-temporal (top horizontal panels) and motor (bottom horizontal panels) ROIs. (d) Relative frequencies (y-axis) of all 120 permutations of probability-ordered 5-item sequences within each TR observed during on-task intervals, separately for both graph structures (vertical panels) and anatomical ROIs (horizontal panels). The horizontal gray line indicates the expected frequency if all sequences would occur equally often (1/120 = 0.008). Colors indicate sequence ordering from forward (e.g., 12345; dark blue) to backward (e.g., 54321; light blue) sequences. (e) Correlations (Pearson's r) between the predicted sequence probability and the observed sequence frequency (120 5-item sequences per correlation), separately for both graph structures (vertical panels) and anatomical ROIs (horizontal panels). Each dot represents one 5-item sequence. (f ) Regression slopes (y-axis) relating classifier probabilities to sequential positions for both graph structures (vertical panels) and anatomical ROIs (horizontal panels). Sequential orderings were determined based on a hidden markov model (HMM) identifying the most likely sequences based on the two graph structures (colors). Positive and negative slopes indicate forward and backward sequentiality, respectively (cf. Wittkuhn and Schuck, 2021). (g) Mean classifier probabilities averaged across all TRs in the early and late phase (x-axis) of the ITIs, separately for both graph structures (vertical panels) and anatomical ROIs (horizontal panels). Each dot in (c) and (g) corresponds to averaged data from one participant. Error bars in (c), (d), and (g) and shaded areas in (a), (b), and (f) represent ±1 SEM. Gray rectangles in (a), (b), and (d) indicate the on-task interval (TRs 1-8). The light and dark gray areas in (a), (b), and (f) indicate early (TRs 1-4) and late (TRs 5-8) interval phases, respectively. 1 TR in (a), (b), and (f) = 1.25 s. All statistics have been derived from data of n = 39 human participants who participated in one experiment.

We present results showing on-task cortical replay of future sequences simulated from a mental model 2021). Through model comparisons between SR models that differed in their discounting parameter 378 γ, i.e., their predictive horizon, we found that behavior overall was best explained by a medium 379 deep predictive horizon corresponding to γ = 0.3 (note, that any model with γ > 0 suggests that 380 participants formed predictive representations). When we separated the analyses by graph condition 381 and graph order, we found that during learning of the first graph structure, planning depth was 382 deeper, as indicated by a predictive horizon of γ = 0.55, irrespective of whether transition structure 383 was governed by the uni-or bidirectional graph condition. This finding suggests that, upon entering a 384 novel environment with sequential events, humans might integrate multi-step transition probabilities 385 to a medium depth that is independent from the specific structure of the environment. Interestingly, 386 after the transition structure changed to the second graph structure halfway through the task, this 387 also seemed to influence the predictive horizon in a manner that was dependent on the order in which 388 the two graphs were experienced. In participants who first learned the unidirectional and then the 389 bidirectional graph, the best fitting model was based on an SR with a higher discount parameter of were only told that short pauses may occur during the task, but they were not informed about the 419 purpose of these pauses, and could not predict when the pauses would occur. It therefore seems likely 420 that neural representations during on-task pauses reflect ongoing task representations similar to theta 421 sequences in rodents.

One important aspect of our work is that we focused on cortical replay of predictive representations (i.e., from B to A), even though this transition actually never occurs during the task.

One remaining challenge for future research is to better understand the sequentiality of replay. We 447 have previously shown that, at the level of classifier probabilities, sequences of neural events first elicit 448 forward followed by backward sequentiality relative to the true sequence of events due to the dynamics In conclusion, our results provide insights into how the human brain forms predictive represen-476 tations of the structural relationships in the environment from continuous experience and samples 477 sequences from these internal cognitive maps during on-task replay. animals that could be expected in a public zoo. Specifically, the images depicted a bear, a dromedary, 507 a deer, an eagle, an elephant, a fox, a giraffe, a goat, a gorilla, a kangaroo, a leopard, a lion, an ostrich, 508 an owl, a peacock, a penguin, a raccoon, a rhinoceros, a seal, a skunk, a swan, a tiger, a turtle, and a 509 zebra (in alphabetical order). For each participant, six task stimuli were randomly selected from the 510 set of 24 the animal images and each image was randomly assigned to one of six response buttons. This 511 randomization ensured that any potential systematic differences between the stimuli (e.g., familiarity, After participants entered the MRI scanner during the first study session and completed an anatomical 538 T1-weighted (T1w) scan and a 5 min fMRI resting-state scan, they read the task instructions while 539 lying inside the MRI scanner (for an illustration of the study procedure, see Fig. S1 ). Participants 540 were asked to read all task instructions carefully (for the verbatim instructions, see Boxes S1 to S15).

They were further instructed to clarify any potential questions with the study instructor right away 542 and to lie as still and relaxed as possible for the entire duration of the MRI scanning procedure. As 543 part of the instructions, participants were presented with a cover story in order to increase motivation 544 and engagement (see Box S1). Participants were told to see themselves in the role of a zookeeper in 545 training whose main task is to ensure that all animals are in the correct cages. In all task conditions, 546 participants were asked to always keep their fingers on the response buttons to be able to respond as 547 quickly and as accurately as possible. The full task instructions can be found in the supplementary 548 information (SI), translated to English (see SI, starting on page 7, Boxes S1 to S15) from the original 549 in German (see SI, page 11). condition, participants were told to see themselves in the role of a zookeeper in training in a public zoo 556 whose task is to learn which animal belongs in which cage (see Box S1). During each trial, participants 557 saw six black cages at the bottom of the screen with each cage belonging to one of the six animals.

On each trial, an animal appeared above one of the six cages. Participants were tasked to press the 559 response button for that cage as fast and accurately as possible and actively remember the cage where the animal belonged (see Box S3 and Box S4) . The task instructions emphasized that it would be very 561 important for participants to actively remember which animal belonged in which cage and that they 562 would have the chance to earn a higher bonus if they learned the assignment and responded accurately 563 (see Box S5).

In total, participants completed 30 trials of the training condition. Across all trials, the pairwise 565 ordering of stimuli was set to be balanced, with each pairwise sequential combination of stimuli 566 presented exactly once, i.e., with n = 6 stimuli, this resulted in n * (n − 1) = 6 * (6 − 1) = 30 trials.

In this sense, the stimulus order was drawn from a graph with all nodes connected to each other 568 and an equal probability of p ij = 0.2 of transitioning from one node to any other node in the graph.

This pairwise balancing of sequential combinations was used to ensure that participants would not 570 learn any particular sequential order among the stimuli. Note, that this procedure only controlled for 571 sequential order between pairs of consecutive stimuli but not higher-order sequential ordering of two 572 steps or more.

On the first trial of the training condition, participants first saw a small black fixation cross that (for an illustration of the study procedure, see Fig. S1 ). The recall condition of the task mainly served 603 two purposes: First, the recall condition was used to further train participants on the associations 604 between animal stimuli and response keys. Second, the recall condition was designed to elicit object-605 specific neural activation patterns of the presented visual animal stimuli and the following motor 606 response. The resulting neural activation patterns were later used to train the probabilistic classifiers.

The cover story of the instructions told participants that they would be tested on how well they have 608 learned the association between animals and response keys during the training phase (see Box S6).

In total, participants completed nine runs of the recall condition. Eight runs were completed during 610 session 1 and an additional ninth run was completed at the beginning of session 2 in order to remind 611 participants about the S-R mappings (for an illustration of the study procedure, see Fig. S1 ). Each and fixate a white fixation cross that was presented on a black background. Acquiring fMRI resting-744 state data before participants had any exposure to the task allowed us to record a resting-state period 745 that was guaranteed to be free of any task-related neural activation or reactivation. Following this 746 pre-task resting-state scan, participants read the task instructions inside the MRI scanner and were 747 able to clarify any questions with the study instructions via the intercom system. Participants then 748 performed the training phase of the task (for details, see the section "Training trials" on page 21) 749 while undergoing acquisition of functional MRI data. The training phase took circa 2 min to complete.

Following the training phase, participants performed eight runs of the recall phase of the task of circa 6 751 min each while fMRI data was recorded. Before participants left the scanner, field maps were acquired.

Session 2 At the beginning of the second session, participants first completed the questionnaire for 753 MRI eligibility and the questionnaire on COVID-19 symptoms before entering the MRI scanner again.

As in the first session, the second MRI session started with the acquisition of a short localizer sequence 755 and a T1w sequence followed by the orientation of the FOV for the functional acquisitions and the 756 Advanced Shimming. Participants were asked to rest calmly and keep their eyes closed during this 757 period. Next, during the first functional sequence of the second study session, participants performed 758 a ninth run of the recall phase of the task in order to remind them about the correct response buttons 759 associated with each of the six stimuli. We then acquired functional resting-state scans of 3 min each 760 and functional task scans of 10 min each in an interleaved fashion, starting with a resting-state scan.

During the acquisition of functional resting-state data, participants were asked to rest calmly and 762 fixate a small white cross on a black background that was presented on the screen. During each of 763 the functional task scans, participants performed the graph learning phase of the task (for details, see 764 section "Graph trials" on page 24). Importantly, half-way through the third block of the main task, the 765 graph structure was changed without prior announcement towards the second graph structure. After 766 the sixth resting-state acquisition, field maps were acquired and participants left the MRI scanner. 

For the functional scans, whole-brain images were acquired using a segmented k-space and steady 777 state T2*-weighted multi-band (MB) echo-planar imaging (EPI) single-echo gradient sequence that is and motor brain regions. The same sequence parameters were used for all acquisitions of fMRI data.

For each functional task run, the task began after the acquisition of the first four volumes (i.e., after 787 5.00 s) to avoid partial saturation effects and allow for scanner equilibrium.

The first MRI session included nine functional task runs in total (for the study procedure, see functional volumes were acquired. We also recorded two functional runs of resting-state fMRI data, 795 one before and one after the task runs. Each resting-state run was about 5 min in length, during 796 which 233 functional volumes were acquired.

The second MRI session included six functional task runs in total (for the study procedure, see 798 Fig. S1 ). After participants entered the MRI scanner, they completed a ninth run of the recall task.

As before, this run of the recall task was also about 6 min in length, during which 320 functional is a long-term support (LTS) release, offering long-term support and maintenance for four years.

Preprocessing of anatomical MRI data using fMRIPrep A total of two T1w images were found 865 within the input BIDS data set, one from each study session. All of them were corrected for inten- brightness threshold set to 75% of the median value of each run and a mask constituting the mean Second, we assessed decoding performance on recall trials across the two experimental sessions.

The large majority of fMRI data that was used to train the classifiers was collected in session 1 (eight of 985 nine runs of the recall task), but the trained classifiers were mainly applied to fMRI data from session 986 2 (i.e., on-task intervals during graph trials). At the beginning of the second experimental session, 987 participants completed another run of the recall task (i.e., a ninth run; for the study procedure, see 988 Fig. S1 ). This additional task run mainly served the two purposes of (1) reminding participants about 989 the correct S-R mapping that they had learned in session 1, and (2) to investigate the ability of the 990 classifiers to correctly decode fMRI patterns in session 2 when they were only trained on session 1 991 data. This second aspect is crucial, as the main focus of investigation is the potential reactivation of 992 neural task representations in session 2 fMRI data. Thus, it is important to demonstrate that this 993 ability is not influenced by losses in decoding performance due to decoding across session boundaries.

In order to test cross-session decoding, we thus trained the classifiers on all eight runs of the recall 995 condition in session 1 and tested their decoding performance on the ninth run of the recall condition 996 in session 2. Classifiers trained on data from all nine runs of the recall task were subsequently applied 997 to data from on-task intervals in graph trials in session 2. For the classification analyses in on-task 998 intervals of the graph task, classifiers were trained on the peak activation patterns from all correct 999 recall trials (including session 1 and session 2 data) and then tested on all TR corresponding to the 1000 graph task ITIs. 

For the anatomical ROI of motor cortex, we selected the labels of the left and right gyrus precentralis 1010 as well as gyrus postcentralis. The labels of each ROI are listed in Table 1 . Only gray-matter voxels 1011 were included in the generation of the masks as BOLD signal from non-gray-matter voxels cannot be 1012 generally interpreted as neural activity (Kunz et al., 2018) . Note, however, that due to the whole-brain 1013 smoothing performed during preprocessing, voxel activation from brain regions outside the anatomical 1014 mask but within the sphere of the smoothing kernel might have entered the anatomical mask (thus, 1015 in principle, also including signal from surrounding non-gray-matter voxels). Table 1 : Labels used to index brain regions to create participant-specific anatomical masks of selected ROIs based on Freesurfer's recon-all labels (Dale et al., 1999) All statistical analyses were run inside a Docker software container or, if analyses were executed on separately for both unidirectional (p ij = 0.7 vs. p ij = 0.1) and bidirectional (p ij = 0.35 vs. p ij = 0.1) data. Effect sizes (Cohen's d) were calculated by dividing the mean difference of the paired samples 1061 by the standard deviation of the difference (Cohen, 1988) and p-values were adjusted for multiple 1062 comparisons across both graph conditions and response variables using the Bonferroni correction 1063 (Bonferroni, 1936) .

In order to examine the effect of node distance on response times in graph trials, we conducted 1065 separate LME models for data from the unidirectional and bidirectional graph structures. For LME 1066 models of response time in unidirectional data, we included a linear predictor variable of node distance 1067 (assuming a linear increase of response time with node distance; see Fig. 2d top right) as well as random 1068 intercepts and slopes for each participant. The linear predictor variable was coded such that the node 1069 distance linearly increased from −2 to +2 in steps of 1, modeling the hypothesized increase of response 1070 time with node distance from 1 to 5 (centered on the node distance of 3). For LME models of response 1071 time in bidirectional data, we included a quadratic predictor variable of node distance (assuming an 1072 inverted U-shaped relationship between node distance and response time; see Fig. 2d bottom right) as 1073 well as by-participant random intercepts and slopes. The quadratic predictor variable of node distance 1074 was obtained by squaring the linear predictor variable. We also conducted separate LME models, that 1075 did not include data of the most frequent transitions in both the uni-and bi-directional data, but 1076 were otherwise specified in the same fashion.

Behavioral modeling based on the successor representation We modeled successor represen- 

whereby 1 s t+1 is a zero vector with a 1 in the s t+1 th position, M t st, * is the row corresponding to 1087 stimulus s t of matrix M. The learning rate α was arbitrarily set to a fixed value of 0.1, and the 1088 discount parameter γ was varied in increments of 0.05 from 0 to 0.95, as described in the main text.

This meant that the SR matrix would change throughout the task to reflect the experienced transitions 1090 of each participant, first reflecting the random transitions experienced during the training and recall 1091 trials, then adapting to the first experienced graph structure and later to the second graph structure.

In order to relate the SR models to participants' response times, we calculated how surprising each 1093 transition in the graph learning task was -assuming participants' expectations were based on the 1094 current SR on the given trial, M t . To this end, we normalized M t to sum to 1, and then calculated the Shannon information (Shannon, 1948) for each trial, reflecting how surprising the just observed 1096 transition from stimulus i to j was given the history of previous transitions up to time point t:

wherem t i,j is the normalized (i, j) th entry of SR matrix M t . Using the base-2 logarithm allowed to express the units of information in bits (binary digits) and the negative sign ensured that the 1099 information measure was always positive or zero.

The final step in our analysis was to estimate LME models that tested how strongly this trial-wise 1101 measure of SR-based surprise was related to participants' response times in the graph learning task, 1102 for each level of the discount parameter γ. LME models therefore included fixed effects of the SR-1103 based Shannon surprise, in addition to factors of task run, graph order (uni -bi vs. bi -uni) and 1104 graph structure (uni vs. bi) of the current run, as well as by-participant random intercepts and slopes.

Separate LME models were conducted for each level of γ, and model comparison of the twenty models 1106 was performed using AIC, as reported in the main text. To independently investigate the effects of 1107 graph condition (uni vs. bi) and graph order (uni -bi vs. bi -uni), we analyzed separate LME models in session 1 (Fig. 4a) . The mean decoding accuracy scores of all participants were then compared 1117 to the chance baseline of 100%/6 = 16.67% using a one-sided one-sample t-test, testing the a-priori 1118 hypothesis that mean classification accuracy would be higher than the chance baseline. The effect 1119 size (Cohen's d) was calculated as the difference between the mean of accuracy scores and the chance 1120 baseline, divided by the standard deviation of the data (Cohen, 1988) . These calculations were per-1121 formed separately for each ROI and the resulting p-values were adjusted for multiple comparisons 1122 using Bonferroni correction (Bonferroni, 1936) . 1123 We also examined the effect of task run on classification accuracy in recall trials. To this end, 1124 we conducted an LME model including the task run as the main fixed effect of interest as well as 1125 by-participant random intercepts and slopes (Fig. 4c) . We then assessed whether performance was 1126 above the chance level for all nine task runs and conducted nine separate one-sided one-sample t-tests 1127 separately per ROIs, testing the a-priori hypothesis that mean decoding accuracy would be higher 1128 than the 16.67% chance-level in each task run. All p-values were adjusted for 18 multiple comparisons 1129 (across nine runs and two ROIs) using the Bonferroni-correction (Bonferroni, 1936) . we used the Bonferroni-correction method (Bonferroni, 1936) to adjust for multiple comparisons of 1140 two observations. In the main text, we report the results for the peak in classification probability of the true class, corresponding to the fourth TR after stimulus onset. The effect size (Cohen's d) 1142 was calculated as the difference between the means of the probabilities of the current versus all other 1143 stimuli, divided by the standard deviation of the difference (Cohen, 1988) . data, predictor variables were switched accordingly, but otherwise the LME were conducted as before.

Finally, we also directly compared the fits of a linear and quadratic model for each graph condition, 1165 ROI, and interval phase and quantified the model comparison using AIC.

Predicting sequence probability during on-task intervals We computed how likely it was 1167 to observe each 5-item sequence of stimuli under the assumption that participants were internally 1168 sampling from an SR model of the unidirectional or bidirectional graph structure. This was done in 1169 two steps.

First, we computed an ideal SR representation based on the true transition probabilities for each 1171 graph structure. Specifically, we defined the true transition function T, as given by a graph, such that 1172 each entry t ij reflected the true probability of transitioning from image i to j. Following the main 1173 ideas of the SR, we then calculated the long-term visitation probabilities as the time-discounted 5-step 1174 probabilities following the Chapman-Kolmogorov Equation:

The discount rate γ was set to 0.3. We used five steps since more steps make little practical dif-1176 ference given the exponential discounting. The theoretical sequence probabilities for a given sequence s were then computed as the product of probabilities for all pairwise transitions (i, j) in the sequence,

Second, we approximated how likely it was to observe a sequence in the fMRI signal, given a 1180 particular sequence event in the brain. Our previous work has investigated which sequences are and (2) how likely it was to observe a sequence in the fMRI signal, given a specific sequence has 1195 been reactivated in the brain. To obtain our final estimates, we multiplied these probabilities for 1196 each sequence. This yielded the total probability to observe each sequence, assuming a true sequence 1197 distribution that results from sampling from the SR model, and a noise model that relates true to 1198 observed sequences.

To examine the relationship between predicted sequences based on this approach and observed 1200 sequences in fMRI during on-task intervals, we ordered the classes by their classifier probabilities 1201 within each TR (removing the class of the stimulus shown on the current trial) to obtain the observed 1202 frequencies for each of the possible 120 5-item sequences across all TRs of the on-task intervals during 1203 the graph learning task, separately for each participant, ROI and graph condition. The resulting 1204 distribution indicated how often classifier probabilities within TRs were ordered according to the 120 1205 sequential 5-item combinations. This distribution was then averaged across participants for each of the 1206 120 sequences and correlated with the sequence probability based on the HMM approach described 1207 above, separately for each ROI and graph condition (using Pearson's correlation across 120 data 1208 points).

Calculating the TR-wise sequentiality metric To analyze evidence for sequential replay during 1210 on-task intervals in graph trials, we calculated a sequentiality metric quantified by the slope of a linear 1211 regression between the classifier probabilities and each of the 5! = 120 possible sequential orderings 1212 of a 5-item sequence in each TR, similar to our previous work (Wittkuhn and Schuck, 2021). We 1213 next separated the regression slope data based on how likely the permuted sequences were given the 1214 transition probabilities of the two graph structures in our experiment. To determine the probabilities of 1215 each possible sequential ordering of the 5-item sequences, we used the HMM approach described above 1216 to obtain the probability of all the 5! = 120 sequences, assuming a particular starting position (i.e., the event on the current trial). Next, we ranked the permuted sequences according to their probability the least likely sequences based on the graph structure. We then separated the ranked sequences into quintiles, i.e., five groups of ranked sequences from the least likely to the most likely 20%. Finally, we averaged the regression slopes separately for both ROIs, the two graph structures and the early 1222 and late TRs and compared the average slope against zero (the assumption of no sequentiality Figure S1 : Study procedure. (a) Session 1 started with a 5 min resting-state scan before participants read the task instructions and completed the training condition of the task. Participants then completed eight runs of the recall condition of ca. 6 min each before another 5 min resting-state scan was recorded. (b) Session 2 started with another run of the recall condition of ca. 6 min. Participants then completed all five runs of the graph learning task of about 10 min each which were interleaved with six resting-state scans of 3 min each. Both experimental sessions started with a short localizer scan and a T1w anatomical scan and ended with the acquisition of fieldmaps. During these scans and additional preparations by the study staff (e.g., orientation of the FOV) participants were asked to keep their eyes closed. Numbers inside the rectangles indicate approximate duration of each step in minutes (mins). Colors indicate participants' task (see legend). Figure S2 : Behavioral accuracy and response times per task run in training, recall, and graph trials. Mean behavioral accuracy (in %; y-axis) per task run of the study (x-axis) in (a) training trials, (b) recall trials in session 1, (c) recall trials in session 2, and (d) graph trials in session 2. (e) Mean log response time (y-axis) per task run of the study (x-axis) in graph trials. The chance-level (gray dashed line) is at 16.67%. Each dot corresponds to averaged data from one participant. Colored lines connect data across runs for each participant. Boxplots indicate the median and IQR. The lower and upper hinges correspond to the first and third quartiles (the 25 th and 75 th percentiles). The upper whisker extends from the hinge to the largest value no further than 1.5 * IQR from the hinge (where IQR is the interquartile range (IQR), or distance between the first and third quartiles). The lower whisker extends from the hinge to the smallest value at most 1.5 * IQR of the hinge. The diamond shapes show the sample mean. Error bars and shaded areas indicate ±1 SEM. All statistics have been derived from data of n = 39 human participants who participated in one experiment. Figure S3 : Behavioral responses across task runs. (a) Log response times (y-axis) as a function of node distance (x-axis) in the graph structure (colors) for each task run (vertical panels) and graph order (uni -bi vs. bi -uni; horizontal panels). (b) Proportion of errors (in %; y-axis; relative to the total number of trials per node distance and run) as a function of node distance (x-axis) in the graph structure (colors) for each task run (vertical panels) and graph order (uni -bi vs. bi -uni; horizontal panels). Boxplots indicate the median and IQR. The lower and upper hinges correspond to the first and third quartiles (the 25 th and 75 th percentiles). The upper whisker extends from the hinge to the largest value no further than 1.5 * IQR from the hinge (where IQR is the interquartile range (IQR), or distance between the first and third quartiles). The lower whisker extends from the hinge to the smallest value at most 1.5 * IQR of the hinge. The diamond shapes show the sample mean. Each dot corresponds to averaged data from one participant. Error bars and shaded areas represent ±1 SEM. All statistics have been derived from data of n = 39 human participants who participated in one experiment. Figure S4 : Classifier probabilities in long ITIs of graph trials. Time courses (in TRs from the onset of the ITIs; x-axis) of classifier probabilities (in %; y-axis) per class (colors; see legend) and run (vertical panels). Substantial delayed and extended increases in classifier probability were found for the class that occurred on a given trial (horizontal panels) in both occipito-temporal brain regions (a) and motor and somatosensory cortex (b), peaking around the fourth TR following ITI onset, as expected given that classifier were trained on the fourth TR from event onset in fMRI data from recall trials. Each line represented averaged data across all trials of all participants. All shaded areas represent ±1 SEM. Gray rectangles indicate the long ITI (TRs 1-8). All statistics have been derived from data of n = 39 human participants who participated in one experiment. Figure S6 : Model comparison of LME models with linear vs. quadratic predictor of classifier probabilities in ITIs of graph trials. (a) Difference in AIC values for LME models including a linear vs. a quadratic predictor for mean classifier probabilities for the two TR phases (early vs. later), the two graph conditions (uni vs. bi; vertical panels) and the two ROIs (occipito-temporal vs. motor; horizontal panels). Positive values indicate a better fit of the LME model with the linear predictor and negative values indicate a better fit of the LME model with the quadratic predictor. (b) Table of AIC values of LME models with linear and quadratic predictor (and their difference) for all combinations of ROI, graph condition, TR phase. All statistics have been derived from data of n = 39 human participants who participated in one experiment with two sessions.

Box S1: Screen 1 of instructions for the training condition in session 1

Welcome to the study -Session 1! Please read the following information carefully. If you have any questions, you can clarify them right away with the study instructor. Please lie as still and relaxed as possible for the entire time.

Press any key to continue.

Box S2: Screen 1 of instructions for the training condition in session 1

Your task:

You are a zookeeper in training and have to make sure that all animals are in the right cages.

First you will learn in a training which animal belongs in which cage. We will now explain to you exactly how this task works.

Press any key to continue.

Box S3: Screen 3 of instructions for the training condition in session 1

Training (Part 1)

You want to become a zookeeper and start your training today. First you will learn which animal belongs in which cage. You will see six cages at the bottom of the screen. Each of the six cages belongs to one of six animals. You will select a cage with the appropriate response key. Please keep your ring, middle and index fingers on the response keys the entire time so

that you can answer as quickly and accurately as possible.

Press any key to continue.

Box S4: Screen 4 of instructions for the training condition in session 1

During the training, the animals appear above their cages. Press the key for that cage as fast as you can and remember the cage where the animal belongs. Please press the correct button within 1 second. Please answer as quickly and accurately as possible. You will receive

feedback if your answer was correct, incorrect or too slow. The correct cage will appear in green and the incorrect cage will appear in red.

Press any key to continue.

It is very important that you actively remember which animal belongs in which cage. You will get a higher bonus if you remember the correct assignment. The better you remember which animal belongs in which cage, the more money you earn! You will now complete one pass of this task, which will take approximately 2 minutes.

Press any key to continue.

Box S6: Screen 1 of instructions for the recall condition in session 1

Training (part 2)

We will now check how well you have learned the assignment of the animals to their cages.

The animals will now appear in the center of the screen. You are asked to remember the correct cage for each animal, and then press the correct key as quickly as possible.

Press any key to continue.

Box S7: Screen 2 of instructions for the recall condition in session 1

This time you respond only after the animal is shown. In each round, the animal will appear first in the center of the screen. Then please try to actively imagine the correct combination of animal, cage and response key. After that, a small cross will appear for a short moment. Then the cages appear and you can respond as quickly and accurately as possible. Please respond as soon as the cages appear, not earlier.

Press any key to continue.

Box S8: Screen 3 of instructions for the recall condition in session 1

You have again 1 second to respond. Please respond again as fast and accurate as possible.

You will get feedback again if your response was wrong or too slow. If your response was correct, you will continue directly with the next round without feedback. You will now complete 8 passes of this task, each taking about 6 minutes. In between the rounds you will be given the opportunity to take a break.

Press any key to continue.

Box S9: Screen 1 of instructions for the recall condition in session 2

Welcome to the study -Session 2!

We will check again if you can remember the assignment of the animals to their cages. The animals will appear in the center of the screen again. You are asked to remember again the correct cage for each animal and press the correct key as quickly as possible.

Press any key to continue.

Box S10: Screen 2 of instructions for the recall condition in session 2

You answer again only after the animal has been shown. In each round, the animal appears first in the center of the screen. Then please try to actively imagine the correct combination of animal, cage and answer key. After that, a small cross will first appear for a short moment.

Then the cages appear and you can answer as quickly and accurately as possible. Please respond as soon as the cages appear, not earlier.

Press any key to continue.

Box S11: Screen 3 of instructions for the recall condition in session 2

You have again 1 second to respond. Please respond again as fast and accurate as possible.

You will get feedback again if your response was wrong or too slow. If your answer was correct, you will proceed directly to the next round without feedback. You will now complete a run-through of this task, which will again take approximately 6 minutes. After the round you will be given the opportunity to take a break. Press any key to continue.

Box S12: Screen 1 of instructions for the graph condition in session 2

You have finished the passage to memory! Well done! You are now welcome to take a short break and also close your eyes. Please continue to lie still and relaxed. When you are ready, you can continue with the instructions for the main task.

Press any key to continue.

Box S13: Screen 2 of instructions for the graph condition in session 2

Congratulations, you are now a trained zookeeper! Attention: Sometimes the animals break out of their cages! Your task is to bring the animals back to the right cages. When you see an animal on the screen, press the right button as fast as possible to bring the animal back to the right cage. This time you will not get any feedback if your answer was right or wrong. The more animals you put in the correct cages, the more bonus you get at the end of the trial!

The main task consists of 5 runs, each taking about 10 minutes to complete.

Press any key to continue.

Box S14: Screen 3 of instructions for the graph condition in session 2

You have again 1 second to respond. In the main task, you again respond immediately when you see an animal on the screen. Again, please respond as quickly and accurately as possible.

Between each round you will again see a cross for a moment. Sometimes the cross will be shown a little shorter and sometimes a little longer. It is best to stand by all the time to respond as quickly as possible to the next animal.

Press any key to continue.

Replay comes of age

Reverse replay of behavioural sequences in hippocam-1396 pal place cells during the awake state

Hippocampal theta sequences

A map of abstract relational 1402 knowledge in the human hippocampal-entorhinal cortex. eLife, 6

Learned spatiotemporal sequence recognition and prediction 1405 in primary visual cortex

The successor representation and temporal context

Linking pattern completion in the 1460 hippocampus to predictive coding in visual cortex

Formalizing planning and information 1464 search in naturalistic decision-making

Improved optimization for the 1467 robust and accurate linear registration and motion correction of brain images

Coordinated memory replay in the visual cortex and hippocampus 1471 during sleep

Neural ensembles in CA3 transiently encode paths forward of 1474 the animal at a decision point

Network constraints on 1478 learnability of probabilistic motor sequences

Human hippocampal theta oscilla-1483 tions reflect sequential dependencies during spatial planning

Local patterns to Yaroslav Halchenko

PyBIDS: Python tools for BIDS datasets

Adaptive learning is structure 1778 learning in time

Segmentation of brain MR images through a hidden markov 1782 random field model and the expectation-maximization algorithm

Bitte lesen Sie sich die folgenden Informationen aufmerksam durch. Falls Sie Fragen haben, können Sie diese gleich mit der Versuchsleitung klären. Bitte liegen Sie die gesamte Zeit so ruhig und entspannt wie möglich

Drücken Sie eine beliebige Taste, um fortzufahren

Screen 2 of instructions for the training condition in session 1

Ausbildung und sollen darauf achten, dass alle Tiere in den richtigen Käfigen sind. Zuerst werden Sie in einem Training lernen, welches Tier in welchen Käfig gehört. Wir werden Ihnen jetzt genau erklären

Drücken Sie eine beliebige Taste, um fortzufahren

Screen 3 of instructions for the training condition in session 1

Sie wählen einen Käfig mit der entsprechenden Antworttaste aus. Bitte lassen Sie Ihre Ring-, Mittel-und Zeigefinger die gesamte Zeit auf den Antworttasten

Drücken Sie eine beliebige Taste, um fortzufahren

Box S19: Screen 4 of instructions for the training condition in session 1

Während des Trainings erscheinen die Tiereüber ihren Käfigen. Drücken Sie die Taste für diesen Käfig so schnell wie möglich und merken Sie sich den Käfig

Sie erhalten eine Rückmeldung, wenn Ihre Antwort richtig, falsch oder zu langsam war. Dabei erscheint der richtige Käfig in Grün und der falsche Käfig in Rot

Box S20: Screen 5 of instructions for the training condition in session 1

Sie erhalten einen höheren Bonus, wenn Sie sich an die richtige Zuordnung erinnern. Je besser Sie sich daran erinnern, in welchen Käfig welches Tier gehört, desto mehr Geld verdienen Sie! Sie werden nun einen Durchgang dieser Aufgabe absolvieren

Drücken Sie eine beliebige Taste, um fortzufahren

Box S21: Screen 1 of instructions for the recall condition in session 1

Wir werden nunüberprüfen, wie gut Sie die Zuordnung der Tiere zu ihren Käfigen gelernt haben. Die Tiere werden nun in der Mitte des Bildschirms erscheinen

Drücken Sie eine beliebige Taste, um fortzufahren

Screen 2 of instructions for the recall condition in session 1

Dieses Mal antworten Sie erst nachdem das Tier gezeigt wurde

Danach erscheint zunächst ein kleines Kreuz für einen kurzen Moment. Dann erscheinen die Käfige und Sie können so schnell und genau wie möglich antworten. Bitte antworten Sie erst sobald die Käfige erscheinen

Drücken Sie eine beliebige Taste, um fortzufahren

Screen 3 of instructions for the recall condition in session 1

Sie erhalten wieder eine Rückmeldung, wenn Ihre Antwort falsch oder zu langsam war. Wenn Ihre Antwort richtig war, geht es ohne Rückmeldung direkt mit der nächsten Runde weiter. Sie werden nun 8 Durchgänge dieser Aufgabe absolvieren, die jeweils circa 6 Minuten dauern

Drücken Sie eine beliebige Taste, um fortzufahren

Screen 1 of instructions for the recall condition in session 2 Willkommen zur Studie -Sitzung 2!

Wir werden noch einmalüberprüfen, ob Sie sich an die Zuordnung der Tiere zu ihren Käfigen erinnern können. Die Tiere werden wieder in der Mitte des Bildschirms erscheinen

Drücken Sie eine beliebige Taste, um fortzufahren

Screen 2 of instructions for the recall condition in session 2

Versuchen Sie dann bitte, sich die richtige Kombination von Tier, Käfig und Antworttaste aktiv vorzustellen. Danach erscheint zunächst ein kleines Kreuz für einen kurzen Moment. Dann erscheinen die Käfige und Sie können so schnell und genau wie möglich antworten. Bitte antworten Sie erst sobald die Käfige erscheinen

Drücken Sie eine beliebige Taste, um fortzufahren

Screen 3 of instructions for the recall condition in session 2

Sie erhalten wieder eine Rückmeldung, wenn Ihre Antwort falsch oder zu langsam war. Wenn Ihre Antwort richtig war, geht es ohne Rückmeldung direkt mit der nächsten Runde weiter. Sie werden nun einen Durchgang dieser Aufgabe absolvieren, der wieder circa 6 Minuten dauert

Drücken Sie eine beliebige Taste, um fortzufahren

Screen 1 of instructions for the graph condition in session 2

Sie haben den Durchgang zu Erinnerung beendet! Gut gemacht! Sie können jetzt gerne eine kurze Pause machen und dabei auch Ihre Augen schließen. Bitte bleiben Sie weiterhin ruhig und entspannt liegen. Wenn Sie bereit sind, können Sie mit den Instruktionen für die Hauptaufgabe fortfahren

Drücken Sie eine beliebige Taste, um fortzufahren

Screen 2 of instructions for the graph condition in session 2

Wenn Sie ein Tier auf dem Bildschirm sehen, drücken Sie so schnell wie möglich die richtige Taste, um das Tier zurück in den richtigen Käfig zu bringen. Dieses Mal bekommen Sie keine Rückmeldung, ob Ihre Antwort richtig oder falsch war. Je mehr Tiere Sie in die richtigen Käfige bringen

Drücken Sie eine beliebige Taste, um fortzufahren

Screen 3 of instructions for the graph condition in session 2

Sie wieder sofort, wenn Sie ein Tier auf dem Bildschirm sehen. Bitte antworten Sie wieder so schnell und genau wie möglich. Zwischen den einzelnen Runden sehen Sie wieder ein Kreuz für einen Moment. Manchmal wird das Kreuz etwas kürzer und manchmal etwas länger gezeigt. Am Besten halten Sie sich die ganze Zeit bereit

Drücken Sie eine beliebige Taste, um fortzufahren

Box S30: Screen 4 of instructions for the graph condition in session 2

Vor, zwischen und nach den Durchgängen der Hauptaufgabe machen wir einige Messungen bei denen Sie einfach nur ruhig liegen sollen. In diesen Ruhephasen sollen Sie bitte Ihre Augen geöffnet halten und die gesamte Zeit auf ein Kreuz schauen

Hintergrund des Bildschirms wird in den Ruhephasen dunkel sein. Bitte liegen Sie weiterhin ganz ruhig und entspannt und versuchen Sie weiterhin sich so wenig wie möglich zu bewegen. Versuchen Sie bitte die gesamte Zeit wach zu bleiben

Bitte warten Sie auf die Versuchsleitung

The authors declare no competing interests.

Supplementary Figure S5 : Classifier probabilities during graph trials are modulated by node distance in the graph structure. Classifier probabilities (in %; y-axis) as a function of the distance between the nodes in the uni-directional (first line) and bi-directional (second line) graph structure averaged across TRs in the early (TRs 1-4) or late (TRs 5-8) phase (horizontal panels) of the long ITIs of the five runs (vertical panels) in graph trials for data in the occipito-temporal (a), (b) and motor cortex (c), (d) ROIs. Each dot corresponds to data averaged across participants. Error bars represent ±1 SEM. All statistics have been derived from data of n = 39 human participants who participated in one experiment.After all the work as a zookeeper you also need rest. Before, between and after the main task we will take some measurements during which you should just lie still. During these rest periods, please keep your eyes open and look at a cross the entire time. Blinking briefly is perfectly fine. The background of the screen will be dark during the resting phases. Please continue to lie very still and relaxed and continue to try to move as little as possible. Please try to stay awake the entire time.Please wait for the study instructor.