key: cord-0262637-0o3tftjx authors: Young, Megan E.; Spencer-Salmon, Camille; Mosher, Clayton; Tamang, Sarita; Rajan, Kanaka; Rudebeck, Peter H. title: Temporally-specific sequences of neural activity across interconnected corticolimbic structures during reward anticipation date: 2020-12-18 journal: bioRxiv DOI: 10.1101/2020.12.17.423162 sha: febd4404187eb35fcc2ba9dc4b46b844d2fea604 doc_id: 262637 cord_uid: 0o3tftjx Functional neuroimaging studies indicate that interconnected parts of the subcallosal anterior cingulate cortex (ACC), striatum, and amygdala play a fundamental role in affect in health and disease. Yet, while the patterns of neural activity engaged in striatum and amygdala during affective processing are well established, especially during reward anticipation, very little is known about subcallosal ACC. Here we recorded neural activity in non-human primate subcallosal ACC and compared this to interconnected parts of basolateral amygdala and rostromedial striatum while macaque monkeys performed reward-based tasks. Applying multiple analysis approaches, we found that neurons in subcallosal ACC and rostromedial striatum preferentially signal anticipated reward using short bursts of activity that form temporally-specific sequences. By contrast, basolateral amygdala uses a mixture of both sequential and more sustained activity to signal anticipated reward. Thus, sequential patterns of neural activity across populations of neurons are engaged in affect, especially in subcallosal ACC. HIGHLIGHTS Sustained elevated firing signals anticipated reward in amygdala Temporally specific sequences of activity in subcallosal ACC signal anticipated reward Neurons exhibit more punctate encoding when tasks become more complex Temporal sequences of neural activity signal different anticipated rewards in BLA Corticolimbic areas are central to neural circuit-based accounts of affective processing (Price and Drevets, 2012) . Neural activity within human amygdala, striatum, and subcallosal anterior cingulate cortex (subcallosal ACC) is related to processing of affective stimuli and experiences, both rewarding and aversive (Lindquist et al., 2012) . Damage to or dysfunction within each of these areas in humans is associated with marked changes in personality and affect (Adolphs et al., 1995 (Adolphs et al., , 1996 . Of these areas, there has been a particular focus on subcallosal ACC because of its role in pathological changes of affect. Both hyper-and hypoactivity within subcallosal ACC are biomarkers of depression (Mayberg et al., 1999; Siegle et al., 2012) and activity normalizes following successful pharmacotherapy (Mayberg et al., 2000) or cognitive behavioral therapy treatment (Dunlop et al., 2017) . In schizophrenia, disrupted anticipatory reward responses within subcallosal ACC as well as striatum and amygdala are related to the severity of anhedonia Barch, 2010, 2012) . This indicates a role for this network of areas in modulating affective responses in anticipation of positive events such as rewards. Consistent with this, lesions (Rudebeck et al., 2014) or pharmacological overactivation of subcallosal ACC (Alexander et al., 2019) disrupt autonomic arousal in anticipation of rewards in non-human primates. Thus, across both humans and non-human primates, the circuit linking subcallosal ACC, amygdala and striatum is necessary for affective behavior and specifically responding in anticipation of reward. Despite this central role for subcallosal ACC in affective behavior, little is known about how neurons within primate subcallosal ACC signal affective stimuli and events. Unlike striatum and amygdala where the patterns of neural activity related to impending rewards and punishments are well established (Paton et al., 2006; Schultz et al., 1993) , few neurophysiological investigations of primate subcallosal ACC are available. Further, the findings are equivocal: neurons in subcallosal ACC weakly encode impeding reward (Monosov and Hikosaka, 2012) , more robustly encode reward or are more strongly modulated by changes between sleep and wakefulness (Gabbott and Rolls, 2013) . Notably, a mechanistic account for the role of subcallosal ACC in sustaining arousal in anticipation of rewards is lacking. One possibility is that sustained activity in subcallosal ACC drives arousal through interaction with hypothalamus and other areas that directly control bodily states Ongur et al., 1998) . Another possibility is that subcallosal ACC provides a dynamic and temporally specific signal to downstream areas such as striatum that is then used to sustain arousal in anticipation of reward. To arbitrate between these two potential alternatives, we recorded neural activity within subcallosal ACC, basolateral amygdala (BLA), and the part of the striatum where both of these areas project, rostromedial striatum (Cho et al., 2013; Haber et al., 2006) . Recordings were made in two macaque monkeys performing Pavlovian trace-conditioning and instrumental choice tasks for fluid rewards. The Pavlovian trace-conditioning task was adapted from one previously shown to be sensitive to lesions of subcallosal ACC (Rudebeck et al., 2014) . Here we report that: 1) neurons in BLA, but not subcallosal ACC or rostromedial striatum, signal anticipated reward through sustained activity and 2) individual neurons in subcallosal ACC as well as BLA and rostromedial striatum all exhibit temporally specific sequences of activity to signal impending reward. We trained two macaque monkeys (monkeys H and D) on a modified version of the Pavlovian trace-conditioning task previously shown to be sensitive to lesions of subcallosal ACC ( Figure 1A ). First, monkeys were extensively trained to maintain gaze on a centrally presented spot for 2.8 -3 seconds for three small drops of fluid. Then a Pavlovian trace conditioning procedure was superimposed on this fixation task. On 75% of fixation trials, stimuli of equal luminance associated with subsequent delivery of either juice (CS+ juice ), water (CS+ water ) or nothing (CS-) were presented for a second with equal probability shortly after fixation was acquired. If CS+ juice or CS+ water stimuli were presented a large drop of the corresponding reward (0.5 ml) was delivered 0.5-0.6 seconds after stimulus offset, respectively. Presentations of the CS-were associated with no reward being delivered. In the remaining 25% of trials, no CS was presented. To probe responses to unsignaled rewards, on 20% of these unsignaled trials (5% of total trials), a reward was delivered 2.1-2.7 seconds after fixation onset. These unsignaled reward trials are not included in the following analyses. While subjects performed the task, we monitored both heart rate and pupil size to look for correlates of sustained arousal in anticipation of reward. Both subjects exhibited sustained changes in autonomic arousal in anticipation of the fluid rewards that they would receive. In both monkeys, heart rate and pupil size responses were sustained in anticipation of reward ( Figure 1C and Supplemental Figure S1 ). The pattern of autonomic response differed between heart rate and pupil size measures and was unique to each subject. Heart rate was elevated to both rewardpredicting cues in in monkey D compared to CS-or neutral ( Figure 1C , left), whereas it was elevated for CS-and water predicting cues in monkey H (Figure 1C , right). In keeping with previous reports (Cash-Padgett et al., 2018) , pupil size often was constricted in anticipation of receiving juice relative to water and CS-but dilated during the period where monkeys received the reward (Supplemental Figure 1 ). In the Pavlovian task, monkeys were required to maintain gaze on a centrally located red spot for 2.8-3.0 seconds to receive three small drops of juice reward. In 75% of trials, one of three conditioned stimuli, either CS+ juice , CS+ water or CS-, were presented for 1 s (CS period) before a 0.4-0.5 second trace interval. If either the CS+ juice or CS+ water were presented then 0.5 ml of the corresponding fluid reward, either juice or water was delivered at the end of the trace interval. B) In the instrumental task, monkeys were required to maintain gaze on a centrally located spot. Then two stimuli were presented to the left and right for a second. Stimuli were the same as those presented in the Pavlovian task and had the same corresponding reward assignments. The red spot was then extinguished, and this was the cue for monkeys to make their choice by making a saccade to either stimulus. After maintaining fixation on the chosen stimulus for a brief period the corresponding reward was delivered. C) Heart rate as a function of time (percent change, mean ± SEM) for monkeys D and H on neutral, CS+ juice , CS+ water or CS-trials. Insets show the mean heart rate during the period after stimulus onset (mean ± SEM). Shaded regions or error bars show SEM. D) Choice behavior during the instrumental task. Choice latencies (left) and proportion of choices (right) during the three different trial types. Error bars show SEM. To confirm that the autonomic responses exhibited by our subjects were related to anticipated reward and that the subjects had learned the CS-outcome relationship, the same two monkeys were trained to perform an instrumental choice task ( Figure 1B) . The instrumental task used the same stimuli-reward pairings presented during the Pavlovian task and the two tasks were predominantly run one after each other in each recording session. In the instrumental choice task, both subjects chose the stimulus associated with juice over either water or nothing (CS-trials) on more than 95% of trials (Fig. 1D) . A similar preference was apparent on trials where they could only choose between the stimulus associated with water and CS-. The choice response latency, the amount of time from the go signal to the selection of one of the stimuli, was also modulated by subjects' preferences. Monkeys were faster to respond on trials where a CS+ juice stimulus was presented ( Figure 1D , effect of reward, F(1,3)=13.44, p<0.00001). Thus, monkeys exhibited differential patterns of sustained autonomic arousal to the different stimuli, but exhibited choices and response latencies that were highly consistent and which matched the motivational significance of the stimuli. Neural activity in subcallosal ACC, BLA, and rostromedial striatum during Pavlovian trace conditioning task To look for neural correlates of reward anticipation, we recorded the activity of 656 single neurons and multi-unit responses in subcallosal ACC (n=222), BLA (n=220), and rostromedial striatum (n=214) while monkeys performed the Pavlovian trace-conditioning task. Full details of recordings by area are presented in Table 1 . In all three areas the firing rate of many neurons varied according to the potential for fluid rewards. For example, the activity of the subcallosal ACC neuron in Figure 2A exhibited maximal firing to the conditioned stimulus associated with receiving the juice reward and progressively lower firing to the stimuli predicting water and no reward respectively. There was no change in firing rate when conditioned stimuli were absent on neutral trials. Neurons in BLA and rostromedial striatum similarly discriminated between the stimuli presented and the potential reward type (Figure 2A , middle and right). To quantify encoding differences between areas we conducted a sliding ANOVA analysis using a 200 ms window on a period of 3.0 seconds starting 500 ms before stimulus presentation and ending 2500 ms after stimulus presentation. This period was divided into four epochs, a baseline period (-500 -0 ms), Stimulus 1 st half (0 -500 ms), Stimulus 2 nd half (500 -1000 ms), and trace (1000 -1500 ms). Neurons were classified as encoding a task variable if they met the threshold for significance during the stimulus or trace intervals (see Experimental Procedures). What we describe below is the combination of both single and multiunit responses as there were no differences between analyses conducted on either dataset (Supplementary Figure S2 ). Figure 2: Neural activity in subcallosal ACC, BLA and rostromedial striatum during the Pavlovian task. A) Spike density functions and raster plots depicting the activity of neurons recorded within subcallosal ACC (left), BLA (middle) and rostromedial (rm) striatum (right). Neurons in subcallosal ACC and rostromedial striatum exhibit the highest firing rate to stimuli associated with receiving juice reward and progressively lower firing to the other stimuli. Note that activity returns to baseline before the trace interval in both. The neuron in BLA (middle) exhibits higher and sustained firing to the stimuli associated with juice and water. The color code shows the different trial types. B) Percent of neurons in subcallosal ACC, BLA and rostromedial striatum classified by a sliding ANOVA as encoding a the different task conditions during either the baseline period (1.0 s before the onset of stimulus), the stimulus period 1 st half (0-0.5 s after the onset of the stimulus) or the stimulus period 2 nd half ( 0.5-1 s after the onset of stimulus). C) Time course of stimulus-reward encoding in subcallosal ACC (blue), BLA (black) and rostromedial striatum (green) following the presentation of the conditioned stimuli. Red and blue dots at the top indicate significant differences in the proportion of neurons between areas (p<0.0167, Gaussian approximation test with false discovery rate correction). D) Location of recorded neurons in the subcallosal ACC (top, sagittal view), rostromedial striatum (middle, coronal view) and BLA (bottom, coronal view). Insets show the location of electrodes on T1-weighted MRIs targeting each of the three structures. Each dot represents a neuron. Red and green dots denote neurons classified as encoding trial type during the stimulus or trace interval, respectively. Similarly, data are collapsed across both subjects as again there were few differences between them (Supplemental Figure S3 ). Following presentation of the stimuli, a greater proportion of neurons in BLA encoded the different task conditions than those in either subcallosal ACC or rostromedial striatum ( Figure 2B and C). Not only were BLA neurons more likely to encode the potential for reward after the stimuli were presented, but they also did so with a faster latency (Kolmogorov-Smirnov test, p<0.001 both comparisons), potentially reflecting the bottom-up flow of information across the three areas recorded. During the trace period when animals remembered if a reward-predicting stimulus had been presented, more BLA neurons encoded impending reward delivery than in either subcallosal ACC and rostromedial striatum (Figures 2B and C) . Note that in Figure 2C the proportion of subcallosal ACC neurons encoding the task conditions rarely exceeds chance when plotted over time, whereas in Figure 2B the proportion is above chance. This indicates that, when pooled across time subcalloal ACC neurons encode reward anticipation, but they do not do so tonically or at a specific time after stimulus onset as is seen in BLA. and counts of the number of significant bins (top) for subcallosal ACC (left), BLA (middle), rostromedial striatum (right) relative to stimulus and reward onset. In the plots of PEV, neurons are sorted according to when they were classified as being significantly modulated. Lighter or 'hotter' colors are associated with higher explained variance. Because of the variable trace interval, data are temporally realigned for the reward period and there is a period which intentionally left blank/dark blue between 1500-1600ms. To address our two competing hypotheses concerning the mechanisms engaged in subcallosal ACC during reward anticipation we next looked at whether encoding was more sustained or dynamic across the three recorded areas. In contrast to neurons in BLA and rostromedial striatum, neurons in subcallosal ACC rarely used a sustained scheme to encode anticipated reward (Figure 3, bottom) . Indeed, when we quantified the length of anticipated reward encoding across the three areas, BLA neurons exhibited longer encoding in anticipation of reward than either subcallosal ACC or rostromedial striatum (Kruskall-Wallis test on number of significant bins in stimulus and trace period, c 2 >17.4, p<0.001, Figure 3 , top). There was no difference between the length of encoding in subcallosal ACC and rostromedial striatum (p>0.1). Structural MRI scans with recording electrodes in place ( Figure 2D ) and reconstructions based on these scans confirmed that recording locations were well distributed within subcallosal ACC. Thus, our data indicate that there is a hierarchy of sustained encoding for reward across the three brain areas recorded; BLA exhibits the most and subcallosal ACC exhibits the least sustained activity in anticipation of reward. Taken together, this pattern argues against the hypothesis that subcallosal ACC uses sustained activity to encode impending reward. The absence of sustained encoding in subcallosal ACC in the Pavlovian task could theoretically be because such encoding is only present in settings that require instrumental actions to obtain reward. Encoding of both reward and stimulus location in subcallosal ACC during instrumental tasks suggests that action contingency could be a key factor (Strait et al., 2016) . To address this possibility, we recorded the activity of 421 single neuron and multiunit responses in subcallosal ACC (n=109), BLA (n=144), and rostromedial striatum (n=168) while monkeys performed the instrumental choice task where they chose between pairs of stimuli presented in the Pavlovian trace conditioning task on each trial. A full breakdown of this information is provided in Table 2 . When monkeys had to choose between options and make eye movements to receive the reward, the firing rate of many neurons across the three recorded areas varied with the identity of the chosen reward, either juice or water (Figure 4A , top). Few neurons encoded the movement direction that monkeys would make or the interaction between chosen reward and movement direction (Supplementary information, Fig S4) . Similar to the Pavlovian task, neurons in BLA signaled the chosen reward earlier and to a greater extent than both subcallosal ACC and rostromedial striatum (Figure 4B & C) . Again, the length of encoding within BLA was greater than in both subcallosal ACC and rostromedial striatum (Supplementary Figure S5) . In summary, when monkeys had to make instrumental actions to receive rewards, neurons in subcallosal ACC again did not use sustained encoding to signal impending reward. Thus, the pattern of activity seen across both tasks in subcallosal ACC appears to refute the sustained encoding hypothesis. A) Spike density functions and raster plots depicting the activity of neurons recorded within subcallosal ACC (left), BLA (middle) and rostromedial (rm) striatum (right). The neuron in subcallosal ACC exhibits higher firing when the stimulus associated with water will be chosen. Neurons in rostromedial striatum and BLA exhibit the highest firing rate when juice will be chosen. The color code shows the different trial types. B) Percent of neurons in subcallosal ACC, BLA and rostromedial striatum classified by a sliding ANOVA as encoding a the trial types during either the baseline period (1.0 s before the onset of the stimuli), the stimulus period 1 st half (0-0.5 s after the onset of the stimulus) or the stimulus period 2 nd half ( 0.5-1 s after the onset of the stimulus). C) Time course of encoding in subcallosal ACC (cyan), BLA (red) and rostromedial striatum (black) following the presentation of the conditioned stimuli. Red and blue dots at the top indicate significant differences in the proportion of neurons between areas (p<0.0167, Gaussian approximation test with false discovery rate correction). Rostromedial striatum and subcallosal ACC comparison not shown as all p's >0.05. D) Location of recorded neurons in the subcallosal ACC (top, sagittal view), rostromedial striatum (middle, coronal view) and BLA (bottom, coronal view). Insets show the location of electrodes on T1-weighted MRIs targeting each of the three structures. Each dot represents a neuron. Red and green dots denote neurons classified as encoding trial type during the first and second parts of the stimulus period and reward period. If neurons in subcallosal ACC are not using a sustained encoding scheme to signal impending rewards, what is the nature of the mechanism engaged? Mounting evidence indicates that sequential patterns of activity within a population of neurons may be a fundamental principle of signaling information in both cortical and subcortical structures (Crowe et al., 2010; Harvey et al., 2012; Pastalkova et al., 2008) . Qualitatively, neurons in subcallosal ACC exhibit temporally punctate bursts of encoding that tile the time interval from stimulus onset to the end of the trace period when visualized and sorted at the population level (Figure 3) . We thus sought to determine whether these patterns of activity in subcallosal ACC as well as BLA and rostromedial striatum form temporally specific sequences. First, to confirm that the sequential patterns of encoding are present in raw firing patterns and not simply in the explained variance, we computed the difference in firing rate between each neuron's "preferred" and "non-preferred" condition in the Pavlovian task ( Figure 5A ). This approach is analogous to defining a receptive field for each neuron (e.g. Crowe et al., 2010) . Aligning these differences in activity by the center of mass revealed that the majority of neurons in subcallosal ACC and rostromedial striatum exhibited punctate changes in activity, a key feature of neuronal sequences (Rajan et al., 2016) . By contrast, neurons in BLA predominantly exhibited sustained changes in firing that were longer in duration than the other two areas ( Figure 5B , effect of group, p<0.011). If neurons are part of a temporal sequence, then individual neurons should have specific times at which they signal impending reward, and this timing should be stable irrespective of which trials are analyzed. Thus, we next looked at whether neurons across the three areas had specific time points at which they encoded impending reward and how robust these timepoints were. To ensure statistical power, we selected all neurons that were classified as encoding impending reward in the Pavlovian task that had been recorded for at least 70 completed trials. We then randomly sampled fifty percent of these trials and determined the point of maximal encoding using the previously described sliding ANOVA (subcallosal ACC: n=35; BLA: n = 94; rostromedial striatum: n=64). The distribution of maximal encoding time points for each neuron for 500 such subsamples in subcallosal ACC, BLA and rostromedial striatum is shown in Figure 5C . Neurons predominantly had specific times at which they encoded different trial types. Extending this analysis, for each neuron we next compared whether maximum encoding times for one half of the trials drawn at random were different to the other half of trials and repeated this process 500 times. For 91% of neurons there was no difference in maximum encoding times between the two sets of trials (subcallosal ACC 2/35=5.71%, BLA 8/94=8.51%, rostromedial striatum 1/64=1.56%). Mean firing rate difference between preferred and anti-preferred trial type for neurons classified as encoding the different trial types in subcallosal ACC, BLA and rostromedial (rm) striatum. Each line represents a single neuron. Firing rate differences are center of mass aligned. Lighter or 'hotter' colors are associated with higher differences in firing rate. Solid colored lines are the mean across the population of neurons in each area. B) Mean of the firing rate difference between preferred and anti-preferred trial types for subcallosal ACC (blue), BLA (red) and rostromedial striatum (yellow). C) Density of maximum significant encoding times across 500 runs for individual neurons classified as encoding the different trial types in subcallosal ACC (left), BLA (middle), and rostromedial striatum (right). Each colored line represents a single neuron and neurons were sorted according to the time of maximum density. D) Distribution of matching indexes for subcallosal ACC (left), BLA (middle), and rostromedial striatum (right) across the 500 runs. Shuffled data are in grey and actual data are in corresponding colors. Taken together, these analyses suggest that individual neurons in all areas encoded reward with a robust preference for a particular moment in time -a pre-requisite for sequences as the organizing principle of neural encoding. To further test the hypothesis that neurons, especially those in subcallosal ACC, are part of a temporal sequence, we computed a matching index (MI) adapted from (Ji and Wilson, 2007) for each area. Briefly, we used the time of maximum encoding of each neuron and numbered each neuron according to its position in the sequence compared to all others. This numbering can be used to compare two sequences of activity in the same population of neurons, by determining whether each possible pair of neurons appears in the same or opposite order between the two sequences. The resulting matching index, which is equal to the difference between the number of Each line represents a single neuron and neurons were sorted according to their maximum density. B) Length of encoding time by area and task for all neuron classified as significantly encoding in the Pavlovian and instrumental tasks. Encoding time is defined as the period of time over the threshold for classification. Darker and lighter colors signify the neuron recorded in Pavlovian and instrumental tasks, respectively. C) Length of encoding time in the Pavlovian task plotted against the instrumental task for neurons recorded in both tasks. Diagonal line represents unity. Individual points denote single neurons and are colored according to brain area. pairs in the same and opposite orders divided by the total number of pairs, is equal to 1 for a perfect match, -1 for a perfect replay of the sequence in opposite order, and 0 for a perfect mixture between same and opposite orderings. This analysis revealed that neurons in all three areas exhibited stable ordering that was above chance ( Figure 5D , subcallosal ACC, MI=0.2196, p<0.035; BLA MI=0.3064, p<0.0004; rostromedial striatum MI=0.2188, p<0.005). These patterns were consistent across monkeys (Supplemental Figure S6 ). Because we recorded across multiple days it is possible that cells in each area encode at a specific time point each day and that we are therefore sorting this across-day variance into a sequence. To address this, we looked for individual recording sessions where more than two neurons in each area were recorded simultaneously. Figure 6A shows the diverse time points of maximal encoding of 6, 5 and 3 simultaneously recorded neurons from single recording sessions in subcallosal ACC, BLA and rostromedial striatum, respectively. Thus, within all areas recorded, most notably subcallosal ACC, neurons exhibited different points of maximal encoding and this was temporally specific, even within a single session. Finally, a recent modeling paper hypothesized that increasing task complexity should be associated with a shift to more sequential-like encoding (Orhan and Ma, 2019) . We therefore compared the amount of time neurons in all areas encoded the different conditions across the two tasks ( Figures 6B and C) . At the level of the whole population of neurons across all areas, there was a trend for a reduction in the amount of statistically significant encoding time from the Pavlovian to the instrumental task (effect of task, F(1,2)=3.06, p=0.08). This difference was most apparent in the BLA (task by area interaction, F(2,446)=5, p<0.0071) likely as their activations changed from more persistent to sequential in nature. When we only considered neurons that were recorded in both tasks, the same effect was seen, as there was a reduction in the length of encoding from the Pavlovian to the instrumental task ( Figure 6C , t-test on difference in length of encoding, p<0.001). Taken together, these analyses conducted on neurons classified as encoding the different trial types indicate that neurons in all three recorded areas exhibit the features of temporallyspecific neural sequences. All of the foregone analyses identifying temporally-specific sequences of neural activity were conducted on neurons previously classified as encoding the different trial types in the Pavlovian task. If sequential patterns of activity are present, however, then it is possible that a greater proportion of neurons are engaged in sequences, as it is the timing, not the number of spikes that signal different aspects of the task. Thus, we next investigated whether distinct neuronal sequences could be identified in pseudopopulations of all of the recorded neurons, not just those classified as encoding the different task conditions. Further, we sought to discern whether unique sequences of activity existed for different trial types. To do this, we applied an unsupervised computational approach, seqNMF, that has been validated in data from rodents and song birds to extract recurring neural sequences from the activity of populations of neurons (Figure 7 , Mackevicius et al., 2019) . Unlike principal component analysis and clustering, which are typically Figure 7 : Schematic of factor identification in neural activity data using seqNMF. An example set of neural activity recorded from 124 neurons in BLA of monkey D (top right). Data are from the 6 trials from each of the three different cued trial types, CS+ juice (red), CS+ water (yellow), and CS-(purple). Convolutional NMF models this neural activity data as a neuron by time matrix over the sum of K matrices. Here each K matrix is a single trial. Matrices constructed for each trial are the product of two components: a non-negative matrix Wk of dimension N by L that stores a sequential pattern of the N neurons at L time lags. An additional vector which stores the temporal loadings, Hk, denotes the times when each factor is present in the data as well as the relative amplitude of that factor. SeqNMF is a refinement of convolutional NMF as it optimizes a penalty term, λ that discards redundant factors in the data identified by convolutional NMF (Mackevicius et al., 2019) . λ is optimized by determining the point at which correlation and reconstruction costs cross (denoted by dotted line where reed and blue curves cross). In this example, optimizing λ and setting it to 1 times the cross over point leads to factor 3 being discarded as this factor is correlated with factor 1. When applied to the data recorded in BLA, seqNMF identified two significant factors corresponding to sequences. Factor reconstructions are shown (bottom right). limited to modeling synchronous activity, this technique can model extended spatiotemporal patterns of activity (putative sequences, referred to as factors), and is more scalable than existing techniques aimed at capturing cross-correlations between pairs of neurons. SeqNMF is an extension of convolutional non-negative matrix factorization (convNMF, Smaragdis, 2006) and prevents redundant factors from being extracted from the data by adding a cross orthogonality cost term, λ (λ optimization, Figure 7) . Fitting seqNMF to neural activity time-series data outputs n unique factors and the amplitude of those individual factors when they are detected, referred to as temporal loadings. We applied seqNMF to the firing rates of neurons from 0 to 1500 ms after stimulus onset that had at least 20 instances for each of the CS+ juice , CS+ water , and CS-trials. Analyses were conducted separately for each subject to control for individual differences. This meant that a total of 115, 124, and 100 neurons in monkey D and 59, 68, and 83 in monkey H from subcallosal ACC, BLA, and rostromedial striatum were combined into pseudopopulations and entered into this analysis, respectively. Critically, neurons were not selected based on whether they had previously been classified as encoding the different trial types. Spike trains were smoothed with a 150ms half Gaussian and parameters were optimized to extract unique and reproducible factors over 20 training and testing runs. Control analyses were conducted on spikes from the last 1500 ms of the ITI and on temporally shuffled data using identical parameters to those applied to within trial segments. Using seqNMF we identified unique factors related to temporally-specific sequences of neural activity in all three areas during the stimulus and trace periods of the task (Figures 8A, D, E, and I). Importantly, factors were very rarely identified within the ITI (unfilled bars, Figures 8 B, F, and J) or when the spike times were shuffled (Supplemental Figure S7) . The amplitude of each factor, as measured by the temporal loadings, and the number of factors identified were, however, different between subcallosal ACC, BLA, and rostromedial striatum indicating that sequences signaled different aspects of the task. A single factor could be reliably discerned from the neurons recorded in subcallosal ACC in both monkeys, and in monkey D where the most neurons were available this factor was closely associated with the anticipation of juice delivery (Figure 8A and B). Figure 8A (left, Factor 1) shows a single factor identified in the activity of neurons in subcallosal ACC from monkey D (left, Factor 1). Reconstruction of pseudoensemble spiking activity for this factor on six held-out CS+ juice , CS+ water , and CS-trials are shown on the right, with the time resolved temporal loadings shown above each trial (green density plots, Figure 8A , right). In monkey D, this factor had the highest amplitude on CS+ juice trials and was equal on CS-and CS+ water trials (mean temporal loadings, effect of trial type, F(2,38)=152.1, p<0.0001, post hoc tests, CS+ juice > CS+ water or CS-, p<0.002, Figure 8C , right). For monkey H, there was no difference between the temporal loading for the individual trial types (effect of trial type, F(2,32)=0.74, p>0.48, Figure 8C , left). In BLA, two unique factors were identified in the neurons recorded in monkey D ( Figures 8D-H) . As can be seen in the reconstructions of individual trials, the first factor was seen on all trial types and was highly reproducible (Figure 8D and G). For this factor the amplitude was equivalent on CS+ juice , CS+ water , and CS-trials (effect of trial type, F(2,44)=0.62, p>0.5, Figure 8H ) indicating that it was not associated with the expected outcome (juice, water, or nothing), but instead simply registered the appearance of a stimulus. Reconstruction of the second factor shows that it was predominantly associated with CS+ juice trials and was largely absent on CS water or CStrials ( Figure 8E , effect of trial type, F(2,44)=145.4, p<0.0001, Figure H , right). This is similar to what was seen in subcallosal ACC, where the amplitude of the single sequence in monkey D was highest on CS+ juice trials ( Figure 8A and C). In the neurons recorded from BLA in monkey H a single factor was most often recovered ( Figures 8F-G, left) . Unlike monkey D, the temporal loadings for this factor did not discriminate between the different trial types (effect of trial type, F(2,56)=1.24, p<0.3, Figures 8F-H, right) . This difference between monkeys may be related to the number of neurons available for analysis; approximately half the number were available from monkey H as compared to monkey D (124 versus 68). Note that there were no differences in the proportion of BLA neurons encoding the different trial types between monkeys D and H (Supplemental Figure S3 ). This indicates that it was not simply the case that neurons in monkey H did not discriminate between the different conditions. In summary, the factors identified in BLA indicate that there are two distinct sequences of neural activity and the second factor identified in monkey D signals impending juice delivery. In rostromedial striatum, where equivalent numbers of neurons were available for analysis a single highly reproducible factor was observed in monkeys D and H (Figures 8I-J) . Reconstruction of this factor on spiking data shows that it was closely associated with anticipation of reward delivery (Figure 8I, left) . Temporal loadings for this factor were higher on rewarded CS+ juice and CS+ water trials compared to CS-trials (effect of trial type, both Fs>8.78, p<0.0001, posthoc comparison, CS+ juice /CS+ water >CS-, Figure 8K) . In monkey H, the temporal loadings for this factor not only differentiated between rewarded and unrewarded trials, but also discriminated between CS+ juice and CS+ water (CS+ juice versus CS+ water , X 2 =5.16, p<0.05). This pattern potentially indicates that this factor scaled with the motivational significance of the available rewards. The same analysis in monkey D failed to reach statistical significance (CS+ juice versus CS+ water , X 2 <1, p>0.3). Thus, the factor identified in rostromedial striatum discriminates rewarded from unrewarded conditions in both monkeys. In summary, factors corresponding to sequences of neural activity were identified in pseudopopulations of neurons from subcallosal ACC, BLA and rostromedial striatum using unsupervised computational methods. These analyses suggest that: 1) all recorded neurons, not just those classified as encoding the different trial types, contribute to task-specific neural sequences through the specific timing of their activity, and 2) unique sequences of activity were engaged to signal different aspects of anticipated reward. This was especially true in the subcallosal ACC and BLA of monkey D where specific sequences of neural activity were related to anticipated juice delivery. Corticolimbic structures are essential for autonomic and behavioral responses in anticipation of receiving reward. Despite its central role in modulating reward anticipation Rudebeck et al., 2014) , the mechanisms engaged in one part of this network, subcallosal ACC have been hard to discern. Here we recorded activity within subcallosal ACC as well as interconnected parts of amygdala and rostromedial striatum in order to gain a circuit level understanding of how these areas signal reward anticipation. We specifically looked for whether activity in subcallosal ACC either exhibited persistent or more dynamic signals related to anticipated reward. We found that neurons in BLA, but not subcallosal ACC or rostromedial striatum, exhibited sustained encoding in anticipation of reward in two different behavioral tasks (Figures 2-4) . By contrast, reward-related activity of single neurons in subcallosal ACC as well as rostromedial striatum was primarily characterized by temporally-specific punctate bursts of activity prior to reward delivery. When visualized at the population level, these bursts form a temporally-specific sequence that tile the time until reward is delivered (Figures 3, 5 and 6) . Similar sequential patterns of activity were also seen in BLA, but they were intermixed with sustained activity (Figures 3 and 5) . Population-level sequential patterns of activity scaled with the complexity of the task being performed, in keeping with predictions from recent theoretical accounts (Figure 6 , Orhan and Ma, 2019) . To further characterize the neural sequences in all of the areas recorded, we first used a previously developed matching index (Ji and Wilson, 2007) and then an unsupervised computational approach, seqNMF. During trials where predictive stimuli were presented, single factors corresponding to sequences were recovered from the activity of the whole population of neurons recorded from subcallosal ACC and rostromedial striatum (Figure 8) . In the subject with the most neurons available for analysis, two distinct factors were recovered in BLA, one associated with the time of visual presentation of any stimulus and one closely associated with CS+ juice trials. Notably, in the same subject the factor identified in subcallosal ACC was also closely associated with CS+ juice trials. In summary, our data reveal that temporally-specific sequences of activity encode reward anticipation across corticolimbic structures. However, whereas both sustained and sequential patterns of encoding were evident in BLA, subcallosal ACC and rostromedial striatum almost exclusively used sequential patterns of encoding. Evidence from human neuroimaging studies in healthy individuals (Harrison et al., 2009; Mayberg et al., 1999) , people with psychiatric disorders (Dowd and Barch, 2012; Keedwell et al., 2005) , anatomy (Ghashghaei et al., 2007) , as well as studies in non-human primates Rudebeck et al., 2014) all point to a central role for subcallosal ACC in modulating affect and specifically reward expectation. Despite this seemingly central role for subcallosal ACC in affective responses, prior neurophysiology studies looking for signals of anticipated reward in macaque subcallosal ACC have reported relatively weak encoding (Monosov and Hikosaka, 2012) , by comparison to amygdala (Paton et al., 2006) , striatum (Schultz et al., 1993) , and other parts of medial frontal cortex . Our data indicate that one reason that these previous studies failed to find strong encoding of impending reward is that they were looking for signals time-locked to reward-predicting stimuli and/or persistent encoding. Indeed, taking the same analysis approach of time-locking activity to the onset of predictive stimuli we similarly found that the proportion of neurons in subcallosal ACC signaling impending reward was substantially lower and less sustained in nature than in BLA (Figures 2-4) . Instead, neurons in subcallosal ACC appear to encode impending reward at specific times before the reward is delivered using punctate bursts of activity. Across recorded neurons, these bursts tile the period until reward is delivered (Figure 5) , a pattern that would have been obscured by standard analytical approaches that average activity across neurons in a population (for example see Figure 2C ). reward is signaled using both sustained and sequential encoding schemes (Enel et al., 2020; Kimmel et al., 2020) . This is qualitatively similar to what we found in BLA, where both encoding schemes were intermingled. By contrast, in subcallosal ACC as well as rostromedial striatum sequential encoding was more prominent. The existence of separable encoding schemes is theorized to play different roles in signaling future events. Persistent encoding states could be required to organize behaviors in anticipation of events providing a fixed point for other areas to sample from (Perich and Rajan, 2020) . By contrast, sequential encoding could be essential for precisely timing events of interest such as rewards or punishments as well as providing a more efficient way to provide population level-signals to downstream areas (Mello et al., 2015) . That subcallosal ACC only exhibits one of these schemes further indicates that sustained and sequential encoding might be generated by different functional interactions between brain areas, a point we take up below. Sequential patterns of activity across a population of neurons are hypothesized to be an evolutionarily conserved motif for signaling information in neural circuits (Hahnloser et al., 2002) . While temporally-specific sequences of neural activity have been most closely associated with hippocampus (for example, MacDonald et al., 2011; Pastalkova et al., 2008) and other medial temporal lobe structures (Vaz et al., 2020) , task-related sequences of neural activity have also been characterized in other cortical and subcortical areas (Crowe et al., 2010; Harvey et al., 2012; Mello et al., 2015) . Our results extend these findings to macaque frontal cortex and limbic system (Figures 5, 6, and 8) , further bolstering evidence that neural sequences are a fundamental encoding scheme in the brain. By characterizing sequences using an unbiased computational method, seqNMF, we were able to identify sequential patterns of activity across the whole population of neurons in the three recorded areas, not just those classified as encoding the different task conditions. In rostromedial striatum we identified a single unique factor, corresponding to a sequence in both subjects. Notably, the mean temporal loadings, the amplitude of this factor, discriminated between rewarded and non-rewarded trials (Figure 8 ) and in monkey H it discriminated all three conditions ( Figure 8K ). This pattern of effect means that the amplitude of this factor scales with the motivational significance of the available rewards as measured by reaction times in the instrumental task ( Figure 1D) . Prior work identified sequences of neural activity in striatum related to movement and timing behavior (Mello et al., 2015) . In the Pavlovian trace-conditioning task used here, subjects likely use task events such as stimulus on and offset to predict when rewards will be delivered. Thus, our findings potentially indicate that the motivational significance of anticipated rewards scales the magnitude of the sequences in rostromedial striatum. In BLA, we identified two distinct factors in monkey D, but only one in monkey H. The first factor appears to be related to the presentation of visual stimuli in the task, as the amplitude was no different between the different conditions in both monkeys. The second factor, however, was observed primarily on trials where juice would be delivered. Previous reports, as well as the present results have found that the magnitude of activity of single neurons in BLA signal different types of rewards or punishments (Paton et al., 2006; Schoenbaum et al., 1998) . Our findings indicate these single neuron correlates are paralleled by population level sequential patterns of activity that are also engaged by the specific reward outcomes that will follow a predictive stimulus. Whether these sequences are truly related to predictions about individual types of reward or simply reflect the animals' preferred outcome is beyond the scope of the present experiments as we only tested two rewards. The difference between the number of factors that could be recovered from BLA pseudopopulations in the two subjects is notable, but not without precedent. Prior work that applied seqNMF to simultaneously recorded populations of neurons from hippocampus found marked differences in the number of factors identified between subjects that had been trained on the same task (Mackevicius et al., 2019; Pastalkova et al., 2008) . Divergence in the number of factors recovered from BLA in each subject could be caused by underlying differences in the time points at which the neurons available for analysis signal impending reward as opposed to this encoding not being present in the activity of neurons analyzed. Indeed, encoding of the different task conditions was robustly observed in BLA neurons in both subjects (Supplementary Figure 3) , but many fewer neurons were available for analysis in monkey H. Alternatively, it could reflect individual differences in subjective responses to the anticipated rewards; such differences were evident in the different autonomic responses exhibited by the two subjects ( Figure 1C) . In subcallosal ACC, a single factor was identified in both subjects using seqNMF. In monkey D where the most data were available for analysis the amplitude of the factor discriminated between CS+ juice trials and the other trial types ( Figure 8C) . A similar pattern was not observed in monkey H, although we note that this could be due to fewer neurons being available for analysis. Alternatively, the divergence in the timing and amplitude of the sequences observed in subcallosal ACC between monkeys D and H could help to explain the individual differences in autonomic responses to the stimuli that were observed in the Pavlovian task ( Figure 1C ). Confirming the direct relationship between sequential patterns of activity in subcallosal ACC and autonomic responding will, however, require simultaneous monitoring of neural sequences in subcallosal ACC and autonomic activity, a level of resolution not available here as it necessitates simultaneous recording of large populations of neurons. Why temporally-specific sequential patterns of activity in subcallosal ACC are engaged to signal impending reward in both Pavlovian and instrumental tasks is unclear. Their presence in both tasks indicates that they are not simply caused by a narrow set of experimental parameters or task settings. Instead, they are likely caused by specific circuit-level interactions. Unlike other parts of frontal cortex, subcallosal ACC receives a dense monosynaptic input from hippocampus (Aggleton et al., 2015; Barbas and Blatt, 1995) . Rostromedial striatum and BLA similarly receive direct input from hippocampus (Aggleton, 1986; Friedman et al., 2002) . As previously noted, neural sequences are prominent in hippocampus during spatial navigation and more recently have been observed during associative learning (Taxidis et al., 2020) . In non-human primates, crosstalk between hippocampus and subcallosal ACC is essential for adaptive patterns of emotional responding Zeredo et al., 2019) , potentially indicating that interaction with hippocampus is what drives sequential patterns of activity in subcallosal ACC as well as BLA and striatum. Determining how this specific neural mechanism contributes to affective regulation in more complex task settings will likely be essential determining how cortico-amygdala-striatal circuits contribute to psychiatric disorders such as anxiety and depression. Two experimentally naïve adult rhesus macaques (Macaca mulatta), one female (Monkey D) and one male (Monkey H), served as subjects. They were 5.6 and 11.0 kg and 16 and 11 years old, respectively, at the beginning of training. Animals were pair housed when possible, kept on a 12-h light-dark cycle, tested during the light part of the day, and had access to food 24 hours a day. Throughout training and testing each monkey's access to water was controlled for 6 days per week. All procedures were reviewed and approved by the Icahn School of Medicine Animal Care and Use Committee. Monkeys were trained to perform Pavlovian and instrumental visually-guided tasks for fluid rewards. Visual stimuli were presented on a 19-inch monitor screen located 56 cm in front of the monkey's head. During all training and testing sessions monkeys sat in a custom primate chair with their heads restrained. Choices on the screen were indicated by gaze location. Eye position and pupil size were monitored and acquired at 90 frames per second with a camera-based infrared oculometer (PC-60, Arrington Research, Scottsdale, AZ). Heart rate was monitored by recording electrocardiogram (ECG) using a dedicated Biopac recording amplifier (ECG100C Goleta, CA) with the matching IPS100c power supply and MAC110c leads. Fluid rewards were delivered to the monkey's mouths using a custom-made pressurized juice delivery system (Mitz, 2005) controlled by a solenoid. All trial events, reward delivery, and timing were controlled using the open source behavioral control software Monkey Logic (https://monkeylogic.nimh.nih.gov). Through successive approximation, subjects were trained to maintain their gaze on a centrally-located red spot to receive a fluid reward. The delay to reward was then increased until both subjects were reliably fixating for 4 seconds. A Pavlovian trace conditioning procedure was subsequently superimposed within this fixation task. Subjects were trained to initiate a Pavlovian trial by fixating on a red fixation point superimposed on a centrally-located gray square for 800-1000 ms. Then, one of four trial types was randomly presented with equal frequency. On CS+ juice , CS+ water and CS-trials, a conditioned stimulus predicting juice (0.5 ml), water (0.5 ml) or no reward was presented behind the fixation spot for 1000 ms in the center of the screen, followed by a 500-600 ms trace interval in which the CS again was replaced by the neutral gray square. After the trace interval, the predicted Pavlovian reward was delivered. On neutral trials, instead of a conditioned stimulus, the neutral gray square remained onscreen after fixation and through the rest of the trial. In one-fifth of such trials, an unsignaled juice reward (0.5 ml) was delivered 1300-1700 ms after fixation. Finally, on all trials, a set of three small (0.1ml) juice rewards was delivered as a reward for successful trial completion 2000 ms after stimulus onset (300-700 ms after Pavlovian reward depending on trial type and trace interval length). Intertrial intervals varied from 2500-4000 ms. Breaking fixation at any point during a trial triggered a timeout interval of 4000 ms before the start of the next trial. Conditioned stimuli varied between subjects and consisted of luminance-matched gray shapes, covering 1.10° of visual angle for monkey D and 2.45° for monkey H. Subjects also performed an instrumental choice task. Here they made choices between pairs of stimuli from the Pavlovian task ( Figure 1B) . Subjects initiated a trial by fixating on a central red spot for 500 ms, after which two of the three stimuli (drawn by random selection) were shown simultaneously to the left and right side of the fixation spot. All three stimuli were shown with equal frequency with equal probability on each side of the screen. After a variable offer period ranging from 600-2200 ms, the fixation spot turned off, indicating that the monkey had 700 ms to select one of the stimuli by making a saccade and holding fixation on the chosen stimulus for 50 ms. The chosen reward (0.5 ml of juice or water or no reward) was then delivered. Failure to hold fixation when instructed or failure to choose a stimulus within 700 ms of the fixation spot being turned off led to a 4000 ms timeout interval before the next trial. Stimuli for the instrumental task used the same shapes that each subject learned in the Pavlovian task and covered 2.45° of visual angle for both subjects. All surgical procedures were conducted in a dedicated operating room under aseptic conditions. Anesthesia was induced with ketamine hydrochloride (10 mg/kg, i.m.) and maintained with isoflurane (1.0-3.0%, to effect). Monkeys received isotonic fluids via an intravenous drip. We continuously monitored the animal's heart rate, respiration rate, blood pressure, expired CO2 and body temperature. Monkeys were treated with dexamethasone sodium phosphate (0.4 mg/kg, i.m.) and cefazolin antibiotic (15 mg/kg, i.m.) for one day before and for one week after surgery. At the conclusion of surgery and for two additional days, animals received ketoprofen analgesic (10-15 mg/kg, i.m.); ibuprofen (100 mg) was administered for five additional days. Each monkey was implanted with a titanium head restraint device and then, in a separate surgery, a plastic recording chamber (27 x 36 mm) was placed over the exposed cranium of the left frontal lobe. The head restraint device and chambers were fixed to the cranium with either titanium screws alone or titanium screws plus a small amount of dental acrylic. Just prior to recording the cranium overlying the recording targets was removed under general anesthesia. ECG was recorded using surface electrodes (Kendall Medi-Trace 530 Series Foam Electrodes, Covidien, Mansfield, MA) placed on the back of the neck. Placements are varied from day-to-day to avoid irritating a single patch of skin. ECG signals were then low-pass filtered (4 pole Bessel) at 360 Hz, then digitized at 1000 samples/s and recorded on the neurophysiology recording amplifier along with pupil size measurements. Both ECG and pupil size were then processed to remove artifacts (periods of high noise where R-peaks could not be discerned or blinks) using previously validated methods (Mitz et al., 2017) . After processing all ECG and pupil signals were visually inspected for quality and sessions where there were prolonged periods of noise, signal drop out, or amplifier saturation were excluded from further analysis. This screening process meant that there were 78 and 66 sessions for monkeys H and D available for analysis, respectively. Potentials from single neurons were isolated with tungsten microelectrodes (FHC, Inc. or Alpha Omega, 0.5-1.5 M at 1 KHz) or 16-channel multi-contact linear arrays (Neuronexus Vector array) advanced by an 8-channel micromanipulator (NAN instruments, Nazareth, Israel) attached to the recording chamber. Spikes from putative single neurons were isolated online using a Plexon Multichannel Acquisition Processor and later verified with Plexon Off Line Sorter on the basis of principal-component analysis, visually differentiated waveforms, and interspike intervals. Subcallosal ACC recordings were made on the medial surface of the brain ventral corpus callosum ( Figure 2D ). All recordings in subcallosal ACC were between the anterior tip of the corpus callosum and the point where structural MRIs were unable to distinguish the medial cortex from the striatum corresponding to roughly the most anterior point of the septum. This corresponds to between 31.5 to 26.5 mm anterior to the interaural plane. Neurons in basolateral amygdala were primarily recorded in basal and lateral nuclei at least 4 mm ventral to the first neurons that could be discerned after crossing the anterior commissure. Neurons in amygdala were recorded between 22 and 18.5 mm anterior to the interaural plane. Recordings in striatum were made in the rostro-medial segment corresponding to the zone where subcallosal and basal amygdala projections overlap (Cho et al., 2013; Haber et al., 2006) . Neurons in striatum were recorded between 28 to 24 mm anterior to the interaural plane. Recording sites were verified by T1-weighted MRI imaging of electrodes at selected locations in subcallosal ACC, basolateral amygdala and striatum after recordings had been completed ( Figure 2D ). Neurons were initially isolated before monkeys were engaged in any task. However, in some cases neurons were isolated during the Pavlovian task and these neurons were then recorded in the instrumental version of the task. Other than the quality of isolation, there were no selection criteria for neurons. Autonomic and behavioral analysis: To control for across session difference in baseline pupil size and heart rate, data for each session were processed using custom software to remove artifacts and blinks (Mitz et al., 2017) and then normalized using a z-score prior to analyses (z = x-μ / σ). Pupil dilation was that baseline corrected to a 500-ms period extending 250 ms before to 250ms after the onset of CS. This baseline procedure meant that pupil size had the maximum time to stabilize before the presentation of conditioned stimuli (earliest responses are typically after 250ms). Autonomics data were analyzed separately for each monkey using mixed-effects ANOVA models with trial type as a main effect and session as a random effect. For the analysis of instrumental choices and response times, choices were registered when monkeys made a saccade to one of the two stimuli presented on the screen. For each session, the mean number of choices for each condition was computed and analyzed. Choice response latencies for each trial type were computed by taking the amount of time from the go signal to the selection of one of the stimuli (Figure 1) . Latencies for each trial type were averaged across sessions and analyzed separately for each monkey using a mixed effects ANOVA with trial type as a main effect and session as a random effect. Neurophysiology analysis: For analysis of neurophysiology data, only neurons that were stably recorded for at least 35 trials were analyzed. Task conditions with too few trials for reliable analysis were not included. Specifically, for the Pavlovian task, the unexpected reward condition was not analyzed as these made up only 5% of total trials. In the instrumental choice task, trials where the CS-was chosen were not included in the analysis as 180/421 units had no CS-choice trials. Overall there was only a mean of 2.18 trials with CS-chosen per session. Task-related neurons were identified by fitting a sliding ANOVA model to the firing rates of each neuron in 201ms bins advancing in 10ms increments. The model included a single parameter of trial type (4 levels) for the Pavlovian task, while the instrumental task was analyzed with a model including chosen stimulus (2 levels, water or juice), response (2 levels, left or right side), and their interaction. Neurons were considered to have significant encoding if the sliding ANOVA returned a p<0.007 for 3 consecutive time bins. These significance criteria were chosen to keep false classification close to 2% or lower for all ANOVA analyses. To compare between the proportion of neurons classified in each area we used the Gaussian approximation test on proportions. The threshold for significance, p=0.0167 was adjusted using a false discovery rate correction procedure for time-series data (Hochberg and Benjamini, 1990) . Duration of encoding of conditioned stimuli in each area was compared using a Kruskall-Wallis test on the number of significant bins during the combined stimulus and trace period in the Pavlovian task and the stimulus period in the instrumental task. To compare the duration and shape of the patterns of neuronal responses to conditioned stimuli in subcallosal ACC, BLA and rostromedial striatum, we first established the "preferred" and "anti-preferred" conditions for each neuron using the change in z-scored firing rate from baseline after stimulus presentation. First, we calculated a single mean firing rate for each neuron for each condition during the stimulus on (instrumental) or stimulus on and trace (Pavlovian) periods. Then we determined which condition had the maximum (preferred) firing rate and which condition had the minimum (anti-preferred) firing rate for each neuron. Second, we calculated the difference in firing rates for each neuron between the preferred and anti-preferred conditions. Taking the mean firing rates from each condition across trials for 10ms bins (as above), we Z-scored the mean firing rates for the preferred trials and the anti-preferred trials to the baseline activity 500ms before stimulus on. Then the sequence of Z-scored mean firing rates for the anti-preferred condition were subtracted from the Z-scored mean firing rates for the preferred condition to obtain a sequence of preferred vs. anti-preferred firing rate differences for each neuron. Third, we identified the longest segment of differences in Z-scored mean firing rates that was greater than a threshold of 3. These segments were aligned around their centers of mass and averaged for each area to get a mean response shape for each area. Finally, a Kolmogorov-Smirnov (K-S) test was performed to show that the distribution of differences in mean firing rate around the peak firing rate bin differed between brain regions. In order to assess the stability of peak encoding times for each neuron, a 50% subsample of trials was taken randomly for each neuron 500 times and an ANOVA run on each subsample as above. The time bin with the highest percent explained variance was recorded for each subsample. This analysis included only neurons with significant reward encoding in the entire trial set that were stably recorded for a minimum of 70 trials, to ensure a minimum of 35 trials in each subsample. As a measure of the stability of this peak encoding point, we compared the distribution of maximum encoding points for all sampled 50 percent of trials to that of the remaining 50 percent for all 500 samples for each neuron. Distributions of maximum encoding points were compared using a K-S test for each neuron. To further test whether neurons are encoding in a particular sequence in relation to each other, we adapted a form of the matching index analysis used in (Ji and Wilson, 2007) using the sets of peak encoding points from 50% subsamples found above. For each area, the set of peak encoding time points for each neuron were assembled as a pseudopopulation of peak encoding times, and these were compared to the set of peak encoding times for the remaining 50% of trials, for 500 sets of two pseudopopulations in total. For each of these pseudopopulations, the peak encoding time for each neuron was used to place them in order, and the order of peak times calculated in the subsampled 50% of trials was compared to those from the remaining 50%. The matching index was computed for each of the 500 subsamples by comparing whether the peak encoding time for each neuron was in the same or a different order as compared to every other neuron between the subsampled and remaining 50% of trials. The matching index MI is given as MI = (m -n) / (m + n) where m is the number of pairs that are in the same order and n is the number of pairs of neurons that were in a different order. The overall matching index for an area was calculated as the mean of the matching indices from each of the 500 subsamples. In addition, to confirm that results from the matching index analysis were not simply related to neuron population size, we repeated the matching index analysis 500 times using subsamples of 35 neurons from BLA and rostromedial striatum, the size of the smallest population (subcallosal ACC). Here subsampling in BLA and rostromedial striatum did not appreciably alter the results. To assess the significance of the calculated matching indices, we calculated the distribution of indices that would be calculated from randomized data for comparison against the null hypothesis that the order of peak encoding times is randomly arranged. We generated a set of random numbers the length of the number of neurons analyzed in each area and calculated the matching indices for 1,000,000 random iterations. The p value for each experimental matching index was then calculated as the probability that the matching index would be generated from randomized data of that length. SeqNMF: We characterized the sequences in the population neural activity recorded from subcallosal ACC, rostromedial striatum and BLA data using an unsupervised machine learning approach that applies non-negative matrix factorization (NMF) to neural data, seqNMF (Mackevicius et al., 2019) . For this analysis, we used spiking activity on trials where stimuli were presented (CS+ juice , CS+ water and CS-) starting from when the stimuli were presented to the end of the trace intervals in the Pavlovian task (1500ms total). We included all neurons, irrespective of whether they had been classified as encoding the trial types in the Pavlovian task, that had at least 20 trials for each of the CS+ juice , CS+ water , and CS-trials (60 trials total). This yielded 115, 100, and 124 total neurons from subcallosal ACC, rostromedial striatum, and BLA respectively in subject D, and 59, 83, and 68 respectively in subject H. Zero padding was added between trial segments to prevent detection of spurious factors across trial segment boundaries. Neural activity was smoothed with a 150ms half-Gaussian window. Smoothing with a 50ms half-Gaussian was also conducted, yielding largely similar results. SeqNMF differs from other forms of non-negative matrix factorization, such as convolutional NMF (Smaragdis, 2006) in its inclusion of a penalty term in the cost function called the x-ortho penalty, the magnitude of which is modulated by the hyperparameter, λ. Higher values of λ suppress all but one factor to zero amplitude, leading to large reconstruction error if more than one ground-truth sequence exists, while an excessively low λ may cause seqNMF to produce multiple factors to explain different instances of the same sequence. As a result, determining the true number of ground-truth sequences depends on appropriate selection of λ. We implemented the method described in Mackevicius et al., (2019) to select an appropriate λ: we ran 20 iterations of seqMNF at each of a range of values of λ, obtained the average reconstruction error and correlation cost terms at each λ, and chose λ near the crossover point λ0 which yielded an approximate balance between the two, either λ =1 λ0, or λ =1.5 λ0. In the event that most runs returned a single sequence, we also compared our sequences extracted from optimized seqNMF with sequences extracted by setting λ=0 (i.e. non-penalized convolutional NMF). If a similar result was obtained with λ=0, this value was used. As a result of this optimization procedure, λ was set to 0, 0, and 1 for subcallosal ACC, rostromedial striatum, and BLA, respectively. Once we had chosen an appropriate λ for each area, we re-ran seqNMF for 20 iterations at the selected λ. In order to assess the significance of extracted factors, we employed a 75%/25% training/testing split balanced to have an equal number of trials for each of the three trial types for both the training set and testing set. As a result, seqNMF was trained on 45 trials and tested on 15 trials for each region. For each iteration, we test for factor significance using the held-out test dataset by comparing the skewness of the distribution of overlaps (W'X, where W = factor tensor and X=data) between each factor and the held-out data with the null case. The overlap of a factor with data will be high at timepoints during which the sequence occurs, leading to a high skewness of the W'X distribution. From these 20 iterations, we also obtained 1) the temporal loadings for each factor across the time epoch of interest, 2) an average value for reconstruction power (% variance in the data explained by a given factorization), and 3) the distribution for the number of significant factors found. To compare within and between each factor for each trial type, we conducted mixed-effect ANOVA on the mean of the temporal loadings (H) across the runs for each factor separately with trial type as a main effect and run as a random effect. Control analyses for seqNMF were conducted using identical parameters to those optimized for the stimulus and trace interval. First, seqNMF was conducted on the last 1500 ms of the ITI of each trial. This analysis thus maintained the trial progression and spiking statistics but lacked the trial event structure. Second, to maintain the trial events in the data, we shuffled the spike times of all neurons on all trials, then smoothed the data, and reran seqNMF. For both analyses we compared the number of significantly extracted factors to those recovered from nonshuffled data during the stimulus and trace intervals (Figures 8 and S7 ). In the Pavlovian task, monkeys were required to maintain gaze on a centrally located red spot for 2.8-3.0 seconds to receive three small drops of juice reward. In 75% of trials, one of three conditioned stimuli, either CS+ juice , CS+ water or CS-, were presented for 1 s (CS period) before a 0.4-0.5 second trace interval. If either the CS+ juice or CS+ water were presented then 0.5 ml of the corresponding fluid reward, either juice or water was delivered at the end of the trace interval. B) In the instrumental task, monkeys were required to maintain gaze on a centrally located spot. Then two stimuli were presented to the left and right for a second. Stimuli were the same as those presented in the Pavlovian task and had the same corresponding reward assignments. The red spot was then extinguished, and this was the cue for monkeys to make their choice by making a saccade to either stimulus. After maintaining fixation on the chosen stimulus for a brief period the corresponding reward was delivered. C) Heart rate as a function of time (percent change, mean ± SEM) for monkeys D and H on neutral, CS+ juice , CS+ water or CS-trials. Insets show the mean heart rate during the period after stimulus onset (mean ± SEM). Shaded regions or error bars show SEM. Because of the variable trace interval, data are temporally realigned for the reward period and there is a period which intentionally left blank/dark blue between 1500-1600ms. are from the 6 trials from each of the three different cued trial types, CS+ juice (red), CS+ water (yellow), and CS-(purple). Convolutional NMF models this neural activity data as a neuron by time matrix over the sum of K matrices. Here each K matrix is a single trial. Matrices constructed for each trial are the product of two components: a non-negative matrix Wk of dimension N by L that stores a sequential pattern of the N neurons at L time lags. An additional vector which stores the temporal loadings, Hk, denotes the times when each factor is present in the data as well as the relative amplitude of that factor. SeqNMF is a refinement of convolutional NMF as it optimizes a penalty term, λ that discards redundant factors in the data identified by convolutional NMF (Mackevicius et al., 2019) . λ is optimized by determining the point at which correlation and reconstruction costs cross (denoted by dotted line where reed and blue curves cross). In this example, optimizing λ and setting it to 1 times the cross over point leads to factor 3 being discarded as this factor is correlated with factor 1. When applied to the data recorded in BLA, seqNMF identified two significant factors corresponding to sequences. Factor reconstructions are shown (bottom right). Fear and the human amygdala Cortical systems for the recognition of emotion in facial expressions A description of the amygdalo-hippocampal interconnections in the macaque monkey Complementary Patterns of Direct Amygdala and Hippocampal Projections to the Macaque Prefrontal Cortex Fractionating Blunted Reward Processing Characteristic of Anhedonia by Over-Activating Primate Subgenual Anterior Cingulate Cortex Prefrontal cortical projections to longitudinal columns in the midbrain periaqueductal gray in macaque monkeys Correlates of economic decisions in the dorsal and subgenual anterior cingulate cortices Topographically specific hippocampal projections target functionally distinct prefrontal areas in the rhesus monkey Opposing pupil responses to offered and anticipated reward values Cortico-amygdala-striatal circuits are organized as hierarchical subsystems through the primate amygdala Rapid sequences of population activity patterns dynamically encode task-critical spatial information in parietal cortex Anhedonia and emotional experience in schizophrenia: neural and behavioral indicators Pavlovian reward prediction and receipt in schizophrenia: relationship to anhedonia Functional Connectivity of the Subcallosal Cingulate Cortex And Differential Outcomes to Treatment With Cognitive-Behavioral Therapy or Antidepressant Medication for Major Depressive Disorder Stable and dynamic representations of value in the prefrontal cortex Comparison of hippocampal, amygdala, and perirhinal projections to the nucleus accumbens: combined anterograde and retrograde tracing study in the Macaque brain Increased neuronal firing in resting and sleep in areas of the macaque medial prefrontal cortex Sequence of information processing for emotions based on the anatomic dialogue between prefrontal cortex and amygdala Reward-related cortical inputs define a large striatal region in primates that interface with associative cortical connections, providing a substrate for incentive-based learning An ultra-sparse code underlies the generation of neural sequences in a songbird Inflammation causes mood changes through alterations in subgenual cingulate activity and mesolimbic connectivity Choice-specific sequences in parietal cortex during a virtual-navigation decision task More powerful procedures for multiple significance testing Coordinated memory replay in the visual cortex and hippocampus during sleep The neural correlates of anhedonia in major depressive disorder Value and choice as separable and stable representations in orbitofrontal cortex The brain basis of emotion: a meta-analytic review Hippocampal "time cells" bridge the gap in memory for discontiguous events Unsupervised discovery of temporal sequences in high-dimensional datasets Reciprocal limbic-cortical function and negative mood: converging PET findings in depression and normal sadness Regional metabolic effects of fluoxetine in major depression: serial changes and relationship to clinical response A scalable population code for time in the striatum Using pupil size and heart rate to infer affective states during behavioral neurophysiology and neuropsychology experiments Regionally distinct processing of rewards and punishments by the primate ventromedial prefrontal cortex Prefrontal cortical projections to the hypothalamus in macaque monkeys A diverse range of factors affect the nature of neural representations underlying short-term memory Internally Generated Cell Assembly Sequences in the Rat Hippocampus The primate amygdala represents the positive and negative value of visual stimuli during learning Rethinking brain-wide interactions through multi-region "network of networks" models Neural circuits underlying the pathophysiology of mood disorders Recurrent Network Models of Sequence Generation and Memory A role for primate subgenual cingulate cortex in sustaining autonomic arousal Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning Reward-related activity in the monkey striatum and substantia nigra Toward clinically useful neuroimaging in depression treatment: prognostic utility of subgenual cingulate activity for determining depression outcome in cognitive therapy across studies, scanners, and patient characteristics Convolutive speech bases and their application to supervised speech separation Neuronal selectivity for spatial positions of offers and choices in five reward regions Replay of cortical spiking sequences during human memory retrieval Hippocampal Interaction With Area 25, but not Area 32, Regulates Marmoset Approach-Avoidance Behavior Glutamate Within the Marmoset Anterior Hippocampus Interacts with Area 25 to Regulate the Behavioral and Cardiovascular Correlates of High-Trait Anxiety