key: cord-0319214-kwt1okqb authors: Desender, Kobe; Vermeylen, Luc; Verguts, Tom title: Dynamic influences on static measures of metacognition date: 2020-10-29 journal: bioRxiv DOI: 10.1101/2020.10.29.360453 sha: 3161a8fbf028d9f3bc8bb5a191a65eaaabf30a5d doc_id: 319214 cord_uid: kwt1okqb Humans differ in their capability to judge the accuracy of their own choices via confidence judgments. Signal detection theory has been used to quantify the extent to which confidence tracks accuracy via M-ratio, often referred to as metacognitive efficiency. This measure, however, is static in that it does not consider the dynamics of decision making. This could be problematic because humans may shift their level of response caution to alter the tradeoff between speed and accuracy. Such shifts could induce unaccounted-for sources of variation in the assessment of metacognition. Instead, evidence accumulation frameworks consider decision making, including the computation of confidence, as a dynamic process unfolding over time. We draw on evidence accumulation frameworks to examine the influence of response caution on metacognition. Simulation results demonstrate that response caution has an influence on M-ratio. We then tested and confirmed that this was also the case in human participants who were explicitly instructed to either focus on speed or accuracy. We next demonstrated that this association between M-ratio and response caution was also present in an experiment without any reference towards speed. The latter finding was replicated in an independent dataset. In contrast, when data were analyzed with a novel dynamic measure of metacognition, which we refer to as v-ratio, in all of the three studies there was no effect of speed-accuracy tradeoff. These findings have important implications for research on metacognition, such as the question about domain-generality, individual differences in metacognition and its neural correlates. .01 .01 .00 -5. M-ratio .03 -.08 -.52 *** .58 *** -1 3 2 Table 1 . Correlation table of the parameters from the model simulation. Note: ***<.001 1 3 3 Experiment 1: Explicit speed-accuracy instructions affect static but not dynamic measures 1 3 4 of confidence 1 3 5 Next, we tested these model predictions in an experiment with human participants. We recruited 1 3 6 36 human participants who performed a task that has been widely used in the study of evidence 1 3 7 accumulation models: discrimination of the net motion direction in dynamic random dot displays 21 . 1 3 8 Participants were asked to decide whether a subset of dots was moving coherently towards the left or the 1 3 9 right side of the screen (See Figure 2A) . The percentage of dots that coherently moved towards the left or 1 4 0 right side of the screen (controlling decision difficulty) was held constant throughout the experiment at 1 4 1 20%. After their choice, and a blank screen, participants indicated their level of confidence using a 1 4 2 continuous slider. Critically, in each block, participants either received the instruction to focus on 1 4 3 accuracy ("try to decide as accurate as possible"), or to focus on speed ("try to decide as fast as 1 4 4 possible"). Consistent with the instructions, participants were faster in the speed condition than in the 1 4 5 accuracy condition, M speed = 727ms versus M accuracy = 1014ms, t(35) = 4.47, p < .001, and numerically 1 4 6 more accurate in the accuracy condition than in the speed condition, M accurate = 75.6% vs M speed = 73.8%, 1 4 7 t(35) = 1.63, p = .111. Participants were also more confident in the accuracy condition than in the speed 1 4 8 condition, M accuracy = 70 versus M speed = 67, t(35) = 3.57, p = .001 (See Figure 2D ). To shed further light on the underlying cognitive processes, we fitted these data using the 1 6 4 evidence accumulation model described in Figure 1A . The basic architecture of our model was a DDM, in 1 6 5 which noisy perceptual evidence accumulates over time until a decision boundary is reached. Afterwards, 1 6 6 evidence continued to accumulate for a specified amount of time 19 . In addition to drift rate, decision 1 6 7 boundary and non-decision time, our model featured a free parameter controlling the strength of the post-1 6 8 decision evidence accumulation (v-ratio, reflecting the ratio between post-decision drift rate and drift rate) 1 6 9 and two further parameters controlling the mapping from p(correct) onto the confidence scale (see 1 7 0 Methods). Generally, our model fitted the data well, as it captured the distributional properties of both 1 7 1 reaction times and decision confidence (see Figure 2C ). As a first sanity check, we confirmed that 1 7 2 decision boundaries were indeed different between the two instruction conditions, M speed = 1.40 versus 1 7 3 M accuracy = 1.77, t(35) = 4.60, p < .001, suggesting that participants changed their decision boundaries as 1 7 4 instructed. Also non-decision time tended to be a bit shorter in the speed condition compared to the 1 7 5 accuracy condition, M speed = 309ms versus M accuracy = 390ms, t(35) = 3.19, p = .003. Drift rates did not 1 7 6 differ between both instruction conditions, p = .368. There was a small but significant difference between 1 7 7 the two instruction conditions in the two additional parameters controlling the idiosyncratic mapping 1 7 8 between p(correct) and the confidence scale, reflecting that in the accuracy condition confidence 1 7 9 judgments were slightly higher, t(35) = 2.506, p = .017, and less variable, t(35) = 2.206, p = .034, 1 8 0 compared to the speed condition. 1 8 1 We next focused on metacognitive accuracy in both conditions (see Figure 2B ). In line with the 1 8 2 model simulations, our data showed that M-ratio was significantly affected by the speed-accuracy tradeoff 1 8 3 instructions, M speed = 0.84 versus M accuracy = 0.66, t(35) = 2.26, p = .030. Moreover, apart from these 1 8 4 between-condition differences we also observed significant correlations between M-ratio and decision 1 8 5 boundary both in the accuracy condition, r(34) = -.36, p = .030, and in the speed condition, r(34) = -.53, p 1 8 6 < .001.Consistent with the notion that metacognitive accuracy should not be affected by differences in 1 8 7 decision boundary, v-ratio did not differ between both instruction conditions, p = .938. 1 8 8 Experiment 2: Spontaneous differences in response caution relate to static but not dynamic 1 8 9 measures of metacognitive accuracy 1 9 0 Although Experiment 1 provides direct evidence that changes in decision boundary affect M-1 9 1 ratio, it remains unclear to what extent this is also an issue in experiments without speed stress. Notably, 1 9 2 in many metacognition experiments, participants do not receive the instruction to respond as fast as 1 9 3 possible. Nevertheless, it remains possible that participants implicitly decide on a certain level of response 1 9 4 caution. For example, a participant who is eager to finish the experiment quickly might adopt a lower 1 9 5 decision boundary compared to a participant who is determined to perform the experiment as accurate as 1 9 6 possible, thus leading to a natural across-subject variation in decision boundaries. To examine this 1 9 7 possibility, in Experiment 2 we analyzed data from an experiment in which participants (N = 63) did not 1 9 8 receive any specific instructions concerning speed or accuracy. Participants decided which of two boxes 1 9 9 contained more dots, and afterwards indicated their level of confidence on a continuous scale (see Figure 2 0 0 3A). The same evidence accumulation model as before was used to fit these data, and again this model 2 0 1 captured both reaction times and decision confidence distributions ( Figure 3B ). Consistent with our 2 0 2 model simulations, model fits showed a positive correlation between M-ratio and v-ratio, r(61) = .21, p = 2 0 3 .092, although this correlation was not statistically significant ( Figure 3C ). However, we again observed 2 0 4 that M-ratio correlated negatively with the fitted decision boundary, r(61) = -.34, p = .006, whereas v-2 0 5 ratio did not, r(61) = -.19, p = .129. Note that participants did not receive any instructions concerning speed or accuracy. B. Distribution of 2 1 1 reaction times and confidence for Experiment 2, using the same conventions as in Figure 2 . C. The data 2 1 2 of Experiment 2 showed a non-significant positive relation between M-ratio and v-ratio (r=.21 M-ratio and v-ratio (r=.38 ) and a significant negative 2 1 9 correlation between M-ratio and decision boundary (r=-.18) but not between v-ratio and decision 2 2 0 boundary (r= -.04). 2 2 1 2 2 2 Experiment 3: Replication in an independent dataset 2 2 3 To assess the robustness of our findings, in Experiment 3 we aimed to replicate our analysis in an 2 2 4 independent dataset with high experimental power. To achieve this, we searched the confidence database 2 2 5 32 for studies with high power (N > 100) in which a 2CRT task was performed with separate confidence 2 2 6 ratings given on a continuous scale. Moreover, because our fitting procedure was not designed for 2 2 7 multiple levels of difficulty, we focused on studies with a single difficulty level. We identified one study 2 2 8 that satisfied all these constraint ( Figure 3D ; Prieto, Reyes & Silva, under review). Their task was highly 2 2 9 similar to the one reported above, but their high experimental power (N=204) assured a very sensitive 2 3 0 analysis of our claims. Consistent with the previous analysis, model fits on this independent dataset 2 3 1 showed a positive and statistically significant correlation between M-ratio and v-ratio, r(202) = .38, p < 2 3 2 .001, suggesting that both variables capture shared variance reflecting metacognitive accuracy (see Figure 2 3 3 3F). We again observed that M-ratio correlated negatively with the fitted decision boundary, r(202) = -2 3 4 .18, p = .009, whereas no relation with decision bound was found for v-ratio, r(202) = .04, p = .535. 2 3 5 Metacognitive accuracy is a quickly emerging field in recent years. Crucial to its study is a 2 3 7 method to objectively quantify the extent to which participants are able to detect their own mistakes, 2 3 8 regardless of decision strategy. We here report that a commonly used static measure of metacognitive 2 3 9 accuracy (M-ratio) highly depends on the decision boundary -reflecting decision strategy -that is set for 2 4 0 decision making. This was the case in simulation results, in an experiment explicitly manipulating the 2 4 1 tradeoff between speed and accuracy, and in two datasets in which participants received no instructions 2 4 2 concerning speed or accuracy. We propose an alternative, dynamic, measure of metacognitive accuracy 2 4 3 (v-ratio) that does not depend on decision boundary. 2 4 4 2 4 5 Caution is warranted with static measures of metacognition 2 4 6 The most important consequence of the current findings is that researchers should be cautious 2 4 7 when interpreting static measures of metacognitive accuracy, such as M-ratio. In the following, we will 2 4 8 discuss several examples where our finding might have important implications. In the last decade there 2 4 9 has been quite some work investigating to what extent the metacognitive evaluation of choices is a 2 5 0 domain-general process or not. These studies often require participants to perform different kinds of tasks, 2 5 1 and then examine correlations in accuracy and in metacognitive accuracy between these tasks 3,11-14,33 . For 2 5 2 example, Mazancieux and colleagues 11 asked participants to perform an episodic memory task, a 2 5 3 semantic memory task, a visual perception task and a working memory task. In each task, participants 2 5 4 rated their level of confidence after a decision. The results showed that whereas correlations between 2 5 5 accuracy on these different tasks were limited, there was substantial covariance in metacognitive accuracy 2 5 6 across these domains. Because in this study participants received no time limit to respond, it remains 2 5 7 unclear whether this finding can be interpreted as evidence for a domain-general metacognitive monitor, 2 5 8 or instead a domain-general response caution which caused these measures to correlate. Another popular 2 5 9 area of investigation has been to unravel the neural signatures supporting metacognitive accuracy 13,14,34-2 6 0 36 . For example, McCurdy et al. observed that both visual and memory metacognitive accuracy correlated 2 6 1 with precuneus volume, potentially pointing towards a role of precuneus in both types of metacognition. It remains unclear, however, to what extent differences in response caution might be responsible for this 2 6 3 association. Although differences in response caution are usually found to be related to pre-SMA and 2 6 4 anterior cingulate 24,25 , there is some suggestive evidence linking precuneus to response caution 37 . 2 6 5 Therefore, it is important that future studies on neural correlates of metacognition rule out the possibility 2 6 6 that their findings are caused by response caution. Finally, our study has important consequences for 2 6 7 investigations into differences in metacognitive accuracy between specific, e.g. clinical, groups. For 2 6 8 example, Folke and colleagues 17 reported that M-ratio was reduced in a group of bilinguals compared to 2 6 9 a matched group of monolinguals. Interestingly, they also observed that on average bilinguals had shorter 2 7 0 reaction times than monolinguals, but this effect was unrelated to the group difference in M-ratio. 2 7 1 Because these authors did not formally model their data using evidence accumulation models, however, it 2 7 2 remains unclear whether this RT difference results from a difference in boundary, and if so to what extent 2 7 3 this explains the difference in M-ratio between both groups that was observed. In a similar vein, 2 7 4 individual differences in M-ratio have been linked to psychiatric symptom dimensions, and more 2 7 5 specifically to a symptom dimension related to depression and anxiety 5 . At the same time, it is also 2 7 6 known that individual differences in response caution are related to a personality trait known as need for 2 7 7 closure 38 . Given that need for closure is, in turn, related to anxiety and depression 39 , it remains a 2 7 8 possibility that M-ratio is only indirectly related to these psychiatric symptoms via response caution. 2 7 9 2 8 0 The potential of dynamic measures of metacognition 2 8 1 In order to control for potential influences of response caution on measures of metacognitive 2 8 2 accuracy, one approach could be to estimate the decision boundary and examine whether the relation 2 8 3 between metacognitive accuracy and the variable of interest remains when controlling for decision 2 8 4 boundary (e.g., using mediation analysis). However, a more direct approach would be to estimate 2 8 5 metacognitive accuracy in a dynamic framework, thus taking into account differences in response caution. 2 8 6 In the current work, we proposed v-ratio (reflecting the ratio between post-decision drift rate and drift 2 8 7 rate) as such a dynamic measure of metacognitive accuracy (following the observation that post-decision 2 8 8 drift rate indexes how accurate confidence judgments are 19,20 ). In both simulations and empirical data, we 2 8 9 observed a positive relation between v-ratio and M-ratio, suggesting they capture shared variance. 2 9 0 Critically, v-ratio was not correlated with decision boundary, suggesting it is not affected by differences 2 9 1 in response caution. Thus, our dynamic measure of metacognition holds promise as a novel approach to 2 9 2 quantify metacognitive accuracy while taking into account the dynamics of decision making. 2 9 3 In our approach we allowed the drift rate and the post-decision drift rate to dissociate. This 2 9 4 proposal is in line with the view of metacognition as a second-order process whereby dissociations 2 9 5 between confidence and accuracy might arise because of noise or bias at each level 40-42 . However, when 2 9 6 formulating post-decision drift rate as a continuation of evidence accumulation, it remains underspecified 2 9 7 which evidence the post-decision accumulation process is exactly based on. It has been suggested that 2 9 8 participants can accumulate evidence that was still in the processing pipeline (e.g. in a sensory buffer) 2 9 9 even after a choice was made 30,43 . However, it is not very likely that this is the only explanation, 3 0 0 particularly in tasks without much speed stress. One other likely possibility, is that during the post-3 0 1 decision process, participants resample the stimulus from short-term memory 44 . Because memory is 3 0 2 subject to decay, dissociations between the post-decision drift rate and the drift rate can arise. Other 3 0 3 sources of discrepancy might be contradictory information quickly dissipating from memory 45 which 3 0 4 should lower metacognitive accuracy, or better assessment of encoding strength with more time 46 which 3 0 5 should increase metacognitive accuracy. 3 0 6 To sum up, we provided evidence from simulations and empirical data that a common static 3 0 7 measure of metacognition, M-ratio, is confounded with response caution. We proposed an alternative 3 0 8 measure of metacognition based on a dynamic framework, v-ratio, which is insensitive to variations in 3 0 9 caution, and may thus be suitable to study how metacognitive accuracy varies across subjects and 3 1 0 conditions. 3 1 1 Computational model 3 1 3 Data were simulated for 100 observers with 500 trials each. For each simulated observer, we 3 1 5 randomly selected a value for the drift rate (uniform distribution between 0 and 2.5), for the decision 3 1 6 boundary (uniform distribution between .5 and 3), for the non-decision time (uniform distribution 3 1 7 between .2 and .6) and for the v-ratio (uniform distribution between 0 and 1.5; see below for details). To 3 1 8 estimate meta-d', data is needed for both of the possible stimuli (i.e., to estimate bias); therefore, for half 3 1 9 of the trials we multiplied the drift rate by -1. Finally, we fixed the values for starting point (z = .5), 3 2 0 within-trial noise (ߪ = 1) and post-decision processing time (1s). Fitting procedure 3 2 2 We coded an extension of the drift diffusion model (DDM) that simultaneously fitted choices, 3 2 3 reaction times and decision confidence. The standard DDM is a popular variant of sequential sampling 3 2 4 models of two-choice tasks. We used a random walk approximation, coded in the rcpp R package to 3 2 5 increase speed 47 , in which we assumed that noisy sensory evidence started at z*a; 0 and a are the lower 3 2 6 and upper boundaries, respectively, and z quantifies bias in the starting point (z = .5 means no bias). At 3 2 7 each time interval ߬ a displacement Δ in the integrated evidence occurred according to the formula shown 3 2 8 in equation (1) Evidence accumulation strength is controlled by v, representing the drift rate, and within-trial 3 3 0 variability, σ , was fixed to 1. The random walk process continued until the accumulated evidence crossed 3 3 1 either 0 or a. After boundary crossing, the evidence continued to accumulate for a duration depending on 3 3 2 the participant-specific median confidence reaction time. Importantly, consistent with the signal detection 3 3 3 theoretical notion that primary and secondary evidence can dissociate, we allowed for dissociations 3 3 4 between the drift rate governing the choice and the post-decision drift rate. For compatibility with the M-3 3 5 ratio framework, we quantified metacognitive accuracy as the ratio between post-decision drift rate and 3 3 6 drift rate, as shown in equation (2): Domain-specific impairment in 4 Direct injection of noise to 4 8 9 the visual cortex decreases accuracy but increases decision confidence A bilingual disadvantage in 4 9 2 metacognitive processing Performance monitoring in a confusing world: error-related brain 4 activity, judgments of response accuracy, and types of errors Two-stage dynamic signal detection: A theory of choice, 4 9 7 decision time, and confidence Dynamics of Postdecisional Processing of Confidence (2) As a consequence, when v-ratio = 1, this implies that post-decision drift and drift are the same. 3 3 8When v-ratio = .5, the magnitude of the post-decision drift rate is half the magnitude of the drift rate. To 3 3 9 calculate decision confidence, we first quantified for each trial the probability of being correct given 3 4 0 evidence, time, and choice. The heat map representing p(correct|e, t, X) is shown in Figure 1A , and was 3 4 1 constructed by means of 300.000 random walks without absorbing bounds, with drift rates sampled from 3 4 2 a uniform distribution between zero and ten. This assured sufficient data points across the relevant part of 3 4 3 the heat map. Subsequently, the average accuracy was calculated for each (response time, evidence, 3 4 4 choice) combination, based on all trials that had a data point for that (response time, evidence, choice) 3 4 5combination.Smoothing was achieved by aggregating over evidence windows of .01 and ߬ windows of 3. 3 4 6Next, to take into account idiosyncratic mappings of p(correct|e, t, X) onto the confidence scale used in 3 4 7 the experiment, we added two extra free parameters that controlled the mean (M) and the width (SD) of 3 4 8confidence estimates, as shown in equation (3):Although empirical confidence distributions appeared approximately normally distributed, there 3 5 0was an over-representation of confidence values at the boundaries (1 and 100 in Experiment 1; 1 and 6 in 3 5 1 Experiments 2 and 3) and in the middle of the scale (50 in Experiment 1, 3.5 in Experiment 2). Most 3 5 2 likely, this resulted from the use of verbal labels placed at exactly these values. To account for frequency 3 5 3peaks at the endpoints of the scale, we relabeled predicted confidence values that exceeded the endpoints 3 5 4 of the scale as the corresponding endpoint (e.g., in Experiment 1 a predicted confidence value of 120 was 3 5 5 relabeled as 100), which naturally accounted for the frequency peaks at the endpoints. To account for 3 5 6peaks in the center of the scale, we assumed that confidence ratings around the center were pulled towards 3 5 7 the center value. Specifically, we relabeled P% of trials around the midpoint as the midpoint (e.g., in 3 5 8Experiment 1, P = 10% implies that 10% of the data closest to 50 were (re)labeled as 50). Note that P was 3 5 9 not a free parameter, but instead its value was taken to be the participant-specific proportion based on the 3 6 0 empirical data. Note that the main conclusions reported in this manuscript concerning the relation 3 6 1 between M-ratio, decision boundary and post-decision drift rate, remain the same in a reduced model 3 6 2 without P, and also in a reduced model without P, M and SD. Because these reduced models did not 3 6 3 capture confidence distributions very well though, we here report only the findings of the full model. To estimate these 6 parameters (v, a, Ter, v-ratio, M, and SD) based on choices, reaction times 3 6 5and decision confidence, we implemented quantile optimization. Specifically, we computed the 3 6 6proportion of trials in quantiles .1, .3, .5, .7, and .9, for both reaction times and confidence; separately for 3 6 7corrects and errors (maintaining the probability mass of corrects and errors, respectively). We then used 3 6 8 differential evolution optimization, as implemented in the DEoptim R package 48 , to estimate these 6 3 6 9 parameters by minimizing the chi square error function shown in equation (4):with oRT i and pRT i corresponding to the proportion of observed/predicted responses in quantile i, 3 7 1 separately calculated for corrects and errors both reaction times, and oCJ i and pCJ i reflecting their 3 7 2 counterparts for confidence judgments. We set ߬ to 1e-2. Model fitting was done separately for each 3 7 3participant. Note that in Experiment 3 there was no clear peak in the middle of the scale so P was fixed to 3 7 4 0 in that experiment. Parameter recovery 3 7 6To assure that our model was able to recover the parameters, we here report parameter recovery. 3 7 7In order to assess parameter recovery with a sensible set of parameter combinations, we used the fitted 3 7 8parameters of Experiment 1 (N = 36), simulated data from these parameters with a varying number of 3 7 9trials, and then tested whether our model could recover these initial parameters. As a sanity check, we 3 8 0 first simulated a large number of trials (25000 trials per participant), which as expected provided excellent 3 8 1 recovery for all six parameters, rs > .97. We then repeated this process with only 200 trials per 3 8 2 participants, which was the trial count in Experiment 2 (note that Experiment 1 and 3 both had higher trial 3 8 3 counts). Recovery for v-ratio was still very good, r = . 85 Forty healthy participants (18 males) took part in Experiment 1 in return for course credit (mean 3 8 9 age = 19.82, between 18 and 30). All reported normal or corrected-to-normal vision. Two participants 3 9 0were excluded because they required more than 10 practice blocks in one of the training blocks (see 3 9 1 below) and two participants were excluded because their accuracy, averaged per block and then compared 3 9 2 against chance level using a one-sample t-test, was not significantly above chance level. The final sample 3 9 3 thus comprised thirty-six participants. All participants provided their informed consent and all procedures 3 9 4 adhered to the general ethical protocol of the ethics committee of the Faculty of Psychology and 3 9 5Educational Sciences of Ghent University. 3 9 6Stimuli and apparatus 3 9 7The data for Experiment 1 were collected in an online study, due to COVID-19. Participants were 3 9 8 allowed to take part in the experiment only when they made us of an external mouse. Choices were 3 9 9provided with the keyboard, and decision confidence was indicated with the mouse. Stimuli in 4 0 0Experiment 1 consisted of 50 randomly moving white dots (radius: 2 pixels) drawn in a circular aperture 4 0 1 on a black background centered on the fixation point. Dots disappeared and reappeared every 5 frames. 4 0 2The speed of dot movement (number of pixel lengths the dot will move in each frame) was a function of 4 0 3the screen resolution (screen width in pixels / 650). 4 0 4Task procedure 4 0 5Each trial started with the presentation of a fixation cross for 1000 ms. Above and below this 4 0 6 fixation cross specific instructions were provided concerning the required strategy. In accuracy blocks the 4 0 7instruction was to respond as accurately as possible; in speed blocks the instruction was to respond as fast 4 0 8 as possible. The order of this block-wise manipulation was counterbalanced across participants. Next, 4 0 9randomly moving dots were shown on the screen until a response was made or the response deadline was 4 1 0 reached (max 5000 ms). On each trial, 20% of the dots coherently moved towards the left or the right side 4 1 1 of the screen, with an equal number of leftward and rightward movement trials in each block. Participants 4 1 2were instructed to decide whether the majority of dots was moving towards the left or the right side of the 4 1 3 screen, by pressing "E" or "T", respectively, with their left hand. After their response, a blank screen was 4 1 4shown for 500 ms, followed by the presentation of a continuous confidence scale. Below the scale the 4 1 5 labels "Sure error", "guess", and "sure correct" were shown, arranged outer left, centrally and outer right, 4 1 6 respectively. After clicking the confidence scale, participants had to click a centrally presented 4 1 7"Continue" button (below the confidence scale) that ensured that the position of the mouse was central 4 1 8 and the same on each trial. 4 1 9The main part of Experiment 1 consisted of 10 blocks of 60 trials, half of which were from the 4 2 0 accuracy instruction condition and half from the speed instruction condition. The experiment started with 4 2 1 24 practice trials during which participants only discriminated random dot motion at 50% coherence, no 4 2 2 confidence judgments were asked. This block was repeated until participants achieved 85% accuracy 4 2 3(mean = 2 blocks). Next, participants completed again 24 practice trials with the only difference that now 4 2 4 the coherence was decreased to 20% (mean = 1.05 blocks). When participants achieved 60% accuracy, 4 2 5 they then performed a final training block of 24 trials during which they practiced both dot discrimination 4 2 6and indicated their level of confidence (mean = 1.05 blocks). 4 2 7 Experiment 2 4 2 8 Full experimental details are described in Drescher et al. 49 . On each trial participants were 4 2 9presented with two white circles (5.1° diameter) on a black background, horizontally next to each other 4 3 0 with a distance of 17.8° between the midpoints. Fixation crosses were shown for 1s in each circle, 4 3 1 followed by dots clouds in each circle for 700ms. The dots had a diameter of 0.4°. Dot positions in the 4 3 2 boxes, as well as the position of the box containing more dots were randomly selected on each trial. 4 3 3Participants indicated which circle contained more dots by pressing "S" or "L" on a keyboard. Then, the 4 3 4question "correct or false?" appeared on the screen, with a continuous confidence rating bar, with the 4 3 5 labels "Sure false", "No idea", and "Sure correct". Participants moved a cursor with the same keys as 4 3 6before, and confirmed their confidence judgment with the enter key. No time limit was imposed for both 4 3 7 primary choice and confidence rating. Subjects received several practice trials (10 without confidence 4 3 8 rating, 14 with confidence rating), before they completed eight experimental blocks of 25 trials.