key: cord-0619529-ovk6241d authors: Aquino-L'opez, Marco A; Sanderson, Nicole K.; Blaauw, Maarten; Sanchez-Cabeza, Joan-Albert; Ruiz-Fernandez, Ana Carolina; Aquino-L'opez, J Andr'es Christen Marco A; Christen, J Andr'es title: A simulation study to compare 210Pb dating data analyses date: 2020-12-12 journal: nan DOI: nan sha: 28f39ac2191f016aab46811cb065fdf65d414a8c doc_id: 619529 cord_uid: ovk6241d The increasing interest in understanding anthropogenic impacts on the environment have led to a considerable number of studies focusing on sedimentary records for the last $sim$ 100 - 200 years. Dating this period is often complicated by the poor resolution and large errors associated with radiocarbon (14C) ages, which is the most popular dating technique. To improve age-depth model resolution for the recent period, sediment dating with lead-210 ($^{210}$Pb) is widely used as it provides absolute and continuous dates for the last $sim$ 100 - 150 years. The $^{210}$Pb dating method has traditionally relied on the Constant Rate of Supply (CRS, also known as Constant Flux - CF) model which uses the radioactive decay equation as an age-depth relationship resulting in a restrictive model to approximate dates. In this work, we compare the classical approach to $^{210}$Pb dating (CRS) and its Bayesian alternative (textit{Plum}). To do so, we created simulated $^{210}$Pb profiles following three different sedimentation processes, complying with the assumptions imposed by the CRS model, and analysed them using both approaches. Results indicate that the CRS model does not capture the true values even with a high dating resolution for the sediment, nor improves does its accuracy improve as more information is available. On the other hand, the Bayesian alternative (textit{Plum}) provides consistently more accurate results even with few samples, and its accuracy and precision constantly improves as more information is available. , CIC (Constant Initial Concentration) (Goldberg, 1963; Crozaz et al., 1964; Robbins, 1978) and CRS -(Constant Rate of Supply; Appleby and Oldfield, 1978; Robbins, 1978) . is available, such as external independent dating markers (e.g. 137 Cs profiles), laminated sediments, tephras, contaminated layers (known sedimentary events) (eg. Appleby, 1998 Appleby, , 2001 Appleby, , 2008 . A recent inter-laboratory model comparison experiment (Barsanti et al., 2020) presented concerning results. Two measured 210 Pb data sets were send to 14 laboratories around the world with varying degrees of expertise in the 210Pb dating method. Each laboratory was asked to provide a chronology, given the same data. It is important to note that each laboratory applied their preferred model; in most cases the CRS model was calculated. This experiment resulted in a wide range of chronologies, independently of the model used, providing different chronologies even when the same model and dataset was used. The authors reinforced the need to use of independent time markers (independent dating sources) to validate and "anchor" of the chronologies, as suggested previously by (Smith, 2001) . This comparison experiment clearly and critically shows the impact that user decisions and applying expert adaptations/revisions have on the resulting chronologies. In order to replicate and/or update any given chronology, such user decisions becomes extremely important. In addition, raw data sets are also required; unfortunately, both the raw data sets and/or user's decisions are rarely reported. Recently Aquino-López et al. (2018) presented an alternative to these classical models, by introducing Plum, a Bayesian approach to 210 Pb dating. This model treats every data point as originating from a forward model that includes both the sedimentation process and the radioactive decay process. Plum also assumes a constant rate of supply to the sediment, similar to the CRS model (this assumption can be relaxed at the cost of computational power).Another important difference between the CRS and Plum is that the latter incorporates the supported 210 Pb, which naturally forms in the sediment and is normally threaded as a hindrance variable. Plum assumes that there exists an (unknown) age-depth function t(x) that relates depth x with calendar age t(x). Conditional on t(x), the following model is assumed for the measured 210 Pb y i between depths x i − δ to x i Here A S i is the supported 210 Pb in the sample and Φ i the supply of excess 210 Pb to the sediment, the agedepth model t(x) is based on a piece-wise linear model constrained by prior information on the sediment's accumulation rates (Blaauw and Christen, 2011) , see Aquino-López et al. (2018) for details. This treatment of the data allows for a formal statistical inference on a well-defined model with specific parameters. In order to infer the parameters of the model, a Bayesian approach is used. This differs from the CRS model, which does not provide a formal statistical inference. The CRS model uses the decay equation to obtain an age-depth function, resulting in a more restrictive age-depth model. It only deals with the excess 210 Pb, the estimated supported 210 Pb having been previously removed before modelling. Plum has shown to provide accurate results with a realistic precision using different case scenarios (Aquino-López et al., 2018 ) -both in simulations as well as for real cores. Under optimal dating conditions, Plum and the CRS model have been shown to provide similar results (Aquino-López et al., 2020) , with Plum providing more realistic uncertainties, with minimal user interaction. Blaauw et al. (2018) presented a comparison between classical and Bayesian age-depth models construction, both for real and simulated 14 C-dated cores. They concluded that Bayesian age-depth models provide a more accurate result and more realistic uncertainties under a wide range of scenarios. The objective of the present study is to test whether the results obtained by Blaauw et al. (2018) , concerning the accuracy and precision of the Bayesian approach, are maintained in a more complex modelling situation, such as the construction of 210 Pb-based age-depth models. To do so, we compare 210 Pb dates and uncertainties from the widely applied CRS model (by far the most popular age-depth model for 210 Pb) against Plum using simulated cores, i.e. sedimentation "scenarios". We also aim to observe the learning process of each of the models and estimate the amount of information is needed to obtained a reasonable chronology for each model. Given that the CRS model has had several revisions, the choice of which can considerably affect model outputs as shown by Barsanti et al. (2020) , we decided to apply the original version of the equations provided by Appleby (2001) , with its suggested error propagation calculation; we will call this version of the CRS the "classical implementation of the CRS" (CI-CRS). hile we acknowledge that this implementation may be the less suitable in some particular cases and then expert knowledge can greatly improve the precision and accuracy of the model, but this will reduce the bias of any particular implementation has on our results. The paper is organized as follows: second section sets the tools we use for the model comparison, describing the simulations of the three different scenarios. Section 3 describes the comparison for both the overall chronologies and for single depths. Section 4 shows the impact of expert revisions. Lastly, Section 5 presents the conclusions and discussion of the results obtained in section 3. In order to observe the accuracy and precision of any chronology, a known true age-depth function is required. Blaauw et al. (2018) presented a methodology for simulating radiocarbon dates and their uncertainties, while Aquino-López et al. (2018) presented an approach for simulating 210 Pb data given an age-depth function t(x). It is important to note that these simulations follow the equations presented by Appleby and Oldfield (1978) ; Robbins (1978) guaranteeing that the CRS assumptions are met. By using the approach presented by Aquino-López et al. (2018) for simulating 210 Pb data and the structure of uncertainty quantification presented by Blaauw et al. (2018) , reliable simulated 210 Pb data can be obtained. Three different scenarios (see Table 2 .1) were chosen to simulate sedimentation processes, with their own age-depth functions and parameters. These scenarios were selected as they provide three key challenges for the models: Scenario 1 presents an age-depth function which is the result of increasing sedimentation and less compaction towards the present (surface); this is quite common for recent sediments. Scenario 2 presents a challenging core structure as the function has a drastic and rapid shift in sediment accumulation around depth 15 cm depth. Lastly Scenario 3 presents a cyclic and periodic change in accumulation rates. Using the agedepth functions and parameters defined in Table 2 .1, we obtain the 210 Pb activity, or concentration, at any given depth or interval, by integrating the age-depth curve for that interval. Although these concentrations may be interpreted as error-free measurements (see Figure 2 ), we replicated the 210 Pb activity uncertainty, following a similar methodology to Blaauw et al. (2018) . This methodology was chosen as it introduces different sources of uncertainty related to different steps of the measurement process. Other uncertainty quantification methodologies could be used, but as long as the same methodology and uncertainty is provided to both models the comparison remains valid. Let Cx be the true 210 Pb concentration in the intervalx = [x−δ, x), given the age-depth function t(x) and parameters Φ and A S in each scenario. To simulate disturbances in the material, we can introduce scatter centred around the true value, θ ∼ N Cx, y 2 scat , where x 2 scat is the amount of scatter for this variable (in this case y 2 scat = 10). Now, to replicate outliers, a shift from the true value (x shif t ) is defined, which occurs with a probability p out . This results in a new variable θ which is defined as Finally, to simulate the uncertainty provided by the laboratory, we can define the simulated measurements as y(θ ) ∼ N θ , σ 2 R , where σ R is the standard deviation reported by the laboratory. σ R is defined as σ R = max (σ min , µ(θ ) ε y scat ), where σ min is the minimum standard deviation assigned to a measurement. This variable differs between laboratories,we use a default value of 1 Bq/kg. Finally, ε is the analytical uncertainty (default .01) and y scat an error multiplier (default 1.5). The default parameters were set in accordance with Blaauw et al. (2018) . For this this study we created a data set for each of the three simulation by integrating in intervals of δ =1 cm, for depths from 0 to 30 cm where radioactive equilibrium was guaranteed (Aquino-López et al., 2018) . The complete simulated 210 Pb data sets can be found in the Supplementary Material 7. In order to create a comparison with minimal user interaction, each model was run automatically. In the case of Plum, default settings were used in order to minimize user interaction. As the CI-CRS model assumes that background (supported) 210 Pb has been reached, in order to reduce user manipulation, we decided to fix the last sample (30 cm depth) for every case. This step not only guarantees the consistent application of the CI-CRS model, it also provides the model with a single bottom-most depth to be removed as it is common practice when using the CI-CRS model. Because of this, Plum's resulting chronology will always reaches 30 cm, as by default 1 cm sections are used for every simulation. Conversely, as CI-CRS model only models the excess 210 Pb (the total 210 Pb minus the supported 210 Pb), when certain excess activities at depth fall below zero, the chronology will only be calculated up to that depth. Plum deals with this variable supported 210 Pb variable automatically, as part of the inference. In order to provide the best possible estimate for this variable a constant level of supported 210 Pb was assumed for both models. For the CI-CRS model, the mean of the supported 210 Pb measurements was calculated and then subtracted from the total 210 Pb to obtain the excess 210 Pb, as it is common practice when using the CI-CRS model. In order to provide an objective comparison, the offset of the true age-depth model (in yr), length of the 95% intervals (in yr) and normalized accuracy were calculated (the normalized offset indicates the distance of modelled ages from the true value given the model's own uncertainty). The main discussion will revolve around the normalized offset as it provide an intuitive measure of the accuracy a model by taking into account the levels of uncertainty provided by each model. To allow for a reasonable comparison between models, and to evaluate the effect that different amount of information may have on the accuracy and precision of 210 Pb models, we used our three simulated data sets (see Supplementary Material 7). For these simulated cores, samples were randomly generated, given a percentage of information (e.g. for a 20% information a dataset with 6 random 1-cm samples -of a possible total 30 1-cm samples-is created) in order to create a sub-dataset, which was then used to create a chronology: 100 of these sub-datasets were created for information percentages from 10% to 95% at 5% intervals (i.e., 10%, 15%, 20%,...,95%). The complete dataset was also used (i.e 100% percentage of information sample). Once a dataset was created, both the CRS model and Plum were applied. Both sets of outputs were then compared against the true known age value, see Figure 3 . Figure 3 shows a single "snapshot" an example of the comparison between the 210 Pb models against the true value. As we are dealing with a total of n = 5333 simulations, in order to evaluate the overall precision and accuracy of both models, we decided to calculate the mean offset to the true age-depth model (in yr), the mean of length of the 95% intervals (in yr), as well as the mean normalized accuracy indicating the distance of modelled ages from the true value given the model's own uncertainty at each depth. Figure 4 show results similar to those presented by Blaauw et al. (2018) . The classical model (CI-CRS) at first appears to provide a similar results (similar offsets) to the Bayesian alternative (Plum), but at higher estimated precision, if we only consider at the length of the 95% interval. It is important to note that the CI-CRS model's offset improves as more information is available. However, if we do not consider both the effects of both the offset and length of the interval together, the results are not favourable to the CI-CRS. To have a more realistic representation of how the models capture the true age-depth models relationship, we should observe the normalized offset. This variable shows the degree to which the average models contain the truth within their uncertainty intervals (normalized to one standard deviation). Any model with a normalized offset larger than two (two standard deviations) is incapable of capturing the true ages within its uncertainty intervals. This means that, while the CI-CRS estimates smaller uncertainties and its ages improve as more data is available, it does so at the cost of its accuracy and the improvements are not sufficient to capture the true age. It also appears that the length of the 95% interval and offset are not affected by how much information is provided to the CRS model. On the other hand, Plum seems to provide increasingly accurate results as more information is added to the model. This again coincides with the results outlined by Blaauw et al. (2018) . When we observe the regular offset (not normalized), we find that Plum provides a smaller offset in comparison to the CI-CRS model; this, in combination with slightly larger (more realistic) modelled uncertainties, results in more consistently accurate age-depth models that are capable of capturing the true values within their uncertainty intervals. This result supports the claim that Plum provides more realistic uncertainties compared those obtained by the CI-CRS. Another important statistic to take into account is that 87.86% (4686/5333) of Plum's runs remain within the 2 standard deviations, opposed to 7.48% (399/5333) for the CI-CRS model. Furthermore, only 0.54% (29/5333) of the CI-CRS model runs remain under the 1 standard deviation, which is the most commonly reported interval when reporting CI-CRS results. We can also observe a clear structure in the way that Plum increases its accuracy and precision to obtain a better chronology as more information is available, whereas the CI-CRS model does not appears to improve its ability to capture the true value from additional data. These results are valid for the overall chronology (the mean offset, interval and normalized offset of the overall chronology). In order to evaluate whether certain models are better predicting ages at certain section of the sediment cores, we have to look at the normalized offset of every depth. improving their offset as more information is available. Middle panel B) shows the 95% confidence intervals. It is clear, from this panel, than the uncertainty provided by Plum is a lot bigger for low percentage of information and it constantly improves as more data is available, whereas the length of the intervals provided by the CI-CRS appear to stay constant regardless of the available information. Bottom panel C) shows the normalized offsets, presenting the distance between the modelled age and the true age normalized divided by the standard deviation (in the case of Plum, the length of the 95% interval divided by 4). This panel presents a worrying situation where the CI-CRS model's calculated standard deviation (on average) is incapable of capturing the true age. On the other hand, Plum's credible intervals almost always capture the true age even when little information is available. Since the late 1970's, when the CRS method was first introduced (Appleby and Oldfield, 1978; Robbins, 1978) , the CRS has undergone several improvements. Some of these improvements rely on independent dates, other isotopes or techniques, and/or require user manipulation to "force" the method to agree with these independent dates. One recent improvement, which does not require user manipulation and/or independent dates, is the comprehensive explanation, with expert notes, on the practical used of the CRS model by This research focuses on exploring the uncertainty and precision of the most commonly used 210 Pb dating methods (CI-CRS and Plum). By using different scenarios, three different simulations were created. These simulations were then sub-sampled at different percentages of information in order to observe the effects that different sample sizes have on the resulting chronology. This experiment provided an objective comparison of the accuracy and precision of both methods. The experiment was measured in two different level. For the first level, the overall accuracy and precision of the method were evaluated. The mean of the offset, length of the 95% confidence and credible intervals, as well as the normalized offset were measured. The second level focused on the ability of each model to capture the true value in their credible/confidence interval, and the normalized offset of every scenario per depth was calculated. These two comparisons provided a good picture of the difference in precision and accuracy between these methods. From the overall accuracy (see Figure 4 ) it is clear that both the CI-CRS model and Plum reduce their offset as more data becomes available, with the Bayesian method providing, on average, a smaller offset regardless of the sample size. On the subject of precision, the Bayesian method is providing much larger uncertainties when small sample sizes are used. It is only with 60%, or more, of the information that the length of the intervals becomes comparable. This is a consequence of the linear interpolation, between data points, used by the CRS method, in contraste to the Bayesian approach (Plum) using a proper statistical inference. As has been previously discussed by Aquino-López et al. (2020) , the larger uncertainties provided by Plum are more realistic, and this experiment confirms the latter. Further evidence that these uncertainties are more sensible is that the length of the credible intervals becomes smaller as more data becomes available. On the other hand, the length of the confidence intervals provided by the classical model (CI-CRS) remain almost constant at any sample size. Lastly, the normalized offset, which shows the capability of the model to capture the true values within their intervals, shows that the classical model (CI-CRS) on average is incapable of capturing the true values within its 95% confidence interval. These results are especially worrisome considering that the 210 Pb dating community rarely report 95% confidence intervals and instead tend to use only 65% confidence intervals (one standard deviation intervals) are reported. On the other hand, Plum's normalized offsets always remain ≤ 2, therefore guaranteeing that on average the true value is capture within its 95% credible intervals, even with small sample sizes. Plum's normalized offsets are constantly improving and reaching stability with 50%, or more, of information percentage. These experiments show that the Bayesian method, on average, provides more reliable results. Because the normalized offset shows the capability of capturing the true value within its intervals, this variable can be used to conclude if any given method is better at estimating certain time period. Figure 5 presents the performance of both the CI-CRS model and Plum for every simulated scenario. It appears that, the normalized offset of many of the CI-CRS chronologies are > 2 throughout the whole chronology, meaning that the model does not have a period of time for which it is more precise. Moreover, the CI-CRS does not exhibits a clear learning pattern, where the normalized offset appears to be indifferent to the amount of information available. It appears that even high levels of information percentage provide normalized offsets > 2, in some cases closer to 4 for scenarios 2 and 3. Plum on the other hand, shows a structure where more data is reflected in improved models in scenarios 1 and 3. It is only at low levels of information where Plum's normalized offset is > 2. Scenario 2, on the other hand, presents a case where Plum is both incapable of capturing the true value, for depths deeper than 15 cm, and it appears that as more data becomes available the model provides worse results. This may be of concern if we do not recognized that this scenario is unrealistic as it presents an extreme change in the accumulation around 15 cm, which coincides with the depth at which the normalized offset becomes > 2. It is also important to acknowledge that this experiment was performed using default settings. In a real-world scenario the user typically has some prior knowledge of the sedimentation process, about the site of interest, which could be incorporated as prior information to the model to improve the resulting chronology. The results obtained by this experiment appear to persist even in the case of the revised version of the CRS model (R-CRS). The R-CRS model appears to improve the offset but this improvement appears to be nullified by the smaller uncertainties presented by Sanchez-Cabeza et al. (2014) . The question of which version of the CRS provides the best result is beyond the scope of this research and is dependent on expert application of the model. Nevertheless, it is important to note that that the offset, related to the CI-CRS and R-CRS, are reasonably small at certain sections of the sediment, but the uncertainty quantification of both methods is overly optimistic. In conclusion, the use of the Bayesian age-depth models is preferred for the consistent construction of sediment chronologies, not only on radiocarbon-based chronologies as presented by Blaauw et al. (2018) but also in the more complex case of 210 Pb as demonstrated by this research. While the classical approach provides reasonable results, regarding the offset, unfortunately the uncertainty quantification in these methods needs improvements as they do not rely on a proper statistical structure. In a real-world scenario, it is impossible to measure the true offset of a method and therefore a proper uncertainty quantification becomes extremely important. These results support the recommendations presented by Smith (2001); Barsanti et al. (2020) where the CRS method, or any dating methodology, should be validated using independent dating methods markers. Lastly, it is important to highlight the benefits of the Bayesian methods. From both Blaauw et al. (2018) and the present work, it is shown that Bayesian methods constantly improve as more data are added, the uncertainty associated to the method is realistic and coherent with the amount of information available. This leads to chronologies that are capable of capturing the true age in their credible intervals. The ability to capture the true value in the credible intervals becomes important when the problem is associated with decision making processes, as it provides a more realistic picture of the available knowledge of the process. Given that 210 Pb dating is now widely-used in pollution, environmental and climate change studies, which potentially have a high impact on both policy making and public perception, realistic age estimates and uncertainties become extremely important. Three decades of dating recent sediments by fallout radionuclides: a review. The Holocene The calculation of lead-210 dates assuming a constant rate of supply of unsupported 210pb to the sediment Dating recent sediments by Pb-210: Problems and solutions Chronostratigraphic techniques in recent sediments. Tracking Environmental Change Using Lake Sediments: Basin Analysis, Coring, and Chronological Techniques Bayesian analysis of 210pb dating Comparing classical and bayesian 210pb dating models in human-impacted aquatic environments Challenges and limitations of the 210pb sediment dating method: Results from an IAEA modelling interlaboratory comparison exercise Calculation and uncertainty analysis of 210 P b dates for PIRLA project lake sediment cores Flexible paleoclimate age-depth models using an autoregressive gamma process Double the dates and go for Bayes -impacts of model choice, dating density and quality on chronologies Antarctic snow chronology with pb210 Geochronology with pb-210. Radioactive Dating Guidelines for reporting and archiving 210pb sediment chronologies to improve fidelity and extend data lifecycle Geochemical and geophysical applications of radioactive lead. The biogeochemistry of lead in the environment Determination of recent sedimentation rates in lake michigan using pb-210 and cs-137 210Pb sediment radiochronology: An integrated formulation and classification of dating models Monte Carlo uncertainty calculation of 210Pb chronologies and accumulation rates of sediments and peat bogs Why should we believe 210pb sediment geochronologies The authors are partially founded by CONACYT CB-2016-01-284451 and COVID19 312772 grants and a RDCOMM grant. The corresponding author is founded by CONACYT through the postdoctoral residence program with CVU 489201.