key: cord-0262777-j3zb8qcy authors: Standing, Matthew R.; Triaud, Amaury H.M.J.; Faria, Joao P.; Martin, David V.; Boisse, Isabelle; Correia, Alexandre C.M.; Deleuil, Magali; Dransfield, Georgina; Gillon, Michael; H'ebrard, Guillaume; Hellier, Coel; Kunovac, Vedad; Maxted, Pierre F.L.; Mardling, Rosemary; Santerne, Alexandre; Sairam, Lalitha; Udry, St'ephane title: BEBOP II: Sensitivity to sub-Saturn circumbinary planets using radial-velocities date: 2021-12-10 journal: nan DOI: 10.1093/mnras/stac113 sha: ead7c735a757c043ff5da3984e468ea99aa95f9a doc_id: 262777 cord_uid: j3zb8qcy BEBOP is a radial-velocity survey that monitors a sample of single-lined eclipsing binaries, in search of circumbinary planets by using high-resolution spectrographs. Here, we describe and test the methods we use to identify planetary signals within the BEBOP data, and establish how we quantify our sensitivity to circumbinary planets by producing detection limits. This process is made easier and more robust by using a diffusive nested sampler. In the process of testing our methods, we notice that contrary to popular wisdom, assuming circular orbits in calculating detection limits for a radial velocity survey provides over-optimistic detection limits by up to $40%$ in semi-amplitude with implications for all radial-velocity surveys. We perform example analyses using three BEBOP targets from our Southern HARPS survey. We demonstrate for the first time a repeated ability to reach a residual root mean squared scatter of $3~rm m.s^{-1}$ (after removing the binary signal), and find we are sensitive to circumbinary planets with masses down to that of Neptune and Saturn, for orbital periods up to $1000~rm days$. Circumbinary planets are planets which orbit around both stars of a central binary system. Long postulated (e.g. Borucki & Summers 1984; Schneider 1994) , most unambiguous discoveries have only been made in the past decade thanks to the transit method. In total, 14 circumbinary planets have been identified orbiting 12 main-sequence eclipsing binaries. The Kepler space telescope discovered 12 of these planets orbiting 10 binaries, (Doyle et al. 2011; Welsh et al. 2012; Orosz et al. 2012a,b; Schwamb et al. 2013; Kostov et al. 2013 Kostov et al. , 2014 Welsh et al. 2015; Kostov et al. 2016; Orosz et al. 2019; Socia et al. 2020) , and since two have been discovered in TESS ★ E-mail: mxs1263@bham.ac.uk photometry: EBLM J0608-68 b/TOI-1338 b (Kostov et al. 2020 ) and TIC 17290098 b (Kostov et al. 2021) . Despite these successes, circumbinary configurations remain largely elusive on account of their longer orbital periods. Martin (2018) and references therein, show how most circumbinary planets found to date lie just outside of the circumbinary stability limit, as described by Holman & Wiegert (1999) , which incidentally places them often close or within the habitable-zones of their parent stars. Binary-driven orbital precession means that most circumbinary planets find themselves in transiting configurations only ≈ 25% of the time (Martin 2017) , while their presence would be visible 100% of the time in radial velocity measurements. With sufficient precision, radial-velocities have the potential to be more efficient at detecting circumbinary planets than the transit method, particularly since the Doppler method is also much less sensitive to orbital inclination and period (e.g. . Despite these advantages, no circumbinary planet has been discovered by the Doppler method so far. Only one has been detected in follow-up, a recent recovery of Kepler-16 b by our project (Triaud et al. in prep) . To unlock more information on circumbinary formation mechanisms the demographics of circumbinary planets need to increase, along with accurate physical and orbital parameters. To this end, Konacki et al. (2009) launched a Doppler survey in an attempt to detect circumbinary planets with radial velocities alone, with 'The Attempt To Observe Outer-planets In Non-single-stellar Environments' (TATOOINE) survey. Unfortunately, TATOOINE was unable to discover any planetary companions during their survey. This is likely due to the type of binary stars that were observed. TATOOINE observed double-lined (SB2) binary star systems. With spectra from both stars visible within a spectrograph, complex deconvolution methods need to be applied to recover the individual components' velocities with high precision and accuracy. Konacki et al. (2009) found their best binary target was HD 9939 ( = 7). Their ten radial velocity measurements on this target from Keck HIRES have uncertainties of 1 − 4 m s −1 , but they find a residual root mean squared (RMS) scatter of 7 m s −1 . This RMS is calculated after the data have been fitted for binary Keplerian models, which we refer to as residual RMS for the remainder of this paper. To achieve this level of RMS scatter on an SB2 is impressive, though combining all their data on this target yielded a 99% confidence detection limit of ∼ 1 M Jup (Konacki et al. 2009 ). Results on other targets typically yield residual RMS in excess of 10 m s −1 In the end, Konacki et al. (2010) recommend that single-lined binaries might be the next step forward, a step that we took. introduced our Binaries Escorted By Orbiting Planets (BEBOP) survey. BEBOP currently only targets proven single-lined (SB1) eclipsing binary targets, where only the spectrum of the primary star, typically an F or G dwarf, is visible. We note here, that whilst the binary mass ratios of the BEBOP and TATOOINE samples are biased towards low and high values, respectively, showed that circumbinary planets exist around all mass ratios with no discernable preference. Our binary sample was identified while confirming transiting hot Jupiters with the Wide Angle Search for Planets (WASP) survey (Pollacco et al. 2006; Triaud et al. 2013; Triaud et al. 2017 ). First we used the CORALIE spectrograph, on the 1.2 m Euler telescope at La Silla, Chile , and have now extended the survey to HARPS, at the ESO 3.6 m telescope, also in Chile (Pepe et al. 2002a) , as well as with SOPHIE, at the OHP 193 cm telescope in France (Perruchot et al. 2008) . In this paper we detail and test our detection and observational protocols for the BEBOP survey, which have improved since . To fit our data, we now use a diffusive nested sampler called kima (Faria et al. 2018 ). Contrary to our previous method, we now measure evidence for the presence of a circumbinary planet, accounting for Ockham's razor, and can marginalise our results over as-of-yet undetected planets. With kima, we investigate our detection sensitivity to circumbinary planets, and we use it to produce robust detection limits. We also test our detection protocol by injecting circumbinary planets into our HARPS data and retrieve them with kima. We show that, after removing the binary signal, we repeatedly achieve a detection limit for circumbinary planets at masses as low as Neptune's, paving our way to actual detections. This paper is organised as follows. In Sect. 2 we describe our observing strategy for the BEBOP survey along with statistics of data gathered to date. Second in Sect. 3 we provide a brief overview of the kima diffusive nested sampling package, along with justification of its use within the BEBOP survey. In Sect. 4 the data analysis and simulation methods used are described. In Sect. 5 we display the results of our detection limit and simulation recovery analysis, and discuss the effect of the results on the BEBOP survey before concluding in Sect. 6. The current BEBOP sample consists of a total of 113 single-lined eclipsing binaries which do not exhibit strong stellar activity, or the presence of a tertiary star. 54 in the Southern hemisphere, with HARPS, and 59 in the Northern hemisphere, with SOPHIE; observations still ongoing in each. The Southern sample was established from the EBLM programme, which identified low-mass eclipsing binaries from WASP false-positives (Triaud et al. 2013; Triaud et al. 2017; Swayne et al. 2021) . The data we use in this paper is exclusively from our Southern sample, for which we have now accumulated over 1200 HARPS spectra, an average ∼ 23 individual radial-velocity measurements per target. The southern data are obtained from a preliminary proof-of-concept run (Prog.ID 099.C-0138) and a 78 night programme seeking planetary candidates that commenced between April 2018 and March 2020 (Prog.ID 1101.C-0721). BEBOP has been awarded a further large programme (Prog.ID 106.212H), which began in September 2020 to confirm a number of candidate planetary signals, the results of which will be published in future work. All systems are observed as homogeneously in time as is possible, with 1800s exposures, at an average cadence of one measurement every ≈ 6 nights except when a system is no longer visible. All measurements are taken at airmass < 1.6. We attempt to cover as much of the yearly visibility as is feasible. All measurements are taken using the O _AB mode of HARPS, which places the B fibre on the sky rather than on a simultaneous Fabry-Pérot calibration lamp, with the A fibre placed on our science target (Pepe et al. 2002a) . Whilst the O _AB mode prevents us from reaching the most accurate mode of HARPS, this gives us a chance to subtract moonlight contamination from our spectra, which can introduce a secondary spectral component, something our sample is designed to avoid. Since most our systems have V magnitudes between 9 and 12, photon noise rarely reaches down to the 1 m s −1 long-term stability of the instrument (Lovis & Pepe 2007; Mayor et al. 2009 ), removing the need for simultaneous calibration. Calibration of the instrument is done at the start of night using Fabry-Pérot and Th-Ar calibration lamps as is now standard on HARPS (Coffinet et al. 2019) . The data is analysed on a bi-monthly basis for quality control (for instance verifying whether the residual residual RMS around the binary solution is low enough to allow exoplanet detections). Exoplanet candidate identification are done only once a year. We chose on purpose to perform those candidate searches rarely in order to avoid falling into observer's bias. A homogeneously observed dataset also provides more robust detection limits. On account of the rather long orbital periods expected for circumbinary planets (50+ days; Martin 2018), we are only now reaching the ability to confirm exoplanetary candidates. A similar procedure is performed with SOPHIE, on the Northern sample. It will be more specifically described once SOPHIE data are presented for publication. Observations are obtained using the HARPS instrument on the ESO 3.6m telescope situated at the La Silla observatory in Chile (Mayor et al. 2003) . Reduction of the spectroscopic data was carried out by the HARPS pipeline (Lovis & Pepe 2007) . A Cross Correlation Function (CCF; Baranne et al. 1996; Pepe et al. 2002b ) is created by comparing the spectra obtained by the spectrograph with a G2 or K5 template mask spectra (depending on the spectral type of the primary star in question). We remind here, that our targets are chosen such that the spectra of the secondary star is too faint to be detected. Typically our primary stars are > 4 magnitudes brighter than the secondaries. This allows us to safely ignore the secondary CCF and treat each system as a single-star system (spectroscopically speaking; ). The correlation is evaluated at 0.5 km s −1 intervals and the CCF indicates with its lowest point the radial velocity at which the mask corresponds most closely with the targets' spectra. The CCF is fit with an inverted Gaussian profile and its mean recorded as the radial-velocity. Various shape metrics are obtained from the CCF itself, such as the span of the inverse of the mean bisector slope (bisector span) and the Full Width at Half Maximum (FWHM). These are often used as indicators of stellar activity (Queloz et al. 2001; Santos et al. 2002) but mainly they track the quality of our observation and whether any radial-velocity displacement is caused by a change in the shape of the CCF, or by a translation of the CCF, which is what we are after. Eleven systems (9 on HARPS and 2 with SOPHIE) with particularly strong anti-correlation have been dropped from the observing schedule. A anti-correlation between bisector span and RV is likely caused by starspots on the surface of the target creating parasitic signals (e.g. Queloz et al. 2001) . Prior to fitting our radial-velocities we clean any obvious outliers from our data. First, we exclude any observations mistakenly obtained on a star other than our target. This is often seen as a significant difference in the spectrum's signal-to-noise and in the FWHM. Second, we remove any measurement likely affected by the Rossiter-McLaughlin effect (e.g. Triaud 2018). To find which measurements are affected, we fit a Keplerian binary model to the radial-velocities, and from the fit parameters compute when eclipses would happen. In a third step, we examine the distribution of all bisector span and FWHM measurements of a given system, and exclude measurements that are further from the mean by more than 3 . We also perform target per target visual inspections of these metrics and flag systems where the bisector span and/or the FWHM appear to show a long term trend. For this paper we select three systems from within the full BEBOP South sample. We choose our Southern sample for this exercise as we have obtained more high precision data with a longer baseline than what is currently available in BEBOP North. We select these three from the sample for several reasons. The first is that no planetary or stellar activity signal are currently visible. Second, we select amongst the systems with the lowest radialvelocity uncertainty, and lowest RMS scatter, after having removed the contribution of the secondary star to the radial-velocities of the primary (that we refer to as residual RMS). We do this to specifically demonstrate BEBOP's ability to recover circumbinary planets with signals of a few m s −1 . Of our 41 Southern targets with more than the average of 23 spectra we have identified 17 with a residual RMS < 10 m s −1 , corresponding to roughly 42% of the sample. While 27 of these targets (66%) have an RMS < 17 m s −1 . Three systems are selected from within these, which we now describe: 1) EBLM J0310-31 (J0310-31 thereafter) has been part of the preliminary survey for BEBOP on HARPS, and of the first extensive observing campaign. This is the target for which the largest number of spectra has been obtained. In total, 65 RV data points are available, obtained between 2017-07-09 and 2020-01-02. These measurements also have the lowest mean uncertainty of the survey RV = 2.13 m s −1 . J0310-31's residual RMS = 3.16 m s −1 , which is one of the smallest we obtain so far. All these make J0310-31 the ideal target to test our procedures to compute detection limits, as well as perform injection-recovery tests on. See figure 1 for a phased plot of our RV data for J0310-31, along with our best fit model and residuals. 2) EBLM J1928-38 (J1928-38 for short). Its residual RMS = 2.96 m s −1 , better than J0310-31, close to its mean photon noise uncertainty RV = 2.67 m s −1 . Only 25 measurements have been obtained on J1928-38 so far, which makes it more representative of the current state of the survey than J0310-31 is. These were collected between 2018-06-04 and 2019-09-14. 3) EBLM J1540-09 (henceforth J1540-09) has RV = 3.86 m s −1 , and RMS = 3.75 m s −1 , for 41 available spectra, observed from 2017-04-20 to 2020-03-06. These targets were analysed using the kima RV analysis package (see next section). Their parameters can be found in Table 1 . All data can be accessed in Tables A5 to Table A7 . All measurements for these three systems are used, with no outliers found within those three datasets. For the analysis of the radial velocities and the calculation of detection limits, we use the kima package presented in Faria et al. (2018) . The code models the RV timeseries with a sum of Keplerian functions from p orbiting planets, estimating the posterior distributions for all the orbital parameters. To sample from the joint posterior distribution, kima uses the Diffusive Nested Sampling (DNS) algorithm from Brewer et al. (2011) . Together with posterior samples, DNS provides an estimate for the marginal likelihood, or evidence, of the model, which can be used for model comparison (e.g. Brewer 2014; Feroz et al. 2011) . Fixing the values of p in sequence, we can use the ratio of the evidences to compare models with different number of planets. In addition, since DNS can be used in a trans-dimensional setting (Brewer & Donovan 2015) , the number of planets in the system p itself can be a free parameter in the analysis, and its posterior distribution can be estimated together with that of the orbital parameters. The posterior probability for each p value then allows for the same model comparison, with the advantage of being obtained from a single run of the algorithm. The DNS algorithm samples from a mixture of distributions which is not directly the posterior. A total number of samples s from this target distribution results in a smaller number of effective samples eff from the posterior distribution. We obtain eff > 20, 000 effective samples for each model, which is more than enough to accurately characterize the posterior. Keplerian parameters are estimated from the effective posterior samples via the clustering algorithm HDBSCAN (McInnes et al. 2017) . First, crossing orbits are removed from the posterior samples. Those are proposed Keplerian orbits that cross with each other, or those with an eccentricity which would cause their orbit to cross into the instability region of the binary and are therefore unphysical (e.g. Dvorak et al. 1989; Holman & Wiegert 1999; Doolin & Blundell 2011; Mardling 2013) . HDBSCAN is applied on the remaining posterior samples, highlighting dense regions in parameter space. The cluster corresponding to the Keplerian signal is plotted with the Corner package (Foreman-Mackey 2016). Parameters are determined as the 50 th percentile of the cluster, with 1 uncertainties estimated from the 14 th and 84 th percentiles. To decide between competing models that fit our data we use the Bayes factor. The Bayes factor (BF, here onwards) is the ratio of the Bayesian evidence of the two competing models, and provides a measure of the support in favour of one over the other (Robert E. Kass; Adrian E. Raftery 1995; Trotta 2008) . Table 2 adapted from Trotta (2008) shows how the BF is measured and introduces the Jeffreys' scale (Jeffreys 1961 ) as a measure of evidence strength. To guide our identification of planetary candidates we follow the same Jeffreys' scale, but use custom thresholds that affect what response we have to the system. Like Trotta (2008) , we use BF = 12 as a threshold for 'moderate' evidence. To identify planetary candidates worthy of additional follow-up (to apply for additional telescope time for instance), we use BF > 6. Regularly, 3 marks a detection in Astronomy. To make sure we are over that value, we place a threshold at BF = 35. Any system reaching that level continues to be observed as any other, but we analyse its data more regularly than once a year. To visually track increasing evidence for a planetary signal, we place an additional division at BF = 70. Finally we place our upper threshold BF = 140, after which observations cease to be blind and we start to target specific epochs that represent poorly sampled orbital phases, or alternate solutions (eccentricity, 2 × p , etc). We use the thresholds described here in our insertion/recovery exercise below. Contrary to a regular planetary system, a circumbinary SB1 system is dominated by the reflex motion caused by the secondary star (typically tens of km s −1 ) and extremely well constrained by the data. In addition, we know there cannot be a planet at an orbital period 4 bin due to the instability region (Dvorak et al. 1989; Holman & Wiegert 1999; Doolin & Blundell 2011; Mardling 2013) 1 . In contrast, we have no idea whether a planet is within the system, or what its parameters might be. By default kima applies a common prior for all the orbiting objects (secondary, or planet) in a given model, which is not well adapted to our case. Particularly, there is no doubt that there is a secondary star in the system, and as such no need for the nested sampler to test that hypothesis. In this context, kima can instead use the so-called known object mode, where a specific set of priors can be placed on the binary properties, and another set of priors is used to explore the presence/parameters of putative planets. At each step kima then fits for the binary (the known object) and any additional Keplerian signal present in the data. Using this mode, the nested sampler only computes evidence for ≥ 1 Keplerian function, therefore only testing for the presence of circumbinary planets (≥ 0 planets). A typical kima run with 200, 000 saves, resulting in eff ≈ 25, 000 − 35, 000 posterior samples (depending on data-set) takes ≈ 180 minutes on a standard laptop computer. Here, we describe our methods, which are used similarly on all the systems we monitor in the survey. First, we give a simple parameterisation of the model. Then, we detail and justify the priors we use within kima (see also Table 3 ). We then demonstrate using kima to produce robust and fully Bayesian detection limits. Following that, we perform a more traditional insertion-recovery test of static Keplerian signals to compare with our kima detection limits. We then test the validity of injecting static Keplerian signals in comparison to N-body simulations. Finally, we discuss how our process compares with previous methods for detection limits, both for single and binary stars. With kima we assume independent, static Keplerian orbits for the binary and any circumbinary planets. This neglects the fact that the orbits of the planet and binary will evolve through three-body interactions (e.g. Martin & Triaud 2014; Kostov et al. 2014 ). however found such interactions to be negligible with respect to the CORALIE BEBOP survey. We will briefly verify if this assumption still holds with our more precise HARPS data in Sect. 4.4, (53) where we test the injection and retrieval of N-body simulated RV signals. The Keplerian models we fit to the data are defined by the following parameters. For the binary we have bin (the orbital period), A (the semi-amplitude of the primary star, caused by the secondary), bin (the binary's orbital eccentricity), bin (the argument of periastron) and bin (the starting phase of the orbit). We can calculate the time of periastron passage from these parameters using peri = 0 − ( × )/(2 ), where 0 is the chosen epoch. The number of planets in the system is referred to as p . Planetary parameters are defined like for the binary, but written with a subscript p, e.g. p for a planet's orbital period. In addition, we fit for the systemic velocity , and for a jitter term, jit that rescales the uncertainties on the data. An increase in this parameter is penalised when computing the likelihood. We can also fit for offsets in data between instruments, but this not necessary for the three systems we analyse here. We use priors similar to those laid out in Faria et al. (2020) , but adapted to circumbinary planets. As described in Sect. 3, we treat the signal produced by the secondary star as a known object, taken to be obviously present in the data and place tight priors on its parameters. We only compute Bayesian evidence for any additional signals to the inner binary. This is an advantage of using kima, both the orbit of the binary and any additional signal are fit to the data simultaneously, each posterior sample obtained has a corresponding binary fit. Table 3 shows the prior distributions utilised in our analysis. The secondary's orbital parameters prior distributions are based on values from an initial fit. We take the mode of each binary parameter and set to the prior to uniformly explore a range around that value. That range is determined from fitting all of the binaries in our sample. For instance, we explore ±0.01 day around the orbital period, and ±10 m s −1 around the semi-amplitude (Table 3) . While these might seem tight, the typical precision obtained on bin and A is 100× less than the prior ranges we set (see Table 1 ). As such the binary priors we chose are wide enough to enclose the true parameters and their uncertainties. To search for additional signals besides the binary (i.e. circumbinary planets), we use a uniform distribution for , , , and p , as there is no reason to favour any particular value within these parameter spaces. For planetary eccentricities, we utilise a Kumaraswamy distribution (Kumaraswamy 1980) , using values for our shape parameters as = 0.867 and = 3.03 as justified in Kipping (2013), closely resembling the Beta distribution described there. A Kumaraswamy distribution favours lower values but still permits the exploration of higher eccentricities when the data allows. The most sensitive parameter to sample when determining robust detection limits is planetary semi-amplitudes p . For this reason, we use a transformed Log-Uniform (Jeffreys) prior. We cap the upper range, choosing 100 m s −1 (any signal greater than this would be obvious in our data). Thanks to various tests, we noticed that setting a specific lower limit to p influenced our ability to recover planets and estimate robust detection limits. When we set the lower limit too high (e.g. 1 m s −1 ) it excludes any signals < 1 m s −1 in strength that could lie formally undetected in our data, but effectively ignoring any contribution they may have on other signals and parameters. If instead we set the limit too low (e.g. 0.01 m s −1 ), the sampler does Notes: p denotes the Number of Planetary Keplerian signals to fit to the data. bin and bin denote the Period and eccentricity of the binary respectively. A denotes the semi-amplitude of the primary star, caused by the secondary. U denotes a uniform prior with an upper and lower limit, LU is a log-uniform (Jeffreys) prior with upper and lower limits, MLU is a modified log-uniform prior with a knee and upper limit, and K is a Kumaraswamy prior (Kumaraswamy 1980 ) which takes two shape parameters. not explore the higher p sufficiently (since they become relatively less likely). This can bias the detection limit to lower p due to low number statistics in the posterior, producing over-confident detection limits. Instead of setting a lower value we insert a knee, to give less priority to signals below a set value. Gregory (2005) advises using a knee at 1 m s −1 , which was 1/10 of their typical RV uncertainty. As our survey utilises RV data from both SOPHIE, HARPS and ESPRESSO, our uncertainties approach 1 m s −1 , we set our knee to 0.1 m s −1 . For values of 0 < p < 0.1 m s −1 the distribution acts as a uniform prior, whereas for p ≥ 0.1 m s −1 the priors follow the usual Jeffreys prior (see Gregory 2005) . For all other parameters we employ Log-Uniform (Jeffreys) priors since the range of values can span several orders of magnitudes. This goes for jit and p . The lower limit for the p prior is set to 4 × bin , following the instability region (see Sect. 3). This is a slightly conservative estimate for the instability region. For completeness, we explored the effect of placing a lower limit at p < bin . Results were similar, with the only effect being that posterior samples were dispersed over a larger parameter space resulting in a coarser posterior. We place the upper limit at p = 1 × 10 3 day, roughly the timespan of our longest observed systems: J0310-31. For simplicity and to allow for a fair comparison, this limit is used for all three systems. We increase the maximum limit to 2 × 10 4 day to generate detection limits, as described in Sect. 4.3, and witness how the detectable semiamplitude increases with increasing orbital period after exceeding the time-span of the data. When seeking planetary candidates, we prefer using a maximum period of 1 × 10 3 day, which saves computational power and ensures a finer sampling of the posterior. The presence of a planetary signal with p > 1 × 10 3 day in our data would be identifiable as an over-density of posterior samples at the upper limit of this prior. In which case we adjust this limit as required. The limit on p will extended as we accumulate more data. A unique feature of a diffusive nested sampler such as kima is that when forced to explore higher p than is formally detected, the sampler will produce a map of all signals that are compatible with the data. Since those proposed signals remain formally undetected, the posterior naturally produces a detection threshold. In our case this is made simpler by kima's known object mode (see Sect. 3), and the fact that no planets have been detected in those three systems thus far. We describe in Sect. 5.3 how we calculate a detection limit when a planetary signal is already present in the data. Later, we compare our results to a more traditional insertionrecovery test, the method of choice to assess detectability and measure occurrence rates in radial-velocities surveys (Cumming et al. 2008; Konacki et al. 2009; Bonfils et al. 2013; Rosenthal et al. 2021; Sabotta et al. 2021) . Using a diffusive nested sampler like kima also provides immediate advantages over insertion-recovery test. kima samples all orbital parameters in a fine manner. Particularly, this means that all orbital phases, all eccentricities and all periastra are sampled when typically only a small number of phases are tested, and that eccentricities are nearly always forced to zero. Detection limits obtained with kima marginalise over more orbital parameters than is computationally feasible with insertion-recovery tests, and kima assumes Keplerian signals, rather than sinusoidals (see next section). To create a detection limit, we run kima with priors as in Table 3 except that we fix p = 1. kima marginalises over all allowed parameter space, searching for any possible Keplerian signals producing a reasonable likelihood, in the process ruling out Keplerians that would otherwise have been detected. A key to producing a robust detection limit is to produce a well sampled posterior. For each system we compute three runs, ensuring > 20, 000 samples are obtained in each. To gather consistent results we set the number of saves in kima to 200, 000. The resulting ( p , p ) posterior's density is plotted in Figures 2 and 3 as a greyscale hexbin, with the detection limit being the top envelope. To compute the detection limit, we separate the posterior in logspaced bins in p , and within each bin we evaluate the maximum 99 th percentile of the p distribution. This is done with the caveat that ≥ 2 posterior samples must lie above the chosen sample in a given bin. We do this to prevent the detection limit from being affected by small number statistics in individual bins. We calculate a detection limit for each of the three runs we perform on each system, which we show with faded blue lines in Figures 2 and 3 . We also draw the detection limit obtained when combining all posterior samples from the three runs into a single sample, with a solid blue line. Proceeding this way allows us to produce a mean detection limit and to obtain a visual estimate of the uncertainty of that limit for each bin. Overall, all runs are compatible with one another. We stop computing the detection limit for all p exceeding the first p bin where the number of samples in the top 10% of p prior is larger than three, which is where the posterior is affected by the upper limit set for the p prior. The seminal (single-star) radial-velocity surveys of Cumming et al. (2008) ; Mayor et al. (2011); Howard et al. (2010) used injection and recovery methods as a means of determining injection limits. First, all known planets are removed from the data. Then a sinusoid of varying amplitude, period and phase is used to model a putative exoplanet, and applied to the data. Following, that, a periodogram of the data (typically a Generalised Lomb-Scargle) is computed. If it produces a peak below a certain false alarm probability (typically < 1%) near the injected period, the simulated planet is considered "detected". By reproducing this procedure over a grid of p and p inserted signal, it is possible find out for which values the planet is no-longer detectable. Martin et al. (2019) followed this approach for the CORALIE BEBOP survey, with one key difference that the planet signal was injected to a radial velocity data set where the binary signal had already been removed. Whilst this traditional approach may appear similar to ours, in practice they are distinct. When we use our Bayesian approach with kima to produce detection limits we answer the question "what is compatible with the data?", whereas with an insertion-recovery test, we ascertain "can this specific signal be found?". Here we create injection-retrieval tests for our three targets. There are two purposes of this. First, we investigate compatibility between our Bayesian kima-derived limit and the more traditional approach of injecting static Keplerian signals. Second, we follow the approach of to inject not just static sinusoids but an N-body derived signal using rebound, including all dynamical interactions between the planet and binary, to test the validity of the previous step. We add one circumbinary planet signal to the original data, and do so for a number of planetary semi-amplitudes p and orbital periods p . kima is then used to recover any Keplerian signals in the modified data-set with the resulting Bayes factors providing a measure of signal recovery success. The choice of p and p at which we inject simulated signals is informed by the data itself. For p , values are chosen as multiples of the residual RMS of each system. Values used can be found for each system in Tables A1-A4. For p there are three periods we test for all three targets. The first is 5.9× bin , which is similar to known circumbinary systems (Martin 2018 ) and slightly offset from the often unstable 6:1 mean motion resonance (Quarles et al. 2018 ). We also test 12 × bin and 486 days (365 × 4/3, to avoid but be close to the 1 yr alias). In some cases, we also simulate specific orbital periods, coinciding with features of the detection limits described in Sect. 4.3. Each inserted signal has an eccentricity p = 0.01 (small but non-zero), and argument of periastron p = 0. Then, we run kima as described previously, using priors as in Table 3 , including a uniform prior on p = (0, 3), reproducing how we conduct an open search of the system. Bayes factors are computed from each run and results are plotted in Figures 2 and 3 as coloured points. Their values can also be found in Tables A1 to A4. The binary and putative circumbinary planets for the three chosen periods are then simulated for the J0310-31 system using the rebound N-body code (Rein & Liu 2012) , with the IAS15 integrator (Everhart 1985; Rein & Spiegel 2015) . First we remove the best fit binary model from J0310-31's observed radial-velocity measurements, producing residuals. Then we use rebound to simulate a radial-velocity model of the binary and putative planet at each epoch of observation. We add this model to the residuals. Doing this preserves the scatter, uncertainties and observational cadence obtained in reality. kima is ran once again on each N-body simulated dataset, and Bayes factors computed and compared to those resulting from the injected static Keplerian signals.We discuss the result of this test in Sect. 5.3. All Bayes factor values are plotted in Fig. 2 as coloured triangles offset to the right, and can also be found in table A2. We begin by providing updated parameters on the three binary systems investigated. Then, we discuss the calculated detection limits for each binary system, along with how they compare to our Keplerian injection and recovery tests. Following this we discuss and present our results from the comparison between Keplerian and N-body simulation injection and recovery. We discuss how nested sampling provides more robust detection limits. Particularly, we describe how traditional insertion-recovery exercises over-estimate their ability to retrieve planets by assuming zero eccentricity. Table 1 shows updated orbital parameters from our most recent data for the systems investigated in this work. We provide 1 uncertainties for each parameter, in parenthesis. Uncertainties are symmetric and varying only in the second significant figure, where the largest of the two values were taken. jit is the only parameter that demonstrated an asymmetric distribution (tending to zero), where we consequently provide asymmetric uncertainties. Values agree with to within 1 − 2 for all parameters other than bin and correspondingly peri . Improvements in parameter precision of around 70% are be attributed to the increased number of high-precision measurements acquired using the HARPS spectrograph. The colour of these points represent the Bayes Factor (BF) discussed in Sect. 3, and measure the probability of recovery for the injected Keplerian signals. Every sample in the posterior used to compute detection limits involves a free fit of the binary as a well as a proposed planetary signal, producing a detection limit that marginalises over the binary parameters, an essential element for BEBOP. From these plots, it is evident that different runs of kima produce consistent detection limits with one another. The only inconsistencies are due to a low number of samples within a particular bin, a situation easily resolved by increasing the number of saved posterior samples. We also note that the posterior below the limit is uniformly sampled across parameter space, an indication of the reliability of the method we followed. The detection limit plots also show many features that are regularly seen in detection limits for exoplanets using the radial-velocity method. For all three systems we see an increased density of posterior samples, and a consequently raised detection limit near 0.5 and 1 yr orbital periods, corresponding to yearly aliases, caused by seasonal gaps in observations. In addition, we observe that our detection capability is broadly horizontal for p below the timespan of our data. Then, it increases linearly in log until it hits the upper limit of the p prior. For J0310-31, we reach a mean p detection of 1.49 m s −1 , for J1540-09, we get 2.06 m s −1 , and 2.15 m s −1 for J1928-38. For these three systems, we outperform any produced by the TATOOINE survey (Konacki et al. 2010 ) despite our systems being fainter by an average four magnitudes, validating our choice to monitor single-lined binaries to seek circumbinary exoplanets. These detection limits can be translated into masses, which are shown by additional lines in Figures 2 and 3 . Typically, we are sensitive to planets with masses between Neptune's and Saturn's for orbital periods between 50 to 1000 days respectively, a milestone. Injection-recovery test results are consistent with the detection limits obtained with kima at almost all orbital periods tested on each system. The largest deviation can be seen in Fig. 2 for J0310-31 at 74.59 d. Where the detection limit is approximately 2 m s −1 lower than the recovered injected signal strength. This is the only location where such a deviation is observed. Figure 4 is an example of how the value of the Bayes Factor evolves with the number of posterior samples during a kima run. Our example is J0310-31 data, with an injected planetary signal with p = 227.57 d, and p = 3.16 m s −1 . The uncertainty on the Bayes Factor (blue area) is estimated assuming a multinomial distribution. We use this figure as an indicator for fit convergence, i.e once the line asymptotes on a BF value the fit has converged. We also check how often we recover the Keplerian signal we insert. We find that orbital parameters of injected signals detected with BF > 35 are typically recovered to within 3 or 10% of their original values, except those with periods close to one or half a year. This is true for injected Keplerian signals, as well as the injected signals. The chosen phase of the injected Keplerian signal could in principle cause the discrepancy between the methods seen at p = 74.59 day for J0310-31. Figure 5 shows the same posterior samples as those in Fig. 2 but split into five different phase bins, with a detection limit calculated for each. Doing this is simple with kima, we simply select all posterior sample within a specific phase bin and reproduce the method in Sect 4.3. As can be seen in the figure, varying phase has little effect on the detection limit, which is the case for each target investigated. This consistency seen between phase bins further demonstrates how robust and consistent our detection limits are. This also highlights the usefulness and importance of using nested samplers such as kima to compute detectability curves. Results from systems simulated using rebound are shown in Fig. 2 , as coloured triangles offset to the right from similar, but Keplerian-only simulations. The only evident deviation between the two simulations Figure 6 . Plot of residual radial velocity amplitude as a function of time, after removing a Keplerian binary signal from N-body simulated J0310-31 radial-velocity dataset. Simulated data are computed with rebound including the binary and an orbiting circumbinary planet with p = 74.59 d and p = 6.32 m s −1 . To reveal the Newtonian perturbation, we plot one point for each night over the entire duration of our dataset. The blue line depicts 1.3× RMS within bins 45 day wide. We use this line as a visual guide to the amplitude of the effect. The shaded orange region illustrates the RMS scatter obtained on the data we collected on J0310-31. The bow-tie shape of the residuals is caused by apsidal motion of the binary ( bin ), caused by the planet, which a Keplerian model does not include. occurs at the shortest period investigated (74.59 day) here the Bayes factor for the rebound simulated data is higher for p = 3.95 m s −1 . The agreement in those results (see Table A2 ) confirms that assuming Keplerian functions has no obvious detriment to circumbinary planet detection within our current data just like for any planetary system so far, and as we had already seen in with more imprecise CORALIE data. This simply means that assuming a static Keplerian signal is sufficient for discovery. However, dynamical fits are known to be useful to constrain the physical and orbital parameters of circumstellar planetary systems (e.g. GJ 876; Correia et al. 2010) , and would likely be the case for circumbinary systems, as they have been for HD 202206 (Correia et al. 2005; Couetdic et al. 2010) , a system comparable to a circumbinary configuration. To better understand the non-Keplerian signal, we visualise the amplitude of the Newtonian interactions in Fig. 6 . We use rebound to simulate nightly RV data, for p = 6.32 m s −1 at p = 74.59 days (the shortest period and highest mass planetary signal simulated in this work, producing the largest Newtonian perturbation), over the timespan of our data on J0310-31. Figure 6 depicts the residuals after fitting and removing the Keplerian binary and planetary signals from the N-body simulated dataset. The residuals show that N-body model diverges from the Keplerian model due to apsidal precession ( bin ). However, for the timespan of our current data that divergence remains comparable to the residuals' RMS. Assuming Newtonian effects are non-negligible will likely cease to be valid for longer time-series. Radial velocity measurements of binary stars are also affected by weaker effects, such as tidal distortion, gravitational redshift, transverse Doppler and light time travel effects (Zucker & Alexander 2007; Konacki et al. 2010; Arras et al. 2012; Sybilski et al. 2013) . kima assumes purely Keplerian functions, ignoring these effects, here we explore what impact this may have on our fit, and particularly on our ability to retrieve planets. We calculate the magnitude of tidal distortion on our RV measure-ments as in Arras et al. (2012) , and find their amplitude is < 0.7 m s −1 for the three systems we investigate in this paper, an insignificant contribution to the observed RV scatter. We do note however, that for the shortest period binaries in the BEBOP survey, the magnitude of this effect becomes significant (with respect to the RMS of the systems) and should be accounted for in any further analysis. Computing equations from Zucker & Alexander (2007) for the parameters of each of our three systems, we find the magnitude of transverse Doppler effects to be < 2.2 m s −1 , and the peak to peak variation < 1.6 m s −1 . Similarly, we find the magnitude and peak to peak variation of light time travel effects is < 2.7 m s −1 . The only relativistic effect found to be greater than the RMS on these systems is that of gravitational redshift, with a maximum amplitude of 13.9 m s −1 but a peak to peak variation of < 6.5 m s −1 . To better asses the impact of these effects on our work, we simulate a purely Keplerian binary signal for J0310-31 using the same cadence as our observations were obtained in, and add these post-Newtonian effects. We then fit the data using kima. Figure 7a shows residuals after removing the best fit Keplerian model of the binary. The residuals have an amplitude of ≈ 0.1 m s −1 and a signal phasing with the binary (as expected). This shows that most of the post-Newtonian effects are absorbed by the Keplerian fit. We now use kima once more, to "search for a planet" in these residuals, in order to study how the algorithm behaves in the presence of additional coherent signals. The posterior of that fit is plotted in figure 7b , and demonstrates the periodicity of these remaining signals at harmonic periods of the binary with low semi-amplitude. Individually these effects boast significant amplitudes, but as their functional forms resemble, and phase, with a Keplerian, only small differences are produced: Our simulations show that the majority of the post-Newtonian signals are absorbed into our fit. They are absorbed primarily into the systemic velocity, and lead to a small underestimation of the secondary's semi-amplitude. For J0310-31, ignoring post-Newtonian effects in this experiment creates an underestimated A ≈ 4 m s −1 , corresponding to a 0.01% deviation. We conclude that these effects do not affect the detection limits of this paper greatly, and can be safely ignored for our purposes. Detection limits thus far are all generated from our current data on systems purposefully chosen with no candidate planetary signals. The ultimate goal of the BEBOP survey is to detect RV signals of circumbinary planets in our data. Once a system is found to have a planetary signal, a detection limit is required to rule out the presence of additional companions above our calculated mass limit. To this end, we calculate the detection limit for J0310-31 with our N-body simulated data-set containing an injected 4.74m s −1 , 486 d signal. A signal of this strength is easily detected in the data (see Tab. A2 for its Bayes factor value). We identify the injected signal as in Sect. 4.4, with p varying uniformly from 0 to 3, as is standard for the survey. Figure 8 shows the resulting posterior samples, along with a green dot illustrating the semi-amplitude and orbital period of the injected signal. To obtain a recovered detection limit we subtract the Keplerian signal corresponding to the posterior model with the highest likelihood from the simulated data. Once subtracted, we run kima again as in Sect. 4.3 and calculate the red recovered detection limit from the resulting posterior samples. The blue line seen in Fig. 8 is taken from Fig. 2 for reference, and the recovered red line closely resembles the original blue detection limit. We note that while this approach is not strictly Bayesian, it recovers the correct detection limit, and is expected to affect only a few systems (with planets) within the survey. Thanks to our use of kima to produce detection limits we are able to explore the effect of assuming p = 0 on detection limits for exoplanet radial-velocity surveys. Most often detection limits are produced with multiple insertionrecovery tests. To increase efficiency, and remain computationally tractable, a number of assumptions are made. Circular orbits are generally assumed in the inserted signals used to calculate detection limits because this removes two dimensions: p and p (e.g. Cumming et al. 2008; Zechmeister et al. 2009; Howard et al. 2010; Bonfils et al. 2011; Mayor et al. 2011; Bonfils et al. 2013; Sabotta et al. 2021) 2 . In addition, a short number of discrete planet phases p are usually sampled (e.g. 10; Martin et al. 2019 or 12; Zechmeister et al. 2009; Bonfils et al. 2011 Bonfils et al. , 2013 . Commonly when orbits are assumed circular, investigators invoke Endl et al. (2002) as a justification. Endl et al. (2002) investigated the effect of eccentricity on their detection limit for HR 4979, using insertion and recovery. They test five planetary orbital periods p , each with one corresponding semi-amplitude p , and vary p over 9 values and p over 4, creating 180 ( p , p , p , p ) signals 3 . They only test one phase angle p . At the time, producing these simulations represented a significant computational effort, however nowadays this appears as a rather coarse grid. From this exercise, Endl et al. (2002) conclude that eccentricity can affect detection limit calculations, but find that their detection limits assuming p = 0 are valid for p 365 days, so long as the simulated Keplerian has p 0.5. More recently Cumming & Dragomir (2010) also concluded that assuming circular orbits provides a good agreement with upper semi-amplitude limits for p 0.5 when recovering signals with a periodogram. These particular results have been invoked to justify the assumption of circular orbits ever since. Interestingly, for p ≤ 365 days (Endl et al. 2002 ) find their detection limit is only valid if the inserted signal has < 0.3. However this recommendation has not been followed, and assuming circular orbits for short period is prevalent throughout the literature. From our analysis, we can show how eccentricity affects the detection limit over the entire period range. Our kima runs contain > 70, 000 posterior samples that naturally explore all orbital parameters. Figure 9 shows the same detection limit for EBLM J0310-31, as calculated before, in blue using all posterior samples. Alongside we plot a red detection limit produced in exactly the same way, except using only samples with p < 0.1, and green with p < 0.5 (the limit stated by Endl et al. 2002; Cumming & Dragomir 2010) . For the three systems explored in this work, the red detection limit is systematically lower, by an average of 24.7% or ∼ 1 m s −1 for periods < 1000 day. The green detection limit is also systematically lower by an average of 13.3% or ∼ 0.6 m s −1 for the same periods. Figure 10 demonstrates how the two lower detection limits differ from the original for J0310-31 with period. We note an increase in divergence from the original detection limit at p exceeding the timespan of the data (> 1000 day here), to a maximum of 39.9% at ∼ 4000 days. This maximum difference corresponds to 3.5 m s −1 (∼ 120 M ⊕ )! Our results are consistent with Wittenmyer et al. (2006) who are able to exclude planets with masses 2 M Jup by assuming p = 0, and masses 4 M Jup if instead they assume p = 0.6 at a given p ∼ 4300 days, a 50% difference, comparable to our 40%. Here we remind the reader that we assume a Kumaraswamy distribution as our prior on p , which already favours low p , as is seen in observations (Kipping 2013). Even though circumbinary planets discovered thus far are known to have relatively low eccentricities ( < 0.15 Martin 2018), we strongly caution against assuming circu-2 Circularity is assumed for recovery since most use a periodogram. With kima the full Keplerian is used for exploration and recovery 3 strangely, Endl et al. (2002) only mention 110 signals lar orbits when calculating detection limits in any survey, particularly at long orbital periods. Fixing p = 0 when calculating detection limits over-estimates the success of the program by a significant amount (up to 120 M ⊕ or 0.4 M Jup in our case). This exercise demonstrates the superiority of a diffusive nested sampler in establishing robust detection limits, which are crucial to infer the sensitivity of radial-velocity surveys, and consequently, the occurrence rates of exoplanets. We analyse high-precision radial-velocities obtained as part of a large scale, ongoing, radial-velocity survey that seeks circumbinary planets using radial-velocities obtained with HARPS, ESPRESSO and SOPHIE, in both hemispheres, on single-lined eclipsing binaries. This survey is called BEBOP (Binaries Escorted By Orbiting Planets). We then detail an observing and Bayesian analysis protocol and test it on data collected for three single-lined binaries within the survey. Our analysis shows for the first time, a repeated ability to detect circumbinary planets with masses between Neptune's and Saturn's, for orbital periods within 1000 days with as few as 25 spectra in the span of a year. Figure 11 displays a mass vs period plot of the detection limits from this work (blue) along with confirmed exoplanets (grey circles; NASA exoplanet archive 4 ), transiting circumbinary planets (magenta diamonds), and Solar System planets for comparison. This figure demonstrates our ability to detect sub-Saturn mass circumbinary planets at periods up to 1000 days. Our data are able to detect a large fraction of currently known systems. We note though that many circumbinary planet detections made with eclipse timing variations are upper limits only. We also present a method to compute detection limits on radialvelocity data by using a diffusive nested sampler, without the need to assume circular orbits as is the norm. We then show that this method is superior than the usual injection-recovery tests. Assuming circular orbits when determining detection limits generates over-optimistic detection limits by an average of ∼ 1 m s −1 (24.7%) at periods < 1000 day, and up to 120 M ⊕ at periods > 1000 day. We therefore strongly caution against assuming circular orbits when calculating detection limits, and suggest kima as a way for exoplanet surveys in general to extract accurate sensitivity limits and occurrence rates with fewer assumptions. Thanks to the protocols and tests described in this paper, the BE-BOP survey is now ready to produce circumbinary planet candidates following Bayesian evidence, and able to compute occurrence rates that can be compared to those already established from photometric methods (Armstrong et al. 2014; Martin & Triaud 2014 ). Producing occurrence rates and upper limits on the occurrence of circumbinary planets will be done by simply combining all kima produced detection limits. An efficient integrator that uses Gauss-Radau spacings VizieR Online Data Catalog Theory of probability Populations of Planets in Multiple Star Systems The Messenger Mass vs period plot showing the three detection limits from this work as blue lines in comparison to confirmed exoplanets as grey circles, and transiting circumbinary planets as pink diamonds. Solar System planets are depicted as yellow dots for reference. Circumbinary planets in order of increasing period Kepler-38 b, Kepler-35 b, Kepler-64 b, Kepler-1661 b, Kepler-47 d, TIC-1729 b The Messenger Ground-based and Airborne Instrumentation for Astronomy II The Rossiter-McLaughlin Effect in Exoplanet Research Contemporary Physics We would like to thank the staff at ESO's La Silla observatory for their hard work and assistance throughout this project, especially during the COVID pandemic. We also thank all of the observers who took part in the HARPS timeshare and were instrumental in collecting data for this projects. We particularly thank X. Dumusque and F. Bouchy for their work organising the timeshare.This research received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement n • 803193/BEBOP) and from the Leverhulme Trust (research project n • RPG-2018-418). MRS would like to acknowledge the support of the UK Science and Technology Facilities Council (STFC). We thank H. Rein for helpful discussion regarding this work. This work made use of the Astropy, numpy, pandas, scipy, corner, and matplotlib packages. We also use of label_line.py by Luke Zoltan Kelley. The French group acknowledges financial support from the French Programme National de Planétologie (PNP, INSU). ACMC acknowledges support by CFisUC projects (UIDB/04564/2020 and UIDP/04564/2020), GRAVITY (PTDC/FIS-AST/7002/2020), ENGAGE SKA (POCI-01-0145-FEDER-022217), and PHOBOS (POCI-01-0145-FEDER-029932), funded by COMPETE 2020 and FCT, Portugal. The data we used is available for download at the ESO public archive 5 . Search for Prog.ID 099.C-0138 and 1101.C-0721. Here we present the results of our insertion/recovery tests for each of our three target systems in tables A1 to A4, Numbers in the tables are Bayes factors calculated from our kima runs. We also present our RV data on the systems in tables A5 to A7. This paper has been typeset from a T E X/L A T E X file prepared by the author.