key: cord-0308842-di4aj5v6 authors: Leenaars, C.; Teerenstra, S.; Meijboom, F.; Bleich, A. title: Predicting animal to human translation: A proof of concept study using qualitative comparative analysis date: 2022-02-01 journal: nan DOI: 10.1101/2022.01.31.22270227 sha: 76745affae9411dd0530cc408471de5daaec03be doc_id: 308842 cord_uid: di4aj5v6 Drug development suffers from high attrition rates; promising drug candidates fail in clinical trials. Low animal-to-human translation may impact attrition. We previously summarised published translational success rates, which varied from 0% to 100%. Based on analyses of individual factors, we could not predict translational success. Several approaches exist to analyse effects of combinations of potential predictors on an outcome. In biomedical research, regression analysis (RGA) is common. However, with RGA it is challenging to analyse multiple interactions and specific configurations ({approx} combinations) of variables, which could be highly relevant to translation. Qualitative comparative analysis (QCA) is an approach based on set theory and Boolean algebra. It was successfully used to identify specific configurations of factors predicting an outcome in other fields. We reanalysed the data from our preceding review with a QCA. This QCA resulted in the following formula for successful translation: ~Old*~Intervention*~Large*MultSpec*Quantitative Which means that within the analysed dataset, the combination of relative recency (~ means not; >1999), analyses at event or study level (not at intervention level), n<75, inclusion of more than one species and quantitative (instead of binary) analyses always resulted in successful translation (>85%). Other combinations of factors showed less consistent or negative results. An RGA on the same data did not identify any of the included variables as significant contributors. While these data were not collected with the QCA in mind, they illustrate that the approach is viable and relevant for this research field. The QCA seems a highly promising approach to furthering our knowledge on animal-to-human translation and decreasing attrition rates in drug development. Several approaches exist to analyse effects of combinations of potential predictors on an outcome. In 23 biomedical research, regression analysis (RGA) is common. However, with RGA it is challenging to 24 analyse multiple interactions and specific configurations (≈ combinations) of variables, which could 25 be highly relevant to translation. 26 Qualitative comparative analysis (QCA) is an approach based on set theory and Boolean algebra. It 27 was successfully used to identify specific configurations of factors predicting an outcome in other 28 fields. We reanalysed the data from our preceding review with a QCA. This QCA resulted in the 29 following formula for successful translation: 30 ~Old*~Intervention*~Large*MultSpec*Quantitative 31 Which means that within the analysed dataset, the combination of relative recency (~ means not; 32 While the debate on the relevance and acceptability of animal experimentation remains polarized [1-42 4] , animal experiments are still hard to avoid in the process of new drugs reaching the market. 43 However, the predictive value of animal experiments has limits, and poor translation from animal 44 experiments to humans may contribute to the high attrition rates in drug development [5] . 45 Explaining attrition can contribute to more efficient drug development, which is one of the reasons 46 why we analyse translational success. Another one is animal welfare; we cannot defend using 47 animals for translational experiments that do not provide relevant information. 48 While the most common approach to evaluating translation is mechanistic and qualitative, we 49 started focussing on quantitative studies in a scoping review of reviews [6] . In that review, we 50 observed translational success rates from 0% to 100% (median: 64%; interquartile range (IQR): 44-51 79%). To identify factors contributing to translational success, we visualised these data by several 52 potentially predictive factors, which are explained further below. Relevant for this paper are: 53 definition type (binary vs. continuous definitions of translation), unit of analysis (were the results 54 analysed at the event, intervention or study level), species, the number of included observations 55 (events, interventions or studies), and the year of publication. There was no apparent relationship 56 between any of these individual factors and the percentage of translational success. However, the 57 effects of combinations of these potential factors on translational success could still be relevant. We 58 thus performed additional analyses on our previously-collected data, which are described in this 59 paper. 60 The following five factors were further analysed based on their theoretical relevance: publication 61 age, unit of analysis, analysis size, inclusion of multiple species and type of definition for translation 62 (binary vs. continuous). Publication age was included as the state of science and the quality of animal 63 models are thought to improve over time, and because animal-to-human translation is getting more 64 attention in the last decade, which may result in improvements. The unit of analysis was previously extracted as a categorical variable with 3 possible values: event, 66 intervention or study. Events were mainly specific adverse events observed in animals, humans or 67 both, where the observed translation (e.g. the percentage of adverse events observed in both 68 animals and humans) depends on study size and chance; larger studies have a larger chance of 69 picking up rare events [7] . Interventions were mostly specific drugs, where certain groups of 70 interventions may translate better than others [8] , and the observed translational success rate 71 depends on the sampling, e.g. which group of drugs was analysed. Analyses at the level of individual 72 studies show translation as results that corresponded between animals and humans, which makes 73 the observed translational success rate depend on multiple factors, including population and 74 experimental design of the compared studies [9] . 75 The number of included observations was counted at the level of the unit of analysis, and could thus 76 reflect a number of events, interventions or studies. It was included in the current analyses as a 77 proxy for power, as underpowered studies can result in erroneous conclusions which may impact 78 translation [10] . While some authors argue that species differences introduce uncertainties that 79 seriously limit their validity [11] , investigating multiple species can at least theoretically improve 80 translation, as successful transfer of a first species barrier may be predictive of crossing a second. 81 The definition type for animal to human translation could be binary, i.e., there was successful 82 translation or there was not, or continuous, which could refer to a percentage success, a correlation 83 coefficient between animal and human data, a percentage overlap in confidence intervals, etc. Binary 84 definitions can of course be expressed as percentages success, but the type of definition may impact 85 the observed translational success. 86 As multiple roads lead to Rome, multiple combinations of these factors may lead to translational 87 success. In scientific terms, there possibly is causal complexity; comprising equifinality (i.e., there are 88 multiple routes to success) [12] and conjunctural causation (i.e., combinations of factors may be 89 involved instead of individual factors [13] ). Besides, causation may be asymmetrical [14] ; while the 90 presence of a factor may contribute to success, its absence does not necessarily result in failure (and 91 All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 1, 2022. ; https://doi.org/10.1101/2022.01.31.22270227 doi: medRxiv preprint vice versa). Thus, we have a configural research question: "Which factors, individually or in 92 combination, are necessary or sufficient for successful animal-to-human translation?" 93 To analyse effects of multiple potential predictors on an outcome, regression analysis (RGA) is 94 common. However, RGA is not specifically suitable for research questions comprising multiple 95 interactions and specific configurations (≈ combinations) of predictors. Qualitative comparative 96 analysis (QCA) is an approach developed for configural research questions [15] . It is based on set 97 theory and Boolean algebra. QCA is increasingly used to identify specific configurations of factors 98 predicting an outcome in other fields [16, 17] . We reanalysed the data from our preceding review 99 with a crisp-set QCA (csQCA) [18] . To test the added value of this QCA-approach, we compared it 100 with a classical regression analysis (RGA). 101 Data collection and selection 103 We reanalysed the data published in our systematic scoping review [6] for this study. This preceding 104 scoping review was an umbrella review of reviews that addressed animal-to-human translation 105 quantitatively, comparing the results of studies including at least 2 species with one being human. 106 Data were extracted from the included publications to Microsoft Excel. When an included paper 107 described multiple studies or analyses on different data, all those compliant with the inclusion 108 criteria were included as a separate "case" into our analyses. When the original authors did not 109 express translation as a percentage, but provided the data needed to do so, we calculated the 110 percentage and added it to the respective case. 111 From these already published data, we selected the following factors as theoretically relevant for 112 further combined analyses (as explained in the introduction): definition type (binary vs. continuous), 113 unit of analysis (event, intervention or study), species, the number of included observations and the 114 year of publication. Cases with missing data for any of the analysed factors were excluded from the 115 analyses (numbers are mentioned in the results). 116 All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in We calibrated all data to create so-called crisp sets as described in table 1. Data were calibrated on 127 theoretical grounds and based on expert opinions from within our network. For example, cut-offs for 128 old publication age at around the start of the century were based on the use of the internet 129 becoming increasingly common in research. Also, >75 observations is considered a large study in the 130 qualitative field. Data distributions were known at the time of calibration, and informed set 131 definitions to the extent that (near) empty sets were consciously prevented. 132 Publication date <2000 Publication date ≥ 2000 Intervention "Interventions" were the unit of analysis "events" or "studies" were the unit of analysis Large k>75 K≤75 MultSpec At least two animal species were analysed At most one animal species was analysed Quantitative Translation was calculated in a continuous manner Translation was defined in a binary manner Translation (outcome) Success: >80% Failure: <45% Table 1 : Data calibration for QCA 133 Units of analyses in the included reviews could be interventions, publications & studies, and 134 particular (e.g., adverse) events. We distinguished observations at the intervention level from those 135 at the event and study level, as the latter two are both chance processes, while at the intervention 136 level we can imagine a clear distinction between e.g., compounds that do translate well and those 137 that do not. 138 All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in accepted, but they were not used to inform logical minimisation (in the QCA package's truthTable 158 function: incl.cut = 1, n.cut = 1, pri.cut = 0). Also, because of our awareness of missing relevant 159 factors in this proof of concept study, we did not analyse coverage of the solutions; i.e., which part of 160 the cases could be explained with the final formula. 161 All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 1, 2022. ; https://doi.org/10.1101/2022.01.31.22270227 doi: medRxiv preprint Regression analysis 162 All cases with complete data were included for the RGA. The following variables were included in the 163 RGA: definition type (binary vs. continuous), unit of analysis (event, intervention or study), multiple 164 species, the number of included observations and the year of publication. Compared to the QCA, we 165 included more data into the RGA. Because we did not have to dichotomise data into crisp sets, we 166 included all cases with full data, also those with translational success from 45% to 80%, with the 167 original percentage as the outcome. The variables for the number of included observations and the 168 year of publication were also included as numbers instead of dichotomising them. The variables 169 definition type, unit of analysis and multiple species were included as binary variables, exactly like in 170 the QCA. 171 Regression analysis was performed with the lm function from R's basic stats package [19] . We tested 172 a single model, including all variables included in the QCA individually. To provide a comparison with 173 the QCA outcome we further added an interaction term for the variables "MultSpec" and 174 "Quantitative". 175 In our original review, we included 232 cases from 121 references. From these original cases, the 104 178 without missing data but with clearly successful or clearly unsuccessful translation were included in 179 the QCA. Of these, 50 showed successful translation and 54 did not. The different observed set configurations with the outcomes are summarised in a truth table (table 183 3). The truth table shows that 9 configurations had inconsistent (both successful and unsuccessful 184 All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in This means that the combination of relative recency (~ means not), analyses at event or study level, 203 n<75, inclusion of more than one species and quantitative analyses resulted in successful translation 204 (>85%). 205 Further evaluation of the two cases consistent with this formula shows that they were both derived 206 from the same publication; they are two meta-analyses (for different outcomes) including both 207 animal and human data (further described in the discussion). The results from both meta-analyses 208 showed a high degree of overlap between the animal and the human data. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 1, 2022. ; https://doi.org/10.1101/2022.01.31.22270227 doi: medRxiv preprint We initiated these analyses as a first exploration and proof of principle of the QCA-method in meta-235 research of animal-to-human translation. QCAs have successfully been performed on data from 236 systematic literature reviews in other fields [16, 17, 23, 24] . However, to the best of our knowledge, 237 we are the first to perform a QCA with animal metadata, and to use it to analyse animal-to-human 238 translation. 239 Our QCA resulted in a preliminary success formula for translational success at the meta-level; recent 240 small reviews with analyses at the event or study level including more than one species and using a 241 quantitative definition of translation were consistent with successful translation. While the effect 242 sizes and directions of the RGA were consistent with these results, hence supportive of the QCA, the 243 RGA did not identify any of the variables, nor the interaction term, as statistically significant. This 244 shows the strength of the QCA approach. 245 Cases consistent with the QCA-derived formulae 246 The formula for translational success was based on 2 meta-analyses, which both came from the same 247 paper [25] . The authors performed an in-depth systematic review on guided tissue regeneration for 248 periodontal infrabony lesions. They included 13 human and 9 animal papers, with varying study 249 quality scores. The approach in their paper can be considered exemplary in synthesising animal and 250 human data; combining them into a sub-grouped meta-analysis of percentages of bone filling, 251 allowing for cross-species comparisons. 252 The formula for translational failure was based on 4 cases from 2 papers including newer studies [26, 253 27] , combined with 4 papers combining into a single term for older studies [28] [29] [30] [31] . To start with the 254 newer studies; Olson et al. [27] described large analyses of animal studies in dogs, primates, rats, 255 mice and guinea pigs. The authors were fairly optimistic in describing that 71% of human adverse 256 events was somehow predicted in an animal model, but they also detailed low concordance rates in 257 toxicity. Musther et al. [26] described correlational analyses of oral bioavailability, and concluded 258 that bioavailability in animals is not predictive of that in humans. They provided separate data for 259 All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 1, 2022. ; https://doi.org/10.1101 https://doi.org/10. /2022 mice (30 compounds), rats (122 compounds) and dogs (125 compounds), which were individually 260 included in our analyses. Their monkey data (41 compounds) were included in the RGA, but excluded 261 from the QCA because of an intermediate translational success rate. 262 To continue with the older studies; Litchfield [28] concluded that many serious side effects that can 263 occur when a drug is given to humans were not predictable from observations on dogs or rats. The 264 rat data were included in the QCA as clear translational failure, the dog data were only included in 265 the RGA because of intermediate translational success . Steinberg & Schlesselman [31] compared the 266 effects of pancreatitis therapeutics between 13 human studies and 25 animal studies in dogs, pigs, 267 rats and guinea pigs, with low correspondence between the results. 268 Schein co-authored two publications in 1973 that both described multiple analyses included in our 269 RGA, most with translational success rates between 45 and 80%. In one publication Schein and 270 Anderson carefully concluded that combining data from multiple species could reduce false negatives 271 for prediction of human adverse events, but one of their data sets reflected translation below 45% 272 and was included in our QCA [29] . In the other publication, Schein et al. concluded that animal 273 models can predict a substantial part of the adverse events occurring in clinical use [30] , but again, 274 translation was low in one of their data sets which we included in the QCA. 275 The term in the formula for translational failure that combines the 4 configurations listing these older 276 studies (Old*~Large*~Quantitative) effectively illustrates the concept of logical minimisation, 277 and thereby the potential of the QCA-method. 278 The formula we here present for translational success is restricted to smaller studies, and two of the 280 three terms in our formula for translational failure cover large studies. While translational success 281 being related to smaller studies may seem counterintuitive, we would like to mention that the 282 familiarity with the data and the individual cases may be better with smaller studies, which might 283 benefit the quality of the work, which, in specific configurations, could positively affect translational 284 success. 285 All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The here-presented data were not collected with QCA in mind, and therefore not optimised for this 287 approach. However, they still resulted in preliminary formulae consistent with translational success 288 and failure, based on relatively few consistent configurations. We expected contradictory lines in the 289 truth table; configurations that were not consistent with the outcome. We hypothesise that this is 290 mainly due to not all relevant factors being included in this QCA. 291 Future studies should gather data for more factors, but also on a larger number of cases to fill the 292 logical remainders. A 2-step approach is considered; a first large QCA could comprise multiple factors 293 relating to the meta-level. A second QCA could be restricted to the successful configurations from 294 the first, and address factors at the primary study level. With more cases included, and less concern 295 about logical remainders, multivalue QCAs (mv-QCAs) [32] may well be preferable. Concerning the 296 currently included factors, it would be relevant to distinguish studies at the event and the study level 297 instead of grouping them outside the set of studies at the intervention level. While it may seem like 298 an attractive idea to include a factor for individual species, the resulting truth table would become 299 incredibly large and have many logical remainders for the less frequently used species. However, a 300 category distinguishing e.g., rodents, non-human primates and other mammals could be viable for 301 future work. 302 QCAs can also be applied to other types of data than literature [14, 15, 18] , which may make other 303 types of data and variables accessible for analyses. E.g., individual compounds or targets can be 304 defined as cases in a QCA, or communication and consideration of all available data in experimental 305 design can be added as factors. There are indications that animal data are insufficiently considered in 306 the design of human trials [33, 34] , which might partially explain translational failure. Data could be 307 gathered from multiple sources comprising also investigators' brochures, ethics applications and 308 patent registrations. 309 All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 1, 2022. ; https://doi.org/10. 1101 /2022 Implications and conclusion 310 While our results are not conclusive and need confirmation, analysing multiple species in 311 combination with analysing translational success quantitatively may be the optimal approach for 312 future studies aiming to translate to humans. Analysing animal-to-human translation quantitatively 313 as a percentage of correspondence instead of making simplified binary yes/no distinctions fairly 314 reflects the available data. While we do not encourage to increase the number of animal studies 315 overall, if a study aiming at translation is considered to be necessary, we may need to get used to the 316 idea of testing more than one species. 317 In this paper, we present the first QCAs addressing translational success and failure rates. While the 318 data were not collected with this method in mind, we show that the approach is viable, relevant and 319 promising. Further knowledge on animal-to-human translation may help to improve efficiency in 320 research and drug development, and to focus animal studies to where they are predictive. 321 All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 1, 2022. ; https://doi.org/10. 1101 /2022 Advancing 323 nonclinical innovation and safety in pharmaceutical testing How the COVID-326 19 pandemic highlights the necessity of animal research Animal Experimentation: Working Towards a Paradigm Change The role of 'public opinion' in the UK animal research debate Can the pharmaceutical industry reduce attrition rates? Animal to human translation: a systematic scoping review of reported concordance rates Epub 2019/07/17 Adverse events and vaccination-the lack 341 of power and predictability of infrequent events in pre-licensure study Prediction of human drug clearance from animal data: application of the rule of 344 exponents and 'fu Corrected Intercept Method' (FCIM) A Systematic 347 Review Comparing Experimental Design of Animal and Human Methotrexate Efficacy Studies for 348 Rheumatoid Arthritis: Lessons for the Translational Value of Animal Studies. Animals (Basel) Why most published research findings are false Is it possible to overcome issues of external validity in 355 preclinical animal research? Why most animal models are bound to fail General systems theory:Applications for organization and 359 management Conjunctural causation in comparative case-oriented research Between complexity and generalization: Addressing evaluation challenges with 363 QCA Using qualitative comparative analysis to study causal complexity The use of Qualitative Comparative Analysis 368 (QCA) to address causality in complex systems: a systematic review of research on public health 369 interventions An overview of qualitative comparative 372 analysis: A bibliometric analysis Not Yet Fuzzy? Assessing the Potentials and Pitfalls R_Core_Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing Read Excel Files dplyr: A Grammar of Data Manipulation Weight 383 management programmes: Re-analysis of a systematic review to identify pathways to effectiveness Comparison of treatment effects of guided tissue 391 regeneration on infrabony lesions between animal and human studies: a systematic review and 392 meta-analysis Rostami Hodjegan A. Animal versus human 395 oral drug bioavailability: do they correlate Concordance of the 398 toxicity of pharmaceuticals in humans and in animals Forecasting drug effects in man from studies in laboratory animals The efficacy of animal studies in predicting clinical toxicity of cancer 403 chemotherapeutic drugs Qualitative aspects of drug toxicity in prediction from 406 laboratory animals to man Treatment of acute pancreatitis. Comparison of animal and 408 human studies A Reassessment of the (Putative) Pitfalls of Multi-value 410 QCA Investigator brochures for phase I/II trials lack 412 information on the robustness of preclinical safety studies Preclinical efficacy 415 studies in investigator brochures: Do they enable risk-benefit assessment?