key: cord-0806439-woa7cbxf authors: McAndrew, Thomas; Cambeiro, Juan; Besiroglu, Tamay title: Aggregating human judgment probabilistic predictions of the safety, efficacy, and timing of a COVID-19 vaccine date: 2022-02-28 journal: Vaccine DOI: 10.1016/j.vaccine.2022.02.054 sha: 0b62dc7b9195e56511642833cc3902b39ff16559 doc_id: 806439 cord_uid: woa7cbxf Safe, efficacious vaccines were developed to reduce the transmission of SARS-CoV-2 during the COVID-19 pandemic. But in the middle of 2020, vaccine effectiveness, safety, and the timeline for when a vaccine would be approved and distributed to the public was uncertain. To support public health decision making, we solicited trained forecasters and experts in vaccinology and infectious disease to provide monthly probabilistic predictions from July to September of 2020 of the efficacy, safety, timing, and delivery of a COVID-19 vaccine. We found, that despite sparse historical data, a linear pool—a combination of human judgment probabilistic predictions—can quantify the uncertainty in clinical significance and timing of a potential vaccine. The linear pool underestimated how fast a therapy would show a survival benefit and the high efficacy of approved COVID-19 vaccines. However, the linear pool did make an accurate prediction for when a vaccine would be approved by the FDA. Compared to individual forecasters, the linear pool was consistently above the median of the most accurate forecasts. A linear pool is a fast and versatile method to build probabilistic predictions of a developing vaccine that is robust to poor individual predictions. Though experts and trained forecasters did underestimate the speed of development and the high efficacy of a SARS-CoV-2 vaccine, linear pool predictions can improve situational awareness for public health officials and for the public make clearer the risks, rewards, and timing of a vaccine. (Dated: February 9, 2022) Safe, efficacious vaccines were developed to reduce the transmission of SARS-CoV-2 during the COVID-19 pandemic. But in the middle of 2020, vaccine effectiveness, safety, and the timeline for when a vaccine would be approved and distributed to the public was uncertain. To support public health decision making, we solicited trained forecasters and experts in vaccinology and infectious disease to provide monthly probabilistic predictions from July to September of 2020 of the efficacy, safety, timing, and delivery of a COVID-19 vaccine. We found, that despite sparse historical data, a linear pool-a combination of human judgment probabilistic predictions-can quantify the uncertainty in clinical significance and timing of a potential vaccine. The linear pool underestimated how fast a therapy would show a survival benefit and the high efficacy of approved COVID-19 vaccines. However, the linear pool did make an accurate prediction for when a vaccine would be approved by the FDA. Compared to individual forecasters, the linear pool was consistently above the median of the most accurate forecasts. A linear pool is a fast and versatile method to build probabilistic predictions of a developing vaccine that is robust to poor individual predictions. Though experts and trained forecasters did underestimate the speed of development and the high efficacy of a SARS-CoV-2 vaccine, linear pool predictions can improve situational awareness for public health officials and for the public make clearer the risks, rewards, and timing of a vaccine. health officials, and has the potential to contribute to decisions related to vaccine research and development. 62 For the public, predictions of the efficacy and safety of a vaccine may lessen vaccine hesitancy [72] [73] [74] [75] . 63 For public health officials, predictions of the time to approval and time to manufacture a vaccine serve as 64 valuable input in supply chain management, including logistics planning, inventory management, material 65 requirements planning [76] [77] [78] . For those in vaccine research and development, probabilistic predictions may 66 help determine which vaccine platforms and pathways should be pursued before others [79] [80] [81] [82] . 67 To the best of our knowledge, this work is the first to generate ensemble probabilistic predictions from 68 expert and generalist forecasters on COVID-19 vaccine development and share these with the public and 69 public health decision-makers from June 2020 through September 2020, before the first approved COVID- Subject matter experts (SMEs) and trained forecasters (definition below) participated in four surveys from 76 June 15th, 2020 to August 30th, 2020. SMEs and forecasters were asked to predict aspects of safety, efficacy, 77 and delivery of a COVID-19 vaccine (see [35] to view all four summary reports, questions asked of forecasters 78 and collected prediction data used in this work. In addition, names and affiliations of forecasters who chose 79 to volunteer this information is provided at the end of each summary report.). 80 Subject matter experts and trained forecasters were solicited by sending personal emails (see Supp. S7 for a 81 template email sent to subject matter experts). Solicitation started on June 3, 2020 and ended on July 8, 82 2020. 83 The first two weeks of each month (the 1st to the 14th) were used to develop questions that could address 84 changing information about a COVID-19 vaccine. 85 Forecasters received a set of questions on the 15th of each month and from the 15th to the 25th could submit 86 predictions using the Metaculus platform [83] . Forecasters made a first prediction and, as many times as 87 they wished, could revise their original prediction between the 15th and 25th. To reduce anchoring bias [84] , 88 between the 15th and the 20th forecasters made predictions without knowledge of other forecaster's predictive 89 densities. From the 20th to the 25th a community predictive density-an equally weighted combination of 90 predictive densities from subject matter experts and trained forecasters-was available to forecasters. 92 We defined a subject matter expert as someone with training in the fields of molecular and cellular biology, 93 microbiology, virology, biochemistry, and infectious diseases, and who has several years of experience in vac-94 cine, antiviral, and/or biological research related to infectious agents and kept up-to-date with vaccine and 95 antiviral research specifically focused on SARS-CoV-2/ COVID-19. Subject matter experts were trained in 96 biological sciences but were not required to have had prior experience making accurate, calibrated proba-97 bilistic predictions. Subject matter experts were recruited by asking for volunteers who were members of II.4. Forecasting platform 135 Forecasters submitted predictive densities by accessing the Metaculus platform (Metaculus), an online fore-136 casting platform that allows users to submit predictive densities and comments related to a proposed question. 137 When using the platform, a forecaster can submit an initial prediction for a question and choose to revise 138 their original prediction as many times as they wish. Metaculus stores individual predictions and comments 139 for each question and when the ground truth for a question is available the platform scores individual predic-140 tions and keeps a history of each forecaster's average score across all questions for which they have submitted 141 predictions. The Metaculus platform allows participants to visualize their proposed predictive density both 142 as probability density and cumulative density functions. A private subdomain was created on Metaculus that allowed only subject matter experts and select trained 144 forecasters to submit predictions and comments on these COVID-19 vaccine questions. Forecasters were 145 encouraged, but not required, to answer all questions. When a user accesses Metaculus they are presented 146 with a list of questions for which they can submit probabilistic predictions. For each question, the forecaster 147 was presented background information about the specific question, including resources judged by the authors 148 to be relevant and informative. This information was meant to be a starting point for forecasters to begin 149 building a prediction and forecasters were not constrained to use only the background information provided. Each question also contained a detailed statement of the resolution criteria-the criteria that describes, as 151 precisely as possible, how the ground-truth would be determined. The online interface that was used by forecasters and subject matter experts presented in order: textual 153 background information, the question asked of the forecaster, and how the truth will be determined for the 154 question. When available, the forecaster can observe the current and past community median, 25th and 155 75th quantile predictions. Below the textual information forecasters were given a tool to form a predictive 156 density as a weighted mixture of up to five logistic distributions (See Suppl. Fig. S6 as an example interface 157 for forecasters and the following link to interact with a specific instance of the platform https://covid.me 158 taculus.com/questions/4822/when-will-a-sars-cov-2-vaccine-candidate-be-approved-for-use-159 in-the-united-states-or-european-union/). The logistic distribution resembles a normal distribution 160 but has heavier tails. 161 Forecasters were notified that new questions were available on the forecasting platform by email. The 162 email contained summary information about the questions presented to forecasters and a line list of each 163 question. Questions were formatted as hyperlinks that directed forecasters to the forecasting platform 164 and corresponding question that they clicked. Forecasters were given the dates when predictions could be 165 submitted and dates for when the real-time and past community predictions could be observed. The email 166 also reiterated for forecasters where they can find reports on past questions asked and that questions and 167 feedback could be sent to the authors (an example email is presented in Supp. S8). Subject matter experts were not required to attend any formal training on how to form probabilistic pre-169 dictions or best practices in forecasting. However, the research team provided both an interactive tutorial 170 (https://covid.metaculus.com/tutorials/Tutorial/) and a video tutorial( https://drive.google.c 171 om/file/d/1sYLif02wimQRi4alufU58YW6 a3ekuFP/view) to introduce the forecasting platform to subject 172 matter experts and familiarize them with probabilistic prediction. Experts and trained forecasters submit a probabilistic density as a convex combination of up to five logistic distributions. Specifically, the ensemble probabilistic prediction f m for the m th forecaster is given by: When generating a prediction, participants specify at minimum one logistic distribution by moving a slider 175 that corresponds to µ and compressing or expanding the same slider that corresponds to σ. If a participant 176 decides to add a second (or 3 rd , 4 th , and 5 th ) logistic distribution they can click "add a component" for a 177 second slider allowing the participant to "shift and scale" this additional logistic distribution. Two "weight" 178 sliders also appear under each "shift and scale" slider that allows the participant to control the weights 179 (α 1 , α 2 , · · · , α 5 ) of each individual logistic distribution. i.e. π m = 1 M , as we had little a-priori reason to assign differential weights to participants. We chose to score forecasts using the logarithmic (log) score [89, 90] . The log score assigns the logarithm of the density value corresponding to the eventual true value (t) of a target of interest. log score( where f is the predictive density submitted by an individual forecaster or linear pool. Log scores take values 199 from negative to positive infinity. The worst possible log score a forecaster can receive is negative infinity 200 (earned when the density value assigned to the actual outcome value is zero), the log score is greater than 201 the value zero when the density assigned to the truth is greater than one, and the best possible score is 202 positive infinity (earned when the density assigned to the actual outcome approaches positive infinity). The log score is a proper scoring rule. If we assume a data generating process that follows a distribution 204 F , then a proper scoring rule is optimized when a forecaster submits the true density-the density F -over 205 potential values of a target, disincentivizing a forecaster from submitting a predictive density that does not 206 accurately represent their true uncertainty over potential outcomes [42, 91] . Logarithmic scores were also transformed to scaled ranks (see S8.2 of the supplement). Given a set of N log 208 scores, the scaled rank assigns a value of 1/N to the smallest log score, a value of 2/N to the second smallest 209 log score, and so on, assigning a value of 1 to the highest log score. 210 We refer to the log score generated by individual forecaster i who made a prediction about question q as 211 L i,q in the following section. The model is where E is a binary variable that identifies whether a forecaster was an expert (E = 1) or not (E = 0), C is 219 a binary variable that indicates whether a forecaster was a linear pool (C = 1) or not (C = 0). and β q is a 220 normally distributed, random intercept, with standard deviation σ q that accounts for the tendency for log 221 scores to be clustered within each question. All statistical hypothesis tests are two-sided and a pvalue less than 0.05 is considered statistically significant. The reported efficacy for a vaccine that uses a non-replicating viral platform was 66.9% and for a vaccine 281 that uses a DNA/RNA platform was 95% [31, 33]. The linear pool mode prediction of efficacy made in July, 282 2020 was 60% for a vaccine produced using a non-replicating viral platform and 65% for a vaccine produced 283 using a DNA/RNA platform. The probability assigned to an efficacy below 50% was 0.35 for a non-rep 284 platform and 0.30 for a DNA/RNA platform. Though there is no reported efficacy for a vaccine using an 285 inactivated virus and protein sub-unit platform, the linear pool mode prediction was 75% for a inactivated, 286 and 77% for a protein sub-unit platform (Fig. 1C. ). The probability assigned to an efficacy below 50% was The linear pool median prediction for when a COVID-19 therapy would show a survival benefit was later than 296 the truth. Though there is no ground truth on survival benefits by vaccine platform, linear pool predictions 297 for safety did differ by platform. 298 We asked trained forecasters and experts six questions, one in June, two in July, one in August, and two in The linear pool median prediction for when a SARS-CoV-2 vaccine would be approved in the EU or the US 320 was later than the truth, and the median prediction of approval time in the US became more accurate as 321 the time shrunk between when the prediction was made and when the ground truth was available. We asked trained forecasters and experts six questions, one in June, one in July, two in August, and two in 323 September related to the timing of approval of a COVID-19 vaccine (Fig. 3) The linear pool median prediction of when an approved vaccine would be administered to 100K people was 343 later than the truth. 344 We asked trained forecasters and experts three questions, one in June and two in August related to speed 345 to produce and administer a vaccine after approval (Fig. 4) , and received 53 predictions (17 predictions per 346 question on average) The date that the first SARS-CoV-2 vaccine to be approved and administered to more than 100K people (Table I) . show a survival benefit. Forecasts from the crowd, though they provided information, were also often broad. 80% confidence intervals for questions that asked the crowd to assign probabilities over dates spanned years. and decision makers, care must be taken to assess environmental changes that could bias human judgment. Linear pool prediction scores were not statistically different than individual prediction scores. A linear pool 434 was never the highest scoring prediction, however a linear pool was also never the lowest scoring prediction. This work cannot show that statistical aggregation in a noisy environment, like the COVID-19 vaccine 436 landscape, is more accurate than individual predictions, but there is evidence that statistical aggregation 437 may guard against inaccurate individual predictions. In this work we are limited by the small number of questions we could ask forecasters, the number of questions 439 that could be compared to the truth, and difficulties associated with human judgment. Compared to computational models, forecasters must spend a significant amount of time and energy to answer 441 questions which limits the number of questions we can ask. Each question included background information 442 that forecasters could use as a starting point for a more detailed analysis and to ease the cognitive burden 443 of prediction. However, adding background information could cause biases such as anchoring. The cognitive burden on forecasters can also be observed in the small percent of textual comments made 445 by forecasters. Forecasters were able to submit comments along with quantitative predictions, but a small prediction. An ongoing challenge in human judgment forecasting is to solicit a prediction, data used to make 449 the prediction, and a rationale. I: Table of coefficients, 95% confidence intervals, and pvalues for a mixed effects model with log score as the dependent variable, three fixed categorical factors to identify log scores from trained forecasters (the reference category), subject matter experts, or linear pool models, and a random intercept by question. There were 234 log scores (observations) from fifteen separate questions that were used to fit this model. The marginal R 2 for this model, or the variance explained by the three fixed categorical variables, is 0.005 and the conditional R 2 , the variance explained by the fixed effects and the random intercept for each question, is 0.43. Trained forecasters have the highest average log score followed by linear pool models and subject matter experts. However, there is not enough evidence to conclude that these differences are statistically significant ) Linear pool predictive percentiles made in June and in July, 2020 for the date when SARS-CoV-2 vaccine will be approved for use in the US or European Union (EU). (B.) Linear pool predictive percentiles for the date a SARS-CoV-2 vaccine will be approved for use in the US or EU through a standard approval process (blue) or an emergency use authorization (red), and (C.) linear pool predictive percentiles for the date a SARS-CoV-2 vaccine will be approved for use specifically in the US through a standard approval process (blue) or an emergency use authorization (red). The linear pool median predictions made in June and July for when a SARS-CoV-2 candidate would be approved in the US or EU were many months later than the truth (May, 2020 and April, 2020 vs Dec., 2020). Linear pool median predictions of the date of emergency and standard approval of a SARS-CoV-2 vaccine in the US or EU were less accurate than predictions of approval dates for the US only. Environmental cues, time between when the forecast was made and the truth, or how the question was asked, may have impacted predictive accuracy. Dear , We read your work on with great interest, and invite you to join a collaborative group of select experts. We are building expert consensus predictions about the development of SARS-CoV-2 vaccines and COVID-19 therapeutics each month with a small group of researchers involved in the study of novel therapeutics/vaccines. Experts are surveyed about their predictions about the future development of SARS-CoV-2 vaccines and COVID-19 therapeutics and the results are bei aggregated into a consensus. We feel your skill set defines you as an expert in this field. Would you participate, alongside other experts, in our survey project to forecast the development of SARS-CoV-2 vaccines and COVID-19 therapeutics? Your anonymized predictions will contribute to an expert consensus made available to the public, and sent to the CDC to provide support for public health decision making. Our main goal is to provide public health officials probabilistic predictions from experts on the research and development of vaccines and therapeutics. We have made the results from our first survey available here. Our past work, in collaboration with Thomas McAndrew at the University of Massachusetts at Amherst, focused on forecasting the early trajectory of COVID-19 in the United States. Forecasts and predictions generated by 41 experts in the modeling of infectious disease were featured in outlets such as Science, FiveThirtyEight, and The Economist, and were also sent to the CDC to support the US COVID-19 response. If you are available to participate in at least one survey, please respond to this email, preferably by . We expect to administer the next monthly survey on . We would be thrilled to welcome you-if only just for one round. An expert consensus can produce forecasts on a diverse range of vaccine and therapeutic solutions that computational and statistical methods cannot, and we feel your expertise will make impactful and meaningful contributions. Please reach out to us with any questions. Sincerely, FIG. S7: A template email used to solicit forecasting participation from subject matter experts in molecular and cellular biology, microbiology, virology, biochemistry, and infectious disease who have had several years of experience studying vaccine, antiviral, or biological related to infectious agents. The fourth COVID-19 Countermeasures session has just opened! This month's survey focuses on the differences between a SARS-CoV-2 vaccine approved in the US via a normal approval process and an emergency approval process, as well as on the recent contradictory statements in the US between the White House, FDA, and CDC on vaccine timelines and mask wearing. Finally, for the first time we are asking experts and trained forecasters to provide us with purely text-based responses (Q6 and Q7) -a unique aspect to this work that computational models cannot provide. The goal of these set of predictions is to support public health decision making, provide best estimates that allow the public to make informed decisions, and address current controversies between the White House, FDA, and CDC. Here are all of the questions: As always, if you have any questions or feedback feel free to reply to this email. Thank you for your participation. Those of you who have participated in previous sessions can find our first three reports here. How many SARS-CoV-2 vaccine candidates will be in human trials as of 1 August 2020? 26 When will a SARS-CoV-2 vaccine candidate be approved for use in the United States or European Union? Dec. 21, 2020 When will a SARS-CoV-2 vaccine candidate demonstrate >70% efficacy? Dec. 10, 2020 When will a COVID-19 therapeutic or therapeutics cocktail show a statistically significant survival benefit for the treatment group in a n>200 RCT? July 17th, 2020 What will be the efficacy of the first US-or EU-approved SARS-CoV-2 vaccine based on a non-replicating viral vector platform? What will be the efficacy of the first US-or EU-approved SARS-CoV-2 vaccine based on a DNA or RNA platform? 95% When will a SARS-CoV-2 vaccine candidate be approved for use in the United States or European Union? Dec. 21, 2020 When will a SARS-CoV-2 vaccine candidate be approved for use in the US or EU through a normal approval process? Dec. 21, 2020 When will a SARS-CoV-2 vaccine candidate be approved for use in the US or EU through an emergency approval process? Dec. 10th, 2020 What will be the efficacy of the first US-or EU-approved SARS-CoV-2 vaccine candidate approved through a normal approval process? 95% What will be the efficacy of the first US-or EU-approved SARS-CoV-2 vaccine candidate approved through an emergency approval process? When will a SARS-CoV-2 vaccine candidate be approved for use in the US through a normal approval process? Aug. 23, 2021 When will a SARS-CoV-2 vaccine candidate be approved for use in the US through an emergency approval process? Dec. 10th, 2020 What will be the efficacy ratio of the first SARS-CoV-2 vaccine candidate approved on an emergency basis (numerator) compared to the first approved through a normal process (denominator)? 1.0 FIG. S9: Log scores for individual trained forecasters (red circles) and a linear pool of trained forecasters (red diamond), individual subject matter experts (blue circles) and an expert linear pool (blue diamond), and a linear pool of both trained forecasters and experts related (black X) for questions related to the efficacy and timing of approval of a vaccine. Individuals, and so linear pools, received higher average log scores for questions related to whether a vaccine would demonstrate an efficacy above 70% and the efficacy of a vaccine that uses a non-replicating platform when compared to questions related to the efficacy of the first vaccine approved. Individuals and linear pools received log scores for questions related to the timing of approval of a vaccine that were similar to questions related to vaccine efficacy. None of the linear pool predictions receive the highest log score for any one question, however the trained forecaster plus expert linear pool also never receives the lowest log score. Aggregating trained forecasters and subject matter experts has the potential to guard against an individual forecast with poor accuracy. FIG. S10: Log scores for individual trained forecasters (red circles) and a linear pool of trained forecasters (red diamond), individual subject matter experts (blue circles) and an expert linear pool (blue diamond), and a linear pool of both trained forecasters and experts related (black X) for questions related to safety, the number of vaccine candidates, and the estimate efficacy of a vaccine approved under emergency authorization divided by a vaccine approved under a standard process. Linear pool predictions again never attain the highest nor the lowest log scores among any other forecaster. FIG. S11: The log score across all questions for individuals and the same three linear pool predictions. Though linear pool predictions were able to guard against poor performing individual predictions, there was little difference in predictive performance between a linear pool of trained forecasters, linear pool of experts, or both. figure S12 and figure S13 ). FIG. S12: Scaled ranks for individual trained forecasters (red circles), subject matter experts (blue circles), and three linear pool distributions: (i) a linear pool of trained forecasters (red diamond), of subject matter experts (blue square), and the linear pool of both trained forecasters and experts (black X) for 5 questions related to efficacy and 7 questions related to the timing of vaccine approval with ground truth. FIG. S13: Scaled ranks for individual trained forecasters (red circles), subject matter experts (blue circles), and three linear pool distributions: (i) a linear pool of trained forecasters (red diamond), of subject matter experts (blue square), and the linear pool of both trained forecasters and experts (black X) for questions related to the safety of a vaccine, number of vaccine candidates, and the difference in efficacy between a vaccine approved under an emergency process versus standard process with ground truth. ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. ☐ The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Probabilistic forecasting. Annual Review of Statistics and Its Appli-592 cation A general framework for forecast verification Combining expert opinions Combining forecasts: A review and annotated bibliography A collaborative multiyear, multimodel assess-601 ment of seasonal influenza forecasting in the united states Adaptively stacking ensembles for influenza forecasting Expert Judgement in Risk and Decision Analysis Aggregating predictions 607 from experts: A review of statistical methods, experiments, and applications Probability 610 forecasts and their combination: A research perspective Psychological strategies for winning a geopolitical forecasting 613 tournament Improving forecasting performance by exploiting 615 expert knowledge: Evidence from guangzhou port Predictive assessment of fish health and fish kills in the neuse river estuary using elicited expert 618 judgment Use of probabilistic expert judgment in uncertainty analysis of carcinogenic potency. Regulatory Toxicology and 621 Estimating the strength of expert judgement: the case of us mortality forecasts Char-625 acterization of precipitation through copulas and expert judgement for risk assessment of infrastructure Analysis of expert judgment in a hail forecasting experiment. Weather and forecasting Health risk assessment for nanoparticles: A case for using expert judgment The value of performance weights 633 and discussion in aggregated expert judgments Combining statistical and judgmental forecasts via a web-based 635 tourism demand forecasting system Probabilistic risk analysis in subsurface hydrology Structured expert elicitation about listeria monocytogenes cross-contamination in the environment of retail deli 640 operations in the united states A struc-642 tured expert judgment study for a model of campylobacter transmission during broiler-chicken processing A 645 human judgment approach to epidemiological forecasting Ensemble forecast of human west nile 647 virus cases and mosquito infection rates Malaria early warnings based on seasonal climate forecasts from multi-model 650 ensembles World health organization estimates of the 653 relative contributions of food to the burden of disease due to selected foodborne hazards: a structured expert 654 elicitation Evaluation of a 656 performance-based expert elicitation: Who global attribution of foodborne diseases Attribution of global foodborne disease to specific foods: 660 Findings from a world health organization structured expert elicitation Attribution of illnesses transmitted by food 663 and water to comprehensive transmission pathways using structured expert judgment, united states. Emerging 664 infectious diseases Ecological theory to enhance infectious disease control and public health policy. Frontiers in Ecology and the 667 Environment Health behavior theory for public health: 669 Principles, foundations, and applications Contact structure, 671 mobility, environmental impact and behaviour: the importance of social forces to infectious disease dynamics 672 and disease ecology Covid-19 vaccine intentions in the united states, a social-ecological framework. Vaccine Vaccinating the uk against covid-19 A study of forecasting practices in supply chain man-678 agement Supply chain 680 forecasting: Theory, practice, their gap and the future Real options based analysis of optimal pharma-683 ceutical research and development portfolios Uncertain mean-variance and mean-semivariance models 686 for optimal project selection and scheduling. Knowledge-Based Systems A robust r&d 688 project portfolio optimization model for pharmaceutical contract research organizations A comprehensive framework for project 691 selection problem under uncertainty and real-world constraints Judgment under uncertainty: Heuristics and biases. science Identifying and cultivating superforecasters as a method of improving 698 probabilistic predictions Superforecasting: The art and science of prediction. Random House The psychology of intelligence analysis: Drivers of prediction 702 accuracy in world politics Allocating the weights in the linear opinion pool Rational decisions Combining dynamical and statistical ensembles Probabilistic forecasts, calibration and sharpness The good judgment project: A large 711 scale test of different methods of combining expert predictions Safety and efficacy of the chadox1 714 ncov-19 vaccine (azd1222) against sars-cov-2: an interim analysis of four randomised controlled trials in brazil, 715 south africa, and the uk Dexamethasone in hospitalized patients with covid-19 Ema recommends first covid-19 vaccine for authorisation in the eu Fda approves first covid-19 vaccine Covid-19 vaccine doses administered Estimation of clinical trial success rates and related parameters When will a SARS-CoV-2 vaccine candidate be approved for use in the US through a normal approval process? 2. When will a SARS-CoV-2 vaccine candidate be approved for use in the US through an emergency approval process? 3. What will be the efficacy ratio of the first SARS-CoV-2 vaccine candidate approved on an emergency basis SAEs) being attributed within one year to the first SARS-CoV-2 vaccine approved in the US through a normal approval process? 5. What is the probability of at least ten serious adverse events (SAEs) being attributed within one year to the first SARS-CoV-2 vaccine approved in the US through an emergency approval process? 6. What will be the most common adverse event, serious or not serious, of the first US-approved SARS-CoV-2 vaccine? 7. How long after the approval of a SARS-CoV-2 vaccine of at least 50% efficacy would you continue to recommend the general public wear masks? What percentage of the US population would have to be vaccinated for your view to change? We very much encourage you to share your reasoning and analyses in the comments with other experts From now until 25 September, the community prediction will be hidden. Subsequently from 25 September until 30 When will a SARS-CoV-2 vaccine candidate be approved for use in the US or EU through a normal approval process? When will a SARS-CoV-2 vaccine candidate be approved for use in the US or EU through an emergency approval process? What will be the efficacy of the first US-or EU-approved SARS-CoV-2 vaccine candidate approved through a normal approval process? What will be the efficacy of the first US-or EU-approved SARS-CoV-2 vaccine candidate approved through an emergency approval process? How many weeks after approval will the first 100 million doses of the first US-or EU-approved SARS-CoV-2 vaccine candidate based on a DNA or RNA platform be manufactured? How soon after approval will the first 100 million doses of the first US-or EU-approved SARS-CoV-2 vaccine candidate based on a non-replicating viral vector platform be manufactured? When will an orally administered SARS-CoV-2 antiviral show a statistically significant survival benefit for the treatment group in an n>200 RCT? What will be the SARS-CoV-2 infectivity of children relative to adults when schools are open? The accuracy of a linear pool of trained forecasters plus experts was in between the accuracy of a linear 729 pool generated from trained forecasters and the accuracy of a linear pool generated from experts except 730 for a single question where a subset of individual experts' accuracy was very poor (Fig. S9 and Fig Across all fifteen questions where the truth could be determined, the log scores for a linear pool of trained 732 forecasters plus experts had a smaller interquartile range when compared to individual forecasters, though 733 median scores were similar between individuals and all three linear pool distributions 735 and for individual subject matter experts was 1 The linear pool of trained forecasters and 740 experts generated a higher average log score than both the linear pool of trained forecasters and experts 741 for 1/6 questions The 746 average log score for an expert linear pool was higher compared to a linear pool of trained forecasters when 747 asked for the date a vaccine will be approved in the US/EU, the date a vaccine will be approved through 748 an emergency authorization The 25 th and 75 th percentiles for log scores matter experts (Table I) however these results did not meet statistical significance. Ninety five percent 755 confidence intervals around the difference in log scores between subject matter experts and trained forecasters, 756 and between linear pool models and trained forecasters were large. We do not have enough data on forecast 757 accuracy to conclude statistical significance at a type I error of 5%