key: cord-0946402-g5qh2dcp authors: Tzou, Philip L.; Tao, Kaiming; Kosakovsky Pond, Sergei L.; Shafer, Robert W. title: Coronavirus Resistance Database (CoV-RDB): SARS-CoV-2 susceptibility to monoclonal antibodies, convalescent plasma, and plasma from vaccinated persons date: 2021-11-24 journal: bioRxiv DOI: 10.1101/2021.11.24.469823 sha: 0626eb60184af6ea23f510d5af8caac391144d3f doc_id: 946402 cord_uid: g5qh2dcp As novel SARS-CoV-2 variants with different patterns of spike mutations have emerged, the susceptibility of these variants to neutralization by antibodies has been rapidly assessed. However, neutralization data are generated using different approaches and are scattered across different publications making it difficult for these data to be located and synthesized. The Stanford Coronavirus Resistance Database (CoV-RDB; https://covdb.stanford.edu) is designed to house comprehensively curated published data on the neutralizing susceptibility of SARS-CoV-2 variants and spike mutations to monoclonal antibodies (mAbs), convalescent plasma (CP), and vaccinee plasma (VP). As of October 2021, CoV-RDB contains 186 publications including 64 (34%) containing 7,328 neutralizing mAb susceptibility results, 96 (52%) containing 11,390 neutralizing CP susceptibility results, and 125 (68%) containing 20,872 neutralizing VP results. The database also records which spike mutations are selected during in vitro passage of SARS-CoV-2 in the presence of mAbs and which emerge in persons receiving mAbs as treatment. The CoV-RDB interface interactively displays neutralizing susceptibility data at different levels of granularity by filtering and/or aggregating query results according to one or more experimental conditions. The CoV-RDB website provides a companion sequence analysis program that outputs information about mutations present in a submitted sequence and that also assists users in determining the appropriate mutation-detection thresholds for identifying non-consensus amino acids. The most recent data underlying the CoV-RDB can be downloaded in its entirety from a Github repository in a documented machine-readable format. 140 and specific epitope, i.e., the list of amino acids within 4.5 angstroms of the mAb paratope, according to 141 structural data obtained from the Protein DataBank (PDB). Table 3 Phase III 7K90 RBM II 4 138 Footnote: 1 RBM (receptor binding motif) is the part of the receptor binding domain (RBD) that contains the spike ACE2 binding residues. RBM class 1 mAbs bind epitopes dominated by ACE2-binding residues and as a result bind solely when the RBD is in the up/open position. RBM class 2 mAbs have a smaller ACE2-binding footprint and can often bind the RBD in down/closed position. RBD core mAbs target a surface accessible part of the RBD that is separate from the RBM. 2 The publications and results include those describing both in vitro selection and neutralization experiments. In CoV-RDB, CP are characterized by the sequence of the infecting variant, severity of illness, and 174 time since infection. Table 5 lists the numbers of experimental results in CoV-RDB according to these 175 characteristics and the SARS-CoV-2 variant or mutation(s) tested for neutralization. Overall, 11,390 176 neutralization experiments were performed using CP samples (96 studies) including 84 studies that 177 provided data for individual samples and 12 studies that provided only aggregate data. The mAb, CP, and VP susceptibility query result tables contain between 10 to 13 column headers 266 (Fig 3C-3E) . The mAb table contains column headers indicating the reference and location of the data 267 within each reference (e.g., figure or table); assay type (e.g., pseudotyped virus); mAb tested; variant tested; 268 IC 50 in ng/ml; and fold-reduced susceptibility compared with a control virus (that is also present in the table; 269 Fig 3C) . The VP table contains column headers that indicate the reference and location of the data within 271 the reference, type of assay used, vaccine received, number of immunizations, number of months since 272 immunization, whether the sample was obtained from a vaccinated person with confirmed prior infection, 273 variant tested, geometric mean neutralizing titer, and median fold reduction in titer compared with the 274 control virus (Fig 3D and Fig 4C) . The CP table contains column headers that indicate the reference and location of the data within 276 the reference, assay type, lineage of the virus that infected the person from whom the CP was obtained, 277 number of months between infection and the plasma sample, variant tested, geometric mean neutralizing 278 titer; and median fold reduction in titer compared with the control virus (Fig 3E) . Each table also contains two additional columns: "# Results" and "Data Availability". The number 280 of results indicates the number of neutralizing experiments. The data availability contains a "" if individual 281 data are available or an "X" if data are only available in aggregate (i.e., as a geometric mean titer or a 282 median fold reduction in susceptibility). Clicking on the spreadsheet icon in "Data Availability" column 283 copies the contents of that study to the user's clipboard. The "Download CSV" tab at the top of each The table dimension check boxes enable users to aggregate the data in the table by 292 deselecting one or more dimensions. For example, if the user deselects the "Reference", "Assay", and 293 "Control" variant dimensions, the data are summarized with six rows that provide the median fold-reduction 294 in susceptibility to BNT162b-associated VP for the Delta variant according to the number of vaccinations 295 received, the time since vaccination, and whether the VP was obtained from a previously vaccinated person 296 (Fig 5B) . Fig 6A-6C show three parts of the output generated when either a FASTQ sequence or CodFreq 335 file is submitted to the sequence analysis program: sequence summary (Fig 6A) , sequence quality 336 assessment (Fig 6B) , and mutation list (Fig 6C) . The mutation comments and neutralization susceptibility 337 data, which are also included in the output, are not shown. The sequence summary section (Fig 6A) lists 338 the genes that underwent sequencing, the median read depth, and the Pango lineage. This section also 339 contains three dropdown boxes that help users select the appropriate threshold for identifying sub-consensus 340 mutations. The read depth and mutation detection thresholds are used to select the minimum number and 341 proportion of reads required for a mutation to be considered viral in origin rather than a sequence or PCR 342 artifact. The nucleotide mixture threshold allows users to select a threshold which minimizes the number 343 of nucleotide ambiguities present in the sequence. CoV-RDB can be downloaded in its entirety without restrictions. This is accomplished using a dual 373 database pipeline that combines the full-fledged Postgres database system to enforce relational data 374 integrity and the simplicity of the SQLite database system to enable users to download and query the 375 database without the overhead of accessing a host server. By making the database fully available to all 376 users, we aim to encourage data sharing and the editing of the underlying CSV files by the authors of 377 published studies. Cov-RDB neutralizing susceptibility query output can be considered multidimensional tables in 379 which the rightmost columns contain numerical results (e.g., titers and fold-reductions in susceptibility) 380 while the leftmost columns contain experimental conditions. The experimental conditions are explanatory 381 variables that either directly influence neutralizing susceptibility (e.g., specific vaccine, time since 382 vaccination, and SARS-CoV-2 variant) or have a more subtle effect on susceptibility (e.g., type of 383 neutralizing assay and control virus). The CoV-RDB query interface enables users to explore the query 384 results at different levels of granularity by filtering or aggregating them according to one or more 385 experimental conditions without making additional calls to the web server. The sequence analysis program shares many features with the Sierra HIV Drug Resistance Variants / Mutations" dropdown box. The upper right summarizes the data returned by the query, 258 which in this case includes 875 results from 12 publications (B). The summary distinguishes between results 259 for which only aggregate (mean or median) data are provided and results for which individual data are 260 provided. The section below the header shows the header and first few rows of the table entitled "Vaccinee 261 Plasma Susceptibility Data Therefore, a database devoted to protective immunity should ideally 411 contain both laboratory and epidemiological data. The main obstacle to expanding CoV-RDB to also 412 include epidemiological studies of vaccine efficacy is that such studies are much more complex than those 413 reporting in vitro neutralizing data. For example, vaccine efficacy data depends not just on the vaccine, the 414 variant, and the time since vaccination but also on the study design and the age and immune status of the 415 study population. Moreover These assays appear to correlate strongly with plasmid and 425 conventional neutralization test results, although they do not assess the effects of NTD-binding and other 426 non-RBD antibodies that synergistically inhibit virus replication [19]. In addition, the use of an international 427 external standard for calibration such as the one developed by the WHO will increase concordance across 428 different assays and will be reported using international units rather than as an IC 50 (for mAbs) or a plasma We have also added the comprehensive deep mutational scanning data published by the Bloom 431 laboratory to the database Commercial total binding assays do not differentiate between binding and neutralizing antibodies They also do not measure binding or neutralization of multiple variants but rather assess binding to pre-441 variant spike proteins. However, total binding assays often display moderately strong correlations with 442 neutralizing assays [29,35,36] and activity against specific variants may eventually be assessed using 443 variant-specific reagents. We may eventually add such data to CoV-RDB if they will provide insights that non-human primates, hamsters, and mice), we have not included data 447 from animal model challenge studies as such studies would require extensive modifications to our current 448 database schema. Therefore, we will continue to monitor these studies and consider adding top-line data 449 from these studies to alert database users to the existence of these studies without rigorously representing 450 study details. Finally, a similar approach will be considered for studies of vaccine efficacy The biological 458 and clinical significance of emerging SARS-CoV-2 variants Cryo-EM structure of 461 the 2019-nCoV spike in the prefusion conformation Structure, Function, and 464 Antigenicity of the SARS-CoV-2 Spike Glycoprotein Structure of the SARS-CoV-2 spike receptor-467 binding domain bound to the ACE2 receptor Structural basis of receptor recognition by 470 SARS-CoV-2 Comprehensive 472 mapping of mutations in the SARS-CoV-2 receptor-binding domain that affect recognition by 473 polyclonal human plasma antibodies Deep Mutational 479 Scanning of SARS-CoV-2 Receptor Binding Domain Reveals Constraints on Folding Human immunodeficiency virus 482 reverse transcriptase and protease sequence database Analysis of 485 unusual and signature APOBEC-mutations in HIV-1 pol next-generation sequences Neutralizing antibody 488 levels are highly predictive of immune protection from symptomatic SARS-CoV-2 infection Evidence for 491 antibody as a protective correlate for COVID-19 vaccines Correlates of 494 protection against SARS-CoV-2 in rhesus macaques Antibodies to 499 15. Li H. Minimap2: pairwise alignment for nucleotide sequences SARS-CoV-2) Variants Induced by Natural Infection 503 or Vaccination: A Systematic Review and Pooled Analysis Calibration of Two Validated SARS-506 CoV-2 Pseudovirus Neutralization Assays for COVID-19 Vaccine Evaluation A SARS-CoV-2 variant elicits 509 an antibody response with a shifted immunodominance hierarchy Evaluation of Cell-512 Based and Surrogate SARS-CoV-2 Neutralization Assays WHO International 515 Standard for anti-SARS-CoV-2 immunoglobulin. The Lancet Effect of Delta 521 variant on viral burden and vaccine effectiveness against new SARS-CoV-2 infections in the UK Efficacy of 524 ChAdOx1 nCoV-19 (AZD1222) vaccine against SARS-CoV-2 lineages circulating in Brazil Waning Immune Humoral 527 Response to BNT162b2 Covid-19 Vaccine over 6 Months Effectiveness of mRNA-1273 530 against Delta, Mu, and other emerging variants Correlates of protection against 533 symptomatic and asymptomatic SARS-CoV-2 infection A simple protein-536 based surrogate neutralization assay for SARS-CoV-2 A SARS-CoV-2 surrogate virus 539 neutralization test based on antibody-mediated blockage of ACE2-spike protein-protein interaction Antibodies elicited 554 by mRNA-1273 vaccination bind more broadly to the receptor binding domain than do those from 555 SARS-CoV-2 infection Mapping mutations to 558 the SARS-CoV-2 RBD that escape binding by different classes of antibodies Prevalence of 564 neutralising antibodies against SARS-CoV-2 in acute infection and convalescence: A systematic 565 review and meta-analysis Publications that appear to have 576 data pertinent to SARS-CoV-2 variants and their susceptibility to mAbs, convalescent plasma (CP), and 577 vaccinee plasma (VP) are downloaded to a Zotero reference database folder to enable full-text review and 578 data curation. Extracted data are exported into a set of linked CSV files in an open-source Github repository 579 Effects of spike receptor binding domain (RBD) mutations on monoclonal antibody (mAb) binding 584 and neutralization. The X-axis shows the escape fraction as determined using a deep mutational scanning 585 (DMS) assay that measures the binding of various mAbs to RBD variants produced in yeast. The Y-axis 586 shows the results of in vitro neutralization using the same mAbs and spike proteins with the same RBD 587 mutations as those used in the DMS assay The table here shows the results from three codons (spike positions 500 to 594 502). The observation that many codons shown in this (and other parts of the same file which are not shown) represent sequencing or PCR artifact 599 S4 Fig. Functions of the SARS-CoV-2 sequence analysis program. The program supports three types of 600 input: a list of spike mutations CoV-2 genome; and one or more FASTQ sequences. However, because a FASTQ sequence can take several 602 minutes to analyze, users are advised to first convert them to a codon frequency file through an auxiliary 603 program. If a list of spike mutations is submitted, the program returns comments about notable mutations 604 and summary tables reporting the susceptibility of viruses with these mutations to mAbs, CP, and VP. If a 605 FASTA sequence is submitted, the program returns the preceding information plus a list of the SARS-CoV-606 2 genes, the amino acid mutations in the sequence, and the sequence's PANGO lineage. If a FASTQ 607 sequence or codon frequency table is submitted, the program provides the preceding information and the 608 read coverage for each position along the genome