key: cord-0817335-0umvwqbg authors: dos Santos, M. C.; Sousa, E. C.; Ferreira, J. A.; Silva, S. P.; Souza, M. P.; Cardoso, J. F.; Silva, A. M.; Barbagelata, L. S.; Chagas, W. D.; Ferreira, J. L.; Souza, E. M.; Vilaca, P. L.; Alves, J. C.; Abreu, M. C.; Lobo, P. S.; Santos, F. S.; Lima, A. A.; Bragagnolo, C. M.; Soares, L. S.; Almeida, P. S.; Oliveira, D. S.; Amorim, C. K.; Costa, I. B.; Teixeira, D. M.; Penha, E. T.; Bezerra, D. A.; Siqueira, J. A.; Tavares, F. N.; Freitas, F. B.; Rodrigues, J. T.; Mazaro, J.; Costa, A. S.; Cavalcante, M. S.; Silva, M. S.; Silva, I. A.; Borges, G. A.; Lima, L. G.; Ferreira, H. L.; Livorati, M. T title: MOLECULAR EPIDEMIOLOGY TO UNDERSTAND THE SARS-CoV-2 EMERGENCE IN THE BRAZILIAN AMAZON REGION date: 2020-09-07 journal: nan DOI: 10.1101/2020.09.04.20184523 sha: 5ca257513832a2b752f9d3da48fdd81fb60badc1 doc_id: 817335 cord_uid: 0umvwqbg The COVID-19 pandemic in Brazil has demonstrated an important public health impact, as has been observed in the world. In Brazil, the Amazon Region contributed with a large number of cases of COVID-19, especially in the beginning of the circulation of SARS-CoV-2 in the country. Thus, we describe the epidemiological profile of COVID-19 and the genetic diversity of SARS-CoV-2 strains circulating in the Amazon Region. We observe an extensive spread of virus in this Brazilian site. The data on sex, age and symptoms presented by the investigated individuals were similar to what has been observed worldwide. The genomic analysis of the viruses revealed important amino acid changes, including the D614G and the I33T in Spike and ORF6 proteins, respectively. The latter found in strains originating in Brazil. The phylogenetic analyzes demonstrated the circulation of the lineages B.1 and B.1.1, whose circulation in Brazil has already been previous reported. Our data reveals molecular epidemiology of SARS-CoV-2 in the Amazon Region. These findings also reinforce the importance of continuous genomic surveillance this virus with the aim of providing accurate and updated data to understand and map the transmission network of this agent in order to subsidize operational decisions in public health. SARS-CoV-2 in the country. Thus, we describe the epidemiological profile of COVID- 48 19 and the genetic diversity of SARS-CoV-2 strains circulating in the Amazon Region. 49 We observe an extensive spread of virus in this Brazilian site. The data on sex, age and is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. . https://doi.org/10.1101/2020.09.04.20184523 doi: medRxiv preprint cases at the beginning of the pandemic in Brazil with high rates of occupancy in 75 intensive care units (ICU) and deaths. Most persons with COVID-19 experience mild to 76 moderate respiratory symptoms and recover 6 . On the other hand, individuals with 77 underlying medical conditions, such as cardiovascular disease, diabetes, chronic 78 respiratory diseases and cancer are more likely to be severely and possibly in need of 79 intensive care 6,7 . In addition to epidemiological information, the SARS-CoV-2 genomic data, as 81 well as evolution datasets to quantify the impact of non-pharmaceutical interventions is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. . https://doi.org/10.1101/2020.09.04.20184523 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. . https://doi.org/10.1101/2020.09.04.20184523 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. . https://doi.org/10.1101/2020.09.04.20184523 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. The phylogenetic analysis reveals that isolates from present study clustering in is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. . https://doi.org/10.1101/2020.09.04.20184523 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. . https://doi.org/10.1101/2020.09.04.20184523 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. . https://doi.org/10.1101/2020.09.04.20184523 doi: medRxiv preprint Amongst the alterations in the Spike protein that plays a role in binding to the 211 human ACE2 receptor and is also the main antigenic target, it was found the D614G 212 substitution that is described as a factor that antigenically favors the virus, giving it a 213 higher capacity to infection 33 and has been used as a genetic marker for strains of the B- is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. . https://doi.org/10.1101/2020.09.04.20184523 doi: medRxiv preprint of SARS-CoV-2 in the Amazon region. Thus, genomic surveillance must be 245 continuously adopted to be able to offer accurate and quality data to understand where 246 this virus emerged from, and map the transmission network to improve operational 247 decisions in public health. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. bootstraps was used as statistical support, using GTR as a nucleotide substitution model. The genomes obtained were compared to the reference strain (NC_045512) by in house 338 python script that compares each base of the entire genome and gives us a mutation list. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. . https://doi.org/10.1101/2020.09.04.20184523 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. . https://doi.org/10.1101/2020.09.04.20184523 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 7, 2020. . https://doi.org/10.1101/2020.09.04.20184523 doi: medRxiv preprint The species Severe acute respiratory syndrome-related Clinical features of coronavirus disease in a cohort of patients with disability due to spinal cord injury A SARS-CoV-2 protein interaction map reveals targets for 424 drug repurposing Nonstructural Proteins in the Pathogenesis of SARS-CoV-2 Tracking changes in SARS-CoV-2 Spike: evidence that D614G 429 increases infectivity of the COVID-19 virus Global initiative on sharing all influenza data -432 from vision to reality SARS-CoV-2 mutations and where to find them: An in 434 silico perspective of structural changes and antigenicity of the Spike protein Genotyping coronavirus SARS-CoV-2: methods and implications Phylogenetic analysis of the first four SARS-CoV-2 cases in 439 Genomic surveillance of SARS-CoV-2 reveals community 441 transmission of a major lineage during the early pandemic phase in Brazil Evolution and epidemic spread of SARS-Cov-2 in Brazil Pattern of early human-to-human transmission of 446 Wuhan Emerging SARS-CoV-2 mutation hot spots include a novel 449 RNA-dependent-RNA polymerase variant Comparative genetic analysis of the novel coronavirus nCoV/SARS-CoV-2) receptor ACE2 in different populations R: A language and 454 environment for statistical computing Elegant Graphics for Data Analysis Read Rectangular Text Data M. fmsb: Functions for Medical Statistics Book with some The Split-Apply-Combine Strategy for Data Analysis Scale Functions for Visualization viridis: Default Color Maps from 'matplotlib Bob Rudis. hrbrthemes: Additional Themes, Theme Components and Utilities for 472 'ggplot2 Trimmomatic: A flexible trimmer for 474 Illumina sequence data FastQC: a quality control tool for high throughput sequence data MEGAHIT: An ultra-478 fast single-node solution for large and complex metagenomics assembly via 479 succinct de Bruijn graph Ultrafast and memory-481 efficient alignment of short DNA sequences to the human genome Fast and sensitive protein alignment using 485 DIAMOND Parallelization of the MAFFT multiple sequence alignment 487 program RAxML version 8: a tool for phylogenetic analysis and post-489 analysis of large phylogenies coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2. Nat. 343 Microbiol. 5, 536-544 (2020 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprintThe copyright holder for this this version posted September 7, 2020. . https://doi.org/10.1101/2020.09.04.20184523 doi: medRxiv preprint It is made available under a perpetuity.is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprintThe copyright holder for this this version posted September 7, 2020.