key: cord-0743725-vvg07kei authors: Pancer, Katarzyna; Milewska, Aleksandra; Owczarek, Katarzyna; Dabrowska, Agnieszka; Branicki, Wojciech; Sanak, Marek; Pyrc, Krzysztof title: The SARS-CoV-2 ORF10 is not essential in vitro or in vivo in humans date: 2020-08-29 journal: bioRxiv DOI: 10.1101/2020.08.29.257360 sha: 67dadaf40029deaa8ba9bf93cad35a95b34a8117 doc_id: 743725 cord_uid: vvg07kei SARS-CoV-2 genome annotation revealed the presence of 10 open reading frames (ORFs), of which the last one (ORF10) is positioned downstream the N gene. It is a hypothetical gene, which was speculated to encode a 38 aa protein. This hypothetical protein does not share sequence similarity with any other known protein and cannot be associated with a function. While the role of this ORF10 was proposed, there is a growing evidence showing that the ORF10 is not a coding region. Here, we identified SARS-CoV-2 variants in which the ORF10 gene was prematurely terminated. The disease was not attenuated, and the transmissibility between humans was not hampered. Also in vitro, the strains replicated similarly, as the related viruses with the intact ORF10. Altogether, based on clinical observation and laboratory analyses, it appears that the ORF10 protein is not essential in humans. This observation further proves that the ORF10 should not be treated as the protein-coding gene, and the genome annotations should be amended. Coronaviruses are mammalian and avian RNA viruses, with large genomes of ~30,000 bases, which encode several proteins required for the virus replication, modulating the immune responses, and forming the scaffold of progeny virions 1 . The spatial distribution of the open reading frames (ORFs) is similar across the taxa. The 1a/1ab ORF starts near the 5' terminus and is the only ORF that may be translated directly from the genomic RNA, giving rise to the non-structural proteins that re-shape the cellular microenvironment and initiate the replication process. Downstream, a number of ORFs encoding the structural proteins are located (HE, S, M, E, N), interspaced with genes encoding accessory proteins, varying in number and position 1 . SARS-CoV-2 genome annotation revealed the presence of 10 ORFs, of which the last one (ORF10) is positioned downstream the N gene 2 . It is a hypothetical, 117 nt -long ORF, which was speculated to encode a 38 aa protein 2, 3 . Bioinformatic analyses revealed that this hypothetical protein does not share the sequence similarity with any other known protein, and the predicted structure cannot be associated with a function. Nonetheless, it was speculated that the ORF10 protein may play a role in the immunogenicity of the SARS-CoV-2 or may modulate the virulence of the SARS-CoV-2. On the other hand, there is growing evidence showing that the ORF10 is not a coding region. Jungreis et al. analyzed the region for different Sarbecoviruses and found that only in a minority of cases, for the closest SARS-CoV-2 relatives, the ORF10 is intact. The evidence for the presence of the subgenomic mRNAs corresponding to the ORF10 is limited 4,5 . Here, we identified two patients infected with the SARS-CoV-2 virus, in which the ORF10 gene was prematurely terminated with a stop codon. The disease was not attenuated, and the transmissibility was not hampered. Isolation of these viruses in cell culture showed that also in vitro, these strains replicated similarly, as the related viruses with the intact ORF10. Altogether, based on clinical observation and laboratory analyses, it appears that the ORF10 protein is not essential in humans. A viral DNA/RNA kit (A&A Biotechnology, Poland) was used for nucleic acid isolation from cell culture supernatants. RNA was isolated according to the manufacturer's instructions. cDNA samples were prepared with a high-capacity cDNA reverse transcription kit (Thermo Fisher Scientific, Poland), according to the manufacturer's instructions. Viral RNA was quantified using quantitative PCR (qPCR; CFX96 Touch real-time PCR assess the copy number for the N gene, standards were prepared and serially diluted. The first SARS-CoV-2 infected patient was identified in Poland on the 4 th of March 2020, and the subsequent monitoring of the genetic drift of the virus was initiated and allowed for the characterization of circulating viruses. The phylogenetic analysis led to the conclusion that the diversity of the virus is similar to the one observed worldwide 6 Based on the collected data, one may safely assume that the virus with the disrupted ORF10 was infectious and pathogenic in humans. The identical change in two patients proves that it was not resulting from intra-patient genetic drift and that the virus transmissibility was not affected. To further characterize the phenotype of the virus, available clinical samples were overlaid on the fully confluent Vero E6 cells. At the same time, parallel cultures were inoculated with closely related PL_P31 and PL_P38 isolates (see Figure 1 ). In all four cases, 72 h post-inoculation we observed the appearance of characteristic CPE. The media samples were collected daily, and total RNA was isolated. The RT-qPCR reaction was carried out, and the virus yields are presented in Figure 2 . No difference between the replication dynamics between strains carrying the nonsense mutation in the ORF10 and the strains with intact ORF10 was observed. Concluding, results obtained from the cell culture, sequencing, and clinical data show that the stop codon in the two-thirds of the protein did not affect the virus fitness. This observation further supports the thesis that the ORF10 should not be treated as the protein-coding gene, and the genome annotations should be altered 4 . On the other hand, ORF10 is relatively conserved, suggesting the importance of this region, e.g., due to the secondary RNA structures. A new coronavirus associated with human respiratory disease in China The coding capacity of SARS-CoV-2. biorxiv repository The Architecture of SARS-CoV-2 Transcriptome Characterisation of the transcriptome and proteome of SARS-CoV-2 using direct RNA sequencing and tandem mass spectrometry reveals evidence for a cell passage induced in-frame deletion in the spike glycoprotein that removes the furin-like cleavage site Geographic and temporal distribution of SARS-CoV-2 clades in the WHO European Region from APOBEC3-mediated restriction of RNA virus replication Genome structure and transcriptional regulation of human coronavirus NL63 Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCR Nextstrain: real-time tracking of pathogen evolution Maximum-likelihood phylodynamic analysis The funders had no role in study design, data collection, and analysis, decision to publish, or preparation of the manuscript.The authors declare no competing financial interests.