key: cord-0942115-f2i0r6rg authors: Van Campenhout, Claude; De Mendonça, Ricardo; Alexiou, Barbara; De Clercq, Sarah; Racu, Marie-Lucie; Royer-Chardon, Claire; Rusu, Stefan; Van Eycken, Marie; Artesi, Maria; Durkin, Keith; Mardulyn, Patrick; Bours, Vincent; Decaestecker, Christine; Remmelink, Myriam; Salmon, Isabelle; D’Haene, Nicky title: SARS-CoV-2 genome sequencing from post-mortem formalin-fixed, paraffin-embedded lung tissues date: 2021-06-18 journal: J Mol Diagn DOI: 10.1016/j.jmoldx.2021.05.016 sha: 331df5176f1c5fd4ebbbdb293cc19450b4df26b0 doc_id: 942115 cord_uid: f2i0r6rg Implementation of SARS-CoV-2 testing in the daily practice of pathology laboratories requires procedure adaptation to formalin-fixed and paraffin-embedded (FFPE) samples. So far, one study reported the feasibility of SARS-CoV-2 genome sequencing on FFPE tissues with only one contributory case out of two. The present study aimed to optimize SARS-CoV-2 genome sequencing using the Ion AmpliSeq SARS-CoV-2 Panel on 22 FFPE lung tissues from 16 deceased COVID-19 patients. SARS-CoV-2 was detected in all FFPE blocks using a real-time RT-qPCR targeting the E gene with Crossing Point (Cp) values ranging from 16.02 to 34.16. Sequencing was considered as contributory (i.e. with a uniformity >55%) for 17 FFPE blocks. Adapting the number of target amplification PCR cycles according to the RT-qPCR Cp values allowed to optimize the sequencing quality for the contributory blocks; i.e. 20 PCR cycles for blocks with a Cp value <28 and 25 PCR cycles for blocks with a Cp value between 28 and 30. The majority of blocks with a Cp value >30 were non-contributory. Comparison of matched frozen and FFPE tissues revealed discordance for only three FFPE blocks, all with a Cp value >28. Variant identification and clade classification was possible for 13 patients. The present study validates SARS-CoV-2 genome sequencing on FFPE blocks and opens the possibility to explore correlation between virus genotype and histopathological lesions. Virus sequencing can be achieved by Sanger 10, 11 and/or Next Generation Sequencing (NGS) 10, 12 . NGS is now well implemented in pathology laboratories for detection of cancer-related molecular alterations, using FFPE tissues 13, 14 . The use of targeted NGS panels allows the identification of tumor molecular profiles using small quantities of nucleic acids from FFPE blocks. However, only a few studies have reported the use of NGS to detect pathogens in FFPE blocks 7,15-17 . Sekulic et al. showed the feasibility of SARS-CoV-2 sequencing on FFPE blocks but only one case out of two was contributory 7 . The present study aimed to optimize SARS-CoV-2 genome sequencing using NGS on 22 post-mortem FFPE tissues. Lung samples were collected from the 16 first confirmed COVID-19 (positive RT-qPCR assay on nasopharyngeal swab and/or BAL) patients who died in Hôpital Erasme (Brussels, Belgium) since March 13, 2020 and with a positive SARS-CoV-2 E gene RT-qPCR on lung FFPE blocks (see below). The study protocol was approved by the local ethics committee (P2020/218 The detection of the SARS-CoV-2 virus in the nucleic acid extracts was performed by RT-qPCR. One-step RT-qPCR assay specific for the amplification of SARS-CoV-2 E gene was adapted from the protocol described by Corman et al. 18 1:1000 was used as positive control and a clinical sample obtained from a patient autopsied before the pandemic was used as negative control in each analysis. For library construction, 10ng of RNA (5ng and 1ng for testing robustness) were retro-transcribed Sequencing was performed using the S5 Gene Studio instrument (ThermoFisher Scientific). SARS-CoV-2 whole-genome sequencing using Oxford Nanopore technology was performed as previously described 19 . The raw sequencing data was analyzed using the torrent suite software (v5.12-ThermoFisher Scientific). The sequencing metric analysis was performed using the coverage analysis plug-in. For fresh samples, the manufacturer (ThermoFisher Scientific) recommends to obtain 1M reads per sample and reports that the uniformity is >85%. The following sequencing quality classification was used: optimal if the mapped reads were >1,000,000 and uniformity >90%; suboptimal if the mapped reads were between 1,000,000 and 500,000 and/or uniformity between 80% and 90%. If the mapped reads were <500,000 and/or uniformity between 55% and 80%, the sequencing quality was considered as poor. If the uniformity was <55%, the sequencing was considered as non-contributory. The sequencing fragments were assembled using Iterative Refinement Meta-Assembler (IRMA) 20 . Alignment to the SARS-CoV-2 genome reference and variant detection were performed using the In order to select the optimal library preparation protocol, uniformities, numbers of mapped reads and coverages were analyzed for each block and considered as independent. For evaluation of the sequencing performance for the selected PCR condition, the number of variants (total and with an AF >90%) were also analyzed for each block and considered as independent. The Mann-Whitney test was applied for the comparison of two independent groups of ranked data. The Friedman test was applied for the comparison of multiple dependent groups. Spearman correlation analysis was used to analyze the relationship between the RT-qPCR Cp values and uniformities. Statistical analyses were performed using Statistica 7.1 (Statsoft, Tulsa, OK, USA). Table S1 ). Globally, no significant differences were observed in term of sequencing metrics (number of mapped reads and coverage) with increased numbers of PCR cycles. Only the uniformity appears higher at 20 PCR cycles (median: 95.72%) than at 25 and 30 PCR cycles (medians: 92 and 88%, respectively; Friedman test: p =0.04) (Supplemental Table S1 ). Next, the analyses were refined according to the RT-qPCR Cp values ( Figure 1 ). and 30C), as confirmed by the negative Spearman correlations ( Figure 2 ). In particular, for the seven blocks with a RT-qPCR Cp value >30, five showed a uniformity lower than 55% for all the tested conditions. These five blocks were considered as non-contributory; 17 blocks were thus considered as contributory. These 17 contributory blocks were coming from 13 patients (including four patients with two blocks tested). For the 17 contributory blocks, the aim was to establish the best PCR condition for sequencing performance and variant analyses. As sequencing quality criteria, the uniformity was selected as the most important factor because it is related to the homogeneity of the coverage distribution. The PCR condition with the highest uniformity was thus selected. If there were conditions with similar uniformities (+/-3%), the condition with the highest number of mapped reads was selected. If there were conditions with similar uniformities (+/-3%) and number of mapped reads (+/-20%), the condition with the fewest PCR cycles was preferred (Supplemental Table S1 ). This allowed us to select 20 PCR cycles for blocks with a RT-qPCR Cp value <28 and 25 PCR cycles for blocks with a RT-qPCR Cp value between 28 and 30. It was not possible to establish rules for blocks with a RT-qPCR Cp value >30, the majority of them being non-contributory (Supplemental Table S1 ). After adapting the number of target amplification PCR cycles according to the RT-qPCR Cp values for the 17 contributory blocks, the median number of mapped reads and uniformity was 1,642,150 (min-max: 305,249 -2,094,563) and 95.9% (min-max: 81-98%), respectively. As shown in Table 1 , the sequencing quality was considered as optimal for ten blocks (see Material Moreover, increasing the number of PCR cycles lead to a higher number of variants with an AF<0.9 which can reflect sequencing artifacts. The presence of hemorrhage and/or necrosis on the 22 FFPE blocks was evaluated in order to identify if histological features can affect the sequencing performance and quality (Supplemental Table S4 ). Hemorrhage was observed for eight blocks and necrosis for four blocks. The sequencing quality was more often optimal when neither hemorrhage nor necrosis were present (7/11 blocks with optimal sequencing when neither hemorrhage nor lysis were present vs 3/11 blocks with optimal sequencing when hemorrhage and/or lysis was present). To examine the impact of formalin-fixation (a well-known cause of RNA damage-induced changes and sequencing artefacts) the same library preparation (with the adaptation of the PCR amplification cycles to the RT-qPCR Cp values) and sequencing protocols were used on matched frozen tissues. Using this method on the 22 frozen tissues, sequencing quality was considered as optimal for 11 (median number of mapped reads of 1,549,686, median coverage of 9,971 and median uniformity of 97%). Sequencing quality was considered as suboptimal for seven frozen tissues (median number of mapped reads of 1,256,654, median coverage of 5,375 and median uniformity of 90%). Finally, for three frozen tissues the sequencing qualities were considered as poor and for one as non-contributory (Table 1) . When considering the six suboptimal FFPE blocks and the matched frozen tissues, optimal quality on frozen tissues was observed for two of them, while sequencing remained suboptimal for two and poor for the remaining two. The 9-1 poor sequencing quality from FFPE was suboptimal from frozen tissue. The five FPPE blocks categorized as non-contributory showed various results when frozen tissue was sequenced: one optimal, two suboptimal, one poor and one non-contributory. For all optimal (ten out of ten) FFPE blocks, the same variants with an AF>90% were detected in the matched frozen tissues. For the six suboptimal FFPE blocks, comparison of the variant caller plug-in results between FFPE and matched frozen tissue revealed additional variants for four FFPE blocks (5-1, 9-2, 12-2, 16), a missing variant for one FFPE block (12-1) and the same profile for one (8-1) (Table 1) . However, IGV verification showed that the profile was concordant between FFPE and matched frozen tissue for blocks 5-1 and 12-1. In summary, the comparison of a matched frozen and FFPE tissues identified three blocks presenting discordance (9-2, 12-2, 16), with additional variants in the FFPE blocks that were absent in the matched frozen tissue. All of the three FFPE blocks were characterized by a suboptimal sequencing and by a RT-qPCR Cp value >28. Regarding the four patients with two different lung lobes tested, two patients presented the same variant profile (8, 11) . Discordances were observed between lobes for patients 9 and 12, but comparison with matched frozen tissues revealed that additional variants observed in one lobe were related to sequencing artifacts. Regarding recurrent variants reported in the literature, all patients harbored the C241T, C3037T, C14408T, A23403G nucleotide variants (Figure 4 and Supplemental Appendix S1). Distinct variant profiles have been identified across the patients (Tables 2-3 ). According to the GISAID definitions (see Material and Methods), clade G was assigned for seven patients, clade GR for four and clade GH for two patients. It should be noted that for four patients (9, 11, 12 and 16) , some genomic positions cannot be assessed due to AF around 40-60%. Using Nextstrain Classification, eight patients were classified as clade 20A (due to the C14408T and A23403G variants), one as clade 20B (due to the To confirm the variants identified using the Ion Torrent sequencing platform, the SARS-CoV-2 genome from the frozen tissues matching the 17 contributory FFPE blocks were also sequenced using Oxford Nanopore technology. Sequences were obtained for all tissues except one (block 2). Variants reported in Supplemental Table S2 could be confirmed using this third generation sequencing platform. To evaluate the robustness of the SARS-CoV-2 genotyping on FFPE blocks, the technique was Key residues nucleotides for GISAID clade classification are indicated. Sequences for block 6 are not included in the alignment as they are much shorter than the others and do not align sufficiently well to the other sequences to give useful information. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet A new coronavirus associated with human respiratory disease in China Data, disease and diplomacy: GISAID's innovative contribution to global health Unspecific post-mortem findings despite multiorgan viral spread in COVID-19 patients Pathological study of the 2019 novel coronavirus disease (COVID-19) through postmortem core biopsies Molecular Detection of SARS-CoV-2 Infection in FFPE Samples and Histopathologic Findings in Fatal SARS-CoV-2 Cases Autopsy Findings and Venous Thromboembolism in Patients With COVID-19 Postmortem examination of COVID19 patients reveals diffuse alveolar damage with severe capillary congestion and variegated findings of lungs and other organs suggesting vascular dysfunction Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-toperson transmission: a study of a family cluster Isolation and Full-Length Genome Characterization of SARS-CoV-2 from COVID-19 Cases in Northern Italy Clinical Validation of Targeted Next Generation Sequencing for Colon and Lung Cancers Design and Validation of a Gene-Targeted, Next-Generation Sequencing Panel for Routine Diagnosis in Gliomas. Cancers (Basel) Virus characterization and discovery in formalin-fixed paraffin-embedded tissues Hybrid capture and next-generation sequencing identify viral integration sites from formalinfixed, paraffin-embedded tissue High-throughput RNA sequencing of a formalin-fixed, paraffin-embedded autopsy lung tissue sample from the 1918 influenza pandemic Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCR A Recurrent Mutation at Position 26340 of SARS-CoV-2 Is Associated with Failure of the E Gene Quantitative Reverse Transcription-PCR Utilized in a Commercial Dual-Target Diagnostic Assay Viral deep sequencing needs an adaptive approach: IRMA, the iterative refinement meta-assembler Emerging SARS-CoV-2 mutation hot spots include a novel RNA-dependent-RNA polymerase variant. Version 2 Genotyping coronavirus SARS-CoV-2: methods and implications On Behalf Of Iss Covid-Study Group: Whole genome and phylogenetic analysis of two SARS-CoV-2 strains isolated in Italy in The establishment of reference sequence for SARS-CoV-2 and variation analysis Integrative genomics viewer MUSCLE: a multiple sequence alignment method with reduced time and space complexity Nextstrain: real-time tracking of pathogen evolution Nextstrain build for novel coronavirus SARS-CoV A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology Isolation and characterization of viruses related to the SARS coronavirus from animals in southern China Evidence for camel-to-human transmission of MERS coronavirus RT-PCR for SARS-CoV-2: quantitative versus qualitative Population genomics of intrapatient HIV-1 evolution Sources of PCR-induced distortions in high-throughput sequencing data sets Intra-and interpatient evolution of enterovirus D68 analyzed by whole-genome deep sequencing Direct RNA sequencing on nanopore arrays redefines the transcriptional complexity of a viral pathogen Direct RNA nanopore sequencing of full-length coronavirus genomes provides novel insights into structural variants and enables modification analysis Koopmans M; Dutch-Covid-19 response team. Rapid SARS-CoV-2 whole-genome sequencing and analysis for informed public health decision-making in the Netherlands Same-day genomic and epigenomic diagnosis of brain tumors using real-time nanopore sequencing Guidelines for Validation of Next-Generation Sequencing-Based Oncology Panels: A Joint Consensus Recommendation of the Association for Molecular Pathology and College of American Pathologists J o u r n a l P r e -p r o o f