key: cord-0857508-l4qqt4hx authors: Bal, A.; Simon, B.; Destras, G.; Chalvignac, R.; Semanas, Q.; Oblette, A.; Queromes, G.; Fanget, R.; Regue, H.; Morfin, F.; Valette, M.; Lina, B.; Josset, L. title: Detection and prevalence of SARS-CoV-2 co-infections during the Omicron variant circulation, France, December 2021 - February 2022 date: 2022-03-27 journal: nan DOI: 10.1101/2022.03.24.22272871 sha: 1bba7d7f60671d72d8a3f5bf642704e3a91752a5 doc_id: 857508 cord_uid: l4qqt4hx In Dec 2021-Feb 2022, an intense and unprecedented co-circulation of SARS-CoV-2 variants with high genetic diversity raised the question of possible co-infections between variants and how to detect them. Using 11 mixes of Delta:Omicron isolates at different ratios, we evaluated the performance of 4 different sets of primers used for whole-genome sequencing and we developed an unbiased bioinformatics method which can detect all co-infections irrespective of the SARS-CoV-2 lineages involved. Applied on 21,387 samples collected between weeks 49-2021 and 08-2022 from random genomic surveillance in France, we detected 53 co-infections between different lineages. The prevalence of Delta and Omicron (BA.1) co-infections and Omicron lineages BA.1 and BA.2 co-infections were estimated at 0.18% and 0.26%, respectively. Among 6,242 hospitalized patients, the intensive care unit (ICU) admission rates were 1.64%, 4.81% and 15.38% in Omicron, Delta and Delta/Omicron patients, respectively. No BA.1/BA.2 co-infections were reported among ICU admitted patients. Although SARS-CoV-2 co-infections were rare in this study, their proper detection is crucial to evaluate their clinical impact and the risk of the emergence of potential recombinants. Since the first SARS-CoV-2 genome was published in January 2020, five variants of concern (VOC), characterized by increased transmissibility and/or immune escape capacity, have circulated worldwide 1 . Omicron, the last VOC to date, was first detected in South Africa 2 in November 2021 and displaced the previously circulating Delta VOC by February 2022. While the Delta and Omicron variants share a few mutations along their genome, Omicron is characterized by a large number of specific mutations, especially in the S gene 3 In France, the fifth wave of the SARS-CoV-2 epidemic was characterized by a sustained cocirculation of the Delta and Omicron (lineage BA.1) variants from November 2021-January 2022. Lineage BA.2 was first detected in France in late December 2021 and its proportion has since increased linearly 4 . Thus, the unprecedented sustained co-circulation of genetically divergent lineages observed from November 2021 to February 2022 may have been suitable for co-infections with a risk of subsequent recombination events. Few cases of Delta and Omicron variant co-infections have been reported recently but a systematic assessment of the prevalence of SARS-CoV-2 co-infections, including BA.1/BA.2 co-infections, has not been explored on a large data set [5] [6] [7] . As Omicron and Delta have been associated with distinct specific variants 10, 11 ; ii) possible sample contamination during the sequencing process which requires independent validation on duplicate extracts 12,13 ; and iii) lack of unbiased and validated bioinformatics methods able to systematically detect co-infections. Previously published reports on SARS-CoV-2 Delta/Omicron co-infections were based on manually curated lists of divergent mutations and visual examination of their relative frequencies along the genome [5] [6] [7] . De-novo assembly methods have also been described to assemble different viral genomes present in one sample 14 but are computationally intensive. Herein, we used different mixed ratios of Delta and Omicron cell culture isolates in order to assess the performances of four different primer sets for the detection of Delta/Omicron coinfections. A co-infection score was determined to warn about probable co-infection. The prevalence of Delta/Omicron and BA.1/BA.2 co-infections were then estimated on a large data set of sequences obtained from random surveillance of out-patients and systematic sequencing of hospitalized patients. The Delta and Omicron variants were isolated in cell culture from nasopharyngeal swabs (NPS). Following interim biosafety guidelines established by WHO, NPS were inoculated on confluent Vero E6 TMPRSS2 cells with DMEM supplemented with 2% penicillinstreptomycin, 1% L-glutamine, 2% G418 and 2% inactivated fetal bovine serum. Plates were incubated at 37 °C with 5% CO 2 for 48 h. The cytopathic effects were monitored daily; samples were harvested when positive. Viral isolates were quantified using RT-PCR 15 and sequenced to confirm the lineage and the absence of low frequency diversity. The Delta and Omicron isolates were then diluted to reach similar viral loads (Ct = 19) and mixed using different Delta:Omicron ratios: 0:100, 10:90, 20:80, 30:70, 40:60, 50:50, 60:40, 70:30, 80:20, 90:10 and 100:0. After nucleic acid extraction was performed in duplicate, all RNA extracts were diluted ten-fold and stored in several aliquots under the same conditions (frozen at -80°C). Thus, all extracts were subjected to one freeze-thaw cycle for all sequencing methods. Routine SARS-CoV-2 sequencing protocol in our laboratory is based on COVIDSeq-Test™ (Illumina, San Diego, USA) using Artic V4 or V4.1 primers as they became available. The samples sequenced at the National Reference Center (NRC) of Respiratory Viruses of Hospices Civils de Lyon (HCL) selected for this study were i) samples from systematic sequencing of hospitalized patients in the Lyon area (university hospital of Lyon, HCL) and from HCL health care workers; ii) samples from random sequencing performed during the weekly Flash survey conducted by the EMERGEN consortium (French consortium for the genomic surveillance of emerging pathogens). The Flash surveys are nationwide surveys where all private and public diagnostic laboratories in France are asked to provide to the NRC and other sequencing centers a fraction of positive samples from one day per week ranging from 25% to 100% according to the number of positive cases detected at the national level 4, 16 . The prevalence of SARS-CoV-2 co-infections was estimated on samples collected both in the HCL and in Flash samples sequenced by the NRC of HCL. To assess the clinical presentations of co-infected patients, three groups were selected: out-patients of Flash surveys, hospitalized patients of Flash surveys and HCL, and healthcare workers of HCL, excluding follow-up samples. Reads were processed using the in-house bioinformatic pipeline seqmet (available at https://github.com/genepii/seqmet, software versions provided in Table S1 ). Paired reads were trimmed with cutadapt to remove sequencing adapters and low-quality ends, only keeping reads longer than 30 bp. Alignment to the SARS-CoV-2 reference genome MN908947 was performed by Minimap2. Mapped reads were processed to remove duplicates tagged by picard, then realigned by abra2 to improve indel detection sensitivity and finally clipped with samtools ampliconclip to remove read ends containing primer sequences. Variants present at frequencies of 5% or above were called using freebayes, then decomposed and normalized with vt and filtered with bcftools to eliminate false positives. To detect coinfection, obtained vcf files were compared to a lineage variant database, both developed internally to this end. The database consists of vcf files listing variants found in 50% or more of 100 randomly selected sequences for a given pangolin lineage in the full GISAID dataset available (extracted on 02 February 2022). The database is available at https://github.com/genepii/seqmet. The co-infection detection script searches for each lineage any major or minor variant matching expected variants of the putative main lineage and then searches for any minor variant matching any other lineage excluding variants in common with . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 27, 2022. ; https://doi.org/10.1101/2022.03.24.22272871 doi: medRxiv preprint the main lineage. This approach provides putative main and secondary lineages contained in a sequenced sample read, along with a ratio of observed and expected variants in each case. Variants were expected when they occurred on a position covered with at least 100 reads. Secondary lineages, considered indicative of a putative co-infection, required at least 6 lineage-specific variants to be called. For Delta:Omicron mixes, vcf files were also analyzed using the curated list of clade-defining mutations of Delta and Omicron (lineage BA.1) as previously published 5, 6 . This list is based on https://covariants.org/variants as of 11/02/2022, excluding 21846 C>T (S: T95I) which is present in 40% of Delta variants. Continuous variables are presented as means ± standard deviations (SD) or median with interquartile range (IQR) and compared using non-parametric Kruskal-Wallis or Mann-Whitney tests. Proportions were compared using the chi-squared or Fisher's exact test, as appropriate. A p value of <0.05 was regarded as statistically significant. Statistical analyses were conducted using R software, version 4.0.5 (R Foundation for Statistical Computing). Samples used in this study were collected as part of an approved ongoing surveillance The GISAID accession numbers of the Delta and Omicron virus isolates used for experimental mixes are EPI_ISL_11171170 and EPI_ISL_11171169, respectively. Sequencing data of the Delta:Omicron mixes were deposited on the SRA database under accession PRJNA817870, and dehosted sequencing data of NPS with co-infections were deposited under accession PRJNA817806. To simulate co-infections, Delta (B. Four sets of primers (Artic V4 and V4.1, Midnight V1 and V2) were used in duplicate on extracts to test the impact of PCR amplification prior to sequencing on co-infection characterization. All mixes were sequenced to 1 M paired-end reads leading to SARS-CoV-2 genome covered > 98% with median coverage of 2276X (IQR=315X) ( Table S2) . The evaluation of the primer sets was performed using a previously published method based on a curated list of mutations specific to Delta and to Omicron derived from co-variants [5] [6] [7] ( Table S2 and Fig 1) . More than 90% of the Delta-specific mutations were found in all mixes with the 4 primer sets. In contrast, the detection rate of Omicron-specific mutations ranged from 27% in the 90:10 mix using Midnight primers (V1 and V2) to >78% for mixes with expected frequency of Omicron above 30% (Fig 1A and Table S1 ). Medians of covered allele frequency for the specific mutations were used to estimate viral frequency. Relations . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 27, 2022. ; https://doi.org/10.1101/2022.03.24.22272871 doi: medRxiv preprint between measured and expected frequency were not linear (Fig 1B) . Over-estimation of Delta was observed for all mixes with all primer sets, and especially in mixes with expected frequency of Delta under 30% and sequenced with Midnight primer sets. Measured frequencies of Delta for the 10:90 mix were between 30-33% with Midnight V1 and V2, and 21-25% with Artic V4 and V4.1. Importantly, consensus sequence calling based on majority rule resulted in artefactual chimeric Delta-Omicron sequences for several mixes and with different patterns depending on the primers used for amplification (Fig S1) . Chimeric sequences were observed with the four primer sets in all duplicates only for the 20:80 mix (Fig 1C) . With Artic primer sets, chimeric sequences were characterized by Omicron sequences bearing the S:L452R and M:I82T mutations, and additional Delta-specific mutations with increasing Delta concentration. With Midnight primers, chimeric sequences were characterized by Omicron sequences with 3' end of the genome belonging to Delta (starting from nt 27638 or 27874). Altogether, the Artic V4.1 primers were the least biased for Delta/Omicron co-infection detection and relative frequency estimation, but all primer sets could lead to artefactual chimeric sequences, highlighting the importance of proper co-infection detection. Independent to this specific set of mutations, an agnostic approach was developed to detect co-infections regardless of the lineage present in the sample (Fig 2) . This approach is based on the identification of a potential secondary lineage, after excluding variants shared with the main lineage. A secondary lineage is identified only if 6 specific mutations are present. Two ratios are calculated: the main lineage mutation ratio and the secondary lineage mutation ratio quantifying the fraction of present mutations among covered specific mutations. Based on this co-infection detection script, co-infection was successfully identified in all mixes, except for . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 27, 2022. ; the pure Delta (100:0) and Omicron (0:100) isolates for which only Delta (lineage B.1.617. 2) and Omicron (lineage BA.1) were identified as the main lineages, respectively. B.1.617.2 and BA.1 were identified as the main and secondary lineages, respectively, in all mixes with expected frequency of Delta above 40%, independent of the primer sets (Fig 2 and Table S1 ). Main lineage mutation ratios were above 0.9 for all mixes (Fig 2A) . Secondary lineage mutation ratios were between 0.216 and 0.941 (Fig 2B) . The lowest ratios were found for the mix 90:10 with only 0.23, 0.25, 0.39 and 0.55 of BA.1 specific mutations found using Midnight V2, Midnight V1, Artic V4 and Artic V4.1, respectively. Altogether, the results of the unbiased co-infection detection scripts were consistent with the curated list approach with a better detection of the secondary lineage with Artic primers. Between December 6th 2021 (week 49-2021) and February 27th 2022 (week 08-2022), 23,242 samples were sequenced using Artic V4 or V4.1 primer sets as they became available. In total, WGS (coverage >90%) were obtained for 21,387 samples collected from Flash surveys (n=16,220) and from HCL and peripheral hospitals (n=5,167). Among the 21,387 samples, 64 samples (0.30%) had a secondary lineage identified with positive ratios (Fig 3) . To rule out potential contamination during initial sequencing, all these 64 samples were re-extracted and sequenced in duplicate. In total, 53 samples had a positive secondary lineage mutation ratio in duplicate: 28 samples were identified as a Delta/Omicron (BA.1) co-infection; 1 sample was identified as a Delta/Omicron (BA.2) co-infection; 24 samples were identified as a co-infection between 2 different Omicron lineages (BA.1 and BA.2) (Fig 4) . All co-infections were confirmed by visual examination of vcf plots using the lists of specific mutations from our lineage variant database (Fig S2, Fig S3 and Fig S4) . Uniform . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 27, 2022. ; https://doi.org/10.1101/2022.03.24.22272871 doi: medRxiv preprint frequencies of lineage-specific mutations along the genome were observed in each sample, except for two samples ("021228537801" among the Delta/BA.1 co-infections and "722000801801" among the BA.1/BA.2 co-infections). In sample 021228537801, frequencies of Delta-and BA.1-specific mutations were reversed between the first and second half of the genome (Fig S2) . In sample 722000801801, frequencies of BA.1-and BA.2-specific mutations were reversed between the first fourth and rest of the genome (Fig S4) . Delta/Omicron (BA.1) co-infections were detected between weeks 50-2021 and 04-2022 (Fig 4) . Considering only the period of Delta and Omicron (BA.1) co-circulation at relative To assess the impact of co-infections on clinical presentations, demographic features including age and sex were reported for 13,187 out-patients, 6,242 hospitalized patients, and 803 healthcare workers ( Table 1 ). In the three groups, no significant difference was noted between BA.1 and BA.2 infections regarding median age or proportion of men (p>0.05). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 27, 2022. ; https://doi.org/10.1101/2022.03.24.22272871 doi: medRxiv preprint Therefore, BA.1 and BA.2 cases were grouped into Omicron cases for further analysis. Delta cases were significantly older than Omicron cases for out-patients (p=0.003) and for hospitalized patients (p<0.001). Delta cases were also significantly more predominant in men for hospitalized patients (p<0.001). No difference regarding age or sex was found between Delta and Omicron cases for healthcare workers (p<0.05). Among the three groups, no significant difference in age or sex was found for Delta/Omicron or BA.1/BA. In conclusion, our findings emphasize the importance of using appropriate experimental and bioinformatic methods for the comprehensive identification of SARS-CoV-2 co-infections. Although these events are rare, SARS-CoV-2 co-infections need to be properly identified as they can lead to the emergence of new variants after a recombination event. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Lineage 20I/501Y.V1 (B.1.1.7, variant of concern 202012/01) in France, January to . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 27, 2022. ; https://doi.org/10.1101/2022.03.24.22272871 doi: medRxiv preprint We would like to thank all the members of the GenEPII sequencing platform who contributed to this investigation. We also thank all the laboratories, clinicians and patients involved in this work. This work was carried out within the framework of the French consortium on surveillance and research on infections with emerging pathogens via microbial genomics (consortium relatif à la surveillance et à la recherche sur les infections à pathogènes EMERgents via la GENomique microbienne EMERGEN; https://www.santepubliquefrance.fr/dossiers/coronavirus-covid-19/consortium-emergen) Santé publique France, the French national public health agency. Caisse nationale d'assurance maladie (Cnam), the national health insurance funds. Table S1 : Versions of tools used in the seqmet bio-informatic pipeline. show which mutations are called in the consensus sequence based on the majority rule. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Tracking SARS-CoV-2 variants Rapid epidemic expansion of the SARS-CoV-2 Omicron variant in southern Africa Enquêtes Flash : évaluation de la circulation des variants du SARS-CoV-2 en France T>A 24503 C>T 25000 C>T 25469 C>T 25584 C>T 26270 C>T 26530 A>G 26577 C>G 26709 G>A 26767 T>C 27259 A>C 27638 T>C 27752 C>T 27807 C>T 27874 C>T 28247 AGATTTC>A 28270 TA>T 28311 C>T