key: cord-0877874-honsyrkh authors: Pablos, Isabel; Machado, Yoan; Ramos de Jesus, Hugo C.; Mohamud, Yasir; Kappelhoff, Reinhild; Lindskog, Cecilia; Vlok, Marli; Bell, Peter A.; Butler, Georgina S.; Grin, Peter M.; Cao, Quynh T.; Nguyen, Jenny P.; Solis, Nestor; Abbina, Srinivas; Rut, Wioletta; Vederas, John C.; Szekely, Laszlo; Szakos, Attila; Drag, Marcin; Kizhakkedathu, Jayachandran N.; Mossman, Karen; Hirota, Jeremy A.; Jan, Eric; Luo, Honglin; Banerjee, Arinjay; Overall, Christopher M. title: Mechanistic Insights into COVID-19 by Global Analysis of the SARS-CoV-2 3CLpro Substrate Degradome date: 2021-10-09 journal: Cell Rep DOI: 10.1016/j.celrep.2021.109892 sha: 0661678ca7a4348a9e2fc5b1a39b0e4a8a4aa6dd doc_id: 877874 cord_uid: honsyrkh The main viral protease (3CLpro) is indispensable for SARS-CoV-2 replication. We delineate the human protein substrate landscape of 3CLpro by TAILS substrate-targeted N-terminomics. We identify >100 substrates in human lung and kidney cells supported by analyses of SARS-CoV-2-infected cells. Enzyme kinetics and molecular docking simulations of 3CLpro engaging substrates reveal how noncanonical cleavage sites, which diverge from SARS-CoV, guide substrate specificity. Cleaving the interactors of essential effector proteins, effectively stranding them from their binding partners, amplifies the consequences of proteolysis. We show that 3CLpro targets the Hippo pathway, including inactivation of MAP4K5, and key effectors of transcription, mRNA processing, and translation. We demonstrate that Spike glycoprotein directly binds galectin-8, with galectin-8 cleavage disengaging CALCOCO2/NDP52 to decouple antiviral-autophagy. Indeed, in post-mortem COVID-19 lung samples, NDP52 rarely colocalizes with galectin-8, unlike healthy lung cells. The 3CLpro substrate degradome establishes a foundational substrate atlas to accelerate exploration of SARS-CoV-2 pathology and drug design. The main viral protease (3CL pro ) is indispensable for SARS-CoV-2 replication. We delineate the 47 human protein substrate landscape of 3CL pro by TAILS substrate-targeted N-terminomics. We 48 identify >100 substrates in human lung and kidney cells supported by analyses of SARS-CoV-2- 49 infected cells. Enzyme kinetics and molecular docking simulations of 3CL pro engaging substrates 50 reveal how noncanonical cleavage sites, which diverge from SARS-CoV, guide substrate 51 specificity. Cleaving the interactors of essential effector proteins, effectively stranding them 52 from their binding partners, amplifies the consequences of proteolysis. We show that 3CL pro 53 targets the Hippo pathway, including inactivation of MAP4K5, and key effectors of 54 transcription, mRNA processing, and translation. We demonstrate that Spike glycoprotein 55 directly binds galectin-8, with galectin-8 cleavage disengaging CALCOCO2/NDP52 to decouple 56 antiviral-autophagy. Indeed, in post-mortem COVID-19 lung samples, NDP52 rarely colocalizes 57 with galectin-8, unlike healthy lung cells. The 3CL pro substrate degradome establishes a 58 foundational substrate atlas to accelerate exploration of SARS-CoV-2 pathology and drug 59 design. 66 The current understanding of how SARS-CoV-2 overwhelms the host cell machinery and 67 escapes antiviral defenses is far from complete. Viruses have evolved an ability to maximize a 68 small genome; thus, their proteins are pleiotropic and multifunctional. As multitasking proteins 69 present challenges for drug development (Butler and Overall, 2009) , deciphering the pleiotropic 70 roles of viral proteins in host cells will inform the identification of novel drug targets for SARS- 71 CoV-2 and other beta-coronaviruses. Within the two polyproteins encoded by SARS-CoV-2 72 reside two essential proteases for replication (Kim et al., 2020) . Non-structural protein-5 (NSP5) 73 encodes the main protease, 3-chymotrypsin-like protease (3CL pro ) (Dai et al., 2020) and NSP3 74 encodes papain-like protease (Shin et al., 2020) . 3CL pro is a validated drug target that releases 16 75 NSPs by cleaving at eleven L/FQ↓(S/A/G/N) sites for viral replication complex assembly. In 76 addition, host cell protein cleavage by viral proteases is a critical component of viral 77 pathogenicity (Jagdeo et al., 2018) , including diverting cellular processes to viral replication, 78 defeating antiviral responses and immune response modulation. However, determining the 79 repertoire and diversity of proteolytic cell targets is a long-standing challenge, and the 80 pathobiological mechanisms driven by 3CL pro in COVID-19 remain elusive. Substrate cleavage 81 requires that the amino acids flanking the scissile bond on the proximal non-prime (P) side and 82 the distal prime (P') side fit the protease S and S' subsites, respectively . 83 Medicinal chemistry classically focuses on the P-side interface to increase drug potency. 84 However, knowledge of human cellular target proteins would improve the characterization of P'-85 recognition subsites to guide drug development and decipher infection pathways to understand 86 and predict outcomes of 3CL pro -inhibitor drug therapy of Many large-scale analyses of the SARS-CoV-2 infected-cell transcriptome (Stukalov et al., 88 2021), proteome (Stukalov et al., 2021) , phosphoproteome and 89 interactomes (Gordon et al., 2020; Stukalov et al., 2021) are described. With only 14 substrates 90 reported in SARS-CoV-2 infection (Meyer et al., 2021; Moustaqil et al., 2021) , the 3CL pro 91 human substrate repertoire, also known as the degradome (López-Otín and Overall, 2002) , is not 92 well understood. Thus, the opaque contribution of 3CL pro to the overwhelming of host cell 93 machinery remains understudied. We addressed this challenge by employing state-of-the-art 94 substrate-targeted proteomics and substrate winnowing analyses to comprehensively profile the 95 human host cell substrates of 3CL pro . Here, we expanded the 3CL pro substrate landscape to over 96 100 substrates and 58 additional high confidence candidate substrates. In exploring the 97 consequences of 3CL pro cleavage events, we demonstrate the direct binding of galectin-8 to 98 Spike S1 glycoprotein and found this complex is disrupted upon cleavage to impact antiviral-99 autophagy, also known as xenophagy. Cleavage of four Hippo signalling proteins, including We profiled the substrate repertoire of 3CL pro in human cell proteomes by Terminal Amine 111 Isotopic Labelling of Substrates (TAILS) (Kleifeld et al., 2010) , a powerful method to selectively 112 purify neo-N-terminal peptides corresponding to substrate P'-cleavage products ( Figure 1A ; 113 Tables 1 and S1-S6). We analyzed 3CL pro cleavages in native proteome extracts from human 114 embryonic kidney substrates. 126 For definitive identification as a 3CL pro substrate, we required further high stringency conditions 127 to be met. Heavy-labelled neo-N-termini had to be present solely as a "heavy singleton" without 128 the corresponding isotopic light-counterpart from control samples. For confident identification as 129 a biologically relevant cleavage site, these neo-N-termini had to be identified in ≥ 2/3 130 independent HEK-293 or ≥ 7/9 independent BEAS-2B cell experiments. Combining the HEK-131 293 and BEAS-2B datasets, we quantified 1,649 labelled N-termini, including 955 neo-N-termini 132 ( Figure 1B ; Tables S6A and S6B ). Thereby, we identified 292 3CL pro -cleaved neo-N-termini in 133 229 proteins (Figures 1C, S1D and S1H; Table S6A ). The sequence logo of the 292 cleavage 134 sites in native cellular proteins is consistent with the 3CL pro cleavage specificities in the viral 135 polyprotein (Scott et al., 2021) , and natural and non-natural amino acid peptide substrates (Rut et 136 al., 2021) (vide infra). Notably, the 'other' 663 neo-N-termini winnowed out were found not to 137 start after the SARS-CoV-3CL pro consensus P1-Gln ( Figure 1C ; Table S6B ). 138 Finally, to select only bona fide substrates, we generated a position-specific scoring matrix 139 (PSSM) using the normalized relative frequency of amino acids in positions P4-P4' of the 292 140 deemed as 3CL pro cut-sites ( Figure 1C ). We then calculated a score for the P4-P4' sequence of 141 all 955 neo-N-termini to measure similarity relative to the PSSM and selected the 3CL pro sites 142 scoring higher than the 90 th percentile of the non-3CL pro cleavage sites (n = 171). All MS/MS 143 spectra of these neo-N-terminal peptides were then manually inspected. Spectra from ragged-144 protein ends, showing poor fragmentation or noise, and four other sites not validated by synthetic 145 peptide cleavage (STAR Methods) were excluded (n = 69, Table S1 ). 146 We conclude that 3CL pro targets at least 101 human substrates at 102 sites (Table 1) that could 147 not be disproven by our substrate winnowing strategy, including 34 proteins identified in both 148 cell lines ( Figure 1D ), 28 of which were found in all twelve or 11/12 independent experiments. 149 Adding further weight to our analyses, 38 of the 167 cut sites we found in Table 1 and Table S1 150 were independently reported in a proteomics dataset brief (Koudelka et al., 2021) , using in vitro 151 N-terminomics in lung epithelial carcinoma cells (H441) and human pulmonary microvascular 152 endothelial cells. However, no further biochemical or physiological validation was performed. In 153 addition, Meyer et al. (2021) very recently reported cleavage of NUP107 (Table 1) at position 154 Gln 35 in SARS-CoV-2 infected A549-ACE2 cells and GOLGA3 at Gln 365 (Table S1 ) and 155 J o u r n a l P r e -p r o o f ATAD2 at Gln 949 (Table 1) , which we also found. In their study, GOLGA3 cleavage was 156 elegantly validated in 3CL pro transfected cells, whereas NUP107 and ATAD2 cleavages were 157 attributed to 3CL pro based on the cleavage logo but without direct evidence. Likewise, our data 158 validate the cut site at position Gln 444 of TAB1 (Table 1) that (Moustaqil et al., 2021) inferred 159 from the electrophoretic migration of TAB1 proteolytic fragments and 3CL pro cleavage 160 specificity. 161 We quantified the relative protein abundance of 45 substrates identified from a total of 2,767 162 quantified proteins in interferon-treated BEAS-2B cells (STAR Methods). Only galectin-8 163 increased protein expression in response to type I interferons, whereas YAP1 and VAT1 164 decreased (Figure S1I and S1J; Table S7 ). Hence, ISGs are not a significant substrate class of 165 3CL pro . Overall, 3CL pro cleaves cellular substrates involved in three main processes: i) RNA 166 splicing, processing, activation, and metabolism, ii) translation, iii) and cell cycle control ( Figure 167 1E; Table S8 ), affording insight into the processes of cellular subjugation utilized by CoV-2. 169 Structure-activity relationships of canonical vs. noncanonical 3CL pro cut-sites 170 Using MALDI-TOF-MS, we calculated the apparent (app) specificity constant, app (k cat /K M ), of 171 3CL pro for synthetic peptides spanning P4-P4' of all cleavage sites in the 34 common substrates 172 identified in HEK-293 and BEAS-2B cells ( Figure 1D ). In addition, we assayed cleavage-site 173 peptides from 12 candidate substrates with compelling biology. 3CL pro cleaved all the 34 174 common peptides and 9/12 from the candidates (Figures 2A, 2B and S2A). The app (k cat /K M ) of 175 3CL pro cleaved peptides was consistent with the 3CL pro preferences for small amino acids in P1', 176 glutamine in P1, and leucine in P2 ( Figure 1C ), but with surprising yet unequivocal exceptions. 177 The presence in P1 of methionine (T22D2, MAP4K5) or histidine (RBM15, MCM4) did not 178 block cleavage ( Figure 2B ). Although no previous reports identify the noncanonical Met at P1 in 179 substrates, we also found the same neo-N-terminal peptide for MAP4K5 by data mining the 180 proteomic dataset report of (Koudelka et al., 2021) , which had not been designated a candidate 181 substrate as it lacked the P1-Gln. 182 We also demonstrate similarities and divergence at P2 from the dominant leucine specificity 183 ( Figure 1C ) previously reported in the SARS-CoV-2 polyprotein (Scott et al., 2021) , peptides 184 (Rut et al., 2021) , and monkey and human proteins (Koudelka et al., 2021; Meyer et al., 2021) . 185 In the polyprotein, P2-Val and P2-Phe each occur once. We too found valine (CREB, site 2) and 186 phenylalanine (SRRM2), as well as methionine (MCM4) and alanine (CLCB) at P2, which we 187 validated ( Figure 2B ). Additionally, we establish the occurrence of isoleucine (RS21), glutamine 188 (SF3B2, NACAM), and proline (IF4G1, PTBP1-2 nd site) in P2, which were previously 189 unreported. The noncanonical P2 residues impaired catalytic efficiency but did not block 190 cleavage. We frequently found glutamine and valine at P3 (e.g. GOLGA2 and CREB1, 191 respectively), and at P4 of valine and eight instances of proline, including NUP107 and FYCO1, 192 respectively. However, the most significant difference between the specificity logos is the prime-193 side specificity profile C-terminal to P1', which has been largely overlooked in the other studies 194 of SARS-CoV-2 3CL pro . Thus, the kinetics analyses confirm the cleavage specificity divergence 195 we found by sequence analysis of cleaved native human proteins ( Figure 1C ). These unexpected 196 findings are fundamental to inform drug development and derive from an approach that does not 197 require manual searches based on assumed cleavage site preferences that miss such deviations. 198 J o u r n a l P r e -p r o o f Several structural analyses reported the P-side interactions of peptides or inhibitors with 3CL pro 199 (Vuong et al., 2020; Zhang et al., 2020) . However, to our knowledge, only one paper described a 200 P'-side sequence engaged in the 3CL pro -S' interface, but the autocatalytic NSP5 P1'-P3' 201 sequence (Ser-Ala-Val) fits poorly . Indeed, none of the 101 human substrates 202 display this sequence. Reasoning that human substrate complexes with 3CL pro would reveal 203 biologically relevant structure-activity relationships, we modelled the binding complex of the 204 3CL pro dimer/cleavage-site peptide of seven human substrates by high-resolution peptide-protein 205 docking. All models displayed highly negative I_sc (Rosetta interface score) values, indicating a 206 favourable 3CL pro and peptide interaction ( Figures 2C-2I) , and for the P-side interactions, our 207 models resembled published structures. Hydrogen-bond lengths were within 3.5 Å (Kajander et 208 al., 2000) , and best-fit models varied due to molecular dynamics. 209 Even when the P-sequence is optimal, cleavage was affected by the fit of residues in subsites on 210 the P'-side. The most prominent sites are P1' since the S1' subsite cannot typically accommodate 211 bulky residues due to steric hindrance imposed by Thr 25 , Leu 27 , and His 41 side-chains. The 212 consensus P1'-Ala/Ser/Gly each fit optimally in S1'. Nevertheless, some substrates are 213 efficiently cleaved despite relatively bulky side-chains at P1', e.g., LQ 78 ↓N in IMA4 and 214 LQ 133 ↓L in YAP1 ( Figure 2B ). In IMA4, the P1'-Asn points towards S3', where the side-chain 215 amide group is within hydrogen-bonding distance of Thr 25 (3.2 Å), His 41 (2.9 Å), Cys 44 (1.6 Å), 216 and Ser 46 (3.3 Å) ( Figure 2E ). Thus, S3' is dynamic, accommodating residues from other P'-side 217 positions. In RBM15, the P1'-P4' residues form β-sheet-like hydrogen bonds with Thr 24-26 218 ( Figure 2G ), contributing significantly to the best P'-side fit, i.e., lowest I_sc = -39.65, of the 219 modelled substrates. (Table S1) , i.e., ~10% of substrates. In this case, the main-chain oxygen atom of 229 the P1-His or Met accepts hydrogen bonds from the main-chain nitrogen of Gly 143 , Ser 144 and 230 Cys 145 to promote cleavage at the Leu-Met↓Ser and Leu-His↓Ser sites. These noncanonical P1 231 residues and the dynamic occupancy of S3' were unexpected and can be leveraged for 232 3CL pro /inhibitor drug development and predictions of off-targets in treatment. 233 3CL pro cleaves RPAP1 and PTBP1, altering PTBP1 subcellular localization 234 The subversion of transcription and translation machinery is a recognized strategy to co-opt host 235 cells for optimal viral replication (Walsh and Mohr, 2011) . Indeed, the three major gene sets 236 enriched with 3CL pro substrates are proteins involved in these processes ( Figure 1E ; Table S3 ). 237 We further characterized two substrates. RNA polymerase II-associated protein 1 (RPAP1) is 238 crucial for optimal RNA polymerase II activity-by binding a protein known as Mediator, 239 RPAP1 couples RNA polymerase II to enhancer elements to elevate transcription (Lynch et al., 240 2018). Polypyrimidine tract binding protein (PTBP1) binds mRNA and is essential for the 241 sequential phases of viral translation and replication (Florez Paola et al., 2005) . RPAP1, cleaved 242 in N = 12/12 experiments, is one of the best substrates for 3CL pro with an app k cat /K M >1.5 x10 3 M -243 J o u r n a l P r e -p r o o f 1 sec -1 and PTBP1 was identified in N = 3/3 HEK-293 cell experiments ( Figure 3A ). In time-244 course 3CL pro in vitro cleavage assays, we observed loss of both substrates coincident with 245 sequential cleavage-product generation at molecular weights predicted from the cut-site locations 246 ( Figures 3B and S3 ). Catalytically-inactive mutant 3CL pro -C145A or incorporation of a 3CL pro 247 inhibitor, GC376 (Vuong et al., 2020) , confirmed 3CL pro cleavage of the substrates. Edman 248 sequencing validated the RPAP1 and PTBP1 neo-N-termini identified by TAILS and identified 249 other cleavage sites, which we supported by peptide cleavage kinetics assays ( Figure 3C) ). This unusual cleavage sequence, i.e., P2-Pro followed by P1-Gln, is spliced out from 265 isoform-1. Since cleavage at the shared site will remove the nuclear localization sequence from 266 the N-terminus of all PTBP1 isoforms ( Figure 3A ), we examined whether SARS-CoV-2 267 infection altered the nuclear localization of PTBP1, as previously reported for other 268 coronaviruses (Sola et al., 2011) . In uninfected Vero E6 cells, PTBP1 was exclusively located in 269 the nucleus ( Figure 3G ) with a nuclear to cytosolic ratio of 1.9 ( Figure 3H ). However, upon 270 SARS-CoV-2 infection, PTBP1 translocated to the cytoplasm with a nuclear/cytosol ratio of 0.3 271 at 48 hpi (N = 5, n > 50 cells, Figures 3G, 3H and S5) . Frequently, the same microscopy fields 272 evidenced nuclear-to-cytosol transit of PTBP1 in infected cells but not in nearby uninfected 273 bystander cells, which is more evident at high magnification (Figures 3G and S5D) . Moreover, 274 we showed that IMA4, which is involved in cargo recognition, and TPR and NUP107, which are 275 integral parts of the nuclear pore ring, are all substrates of 3CL pro ( Figures S2C-S2E and S4B ). 276 These substrates provide evidence for potential mechanisms in the targeted shutdown of 277 nucleocytoplasmic transport by SARS-CoV-2, a viral strategy to repress host cell translation 278 (Caly et al., 2015) . 279 In picornavirus, RNAi-silencing reveals that full-length PTBP1 negatively regulates viral RNA 280 transcription (Florez Paola et al., 2005) . Hence, PTBP1 cleavages may relieve an inhibitory 281 effect on SARS-CoV-2 replication. Alternately, poliovirus 3CD pro reportedly cuts PTBP1 and 282 blocks IRES-dependent protein synthesis, switching from viral translation to replication (Back et 283 al., 2002) . Notably, knockdown of RPAP1 results in broad reductions in transcription and leads 284 to cell dedifferentiation (Lynch et al., 2018) , which is often a feature of viral infection but is 285 poorly understood. Thus, the fragmentation of RPAP1 by 3CL pro , which we hypothesize 286 phenocopies RPAP1 silencing, together with direct cleavage of RNA polymerase I (Table 1) The Hippo signalling pathway, which regulates cell morphology, mechanotransduction, tissue 291 growth and regeneration, is not a generally recognized target of viral proteolytic attack 292 (Yalamanchili et al., 1997) . Nevertheless, TAILS identified three substrates integral to Hippo 293 signalling: YAP1, CREB1 and ATF1, with a fourth, MAP4K5, involved in the regulation of 294 Hippo/EGFR crosstalk. The phosphorylation of YAP1 by LATS1/2, a downstream 295 phosphorylation target of the MAP4K family, prevents nuclear translocation and transcriptional 296 activity of YAP1 (Rausch and Hansen, 2020) . MAP4K5 contains ten Leu-Gln instances with at 297 least three optimal sequences for 3CL pro cleavage, yet none were cut in 9/9 independent BEAS-298 2B analyses. Instead, TAILS identified a noncanonical SKLM 456 ↓SENT cleavage site between 299 the kinase and the regulatory citron homology domains in all experiments ( Figure 4A ). P1-Met 300 was previously unknown to be susceptible to 3CL pro . Therefore, we verified the TAILS cut-site 301 by cleaving the corresponding P4-P4' synthetic peptide ( Figure 4B ). Edman sequencing 302 confirmed that product-2 of cleaved recombinant MAP4K5 protein was from scission at 303 Met 456 ↓Ser, with immunoblotting showing the N-terminal origin of product-1 ( Figure 4C ). 304 Hence, in addition to glutamine and histidine, 3CL pro accommodates methionine in P1 ( Figure 305 2H), which must now be considered integral to its specificity profile. 306 After activation by upstream signals, including the Hippo pathway, CREB1 dimerizes with 307 ATF1 to form a competent transcription factor that binds the cAMP-responsive element to 308 promote expression of anti-apoptotic and cell proliferation genes (Persengiev and Green, 2003) . 309 Moreover, the Hippo signalling pathway cross-talks with Wnt, Notch, the EGF receptor ERBB4, 310 and the TGFß pathway through SMAD1 and SMAD7 (Dupont et al., 2011) . Figure 2I ) (Rut et al., 2021) . Edman sequencing confirmed cleavage at VQ 243 ↓A and 316 revealed a 2 nd site at TILQ 223 ↓YAQT (product-2, Figure 4C ). We mined the TAILS data and 317 found proteomic evidence for this site (n = 2/12, HEK-TAILS2_ACN, MS/MS #46,629). Hence, 318 our stringent substrate winnowing criteria identified substrates with great confidence but at the 319 expense of underestimating substrate numbers. TAILS also identified the identical site, 320 TILQ 151 ↓YAQT, in ATF1 (Table 1) , which we confirmed by peptide cleavage ( Figure S2A ). 321 YAP1, MAP4K5 and CREB1 in primary HAECs (N = 5) were cleaved by 3CL pro , but not 322 inactive 3CL pro -C145A ( Figure 4D ), with cleavage of CREB1 and YAP1 dimers also evident 323 ( Figure S4A ). In Vero E6 cells infected with SARS-CoV-2, we identified reductions in 324 endogenous YAP ( Figure 4E ) and MAP4K5 ( Figure 4F ). MAP4K5 cleavage products were at 325 the expected apparent molecular weights ( Figure 4F ) and consistent with the MAP4K5 cleavage 326 products shown in primary human airway epithelial cells ( Figure 4D ). In SARS-CoV-2 infection 327 of a second cell type, human Calu-3 cells, antibodies to CREB1 did not show evident changes of 328 the full-length band consistent with the low cleavage rate of the synthetic peptide and the 329 recombinant protein shown by in vitro cleavage assays ( Figures 4B and 4C ). Higher molecular 330 weight bands ( Figure 4G ) with similar-size products were observed in bronchial epithelium after 331 cleavage by 3CL pro (Figures S4B and S4C) . However, the antibody specificities were not optimal 332 J o u r n a l P r e -p r o o f for more definitive conclusions in these cells. We measured kinase activity of MAP4K5 and 333 found that cleavage separation of the Ser/Thr-kinase domain from the citron homology domain 334 by 3CL pro halted kinase activity ( Figure 4F ). Thus, 3CL pro redundantly targets the transcription 335 arm of the Hippo pathway. 336 Phosphorylation of Ser 381 targets YAP1 for proteasomal degradation, whereas phospho-Ser 127 337 triggers YAP1 binding to 14-3-3 , which sequesters YAP1 in the cytosol, preventing transit to 338 the nucleus as a transcriptional coactivator (Rausch and Hansen, 2020) . YAP1 cleavage at 339 ASLQ 133 ↓LGAV was observed in 9/9 independent BEAS-2B TAILS experiments, which we 340 confirmed by peptide cleavage kinetic analyses ( Figure 2B ). Scission at Gln 133 could prevent 341 Ser 127 phosphorylation, 14-3-3  binding and hence nuclear translocation. Truncation of YAP1 at 342 Gln 133 generates a C-terminal fragment homologous to the transcriptionally inactive isoform-4 of 343 YAP1, which efficiently inhibits IRF3 translocation and innate antiviral responses (Wang et al., 344 2017). Thus, the redundant inactivation of YAP1 by removal of the YAP1 Ser 127 kinase-345 activation sequence/14-3-3  binding site, the inactivation of an upstream regulator kinase, 346 MAP4K5, together with two downstream transcription factor targets, CREB1 and ATF1, 347 strongly implicate the importance of repressing Hippo-regulated gene transcription and TBK1 348 activity for optimal SARS-CoV-2 infection. Diverse 3CL pro targets in viral subjugation of the cell in COVID-19 350 We validated substrates from other pathways relevant to the viral hijacking of the cell. These 351 include EIF3 (Figure 2A ), which blocks binding of SARS-CoV-2 NSP1 to the 40S ribosomal 352 subunit (Lapointe et al., 2021) ; and FAS-associated factor 1 (FAF1) ( Figures 4H and 4J) , a 353 positive regulator of type-I interferon signalling (Kim et al., 2017) . Insulin receptor substrate 2 354 (IRS2) ( Figure S4A ), a key phosphorylation target of the insulin receptor (Guo et al., 2006) , was 355 also cleaved, as were two integral components of nuclear pore transport-nuclear pore complex (Cheng et al., 2016) , and galectin-8 ( Figure 5 ) (Wang et al., 2020) . 360 Galectins are essential in host defence by directly interacting with pathogens and regulating the 361 immune response (Wang et al., 2020) . Galectin-8 was the only 3CL pro substrate elevated by type 362 I interferons, consistent with an antiviral role (Figures 5A and S1I; Table S6 ). Proteolysis of Figures 5D and S6C ). Significant interactions also occur on 373 the P'-side, mainly by Thr 21 , Thr 24 and Thr 26 . The 3CL pro protomer-2 further stabilizes the 374 3CL pro /galectin-8 complex by hydrogen bonds between Cys 300 (2.7 Å) and Ser 301 (3.5 Å) of 375 3CL pro to Thr 168 and Glu 169 , respectively, of galectin-8. 376 J o u r n a l P r e -p r o o f Galectin-8 binds glycans on the cell surface (Carlsson et al., 2007) and has hemagglutination 377 activity due to its bivalent carbohydrate-binding capacity. We found that 3CL pro cleavage 378 disrupts glycan-binding-separation of CRD1 from CRD2 by 3CL pro prevented 379 hemagglutination of human erythrocytes ( Figures S6D and S6E ) and surface adhesion of Jurkat-380 T cells ( Figure S6F ). In addition, proteolysis of endogenous galectin-8 by 3CL pro , but not 381 inactive 3CL pro -C145A, was observed in primary HAECs ( Figures 5E and S6G ) and Calu-3 cells 382 infected with SARS-CoV-2 ( Figure 5F ). 383 On permeabilized endosomes, intracellular galectin-8 detects exposed glycans normally on the 384 cell exterior, leading to cell resistance to infection, e.g. by S. Typhimurium (Thurston et al., 385 2012) and picornavirus (Staring et al., 2017) . Upon exposure of alpha-2, 3-sialylated-and 3'-386 sulfated glycans to the cytosol, e.g., on membrane damage, galectin-8 recruits an autophagy 387 adaptor, CALCOCO2/nuclear dot protein-52-kDa (NDP52), which binds microtubule-associated 388 protein-1 light chain-3 (MAP1LC3). MAP1LC3-coated autophagosomes are then targeted for 389 lysosomal degradation (Mohamud and Luo, 2019) . We hypothesized that in SARS-CoV-2 390 infection, galectin-8 senses the highly glycosylated Spike S1 protein and activates antiviral-391 xenophagy, reducing SARS-CoV-2 infection. Of significance for viral entry and potential escape 392 from xenophagy, we demonstrated direct binding of galectin-8 to immobilized Spike S1 protein 393 and Spike S1 to immobilized galectin-8 ( Figure S6H ). This protein complex was broken up 394 following 3CL pro cleavage of galectin-8 ( Figures 5G and S6J ). Decisively, a competitive 395 inhibitor of galectin glycan-binding sites, thiodigalactoside, blocked binding (Figures S6I and 396 S6J), confirming this previously unknown direct interaction between galectin-8 and Spike S1 397 glycans, which 3CL pro disrupts. 398 To determine the potential for galectin-8 acting as a cell sensor for SARS-CoV-2, we confirmed 399 glycan independent NDP52 binding to galectin-8 ) ( Figure S6K ). By 400 immunoprecipitation with -FLAG antibody, we confirmed NDP52 binds the C domain of 401 galectin-8 generated after 3CL pro cleavage ( Figures 5H and S6L ), which we also showed by 402 ELISA ( Figure 5I ) as previously reported by (Li et al., 2013) . As controls, we found that NDP52 403 and Spike S1 were not susceptible to 3CL pro cleavage ( Figure S6M ). We assembled the trimeric 404 complex comprised of NDP52 bound to galectin-8 bound to immobilized Spike S1 protein. 405 Using this complex, we showed that upon galectin-8 cleavage, the indirect tethering of NDP52 to 406 Spike S1 was lost ( Figure 5J ). 407 To model the effect of 3CL pro cleavage of galectin-8 on autophagy, we transfected HEK-293 408 cells with galectin-8 or the 3CL pro -cleavage analogues FLAG-N-Gal8 (1-158) and FLAG-C-409 Gal8 (159-317). Upon disruption of endosomal/lysosomal integrity by osmotic shock in HEK-410 293 cells, we observed that transfected FLAG-tagged galectin-8 was recruited to damaged 411 vesicles and formed puncta that colocalized with NDP52 ( Figure 5K ). In contrast, transfected 412 FLAG-tagged cleavage-fragment analogues failed to form puncta or colocalize with NDP52 in 413 HEK-293 cells ( Figure 5K ). 414 Analysis of human lung autopsy samples from post-mortem COVID-19 patients (N = 4) was Despite the slightly weaker staining, this did not affect the colocalization analysis as only cells 422 showing intact nuclei with DAPI staining present were counted. The difference in the expression 423 pattern of NDP52 and galectin-8 was both substantial and consistent for each patient and field of 424 view (n = 30). That is, there was virtually a complete overlap between the two proteins in normal 425 lung (> 95% colocalization) versus in the patient samples, where only 5% of galectin-8 426 colocalized with NDP52 ( Figures 5L and 5M ). Hence, our results showing direct binding of 427 galectin-8 to Spike S1 protein and the C-domain of galectin-8 to NDP52 suggests an antiviral 428 autophagy mechanism that SARS-CoV-2 3CL pro counteracts by cleavage of galectin-8 and 429 FYCO1. Protein-protein interaction landscape of 3CL pro host cell substrates 431 In addition to direct cleavage of essential host proteins, we reasoned that 3CL pro proteolytic 432 activity could hijack the cellular machinery by indirectly modifying the function of substrate-433 interacting proteins. To explore this, we constructed a protein-protein interaction network using 434 the 101 substrates of the 3CL pro degradome as seeds ( Figure 6A , red circles). We retrieved 2,202 435 human proteins from the Imex/Intact database having rigorous experimental evidence for direct 436 interactions or physical associations ( Figure 6A ). Among the interactors are sixteen proteins 437 from Table S6 classified as "candidate" substrates ( Figure 6A , orange circles; Table S2 (Casey et al., 2018; Chanda et al., 2003) ; and iii) NDP52 and PICK1, required for antiviral-452 autophagy and endosome maturation. 453 The most significantly enriched protein complexes in the 3CL pro substrate interactome are the 454 spliceosome, the PA700-20S-PA28 proteasome, the EIF3 complex, the anti-HDAC2 complex, 455 and the TNF-/NF-kB signalling complex ( Figure 6B ). These complexes are consistent with the 456 functional categories of 3CL pro substrates ( Figure 1E ) and the cellular processes impacted by 457 stranded proteins. Finally, we show that 26 viral proteins connect to 74 substrates, either by 458 direct interactions (n = 16) in the virus/human-substrate interactome or via a shared interacting 459 partner ( Figure 6C) . Notably, the substrate PTBP1 is the most connected with seven viral protein flexibility of SARS-CoV-2 3CL pro in depth. We show that 3CL pro is a pleiotropic viral factor that 467 proteolytically processes over one hundred host cell proteins involved in essential cellular 468 processes. Proteolytic processing is fundamentally different from degradation to completion via 469 lysosomes and the ubiquitin-proteasome system . We demonstrated pertinent 470 biological effects of processing with examples of altered protein function and subcellular 471 localization after 3CL pro cleavage. Unlike viral competition for cellular resources, which are 472 reversible, 3CL pro proteolytic processing of host cell substrates is irreversible. Thus, the targeted 473 sculpting of the host cell proteome by viral proteases is one of the few direct ways that a virus, 474 with a limited genome, can subvert the cell to enhance replication and infection while rapidly 475 defeating antiviral defenses. Moreover, the effects of 3CL pro proteolysis reverberate through the 476 cell by cleaving interactors of what we term "stranded" proteins that are not cut, effectively 477 isolating essential cofactors and impairing their function or disassembling protein complexes. 478 We demonstrate that galectin-8, the only ISG we found targeted by 3CL pro , loses the ability to 479 recruit the autophagy adapter NDP52 to damaged endosomes upon cleavage by 3CL pro . We 480 further showed that galectin-8 functions as an intracellular sensor for SARS-CoV-2-loaded 481 endosomes by recognizing the glycans decorating Spike S1. We suggest that proteolytic (Zhang et al., 2017) . Notably, we also found that the TBK1 activators, TAB1 and TTC4, are 496 3CL pro substrates, as is FAF1, which also upregulates type-I interferon signalling. Thus, the 497 inactivation of anti-apoptotic and cell proliferation proteins by cleavage-deregulation of the 498 Hippo signalling pathway deserves further study in SARS-CoV-2 infection. The functional 499 YAP/TAZ dimer interacts with, regulates, and is regulated by plasma membrane structures. 500 Therefore, deregulation of the Hippo pathway that relays cell shape and plasma membrane status 501 should contribute to the dramatically altered cell morphology in SARS-CoV-2 infected cells and 502 in the lungs of COVID-19 patients. 503 Due to the structural similarity between SARS-CoV and SARS-CoV-2 3CL pro , it is generally 504 assumed that both enzymes behave with similar substrate preferences and kinetics. Most 505 attention has been devoted to studying non-prime side interactions for drug development. In 506 contrast, our study highlights the role of the substrate prime side and shows that SARS-CoV-2 507 3CL pro can cleave noncanonical sequences after methionine and histidine. We empirically 508 showed cleavage occurs even with a bulky aliphatic residue in P1'. This can only occur after a 509 significant conformational rearrangement of the substrate cleft, which has implications for the 510 rational design of inhibitor drugs. The mechanistic insight gained from the over 100 substrates 511 J o u r n a l P r e -p r o o f we discovered-with the promise of more by mining our data resource-and further exploration 512 of the entire substrate degradome provides a foundational resource for the scientific community. 513 With many opposing cell mechanisms at play to favor viral translation and viral replication, 514 targeting essential host proteins by 3CL pro with precision temporal-spatial localization over a 515 range of cleavage rates may synchronize the wave of events in the COVID-19 cellular coup 516 d'état. Thus, our study strengthens the case for 3CL pro inhibition as an attractive therapeutic 517 option to not only block viral polyprotein processing and assembly of the replication complex 518 but also synergistically restore protective antiviral intracellular defense pathways. Our atlas of 519 101 substrates and the additional 68 candidate substrates provides rational start points for further 520 investigations of the pathobiology of SARS-CoV-2 infection leading to COVID-19 triggered by 521 3CL pro cleavage of these host proteins. The cleaved substrate neo-termini in our atlas will help 522 assess on-target drug efficacy in vivo. Moreover, clinical translation to detect cleaved substrate 523 neo-N-termini, which more precisely reflect disease stage than the levels of the protein or 524 transcript alone, is a precise diagnostic strategy for infection surveillance of SARS-CoV-2 and 525 future coronavirus outbreaks that infect humans-which is just a matter of time. Limitations of the study 527 Like all proteomics analyses, TAILS relies on mass spectrometry with inherent limitations in 528 LC-MS/MS peptide identification and mass spectrometer sensitivity. These contribute to missing 529 low abundance peptides, short or very long peptides, rare peptides from low abundance proteins, 530 and some membrane proteins. However, in TAILS, short semi-tryptic neo-N-terminal peptides 531 resulting from proteolytic cleavage are often lengthened somewhat as in our workflow, lysine 532 amino acid residues are blocked by dimethylation, which trypsin cannot cut. In addition, the 533 polymer enrichment of neo-N-terminal peptides can amplify the detection of low abundance 534 peptides, and we also accurately identified peptides >30 amino acids in length (e.g. Figure 3A ). 535 To generate the most accurate atlas of substrates possible, we employed rigorous substrate 536 winnowing criteria. This means that, although we identified many substrates with high 537 confidence, we likely did not include some bone fide cleavage events. These can be data mined 538 and followed up in subsequent studies, especially when data from emerging studies can be cross- (Table 1) . (E) Substrate 609 Reactome gene set enrichment by hypergeometric distribution followed by FDR correction. 610 Node radius designates gene enrichment; line widths are proportional to the overlap of shared 611 substrates between connected nodes sharing ≥ 20% genes. See also Figure S1 and tables S1-S8. After bioinformatics analysis and substrate winnowing, n = 102 cut sites in n = 101 human protein substrates of 3CL pro were confidently identified. Fields marked as "" or "" indicate in which of the N = 12 independent cell experiments that the cleaved neo-N-terminal P' peptide was found by TAILS LC-MS/MS with an FDR ≤ 0.01 at the peptide level. For protein identification, the TAILS and preTAILS shotgun proteomic analyses were combined in each experiment, with an FDR ≤ 0.01 at the protein level. Cleaved neo-N-terminal peptides of substrates that were reproducibly identified in ≥ 2/3 HEK-293 or ≥ 7/9 BEAS-2B experiments were further substrate winnowed by sequence distance score calculation and manual inspection of all MS/MS spectra in order to be considered bona fide substrates. *, Amino acid sequence of the cleavage site and P1' amino acid position identified from the neo-N-terminal peptide. ↓, scissile bond. †, Cleavage site confirmed by MALDI-TOF MS analysis of 3CL pro enzyme kinetics of P4 -P4' spanning peptide cleavage (+). ‡, SRRM2 has two cleavage sites identified in the same protein, one in 12/12 experiments, the other in 9/9 BEAS-2B experiments. §, MCM4 was identified with a sequence distance score below the 10 th percentile, but the P4 -P4' synthetic peptide was cleaved in MALDI-TOF MS analysis. ¶, Substrate found in 2/3 HEK-293 cell experiments only (n = 4) or ≤ 6/9 BEAS-2B cell experiments only (n = 3), but with other compelling evidence or biology, including peptide cleavage in MALDI-TOF MS analysis, to be designated as a substrate. Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Christopher Overall (chris.overall@ubc.ca). This study generated 49 new synthetic 14-mer peptides spanning substrate P4-P4' cleavage sites suitable for MALDI-TOF-MS analysis and are available from the lead contact. This study generated eukaryotic cell expression DNA constructs in plasmids for FLAG-tagged full-length human galectin-8 and FLAG-tagged 3CL pro cleavage-fragment analogues of human galectin-8 designated N-galectin-8 (1-158) and C-galectin-8 (159-317) and are available from the lead contact. This study generated C-terminal-tagged recombinant wild type (active) and inactive mutant 3CL pro -C145A plasmids, which have been deposited to Addgene, (pET21b(+)_SARS-CoV-2_3CLpro-Q306A (Addgene, ID 177334) and pET21b(+)_SARS-CoV-2_3CLpro-C145A-Q306A (Addgene, ID 177335). The mass spectrometry proteomics data are available via ProteomeXchange with identifiers PXD026797 (username: reviewer_pxd026797@ebi.ac.uk and password: R2ujONYL) and PXD026815 (username: reviewer_pxd026815@ebi.ac.uk and password: 6t5zSsdI). The interactive version of PPI networks presented in Figures 6A and 6C are available online in the NDEx repository (https://public.ndexbio.org/#/network/1b9f868d-d391-11eb-b666-0ac135e8bacf?accesskey=f186d2601af5583f94dbd3e32898225e761d0bd5aa7bb885dbe5a8927a 6a3962) and (https://public.ndexbio.org/#/network/195436fa-d391-11eb-b666-0ac135e8bacf?accesskey=8d7377ab2f26a9ca4b13276a71e55864cead1543752cd4737d9c803cb4 fd6540). All flags and commands lines used to generate the structural models reported in this study are available in this paper's supplemental information. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request. Primary human airway epithelial cells (HAECs) collection from five donors (one female, 57 years old and four males, 37, 47, 61, 71 years old) was approved by the University of Hamilton Integrated Research Ethics Board (HiREB) under protocol HiREB-5099-T. HAECs were cultured using PneumaCult-Ex Plus Media (STEMCELL Technologies). HAECs were cultured at 37°C and 5% CO 2 . (Banerjee et al., 2020) . A fresh vial of virus stocks was used for each experiment to avoid repeated freeze-thawing. SARS-CoV-2 infections of monkey Vero E6 (RRID: CVCL_0574) cells were performed in the University of British Columbia (UBC) BSL3 facility (FINDER) following the Public Health Agency of Canada and UBC FINDER regulations (UBC BSL3 Permit # B20-0099 to EJ). SARS COV-2/Canada/VIDO-01/2020 was kindly provided by Dr. S. Mubareka (Sunnybrook Research Institute, Toronto, ON, Canada). Human blood (~10 ml) was collected from a healthy volunteer donor (male, 24 years old) at the UBC Centre for Blood Research (UBC Human Ethics number: H06-00047) in a Vacutainer (BD) containing sodium citrate. Human tissue samples were collected and handled following Swedish laws and regulations. Normal lung samples (N = 3) were obtained from the Clinical Pathology Department, Uppsala University Hospital, Sweden and collected within the Uppsala Biobank organization. The samples were anonymized for personal identity by following the approval and advisory report from the Uppsala Ethical Review Board (Ref # 2002 (Ref # -577, 2005 . The tissue samples representing one female 54 years old (F54) and two males 15 and 45 years old (M15 and M45) were collected based on hematoxylin-eosin (H&E) stained tissue sections showing representative normal lung histology and quality-controlled by a certified pathologist. COVID-19 lung tissue samples (N = 4) were collected during clinical autopsies to establish the precise cause of death at the Department of Clinical Pathology/Cytology, Karolinska University J o u r n a l P r e -p r o o f Hospital, Huddinge, Stockholm, Sweden, described previously (Szekely et al., 2021) . The Swedish Ethical Review Authority approved the study under the registration number DNR 2020-02446 and 2020-04339. Samples from four individuals were used (age 64, 97, 60 and 31), corresponding to cases 1, 8, 9 and 11 with the patient characteristics and clinical parameters described in detail (Szekely et al., 2021) . The DNA sequence of SARS-CoV-2 main protease (3CL pro , NSP5; YP_009725301.1 (protein ID), NC_45512.2 (whole SARS-CoV-2 genome) was synthesized and cloned into the expression vector pET-21b (+) using NdeI and BamHI restriction sites. During synthesis, a second NdeI cleavage site in the original sequence was deleted by silent mutation (GenScript). For efficient expression and purification, a Gln306Ala mutation was introduced, eliminating the C-terminal 3CL pro autoproteolytic cleavage site (FQ306↓G). This site was followed by a 3x Gly flexible linker, the Factor Xa cleavage site, 2x Gly linker, 3x FLAG-tag, 2x Gly linker, Myc-tag, 2x Gly linker, and 6x His-tag. The catalytic inactive 3CL pro expression plasmid was constructed by introducing a mutation into the codon for the catalytic cysteine 145 to alanine (3CL pro -C145A). Active and inactive proteases were expressed in E. coli BL21(DE3)pLysS (Thermo Fisher Scientific). Bacteria were grown at 37°C until expression was induced with 0.4 mM IPTG, after which the cultures were grown at room temperature (RT) for ~20 h. Bacterial pellets were collected by centrifugation at 5,000g for 20 min and lysed with lysis buffer (300 mM NaCl, 10 mM imidazole, 1 mM DTT, 50 mM Tris-HCl, pH 7.4). Purification of 3CL pro and 3CL pro -C145A was performed by immobilized metal affinity chromatography using a 1 ml HisTap HP column (Cytiva). A continuous gradient up to 250 mM imidazole eluted the recombinant proteins on an ӒKTAexplorer (Amersham Pharmacia Biotech, now Cytiva). Protein fractions were pooled and dialyzed against assay buffer (150 mM NaCl, 2 mM DTT, 1 mM EDTA, 0.05% Brij 35, 50 mM Tris-HCl, pH 7.2), snap-frozen in liquid N 2 and stored at -80°C until use. The activity of the purified protease was confirmed using the quenched fluorescence specific peptide (Ac-Abu-Tle-Leu-Gln-ACC) at 20 µM as described (Rut et al., 2021) and measured with a λ ex 320 nm and λ em 460 nm using a POLARstar optima (BMG LABTECH) microplate reader. HEK-293, BEAS-2B and HAECs were maintained as described. To induce interferon-stimulated gene proteins that may be 3CL pro substrates, the BEAS-2B cells were cultured in DMEM/F12 with 10% (v/v) FBS and treated with 10 4 U/ml carrier-free IFN-α2a or IFN-β1a (PBL Assay Science), or medium (control) for 18 h. Cells were harvested, and lysates were prepared under native conditions taking the necessary steps to reduce any cellular proteolytic activity. All steps were performed on ice. First, cells were washed twice with phosphate-buffered saline (PBS) and lifted with Versene buffer (0.5 mM EDTA, PBS). The cell pellet was washed twice with 150 mM NaCl, 50 mM Tris-HCl, pH 7.2 by centrifugation at 300g for 10 min. Protease inhibitor cocktail (bimake.com) and 5 mM N-ethylmaleimide (NEM) (Sigma) were added to the cell pellet in hypotonic lysis buffer (2 mM MgCl 2 , 1 µl/ml Benzonase (Sigma), 50 mM Tris-HCl, pH 7.2). For lysis, cells were pushed through a 27-gauge needle for 10 cycles and rested on ice for 1 h with agitations every 10 min. Cell lysates were flash-frozen in liquid N 2 and stored at -80°C. For 3CL pro substrate profiling, the cell lysates were thawed on ice, ultrasonicated (3 cycles, 20 s, power 3) (Sonic Dismembrator Model 100, Fisher Scientific) and clarified by centrifugation at 400g for 10 min. Buffer exchange to Brij-free 3CL pro assay buffer was performed 3x in a 0.5-ml Amicon Ultra with a 3-kDa cutoff (Millipore). 3CL pro or inactive 3CL pro -C145A (control) at 2.5 µM were incubated with 500 µg native cell protein in their respective 0.5-ml Amicon Ultra filter cartridges, 37°C for 18 h ( Figure 1A, panel a) . Quenched fluorescent peptide assays (Rut et al., 2021) of the samples before and after incubation confirmed 3CL pro activity. The incubated samples were then analyzed by Terminal Amine Isotopic Labeling of Substrates (TAILS) and preTAILS shotgun proteomics (Figure 1 ) using a modified protocol from that described before (Kleifeld et al., 2011) . One volume (~140 µl) of 8% (w/v) SDS, 20 mM DTT, 100 mM HEPES, pH 8.0 was added to the samples in the Amicon filters used for digestion and incubated for 1 h at 37°C. Samples were centrifuged (12,000g, 10 min), washed twice with wash buffer (20 mM HEPES, pH 7.0), and cysteines were alkylated by adding one volume of 40 mM NEM in wash buffer followed by a 30-min incubation at RT. After adding additional 10 mM DTT for 10 min at RT, the samples were concentrated by ultrafiltration to ~100 µl and transferred to Lo-bind Eppendorf tubes. Amicon filters were rinsed 2x with 50 µl of wash buffer and added to the samples (150 µl) for precipitation with methanol/chloroform/H 2 O (4:1:3) (Wessel and Flügge, 1984) . Protein precipitates were collected and resuspended in 50 µl of 4% (w/v) SDS, 50 mM HEPES, pH 6.8. All protein N-termini, i.e., neo-N-termini generated by 3CL pro cleavage and natural protein starts, were isotopically labelled at the protein level using 2.5 µl of 1 M heavy [+34 Da] (for 3CL pro ) or light [+28 Da] (for 3CL pro -C145A) formaldehyde and 2.5 µl of 500 mM NaCNBH 3 , for 4 h, 42°C ( Figure 1A, panel b) . Excess formaldehyde was quenched with 5 µl 1 M Tris, pH 6.8 for 1 h. Then, samples were pooled and cleaned by methanol/chloroform/H 2 O (4:1:3) precipitation, resuspended in 200 µl of 20 mM HEPES, pH 8.0. The labelled protein was then digested with MS grade trypsin protease (Thermo Fisher Scientific), 1:50 enzyme:protein (w/w) overnight, 37°C ( Figure 1A, panel c) . For preTAILS, 20 µl of the peptide digest was desalted using C18 StageTips, lyophilized, and stored at -20°C until liquid chromatography-tandem mass spectrometry (LC-MS/MS) analysis. The remaining sample was pH-adjusted to 6.5 with HCl. N-terminal peptides were enriched by depleting the tryptic peptides via covalent coupling to our in-house synthesized HPG-ALD 100K polymer (available via UBC Flintbox, bit.ly/3iHPs8P) ( Figure 1A , panel c), 5:1 (w:w; polymer:peptide) in the presence of 30 mM NaCNBH 3 for 4 h, 42°C. N-terminal blocked peptides were retrieved by ultrafiltration in 10-kDa filters by centrifugation, desalted using C18 StageTips, lyophilized, and stored at -20°C until LC-MS/MS ( Figure 1A, panel d) . Data-dependent acquisition was performed using UHPLC (Easy nLC-1000, Thermo-Fisher Scientific) coupled to an Impact II Q-TOF mass spectrometer (Bruker-Daltonics) with a CaptiveSpray ionization interface equipped with a NanoBooster. Peptide samples (1 μg) were injected onto a 75 μm × 300 mm analytical column (packed in house) with ReproSil-Pur C18 1.8 μm stationary phase (Dr. Maisch GmbH). Peptides were eluted using a 120-min curved gradient at 250 nl/min from 5% to 24% buffer B (99.9% acetonitrile, 0.1% formic acid), then increased to 34% over 10 min, further increased to 95% buffer B over 5 min and finally held at 95% for 10 min. CaptiveSpray source voltage was set to 1,250 V, the mass spectrometer was operated in positive ion polarity mode, and precursor ions were detected from 150 to 2,250 m/z. MS/MS spectra were acquired using a Top12 selection method with an intensity-adjusted MS/MS summation time (duty cycle 1.3-1.8 s). Acquired precursors were excluded for 14 s before a new acquisition (Compass oTOF control 1.9, Bruker). Samples were measured twice, once using acetonitrile, with a second using methanol as the dopant in the NanoBooster. A total of 49 14-mer peptides with the sequence AA(X 1 -X 8 )YAYR, with X 1 -X 8 being the P4-P4' sequence of 46 3CL pro cleavage sites identified by TAILS plus 3 cut sites identified by Edman sequencing, were synthesized (GenScript). Peptides (25 µM) were diluted in 3CL pro assay buffer and incubated with 3CL pro (1:20 molar ratio, E:S) in a 25-µl final volume at 37°C in a humidified chamber for 5, 15, 30, 60, 120, 240 min. At the indicated time points, 0.5 µl of the enzymatic reaction was deposited on a MALDI plate pre-spotted with CHCA matrix. After which, 0.5 µl CHCA matrix was immediately added, and the plate was air-dried. The samples were desalted by immersing the whole plate in ice-cold 0.1% formic acid bath. After air-drying, samples were measured in positive ion mode in a MALDI-TOF/TOF 4700 Proteomics Analyzer (Applied Biosystems). MALDI spectra were analyzed using Applied Biosystem Data Explorer, version 4.5. Estimations of apparent (app)k cat /K M (Starr and Overall, 2009) were done under the assumption of a first-order reaction where: The other important assumption is that the peak areas of the substrate and product fragments in MALDI-TOF MS spectra are directly proportional to their relative abundance. As this is not always necessarily true, we limited the scope of the (app) k cat /K M calculations to rank the substrates in bins of four according to degradation rate. Peptide-protein docking Peptide-3CL pro molecular docking simulations were performed using the Rosetta FlexPepDock ab-initio protocol (Raveh et al., 2011) implemented within the Rosetta software suite . First, the crystal structure of 3CL pro (PDB: 6XHU) was prepared for docking calculations by running the Rosetta relax application using the flags listed in the Supplementary text. The starting backbone conformation of the peptides (RPAP1: ARLQAMAP; IMA4: AILQNATS; PTBP1: AALQAVNS; RBM15: SRLHSYSS; MAP4K5: SKLMSENT; CREB1: VVVQAASG) were created as a preliminary extended structure, truncated at both N-and Ctermini, using the BuildPeptide Rosetta application. For each peptide, a fragment library of trimer and pentamer backbone was generated from known structures available in the PDB, based on the target sequence similarity and its predicted secondary structure. Briefly, FlexPepDock abinitio simulations started from the extended peptide structure placed at 15 Å apart from the 3CL pro active site. A total of 50,000 models were then generated through a fast low-resolution step. The side-chains are represented as a single centroid sphere, followed by a high-resolution step that uses a full-atom energy function that enables complete flexibility for all peptide and receptor side-chains (Alford et al., 2017) . A flat harmonic function was used to penalize models when the Euclidean distance between Cys 145 Sγ and Cα of P1 exceeds 4 Å. The 500 lowestscoring models, based on Rosetta total energy, were selected. The model with the most J o u r n a l P r e -p r o o f significant structural similarity within this subset, given by the root-mean-square deviation (RMSD), was chosen as the representative model. An initial full-length 3D model of galectin-8 was built by comparative modelling using the RosettaCM protocol and PDB 4FQZA as the template structure. Before docking simulations, we generated structural ensembles with backbone conformational variations for both the 3CL pro dimer (PDB 6XHU) and galectin-8 top-ranked full-length models using Normal Mode Analysis, with perturbation steps of 1 Å, through RosettaScripts (Fleishman et al., 2011) . The ensembles were used as input structures for the docking simulation between 3CL pro and galectin-8 using the RosettaDock algorithm implemented in the Rosetta macromolecular modelling suite. Constraint was applied to penalize models having Cys 145 Sγ atom of 3CL pro and Gln 158 Cα of galectin-8 spaced by more than 4 Å. A total of 33,500 models were generated, and the decoy with the greatest structural similarity within the 500 lowest-scoring models was selected as a representative model. The recombinant proteins assayed were: RNA polymerase II-associated protein 1 (RPAP1), partial 6x His-tagged (1 -351 aa, BC000246, Proteintech); polypyrimidine-tract binding protein 1 (PTBP1), 6x His-tagged (1 -557 aa, NP_002810.1, Aviva System Biology); mitogen-activated protein kinase kinase kinase kinase 5 (MAP4K5), GST/6x His-tagged (1 -846 aa, NP_006566.2, Sino Biological); cyclic AMP-responsive element-binding protein 1 (CREB1), 6x His-tagged (1 -327 aa, NM_004379, Origene); galectin-8 (1 -317 aa, AAF19370.1, Sino Biological); galectin-8 (LGALS8) (1 -317 aa, AAF19370.1, Sino Biological); SARS-CoV-2 Spike S1, 6x His-tagged recombinant protein, (16 -685 aa, YP_009724390.1, Sino Biological); calciumbinding and coiled-coil domain-containing protein 2 (CALCOCO2/NDP52), GST-tagged recombinant protein (1 -446 aa, NP_005822.1, Abnova); importin subunit alpha-4 (IMA4), partial 6x His-tagged (3 -220 aa, NP_002258.2, Aviva System Biology). Recombinant proteins were incubated with 3CL pro , 3CL pro -C145A, and 3CL pro inhibited with the specific 3CL pro specific inhibitor, 1 µM GC376 (Vuong et al., 2020) , at a molar ratio of 1:5 mol/mol, E:S in time course assays (0.25, 0.5, 1, 2, 4 and 16 h) at 37°C. Protein cleavage was confirmed by SDS-PAGE followed by Imperial protein staining (Thermo Fisher Scientific). Edman sequencing was used to identify the N-terminal sequence of cleaved proteins using an ABI 494 Protein Sequencer (Tufts University Core Facility, Boston, MA, USA) as previously described . The apparent molecular weights of cleaved protein fragments on SDS gels were calculated using GelAnalyzer version 19.1. (www.gelanalyzer.com by Istvan Lazar Jr., Ph.D. and Istvan Lazar Sr., Ph.D., C.Sc.). The kinase activity of intact or cleaved (Δ) recombinant MAP4K5 (500 ng) was measured at 1:2 serial dilutions in duplicate using the Universal Kinase Activity kit (R&D systems, EA004) as per manufacturer instructions. Native pig myelin basic protein (Abcam, ab64311) at 5 mM was the acceptor substrate. The ATP consumption (nmol of phosphate) was measured on a POLARstar optima (BMG LABTECH) at 620 nm. Statistical significance was calculated by comparing the area under the curve with Prism version 9.0.0 (121) and a Student's t-test (GraphPad). J o u r n a l P r e -p r o o f HAEC lysates were incubated with 3CL pro or 3CL pro -C145A as described above. Vero E6 cells were seeded at ~0.1 x 10 6 in 12-well and cultured as described above. SARS-CoV-2 was absorbed to the cells at an MOI 0.1 for 60 min in Opti-MEM, 37°C, washed with PBS, pH 7.4 and then incubated with complete DMEM, 24 and 48 h (N = 4 for each time point). Cells treated with DMEM alone were considered as controls (N = 3, mock). For immunoblot characterization of PTBP1 cleavage at 24-and 48-hpi with SARS-CoV-2, the cells were washed 3x PBS before lysis in 1x Halt Protease Inhibitor Cocktail (Thermo Fisher Scientific) and RIPA buffer (150 mM NaCl, 1% NP-40, 0.5% sodium deoxycholate, 0.1% SDS, 25 mM Tris-HCl pH 7.4). Calu-3 cells were seeded at ~0.7 x 10 6 cells/T-25cm 2 flask and infected with SARS-CoV-2/SB3 at different multiplicities of infection (MOI) of 0.1 or 1.0, or mock-infected as controls. Cell lysates were collected at 24-and 48-hpi in the presence of Halt Protease and Phosphatase Inhibitor Cocktail (Thermo Fisher Scientific) using 1x lysis buffer (2% SDS (w/v), 10% glycerol (v/v), and 1% βmercaptoethanol (v/v), 160 mM Tris-HCl, pH 6.8) and boiled for 10 min. Samples of HAEC, Vero E6 and Calu-3 cell lysates (20 µg) were electrophoresed on 12% or 4-12% gradient NuPAGE Bis-Tris 1.0 mm Mini protein gels at constant 200 V, or 3-8% gradient NuPAGE Tris-Acetate 1.0 mm gels at 150 V (Invitrogen). Proteins were transferred to PVDF membranes (Immobilon-FL, Millipore-Sigma). After blocking with Intercept (PBS) Protein-free Blocking Buffer (Li-COR) for 1 h membranes were incubated with the primary antibodies listed below in blocking buffer, 0.2% Tween 20 overnight at 4°C. Membranes were then washed 3x with PBST buffer (1x phosphate-buffered saline, 0.1% Tween 20) and incubated with secondary antibodies (listed below) in blocking buffer, 0.2% Tween and 0.01%SDS for 1 h at RT. Membranes were washed 3x with PBST buffer, rinsed with water and imaged on an Odyssey-Classic infrared imager (application software 3.0.30, Li-COR). For densitometric analyses of the immunoblots, we used the Image Studio Software version 5.2.5. Fold change was calculated relative to the corresponding loading control bands, and statistical analyses were performed with Prism version 9.0.0 (121) and one-way ANOVA followed by Dunnett's multiple comparison test (GraphPad). The predicted molecular weight of protein bands was calculated using ProtParam, ExPASy. The primary antibodies and dilutions used were: mouse monoclonal anti-SARS-CoV-2 nucleocapsid antibody (1:1,000, Invitrogen, MA5-29981, RRID: AB_2785780); rabbit anti-SARS-CoV-1 3CL pro antibody (1:2000, Rockland, 200-401-A51, RRID: AB_828457); rabbit polyclonal anti-RPAP1 antibody (1:1,000, Proteintech, 15138-1-AP, RRID: AB_2301137); mouse monoclonal anti-PTBP1 antibody (1:500, Biolegend, 630101, 3H7, RRID: AB_2171285); rabbit polyclonal anti-MAP4K5 antibody (1:1,000, Cusabio, CSB-PA013440DSR2HU, RRID: AB_2892084); rabbit polyclonal anti-CREB1 antibody (1:1,000, Abclonal, A11989, RRID: AB_2758916); rabbit polyclonal anti-YAP1 antibody (1:1,000, Abclonal, A11430, RRID: AB_2758556); rabbit polyclonal anti-FYCO1 antibody (1:1,000, Cusabio, CSB-PA866262LA01HU, RRID: AB_2892085); rabbit polyclonal anti-FAF1 antibody (1:1,000, Abclonal, A2921, RRID: AB_2764739); goat polyclonal anti-Gal8 antibody (1:400, R&D Systems, AF1305, RRID: AB_2137229); rabbit polyclonal anti-KPNA3 (IMA4) antibody (1:1,000, Abclonal, A8347, RRID: AB_2770124); rabbit polyclonal anti-NUP107 antibody (1:1,000, Abclonal, A13110, RRID: AB_2759959); mouse monoclonal anti-IRS2 antibody (1:300, R&D Systems, MAB6347, 676415, RRID: AB_10992928); mouse monoclonal anti-FLAG M2 antibody (1:10,000, Sigma, F3165, RRID: AB_259529); mouse monoclonal anti-βtubulin antibody (1:2000, AbLab, 21-0018-00, clone BT7R); mouse monoclonal anti-β-actin J o u r n a l P r e -p r o o f antibody (1:1,000, Abcam, ab8226, RRID: AB_306371); rabbit monoclonal anti-β-actin antibody (1:200, Abcam, ab115777, RRID: AB_10899528). The secondary antibodies and dilutions used were: IRDye 800CW goat anti-rabbit (1:10,000, Li-COR, 926-32211, RRID: AB_621843) ; Alexa Fluor Plus 800 goat anti-mouse (1:20,000, Invitrogen, A32730, RRID: AB_2633279); Alexa Fluor 680 goat anti-rabbit (1:10,000, Invitrogen, A21109, RRID: AB_2535758); Alexa Fluor 680 goat anti-mouse (1:10,000, Invitrogen, A21057, RRID: AB_2535723); and Alexa Fluor 680 donkey anti-goat (1:10,000, Invitrogen, A21084, RRID: AB_2535741). Vero E6 cells (0.1 x 10 6 ) were seeded in 12-well plates containing coverslips and cultured as described above. Vero E6 cells were infected with SARS-CoV-2 as described above (MOI 0.1, N = 5) for 48 h. Cells treated with DMEM alone were considered as controls (N = 5). Immunostaining and confocal image acquisition were performed at Wax-it Histological Services, Vancouver, BC, Canada. Coverslips with the attached cells were washed once with PBS followed by fixation and virus inactivation with 4% paraformaldehyde, 30 min with three subsequent PBS washes. Briefly, coverslips were blocked with Wax-it blocking solution for 1 h. Primary antibodies, mouse monoclonal anti-PTBP1 antibody (1:66, Biolegend, 630101, 3H7, RRID: AB_2171285) and rabbit monoclonal anti-SARS-CoV-2 Spike S1 antibody (1:500, Sino Biological, 40150-R007, RRID: AB_2827979) were incubated overnight at 4°C. After 3 washes with Wax-it washing solution, the coverslips were incubated with secondary antibodies, goat anti-rabbit Alexa 488 (1:500, Invitrogen, A11034, RRID AB_2576217) and goat anti-mouse Alexa 546 (1:500, Invitrogen, A11030, RRID: AB_2534089), for 1 h at RT. Coverslips were mounted with ProLong Gold mounting media with DAPI (provided by Wax-it) and imaged by confocal microscopy. One image (63x magnification) per coverslip (mock, N = 5 and infected N = 5) were acquired. Whole blood was centrifuged at 1,000g, 5 min to separate red blood cells (RBC) from plasma. After washing with PBS 4x, 50 µl of 4% (v/v) RBCs were mixed in a 96-U-shaped well plate with 50 µl of serial-diluted intact or cleaved (Δ) galectin-8 (50 to 3.1 µg/ml) at RT. 3CLpro at the same concentration used to cleave galectin-8 (0.26 µM) was used as a control. The plate was incubated for 1 h at RT and photographed. Jurkat T cells were seeded in an 8-well chamber slide (Lab-Tek II, 154534, NalgeNunc) at 2 x 10 5 cells/well in 150 µl RPMI serum-free medium, 2 h at 37°C in 5% CO 2 . Intact or cleaved (Δ) galectin-8 at 0.4 µM (n = 2) was added in RPMI serum-free medium for 1 h at 37°C. 3CL pro at the same concentration used to generate ΔGal8 (0.08 µM), and RPMI serum-free medium (n = 2) were used as controls. Non-adherent cells were collected, washed twice with PBS and counted using a Scepter handheld automated cell counter (Millipore-Sigma) with a 60-µm sensor. Adherent cells were washed with PBS and fixed with 4% formaldehyde in PBS at RT for 10 min. Then, the cells were washed with PBS and permeabilized with 0.1% Triton X-100 for 5 min at RT. Staining of F-actin and nuclei was performed in 200 µl PBS with 5 µl Alexa Fluor 594 phalloidin (Invitrogen, A12381, RRID: AB_2315633) and DAPI (1:1,000, Invitrogen, D3571) for 30 min at RT in the dark. The slide was washed twice with PBS and mounted with ProLong Gold antifade mounting media (Invitrogen, P36930). Images were acquired with a 20x and 100x objective lens in a Leica DMRA2 microscope (Leica) from 2 randomly selected fields of view. J o u r n a l P r e -p r o o f Adherent cells were counted using 20x images via ImageJ 1.53c. Statistical analyses were performed with Prism version 9.0.0.121 using one-way ANOVA followed by Tukey's multiple comparisons test (GraphPad). Galectin-8 binding to SARS-CoV-2 Spike S1 glycoprotein was assessed by a Sandwich ELISA (n = 2). Recombinant galectin-8 (1 -317 aa, AAF19370.1, Sino Biological) or recombinant SARS-CoV-2 Spike S1-6x His-tagged protein (16 -685 aa, YP_009724390.1, Sino Biological) at 5 µg/ml were coated on ELISA plate wells at 4°C, overnight. Blocking was with 1% β-casein (Sigma) for 1 h at RT. Next, 0.2 µM SARS-CoV-2 Spike S1 glycoprotein or 0.2 µM galectin-8 were added to the galectin-8 or SARS-CoV-2 Spike S1 glycoprotein-coated wells, respectively, for 2 h at RT. To detect the bound protein, mouse monoclonal anti-His-tag antibody (1:1,000, Cedarlane Labs, CLH101AP) (for Spike S1 glycoprotein) or goat polyclonal anti-Gal8 antibody (1:200, R&D Systems, AF1305, RRID: AB_2137229) and rabbit polyclonal anti-C-Gal8 antibody (1:500, Thermo Fisher Scientific, PA5-19729, RRID: AB_10984508) (for galectin-8) were added as appropriate and incubated for 1 h at RT. Detection was either by goat anti-mouse IgG (H+L)-HRP conjugated (1:1,000, Bio-Rad, 170-6516, RRID: AB_11125547), rabbit antigoat IgG (H+L)-HRP conjugated (1:1,000, Bio-Rad, 172-1034, RRID: AB_11125144) or goat anti-rabbit IgG (H+L)-HRP conjugated (1:1,000, Bio-Rad, 172-1019, RRID: AB_11125143) for 1 h at RT. The colorimetric assay was developed using the peroxidase substrate ophenylenediamine dihydrochloride tablets for 20 min at RT. Four washes with PBST buffer followed every step of the binding assay. The colorimetric signal at OD 450 nm was measured on a SpectraMax 384 Plus (Molecular Devices) spectrophotometer plate reader. Immobilized SARS-CoV-2 Spike S1 glycoprotein binding to 1 µM of galectin-8 was repeated in the presence of 20 mM of the competitive inhibitor thiodigalactoside (TDG, Sigma, SML2310) or ΔGal8 at 0.5 µM (1:2 serial dilutions) (n = 2). For galectin 8 detection, two different antibodies were used (n = 2 with anti-Gal8 antibody and n = 3 with anti-C-Gal8 antibody). Immobilized CALCOCO2/NDP52 (5 µg/ml) binding to 0.3 µM of intact Gal8, pre-incubated with TDG intact Gal8 or cleaved ΔGal8 (n = 3) was study by ELISA as described above. Rabbit polyclonal anti-C-Gal8 antibody (1:500, Thermo Fisher Scientific, PA5-19729, RRID: AB_10984508) followed by goat anti-rabbit IgG (H+L)-HRP conjugated (1:1,000, Bio-Rad, 172-1019, RRID: AB_11125143) detected the bound galectin-8. Finally, the trimeric interaction between galectin-8, SARS-CoV-2 Spike S1 glycoprotein and CALCOCO2/NDP52 was studied (n = 2). Recombinant SARS-CoV-2 Spike S1 glycoprotein was coated at 5 µg/ml followed by incubation with recombinant intact or ΔGal8 at 1 µM (1:2 serial dilutions) for 2 h at RT. After 3 PBS washes, 0.2 µM of recombinant CALCOCO2/NDP52 was added and incubated for 2 h at RT. Bound CALCOCO2/NDP52, as the third member of the complex, was detected with anti-CALCOCO2/NDP52 antibody. Statistical analyses were performed with Prism version 9.0.0 (121) (GraphPad). Student's t-test was used to assess statistical significance between two groups, and two-way ANOVA followed by Šídák's multiple comparisons test was used when the effect of two variables across different groups were analysed. HeLa cells were co-transfected with 1 µg of GPF-NDP52 (Dr. Richard Youle at the National Institute of Neurological Disorders and Stroke, USA) and WT-Gal8-FLAG or C-Gal8 (159-317)-FLAG plasmid constructs engineered and synthesized from the galectin-8 ORF clone OHu23472 J o u r n a l P r e -p r o o f (GenScript) were cultured as described above. HeLa cells were harvested in FLAG lysis buffer (150 mM NaCl, 1.0% Triton-X-100, 50 mM Tris-HCl pH 7.4, 1 mM EDTA and proteinase inhibitor cocktail). Protein complexes were immunoprecipitated for 16 h with Anti-FLAG M2 Affinity Gel (Sigma, A2220) before washing 3 times. Samples were eluted with 2x SDS sample buffer and separated on 12% polyacrylamide gel and transferred to nitrocellulose membrane for western blotting as described here. The following primary antibodies were used: mouse monoclonal anti-FLAG M2 antibody (1:1,000, Sigma, F3165, RRID: AB_259529), mouse monoclonal anti-CALCOCO2/NDP52 (1:1,000, Santa Cruz Biotechnology, sc-376540, F-6, RRID: AB_11150487), rabbit monoclonal anti-GFP (D5.1) (1:1,000, Cell Signaling Technology, 2956, RRID: AB_1196615) and rabbit monoclonal anti-Gal8 (EPR4857) (1:1,000, Abcam, ab109519, RRID:AB_10861755). Secondary antibodies used: goat anti-mouse IgG (H+L)-HRP conjugated (1:5,000, Thermo Fischer Scientific, 31430, RRID: AB_228307) and goat anti-rabbit IgG (H+L)-HRP conjugated (1:3,000, Cell Signaling Technology, 7074, RRID: AB_2099233). Sterile damage to cell vesicles in the puncta assay was performed as previously described (Thurston et al., 2012) . Briefly, HEK-293 cells were seeded in an 8-well chambered coverglass (Thermo Fisher Scientific, 155411), incubated in DMEM 10% FBS, 16 h and transfected with either FLAG-tagged galectin-8 or 3CL pro -cleavage analogues of galectin-8 using Lipofectamine 2000 (Invitrogen, 11668019). The FLAG-tagged-N-Gal8 (1-158), or C-Gal8 (159-317)-FLAGtag plasmid constructs engineered and synthesized from the galectin-8 ORF clone OHu23472 (GenScript). After recovery for 24-h culture in medium, the cells were exposed for 10 min to hypertonic medium (0.5 M sucrose (Calbiochem, 8510) and 15% polyethylene glycol (Sigma, P-3265) in PBS (Sigma, D8537). Cells were rinsed twice with PBS and incubated in 60% PBS for 3 min followed by a 20-min recovery period in complete DMEM +10% FBS. The assay was terminated by fixing the osmotically-shocked cells for 15 min in 4% methanol-free paraformaldehyde (Thermo Fischer Scientific, 28909) . Cells were rinsed with 100 mM glycine/PBS solution for 15 min and subsequently permeabilized with a 3-min incubation in 0.1% Triton X-100. The fixed and permeabilized cells were blocked for 1 h with 3% bovine serum albumin (Sigma, A7030), followed by an overnight 4ºC incubation with primary rabbit monoclonal anti-FLAG (1:1,000, Cell Signaling Technology, 14793S, RRID: AB_2572291) and mouse monoclonal anti-CALCOCO2/NDP52 (1:1,000, Santa Cruz Biotechnology, sc-376540, F-6, RRID: AB_11150487). After 15-min washes with PBS, cells were incubated with fluorescent secondary antibodies, Alexa Fluor Plus 488 goat anti-rabbit (1:1,000, Invitrogen, A32731, RRID: AB_2633280) and Alexa Fluor Plus 647 goat anti-mouse (1:1,000, Invitrogen, A32728, RRID: AB_2633277) for 1 h. After 15-min final washes, the coverslips were mounted using Fluoroshield with DAPI (Sigma-Aldrich, F6057). Confocal images were captured with 63x objective lens (Zeiss LSM 880 Inverted Confocal Microscope) from 5 randomly selected fields (n > 30 cells), and the percentage of cells positive for CALCOCO2 /NDP52/Gal8 puncta was manually quantified. Statistical analyses were performed with Prism version 9.0.0.121 and oneway ANOVA followed by Tukey's multiple comparison test (GraphPad). Normal and COVID-19 lung tissue samples were formalin-fixed and paraffin-embedded, followed by generation of tissue microarrays (TMAs). TMAs containing 1-mm cores were generated essentially as previously described (Kampf et al., 2012; Uhlén et al., 2015) , using a TMArrayer (Pathology Devices) and the Beecher Instruments Manual Tissue Arrayer MTA-1 (Estigen OÜ). One core each from two of the normal lung samples (F54 and M45) was included J o u r n a l P r e -p r o o f in the TMA with the COVID-19 lung samples, thus serving as controls that there were no staining reproducibility issues between sections. The M15 sample was kept as a full block for staining as a large section, ensuring no regional difference in the staining pattern. From each of the COVID-19 lung samples, two representative cores of ten different lung areas were sampled, i.e. in total, n = 20 lung TMA cores for each COVID-19 patient. The two cores from each area represented different regions of the corresponding tissue blocks. If available, regions with different tissue morphology were selected, e.g. areas heavily affected by the disease versus areas with more normal histology. The TMA block and the M15 lung-tissue block were cut in 4-μm thick sections using waterfall microtomes (Thermo Fisher Scientific, Microm HM 355S), dried at RT overnight and baked at 50°C for 12 -24 h before multiplex fluorescence immunohistochemistry. The sections were deparaffinized in xylene, hydrated in graded alcohols and blocked for endogenous peroxidase in 30% hydrogen peroxide diluted 1:100 in 95% ethanol, final concentration 0.3%. For antigen retrieval, a Decloaking chamber (Biocare Medical) was used. Slides were immersed and boiled in Antigen Retrieval Buffer (PT Module Buffer 1, 100x Citrate Buffer, pH 6, TA-250-PM1X, Thermo Fischer Scientific) for 4 min at 125°C and then allowed to cool to 90°C. For antibody validation, all antibodies were first tested with brightfield immunohistochemistry (IHC) on a test TMA containing 20 different normal tissue types. Staining intensity across the tested tissues was compared with mRNA expression levels, in line with the orthogonal approach following guidelines of the International Working Group of Antibody Validation (IWGAV) (Sivertsson et al., 2020; Uhlen et al., 2016) . IHC was performed essentially as previously described in detail (Kampf et al., 2012) . After evaluating the IHC results and determining the optimal antibody dilution, multiplex fluorescence IHC was performed based on a 3-plex Opal strategy. Thus, antibodies were added one at a time at RT, and insoluble Opal reagents connected to different fluorophores were used for visualization, followed by heating and inactivation of the previous antibody for each staining cycle. Primary antibodies specific for galectin-8 (1:15, Atlas Antibodies AB, HPA030491, RRID: AB_10602345), CALCOCO2/NDP52 (1:400, Atlas Antibodies AB, HPA022989, RRID: AB_1845914), and SARS-CoV-2 Spike S1 glycoprotein (1:500, Sino Biological, 40150-R007, RRID: AB_2827979) were diluted in UltraAb Diluent (Thermo Fisher Scientific). The primary antibodies and secondary HRP polymer were incubated for 30 min each, followed by 10-min development with the Opal FP1500001KT reagents Opal 650 (Cy5/magenta, Spike S1 glycoprotein, first staining cycle), Opal 520 (FITC/green, galectin-8, second staining cycle) and Opal 570 (Cy3/red, NDP52, third staining cycle, red). All incubations were followed by rinsing in wash buffer (Thermo Fisher Scientific). Inactivation between each cycle was performed in Antigen Retrieval Buffer (Thermo Fischer Scientific) by heating the slides to 90°C for 20 min using a Decloaking chamber and then allowed to cool slowly to 80°C. Slides were incubated with DAPI (1:1,000, Invitrogen) for 5 min and mounted using ProLong TM Glass Antifade Mounting Media (Life Technologies, 2157948). Slides from both normal and COVID-19 lung tissue samples were stained simultaneously to avoid bias between runs. Digital fluorescent images were obtained using a Zeiss Axio Scan.Z1 System equipped with a Zeiss Colibri 7, Type RGB-UV fluorescence light source. Exposure times and visualization parameters were set for normal lung using cell types with known positivity based on the initial IHC results and adjusted for each channel to obtain distinct signals with minimal autofluorescence. All parameters were kept consistent between the M15 normal lung large section and the TMA. H&E images of the J o u r n a l P r e -p r o o f same sections were obtained after the fluorescence image acquisition by removal of the coverslips, immersing the slides in Antigen Retrieval Buffer (Thermo Fischer Scientific) at 40°C overnight, followed by staining with hematoxylin (Mayers Htx Plus, Histolab 01825) and eosin (Bio-Optica, 05-10003/L). The H&E slides were coverslipped using PERTEX (Histolab) as mounting medium and scanned with Aperio AT2 slide scanner (Aperio) using a 40x objective. Data were analyzed within Prism version 9.0.0.121 (GraphPad Software Inc., San Diego, CA). The description of specific statistical tests used for each experiment are detailed in the figure legends and the method details section above. All N (independent biological experiments) and n (intra-experimental independent replicates) values are reported in the results for the data presented. All MS/MS data were analyzed using Byonic (Protein Metrics, San Carlos, CA USA; version PMI-Byonic-Com:v3.8.13). Byonic was set to search the uniprot_human database (UP000005640_9606) that included the 3CL pro constructs we expressed and common contaminants. An initial limited search was performed using Preview (v3.8.13) to determine m/z errors and derive recalibration parameters for precursor and fragment ions. The main search parameters were: semi-specific N-ragged ArgC, maximum of 2 missed cleavages; mass tolerance was set to 20 ppm for precursor and fragment ions; fragmentation type, QTOF/HCD; and precursors and fragments were recalibrated from Preview. NEM (+125.0477 Da) at Cys and dimethyl light (+ 28.0313 Da) at Lys were set as fixed modifications, heavy-labelled lysine was set as a +6.0318 Da variable modification over the dimethyl-light. Peptide N-terminal dimethyl light (+ 28.0313 Da), dimethyl heavy (34.0631 Da), pyroglutamic acid (-17.0265 Da), Met oxidation (+ 19.9949 Da), and acetylation (+42.0106 Da) of protein N-terminus were set as variable modifications. The peptide score cut off was set to automatic, and the protein score cut off was set to a 1% False Discovery Rate (FDR) or 20 hits in the reverse database, whichever was reached last. Scaffold (version Scaffold_4.11.0, Proteome Software Inc., Portland, OR, USA) validated the MS/MS-based peptide and protein identifications. Peptide identifications were accepted if established at greater than 99.0% probability to achieve an FDR < 1.0% by the Scaffold Local FDR algorithm. Protein identifications were accepted if they could be established at greater than 99.0% probability to achieve an FDR < 1.0%. Protein probabilities were assigned by the Protein Prophet algorithm (Nesvizhskii et al., 2003) . Proteins that contained similar peptides and could not be differentiated based on MS/MS analysis alone were grouped to satisfy the principles of parsimony. N-termini were annotated using our in-house program TopFinder 4.1 (https://topfind.clip.msl.ubc.ca). Retention time alignment and MS1-level quantification of all identified peptides were performed via Skyline (v 20.1.0.155). Only quantitative values with an idotp ≥ 0.85 were considered. Foldchanges between heavy and light forms of the peptide were obtained by dividing their respective MS1 peak areas. For singleton heavy peptides, the peak area was considered as fold-change. The inverse of the MS1 area was taken as the fold-change for a singleton light. To interrogate whether 3CL pro cleaves substrates regulated by type I interferons (IFN-α and IFNβ), we compared the relative protein abundance in type I interferon-stimulated BEAS-2B cells (N = 6) versus unstimulated control cells (N = 3). To do so, we used the MS1 intensity of the J o u r n a l P r e -p r o o f respective preTAILS runs acquired with methanol in the nanoBooster. MS1 quantification and statistical analysis were performed using the default settings in the MSstats tool integrated into Skyline. Statistical significance was determined by multiple sample t-test, and adjusted p-values were obtained using Benjamini-Hochberg correction. To validate the treatment strategy, we compared the 49 proteins upregulated by IFN-α and IFN-β with known ISGs and found all were previously reported, including STAT-1, IFIT2, IFIT3, OAS, MX1, amongst others. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE (Perez-Riverol et al., 2019) partner repository with the dataset identifiers PXD026797 and 10.6019/PXD026797 for HEK-293, and PXD026815 and 10.6019/PXD026815 for BEAS-2B. In our experimental design, the neo-N-termini generated from 3CL pro activity must necessarily and exclusively be labelled with dimethyl heavy (+ 34 kDa). Therefore, we considered as possible 3CL pro substrates only singleton heavy peptides that were identified in ≥ 2/3 HEK-293 or ≥ 7/9 BEAS-2B independent biological experiments. A poor-quality or false identification of a light form of the peptide across the whole experiment was sufficient to disqualify the protein as a substrate. In addition, we excluded all peptides that could be explained from residual labelling of tryptic peptides, Met 1 removal, or protein N-terminal ragging. To further increase the confidence of the 3CL pro substrates, a score was derived (using a custom script) that compared the sequence of each identified cleavage site to the normalized relative frequency of amino acids in positions P4-P4' of all cleavage sites meeting all the criteria described above. To define the confidently identified cleavage sites generated by 3CL pro , the score of the 90 th percentile of non-confidently identified cleavage sites was used as the minimum cutoff. Finally, all MS/MS spectra of the winnowed neo-N-terminal peptides were manually inspected, discarding any displaying poor fragmentation, noise, or ragged termini and were not further considered. The iceLogos presented were generated using iceLogo (https://iomics.ugent.be/icelogoserver/) (Colaert et al., 2009) . Peptide-protein docking The following flags and command lines were used in the peptide-protein docking simulations: Prepacking the complex to remove potential internal clashes and guarantee a uniform conformational background in non-interface regions. $ROSETTA_BIN/FlexPepDocking.mpi.linuxgccrelease @prepack_flags prepack_flags: -s 3CL_YAP1_a.pdb -ex1 -ex2aro -database $ROSETTA_DB_Path -scorefile prepack.score.sc -flexpep_score_only -flexpep_prepack -nstruct 1 J o u r n a l P r e -p r o o f -out:path:pdb output -out:path:score output -use_truncated_termini -flexPepDocking:receptor_chain A -flexPepDocking:peptide_chain D Model generation beginning from the prepacked structure. $ROSETTA_BIN/FlexPepDocking.mpi.linuxgccrelease @abinitio_flags abinitio_flags: -s input/3CL_YAP1_ppk.pdb -lowres_abinitio -pep_refine -flexpep_score_only -ex1 -ex2aro -use_truncated_termini -frag3 input/frags/frags.3mers.offset -flexPepDocking:frag5 input/frags/frags.5mers.offset -flexPepDocking:frag5_weight 0.25 -constraints:cst_weight 2 -constraints:cst_fa_file input/constraint_file -constraints:cst_file input/constraints_file -constraints:cst_fa_weight 2 -score:weights ref2015_cst -out:path:pdb output -out:file:silent output/SARS_mono_Isa_peptide1_silent.out -out:file:scorefile output/score_mono_Isa_peptide1.sc -nstruct 50000 -flexPepDocking:receptor_chain A -flexPepDocking:peptide_chain D Given below is the constraint file content and the flat harmonic function used to favor models where the Euclidean distance between Ser145 Sγ and Cα of P1 are less than 4 Å. where rosetta.xml is: