key: cord-0945573-xlfjg5lf authors: Ayyappan, Vinay; Wat, Ricky; Barber, Calvin; Vivelo, Christina A; Gauch, Kathryn; Visanpattanasin, Pat; Cook, Garth; Sazeides, Christos; Leung, Anthony K L title: ADPriboDB 2.0: an updated database of ADP-ribosylated proteins date: 2020-11-02 journal: Nucleic Acids Res DOI: 10.1093/nar/gkaa941 sha: ee493d7dac74d67dacc9774cff7a604ad499bbbe doc_id: 945573 cord_uid: xlfjg5lf ADP-ribosylation is a protein modification responsible for biological processes such as DNA repair, RNA regulation, cell cycle and biomolecular condensate formation. Dysregulation of ADP-ribosylation is implicated in cancer, neurodegeneration and viral infection. We developed ADPriboDB (adpribodb.leunglab.org) to facilitate studies in uncovering insights into the mechanisms and biological significance of ADP-ribosylation. ADPriboDB 2.0 serves as a one-stop repository comprising 48 346 entries and 9097 ADP-ribosylated proteins, of which 6708 were newly identified since the original database release. In this updated version, we provide information regarding the sites of ADP-ribosylation in 32 946 entries. The wealth of information allows us to interrogate existing databases or newly available data. For example, we found that ADP-ribosylated substrates are significantly associated with the recently identified human protein interaction networks associated with SARS-CoV-2, which encodes a conserved protein domain called macrodomain that binds and removes ADP-ribosylation. In addition, we create a new interactive tool to visualize the local context of ADP-ribosylation, such as structural and functional features as well as other post-translational modifications (e.g. phosphorylation, methylation and ubiquitination). This information provides opportunities to explore the biology of ADP-ribosylation and generate new hypotheses for experimental testing. ADP-ribosylation is a reversible post-translational modification defined by the addition of one [mono(ADPribosyl)ation] or multiple [poly(ADP-ribosyl)ation] ADPribose moieties onto amino acid side chains with nucleophilic oxygen, nitrogen or sulfur (1, 2) . ADP ribose groups can be transferred by enzymatically active members of the family of 17 ADP-ribosyltransferases (commonly known as PARPs) as well as other enzymes, such as bacterial toxins and sirtuins (3) (4) (5) . ADP-ribosylation has been implicated in fundamental biological processes (e.g. DNA damage repair, gene regulation and cell signaling) (6) (7) (8) as well as disease-related processes (e.g. microbial pathogenesis, carcinogenesis and inflammation) (9) (10) (11) . While ADP-ribosylation was discovered in 1963 (12) , studies of ADP-ribosylation have recently begun to benefit from the rapid development in proteomics technologies that identify ADP-ribosylated sites and characterize the substrate specificity of PARPs and other enzymes associated with ADPribosylation (13) . In light of these developments, we created ADPriboDB in 2015 (14) to provide a one-stop informatics portal about ADP-ribosylated substrates across the proteomes from different species, similar to databases like dbPTM (15, 16) and PhosphoSite (17) that provide information about other post-translational modifications such as ubiquitination and phosphorylation. In addition, ADPri-boDB provides data entries pertaining to PARP inhibitor effects on the ADP-ribosylated proteome, aiming to bridge basic scientific knowledge about ADP ribosylation with information that physicians can use to garner insights on clinical benefits and side effects. Since its creation, the database has received over 702 500 hits and 13 300 unique visitors. * To whom correspondence should be addressed. Tel: +1 410 502 8939; Fax: +1 410 955 2926; Email: anthony.leung@jhu.edu † The authors wish it to be known that, in their opinion, the second and third authors should be regarded as equal contribution. C The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com ADPriboDB 2.0 includes an expanded library of data, new addition of site information, and an improved user interface (Table 1 ). ADPriboDB 2.0 also integrates other post-translational modification sites and features a web interface that allows users to visualize protein domains and other features at and proximal to the modification sites. As in ADPriboDB, published papers were curated from PubMed, accessed via the following search terms: 'PARylation' or 'PARsylation' or 'poly(ADP-ribosyl)ation' or 'poly-adp-ribosylation' or 'poly(ADP-ribosylation)' or 'PARylated' or 'PARsylated' or 'poly(ADP)ribosylation' or 'MARylation' or 'MARsylation' or 'mono(ADPribosyl)ation' or 'mono-adp-ribosylation' or 'mono(ADPribosylation)' or 'MARylated' or 'MARsylated' or 'mono(ADP)ribosylated.' Search results were restricted between the dates of July 2015 and May 2019, where the initial release covered between January 1975 and June 2015 (14) . ADPriboDB 2.0 now also included the terms 'ADPr,' 'ADP-ribose' and 'ADPribosylation' for literature search between January 1975 and May 2019. Information regarding protein identifiers as well as experimental and modification details were collected and assessed for inclusion in ADPriboDB by two independent curators. Since 2010, research regarding ADP-ribosylation has accelerated, and particularly the identification of ADPribosylated substrates has risen ( Figure 1A ). In light of these trends, the latest iteration of ADPriboDB comprises 48 346 entries, including 9097 unique proteins across 610 papers, more than tripling the size of the initial release of the database. The database now includes data from 41 species, up from 28 species. ADPriboDB 2.0 has more than doubled the proportion of its entries bearing information regarding the enzyme responsible for the ADP-ribosylation. Reflecting a greater availability of technology (e.g. analogsensitive PARP approaches), 65% of database entries include the information of enzyme-substrate specificity (Table 2). To enhance the speed with which queried database pages are loaded, we have used the open-source PHP framework Xataface to compress and save all requested pages to a cache. These caches are generated for each individual user. This feature allows the display of the frequently visited pages, thereby speeding-up the process of information loading and display. In ADPriboDB 2.0, a growing number of entries were derived from proteomics studies, which account for ∼8000 unique proteins ( Figure 1B ). Approximately 75% of ADPribosylated proteins were identified in at least two publica-tions ( Figure 1C ). Analysis of these independently verified ADP-ribosylated substrates via EnrichR (18) revealed a significant enrichment of gene sets associated with diverse biological processes, including DNA metabolism, RNA processing, protein targeting and viral-related processes (Figure 1D and Supplementary Data S1). This expanded database therefore provides greater flexibility and depth to analyses of ADP-ribosylation, its function and its clinical implications. As an example, we used ADPriboDB to explore the biology of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). SARS-CoV-2 encodes a protein domain (called macrodomain) that is conserved in all coronaviruses (19, 20) , and this macrodomain binds and hydrolyzes ADP-ribose from substrates (21, 22) , as in other viral macrodomains. Given the conserved nature of macrodomain and the critical importance of the enzymatic activity of macrodomain in the virulence of other coronaviruses (19, 20) , ADP-ribosylation is likely highly regulated in SARS-CoV-2 infection. Consistent with this hypothesis, human protein interactors of SARS-CoV-2 viral proteins (23) are statistically enriched with ADP-ribosylated substrates (P = 0.0096, Chisquare test with continuity correction) (Supplementary Data S2). No statistical significance was observed between SARS-CoV-2 protein interactors with the methylated proteomes (P = 0.1977, Chi-square test with continuity correction). Notably, this statistical enrichment with ADPribosylated substrates was present even though the SARS-CoV-2 protein interaction map did not include the association with nonstructural protein 3 (nsP3), which possesses the macrodomain (23). These ADP-ribosylated substrates associated with SARS-CoV-2 proteins were enriched with gene ontologies, including transfer RNA and ribosomal RNA regulation ( Figure 1E and Supplementary Data S2). For the last 6 years, evolving proteomics techniques have allowed the identification of sites of ADP-ribosylation for functional analyses (13, 24) . During curation of database entries, information regarding modified sites and/or peptide sequences identified by mass spectrometry were included when available. Sequence information was then aligned to protein sequences deposited in the UniProt database (25) . Currently, ∼67% of ADPriboDB 2.0 entries includes information regarding ADP-ribosylation sites, with a total of 14 839 unique sites on a range of amino acids ( Figure 1F ). For each database entry, users can view the site and sequence information provided by the publication. Users can also find out whether the same sites were identified in other publications or whether there are other ADP-ribosylation sites from the same proteins. The new site information enables users to identify potential crosstalks between ADP-ribosylation and other posttranslational modifications. ADPriboDB 2.0 currently features sites of phosphorylation, methylation and ubiquitination, sourced from dbPTM (15, 16) . To facilitate correlation analyses, ADPriboDB 2.0 includes a new Shiny-based tool to visualize the local structural and functional context of ADP-ribosylation sites, such as other post-translational modification sites [e.g. phosphorylation, as has been studied (26) (27) (28) ], protein domains, as well as regions of proteinprotein interaction. Vertical black lines are drawn to denote the location of ADP-ribosylation sites. For example, a high frequency of Axin1 ADP-ribosylation was found to localize around the motif that binds tankyrases--i.e. PARP5a and PARP5b (Figure 2A ). To help identify ADP-ribosylation sites of high confidence, the number of publications that identified a given Figure 2B) . Besides, users can toggle zoom settings to visualize crowded regions at a single amino acid level ( Figure 2C ). For example, S195 in NPM1 within the nuclear localization signal was identified four times, and this site can also be phosphorylated ( Figure 2C ). Taken together, all these additional features enable users to prioritize which sites for further mechanistic analyses. ADPriboDB 2.0 was updated in response to the growing volume of data regarding novel substrates and sites of ADP-ribosylation. We anticipate that the wealth of site information will stimulate new mechanistic studies, especially with the advent of new chemical approaches to generate model ADP-ribosylated peptides and methods to label ADP-ribose (29, 30) . Given that the database now includes a considerable amount of site and sequence information, we hope the database will encourage the development of novel algorithms for motif finding, site prediction and correlation analyses with various post-translational modifications (31, 32) . With the development of chemical inhibitors and genetic ablation technologies (such as CRISPR) that can target specific PARPs, we anticipate future generations of ADPri-boDB will include more information regarding substrateenzyme specificity. As ADP-ribosylhydrolases also exhibit amino acid specificity (33) , future entries may include information of enzymes that add and remove ADP-ribosylation from specific sites. Emergent studies indicate that cellular processes are regulated not only by the balance between the synthesis and removal of ADP-ribosylation, but also by maintaining the appropriate forms of ADP-ribosylation (34) . Although mono(ADP-ribosyl)ation and poly(ADPribosyl)ation cannot be distinguished by the current proteomics methods (13) , future entries will be focused on an-Nucleic Acids Research, 2021, Vol. 49, Database issue D265 notating this critical chain length information, especially when appropriate technologies are mature (30, 35) . ADPriboDB is free and publicly available at http:// adpribodb.leunglab.org/. PARPs and ADP-ribosylation: recent advances linking molecular functions to biological outcomes ADP-ribosylation: new facets of an ancient modification Sirtuins as regulators of metabolism and healthspan Toward a unified nomenclature for mammalian ADP-ribosyltransferases Physiological relevance of the endogenous mono(ADP-ribosyl)ation of cellular proteins Nuclear ADP-Ribosylation and Its Role in Chromatin Plasticity, Cell Differentiation, and Epigenetics The multifaceted roles of PARP1 in DNA repair and chromatin remodelling PARPs and ADP-ribosylation in RNA biology: from RNA expression and processing to protein translation and proteostasis Novel bacterial ADP-ribosylating toxins: structure and function Poly-ADP ribosylation in DNA damage response and cancer therapy The impact of PARPs and ADP-ribosylation on inflammation and host-pathogen interactions Nicotinamide mononucleotide activation of new DNA-dependent polyadenylic acid synthesizing nuclear enzyme The promise of proteomics for the study of ADP-ribosylation ADPriboDB: The database of ADP-ribosylated proteins dbPTM 2016: 10-year anniversary of a resource for post-translational modification of proteins dbPTM in 2019: exploring disease association and cross-talk of post-translational modifications PhosphoSitePlus, 2014: mutations, PTM Recalibr Enrichr: a comprehensive gene set enrichment analysis web server 2016 update Macrodomain ADP-ribosylhydrolase and the pathogenesis of infectious diseases The Viral Macrodomain Counters Host Antiviral ADP-Ribosylation Molecular Basis for ADP-Ribose Binding to the Mac1 Domain of SARS-CoV-2 nsp3 The SARS-CoV-2 conserved macrodomain is a highly efficient ADP-ribosylhydrolase A SARS-CoV-2 protein interaction map reveals targets for drug repurposing Proteomic Analysis of the Downstream Signaling Network of PARP1 UniProt: a worldwide hub of protein knowledge Systems-wide Analysis of Serine ADP-Ribosylation Reveals Widespread Occurrence and Site-Specific Overlap with Phosphorylation Chemical genetic discovery of PARP targets reveals a role for PARP-1 in transcription elongation Chemical Tools to Study Protein ADP-Ribosylation ELTA: Enzymatic Labeling of Terminal ADP-Ribose PTMscape: an open source tool to predict generic post-translational modifications and map modification crosstalk in protein domains and biological processes SUMOgo: Prediction of sumoylation sites on lysines by motif screening models and the effects of various post-translational modifications ADP-ribosyl)hydrolases: structure, function, and biology HPF1 completes the PARP active site for DNA damage-induced ADP-ribosylation Quantification of cellular poly(ADP-ribosyl)ation by stable isotope dilution mass spectrometry reveals tissue-and drug-dependent stress response dynamics We thank the Leung lab members for critical comments on the manuscript and the database for improving its usability. Supplementary Data are available at NAR Online.