key: cord-0428311-upk3b09x authors: Wang, Jiyao; Youkharibache, Philippe; Marchler-Bauer, Aron; Lanczycki, Christopher; Zhang, Dachuan; Lu, Shennan; Madej, Thomas; Marchler, Gabriele H.; Cheng, Tiejun; Chong, Li Chuin; Zhao, Sarah; Yang, Kevin; Lin, Jack; Cheng, Zhiyu; Dunn, Rachel; Malkaram, Sridhar Acharya; Tai, Chin-Hsien; Enoma, David; Busby, Ben; Johnson, Nicholas; Tabaro, Francesco; Song, Guangfeng; Ge, Yuchen title: iCn3D: from Web-based 3D Viewer to Structure Analysis Tool in Batch Mode date: 2021-09-11 journal: bioRxiv DOI: 10.1101/2021.09.10.459868 sha: f00cb2ae9cee9a5d0987283a25b30ba1381eec53 doc_id: 428311 cord_uid: upk3b09x iCn3D was originally released as a web-based 3D viewer, which allows users to create a custom view in a life-long, shortened URL to share with colleagues. Recently, iCn3D was converted to use JavaScript classes and could be used as a library to write Node.js scripts. Any interactive features in iCn3D can be converted to Node.js scripts to run in batch mode for a large data set. Currently the following Node.js script examples are available at https://github.com/ncbi/icn3d/tree/master/icn3dnode: ligand-protein interaction, protein-protein interaction, change of interactions due to residue mutations, DelPhi electrostatic potential, and solvent accessible surface area. iCn3D PNG images can also be exported in batch mode using a Python script. Other recent features of iCn3D include the alignment of multiple chains from different structures, realignment, dynamic symmetry calculation for any subsets, 2D cartoons at different levels, and interactive contact maps. iCn3D can also be used in Jupyter Notebook as described at https://pypi.org/project/icn3dpy. Since the release of GLmol and iview [1] , WebGL-based structure viewers (e.g., 3Dmol [2] , NGL [3; 4] , Aquaria [5] , iCn3D [6] , LiteMol [7] , and Mol* [8] ) have been widely adapted in the structure biology field [9] . These web-based viewers have several advantages over standalone viewers (e.g., RasMol [10] , VMD [11] , Cn3D [12] , JMOL [13] , Chimera [14; 15] , and PyMOL [16] ). First, users can view 3D structures directly in web browsers without installation. Second, users' custom views could be shared with colleagues in a shortened link (or an image) with all data and command history. Third, these viewers are written in open-source JavaScript code and their features can be easily shared with each other. iCn3D (I-see-in-3D) has all these features plus its own unique feature: all interactive structure analyses in iCn3D could be converted to analyses in batch mode using Node.js scripts, which call functions in iCn3D. This feature has been available since we modernized the code of iCn3D version 3 to use JavaScript classes. iCn3D was first released in 2016 as a 3D structure viewer with synchronized 1D, 2D and 3D views and different annotations. The custom view can be shared in a life-long, shortened URL. Since then, we introduced interaction analyses [17] and converted three pieces of software into RESTful APIs. First, DelPhi [18] was licensed from Columbia University and was used to calculate electrostatic potentials for nucleotides, proteins, and ligands in iCn3D. Second, scap [19] was kindly provided by the Barry Honig group at Columbia University and was used to predict the side chain change due to mutation. Third, SymD [20] was kindly provided by Chin-Hsien Tai at the National Cancer Institute and was used to calculate protein symmetries dynamically [21] . We also added other features including the alignment of multiple chains from different structures, realignment, dynamic symmetry calculation for any subsets, 2D cartoons at different levels, and interactive contact maps. iCn3D can also be used in Jupyter Notebook. The C++ code of DelPhi [18] was licensed from Columbia University and converted to a RESTful API "delphi.cgi" at National Center for Biotechnology Information (NCBI). To qualitatively illustrate the electrostatic potential and speed up the calculation, the linear Poisson-Boltzmann equation was solved. DelPhiPKa [22] is used to add hydrogens and partial charges to proteins and nucleotides. Open Babel [23] is used to add hydrogens to ligands. Antechamber [24] is used to add partial charges to ligands with the Gasteiger charge calculation method. The default grid size (n=65) and default salt concentration (0.15 M) can be changed. The pH is set at 7.0. The parameters of all RESTful APIs are described at https://www.ncbi.nlm.nih.gov/Structure/icn3d/icn3d.html#restfulapi. The C++ code of Jackal/scap [19] was provided by the Barry Honig group and converted to a RESTful API "scap.cgi" at NCBI. The parameter "min" was set as "1". The scap program first iteratively samples all sidechain rotamers until convergence. The final lowest-energy conformation among all the conformations starting from the initial 3 conformations will then be minimized by refining the side-chain conformation with 2-degree rotation on each bond in the sidechain to search for lower energy conformations around the rotamer. The large rotamer library with the file "side_large_rotamer_format.txt" was used. The C++ code of SymD [20] was provided by the Chin-Hsien Tai and Byungkook Lee group and converted to a RESTful API "symd.cgi" at NCBI. The maximum allowed number of atoms is 30,000. A symmetry is considered found if the Z score is greater than 8. Only Cn and helical symmetries are considered. Before iCn3D version 3, two JavaScript objects "iCn3DUI" and "iCn3D" were used to contain all functions. In iCn3D version 3, we switched from ECMAScript 2009 (ES5) to ECMAScript 2015 (ES6) and used JavaScript classes. Each class is in a separate file and can access the classes "iCn3DUI" and "iCn3D", which can access almost all functions. The class structure is described at https://www.ncbi.nlm.nih.gov/Structure/icn3d/icn3d.html#classstructure. We also updated Three.js from version 103 to version 128, which uses JavaScript classes as well. Since the changes of both iCn3D and Three.js are significant, we kept the previous library names untouched, and used new library names for Three.js and iCn3D. The updated embedding method is described at https://www.ncbi.nlm.nih.gov/Structure/icn3d/icn3d.html#HowToUse. All JavaScript classes are bundled with rollup (https://rollupjs.org/guide/en/). Then gulp (https://gulpjs.com) was used to set up all files for release in iCn3D GitHub page (https://github.com/ncbi/icn3d) and icn3d npm package (https://www.npmjs.com/package/icn3d). iCn3D shows eight kinds of interactions between molecular structures: contact, hydrogen bond, salt bridge or ionic interaction, halogen bond, π -cation interaction, and π -stacking. Users can generate the interactions shown in Figure 1 in three steps: click the menu "Analysis > Interactions", select the two sets chain 6M0J_A and 6M0J_E in the popup window, and click "2D Interaction Network". Figure 1A shows the interacting residues between ACE2 (in pink) and SARS-CoV-2 spike protein (in blue) using the PDB structure 6M0J. The green, cyan, and gray dotted lines show the hydrogen bonds, salt bridges, and contacts, respectively. Figure 1B shows the 2D interaction network, where the nodes represent residues, and the lines represent interactions. The interaction includes multiple hydrogen bonds, two salt bridges and many contacts. The custom view can be reproduced with a life-long sharable link https://structure.ncbi.nlm.nih.gov/icn3d/share.html?oLwZzGL59izJeEBVA as described in Figure 1 . This interactive analysis can be converted to a Node.js script https://github.com/ncbi/icn3d/blob/master/icn3dnode/epitope.js by using the npm package icn3d (https://www.npmjs.com/package/icn3d). Users can simply run the command "node epitope.js 6M0J E A" in a command line to get the interacting residues between the chain E (spike protein) and chain A (ACE2) of the PDB structure 6M0J. iCn3D can show side chain mutations in 3D views. The side chain prediction was done with a RESTful API based on scap [19] . Users can show the change of interactions due to mutations by clicking the menu "Analysis > Mutation", inputting the mutation (such as "6M0J_E_501_Y" to indicate the mutation is at position 501 of the E chain of PDB 6M0J and the mutant is Tyr residue), and clicking "Interactions". Users can press the key "a" to toggle between the wild type and mutant. Figure 2A shows the mutant's 3D structure in the stick style. The spike protein is in blue and ACE2 is in pink. Figure 2B shows that the mutant Y501 (bottom panel) introduces one π -cation interaction (red line) and one π -stacking (blue line). The view can be reproduced with the sharable link as described in Figure 2 . The interactive analysis can be converted to a Node.js script https://github.com/ncbi/icn3d/blob/master/icn3dnode/interaction2.js. Users can run the command "node interaction2.js 6M0J E 501 Y" to get the change of interactions with counts. iCn3D can also show electrostatic potential for structures. The electrostatic potential was calculated by a RESTful service based on the software DelPhi [18] , which is licensed from Columbia University. Figure 3A shows the binding of the PH domain of phospholipase C Delta to IP 3 (or a membrane containing PIP 2 ). The binding region contains multiple positively charged residues (in blue). Figure 3B shows the electrostatic potential on the surface of the PH domain. Blue and red indicates +75 and -75mV, respectively. The binding region clearly has large positive electrostatic potential. The custom view of Figure 3B can be reproduced with a sharable link: https://structure.ncbi.nlm.nih.gov/icn3d/share.html?8WqQV4wir5oFx7aM9. The DelPhi electrostatic potential on the surface of chain A can be exported from a Node.js script https://github.com/ncbi/icn3d/blob/master/icn3dnode/delphipot.js with the command "node delphipot.js 1MAI A". Users could also choose to show an equipotential map ( Figure 3C ) instead of surface potential. The blue and red mesh show the +25 and -25 mV equipotential profile. iCn3D can not only show the pre-calculated symmetry from PDB, but also calculate symmetry for any selected residues. The calculation is done by a RESTful API based on the software SymD [20] . Figure 4B shows the structure of PDB 3HUJ. Even though the whole structure has no symmetry, its subset of residues 1-178 in chain A has C2 symmetry as shown in Figure 4A . The symmetric parts are colored in red (same residue) or blue (different residues). The feature is available in the menu "Analysis > Symmetry > from SymD (Dynamic)". The custom view can be reproduced in a sharable link as describe in Figure 4 . iCn3D can align multiple chains from different structures. Figure 5 shows an example with two structures. The master structure is the first PDB structure 1HHO. All other structures are aligned to it. Figure 5A shows the alignment with full chains using the VAST algorithm [25] based purely on geometric criteria. Figure 5B shows the alignment to a subset of residues 10-50 in the master structure. A sequence alignment to the set of residues is done, followed by the structure superposition of the coordinates of these residues. Figure 5C shows the alignment of predefined residues in each structure. The coordinates of the pre-defined residues are used for structure superposition. The sharable links are shown in Figure 5 . Users can also load multiple PDB files and use the "File > Realign Selection" feature to align multiple structures. The "on Sequence Alignment" option is similar to Figure 5B and uses a sequence alignment. The "Residue by Residue" option is similar to Figure 5C and uses predefined residues. Previously, iCn3D only showed "2D Interaction" for structure data from NCBI with the URL parameter "mmdbid=". We added "2D Cartoon" for any data source such as from PDB with the URL parameter "mmtfid=". Figure 6A shows the AlphaFold predicted structure for the UniProt ID A0A061AD48. iCn3D uses the URL parameter "afid=A0A061AD48" to get the AlphaFold predicted structure data from EMBL-EBI [26] . The high confidence parts are colored in blue and the low confidence parts are colored in yellow and orange. Figure 6B shows the 2D cartoon of the two domains in this protein: Nek and ATS1. The principal axes and size of the domains are used to determine the orientation and size of the ovals, respectively. The nodes can be dragged to modify the view of the cartoon and clicked to be selected in 3D. Figure 6C shows the selected Nek domain. Figure 6D shows the 2D cartoon at the secondary structure level for the Nek domain by clicking the menu "Analysis > 2D Cartoon > Helix/Sheet Level". The helices and sheets are projected from 3D to 2D and thus the 2D view is dynamic depending on the orientation of the 3D view. The helices and sheets can be dragged to rearrange the view and clicked to be selected in 3D. For a structure with multiple chains, the 2D cartoon at the chain level is useful too. The sharable link is shown in Figure 6 . iCn3D can also display interactive 2D maps for any subsets. Figure 7 shows the contact map of AlphaFold structure with UniProt ID A0A061AD48. The contacts within 8 angstroms of C-beta atoms are shown. The map has a scale of 0.04. If the scale is set as 1, the X-and Y-axes show residues as nodes with residue names and colors. The residue nodes and the contact points could be clicked to show the residues or contacts. This feature is available in the menu "Analysis > Contact Map". It can be dynamically applied to multiple structures or a subset of the structure. The sharable link is shown in Figure 7 . After we modernized iCn3D with JavaScript classes and released the npm package icn3d, we can use scripts to analyze a large set of structures in batch mode. A few Node.js scripts have been shown above. More examples are listed at https://github.com/ncbi/icn3d/tree/master/icn3dnode. We use the Node.js script "ligand.js" as an example to show the flow of the code. First, install the npm libraries "icn3d", "three", "jsdom", and "jquery". Then set up two variables used in all iCn3D functions: "me" is an instance of the iCn3DUI class, and "ic" is an instance of the iCn3D class. The class structure is described at https://www.ncbi.nlm.nih.gov/Structure/icn3d/icn3d.html#classstructure. Next, retrieve the coordinate data from NCBI. Finally, call the iCn3D function "ic.showInterCls.pickCustomSphere_base" to get the residues interacting with the ligand. Users now can use the command "node ligand.js [PDB ID] [three-letter ligand residue name]" to calculate the interaction between a ligand and a protein. We released icn3dpy at https://pypi.org/project/icn3dpy/ to make iCn3D work in Jupyter Notebook. Users just need to install icn3dpy with the command "pip install icn3dpy" and install one extension with the command "jupyter labextension install jupyterlab_3dmol". After launching Jupyter Notebook, three lines of script will launch the same view as in Figure 1 . selection; add residue number labels; set label scale 2.0|||{"factor":"0.6758","mouseChange":{"x":"0.000","y":"0.000"},"quaternion":{"_x":"-0.6336","_y":"0.3127","_z":"-0.3564","_w":"0.6114"}}')". Finally type "view" to see the interactive 3D view and 2D interaction network similar to Figure 1 . With the release of more and more AlphaFold predicted structures and experimentally determined structures, it's crucial to be able to do structure analysis in batch mode. The iCn3Dbased Node.js scripts fully leverage the power of iCn3D and can achieve any feature in iCn3D except graphical displays, which could be achieved by some Python code. The Python code to export 3D images from iCn3D is shown at https://github.com/ncbi/icn3d/blob/master/icn3dnode/batch_export_png.py. Users can save their work in iCn3D by saving iCn3D PNG images with the menu "File > Save File > iCn3D PNG Image". These images not only contain the custom views, but also contain all data and command histories. They can be loaded back to iCn3D with the menu "File > Open File > iCn3D PNG Image". They can also be retrieved from the web to reproduce the saved views, e.g., Our future plan includes linking iCn3D with multiple sequence alignment tools, allowing iCn3D to assemble different parts of structures, adapting WebGL2 in iCn3D, and more. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. Philippe 2D cartoon at the secondary structure level. Helices are solid cylinders and labeled as "H" plus the first residue number. Sheets are empty cylinders and labeled as "S" plus the first residue number. The sharable link for panels A and B is: https://structure.ncbi.nlm.nih.gov/icn3d/share.html?R8Wtf9hDsdo6sH5Z9 e Figure 7 . Contact map of AlphaFold structure with UniProt ID A0A061AD48. The distance threshold is 8 angstroms between C-beta atoms. The sharable link is https://structure.ncbi.nlm.nih.gov/icn3d/share.html?vTc6gkbCenTyLzt4A. iview: an interactive WebGL visualizer for protein-ligand complex 3Dmol.js: molecular visualization with WebGL Web-based molecular graphics for large complexes NGL Viewer: a web application for molecular visualization Aquaria: simplifying discovery and insight from protein structures iCn3D, a web-based 3D viewer for sharing 1D/2D/3D representations of biomolecular structures LiteMol suite: interactive web-based visualization of large-scale macromolecular structure data Mol* Viewer: modern web app for 3D visualization and analysis of large biomolecular structures Bringing Molecular Dynamics Simulation Data into View RASMOL: biomolecular graphics for all VMD: visual molecular dynamics Cn3D: sequence and structure views for Entrez The use of the free, open-source program Jmol to generate an interactive web site to teach molecular symmetry Meeting modern challenges in visualization and analysis UCSF chimera -A visualization system for exploratory research and analysis The PyMOL Molecular Graphics System Using iCn3D and the World Wide Web for structure-based collaborative research: Analyzing molecular interactions at the root of A Rapid Finite-Difference Algorithm, Utilizing Successive over-Relaxation to Solve the Poisson-Boltzmann Equation Extending the accuracy limits of prediction for side-chain conformations SymD webserver: a platform for detecting internally symmetric protein structures Topological and Structural Plasticity of the Single Ig Fold and the Double Ig Fold Present in CD19 DelPhiPKa web server: predicting pKa of proteins, RNAs and DNAs Open Babel: An open chemical toolbox Development and testing of a general amber force field Surprising similarities in structure comparison Highly accurate protein structure prediction with AlphaFold Cloudy, increasingly FAIR; revisiting the FAIR Data guiding principles for the European Open Science Cloud Interoperability and FAIRness through a novel combination of Web technologies We would like to specially thank Barry Honig for licensing DelPhi and providing scap to us;Emil Alexov for his very helpful suggestions about using DelPhi; David Koes for his help in using iCn3D in Jupyter Notebook; Ravinder Abrol, Raul Cachau, Todd Smith and Sandra Porter for their feedback and suggestions during the development of iCn3D; and Ravinder Abrol and Allissa Dillman for organizing several codeathons related to iCn3D.