key: cord-0018858-6ay299zs authors: Sehnal, David; Bittrich, Sebastian; Deshpande, Mandar; Svobodová, Radka; Berka, Karel; Bazgier, Václav; Velankar, Sameer; Burley, Stephen K; Koča, Jaroslav; Rose, Alexander S title: Mol* Viewer: modern web app for 3D visualization and analysis of large biomolecular structures date: 2021-05-06 journal: Nucleic Acids Res DOI: 10.1093/nar/gkab314 sha: 78c9b35e48e5ae7d4d165ba50d747cf07c28828d doc_id: 18858 cord_uid: 6ay299zs Large biomolecular structures are being determined experimentally on a daily basis using established techniques such as crystallography and electron microscopy. In addition, emerging integrative or hybrid methods (I/HM) are producing structural models of huge macromolecular machines and assemblies, sometimes containing 100s of millions of non-hydrogen atoms. The performance requirements for visualization and analysis tools delivering these data are increasing rapidly. Significant progress in developing online, web-native three-dimensional (3D) visualization tools was previously accomplished with the introduction of the LiteMol suite and NGL Viewers. Thereafter, Mol* development was jointly initiated by PDBe and RCSB PDB to combine and build on the strengths of LiteMol (developed by PDBe) and NGL (developed by RCSB PDB). The web-native Mol* Viewer enables 3D visualization and streaming of macromolecular coordinate and experimental data, together with capabilities for displaying structure quality, functional, or biological context annotations. High-performance graphics and data management allows users to simultaneously visualise up to hundreds of (superimposed) protein structures, stream molecular dynamics simulation trajectories, render cell-level models, or display huge I/HM structures. It is the primary 3D structure viewer used by PDBe and RCSB PDB. It can be easily integrated into third-party services. Mol* Viewer is open source and freely available at https://molstar.org/. Experimental methods to determine the three-dimensional (3D) structures of biomolecules are continuously improved and produce molecular complexes from models with various resolutions that span multiple scales and can be dynamic throughout an experiment. By combining data from complementary experimental techniques including macromolecular crystallography (MX), nuclear mag-W432 Nucleic Acids Research, 2021, Vol. 49, Web Server issue netic resonance (NMR), 3D electron microscopy (3DEM), small-angle scattering (SAS), chemical crosslinking, or integrative/hybrid methods (I/HM), 3D structural models of large macromolecular systems can be determined (1, 2) . Such models include large macromolecular machines (3), dynamic assemblies, membrane organization (4), genome architecture (5, 6) or even whole organelles (7) . Access, visualization and analysis of these structures is a central part of structural biology and structural bioinformatics. However, as macromolecular data sets grow ever larger and more complex, it becomes challenging to create software tools to access, visualize, and manipulate them. Web platforms, both mobile and desktop, have become an increasingly popular and essential tool for performing these tasks. Embracing advances in web browser technology provides the means for creating scalable molecular graphics and analysis tools with near-instant access to any available data. Web-based tools are platform-independent and require little or no local software installation, making them available to virtually everyone in both the scientific and non-scientific community, reaching an audience larger than ever before. Moreover, these technologies (most notably JavaScript, HTML and WebGL https://www.khronos. org/webgl/) and their surrounding ecosystem (including NPM https://www.npmjs.com/, Node.js https://nodejs.org/, TypeScript https://www.typescriptlang.org/, GitHub https: //github.com/) offer good support for the development of modular libraries and components. In summary, the web provides a unique opportunity to develop a common library and a set of tools for accessing, analysing, and visualizing macromolecular data. Here, we introduce Mol* (/'mol-star/) Viewer, part of the Mol* open source project (8) developed by an international group of contributors and supported by RCSB Protein Data Bank (RCSB PDB) (9) and Protein Data Bank in Europe (PDBe) (10) with the goal of developing a common library and tools for web-based molecular graphics and analyses. The Mol* project includes modules for data storage, inmemory representation, a query language, UI state management, visualization, and tools for efficient data access in a collaborative ecosystem. Collaborative development also helps with anticipating and keeping up with developments in structural biology and related fields as well as long-term sustainability. The purpose of Mol* Viewer is to enable webbased molecular visualization and analyses by providing a common library for the rapid and efficient development of tools and services for the structural biology/bioinformatics community. Examples include showing experimental/validation-related data for macromolecular models; displaying various annotations for macromolecular models providing biological context, including SCOP (11), PFAM (12), UniProt (13); the creation of visually interesting, engaging, and interactive educational materials; or visualizing results from structural bioinformatics or computer-aided drug design efforts. The project builds on the code and knowledge the authors gained from developing web-based molecular viewers, analysis tools, and compressed file formats, including the LiteMol Suite (14) , the NGL Viewer (15, 16) , PatternQuery (17), MMTF (18, 19) and BinaryCIF (20) . Mol* Viewer is developed as an open source project and hosted on GitHub (https://github.com/molstar). Mol* Viewer can visualize markedly larger molecular systems than other currently available web visualization tools. Due to built-in BinaryCIF (20) , decompression support, and advanced techniques for model and volume/experimental data streaming (14) , even large structures are interactively renderable over limited bandwidth. As such, Mol* Viewer is able to render many types of large systems, including ribosomes, virus capsids, collections of superimposed macromolecules (e.g. a comparison of individual members of the same protein family), or MD simulation systems. Additionally, it is able to visualize mesoscopic models such as cellPACK models, as illustrated in Figure 1 . Mol* inherits many LiteMol suite and NGL Viewer features, as described in their respective articles. Here, we highlight some of its new and improved features: • Advanced User Interface: access to many capabilities of the underlying Mol* library for fully creating molecular scenes with custom visuals and colourings. • High-quality rendering: advanced rendering options for beautiful images and improved perception of details, such as lighting (matte, metallic, glossy, plastic), outlines, fog/depth cue, and ambient occlusion 'shadows' which darken crevices occluded from ambient light ( Figure 2 ). • Sequence view and molecular component focus tools: integrated sequence view and component (e.g. ligand or polymer) selection menu to help with navigating the structure and making selections. • Alignment of molecules: sequence-guided pairwise alignment of two or more structures and ligand alignment by manual selection of corresponding atoms. • Measurements and labels: geometric measurements (distances, angles, dihedrals) and their labelling. • Session state saving: save the current molecular scene to reload it later. • Animation export: video export of molecular dynamics simulations, rotating molecules, and 3D state transitions (such as zooming to a binding site). • Screenshots: custom-sized, high-resolution, anti-aliased screenshots with preview, automatic cropping, and transparent background support. • Built-in data loading and annotations: includes support for loading data from RCSB PDB and PDBe MX and 3DEM density servers, wwPDB validation reports, and RCSB PDB assembly symmetry. Mol* Viewer is currently able to load and visualize many file formats with 3D structures: Nucleic Acids Research, 2021, Vol. 49, Web Server issue W433 • Structures (21): PDBx/mmCIF (https://mmcif.wwpdb. org/pdbx-mmcif-home-page.html) (including Bina-ryCIF encoded), pdb (ftp://ftp.wwpdb.org/pub/pdb/ doc/format descriptions/Format v32 letter.pdf), pdbqt (http://autodock.scripps.edu/faqs-help/faq/what-isthe-format-of-a-pdbqt-file), gro (https://manual. gromacs.org/documentation/2018/user-guide/fileformats.html#gro), sdf (https://discover.3ds.com/sites/ default/files/2020--08/biovia ctfileformats 2020.pdf), mol (https://discover.3ds.com/sites/default/files/ 2020--08/biovia ctfileformats 2020.pdf), mol2 (http: //chemyang.ccnu.edu.cn/ccb/server/AIMMS/mol2.pdf), CIF-core (small molecule/crystallographic cif) (https: //www.iucr.org/resources/cif/dictionaries/cif core). • Volumes: ccp4/mrc/map (22) Adding support for additional formats is usually a straightforward process. TypeScript (https://www.typescriptlang.org/) was used for the development of the web application, WebGL (https:// www.khronos.org/webgl/) for hardware-accelerated 3D rendering, and the standards of the open web platform (https: //www.w3.org/standards/) for the whole tool. The React framework (https://reactjs.org/) was used to implement the application's UI. Mol* Viewer offers a wide variety of visualization aspects that are required by current structural biology needs. It can show one structure, a few structures, or a large set of structures (e.g. a whole protein family). The structures can be static or dynamic (e.g. a molecular dynamic trajectory). It can visualize large mesoscopic models (e.g. Genome3D (6), cellPACK models (23)), hybrid models (e.g. from PDB-Dev (1)), protein assemblies, but also residues at atomic resolution. All the levels of detail can be seamlessly navigated within one Mol* session (provided the availability of the data). Various visualization models of structure coordinates can be applied, specifically: surfaces and volumes (Gaussian surface, Gaussian volume, molecular surface, etc.), secondary structure (e.g. cartoon, ribbon), ligands (labels, glycan 3D-SNFG symbols (24), etc.), atoms (balls and sticks, lines, points, etc.). These visualization models can be coloured not only according to many types of atom properties (including occupancy, uncertainty, etc.), residue properties (including hydrophobicity and accessible surface area) and chain properties but also based on anno- tations (e.g. quality criteria annotations loaded from ww-PDB validation reports (25)). Mol* also supports the rendering of electron densities and Cryo-EM maps. Further molecular characteristics, for example, molecular orbitals, non-covalent interactions, and membrane orientation, can be shown. Figure 3 shows the user interface of the Mol* Viewer and Figure 4 points to interactive practical demonstrations of Mol* Viewer's capabilities. The high quality and applicability of Mol* Viewer was shown by a large number of integrations of it into scientific tools and databases. Mol* Viewer was integrated into PDBe and RCSB PDB as the primary 3D viewer where it is actively used by thousands of users daily. Moreover, Mol* Viewer was incorporated into many other resources, including PDBe-KB (26), PDB-Dev, EMDataResource (27) , PED (28) , MobiDB (29) and HARP (30) . Mol* Viewer is a powerful web application for the visualization and analysis of molecular data. Its visualization capabilities far exceed other currently available web visualization tools. Mol* Viewer's speed and robustness allow the fast and intuitive visualization of molecular data ranging from atomistic models from PDB or MD simulations up to hybrid models with hundreds of thousands of residues, mesoscale cellPACK with tens of millions of atoms, or 3D Genome data. Furthermore, Mol* Viewer offers advanced selection and superimposition functionalities. It also offers a rich set of visualization models and colouring types. Last but not least, Mol* Viewer can save a visualization state operated by this web app. Mol* Viewer can be used from its webpage https:// molstar.org/ or can be integrated into other web applica-W436 Nucleic Acids Research, 2021, Vol. 49, Web Server issue tions. Its source codes are available on GitHub at https: //github.com/molstar. Mol* Viewer is available for free at https://molstar.org/, under the MIT license, a permissive Open Source license, to facilitate code sharing and collaboration. The code is available on GitHub at https://github.com/molstar. Mol* Viewer can be readily embedded into any scientific web application. Data for Figure 4C are available at https://doi.org/10. 6084/m9.figshare.12040257.v1 and (31) . Data for Figure 4D are available at (32) . Data for Figure 4E are available at (33) . PDB-Dev: a prototype system for depositing integrative/hybrid structural models Outcome of the first wwPDB hybrid/integrative methods task force workshop. Structure Structural insights into mammalian mitochondrial translation elongation catalyzed by mtEFG1 How nanoscale protein interactions determine the mesoscale dynamic organisation of bacterial outer membrane proteins Scaling molecular dynamics beyond 100,000 processor cores for large-scale biophysical simulations Genome3D: a viewer-model framework for integrating and visualizing multi-scale epigenomic information within a three-dimensional genome Atoms to phenotypes: molecular design principles of cellular energy metabolism Mol*: Towards a common library and tools for web molecular graphics The Protein Data Bank PDBe: towards reusable data delivery infrastructure at protein data bank in Europe The SCOP database in 2020: expanded classification of representative family and superfamily domains of known protein structures Pfam: The protein families database in 2021 UniProt: the universal protein knowledgebase in 2021 LiteMol suite: interactive web-based visualization of large-scale macromolecular structure data NGL Viewer: a web application for molecular visualization Web-based molecular graphics for large complexes PatternQuery: web application for fast detection of biomacromolecular structural patterns in the entire Protein Data Bank MMTF--an efficient file format for the transmission, visualization, and analysis of macromolecular structures Towards an efficient compression of 3D coordinates of macromolecular structures BinaryCIF and CIFTools--lightweight, efficient and extensible macromolecular data management Chapter 10 The PDB format, mmCIF formats, and other data formats MRC2014: extensions to the MRC format header for electron cryo-microscopy and tomography cellPACK: a virtual mesoscope to model and visualize structural systems biology Rapidly display glycan symbols in 3D structures: 3D-SNFG in LiteMol Protein Data Bank: the single global archive for 3D macromolecular structure data PDBe-KB: a community-driven resource for structural and functional annotations EMDataBank unified data resource for 3DEM PED in 2021: a major update of the protein ensemble database for intrinsically disordered proteins MobiDB: intrinsically disordered proteins in 2021 HARP: a database of structural impacts of systematic missense mutations in drug targets of Mycobacterium leprae HTMD: high-throughput molecular dynamics for molecular discovery A multiscale coarse-grained model of the SARS-CoV-2 virion Tethered agonist exposure in intact adhesion/class B2 GPCRs through intrinsic structural flexibility of the GAIN domain A robust and accurate tight-binding quantum chemical method for structures, vibrational frequencies, and noncovalent interactions of large molecular systems parametrized for all spd-block elements (Z = 1-86) ) entos: a quantum molecular simulation package We wish to thank Dr Ludovic Autin for his contributions to the cellPACK support, Aron Kovacs for improvements in 3D rendering, Dr Daniel Smith for help with molecular orbitals rendering, and specifically Dr Shuchismita Dutta for providing the basis of the User documentation. We gratefully acknowledge contributions from past and present members of the RCSB PDB and PDBe and our other ww-PDB partners.