Title Searching CSD Search Results Technical Information & Limitations References

Cambridge Structural Database (WebCSD)

Matt Hayward
STEM Librarian
University of Texas at San Antonio
matt.hayward@utsa.edu

Cambridge Structural Database (CSD) is a chemistry resource compiled and distributed by the Cambridge Crystallographic Data Centre (CCDC). CSD contains a highly-detailed and complete record of all published organic and metal-organic small-molecule crystal structures. CSD is considered the authoritative source for finding and sharing structural chemistry data (Groom et al. 2016). WebCSD, the online implementation of CSD, is freely available on the internet, although a subscription and individual account is required for advanced searching. The CSD Software System, which includes ConQuest, IsoStar, Mercury, PreQuest, Mogul, and Python API, is available for annual subscription.

CCDC is a non-profit charitable organization started in 1965 by the Organic Chemistry Department at the University of Cambridge (Groom & Allen 2014). CSD is a vast repository for experimentally-determined small-molecule crystallography data and structures. The database is continually updated with new structures visible in the database within moments of user deposition. CSD contains a complete record of all published organic and metal-organic small-molecule crystal structures. As of this writing, the database contains over 970,000 entries.

The CSD Software System is intended for in-depth and comprehensive crystallographic searching by advanced users with expert knowledge in crystallography, such as crystallographers, structural chemists, and the drug design community (Thomas et al. 2010). Therefore, this review will focus on WebCSD as the web implementation, aimed at the medicinal and pharmaceutical chemists, but it is more likely to be used by librarians and students. WebCSD is also an excellent tool for chemical education. However, that content has been reviewed in numerous chemical education journals (Battle and Allen, 2012; Battle et al. 2011;Battle et al. 2010) and falls outside the scope of this publication.

Searching CSD

CSD offers several options for searching, which are arranged in a tabbed frame. Each tab represents a different search type, with various options for search and refinement within the design. The search types are Simple, Structure, Unit Cell, and Formula.

Simple Search –default landing page, search by:

Figure 1. Simple search

Structure Search – users can draw chemical structures within a Java applet to search for an exact structure, substructure, or similarity. Common rings and elements can be selected from the toolbar. A periodic table and the hand-drawing tools are available for others.

Figure 2. Structure search

Unit Cell Search – search by lattice centring (e.g., primitive, rhombohedral, a-, b-, c-, face- or body-centered) as well as cell lengths and angles.

Figure 3. Unit Cell search

Formula Search – search by molecular formula components (e.g., C5 H6 O2)

Figure 4. Formula search

Search Results

Following a query, the user is presented with a list of reference codes for structure or substructure matches (depending on query type and user selection) and the individual record for the first item in the list.

Figure 5. Search Results

Individual record - Individual records retrieved consist of several sections: reference code, compound name, 3-dimensional structure, chemical diagram, additional details, data citation, associated publication(s), and other chemical, crystal, and experimental details.

Figure 6. Individual record

Technical Information & Limitations

A CSD subscription, annually renewed, includes an unlimited-use license, with user authentication based on IP address, as well as local installation and updates for the CSD System Software. This authentication allows for secure searching locally using the on-site server, as well as integration with in-house databases and proxy connection for off-campus users. For records where the originally published articles are provided, DOIs can be linked to library holdings. These records are linked under the “Associated publications” section and have been shown to be a source of confusion for some first-time users. Some users infer from the verbiage that these records are other related publications outside of the original structural determination.

While WebCSD can be accessed from any device, the structure search and 3D structural applets can infrequently be finicky. Most problems that users encounter with WebCSD can be resolved through updating Java, switching browsers (Firefox seems to work best), or clearing cache & cookies. There are occasional off-campus access interruptions, especially during heavy-use periods, for instance, when class assignments that require WebCSD are due. CCDC recommends having users create their own (free) CCDC accounts using the License Site Number and License Confirmation Code, which can be requested from the subject librarian. While the CCDC support team is very fast to respond when issues arise, their response time is sometimes hindered for US-based institutions by the time-zone difference.

There are other resources available that offer comparable features, such as structure and compound searching, 3D visualization, and structural properties, but CCDC provides the most comprehensive coverage for organic crystal structures, while also incorporating innovative searching techniques and thorough experimental and physical details. It is important to note that while CSD does contain all published crystal structures, as of 2015, it was estimated that only about 15% of determined structures were published (Groom et al. 2016). Additionally CSD does not include the following: inorganic structures, proteins, high-molecular-weight compounds, polypeptides and polysaccharides consisting of greater than 24 units, or oligonucleotides.

Patrons searching for those structures should consider: Inorganic Crystal Structure Database (ICSD), NRCC Metals Crystallographic Data File (CRYSTMET), Protein Databank (PDB), or ICDD NIST Crystal Data File. While many databases offer some features of CSD, such as the chemical structure search, no other available resources offer the full search capabilities or comprehensive records afforded by CSD.

For more information contact:
The Cambridge Crystallographic Data Centre
https://www.ccdc.cam.ac.uk/solutions/csd-system/components/csd/
12 Union Road, Cambridge, CB2 1EZ, United Kingdom.
Phone: +44 (0)1223 336408, Fax: +44 (0)1223 336033.

References

Battle, G., & Allen, F. 2012. Learning about Intermolecular Interactions from the Cambridge Structural Database. Journal of Chemical Education 89(1): 38. DOI: 10.1021/ed200139t.

Battle, G., Atlen, F., & Ferrence, G. 2011. Teaching Three-Dimensional Structural Chemistry Using Crystal Structure Databases. 4. Examples of Discovery-Based Learning Using the Complete Cambridge Structural Database. Journal of Chemical Education 88(7): 891. DOI: 10.1021/ed1011025.

Battle, G. M., Ferrence, G. M., & Allen, F. H. 2010. Applications of the Cambridge Structural Database in chemical education. Journal of Applied Crystallography 43(5‐2): 1208-1223. DOI: 10.1107/S0021889810024155.

Groom, C. R., & Allen, F. H. 2014. The Cambridge Structural Database in Retrospect and Prospect. Angewandte Chemie International Edition 53(3): 662-671. DOI: 10.1002/anie.201306438.

Groom, C. R., Bruno, I. J., Lightfoot, M. P., & Ward, S. C. 2016. The Cambridge Structural Database. Acta Crystallographica Section B 72(2): 171-179. DOI: 10.1107/S2052520616003954.

Thomas, I. R., Bruno, I. J., Cole, J. C., Macrae, C. F., Pidcock, E., & Wood, P. A. 2010. WebCSD : the online portal to the Cambridge Structural Database. Journal of Applied Crystallography 43(2): 362-366. DOI: 10.1107/S0021889810000452.





Issues in Science and Technology Librarianship No. 92, Fall 2019. DOI: 10.29173/istl28