Rarely Analyzed: The Relationship between Digital and Physical Rare Books Collections ARTICLE Rarely Analyzed The Relationship between Digital and Physical Rare Books Collections Allison McCormack and Rachel Wittmann INFORMATION TECHNOLOGY AND LIBRARIES | JUNE 2022 https://doi.org/10.6017/ital.v41i2.13415 Allison McCormack (allie.mccormack@utah.edu) is the Original Cataloger for Special Collections, University of Utah, University of Utah. Rachel Wittmann (rachel.wittmann@utah.edu) is the Digital Curation Librarian, University of Utah. © 2022. ABSTRACT The relationship between physical and digitized rare books can be complex and, at times, nebulous. When building a digital library, should showcasing a representative slice of the physical collection be the goal? Should stakeholders focus on preservation, high-use items, or other concerns? To explore these conundrums, a special collections librarian and a digital services librarian performed a comparative analysis of their library’s physical and digital rare books collections. After exporting MARC metadata for the rare books from their ILS, the librarians examined the place of publication, publication date, and broad subject range of the collection. They used this data to create a variety of visualizations with the open-source digital humanities tool Tableau Public. Next, the authors downloaded the rare books metadata from the digital library and created illuminating data visualizations. Were the geographic, temporal, and subject scopes of the digital library similar to those of the physical rare books collection? If not, what accounted for the differences? The implications of these and other findings will be explored. INTRODUCTION As of August 2019, the Special Collections Division of the University of Utah J. Willard Marriott Library held over 256,000 printed works and archival collections. Approximately 22% of the collection, or just over 55,000 works, belongs to the Rare Books Department (https://lib.utah.edu/collections/rarebooks/), which contains not only books but serials, maps, manuscripts, ephemera, and other formats. The collection covers over 4,000 years of human history, with its earliest piece, a cuneiform tablet, dating to the mid-twenty-third century BCE; contains works from nearly 100 different countries; and represents a wide variety of topics, including the exploration and settlement of the American West and the history of the book. The Rare Books Department, a subset of Special Collections, specifically seeks to document the history of written human communication and actively collects historical items to enhance teaching and research at the University of Utah. The Marriott Library has been adding digitized works from the Rare Books Department to its Digital Library (https://collections.lib.utah.edu/) for over 25 years. Approximately 780 works, or 1.42% of the rare books collection, has been digitized to date. However, no formal collection development plan was ever written, and rare books were selected for digitization by both curators and patrons. Unfortunately, the reason a particular item was digitized is not recorded in the system: it is unclear if age, research value, physical condition, a desire to bring forward underrepresented stories, or a combination of these and other factors influenced the decision to digitize a rare book. This piecemeal approach to digital library collection development, while not uncommon, made it difficult for library staff and patrons to determine the relationship between the digital and physical collections of rare books. It also presented challenges when library staff mailto:allie.mccormack@utah.edu mailto:rachel.wittmann@utah.edu https://lib.utah.edu/collections/rarebooks/ https://collections.lib.utah.edu/ INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2022 RARELY ANALYZED | MCCORMACK AND WITTMANN 2 attempted to communicate the scope and intent of the Digital Library to patrons, who assumed that the digitized items were representative of the overall collection. Given their expertise in library metadata, the authors decided to analyze both traditional library catalog records and digital library records for the rare books collection and explore whether the digital collection was proportionally representative of the physical collection or if it differed in geographic, temporal, or subject scope in a meaningful way. They then created a series of data visualizations to better communicate information about the library’s rare books holdings. LITERATURE REVIEW While much has been written about methods and criteria for selecting special collections items to be digitized and the effects of digitization on collection accessibility, few authors have discussed the relationships between digital collections and the physical collections from which they were sourced. In their highly detailed treatise on selection strategies for digitization, Ooghe and Moreels identify representativity, a method that “aims for a final selection that provides a representative view of the original collections,” as one of 25 selection criteria for digitization projects.1 However, Alexandra Mills notes that “without a thorough understanding of the institution and collections, it is impossible to create truly representative collections.”2 Because many digitization initiatives are undertaken in response to user requests, preservation concerns, or the availability of project- based funding, it is likely that most libraries do not plan for their digital collections to be representative of their overall special collections holdings. As Peter Michel states, the digital collections at the University of Nevada, Las Vegas, were explicitly built with popular history and popular culture in mind and were never intended to be “surrogates of the collection.”3 Bradley Daigle of the University of Virginia explained that digitization could be undertaken to alleviate preservation concerns, respond to defined research needs, or to brand certain online content, but this approach could give the mistaken impression “that only the important materials are digitized.”4 Despite the gaps in the literature, having an explicit collection development policy is still considered paramount; indeed, it is the very first principle listed in the National Information Standards Organization (NISO)’s framework for building “good” digital collections.5 To investigate this type of documentation further, a Google search was employed using the search term “digital collection development policy site:edu”. This yielded 10 publicly accessible digital collection development policies from academic libraries in the United States: 6 • Amherst College Library (https://www.amherst.edu/library/services/digital/digitalcolldev) • Emerson College Archives and Special Collections (https://www.emerson.edu/policies/digital-collections-development-policy) • Colorado State University Libraries (https://lib.colostate.edu/digital-collection- development-policy/) • Florida Atlantic University Digital Library (https://library.fau.edu/policy/digital-library- collection-development-policy) • Georgetown University Library (https://www.library.georgetown.edu/digital-project- policy) • Northern Illinois University Digital Library (https://digital.lib.niu.edu/policy/collection- development-policy) https://www.amherst.edu/library/services/digital/digitalcolldev https://www.emerson.edu/policies/digital-collections-development-policy https://lib.colostate.edu/digital-collection-development-policy/ https://lib.colostate.edu/digital-collection-development-policy/ https://library.fau.edu/policy/digital-library-collection-development-policy https://library.fau.edu/policy/digital-library-collection-development-policy https://www.library.georgetown.edu/digital-project-policy https://www.library.georgetown.edu/digital-project-policy https://digital.lib.niu.edu/policy/collection-development-policy https://digital.lib.niu.edu/policy/collection-development-policy INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2022 RARELY ANALYZED | MCCORMACK AND WITTMANN 3 • Oregon Health and Sciences University Digital Collections (https://www.ohsu.edu/library/ohsu-digital-collections-development-policy) • University of North Texas University Libraries (https://library.unt.edu/policies/collection-development-digital-collections/) • Wesleyan University Digital Library (https://digitalcollections.wesleyan.edu/about/what- we-collect) • Williams College Special Collections (https://specialcollections.williams.edu/collection- development-policies/digital-collections/) In reviewing the sample of 10 universities’ digital collection development policies, homogenous content becomes apparent. Almost all of the policies include a mission statement, scope, and selection criteria for potential digital collection items. All policies include criteria that physical materials should meet in order to qualify for digitization. The most common criteria for digitization are materials that are rare or unique, high-use, fragile, important to institutional or regional history, and/or support campus curriculum or faculty research. In addition, the clearance to publish materials online is ubiquitous among the policies. Materials eligible for online display must either be in the public domain or intellectual property rights are held by the institution, and materials currently under copyright must receive permission from the copyright holder. A measured approach to digitization qualification has been employed by the University of North Texas (UNT) Libraries’ Digital Collections and the Northern Illinois University Digital Library (NIUDL). UNT Libraries’ Digital Collections policy lists levels of criteria that materials must meet in order to be digitized and included in the digital library; to qualify for digitization, all criteria on level one must be met while only one criterion from level two is needed. NIUDL includes a Priority Factor Rubric which includes criteria categories and corresponding numerical scale with a maximum point of 35, the higher value signifying an elevated priority. Six of the 10 policies include prioritizing materials that support diversity and inclusion missions on campus. Amherst College has leveraged their digital collection development policy to include content that would increase perspectives of underrepresented groups within the digital collections and traditionally underrepresented groups more broadly. NIUDL includes marginalized groups as a collection priority area in order to “deepen public understanding of the histories of people of color and other communities and populations whose work, experiences, and perspectives have been insufficiently recognized or unattended” and lists over 20 such groups. The collection candidate’s relationship to other collections is outlined in four of the 10 policies. Georgetown University requires that “the materials form a coherent collection, fill gaps in existing collections, or complement existing collection strengths.” Amherst College evaluates whether digitization would “enhance public awareness of Archives’ Collection strengths.” Another function of a digital collection development policy is to inform the public on the scope and provenance of contents in their digital library. The UNT Digital Collection Policy includes a section outlining the content contributors, including partners, which can be beneficial for large-scale digital libraries that host collections from multiple partners. UNT is also exemplary in defining collection curators and their responsibilities while underscoring the nature of this role, likely changing over time and not set to an individual. With no written digital collection development policy regarding special collections at the Marriott Library, the authors would first have to analyze both the physical and digital special collections before determining what factors may have influenced the digitization of these materials. Libraries are gathering massive amounts of data, ranging from the metadata of their varied collections to patron usage statistics of both physical and digital collections. Interpretation of the https://www.ohsu.edu/library/ohsu-digital-collections-development-policy https://library.unt.edu/policies/collection-development-digital-collections/ https://digitalcollections.wesleyan.edu/about/what-we-collect https://digitalcollections.wesleyan.edu/about/what-we-collect https://specialcollections.williams.edu/collection-development-policies/digital-collections/ https://specialcollections.williams.edu/collection-development-policies/digital-collections/ INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2022 RARELY ANALYZED | MCCORMACK AND WITTMANN 4 ever-growing accumulation of data can quickly become complex. By visualizing data, we are able to interpret large and often messy sets of data while processing multiple aspects of the data concurrently. For example, the Ohio State University (OSU) Libraries used Tableau Desktop to combine data from various departments in order to better manage and explore information.7 Tableau was OSU’s data visualization software of choice due to its ease of use and accessibility, and the program was also used to create dashboards that blend data from various sources for real- time visualizations. BIBLIOGRAPHIC METADATA CLEANUP To understand the Marriott Library’s collections, one must first understand the relevant metadata, which for the Rare Books Department is in the Machine-Readable Cataloging (MARC) format. A popular criticism of MARC, commonly used in traditional library cataloging, is that the schema is highly regulated and, at times, redundant. However, for the purposes of this project, those qualities proved to be a boon. An older, uncorrected record in the Digital Library might list London as the place of publication for a particular book, but it was not immediately apparent if that referred to London, England; London, Ontario; or London, Ohio. However, a MARC record would not only list a book’s city of publication in the 260 or 264 field but would also contain a two - or three-letter code in the 008 field that specified the country, US state, Canadian province or territory, or Australian state or territory in which it was published. For this reason, the authors decided to base their analysis on MARC record data from the physical collection instead of the Dublin Core metadata used in the Digital Library. In order to tease out the relationships between our digital and physical collections, each of the approximately 55,000 rare books bibliographic records stored in Alma, the Marriott Library’s cloud-based library services platform, would have to have a common set of data points that could be compared. For the purposes of this analysis, the authors chose to investigate the place of publication and the subject of each work. Despite the relative rigidity of MARC metadata, some of the Alma records lacked country of publication data in the 008 field. These records were not incorrect, but merely outdated: some had been copied directly from paper catalog card s when the library first transitioned to a computer-based cataloging system, while others were created using different metadata standards. Approximately 6,000 rare books either completely lacked a country code in the 008 field or had data that could possibly be enhanced by, for example, replacing a code for the United States with a code for a particular state. Instead of editing all 6,000 records by hand, the cataloger wrote several metadata normalization rules in Alma to automatically correct the most obvious errors. Records that listed Chicago as the place of publication were assigned the MARC geographic code for Illinois, while those that were published in Lugduni Batavorum, the Latin designation for Leiden, were given the geographic code for the Netherlands. However, 3,000 records were unable to be enhanced in this manner, either because their place of publication was an ambiguous city name like Cambridge or because the place of publication was listed as unknown. The cataloger examined each record individ ually and was ultimately unable to assign a MARC geographic code to 1,682 records, most of which were Arabic manuscripts or advertising pamphlets that simply did not list a place of publication or creation. While these records would be excluded from the place of publication analysis, they could be mined for data on other topics. With the MARC records as complete as possible, the metadata was exported from Alma into an Excel spreadsheet and given to the metadata librarian for further manipulation. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2022 RARELY ANALYZED | MCCORMACK AND WITTMANN 5 METADATA TRANSFORMATION & VISUALIZATION CREATION The next phase involved standardizing the raw metadata to create human readable data, rather than MARC codes, that are necessary to produce data visualizations. Once the physical rare books ’ bibliographic metadata was updated in Alma, it was then exported as a comma-separated values file. The raw data export produced a massive spreadsheet containing over 50,000 MARC records. These records included two- and three-letter location codes for the place of publication from the Library of Congress MARC Code List for Geographic Areas. Two-letter codes are used for most countries, while three-letter codes are used for states within the United States, provinces within Canada, and territories within the United Kingdom. While this additional level of location data was available for books from the United Kingdom and Canada, it was decided to review the collection at a country level for consistency and map display. Books from the United States, however, were analyzed on a state level, considering the research is germane to an American institution. Using a list correlating these codes to the location name provided by the Library of Congress (https://www.loc.gov/marc/countries/countries_code.html), a VLOOKUP formula was used in Microsoft Excel to add the location names to the MARC records. The VLOOKUP formula pulls in data from one table to another as long as the two tables have one data field in common. In this exercise, both tables of data contained the Library of Congress location codes, therefore the LC location codes were used to add the location names to the table containing the MARC metadata. Once the location names were added, there were some additional quality control steps required, as LC location names that included outdated country names posed issues to mapping the data to current country names and boundaries. For example, we combined the codes for East Germany and West Berlin for the one representing contemporary Germany. For countries that have since been dissolved and rezoned to multiple countries, e.g., the USSR and Czechoslovakia, these records were manually checked for city names and then added to the current country. Once this process was completed, the results showed the rare books were published in 97 countries and all 50 United States, as well as the District of Columbia. Examining the subject content of the rare books physical collection was another aspect of analysis for this project. In contemplating this analysis, using the LC Subject Heading field was considered, however, faceting of LC Subject Headings and the structure of the exported data posed too many issues for a rather simple analysis. Instead, the Library of Congress call number was used to extract high-level LC classification information for each work by separating the first two letters of the call numbers included in the exported MARC metadata, which indicated LC class and subclass. To add the LC class and subclass names to these letters, a VLOOKUP formula was used again to match the letter codes to the list of LC classification categories. Once classification categor ies were added to the 55,000 records, works from all 21 LC master classes and 190 subclasses were represented in the rare books collection. In addition to the physical rare books collection held at the Marriott Library, there is a selection of this collection that has been digitized and is accessible in the Marriott Digital Library. The Rare Books digital collection (https://collections.lib.utah.edu/search?facet_setname_s=uum_rbc) comprises 780 works, although this number includes unique records for individual volumes within a series and therefore is not a true comparison to MARC metadata records, which contain one record for a series. For example, the Silver Reef Miner, a newspaper “devoted to the mining interests of Southern Utah” published during the late nineteenth century, has 30 individual volumes in the Digital Library, but these are represented in just one MARC record. In order to compare the digital collection to the physical collection, the datasets would need to have https://www.loc.gov/marc/countries/countries_code.html https://collections.lib.utah.edu/search?facet_setname_s=uum_rbc INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2022 RARELY ANALYZED | MCCORMACK AND WITTMANN 6 consistent data for comparison, namely place of publication and LC classification derived from call numbers. The digital collection metadata is in the Dublin Core schema, which does not include all of the metadata found in the MARC metadata, nor does it use the same format. While there is a Dublin Core spatial element used to capture geographic data on what the item is about, this does not always align neatly with the location of an item’s publication. For example, Reise in das innere Nord-America in den Jahren 1832 bis 1834 (2 volumes) is a book printed in Germany that documents an expedition to North America in 1832–1834 and includes illustrations of Native American people from the Swiss artist Karl Bodmer. For these volumes, the appropriate Dublin Core spatial data would include the specific regions the expedition traveled to in North America; in the MARC 26X field, however, it contains Koblenz, Germany, the city where the volumes were published. Call number data was included for many digitized works, but not in a consistent format. In order to use the same data to compare the physical rare books collection to the digital one, the digital collection metadata was updated with the improved/accurate call numbers found in the MARC metadata. Another improvement to the digital collection metadata was the addition of the Metadata Management System (MMS) ID unique numerical identifiers that aid in locating a record in the Alma system. When the rare books’ descriptive metadata was originally converted to Dublin Core during the digitization process, some titles and call numbers were changed and became different from their physical counterparts. The inclusion of the MMS ID allows for a consistent identifier between the physical and digital collections. When selecting data visualization software, being able to create a map of the places where books in the rare collection were published was a priority. Considering the goal of creating an easily replicable workflow for other libraries, the authors sought a freely accessible program that did not require advanced geospatial skills, unlike Esri’s ArcGIS software. Tableau Software is a data visualization software package with both a public and desktop version. The Tableau Desktop version requires a subscription fee while Tableau Public is open access. For the purposes of this study, Tableau Public offered open access and mapping features that are enabled without any geospatial knowledge necessary. ANALYSIS Creating a variety of data visualizations allowed information about the Rare Books physical and digital collections to be more apparent than merely browsing entries in a spreadsheet. For example, there are numerous geographic disparities between the two collections of rare materials as shown in the American states in which works from the collections were published. While books from all 50 states are found in the physical collection (fig. 1), only 18 states are represented in the Digital Library (fig. 2), with New York being the state in which the highest number of books were published. As New York City has long been a major publishing center in the United States, the authors were not surprised by this. However, the subsequent states were quite different: California and Utah ranked second and third for the physical collection, while Massachusetts and Pennsylvania claimed those spots for the Digital Library. The authors believe several factors might influence this discrepancy. First, works can only be added to the Digital Library if they are no longer in copyright, and states with longer histories of European-American settlement are more likely to have published books that are now out of copyright. Furthermore, these older books are more likely to be in a fragile condition and therefore may have been digitized to decrease the amount of physical handling to which they are subjected. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2022 RARELY ANALYZED | MCCORMACK AND WITTMANN 7 Figure 1. Marriott Library physical rare books by US state. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2022 RARELY ANALYZED | MCCORMACK AND WITTMANN 8 Figure 2. Marriott Library digital rare books by US state. There are other discrepancies when comparing the country of publication between the physical (fig. 3) and digital collections (fig. 4). While 61% of the physical rare books were published in the United States, only 20% of the digitized works were published in this country. The authors expected to see Egypt rank highly in the physical collection, as many of the rare books were purchased by former University of Utah professor Dr. Aziz Atiya to support the Middle East Center for research he founded; similarly high in rank, Britain, Germany, France, and Italy were all major centers for the early printing and publishing trade in early modern Europe. However, there is strong geographic bias in the digital collection, as only North America, Western Europe, and one African country are represented online. Copyright may again play a factor, as the earliest books from non-Western countries in the collection often date to the twentieth century, but a Eurocentric or other bias cannot immediately be discounted. While the physical collection contains many more European imprints than from the Global South, it is much more diverse than the digital collection. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2022 RARELY ANALYZED | MCCORMACK AND WITTMANN 9 Figure 3. Marriott Library physical rare books by country. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2022 RARELY ANALYZED | MCCORMACK AND WITTMANN 10 Figure 4. Marriott Library digital rare books by country. The analysis of the subjects represented in the collection proved to be somewhat challenging to study. Due to the nature and structure of Library of Congress Subject Headings, which attempt to mirror natural language and may be composed of “strings” of phrases to represent complex topics, no Tableau Public visualization could be created that effectively grouped similar content areas together without looking quite fragmented. Instead, the authors based their analysis of subjects on Library of Congress classification numbers (i.e., call numbers) assigned to works, which, though not exact, can be understood as distillations of the subject of a work.8 Once again there were considerable differences between the physical and digital rare books collections (fig. 5). As in many generalized special collections, literature and history make up significant portions of the physical collection. However, works on bibliography, or the study of books and book history, comprise a notable percentage of the collection. Many of these are modern works on book history and special collections librarianship and therefore are unable to be digitized due to copyright law. Nearly 9% of the digital collection is on the sciences, though these works comprise only 3% of the physical collection. While this portion of the holdings may be relatively small, it contains many scientific high points such as Vesalius’ De Humani Corporis Fabrica, early printings of ancient mathematical texts, and the journals of major scientific societies, which may have been digitized both for physical preservation as well as high interest on the part of students and faculty on campus. INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2022 RARELY ANALYZED | MCCORMACK AND WITTMANN 11 Figure 5. Percentage of rare books physical and digital collections by Library of Congress class. NEXT STEPS Now that the first phase of the project is complete, the authors would like to conduct additional analyses. First, they plan to compare the usage statistics of the digital rare books collection to the circulation statistics of the physical collection. This method of inquiry was not possible at the start of the project, as circulation information for the rare books was previously not tracked in the integrated library system. Now that rare books are checked out to patrons for use in the Special Collections Reading Room, this data can be quickly pulled from Alma. Once there is a year’s worth of circulation data for the rare books unhindered by the changes necessitated by the coronavirus pandemic, the authors will compare the usage statistics of the digital collection for the same time period. Do patrons in the reading room look at similar materials to online patrons, or are their interests vastly different? Are some rare books used so frequently that they would benefit from the added physical security that digitization brings? INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2022 RARELY ANALYZED | MCCORMACK AND WITTMANN 12 The authors also plan to pull annual usage statistics from the digitized rare books and share this with Special Collections Division leadership. Online patrons are still library patrons, and the division can use the viewing data to show the national and international reach of the collection. Relatedly, the authors will investigate the Digital Library usage data in more depth. Do patrons from Utah, the United States, and the world look at similar materials, or are there geographic divides among the online patrons? Do countries that are home to a majority of the University’s international student body have higher viewership numbers? Finally, the authors wish to convene a group of stakeholders to create a formal collection development plan for the rare books component of the Digital Library. Given the library’s limited resources, it is imperative that digitization be done thoughtfully and systematically. There is a good rationale for creating a digital collection that is representative of the physical rare books collection as well as one that highlights certain collection areas. Both material fragility and the modern scholarly emphasis on highlighting the stories of people of color, women, and other underrepresented groups in library collections provide strong counterarguments to making digital libraries strictly representative of their physical counterparts. Since informal conversations with patrons of the Marriott Library revealed that they assumed the Digital Library was representative of the collection overall, it is imperative that this assumption be either confirmed or disclaimed in a publicly viewable statement. In the case of the Rare Books Department, the authors are in favor of a focused, rather than representative, collection development policy. Firstly, many of the books in the collection are under copyright and therefore cannot be digitized, while other materials like reference sources for rare books librarians will be of limited interest to the general public. Furthermore, complex items such as artists’ books are often poor candidates for digitization, as they may have movable components that cannot be captured accurately in a still photograph. As for what should be included online, the authors fully support equity, diversity, and inclusion efforts at the University of Utah and would like to see the Digital Library highlight materials from marginalized groups whenever possible. Usage statistics from the physical and digital collections, when they become available, should also inform the collection development policy to encourage traffic to the Digital Library. Whatever is ultimately decided, however, the clarity a written policy provides will help streamline decision-making and ultimately help both library staff and patrons understand and search within the Digital Library much more effectively. ENDNOTES 1 Bart Ooghe and Dries Moreels, “Analysing Selection for Digitisation: Current Practices and Common Incentives,” D-Lib Magazine 15, no. 9/10 (2009), https://doi.org/10.1045/september2009-ooghe. 2 Alexandra Mills, “User Impact on Selection, Digitization, and the Development of Digital Special Collections,” New Review of Academic Librarianship 21, no. 2 (2015): 166. https://doi.org/10.1080/13614533.2015.1042117. 3 Peter Michel, “Digitizing Special Collections: To Boldly Go Where We’ve Been Before,” Library Hi Tech 23, no. 3 (2005): 382, https://doi.org/10.1108/07378830510621793. https://doi.org/10.1045/september2009-ooghe https://doi.org/10.1080/13614533.2015.1042117 https://doi.org/10.1108/07378830510621793 INFORMATION TECHNOLOGY AND LIBRARIES JUNE 2022 RARELY ANALYZED | MCCORMACK AND WITTMANN 13 4 Bradley J. Daigle, “The Digital Transformation of Special Collections,” Journal of Library Administration 52, no. 3–4 (2012): 253, https://doi.org/10.1080/01930826.2012.684504. 5 NISO Framework Working Group, A Framework of Guidance for Building Good Digital Collections (2007), https://www.imls.gov/sites/default/files/publications/documents/framework3.pdf. 6 The URLs in the following list were accurate as of March 2, 2022. 7 Sarah Anne Murphy, “Data Visualization and Rapid Analytics: Applying Tableau Desktop to Support Library Decision-Making,” Journal of Web Librarianship 7, no. 4 (2013): 465–76, https://doi.org/10.1080/19322909.2013.825148. 8 Readers who do not work with MARC metadata may not be familiar with how Library of Congress call numbers are assigned. Created in 1891, the classification system is based on 21 classes designated by a single letter; subclasses add one or two letters to the initial class. Catalogers must choose which one of the classes to assign to a particular work. The subject headings may guide a cataloger towards a certain class, but there is not a 1:1 relationship between subject headings and call number classes. https://doi.org/10.1080/01930826.2012.684504 https://www.imls.gov/sites/default/files/publications/documents/framework3.pdf https://doi.org/10.1080/19322909.2013.825148 ABSTRACT Introduction Literature Review Bibliographic Metadata Cleanup Metadata Transformation & Visualization Creation Analysis Next Steps Endnotes