Recommended Best Practices for Digital Image Capture of Musical Scores Jenn Riley Ichiro Fujinaga The authors Jenn Riley is!Digital Media Specialist at Digital Library Program, Indiana University, Bloomington, Indiana, USA Ichiro Fujinaga is Assistant Professor of Music Technology at Faculty of Music, McGill University, Montréal, Québec, Canada Keywords Best practice, digitization, musical scores Abstract Musical scores, as complex visual articles with small details, are difficult to digitally capture and deliver well. All capture decisions should be made with a clear idea of the purpose of the resulting digital images and must be flexible enough to fulfill unanticipated future uses. Best practices for detail and color capture are presented for creating an archival image containing all relevant data from the print source, based on commonly defined purposes of digital capture. Options and recommendations for file formats for archival storage, web delivery and printing of musical materials are presented. Introduction Libraries and archives embarking on digital imaging projects today have a great deal more guidance for decision-making than they did just a few years ago. Standards and best practices for many types of originals have emerged, from the early NARA (National Archives and Records Administration) Guidelines (Puglia and Ruginski, 1988), Cornell University’s Digital Imaging for Libraries and Archives (Kenney and Chapman, 1996), its successor Moving Theory into Practice (Kenney and Rieger, 2000), and the Library of Congress’ documentation for the American Memory project (Fleischhaur, 1988; Library of Congress, 2000) to the Arts and Humanities Data Service’s Guides to Good Practice series (Arts and Humanities Data Service, 2002) and the NINCH (National Initiative for a Networked Cultural Heritage) Guidelines (NINCH, 2002). These standards and best practices documents take a wide variety of approaches, from prescriptive lists of appropriate resolutions and bit depths for various formats to explanations of decision-making processes to determine specifications individually for each item to be digitized. They cover many formats of originals, but tend to focus on photographic and printed textual materials. Much of the information in these guidelines can be transferred to the digital capture of musical scores. However, musical notation has a much greater need for accurate detail capture. Staff and ledger lines, dots and bars are all very small details, and any loss of detail results in a significant loss of meaning. This paper will present some best practice guidelines for decision-making for digital image capture of musical scores. Defining the purpose of scanning Before any decisions regarding capture specifications can be made, the purpose of the imaging project must be clearly defined. Is the musical score important as a historical artifact, or is only the musical content within worth preserving? Manuscripts, rare materials, and those with annotations by a collector are examples of scores that would require artifactual treatment. Mass-printed publications now in poor condition may be candidates for content-only capture. As the capture of paper watermarks has been covered elsewhere (Edge, 2001; Kenney and Rieger, 2000; Stewart, Scharf and Arney, 1995; Wenger, et al., 1995), it will not be covered here. Note that not all materials are good candidates for digital imaging: rare and fragile materials might best be captured for preservation on medium-format color film, such as Ilford’s Ilfochrome Micrographic , which has an estimated 300-year life expectancy. While we cannot anticipate all future uses of our digital images, our digitization decisions must be made to ensure that master images are flexible enough for a variety of uses. For musical scores, master images should at the very least support the creation of derivative versions for web delivery, printing and Optical Music Recognition (OMR). Master file specifications Resolution Scanning resolution must be set to capture all important detail from the original. One method of determining this resolution is to determine the minimum scanning resolution based on the stroke width of the smallest detail (Kenney and Rieger, 2000, pp. 46-47). For musical notation, this smallest detail is generally the white space between beams (see Figure 1). While Kenney advocates capturing the smallest detail with 2 pixels for adequate reproduction of the stroke with a grayscale scan, 3 pixels per detail is required for successful OMR with the forthcoming Gamera software (MacMillan et al., 2001). Figure 1. An example of very small spacing between beams (scanned at 600dpi). An online version of this image is available at However, details in musical notation are consistently smaller than 1mm and are difficult to measure accurately without specialized equipment. Also, since printing sizes of musical notation are not consistent between different publications, this method would have to be applied individually to each piece of music to be scanned. Because of these problems, for most projects it would be more appropriate to simply capture all images at the same resolution. Our tests have found that 600dpi is a sufficient resolution to capture all significant detail for most musical notation, as seen in Figure 2, where the 600dpi scan more adequately renders the ledger line and the sharp sign. This resolution will capture detail as small as .005in (.027mm) with the required 3 pixels. For larger printed notation, 300dpi may be sufficient. Our preliminary studies show that resolutions above 600dpi generally do not offer much advantage for the purpose of web viewing, printing, or OMR. This is true even in the case of miniature scores as shown in Figure 3, where there is an improvement from 300dpi to 600dpi but there are no clear improvements in 1200dpi or 1600dpi scans. Grayscale versions of these sample images and some others can be found at . a ) b ) Figure 2. small detail scanned at a) 300dpi, b) 600dpi. a ) b ) c) d ) Figure 3. Miniature score scanned at a) 300dpi, b) 600dpi, c)1200dpi, d)1600dpi. Color Reproduction and Bit Depth Musical notation must be captured in grayscale, as 1-bit (bitonal) scanning does is generally not adequate to capture all important detail. (See for a comparison of bitonal and grayscale detail reproduction.) If color is used on the page in a meaningful way, such as on sheet music covers, color scanning should be used. Grayscale and color scanning should both use at least 8 bits per channel, and higher bit depths may be appropriate for some uses. In order to preserve this full color range, any image manipulations done according to the guidelines below should be performed in the scanning software at the time of capture, not after capture with an image-editing application. It is important to understand that image manipulations done to the master file, including straightening, reduce the amount of data present in the master image. They should be done only to achieve the goal of the master image: to reproduce an artifact or to maximize capture of its musical content. Before doing any image adjustments, the imaging system must be set up properly to ensure that the scanner accurately sees the color of the printed original and the image displayed on the monitor accurately represents the data in the image. The scanner and monitor should both be characterized and managed via International Color Consortium (ICC) profiles . Operating-system level color management exists both for Macintosh in the form of ColorSync and for Windows 98, 2000, ME, and XP in the form of Image Color Management (ICM) . Locally- created ICC profiles, such as those created with Monaco Systems’ software for each device are preferable to generic profiles for a specific model of scanner or monitor. Once a system is properly calibrated, it should capture reasonably color-accurate versions of the original printed materials. If the purpose of the imaging project is to capture the artifact as it exists today, no corrections should be made to the master images. Every effort should be made to ensure pages are straight during capture as rotating them in image-editing software can result in a loss of detail. If capture of the musical content rather than visual content has been determined as the purpose of the scan, the contrast between the musical notation and background of the page should be maximized. A well-contrasted page will have completely filled-in note heads, solid staff lines, and clean white space between staff lines when viewed at 100% magnification in image editing software. Master File Formats Uncompressed TIFF is generally suggested as the most appropriate file format for master files (Fleischhauer, 1998; Puglia and Roginski, 1998). However, TIFF is not a true, but instead a de facto, standard. The PNG (Portable Network Graphics) format may be an emerging replacement for TIFF for this purpose (Roelofs, 1999). PNG has the technical capabilities to store all relevant information captured according to these guidelines. It can use lossless compression, and produces significantly smaller files than uncompressed TIFF files and various JPEG lossless compression schemes (Santa- Cruz, 2000). Most archival imaging projects, however, still use TIFF as the master file format, and it may be some time before it is clear whether the digital library community as a whole accepts PNG as a master file format. Storage of Master Files Proper storage of master files is perhaps the most difficult aspect of managing a digital imaging project. One possible system allowing for multiple copies of master and derivative files on a variety of media is described at . However, even basic configurations such as this one will not be available to many smaller institutions without sufficient technical support embarking on digital projects. Storage of master files on optical media such as CD-R and DVD-R is a short-term solution and should be supplemented by efforts to increase access to long-term data storage. Web Delivery File Formats Regardless of whether master files are captured as artifacts or just for content, methods for delivering the images via the web are the same. At first glance, there appear to be an extremely large number of file format options for web delivery. Using an open format is not as important for delivery images as it is for master images. However, some choices are better than others and the final decision regarding web delivery format should take into account three major considerations: availability of web viewers for the format, support for multi-page images, and file size. Table 1 sorts possible delivery formats according to the first two of these criteria. Table 1. Comparison of some web-deliverable image formats Support for multi-page images? Yes No Al l gr ap hi ca l w eb br ow se rs JPEG GIF PNG* R eq ui re s co m m on pl ug -in PDF A va ila bi lit y of w eb v ie w er R eq ui re s un co m m o n pl ug -in TIFF DjVu JPEG2000 *PNG has a reputation for poor web browser support, but current support problems are with advanced PNG functionality, namely, alpha transparency. Simple page images in PNG format will display properly in recent versions all major web browsers, including Netscape and Internet Explorer versions 4 and above, on all major platforms. Usage of pre-version 4 browsers is now at less than 1%, according to Jupitermedia’s The Counter . Unfortunately, there is not a multi-page image format with native support in the mainstream graphical web browsers. While tools for viewing PDF files are fairly widespread and easily available, the PDF format was not designed for efficient compression of scanned images. Converting score images to PDF at acceptable resolutions for screen viewing and printing, even for short pieces, will generally result in prohibitively large file sizes. Other formats, such as DjVu (Bottou et al., 1988) and JPEG2000 (Santa Cruz et al., 2000) hold promise for more efficient web delivery in the future but are not currently widespread enough to be appropriate for use to a wide audience. Instead, single-page image formats should be used together with some sort of “page-turning” mechanism in the user interface. To accomplish this, metadata describing the structural relationship of page images to one another must be stored, for example, in the METS schema , and used to generate HTML code for navigation within the score. The choice between JPEG, GIF, and PNG is affected partially by file size, which is a function of the pixel dimensions of the display image. Pixel Dimensions The size of score images for screen display depends on the size and type of your original and the characteristics of your users. Most standards and best practices for web delivery of digital images focus on determining a fixed set of pixel dimensions for images, balancing the amount of detail presented with the need to fit an image on a user’s screen. However, for musical notation, the readability of the page and the level of detail presented are essential, and thus are more important than making an entire score page visible at a glance. Downsizing master score image files to 100-200 dpi from their original page size should result in screen-readable images from most sizes of originals, as seen at . As these images show, for all but the smallest printed notation, web-deliverable images can be created that show all necessary detail without requiring horizontal scrolling. However, vertical scrolling will be required at many screen resolutions. At these sizes, there is very little visual difference between grayscale JPEG, GIF, and PNG files of musical score pages. JPEG files are preferable to GIF files for two reasons. We have found that for grayscale notation pages, JPEG images of score pages at medium-high to high quality tend to be smaller than GIF files, and do not show obvious compression artifacts at these sizes. Scores with large printing can be compressed more heavily, down to what many define as “medium” quality (e.g. 50% in utilites such as ImageMagick and GraphicConverter or level 6 in Adobe Photoshop). For color images, GIF files are unsuitable because the GIF format is limited to an 8-bit palette, which can result in unacceptable color-shifting. PNG offers an advantage over JPEG in that it can use lossless compression. We have found PNG files for web delivery of scores to be smaller than high-quality JPEGs but larger than medium-high quality JPEGs. Some average file sizes for the different formats can be found in Table 2. Table 2. Representative file sizes for web-deliverable images from 9” x 12” original 2 0 0 d p i 1 5 0 d p i 1 0 0 d p i GIF 598K 389K 216K P N G 500K 326K 180K JPEG high quality 647K 421K 280K JPEG medium high quality 411K 268K 137K JPEG medium quality 332K 215K 111K For some collections it may be appropriate to provide thumbnail-sized images for browsing. While thumbnails of notation pages would generally not be very useful, thumbnail browsing of sheet music covers may be desirable. Images downsized to 5-25 dpi from their original page size should produce thumbnail-sized images. The compression method should be either JPEG (medium to high quality) or PNG. Processing Filters While no image processing should be done on the master files, it may be appropriate while creating derivatives for web display in order to increase their readability. Depending on the size and quality of the original, sharpening, deskewing and thresholding filters may be appropriate for use when creating web-deliverable images of musical scores. Printing Printing is a much greater need for digitized musical score collections than for many other formats. While it may not be important to be able to print colored covers or pages from original manuscripts, score pages intended for use for practice or performance will need print capability. While the exact best file format for print versions of score images may vary between user populations, generally score images for printing on laser printers are best presented as bitonal files at 250–400dpi, depending on the original print size (see examples at ). At lower resolutions, bitonal PNG files on average are smaller, while at higher resolutions, Group 4 compressed TIFF files on average are smaller, as shown in Table 3. Table 3. PNG and Group 4 compressed TIFF file size comparison for bitonal images. PNG TIFF (Group 4) 800dpi 329 KB 192 KB 400dpi 183 KB 146 KB 250dpi 90 KB 96 KB 200dpi 64 KB 71 KB 100dpi 25 KB 38 KB Files intended for printing must be easily downloaded by users. The TIFF format allows multi-page files, which would eliminate the need for bundling single image files using a utility like ZIP for Windows or TAR for Unix-based systems. However, many TIFF viewers cannot display multi-page TIFF files. Conclusion Digital imaging standards and best practices can be applied to the digitization of musical scores, when used with a full understanding of the decision-making processes behind their recommendations. A well- designed digital imaging process with appropriate quality control mechanisms can result in flexible master files from which successful OMR can be done, and web-viewable and print-quality images can be created. Acknowledgments Ichiro Fujinaga would like to thank Michael Droettboom, Karl MacMillan, and Asha Srinivasan for their help in the preparation of this paper. This research is funded in part by the NSF’s DLI-2 initiative (#981743), IMLS National Leadership Grant, and support from the Levy Family. References Arts and Humanities Data Service, (2002), “Guides to Good Practice in the Creation and Use of Digital Resources”, Available: http://www.ahds.ac.uk/guides.htm (Accessed: 2002, December 18). Bottou, L., Haffner, P., Howard, P.G., Simard, P., Bengio, Y. and Le Cun, Y. (1988), “High quality document image compression with DjVu”, Journal of Electronic Imaging, vol. 7, no. 3, pp. 410- 425. Edge, D. (2001), “The digital imaging of watermarks”. Computing in Musicology, vol. 12, pp. 261-274. Fleischhauer, C. (1998, July 13), “Digital formats for content reproductions”, (Library of Congress), Available: http://memory.loc.gov/ammem/formats.html (Accessed: 2002, December 20). Kenney, A. and Chapman, S. (1996), Digital Imaging for Libraries and Archives, Cornell University Library, Ithaca, NY. Kenney, A. and Rieger, O. (2000), Moving Theory into Practice, Research Libraries Group, Mountain View, California. Library of Congress, (2000, April 14), “Building Digital Collections: Technical Information and Background Papers,” Available: http://memory.loc.gov/ammem/ftpfiles.html (Accessed: 2002, December 17). MacMillan, K., Droettboom, M. and Fujinaga, I. (2001), “Gamera: A structured document recognition application development environment”, Proceedings of the 2nd Annual International Symposium on Music Information Retrieval, Oct. 15-17 2001, Bloomington, Indiana, pp.15-16. “NINCH Guide to Good Practice in the Digital Representation and Management of Cultural Heritage Materials”, (2002 October), Available: http://www.nyu.edu/its/humanities/ninchguide/index.html (Accessed: 2002, December 17). Puglia, S. and Roginski, B. (1998, January), “NARA guidelines for digitizing archival materials for electronic access”, (National Archives and Records Administration), Available: http://www.archives.gov/research_room/arc/arc_info/ guidelines_for_digitizing_archival_materials.html (Accessed: 2002, December 15). Roelofs, G. (1999), PNG: The Definitive Guide, O’Reilly, Sebastopol, CA. Santa-Cruz, D., Ebrahimi, T., Askelöf, J., Larsson, M. and Christopoulos, C.A. (2000), “JPEG 2000 still image coding versus other standards”. In Proceedings of the SPIE’s 45th annual meeting, Applications of Digital Image Processing XXIII, vol. 4115, pp. 446-454. Stewart, D., Scharf, R.A. and Arney, J.S. (1995), “Techniques for digital image capture of watermarks”, Journal of Imaging Science and Technology, vol. 39, no. 3, pp. 261-267. Wenger, E., Karnaukhov, V., Haidinger, A. and Merzlyakov, N. (1995), “Image analysis for dating of old manuscript”, in Chin, R.T., Ip, H.S., Naiman, A.C. and Pong, T.-C. (Eds.), Image Analysis Applications and Computer Graphics, Lecture Notes in Computer Science 1024, Springer, Berlin.