53 USA STANDARD FOR A FORMAT FOR BIBLIOGRAPHIC INFOR- MATION INTERCHANGE ON MAGNETIC TAPE The Chairman of the United States of America Standards Institute, Sectional Committee Z39, Library Work and Documentation, has ap- proved publication of the following draft "USA Standard for a Format for Bibliographic Information Interchange on Magnetic Tape" to hasten availability of this fundamental contribution to bibliographic standardiza- tion. Two important implementations follow the Standard. Part B of Appendix I is "Preliminary Guidelines for the Library of Congress, National Library of Medicine, and National Agricultural Library Imple- mentation of the Proposed American Standard for a Format for Biblio- graphic Information Interchange on Magnetic Tape as Applied to Records Representing Monographic Materials in Textual Printed Form (BooksT- more succintly known as MARC II. Part Cis a Committee working paper entitled "Preliminary Committee on Scientific and Technical Information (COSATI) Guidelines for Implementation of the USA Standard." 0. INTRODUCTION 0.1 T~is introduction is not part of the proposed standard but is included to facilitate its use. 0.2 This standard defines a format which is intended for the inter- change of bibliographic records on magnetic tape. It has not been designed as a record format for retention within the files of any specific organization. Nor has it been the intent of the subcommittee to define the content of individual records. Rather it has attempted to describe a generalized structure which can be used to transmit between systems records describing all forms of material capable of bibliographic qescriptions, as well as related records, such as authority records for authors and sub- ject headings. 54 Journal of Library Automation Vol. 2/2 June, 1969 0.3 In designing the format the subcommittee has tried to achieve the goals listed below. It recognizes, however, that the goals were not completely compatible and that various trade-offs were required. (a) Hospitality-to all kinds of bibliographic information should be provided; (b) Hardware independence-a format which can be used with a variety of digital computers should be defined; (c) Uniformity of structure-the structure of all machine records should be basically identical and include such control information as may be required "to specify unique characteristics. For any given class of records the com- ponents of the format may have specific meanings and unique characteristics; (d) Data Manipulation-the methods of recording and iden- . tifying data should provide for maximum manipulability · ·leading to ease of conversion -to other f.ormats for various uses. · 0.4 The standard· includes the concept that the bibliographic unit may .be described independently or in relation to other biblio- graphic units. Many relationships exist, including: the hier- archical, in which the bibliographic unit contains, or is contained in, another bibliographic unit, e.g., a monograph in a series; the equivalent, e.g., a work and its translation; and the sequential, e.g., a serial which appeared under a succession of titles. The standard provides for bibliographic records which describe one or more related bibliographic units, and provides for coding the relationships among them. Appendix II describes a proposed method for implementing this concept. 0.5 Preliminro·y guidelines for implementing the standard by two different groups of users are provided in Appendix I. These guidelines are not part of the standard but . are included to illustrate the use of the format. 0.6 Explanatory material which is not part of the standard but which will assist in its interpretation or implementation appears in brackets. 0.7 The appendices accompanying this standard are not part of the standard. · · · · 0.8 The development of this standard was made possible partially by support received from the National Science Foundation and the Council on Library . Resources. Personnel of . the US ASI Committee Z39 at the time the Committee approved the stand- ard were Dr. Jerrold Orne, Chairman; Mr. James Wood, Vice- Chairman; and Mr. Harold Oatfield, Secretary. USA Standard for a Format for Bibliographic Information Interchange 55 The Subcommittee on Machine Input Records, which is directly respon- sible for this standard, had the following personnel: Mrs. Henriette D. Avram, Chairman Assistant Coordinator of Information Systems Information Systems Office Library of Congress Washington, D.C. 20540 Mrs. Pauline A. Atherton School of Library Science Syracuse University 308 Carnegie Library Syracuse, New York 13210 Mr. Arthur R. Blum American Institute of Physics 335 East 45th Street New York, New York 10017 Mr. Lawrence F. Buckland President, lnforonics, Inc. 806 Massachusetts Avenue Cambridge, Massachusetts 02139 Miss Ann T. Curran Inforonics, Inc. 806 Massachusetts A venue Cambridge, Massachusetts 02139 Mr. Kay D. Guiles Information Systems Office Library of Congress Washington, D.C. 20540 Mr. Frederick G. Kilgour Director, Ohio College Library Center 1314 Kinnear Road Columbus, Ohio 43212 Mr. Abraham I. Lebowitz Assistant to the Director National Agricultural Library U.S. Department of Agriculture Washington, D.C. 20250 Mrs. Phyllis B. Steckler R. R. Bowker Company 1180 A venue of the Americas New York, New York 10036 56 Journal of Library Automation Vol. 2/ 2 June, 1969 1. GLOSSARY It has been considered unnecessary to define terms in common use. Terms which have a special meaning in the standard or which might be am- biguous are defined below. BASE ADDRESS OF DATA. A data element whose value is equal to the character position of the character following the field terminator of the directory, where the specified origin is the first character of the leader. [Example: If the directory contains two ( 2) entries, the first character position of data will be 49, and therefore the base address of data equals 49.] BASIC CHARACTER. A character occurring in columns 2, 3, 6 or 7 of the Standard Code as defined in USAS X3.4-1967 Code for Infor- mation Interchange, p. 6. [The basic character set is included as part of the illustration on page 82 of Appendix I, columns 2, 3, 6 and 7.] BIBLIOGRAPHIC INFORMATION INTERCHANGE FORMAT. A format for the exchange, rather than the local processing, of biblio- graphic records. (The terms "bibliographic information interchange format," "information interchange format," and "interchange format" are used interchangeably in this standard. ) BIBLIOGRAPHIC LEVEL. A data element which, in conjunction with the data element "type-of-record," specifies the characteristics and describes the components of the bibliographic record. [See Appendix I for an illustration of an application of this data element.] BIBLIOGRAPHIC RECORD. A collection of fields, including a leader, directory, and bibliographic data, describing one or more bibliographic units treated as one logical entity. BIBLIOGRAPHIC UNIT. A defined body of recorded information and the artifact on which it is recorded, e.g., a book, chapter of a book, map, cuneiform tablet, digital magnetic tape file, song (sheet music ), and song (phonograph record ). A bibliographic unit may be part of a larger bibliographic unit (e.g., the chapter as part of a book, which in tum is part of a series). [It is assumed that the originators of bib- liographic information and/or bibliographic descriptions follow a set of mles or guidelines which define, for the originating source, what is to be treated as a bibliographic unit.] A single author or subject heading authority record is also a bibliographic unit. CHARACTER. See INTERNAL CHARACTER. COMMUNICATIONS FORMAT. See BIBLIOGRAPHIC INFORMA- TION INTERCHANGE FORMAT. CONTROL FIELD. A variable field which supplies parameters which may be required in the processing of the bibliographic record. CONTROL NUMBER. An alphanumeric symbol uniquely associated with a bibliographic record assigned by the organization creating the bibliographic record. USA Standard for a Format for Bibliographic Information Interchange 57 DATA ELEMENT. A defined unit of information within a system. DATA ELEMENT IDENTIFIER. A code consisting of one or more basic characters used to identify individual data elements within a variable field. If and when data element identifiers are used, each occurrence must be immediately preceded by a delimiter, and each data element identifier must immediately precede the data element it identifies. The length (in characters) of the data element identifier must be uniform for each field of a given record. [In effect, a delimiter and data element identifier are combined to form a symbol used to initiate and identify data elements within a variable field. The use of the concept of data element identifiers is optional and provides a means of explicitly identifying data elements, even though in some instances there may be a redundancy of identification (e.g., if a variable field consists of only one data element, presumably the tag alone would provide sufficient identification). J DATA FIELD. A variable field containing bibliographic or other data not intended to supply parameters for the processing of the biblio- graphic record. DELIMITER. A character which serves as an initiator, a separator, or a terminator of individual data elements within a variable field. [Whether a delimiter is used to initiate, to separate, or to terminate, is dependent upon a specific system.] DELIMITER (OR DELIMITER PLUS DATA ELEMENT IDENTI- FIER) COUNT. A data element whose value is the length (in char- acters) of the delimiter (or, if data element identifiers are used, the length (in characters) of the delimiter plus data element identifier) used within the record. DIRECTORY. An index to the location of the variable fields (control and data) within a bibliographic record. The directory consists of entries. ENTRY. A fixed field within the directory which contains information about a variable field. ENTRY MAP. A data element which is used to indicate the structure of the entries in the directory. EXTERNAL CHARACI'ER. A graphic symbol which may be repre- sented by one or a series of two or more internal characters. [The external character "space, is always represented by an internal char- acter.] FIELD. A defined character string which may contain one or more data elements. See also CONTROL FIELD; DELIMITER; ENTRY; FIXED FIELD; INDICATOR; VARIABLE FIELD. FIELD TERMINATOR (FT). A character used to terminate a vari- able field within a bibliographic record. The last variable field is terminated by a record terminator and not a field terminator. 58 Journal of Library Automation Vol. 2/ 2 June, 1969 FILE. A set of related records denoted by a single name. FIXED FIELD. One in which every occurrence of the field has a length of the same fixed value regardless of changes in the contents of the field from occurrence to occurrence. FORMAT. See STRUCTURE. FT. See FIELD TERMINATOR. INDICATOR. A data element associated with a data field which sup- plies additional information about the associated data field. INDICATOR COUNT. A data element whose value is the length (in characters) of the indicator( s) which appears as the first data ele- ment in each variable data field. The length (in characters) of the indicator ( s) must be uniform for each field of a given record. (A length of zero ( 0) is permitted) . INFORMATION INTERCHANGE FORMAT. See BIBLIOGRAPHIC INFORMATION INTERCHANGE FORMAT. INTERCHANGE FORMAT. See BIBLIOGRAPHIC INFORMATION INTERCHANGE FORMAT. INTERNAL CHARACTER. A pattern of bits of a predetermined length (depending on the system ) treated as a meaningful unit. (The terms "internal character" and "character" are used interchangeably in this standard.) LEADER. A fixed field which occurs at the beginning of each biblio- graphic record which provides parameters for the processing of the record. PADDING CHARACTER. A character used to fill areas in fixed fields which contain no data. [See paragraph A.2.1.4 of Appendix I.] PRIMARY BIBLIOGRAPHIC UNIT. That bibliographic unit whose physical and bibliographic characteristics determine the type-of- record and bibliographic level. RECORD. See BIBLIOGRAPHIC RECORD. RECORD LENGTH. A data element whose value is equal to the length (in characters) of the bibliographic record including the record ter- minator. RECORD TERMINATOR (RT) . A character used to terminate each record. RT. See RECORD TERMINATOR. STATUS. A data element which indicates the relation of the biblio- graphic record to a file (e.g., new, updated, etc.). STRUCTURE. The framework of fixed and variable fields within the bibliographic record. SUBRECORD. A group of fields within a bibliographic record which may be treated as a logical entity. [When a bibliographic record describes more than one bibliographic unit, the descriptions of in- dividual bibliographic units may be treated as subrecords.] USA Standard for a Format for Bibliographic Information Interchange 59 TAG. A series of characters used to specify the name or label of an associated variable field. TYPE-OF-RECORD. A data element which, in association with the data element "bibliographic level," indicates the form of the biblio- graphic description provided for the primary bibliographic unit. [It is assumed that the person providing the bibliographic description, on the basis of predefined criteria, will detemline the treatment a given item is to receive; i.e., whether the item is to be treated as a book, a journal article, a map, a picture, an abstract, a bibliographlcal footnote, etc. If a given item consists of parts which, if they occurred independently, would be accorded different bibliographic descriptions, the choice of treatment selected is assumed to be the most appro- priate. Frequently occurring combinations may be accorded their own treatments, e.g., collections of drawings with accompanying text. For each established form of bibliographic description, there will be a record format whose components are defined by the "type-of-record" data element. Among these components are the length of the fixed fields, the tagging scheme employed, and the definition of the data elements. If the interchange format is used for the interchange of records of a type for which "bibliographic description" is not a para- meter, e.g., authority records, this data element may be redefined. See Afpendix I for an illustration of an application of this data ele- ment. VARIABLE FIELD. One in which the length of an occurrence of the field is determined by the length (in characters) required to contain the data stored in that occurrence. The length may vary from one occurrence to the next. 2. PURPOSE AND SCOPE 2.1 2.2 2.3 2.4 This standard defines a format for the interchange of biblio- graphic and related [authority files, subject heading lists, etc.] records. This standard does not define a record format for retention within the files of any specific organization. This standard does not necessarily define the content of in- dividual records. It does describe a generalized structure which can be used for the interchange of records describing various forms of bibliographic material. This standard assumes the utilization of the following USASI Standards and Proposed Standards: (a) USAS X3.22-1967 Recorded Magnetic Tape for Infor- mation Interchange ( 800 CPI, NRZI) (b) USAS X3.4-1967 Code for Information Interchange (c) Proposed Standard X3.2/552 Magnetic Tape Labels and File Structure 60 Journal of Library Automation Vol. 2/2 June, 1969 3. BIBLIOGRAPHIC INFORMATION INTERCHANGE FORMAT 3.1 Schematic Representation The interchange format is schematically represented below: I . I I I . I Leader Directory I F Control IF Other IF Data IF Data I R Field I T n I IU!CORD LENGTH 0 I T NUlllber IT Control 1T Field rT I I Fields I I (If _l _l Present) I 1 I I ' I 3.2 Leader STATUS 4 5 3.2.1 Schematic Representation The leader is schematically represented below: TYPE OF IIIIILIQ- RESERVED INDI- DELIMITER !lASE RESERVED ENTRY RECORD GRAPHIC FOR CATOR (OR DE- ADDRESS FOR USI! MAP LEVEL FUTIIRI! COUNT LIMITER PLUS OF IIY USER USE DATA ELEMENT DATA SYSTEMS IDENTIFIER COUNT 6 7 8 9 10 ll 12 16 17 19 20 23 3.2.2 Record Length The record length is a 5-digit decimal number equal to the bibliographic record length. This number will include its own five characters and the record termina- tor. The record length will always be present in char- acter positions 0-4 of the record. In the interchange format the bibliographic record has a maximum length of 99,999 characters. 3.2.3 · Status A data element in character position 5 consisting of 1 basic character. 0 3.2.4 Type-of-Record A data element in character position 6 consisting of 1 basic character. o 3.2.5 Bibliographic Level A data element in character position 7 consisting of 1 basic character. 0 3.2.6 Indicator Count A data element in character position 10 consisting of 1 decimal digit 0 equal to the length (in characters) of the indicator ( s) which appears as the first data element of each variable data field. If indicators are not used, this field is set to zero ( 0). (See 3.4.2.1) • See Appendb: I for an lllustratfoa of an application of this data element. USA Standard for a Format for Bibliographic Information Interchange 61 3.2.7 Delimiter (or Delimiter Plus Data Element Identifier) Count A data element in character position 11 consisting of 1 decimal digit equal to the length (in characters) of the delimiter (or, if data element identiliers are used, the length (in characters) of the delimiter plus data element identifier) used within the record. If a delimiter is not used, this field is set to zero ( 0). If a delimiter alone (i.e., without data element identifiers) is used, this field is set to one ( 1 ) . 3.2.8 Base Address of Data A data element in character positions 12-16 consisting of 5 decimal digits and equal to the combined length (in characters) of the leader and directory (including the field terminator at the end of the directory) . 3.2.9 Entry Map (See 3.3.1 for the description of entries.) Structure of each entry in the directory: Tag Length Starting of Character Field Position Entry map: m n I ~ I ~ m = length (in characters) of the '1ength of field" portion of each entry in the directory n = length (in characters) of the "starting character position" portion of each entry in the directory 0 = undefined; available for future use The entry map is a data element in character positions 20-23 consisting of 4 decimal digits. Each decimal digit recorded corresponds sequentially to each portion of the entry, except for the portion allotted to the tag. Char- acter position 20 in the entry map indicates the length (in characters) of the "length of field" portion of each entry in the directory; character position 21 indicates the lenrh (in characters) of the "starting character position portion of each entry. If one of these does not occur, the relevant character position in the entry 62 Journal of Library Automation Vol. 2/ 2 June, 1969 map is set to zero. Character positions 22 and 23 are undefined and are available for future use. [Since bib- liographic data is usually variable in length, the struc- ture of an entry in the directory will usually follow the pattern "tag, length of field, starting character position." The inclusion of an entry map provides flexibility for those users who wish to structure the entry in the direc- tory differently, either by including (in addition to tag, length of field, and starting character position) other data elements not defined in this standard or by excluding those that have been defined. However, any restructur- ing of the entry by a user will have to be done within the general limitations imposed by the standard (see 3.3.1). The use of the entry map can be illustrated as follows : ( 1) An entry map set to 4500 would define the characteristics of a directory in which each entry consisted of a 3-digit tag (not expressed in the entry map), a 4-digit length of field, and a 5-digit starting character position. ( 2) An entry map set to 0500 would define the characteristics of a directory in which each entry consisted of a 3-digit tag, no length of field data element, and a 5-digit starting character position. See Appendix I for an illustration of an actual application of the concept of an entry map.] 3.3 Direct01·y The directory consists of a series of fixed fields (hereinafter referred to as entries). The directory ends with a field terminator. The directory must contain at least one entry for each subsequent variable field (control and data). [In the case of very long fields additional entries may be required. See 3.3.1.3.] 3.3.1 Entries Each entry consists of 12 characters. Each entry must contain, at the very least, a tag, and length of field, or a tag and starting character position and must corres- pond, unambiguously, to a specific variable length data or control field. The tag, length of field, and starting character position must, whenever they occur, be in that sequence. 3.3.1.1 Tag The tag is a data element consisting of 3 basic characters. 3.3.1.2 Tags for Control Fields Tags 001-009 are reserved for control fields as shown: USA Standard for a Format for Bibliographic Information Interchange 63 001 Control number 002 Reserved for Subrecord directory, if any• 003 Reserved for Subrecord relationship, if any• 004-009 Reserved for use by user systems 3.3.1.3 Length of Field The length of field in the entry is the length (in characters) of the variable field to which it corresponds. The length of field includes the indicator( s) and field terminator. It is expressed as a decimal number. If the length of a variable field exceeds the maximum length expressible as decimal num- ber in the length of field portion of the entry, two or more entries (called a "subset" for the purposes of this explanation) will be used to define the location and extent of such a field. Since all the entries in the subset of entries reference the same variable field, they will contain the same tag. The length of field in each entry of the subset, except the last entry in the subset, will be set to 0 to indicate that the length of field is equal to the maximum length expressible and that there is additional information for the same fi eld in the next entry in the record direptory. The length of field for all entries in the subset subsequent to the first will refer to the length (in characters) of the overflow data. [This convention cannot be followed if the structure of the entry does not contain a length of field.] 3.3.1.4 Starting Character Position The starting character position is the character position of the first character in the variable field (which may be an indicator or data; see 3.4.2) referenced by the entry. It is given relative to the base address of data (i.e., the first character of the first variable field follow- ing the directory is numbered 0) . 3.3.2 Sequence of Entries The entries in the directory may be recorded in any sequence (i.e., they need not be in the same sequence as the corresponding variable fields ) except that the • Ap pendix II illustrates a possible method of handling sub records within a b ibUoKTaphic record. This is not part of the St andard . 64 Journal of Library Automation Vol. 2/2 June, 1969 entry for tags 001-009 must always be first and in as- cending numeric sequence. [Note that specific systems may use the sequence of entries in the directory to con- vey semantic information.] 3.4 Variable Fields 3.4.1 General Following the leader and directory, the bibliographic record consists of variable fields. (Although the direc- tory is technically a variable field, the following para- graphs do not apply to it.) 3.4.2 Structure of Variable Fields INDICATOR(S)* Each variable field consists of indicators( s) (if used), a delimiter (if used), a data element identifier (if used), data, and a field terminator, as shown. Control fields do not contain indicators, delimiters, or data element iden- tifiers. DELIMITER* . DATA ELEMENT DATA FIELD IDENTIFIER* TERMINATOR * Except control fields 3.4.2.1 Indicator The indicator is the first data element in each variable field. The length (in characters) of the indicator(s), which may be 0, (i.e., no indicator is present) is recorded in the indica- tor count in the leader. All variable fields, except control fields, in the same record have the same length (in characters) for an indi- cator(s ). 3.4.3 Sequence of Variable Fields The variable fields, except for the control fields asso- ciated with tags 001-009, need not occur in the same sequence as the corresponding directory entries. The control fields which occur must be first and in ascending numeric sequence. 3.4.4 Control Fields The variable fields associated with tags 001-009 are con- trol fields. Control fields do not contain indicators, delimiters, or data element identifiers. 3.4.4.1 Control Number Field This field contains the control number, con- sisting of basic characters. This field must always occur once, and only once, in each USA Standard for a Format for Bibliographic Information Interchange 65 bibliographic record, and must immediately follow the directory. 3.5 Variable Data Fields 3.5.1 General The remainder of the bibliographic record consists of variable data fields. There are no restrictions on the munber, length, or content of the variable data fields other than those already stated or implied (e.g., those based on the limitations of the total record length). 3.5.2 Multiple Data Elements Multiple data elements within fields may be fixed or variable and may be identified by position, by the use of a delimiter alone, or by the use of a delimiter plus data element identifier( s) as the case may be.