Rationale & Sound Practices

Rationale

Preservation readiness for metadata refers to the process of ensuring that metadata is properly preserved along with the item/collection it describes. Particularly in the case of digital newspapers—which are often compound objects with complex relationships—maintaining robust connections between the metadata and the content is essential. If this information about the objects is lost, they may no longer be able to be reassembled and used in meaningful ways.

In this guidance document, content curators will find practical advice regarding:

  1. General strategies for packaging metadata for digital newspapers
  2. Responsible approaches for exporting metadata from any existing repository systems where it may be held
  3. Strategies for navigating the features of such repository systems
  4. Tips for managing the outputs from all such activities

Please note: this section does not provide advice on collection-level descriptive metadata creation. Institutions needing advice on that topic should refer to the host organizations for the various standards and schemas that may apply. This section does address the importance of creating administrative, technical, and structural metadata for long-term preservation purposes.

Sound Practices

Packaging metadata essentially requires knowledge of three core factors: 1) the metadata an institution currently has for its newspaper content; 2) where this metadata is stored, and 3) how this metadata is related to the objects/collections it describes. Each of these factors will be considered below.

1. What metadata does an institution have?

Institutions produce metadata to aid with collection description, discovery, and archival management. Multiple schemas (e.g., Dublin CoreMETSPREMISMIX, and MODS among others) are often used to record this important descriptive, technical, administrative, and structural information for digital newspaper files and collections. Institutional practices vary widely in terms of schemas used over time, and also in the levels of completion or conformance to these schemas and their attendant profiles. Understanding and documenting what metadata schemas have been used for different collections and items over time is the first step in readying this metadata to be preserved with the content it describes.

2. Where is metadata stored?

The storage and maintenance of this metadata likewise varies widely. Most often, the metadata is stored either alongside the collections it describes (e.g., in a repository system as some type of associated file) or  embedded with the objects/collections it describes (e.g., via METS or METS-ALTO packaging, or in file headers). Sometimes, metadata may also be held in a collection database that describes many digital collections held by an institution (e.g., one catalog that describes the entire digital holdings of an institution). An institution should document these locations within its digital newspaper inventory to ensure that curators know where the metadata resides (see Section 1: “Inventorying Digital Newspapers for Preservation”).

3. How is metadata related to the objects/collections?

There is a wide range of practices for metadata association and linkages. At the lower end of the spectrum, some institutions document these relationships through maintaining a metadata spreadsheet or database that includes keys or unique identifiers that correspond to each collection title or collection item. Some institutions fall back on their repository systems  (e.g., DSpaceFedoraOlive ActivePaper, or ArchivalWare) to structure both their collections and their metadata according to the repository software’s default means for associating the records with the objects. Still others package the object with its metadata or refer to it externally using METS or another packaging standard.

Understanding the range of local metadata schemas, locations, and relationships is the first step in readying the metadata for preservation along with the objects/collections it describes.