Organizing digital newspaper content is a process through which an institution assesses, documents, and sometimes refines its file naming and folder usage conventions.
As described in previous sections, an institution’s newspaper content is often created and/or acquired by a range of players and over a long span of time. Different collections within an institution’s holdings may conform to different file-naming conventions and folder conventions. Documenting these conventions clearly and/or normalizing these disparate collections by applying a unified schema enables future curators (and users) to retrieve, validate, and if necessary, reconstitute these collections in the future. As such, organizing digital newspapers is an important step in the preservation readiness process.
This process of organizing digital newspaper content builds upon curatorial work described in earlier sections of these Guidelines, specifically:
- Inventorying how much digital newspaper content an institution is managing and where it is located;
- Identifying the range of file formats and performing any necessary normalizations or migrations;
- Exporting and consolidating metadata for all collection(s); and
- Producing checksum manifests for this content.
Once these preliminary activities are complete, curators can effectively organize their digital newspaper content for archival storage.
Sound practices for organizing digital news content primarily include the following:
- Rectifying any file-naming conventions that put content at risk of non-renderability;
- Documenting effectively the range of file and folder naming practices and conventions represented in an institution’s collections; and
- Storing this documentation with the content it describes.
Even institutions with low resources and disparate practices will be able to provide a brief summary of each digital news collection’s internal conventions that can help future curators and users to understand each collection’s structure for future use and renderability. Institutions with higher resource levels may also analyze and streamline their conventions and practices across digital newspaper collections.
The goal is to arrive at a documented and uniform approach (or small set of approaches with clearly designated use-cases, e.g. one approach for legacy digitized content, another for recent digitization efforts, and a third for born-digital content) that contains clear guidelines for file names and folder/sub-folder usage. Collection managers should coordinate with their technical staff members (or those of any repository service provider) throughout any remediation and convention setting so that any change to existing conventions is understood and accounted for in the repository software environments used for access and/or preservation purposes.