Recovering Digital Newspapers from Preservation

The Guidelines have emphasized, particularly in the last sections (Section 5: “Organizing Digital Newspapers for Preservation” and Section 6: “Packaging Digital Newspapers for Preservation”) that the time and work associated with recovery depends largely upon the work an institution does to logically structure, document and package its digital newspaper collections. Loss or corruption scenarios may involve one or a small number of files or they may involve whole collections. The institution as a data provider needs to know how and where to turn to issue a request for any preserved copies of data—be that in coordination with local archival management or an external preservation service provider. There may be specific request channels and protocols that need to be observed and followed. The institutional owner of the data should have all the necessary identifying information as it corresponds to the data stored so that proper identifications and timely retrievals can take place.

Though the Reference Model for an Open Archival Information System (OAIS) primarily refers to this stage of activity in terms of Access and Dissemination Information Packages (DIPs) as they relate to end-user access requests, the information can be helpful for understanding requests for data from an archive as a bounded and segregated activity that should occur using resources not necessarily shared by those reserved for archival management. In other words there should be server/storage resources assigned for copying AIPs or objects within AIPs to in order to facilitate a retrieval of this data for recovery purposes—this might be an FTP enabled server, an external hard drive, or some other portable media for shipping/delivery purposes. Negotiating this media and delivery mechanism is important prior to facing an actual recovery scenario because it can determine what sorts of support needs to be in place at the receiving institution.

The institutional owner of this digital newspaper content should include within a DIP everything it needs to verify the correctness of the file(s) and their bit integrity. These Guidelines have consistently aimed to facilitate this objective by advocating for various documentary and metadata approaches that can disclose filenames and their linkages to metadata and checksum information. In the event of a full collection recovery, if metadata has been properly stored along with collection files, and any globally unique identifiers (GUIDs) that the local repository system makes use of to re-integrate collection content, this set of activities should be less cumbersome. The owning institution may want to test such recoveries as part of its initial archival storage efforts to ensure a seamless integration between archival management and recovery. Recovered preservation copies of digital newspaper data can then be used to produce any needed derivative access copies, if these also have been compromised.