Skip to Main Content

University Library, University of Illinois at Urbana-Champaign

IDEALS

This guide has information on how to deposit and access IDEALS materials for individuals and collection administrators.

Digital Preservation Support Policy

Committed to building and maintaining collections for the use of students, faculty, scholars, and the public long into the future, the University of Illinois at Urbana-Champaign assumes an obligation to ensure long-term access to the materials deposited into IDEALS and their intellectual content, but also acknowledges the inherent challenges involved in preserving digital content.

To this end, the IDEALS Digital Preservation Support Policy defines the categories of preservation support available and provides specific information about where different file formats fit within these categories. This policy is subject to change as new and emerging technologies impact our ability to preserve deposited content.

Please note that IDEALS content is now being preserved in our Medusa repository, and is governed by the digital preservation support policies found there.

Background

Our ability to preserve digital objects deposited in IDEALS is dependent, among other things, on whether the file format used:

  • Is openly documented (more preservable) or proprietary (less preservable);
  • Is supported by a range of software platforms (more preservable) or by only one (less preservable);
  • Is widely adopted (more preservable) or has low use (less preservable);
  • Is lossless data compression (more preservable) or lossy data compression(less preservable); and
  • Contains embedded files or embedded programs/scripts, like macros (less preservable).

All digital objects deposited to IDEALS will receive basic, "bit-level" preservation. Basic preservation means that IDEALS will preserve the viability of the original object through:

  • ensuring that the bitstream (the 1s and 0s that make up the digital file) remains exactly the same over time;
  • assigning a persistent, permanent identifier;
  • creating preservation metadata;
  • maintaining onsite and offsite backup copies;
  • performing regular virus and file corruption checks; and
  • performing periodic refreshments by copying files to new storage media.

Basic preservation does not ensure that a digital object may be opened by a computer program or is understandable by a human in the future. For example, in 2006 a faculty member deposits a conference presentation in the Microsoft PowerPoint format (.ppt), a proprietary format. In 2030, a graduate student would like to view that conference presentation, but the software program - Microsoft PowerPoint - used to open and read .ppt files has been discontinued since 2020. Old versions of the software program are difficult to find, and, because the .ppt file format had never been publicly documented, there exist no other software programs to open the file. Even though the original digital object (the conference presentation in .ppt) is still technically viable, it is no longer renderable (able to be opened by a computer program), and thus not understandable by the graduate student in 2030.

Therefore, for digital objects that meet certain criteria (see below), IDEALS will strive to preserve not only the viability of the object but also the renderability and the understandability of the content of the digital object, as well as the original file itself. In the case of some objects in proprietary formats, this will mean that in addition to the original digital object, IDEALS will also save a copy of the object transformed into a file format that is more preservable than the original. For example, the conference presentation in .ppt might also be saved as a .pdf/a object (an open, publicly documented standard). The .pdf/a object is a more preservable format than the .ppt format. What may be lost is the full functionality of the original digital object. For example, the graduate student in our example may not be able to view the conference presentation as a slide show as the Microsoft PowerPoint software program allows. However, the content of the conference presentation will be preserved.

IDEALS also recognizes that in some cases an access copy of a digital object is necessary due to the proprietary nature or cost of the software used to render it. For example, a Microsoft Word document is reliant on the Microsoft Word program to render it; IDEALS will also provide a .pdf version of the document because .pdf readers are freely and readily available. In some cases, the access copy and the preservable copy may be the one and the same - a .pdf/a version, for example.

Categories of Preservation Support

IDEALS categorizes digital objects into three categories of preservation support. These categories are defined below. Any format not yet reviewed and evaluated by IDEALS will receive Category 3 support on deposit. A different category may be assigned after format review takes place.

Category 1 - Highest Confidence - Full Support

Description

Most confidence in ability to provide long term preservation to content and functionality
Highest level of preservation support in effort to maintain viability, renderability, and understandability as well as functionality of original digital object.

Criteria

  • Is in a format that is publicly documented (example: .xml);
  • Is in a format that is widely adopted (example: .xhtml);
  • Is in a format that may be rendered by multiple software packages (example: .txt);
  • Is in a format that has lossless data compression (example: uncompressed TIFF files); and
  • Contains no embedded files or dynamic content (example: .txt).

Actions

  • Monitor file format for changes that might warrant transformation or reassessment;
  • Migration of document to successive format when necessary;
  • Basic preservation including:
    • bitstream maintenance;
    • persistent, permanent identifier;
    • preservation metadata;
    • onsite and offsite backup copies;
    • regular virus and file corruption checks;
    • periodic refreshments to new storage media.

Examples

  • Plain text document in unicode
  • A TIFF image

Category 2 - Moderate Confidence - Intermediate Support

Description

Moderate confidence level in ability to provide long term preservation to content of file
Intermediate level of preservation support in effort to maintain maintain viability, renderability, and understandability (but not functionality) of original digital object.

Criteria

  • Is in a format that is publicly documented;
    • AND is in a format that has lossy data compression (example: Ogg Vorbis);
    • OR is in a version of a format that has been deprecated in favor of a later version (example: HTML 3.0).

OR

  • Is in a proprietary format;
  • Is in a format that is widely adopted; and
  • Is in a format that is of enough public and/or commercial interest that tools are likely to be available to migrate them to successor formats.

NOTE: Files with embedded content (for example, a PowerPoint (.ppt) with a AVI video file (.avi) inserted into it) are more preservable if the files are deposited as separate files within the same item in IDEALS. If the content remains embedded, it will likely not remain intact when the file is transformed to a more preservable format.

NOTE: Files with dynamic content (for example, an Excel spreadsheet (.xls) with dynamic functions - even simple ones!) are more preservable if the dynamic content is either documented (for example, a note in an Excel spreadsheet explaining the functions that are included) or the document is saved as a static document (for example, a cell in an Excel spreadsheet that is the sum of a column is saved as the sum, not the function of adding the multiple cells).

Actions

  • Monitor file format for changes that might warrant transformation or reassessment;
  • When possible, transformation to a format that preserves the content and when possible the formatting and style of the original, but not necessarily the functionality.
  • Basic preservation of original object including:
    • bitstream maintenance;
    • persistent, permanent identifier;
    • preservation metadata;
    • onsite and offsite backup copies;
    • regular virus and file corruption checks;
    • periodic refreshments to new storage media.

Examples

  • Microsoft Word document (proprietary format)
  • A compressed TIFF file.

Category 3 - Low Confidence - Basic Preservation Only

Description

Low confidence level in ability to provide long term preservation to content of file
Basic level of preservation support in effort to maintain maintain viability of original digital object only.

Criteria

  • Is in a proprietary format;
  • Is in a format about which little information is publicly available;
  • Is in a format that is not widely adopted;
  • Is in a format with lossy data compression;
  • Is supported by a single or very few software platforms; and/or
  • Is in a format that does not meet the criteria for any of Categories 1-2.

Actions

  • Basic preservation of original object only including:
    • bitstream maintenance;
    • persistent, permanent identifier;
    • preservation metadata;
    • onsite and offsite backup copies;
    • regular virus and file corruption checks;
    • periodic refreshments to new storage media.

Examples

  • Kodak Photo CD format (.pcd)
  • Windows Media Video (.wmv)

Table of Preservation Actions

Preservation Action Category 1 Category 2 Category 3
Provision of persistent identifier for object and/or its metadata X X X
Creation of preservation metadata X X X
Secure storage and backup X X X
Regular fixity checks X X X
Regular virus checks X X X
Periodic refreshment to new storage media X X X
Storage of original digital object X X X
Transformation to a more preservable format N/A X  
Strategic monitoring of format for changes X X  
Migration to successive format upon obsolescence X