|Storage and Management- Local History Digitisation Manual|
Once the digital files have been created, they need to be stored and managed in a way that makes them available for use via the access mechanism decided on for the project and ensures their preservation for the period of access required. Two main issues to be considered are:
Unlike physical media, digital media cannot be accessed without computer hardware and software. It is also vulnerable to destruction if the magnetic or optical media it is stored on is corrupted. An appropriate data management strategy is therefore critical to the on-going viability of a digital collection.
It is important to consider the IT infrastructure which is available to assist with the digitisation project. If in-house IT staff and equipment are to be utilised, they may require additional funding to cover the increased workload and/or equipment requirements. Additional training may also be necessary in specific areas of expertise connected with digitisation. If IT support is to be outsourced, this must be properly costed and documented to ensure that both sides have the same understanding of the level of support which is to be provided. IT problems can often become larger or more complex than expected and parties on each side of an agreement should be aware of how they will deal with this if it should occur.
The cost of storage and management of digital data is highly dependent on the IT infrastructure used, the volume of data to be stored and the level of data protection required. Approaches which leverage off existing arrangements for storage of data - such as the library's integrated library management system (current systems often include a module for storage of image data) or a Local Council's document storage system - are likely to be more cost-effective than something instituted to serve the digitising operation alone. A central approach to data storage, whether outsourced to a commercial organisation or managed collaboratively, is also likely to create economies of scale for small-scale digital collections. It is important to obtain professional IT advice on the appropriate storage requirements for the project.
Documented file naming protocols are imperative for any digitisation project. The digital files which are created will need to be named so that they can be meaningfully linked to the appropriate metadata and access mechanisms and the files can be located easily for future management. A naming protocol which is meaningful to humans may be appropriate in a small collection where files will be managed manually. However in collections of tens or hundreds of thousands of items, files are more likely to be subject to some form of automated management and therefore file names which are meaningful to the computer system may be more appropriate. This is a decision which will be specific to each project, however if interoperability is a desirable outcome, it may also be dependent on protocols being used in other systems.
Whatever protocol is chosen it should be clearly documented. If a mechanism for automatic generation is used, for example using existing thesauruses, there are likely to be fewer misnamed (and therefore possibly permanently lost) files. The State Library of Victoria uses an 8 character file name with the first 2 alphabetical characters refer to the collection and the final 6 numerical characters are a sequential number. Consideration should be given to the effect of using leading zeros before sequential numbers if later items may need to be inserted into the sequence for any reason.
There are a number of
options for storage of digital files, however the main storage options are:
Digital files may require a large amount of computer disk storage space, particularly if high quality master files are being stored. A photograph may take up tens of megabytes of storage space if it is in high resolution uncompressed TIFF format. However the same photograph may only take up a few hundred kilobytes if it is in lower resolution JPEG format. In a collection of hundreds or thousands of digital items, storage can be a substantial cost. However the cost of disk storage is constantly decreasing and tape or CD-ROM storage may offer an alternative for material where instant access is not required. Storage can also be outsourced to companies who specialise in data storage.
This is often the default option for digital projects, as local servers can easily be made available and web site managers are familiar with the storage of image files in file structures on these servers. For a small number of files this may be an appropriate option. The system requires some manual management of the files and if this is undertaken by an administrator unfamiliar with the specific files, does allow the possibility of loss or damage of data.
Proprietary image databases
There are a number of image databases on the market now, many of which will store a wide range of digital file types. They may also offer additional functionality such as version control of changes made to files, multiple format storage or automatic generation of lower resolution copies of images, workflow management, user access authentication and cataloguing. The database architectures underlying these proprietary systems will be one of three basic types. These are: 'flat-table' (in which data is held as a series of single 'card file' type entries); 'relational' (where data is held in tables and relationships are specified between the data); or 'object-oriented' (the next generation, where data is represented as a complex object which can have attributes and operations performed upon it). The software systems available will therefore range from very simple databases which may allow in-house customisation, to professional digital object management systems. Some databases which appear simple and easy to use, may not have the flexibility required for a more complex project. Any choice of database will depend upon the specific requirements of the project or collection. The issue should not be decided without seeking professional IT advice.
Complex digital object management systems
Many large libraries and museums have developed complex software and hardware architectures to house digital files and associated metadata. This is a very specialised area and generally well beyond the resources of smaller organisations. However examination of such systems can provide valuable information on developing technologies such as SGML/XML which may be used more widely in the future.
Data management can also be undertaken by an external agency specialising in data management. Data can be stored off-line on either magnetic (tape or disk) or optical (CD-ROM) storage and only delivered to users on request. Alternatively it can be stored on a server computer which is online and allows 24 hour 'live' remote access. Even if data is stored by an external agency, interface design can make it appear to users that it is all being delivered from the one collection.
For further information on image management systems see Kenney & Rieger, 119.
However the data is stored, it is imperative that an appropriate data management strategy is in place to ensure ongoing access to the items. This should include appropriate and robust back-up systems including off-site storage of one copy at all times.
Long-term storage of digital files is fraught with difficulty due to the rapid changes in hardware and software. There are three possible solutions currently advocated for ensuring long-term access to digital data:
Increasingly the third option is being seen as the most realistic. (Lee 146) It is also important to regularly refresh data by periodically copying files onto new storage media, as magnetic storage media (tape and disk) has a finite life span. However for long term management, it will also be necessary to have in place a data migration plan. This will need to ensure that at appropriate times data is transferred to new file formats which can be accessed by current hardware and software. This will prevent the risk of data being held in outdated file formats and possibly becoming unreadable at some stage in the future.
Consideration should be given to both security of the actual data, including proper back-up and data migration strategies, and security of the intellectual property contained within the data.
To protect the intellectual property in the data it may be necessary to restrict access in some way and/or to embed security information into the file itself. A number of cultural organisations have placed visible or invisible watermarks into their digital images in order to protect against unauthorised further use of the image. A number of companies sell watermarking technologies and this is a rapidly changing field. For an example of the use of watermarks, see the Australian War Memorial site which includes visible watermarks on many of its photographic images in order to prevent unauthorised reproduction. (see examples at http://www.awm.gov.au)
Watermarking technologies are 'far from stable' and can be expensive to implement and may not necessarily be totally effective. (Lee, 144)
It may also be appropriate to restrict access to the collection using password protection or some other form of access authorisation. Any decision made in this area will be highly dependent upon both copyright restrictions and management or business plans for the collection. It is likely to be necessary to seek professional IT assistance to establish the appropriate security strategies for your project.
There is still considerable debate regarding the appropriateness of digitial technologies for long-term preservation. Although digitisation may assist with the preservation of originals by reducing the use of fragile physical items, there is considerable risk that digital information itself will become non-accessible in a relatively short space of time due to degradation of storage media and/or upgrading of hardware, software and file formats.
Loss of access to material can be caused by the degradation of the storage media, loss of functionality of the access devices, loss of manipulation and presentation functions, or the disfunction in the documentation chains. The challenge in the preservation of digital material is to retain the essence of the original material by maintaining its function, content and context. http://www.swin.edu.au/afi/digital%20preservation.pdf
A digitisation project should not consider preservation as one of its goals unless there has been considerable planning and research into the long-term viability of all technical and management processes put in place. Digital preservation should not be attempted without expert advice from both archival and technical experts.Back to the Manual Home Page Back to the Local History Digitization Page