A plan for storage of research data
A plan for the storage of research data is essential, both for the short and long term. The short term plan considers how to store data during research, whereas the long term plan identifies how and where to store data for archiving and future reuse after research activities end.
Which storage media?
The accessibility of data depends on the quality of the storage media and the availability of the relevant data-reading equipment. An Amstrad floppy disc may still work perfectly 20 years after it was made, but the lack of working machines means that the data on this disc may not be easily recoverable.
Optical media are vulnerable to damage by poor handling, changes in temperature, relative humidity, air quality and lighting conditions. Data files should be copied to new media every two to five years after they were first created.
It is also good practice to check, at regular intervals, the data files on these discs. The National Preservation Office has published guidelines on Caring for CDs and DVDs.
Magnetic media, like hard drives or tapes, are also subject to physical degradation and should be regularly migrated to fresh media
File-sharing services, such as Google Docs, OneDrive and Dropbox, may not be suitable for confidential data.
We recommend that any storage strategy, even for a short-term project, should involve at least two different forms of storage, for example on hard drive and on DVD. Whichever form is chosen, the data integrity should be checked periodically.
How to store confidential, sensitive and personal data?
Storage of data that are considered confidential or sensitive may need to be addressed during consent procedures, to inform the people to whom the data belong to, how and why the data will be stored.
Legally, data that contain personal information must be treated with more care than data that do not, as dictated by the Data Protection Act 2018 and the UK General Data Protection Regulation, which state that personal data should only be accessible to authorised persons.
Personal information can be removed from data files and stored separately under more stringent security measures. Any digital files or folders, which contain sensitive information and data should be encrypted. For more information, consult our guidance on data security.
Signed consent forms or other non-digital records may contain identifying information and should be stored separately from data files, although an anonymous ID system can help link the two sets of materials together if required.
Which physical conditions for storing data?
Areas and rooms designated for storage of digital or non-digital data should be suitable for this purpose. The conditions under which data are stored significantly affect their longevity.
A personal computer is more likely to suffer from a fatal crash in a stiflingly hot office than in a temperature-controlled environment. The same applies to optical and magnetic storage media. The storage rooms should be structurally sound, free from the risk of flood and as far away as possible from the risk of fire.
Printed materials and photographs are subject to degradation from sunlight and acid. High-quality media should be used for preparing paper-based materials for storage, or for copies of originals. Examples include using acid-free paper, folders, boxes and non-rust paperclips, rather than staples.
What kind of documentation or metadata to accompany stored data?
Comprehensive and accurate documentation is essential for informed and accurate use (and reuse) of data at any time in the future. Our section on data documentation provides detailed advice.
Making digital versions of paper documentation in PDF/A format is recommended for long-term storage.
Which file formats for long-term storage?
Our guide to recommended file formats for data preservation gives advice on best file formats for various data types. The best formats for long-term storage are generally non-proprietary formats or formats based on open standards.
Some proprietary formats, like Microsoft’s Rich Text Format and Excel, are widely used and likely to be accessible for a reasonable length of time.