Backup

Protect against accidental or malicious data loss

Making backups of files is an essential element of research data management which ensures that original data files can be restored from backup copies, should they get damaged or go missing.

Regular backups help protect against accidental or malicious data loss due to:

  • human error
  • hardware failure
  • software or media faults
  • virus infection or malicious hacking
  • power failure

The form of backup procedure required for a project will depend on local circumstances, the perceived value of the data and the levels of risk of losing data you are prepared to take. Carrying out an informal risk analysis can provide a good indication of backup needs.

Your risk analysis for planning a backup strategy for your research data

Is there any backup provision already in place?

Find out if your institution has an operational backup policy. Most universities have one for files held on a university network space. In most cases, their policies do not include your local drive; you must manually backup this drive if you use it for data storage. If you are not happy with the robustness of the solution you should carry out an independent backup of critical files.

Which systems to back up?

You need a strategy for all systems where data are held, including portable computers and devices, non-network computers and home-based computers.

It will be important to identify which information on these systems should be backed up. This could be all, some or just the parts that have changed. If your institution does not provide any system backup, you may need to take full responsibility for all your own backups.

What file formats to use?

Backups of master copies should ideally be in file formats that are suitable for long-term digital preservation, i.e. open or standard formats as opposed to proprietary ones.

How often should I back up my data?

Consider how often you make changes to your data, and which amount of changed data you are prepared to lose between backups. Consider backing up after each change to a data file or at regular intervals, such as daily or weekly.

How many copies should be made?

Most back-up policies would recommend having at least three copies of the data, with at least one being stored offsite.

Where should I store my backups?

Depending on the form of backup and the risks associated with data loss, it is most convenient to keep backup files on a networked hard drive. For critical data, not available elsewhere, we would recommend that you also adopt offline storage on optical media, removable hard drives or magnetic tapes.

If you are backing up many small data files on a daily basis, copying them to a recordable DVD probably suffices. If you are making backups of very large quantities of data from a networked hard drive, a removable hard drive or even magnetic tape is probably more convenient. Never rely on pen drives for backup media.

Physical media should be safely stored in another location. Most manufacturers provide recommendations for the best storage conditions of physical media.

Backups that contain personal data require encryption and should be stored securely. For more guidance on backing up personal data see the question below on ‘How about backing up personal data?’

How should I organise my backups?

If you are making your own backups on removable media, make sure they are well-labelled, indicating the content and date/time, and well-organised. Without some management, achieving the ultimate aim of restoring lost data may prove difficult.

Are there any tools I can use to help with backing up?

It is good to use an automated backup process to back up frequently used and critical data files. Windows and MacOS both have backup tools built-in, File History and Time Machine respectively, which make backing up easy and as simple as a few clicks.

How about backing up personal data?

Where data contain personal information, care must be taken to create only the minimal number of copies needed, for example, a master file and one backup copy, which is encrypted and securely stored. Otherwise, there could be a proliferation of data files containing personal information, which will be harder to securely destroy at the end of a project.

How can I verify and validate backup files?

It is important that you verify and validate backup files regularly by fully restoring them to another location and comparing them with the original. Backup copies can be checked for completeness and integrity, for example by checking the file size, date and MD5 checksum value, It is also worth considering how long the backed-up data should be retained and if any data retention policies apply to it.