Ensuring quality of data at all stages

Quality control of data is an integral part of all research and takes place at various stages, during data collection, data entry or digitisation, and data checking. It is vital to develop suitable procedures before data gathering starts.

Data collection

During data collection researchers must ensure that the data recorded reflect the actual facts, responses, observations and events. The quality of data collection methods used strongly influences data quality. Documenting in detail how data are collected provides evidence of such quality.

Quality control measures during data collection may include:

  • Calibration of instruments to check the precision, bias and/or scale of measurement.
  • Taking multiple measurements, observations or samples.
  • Checking the truth of the record with an expert.
  • Using standardised methods and protocols for capturing observations, alongside recording forms with clear instructions.
  • Computer-assisted interview software to: Standardise interviews, verify response consistency, route and customise questions, so that only appropriate questions are asked, confirm responses against previous answers where appropriate and detect inadmissible responses

Digitisation and data entry

When data are digitised, transcribed, entered into a database or spreadsheet, or coded, quality is ensured by standardised and consistent procedures for data entry with clear instructions. This may include:

  • Setting up validation rules or input masks in data entry software.
  • Using data entry screens.
  • Using controlled vocabularies, code lists and choice lists to minimise manual data entry.
  • Detailed labelling of variable and record names to avoid confusion.
  • Designing a purpose-built database structure to organise data and data files.
  • Accompanying notes and documentation about the data.

Data checking

Data checking is when data are edited, cleaned, verified, cross-checked and validated. Checking typically involves both automated and manual procedures. This may include:

  • Double-checking coding of observations or responses and out-of-range values.
  • Checking data completeness.
  • Adding variable and value labels where appropriate.
  • Verifying random samples of the digital data against the original data.
  • Double entry of data.
  • Statistical analyses, such as frequencies, means, ranges or clustering to detect errors and anomalous values.
  • Correcting errors made during transcription.
  • Peer review.

Data authenticity

Because digital information can be copied or altered so easily, it is important to be able to demonstrate the authenticity of data and to be able to prevent unauthorised access to data that may potentially lead to unauthorised changes.

Best practice to ensure authenticity is to:

  • Keep a single master file of data.
  • Assign responsibility for master files to a single project team member.
  • Regulate write access to master versions of data files.
  • Record all changes to master files.
  • Maintain old master files in case later ones contain errors.
  • Archive copies of master files at regular intervals.
  • Develop a formal procedure for the destruction of master files.

Adding value

Researchers can add significant value to their datasets by including additional variables or parameters that widen the possible applications.

Including standard parameters or generic derived variables in data files may substantially increase the potential reuse value of a dataset and provide new avenues for research.

For example, geo-referencing data may allow other researchers to more easily add value to data and apply the data in geographical information systems. Equally, sharing field notes from an interviewing project can help enrich the research context.

Quality assurance of recorded interviews

The quality of interview data gathered by means of recorded interviews depends on both the skill of the interviewer and the quality of the audio-visual equipment. Initially a researcher should think about:

  • The level of sound or picture quality needed.
  • The budget available for equipment and related consumables.
  • How quickly the technology being used will become redundant.
  • Whether consent is in place to allow fullest use of recordings.
  • How the data created will be used.
  • Whether data or information not allowed by consent can be excluded from recording.
  • Whether the equipment will be simple to operate in the field.

Taking steps to create audio recordings of good quality increases their usefulness. Good quality sound recordings should prevent mis-transcription and reduces the chance of sections of an interview remaining untranscribed due to poor sound quality.

Although some recording equipment can be expensive, it is a good investment if it is to be used time and again during a project, or even again on future projects.