Study-level documentation

Providing information on the research context and design

Study-level documentation for a data collection or dataset provides high-level information on the research context and design, the data collection methods used, any data preparations and manipulations, plus summaries of findings based on the data.

Good study-level data documentation should include information on the following:

  • Purpose for which data was created: Project history, aims, objectives and hypotheses, investigators, funders and publications from the data.
  • Content of the dataset: Kind of data (e.g. interviews, images, survey responses), the structure of data files, with the number of cases, records, files and variables, as well as any relationships among such items.
  • How the data was collected: Data collection methodology and protocols, sampling design, sample structure and representation, workflows, instruments used, hardware and software used, data scale and resolution, digitisation or transcription methods used, secondary data sources used.
  • Who collected the data, where and when: Data collectors, geographic coverage of the data collection, date of data collection (temporal coverage).
  • How data was processed: Tools, instruments, procedures, hardware/software or protocols used to edit, clean, code and classify the data.
  • Any modifications done to the data: Anonymisation, for time series or longitudinal surveys changes made to methodology, variable content, question text, variable labelling, measurements or sampling.
  • Quality assurance procedures: Data validation, checking, proofing, cleaning and other quality assurance procedures carried out, such as checking for equipment and transcription errors, calibration procedures, data capture resolution and repetitions, or editing, proofing or quality control of materials.
  • How can data be accessed: Where is the dataset available from, persistent identifier, access conditions, use conditions, licence, copyright, information on data confidentiality, citation.

Example

Much of this information is usually already included in publications, final reports to funders, working papers and lab books. At the UK Data Service, all useful sources of information that are provided by the data producer are gathered and combined into a bookmarked PDF/a user guide accompanying a data collection,

Follow this link for an example of online study-level documentation for a data collection in our data catalogue, including questionnaires and user notes, a technical report with coding frames and weighting information and methodological information.