This site uses cookies

Some of these cookies are essential, while others help us to improve your experience by providing insights into how the site is being used.

For more detailed information please check our Cookie notice


Necessary cookies

Necessary cookies enable core functionality. This website cannot function properly without these cookies.


Cookies that measure website use

If you provide permission, we will use Google Analytics to measure how you use the website so we can improve it based on our understanding of user needs. Google Analytics sets cookies that store anonymised information about how you got to the site, the pages you visit, how long you spend on each page and what you click on while you’re visiting the site.

Generating Synthetic Data for Statistical Disclosure Control

16 Oct 2017 - 17 Oct 2017 12:00 am
Training
Data skills
Other

This short course will
provide a detailed overview of the topic, covering all important aspects
relevant for the synthetic data approach. Starting with a short introduction to
data confidentiality in general and synthetic data in particular, the workshop
will discuss the different approaches to generating synthetic datasets in
detail. Possible modelling strategies and analytical validity evaluations will
be assessed and potential measures to quantify the remaining risk of disclosure
will be presented.

The aim is to provide
the participants with hands on experience, the course will include practical
sessions using R, in which the students generate and evaluate synthetic data
based on real data examples.

Target Audience

The course intends to
summarize the state of the art in synthetic data. The main focus will be on
practical implementation and not so much on the motivation of the underlying statistical
theory. Participants may be academic researchers or practitioners from
statistical agencies working in the area of data confidentiality and data
access. Basic knowledge in R is expected. Some background in Bayesian
statistics is helpful but not obligatory.

 

Further course details can be found here.

 

More information regarding our courses can be found here.

 

Podcast
for some of our previous courses can be found
here.

Course LeaderDr Jörg Drechsler (IAB)

Course Outline:

The
course covers:

  • the fully synthetic data approach
  • the partially synthetic data approach
  • modelling strategies for generating
    synthetic data
  • data utility evaluations
  • disclosure risk assessment

By
the end of the course participants will:

  • have a practical understanding of the concept of synthetic data 
  • be able to judge in which situations the approach could be useful
  • know how to generate synthetic data from their own data
  • have a number of tools available to evaluate the analytical validity of
    the synthetic datasets
  • know how to assess the disclosure risk of the
    generated data