Introduction to synthetic data for longitudinal data managers
Jools Kasmire invites you to learn about synthetic data.
Format
Interactive presentations followed by a live coding demonstration via Zoom.
Overview
Are you a longitudinal data manager or do you work alongside data managers in a longitudinal study? We are excited to invite you to this interactive workshop, designed to introduce data managers and their colleagues to the fundamentals of synthetic data. As part of this event, we will include a welcome presentation to our ongoing project, 'Skills Development for Managing Longitudinal Data for Sharing,' part of the Population Research UK initiative, funded by the Economic and Social Research Council and the Medical Research Council. This project aims to upskill data managers, providing them with the tools and knowledge necessary to effectively manage longitudinal data with the goal of increasing data sharing.
Workshop details
This workshop is divided into four parts:
- Introduction to Skills Development for Managing Longitudinal Data for Sharing Project
We will begin with a brief presentation on our ongoing project. This session will focus on the importance of this project for the Longitudinal Population Studies community, highlighting the resources and training we are developing to support the effective management and sharing of longitudinal data. - Introduction to Synthetic Data
We will follow up by covering the basic concepts of synthetic data, show examples, and discuss why understanding synthetic data is crucial. Key questions addressed will include:
- What is synthetic data?
- What types of synthetic data are there?
- For what purposes can synthetic data be used?
- How can synthetic data be generated?
We will also explore the specific benefits of synthetic data for longitudinal and biomedical studies, discussing how it can enhance data management, sharing, and research efficiency. - Live Coding Demonstration
After a short break, the workshop will transition to a live coding session using Jupyter notebooks. We’ll demonstrate several methods in Python for generating various forms of synthetic data. - Q&A
We will conclude the workshop with an open Q&A session, offering participants the opportunity to engage directly with the content and ask questions related to the presentation or live coding demonstration.
Presenters: Jools Kasmire, Cristina Magder and Hina Zahid.
Resources
Participants will have access to the slide decks, Jupyter notebooks, and other materials via a GitHub repository. Additionally, a recording of the workshop will be available after the event.
Prerequisites
While there are no formal prerequisites for attendance, those who wish to follow along with the coding demonstration will need:
• Access to a computer with Python installed, or to an online Python environment.
• Basic Python competency (e.g., ability to load packages, read in data, assign variables, write data to a .csv, etc.)
Any questions regarding this workshop can be sent to comms@ukdataservice.ac.uk.