This site uses cookies

Some of these cookies are essential, while others help us to improve your experience by providing insights into how the site is being used.

For more detailed information please check our Cookie notice


Necessary cookies

Necessary cookies enable core functionality. This website cannot function properly without these cookies.


Cookies that measure website use

If you provide permission, we will use Google Analytics to measure how you use the website so we can improve it based on our understanding of user needs. Google Analytics sets cookies that store anonymised information about how you got to the site, the pages you visit, how long you spend on each page and what you click on while you’re visiting the site.

Building synthetic population data – new research available from the UK Data Service

Research from Understanding Society has been used to create a dataset that mimics the adult population in England, Scotland, and Wales, allowing researchers to model the potential impact of health interventions and policies. This dataset is now available in the UK Data Service’s collection.

What is the project about?

We know that the conditions in which people are born, where they grow up, live and work can have an impact on their health. We also know that these social determinants of health, such as income, good quality homes, education, or work, are not distributed equally in society leading to health inequalities. Preventing ill health related to social and economic conditions should therefore be a priority of all policy sectors affecting the economy, welfare, housing, education, and employment.

The SIPHER Consortium is a major project funded by the UK Prevention Research Partnership, which aims to tackle some of these public health policy challenges. SIPHER brings together university researchers, local, regional and government policymakers, and groups working in public health. The consortium is working on new insights on the causes and consequences of poor health and building evaluation and data tools for researchers and policymakers.

The Understanding Society research played a vital role in helping the SIPHER Consortium to complete this project.

What is synthetic data?

The interactions between social, economic and health conditions are complex and can involve data that can be disclosive. This can make it difficult or even impossible for researchers to access these data. To make it easier to study the dynamic relationships between social determinants and health outcomes, SIPHER has now created a synthetic population, based on Understanding Society and UK Census data.

Using spatial microsimulation, the SIPHER Synthetic Population dataset captures attributes available in the UK Census, such as age, sex and ethnicity, plus additional information captured in Understanding Society. The dataset reflects the characteristics seen in the survey data, but with the distribution, scale and geographical coverage of the UK Census. Capturing small areas, the dataset provides a “digital twin” of the adult population in England, Scotland, and Wales. It’s important to note that the dataset does not reflect “real”, but “synthetic” people.

 

What can synthetic data be used for?

The SIPHER Synthetic Population can help researchers and policymakers to swiftly fill important data gaps. In particular, the dataset can be used in simulation models when testing the likely effects of different policy options. Here, the dataset can reveal how policy interventions might affect population subgroups and areas across Great Britain differently.

Nik Lomax, the SIPHER Consortium Co-Lead on and who is also the Professor of Population Geography at the University of Leeds, talks to us about on our Experts’ Expert podcast on how the process of creating this synthetic dataset works.


In his second podcast with us, he tells us about the overall purpose and objectives of the SIPHER Consortium.

To further improve access to this resource, SIPHER has developed an interactive dashboard. This tool allows exploration of an aggregated version of the synthetic population without any coding or data preparation. Its ‘click and explore’ format enables comparison of areas of interest, creation of bespoke detailed area profiles, development of customised data visualisations, and downloading of the aggregate data used.

Accessing the SIPHER Synthetic Population

Researchers can access the dataset via the UK Data Service curated collection:

The full replication package and supplementary materials can be accessed via the UK Data Service ReShare repository:

You can also explore the Dashboard and learn more about the dataset on the SIPHER website.

For support with the interpretation of results, to provide feedback and/or to discuss project ideas and applications, please direct enquiries marked ‘SIPHER Synthetic Population – Dashboard’ to sipher@glasgow.ac.uk