This site uses necessary cookies

Some of these cookies are essential. Strictly necessary cookies enable core functionality, without which, the website cannot function properly. For more detailed information please see our Cookie Policy.


Website stats

We use Matomo Analytics to understand how our website is used and to improve your experience. This tool gathers limited information about the device you use to access the UK Data Service website. To learn more, please see our Privacy Policy.

Webinar: Web-scraping for Social Science Research: Websites as a Source of Data

23 Apr 2020 2:00 pm - 3:00 pm
Online
Training
Data skills
Other
Vast swathes of our social interactions and personal behaviours are now conducted online and/or captured digitally. In addition to common sources such as social media/network platforms and text corpora, websites and online databases contain rich information of relevance to social science research. Thus, computational methods for collecting data from the web are an increasingly important component of a social scientist’s toolkit.
This free webinar, organised by the UK Data Service, is the second in a series of three on how to collect data from the web using computational methods. Specifically, this webinar delineates the value, logic and process of capturing data stored on websites. Presented by Dr Diarmuid McDonnell of the UK Data Service, this webinar will cover the step-by-step process of collecting data from a web page, including providing sample code written in the popular Python programming language. It demonstrates web-scraping techniques for capturing real-time information on the Covid-19 pandemic, as well as for the author’s own research specialism (charitable organisations).
Webinar one, on 27 March, will provide an example of a published piece of social research that utilised web-scraping techniques to generate a novel, linked administrative dataset to evaluate a regulatory intervention.
Webinar three, on 30 April, will demonstrate how to use application programming interfaces (APIs) to download data from the web, again using the Python programming language.
There is also a parallel webinar series focusing on getting, storing and manipulating data that illustrates a variety of complementary techniques for collecting data from the web.