Working with Reddit data
This free workshop will introduce the fundamentals of natural language processing (NLP) and sentiment analysis, using Reddit as a case study for social media research. We will explore the ethical and legal considerations of working with publicly available online data and discuss best practices for responsible and reproducible research.
The session will provide a step by step walk-through of how to preprocess and explore Reddit data using Python. No prior programming experience is required. We will start with core concepts before moving into more technical demonstrations.
The session will conclude with a Q&A segment, allowing time for questions and discussion.
Optional hands-on participation: Attendees who wish to follow the data scraping exercise are encouraged to set up a validated Reddit account. Basic familiarity with Python will be helpful but is not essential.
This workshop is aimed at beginners and anyone interested in analysing social media data. It will start with non-technical concepts. Accessing data from Reddit is a technical process that usually requires Python but a synthetic dataset will be provided for attendees to explore. While Python isn’t required, some familiarity with it is recommended.
Timetable
13:00: Web-scraping and sentiment analysis presentation
13:30: Web-scraping demo (Python)
14:00: Python analytics
Presenter: Joe Allen
This event will be livestreamed on our UK Data Service YouTube channel but the chat will be disabled. By registering and attending the Zoom event you will be able to ask questions and interact.
Recordings of UK Data Service events are made available on our YouTube channel and, together with the slides, on our past events pages soon after the event has taken place.
Event resources
Slides, notebooks, and demo data (GitHub)
Recording (YouTube)