Software and tools
The importance of software for social science
Software programs are critical for handling the complex or large amounts of data available to researchers, and are vital tools for manipulating and analysing data. They are also important for producing graphics of data, mapping and data mining.
The suite of training resources below relate to software tools used by social science data researchers.
Key software tools
R
R is a free, user developed, advanced statistical and computing programme for statistical computing. It is increasingly used in the academic world for teaching purposes.
Resources
- Guide to using R to analyse key UK surveys
- Mapping 2011 Census Microdata using R (PDF)
- Data Skills Module: Exploring crime surveys with R (Beta version).
SPSS
A software package for Windows, SPSS can be used to to produce graphics of data as well as other data analysis.
Resources
Stata
Stata is a statistical software package for data analysis. You can use Stata by pointing and clicking, or by using the command syntax.
The software can support complex analysis, and, as it is so programmable, developers and users continue to add new features.
Resources
- What is Stata 12.1? (PDF)
QGIS
QGIS is an open source mapping package which can be downloaded for free – it has a good range of functionality and is straightforward to use.
Resources
Python
Python is a general-purpose programming language for data scientists to collect, clean, and analyse data. It is often used because it is flexible and suitable for handling large datasets.
Resources
- These training materials include webinar recordings, slides, and sample Python code for core social science research tasks.
Nesstar
Nesstar enables you to search, browse, visualise, analyse and download a selected range of different kinds of social and economic data, from survey data to multidimensional tables.
Resources
- Access Nesstar
- Video tutorial: Using Nesstar to download a subset of data
- Video tutorial: How to weight survey data in Nesstar
- GESIS Online Study Catalogue
- The European Social Survey (ESS)
- More information on researching survey data.
InFuse
Infuse is an open standards structure developed by the UK Data Service to provide easy access to aggregate data from the UK 2011 and 2001 censuses.
Resources
Casweb
Casweb is an interface that provides access to census data from 1971 to 2001.
Resources
GeoConvert
GeoConvert is a tool that makes it easy to match UK postcodes, census geographies and convert data between them.
Resources
Boundary data tools
Through its Census Support service, the UK Data Service provides a selection of tools to enable you to easily analyse census boundary data.
Resources
QualiBank
QualiBank is the UK Data Service’s search and browse interface for qualitative data objects allowing searching of the content of text files, such as interviews, essays, open ended questions and reports.
Resources
- Access QualiBank
- Video guide: Citing QualiBank data.
Other computational social science software
Installing Spark on a Windows PC
Apache Spark is an open source parallel processing framework that enables users to run large-scale data analytics applications across clustered computers.
View our guide Installing Spark on a Windows PC (PDF).
Obtaining and downloading the HDP Sandbox
Hortonworks is a commercial company which specialises in data platforms based on open source software for big data, in particular Hadoop. HDP is an acronym for the Hortonworks Data Platform, which is an implementation of a Hadoop cluster and a range of associated big data products which run in the Hadoop environment.
View our guide Obtaining and downloading the HDP Sandbox (PDF).
Loading data into the HDFS
This short guide provides detailed instructions of how to load a dataset from a PC into a Hadoop system.
View our guide Loading data into HDFS (PDF).
HiveQL example queries
This workbook contains some practical excercises for researchers and/or data analysts who want to run simple queries using Apache Hive.
View our guide HiveQL example queries (PDF).