The public and commercial sectors are continuously generating large amounts of data that can provide a powerful discovery tool for researchers, enabling them to gain valuable insight.
But, the arrival of ‘big data’ has changed social scientists’ expectations, bringing technological and infrastructure challenges for data services and repositories, in terms of volume, complexity of and rights in the data. The UK Service now needs to ingest sizable streams of real-time data, and enable exploration and linkage of a variety of data assets. The UK Data Service has been operating since 1967, when ‘data storage technology’ meant punch cards. We’ve adapted to magnetic tape, floppy disks and modern online databases – and users have moved from examining printed statistical tables to downloading survey data files, and to exploring and visualising over the web.
These new challenges mean that researchers need to be enabled to make the most of these data for knowledge exchange and impact – through access to data and capacity building. Moreover, repositories need to review their existing repository architecture and infrastructure to be ready for the changes in the data landscape.
As part of Phase 2 of the Economic and Social Research Council's (ESRC) Big Data Network (BDN2), the UK Data Service established a dedicated Big Data Network Support team to provide advice to other BIG Data Network Research Centres - The Urban Big Data Centre, The ESRC Business and Local Government Data Research Centre, The Consumer Data Research Centre - on areas of data licensing and governance and trusted access. We put on many training events, webinars and developed user guides and capacity building case studies.
Importantly, the BDNS award allowed the UK Data Service to research and develop an open source solution for hosting big data. We focussed first on the field of household energy research, and developed blueprints and a test instance of an Apache Hadoop system to manage smart meter data, known as Data Service as a Platform (DSaaP). This is being utilised as the technology for a group-breaking EPSRC-funded project headed up by UCL as a Smart Energy Research Lab (SERL).
Principle Investigator: Nathan Cunningham, UK Data Service
Team: Darren Bell, Deirdre Lungley, Chris Park, Sarah King-Hele, Peter Smythe, Margerita Ceraldo, Herve L'hours, Louise Corti
Funder: Economic and Social Research Council (ESRC)
Dates: January 2014 - September 2017
Alongside our training and capacity building programme, our big data work produced some useful guides and case studies: