Dr Steve McEachern recently attended the International Science Council General Assembly and Global Knowledge Dialogue to discuss how the social sciences sector can work together more closely to overcome some of its biggest hurdles.
During his presentation at the event, held in Muscat, Oman, he covered a number of key areas that need further development in the sector, including:
- Governance
- Preservation and protection of data and metadata
- Protection versus utility
- Access and sharing
- Interoperability
Steve, who also attended in his role as the treasurer of CODATA (the Committee on Data of the International Science Council), added that much of his day-to-day work involves thinking about how his team can improve access to high quality data and reproducibility in the social sciences.
Opening up access while maintaining data sensitivity
“We are always asking the question: How can you get data to where it needs to be, and do the things you need it to do as a researcher? As a researcher myself, my background is in labour relations, and I’ve more recently been involved with public opinion and public attitudes. So, there are lots of questions around sensitive issues that I constantly need to deal with.
“On the whole, the social sciences sector has addressed the findability problem very well. In the 1960s, for example, we figured out this was a good thing to do because we were interested in comparing problems between different places. We decided back then we could do this by sharing data between locations and countries, and that started off the origins of the Data Archives.
“The hard part for us is always related to the sensitivity question. We ask things about people, but they generally don’t want information about themselves to be shared. However, in terms of protection vs utility, you have to be able to do both.”
Choosing the right options
He went on to explain that this conversation is intrinsically linked with the processes of governance, data management, preservation and protection of data, and the efficient use of metadata. He said this is why there is so much focus on data being accessible rather than open.
“For a social scientist, often using open data is not really an optimal solution. You have to be able to recognise the importance of Open Science, traded off against the importance of maintaining the rights of the individuals being studied. It’s a trade we have to constantly manage. However, nobody will be able to tell us the right answer – we have to think about what are the best options we can use here.”
In terms of governance, Steve drew on his experience as the Director of the Australian Data Archive when he dealt with Indigenous Data Governance. He said his work in this area provided a good example of the options available on how to handle this kind of data.
Over the years, Steve has also found that the themes of data access generally cross over consistently. Plus many of the different legal and ethical frameworks that now exist started in the UK, before moving over to Australia and then New Zealand. So the principles between institutions have now become standardised across different countries, which makes things easier.

Data comes alive with documentation
“We have layers of governance in place now and accessibility is partly about governing the access models that you have. But how do you make sure you have the right information once you have the data you want to use? Metadata is so crucial to all of this. Data without documentation, particularly in the social sciences, is potentially useless.
“For example, I’ve got data going back to the 1830s, which I can still use for long periods of time. But once the documentation and metadata goes, the data itself is just a set of numbers of words. A table is a table, but if I don’t know what the rows and columns mean, there is not much you can do with it.
“Humans can interpret things very quickly and easily. But for machine interoperability, the fine grain detail, which is essentially the metadata, it needs to be able to operate on something – a suitable platform.”
He added that the standardisation of metadata is very important moving forward, but the social sciences has already done well in this area so far.
“A number of standards have emerged in recent years that we can take advantage of. Many interoperability standards now exist for a lot of collections that have evolved from major international organisations. Some have origins in economic statistics over the last 50 years, since World War II.
“But a lot of the governance models focus on the trade off between protection and sharing. They examine what the risk is of individuals making the data available, versus what the utility is of having the data accessible for research and analytical purposes.
“This is the trade off in processes that most governments are having to deal with now. So your model for governance essentially needs to be able to deal with both of these things. You also need appropriate protections and review processes in place to balance out the risks.
“But what do we need to do to achieve this? Essentially, to run those systems, we need a combination of technical administrative and government services in places. Plus, increasingly, we need to support both humans and machines to work together even better in the future, to access the data in the best way.”
In a follow-up forthcoming article, Steve will share his views on how humans and machines can work together better in the future.
The Global Knowledge Dialogue also had sessions covering an interesting mix of topics, including the transformations of science due to Artificial Intelligence (AI), Open Science and research assessment, the role of science advice, and specific sessions on international collaboration in relation to ocean science, polar science and other fields.
The CODATA session on “Data policy and skills in a rapidly changing world” attracted a large and very engaged group of participants.
The keynote presentation from Mercè Crosas covered the major transformations of science including the growth in interdisciplinary research, the challenges and opportunities of AI and the implications of this for data stewardship and data science.
The presentation generated a lot of discussion and agreement with the core contention that the increasing use of AI requires higher quality data and metadata.

Further presentations from the CODATA team covered the outlines of a policy response (Simon Hodson, CODATA Executive Director), as well as examples from the social sciences (Steve McEachern) and chemistry (Richard Hartshorn, CODATA Vice-President and University Librarian, University of the Witwatersrand, Johannesburg).
Finally, Shaily Gandhi, the ISC Fellow and former lead of the CODATA Connect Early Career Group and Senior Postdoc at ITU Linz, discussed what data-related skills researchers need to be ready to manage these new challenges and opportunities.
The slides from all presentations may be viewed here and the recording is available from the ISC YouTube channel.
CODATA attended in force and its contingent comprised:
- Mercè Crosas, CODATA President and Director of Computational Social Science at the Barcelona Supercomputing Center;
- Richard Hartshorn, CODATA Vice-President and Professor of Chemistry, University of Canterbury, Christchurch;
- Daisy Selematsela, CODATA Vice-President and University Librarian, University of the Witwatersrand, Johannesburg;
- Steve McEachern, CODATA Treasurer and Director of the UK Data Service;
- Shaily Gandhi, ISC Fellow and former lead of the CODATA Connect Early Career Group and Senior Postdoc at ITU Linz;
- Christopher Zhu, new co-chair of CODATA Connect (sponsored by ISC), and graduate student at Imperial College London; and
- Simon Hodson, CODATA Executive Director.