The Digital Culture and Communication section of ECREA
The rise of big and open data have led to an increase in the use and availability of publicly accessible databases, FOI requests and a rapid expansion of tools and software for analysis and visualisation. Emerging with the promises of transparency and governmental/corporate accountability, open data encouraged many groups who conduct investigative research such as journalists, third sector researchers and different advocacy groups to seek new data skills. However, despite the recognition of the need for these skills, as data becomes ubiquitous, and manual analysis becomes more complex, those without the literacy or the capacity (i.e., time, money, resource, confidence) become increasingly excluded from the processes of engaging critically with such data. In this sense, we loosely define critical data practices as work that explicitly confronts the risks and ethics associated with processes of data gathering, cleaning, analysis, visualisation and storytelling. Such critical data practices also often use data to tell narratives that contest or challenge the status quo. The research on critical data practices, at present, remains limited primarily to the studies on major (and often for-profit) news organisations (Gray et al 2012), and journalism education (Berret and Phillips 2016). This leaves the challenges, barriers and opportunities experienced by smaller, non-profit organisations — from community journalism to third sector advocacy groups — overlooked and under-examined.
The scarce research that does exist on the barriers to skill acquisition shows that limited time and resources are the predominant impediments to the advancement of data skills but does not elaborate on these factors. A range of provisions for skill development currently exists to try and tackle these challenges. There are online resources, introductory classes and software training. However, our own experience of training and research evidence show that these provisions alone are not enough to tackle the issues in skill acquisition (Stoneman 2017). Even where people are able to attend training classes online and offline, skill development and retention is difficult without regular engagement. Like learning a language, without the ability to practice in everyday conversation, skills get lost. Trainers find that the same people are attending their introductory classes each year (Stoneman 2017). In addition to the skill-based barriers, there are also continued barriers to access inherent in the inaccessibility of methods for publishing data , but this is not the focus of this particular study, which is about skill acquisition for working with data not the quality of the data itself.
Nevertheless, this inaccessibility of much data, along with barriers to using it is creating a ‘data divide’ that leaves power in the hands of a select few (Gurstein 2011) . If the promised benefits of the rise of big and open data are to be realised in truly democratic ways, the ability to find meaningful stories in datasets must be a skill made available to wider populations, particularly those marginalised from mainstream narratives, who are often those most directly affected. With these in mind, we have designed the Data Skills Survey to achieve a more comprehensive understanding of the barriers and challenges to the uptake of data skills for investigative research (data gathering, analysis, visualisation and storytelling) in order to improve data training and skills delivery.
The Data Skills Survey investigates current data practices and barriers to acquiring more data skills. Initially, basic demographic information relating to role, age, frequency of data use and size of team and organisation was collected to look for variability across individuals and workplace cultures. For the questions regarding different data tools, responses included ‘know of but haven’t used’, ‘have never heard of’ and ‘not relevant’ in order to assess gaps between interest in learning, access to learning, and engagement with learning specific data skills. We also included open-ended questions to solicit additional information on tools people are using or want to be able to use. Ranking questions were designed to try and capture what is meant by time and resource barriers, in efforts to gauge priorities and the degree of effect of particular obstacles. The survey was distributed, following a non-probability sampling approach. Initial distribution began on the CIJ and Cardiff University’s network forums. Using snowball sampling, journalist and NGO participants of our partner networks were then asked in a debrief note at the end of the survey to further circulate the survey in their own networks. As this is a niche targeted population (of community journalists and third sector organisations that do investigative research) and it is not possible to know the total size of the target population, this method of non-probability sampling was chosen to be the most appropriate.
Findings and Limitations
Our preliminary findings from the 93 survey respondents show that the number of people using data regularly is far higher than the number of regular use of or proficiency in any of the data tools. This shows evidence of engagement and the desire for engagement with data, without the requisite skills. However, when it comes to the regular use of data tools, it rarely exceeds or in most cases doesn’t even meet 15%. Not knowing what is out there is coming up as one of the biggest problems in obtaining data skills. Only the software with business orientation and high brand recognition are coming up as well known such as Excel, Google Sheets, Adobe Illustrator.
In line with the existing research, not having enough time is coming up as one of the biggest barriers. However, we argue that time barrier is a symptom of various physical and psychological barriers that results in the dismissal of any effort to gain data skills as time-consuming. Lack of support infrastructures and prioritisation in workplaces can be given as examples to this. In order to help analyse these findings around data skill acquisition, we make use of the COM-B system for behavioural change intervention (Michie et al., 2011). The COM-B model suggests that behaviour can be challenged by addressing the barriers to motivation, opportunity and capacity. In our preliminary findings, motivation is not coming up as a problem, but various barriers to opportunity and capacity are present. Some of these are shown in the table below.
To conclude, it is clear that opportunity and capability barriers to data skills acquisition exist and that more research is needed to plan sustainable, reiterative interventions that can increase the use of data in investigative research. Yet, going forward, it is important to note that a current limitation to this research is that 39% of the respondents were academics, which skewed the results towards academia. We suspect this is due to the limited distribution channels and heavily academic audience of listservs, which needs to be addressed in a larger project. In addition to this, from survey results alone, it is difficult to gain insight into whether something is a low barrier because of expertise or lack of engagement, we will address this by conducting qualitative follow-ups in future.