It is meant to be the key to great investigative journalism but what is the big deal about data journalism and how is it done? Melina Meletakos finds out.
Data journalism is the much-touted new media trend that represents a convergence of the research, design and programming fields. But how many South African journalists are using this new toolkit to tell engaging stories?
“Disappointingly, not many,” says Adi Eyal, the director of Code for South Africa. Part of this non-profit organisation’s focus is the promotion of data-driven journalism through newsroom training. These skills involve digging deep into data sets, filtering relevant information, finding ways of visualising it and turning it into a narrative.
“The concept of data journalism in South Africa is very naive at the moment,” says Eyal. “Journalists tend to equate it with visualisations and infographics but it’s also about crunching numbers.”
Through his work at media houses, Eyal has seen some of the challenges in getting a local data journalism culture off the ground. In one newsroom, he says, an editor was eager for a journalist to learn data-mining skills but rarely gave the person the time needed to focus on data projects. Code for South Africa also worked closely with another publication, assisting them with free data-journalism tools. But when it came to integrating this into their workflow system, they were unable to.
Eyal says the number of journalists who attend Hacks/Hackers events, which are meetings that bring together techies and journalists, has also been disappointing.
“It’s as if they realise that it’s an interesting thing but they aren’t prioritising it,” he says.
Athandiwe Saba, a journalist at City Press, agrees, saying what we are seeing a lot of is “data porn”, where journalists are putting out fancy graphs but aren’t mining data to tell stories. Besides reporting on general stories for City Press, Saba has also groomed herself as a data journalism specialist by taking a number of courses to sharpen her skills in the field. This includes a short data journalism course at the University of Witwatersrand with The New York Times’s Ron Nixon, an online certificate course with Canvas Network and a web application certificate course with City Varsity. She was chosen by Wits to attend a data-journalism course in Baltimore in the US in 2014.
“With the knowledge I have garnered in the past three years I can trawl for numbers from different institutions and make sense of them, adding that edge to every story,” she says. “But data journalism is not only about the inclusion of a few numbers. It’s also about using tools like Google Refine to understand the data you are working with. With these tools I have been able to analyse census data, election data, municipal data and pass-rate figures for matrics.”
Saba says she generally works on mainstream stories on a day-to-day basis but she recently decided to make a point of fighting for the space and time to do big data projects.
“The more I push for it, and the more I explain why I need more time for a particular story, the easier it will become,” she says.
The consequences of not investing in data journalism, says Eyal, is not having the competitive advantage that it offers media houses who do.
“If they don’t do it, then traffic will go to one of their competitors. It also strengthens what the investigative journalism publications can do because it allows them to dig deeper and be one step ahead,” he explains.
Harry Dugmore, director of the Discovery Centre for Health Journalism at Rhodes University, says journalists and editors have to find a balance because “the old ways of doing journalism are simply not cutting it any more”. Dugmore, who is also in charge of digital media and journalism at the university, says consumers have too many other things grabbing their attention.
“The question newsrooms need to ask themselves is how do they grab some of that attention back in an era of multimedia? People want more and to stick to journalism’s traditional strengths only is not a sensible model,” he explains.
Another reason local data journalism is still in its infant stage is because publicly-funded data is closely safeguarded and difficult to access. Eyal says government officials often inadvertently refer to it as “our data” because they feel that it belongs to them.
“We are, however, seeing a loosening of data. I also think we complain too much about not enough data being accessible. There is a lot out there. I think people are actually just lazy and it’s far easier to complain,” he says.
Dugmore says government departments need to be encouraged to make data available in machine-readable formats, instead of PDFs, so that data-mining software can be used on it. Media houses, however, can also generate their own state systems, he says. “This can be done be asking readers to send an SMS answering a specific question.”
Despite a lot of inactivity, there are a few inspired data projects being produced by local media hacks. 24.com, which is part of Media24, recently launched a loadshedding website and app called GridWatch, which allows users to search for the loadshedding timetable for any area in South Africa. Launched on 6 February, less than a week later it had already garnered 2.35 million page views and 381 000 users, 24.com’s head of product development, Cathryn Reece, told Grubstreet.
Speaking to The Media, Reece said the project took extensive research and planning to determine whether a national aggregator was possible as schedule formats and sources are varied and unreliable.
“Once we started talking to Eskom about their database – which they have very kindly and efficiently provided us, and will be providing us each week – we figured we at least had one structured data format, so we could press on with the others. We had our first conversation with them mid-December and have been working on the database and site-build since early January,” she says.
Reece and her team were also behind News24’s award-winning elections app in 2014, which provided users with breaking news, multimedia, user-generated content, voter information and a live-results map. The data for this project was easier to source, says Reece.
“We only had to deal with one supplier, the IEC, who very graciously agreed to send us their previous results sets and to provide the live results database to us in real time over the course of the results phase. We built our own database, result roll-up logic, maps and web services to build our products on top of, so we were able to minimise dependencies and risk at most turns.”
Does Reece think we’ll see more data-driven media products in the future?
“I hope so. We have a lot more data journalism minds in the business now than ever before, so I’m very excited to see what we can come up with next,” she says. “I think it takes a lot of careful study to understand when a data source is a story, or an infographic or a widget or a website, and so far we’ve only really done the big things, not the little ones. So, I’m excited to see how we can vary our solutions to what is still essentially a requirement to tell stories – only with data instead of words – to make it more exciting and engaging for our users.”