How Covid-19 took hold in South Africa
Country/area: South Africa
Organisation: Bhekisisa.org, News24.com
Organisation size: Small
Publication date: 28 Sep 2020
Credit: Laura Grant, Alastair Otter, Gemma Gatticchi, Mia Malan
Media Hack Collective and the Bhekisisa Centre for Health Journalism partnered to create an interactive story about how the SARS-CoV-2 virus got to South Africa and how it travelled throughout the country, using maps, a narrative and visualisations. To do this we had to build our own dataset from the daily numbers released by the government in PDFs. We started when the first case was reported in March 2020 and visualised this data in a dashboard. The scrolly story was a way to use 7 months of data in a narrative format. We also made the data publicly available.
The scrolly story, How Covid-19 took hold in South Africa, was not only published by Bhekisisa, but was also published by News24, South Africa’s largest news website. It was published shortly after the country emerged from the first peak in cases and it was the culmination of 7 months of work collecting and visualising South African Covid-19 data in a format that we tried to make accessible to the general public on the Coronavirus in South Africa dashboard. This dashboard had over a million unique visitors by the end of 2020, many of whom returned regularly, often daily. We know this, not only from Google Analytics data, but also because we have received hundreds of emails from visitors to the dashboard making comments and suggestions, many people shared their own chart ideas. Screen grabs from the dashboard are regularly shared on social media. We started a newsletter which quickly grew to have more than 9,000 subscribers. Through the newsletter we gave people updates on Covid-19 with additional data analysis and charts as well as links to stories written by Bhekisisa. Thanks to these newsletter subscribers, when we launched a crowdsourcing campaign to help us cover some of the costs of collecting the data, we raised our target funds in 30 hours – not the 30 days we had allocated. Bhekisisa commissioned the scrolly story as a way to turn the data into a data-driven journalism feature that show people through maps and visualisations how the virus had spread through different parts of the country at different times and how superspreader events had played a role. The data that we made publicly available was used by other media companies, eg, Al Jazeera and the BBC, as well as by some academics and other people interested in doing their own visualisations.
Google Sheets for data collection. Data is collected daily in a number of different spreadsheets because it is easily done by helpers without specialised skills. The bulk of the national and provincial data is taken from PDFs published daily by the South African Department of Health. Other data, such as district-level data, is gathered from infographics published usually on social media by the provincial health authorities. This data is sometimes erratic and often contains typos, but there is no other source of district-level data that is publicly accessible. R and R Studio used to clean and analyse data. SQL – data is pulled from the spreadsheets into a SQL database. D3 – visualisations are done with D3, on the dashboard as well as the scrolly story. The scrolly story is cutom-built using HTML, JS and CSS.
What was the hardest part of this project?
Manually collecting the 7 months’ worth of data on which this project was based was the hardest part of the project.
What can others learn from this project?
Taking the trouble to collect data that isn’t produced in a machine-readable format over a long period of time is worth the effort because in the end you have a unique dataset that can be turned into a variety of outputs, eg, a dashboard and a data-driven feature. Create visualisations that make data easy to understand – in this way data can help people understand how the Covid-19 pandemic is progressing and, in a way, that can help reduce fear and help to combat misinformation.