Throughout the year, the Guardian has been tracking the spread of the disease both in the UK and around the world. This coverage has concentrated on two tracking pages. The UK page tracks the number of daily cases, deaths, hospitalisations and – now – vaccine doses, as well as the geographical spread of case rates and restrictions. The global page tracks the latest case and death rates per country, analysing trends both over time and geographically. These are designed to present the latest data in the clearest way possible so that our readers can understand the current state of the
Our trackers are some of the team’s most-visited pieces of data journalism over 2020, with millions of both regular readers and new audiences engaging in the content. We regularly receive contact from readers who have used the pages, either simply complimenting the clarity of the information or making suggestions for future improvements. As this is a live event with both the state of data and the focus of the story constantly changing, we regularly integrate these suggestions into our thinking.
As we have been able to track the number of people who have engaged with the interactives on both pages, we are able to see that thousands of members of our community uses these pages as a resource to get the latest information on their area – whether that’s the latest information on cases in their local authority, or details on the latest lockdown restrictions that apply to them. In a rapidly changing and unprecedented pandemic, such localised, free and easy-to-use information has proved essential for people to understand how the situation impacts their lives.
To make the self-updating graphics for the UK tracker, we used a Lambda which fetches the coronavirus numbers from the government API on a daily schedule. The Lambda then cuts down the government data to exactly what we need, and writes that data as a file to S3. On the frontend – when the user loads the page, the S3 file is fetched and is run through D3 templates to create the maps and charts.
On the backend of the global tracker, we have used both AWS EC2 in the past and AWS Lambda currently in order to pull the data. D3 was used to build the maps and charts of the global tracker. We also developed the pages as AMP-friendly formats and integrated tracking for the interactives within Ophan, the Guardian’s in-house analytics platform.
The individual components have been designed so that, aside from the tracker pages themselves, they can be embedded in any Guardian journalism, bringing clear up-to-date data visualisation to our readers even if they’re not on the tracker pages.
What was the hardest part of this project?
All the numbers related to coronavirus are less simple than they initially appear. Case rates depend upon test rates. And not only do death rates depend upon case rates, but they also are subject to different definitions of a Covid-related death. One of the biggest challenges with this piece was the selection of correct and suitable metrics on a quick deadline and then the clear and effective communication to our readers who are anxious to understand the latest coronavirus news.
Also, the coronavirus pandemic changes on a daily basis – with the stories, metrics and emphasis shifting continually. This means that the team behind our trackers has to be highly adaptive and responsive. Problems have been triggered by a variety of things: a country changing their counting methods for cases or deaths; a new variant causing a dramatic change of the picture; or a new metric such as vaccine doses being introduced by a new source that we need to quickly create a data pipeline and visualisation for. Therefore, the page requires a lot of maintenance. The last example of this was the fact that, as countries started approving vaccine rollouts, we have had to find new sources for vaccination data – with Our World in Data coming to the forefront of up-to-date sources of vaccine data – and include three new components to the two trackers to show how a new part of the story is progressing.
What can others learn from this project?
Simple, utility-driven data journalism is a useful resource for readers, and these tracker pages are worth the amount of time and resource they take in both the initial production and continual maintenance. The team has collectively put weeks of work into these tracker pages, and the rewards have been worth it. We can see from the amount of people reading from the page, as well as the amount of time they engage with the content, that our community – and those beyond – value the work we put into these pages.
This once again proves the power of simple visuals such as bar charts, line charts and maps in telling readers exactly what they need to know in the biggest ongoing story we’ve ever covered – and that makes them one of the most rewarding, valuable and useful pieces of data-driven content that we can produce.
Coronavirus is an exceptionally long-running story. But as data journalists tackle more systemic and long-duration topics, it’s critical to build the tech behind these pipelines and visualisations like an app that has to work for a year, not like a one-off with a shelf-life of several weeks.
That means investing the time up front to bring in tech best practices, making the code easy to maintain and change. And as we strive to be visually accessible – we also need to be technically accessible and the graphics should work on old mobiles as well as on new computers.
Making that technical investment allows data journalists to add value beyond the 24hour news cycle.