Rivers used as ‘open sewers’

Category: Best data-driven reporting (small and large newsrooms)

Country/area: United Kingdom

Organisation: BBC England Online

Organisation size: Big

Publication date: 22/08/2019

Credit: Daniel Wainwright; Paul Bradshaw

Project description:

The pollution blighting rivers across the UK was revealed in an exclusive data-driven investigation that led the Environment Agency to reveal it was reassessing long-term targets.  Our analysis found around eight in ten waterways were not in good health, prompting the WWF charity to say they were being treated as “open sewers”, and failed to meet environmental standards. And we showed that despite this challenge, the government was still predicting that three quarters would be in good health by 2027.  Commenting on the findings, the Green Party co-leader Jonathan Bartley said the state of our rivers was “totally unacceptable”.

Impact reached:

The Environment Agency said it would now review the targets “based on what can realistically be achieved”.

The research led to coverage across BBC News platforms, with environmental charities, campaigners and academics highlighting the issue of river pollution. It showed the UK was a long way from meeting targets it had set itself as part of Europe-wide efforts to improve the health of rivers.

WWF media manager Lis Speight said: “Without such investigative journalism, important stories that lie buried in dense data would simply not be told. The team behind this piece were meticulous, really diving down into the numbers and spending time to really understand this issue and to get their facts straight.”

The BBC’s Head of UK News Richard Burgess said: “This was an important piece of public interest reporting on a topic that needed investigation and discussion. Through the data journalism that went into analysing, visualising and presenting the figures our programmes were able to tell compelling stories about the challenges facing England’s waterways that would otherwise have remained hidden in plain sight.”

We invited questions from readers using an online form and answered those questions, inserting responses into the piece.

The questions showed we had got people thinking about the waters they swim in and what they could do to make a difference, whether that was reporting a pollution incident or joining a group or charity that would work to improve the ecological health of waters.

Our dataset was shared across the BBC with journalists on TV and radio so that they could identify stretches of river that were in poor health and use that for filming, location two-ways etc.

Techniques/technologies used:

We worked with open data to show the ecological health of rivers at the last published check and the predicted health status for 2027, showing that for many there was a very long way to go to reach that goal and that it was not likely to be met.

We then broke the figures down by overall waterbody for rivers.

This meant the local data could be used to help regional broadcast colleagues find potential sites for filming.

Each river basin district’s data were stored in almost a dozen different zip files on the Environment Agency website. Paul Bradshaw used R (in R-Studio) to merge these together and filter them so that we were focussing specifically on rivers and their status at previous checks and their predicted status in 2027.

Daniel Wainwright visualised these using bbplot2, an R package based on ggplot2 and developed by the BBC’s Visual Journalism team to create bespoke charts. He also created a map in R showing the spread of healthy and unhealthy rivers, using an unusal shapefile for England that showed the river basin districts that crossed borders into Scotland and Wales.

We also shared our data with colleagues in simple spreadsheets, writing formulae that would take the numbers and turn them into sentences explaining what they meant. These were used to assist broadcast colleagues in finding potential sites for filming.

We invited questions from readers using Hearken, a tool that provides readers a chance to ask anything they they want to know about the story. 

What was the hardest part of this project?

We took Environment Agency data and broke it down as locally as it was possible to do and show the scale of the challenge that England faces in restoring its natural river waters to a state where they are not suffering from the effects of sewage, industry and farming.

There was an enormous amount of data available about the health of water across England but it was not in a format that the general audience would necessarily understand or find accessible.

Making the data understandable and relatable to the audience was the biggest hurdle.

Firstly, we were using geographies that many people simply wouldn’t recognise – river basin districts. These simply did not follow more recognisable geographies such as local authorities or government regions.

We had to ensure we showed people on a map where their area was and how it compared to others for river water quality.

It also meant writing into the story recognisable place names to explain where we referring to.

The same had to be done for assisting our own colleagues in finding the areas that were relevant to them.

We therefore used the data to highlight examples of rivers that had consistently been found not to be in “good” health and wrote simple, narrative explanations using simple, formula-generated sentences in an Excel spreadsheet, to explain where each river was and in how many of the previous four years it failed to achieve a good rating.


What can others learn from this project?

Others can learn technically from the project’s processes for finding, combining, analysing and communicating data — these are shared in RMarkdown notebooks in the project’s GitHub repository (https://github.com/BBC-Data-Unit/river-quality).

This also makes it easier for other journalists to visualise environmental information, with the river basin shapefiles.

The story also shows how data can be used in broadcast journalism to identify sites for filming, and organisations to approach for interview, while more broadly it demonstrates how a data project can lead to wide range of stories across multiple platforms and areas, engaging multiple audiences in an issue that affects them all.

The use of simple automation internally (personalised formula-generated sentences for colleagues to understand how the data relates to their areas) is an area of innovation we have not seen elsewhere.

Project links: