In May, The Times launched it’s “Clean Air for All” campaign, aiming to shine a light on Britain’s air pollution crisis and demand the Government update the Clean Air Act.
We built the first ever school league table for air pollution by combining 20m by 20m point data from the Greater London Authority with shape files for every school in the capital. We used this to calculate a pollution score for every school.
The results revealed that every school in London breached the WHO daily guideline of 10ug/m³. Our tool puts pollution data in context for schoolchildren and their parents.
Our analysis revealed that 6,500 schools educating 2.6 million children are in areas where fine particles in the air exceed the World Health Organisation recommended limit. Our findings led the campaign and featured on the front page of the paper on it’s first day.
In response to our story, the prime minister, Boris Johnson, pledged a “successor to the 1956 Clean Air Act” inspired by The Times campaign. While it is still unclear what the new environment bill will look like, this newspaper continues to campaign for measures to improve air quality in 2020. Our findings also prompted reaction from environment campaigner Chris Packham, UNICEF and London mayor Sadiq Khan who joined The Times in calling for an updated environment bill.
More than half of the 3,200 schools in our database were searched at least once in the interactive published with the story and the tool was received positively by readers.
The majority of our data analysis was performed in R while the tool itself was built using React. We downloaded open data on air pollution from the Greater London Authority (GLA) for 2013, 2020, 2025 and 2030. The GLA published an updated release while we were working on the story and we were able to add data for 2016 too.
We asked the Ordnance Survey for mapping data for schools: these were shapefiles of the land owned by every school in the country which we used to represent a school’s grounds. We buffered these by 10m to include the roads around schools.
To work out the average, maximum and minimum air pollution level found at each school we used points in polygon analysis (“over” from R library SP) to associate each point to a school and then summarised these grouped points.
Our school shapefiles did not include any identifying information about the school other than the UKPRN (a unique property number) and the name. We matched UKPRNs to a scrape of edubase, a database of school details. Where we were unable to find matches, we matched on school names and then finally wrote a script to match the last few schools manually. This allowed us to calculate the number of pupils at a school over the WHO limit.
Our tool also identifies the ten nearest schools to the user’s selected school. To work out where these were we used a nearest neighbour algorithm, again in R.
What was the hardest part of this project?
We faced challenges in finding mapping data of a high enough resolution to make the tool useful as changes in air quality can occur over small distances. After a lot of research we decided to limit our analysis to London as we could guarantee 20 by 20m gridded data would provide us with a reasonable level of detail.
Communicating a technical topic effectively was also a challenge. The problem of air pollution is both important and complex: The units of measurement need explaining, the weight given to different pollutants requires thought and determining whether the results are good or bad when the full health implications are unknown is complicated.
We feel this project is innovative as it’s the first time a school-by-school breakdown of air quality in London has ever been published. The techniques used to perform the spatial analysis are a first for The Times and it’s a method of finding stories we plan to use more often in the future.
What can others learn from this project?
Take as much time considering the look, feel and message of your output. As involved and complicated the data analysis used to get to your findings is, it won’t be useful unless you can communicate those findings effectively.
When dealing with a technical subject, it helps to take the time to workshop a tool with information designers and then test it on people outside of the project. We put the league table on a test link and sent it to the wider newsroom asking for feedback. This proved to be such a useful exercise that it became an important part of the process in the projects that followed.