To combat climate change, cities need to control their urban sprawl and intensify population density. To measure how much Canadian cities have grown over the last 20 years, we taught a machine learning algorithm to detect urban areas from satellite imagery and then used it on thousands of satellite pictures covering 47,000 square kilometres. By overlaying the Canadian census data, we were then able to measure the impact of this urbanized land growth on the number of cars on the road.
Because the project allowed users to check their city, it had a great reach and was widely shared on social media. CBC/Radio-Canada is Canada’s national broadcaster and has regional stations across the country. The data for each city was shared in advance with local reporters, which produced TV and radio stories on top of the national ones. Urban sprawl is an important topic in many municipalities, but studies about this phenomenon are often limited and private. This data-driven project brought new numbers, with one consistent methodology for the whole country, allowing comparison between cities and forcing mayors and municipal councils to consider the problem more seriously.
The satellite imagery comes from Landsat 7. The satellite was launched in 1999 and is still active today, allowing us to study urban sprawl over more than 20 years with the same image source. We created a simple composite of images for each available year between May 1st and November 1st to get cloud-free pictures.
To classify each pixel as an urban or non-urban area, we trained Random Forest models with Canada’s 2015 Land Cover maps produced by Natural Resources Canada. After training, the models classified each pixel in the selected cities between 1999 and 2021. This procedure was done in Google Earth Engine.
Once each pixel was classified, we did a spatial join between the pixels and the Canadian Census dissemination areas to infer information on the residents from the latest census year (2016). We also used Statistics Canada population estimates from 2001 to 2021.
For the publication, we created 3D maps and 3D scenes with ThreeJS. The charts were coded with D3.js.
We published our detailed analysis with all of our code along with the main story: https://ici.radio-canada.ca/info/codesource/code-ouvert/2022/03/etalement-urbain/analysis.nb.html.
Context about the project:
There’s a lot of hype around artificial intelligence. However, finding a way to use machine learning algorithms from a journalistic perspective is difficult. As journalists, we deal with facts. But there is always a part of uncertainty with these algorithms. So it took us several months to gain enough confidence in our results to publish them.
However, the resulting story was worth it. From a data journalist’s point of view, it was an extraordinary experience, and we want to keep exploring this way. The algorithms are getting easier to use, and the computing power is getting cheaper: the potential for future data-driven investigations powered by AI is real.
For this story, our models classified 1.7 billion pixels in total. Can you imagine doing that manually?
What can other journalists learn from this project?
CBC/Radio-Canada wanted to show Canadians the extent of urban sprawl in their cities. By using satellite imagery and machine learning algorithms, any journalist on the planet can answer the same questions for their own country. We published all of our code and methodology, and we hope it will be useful to other newsrooms worldwide.