2020
Analysing political bias in Google’s news algorithm
Category: Best data-driven reporting (small and large newsrooms)
Country/area: United Kingdom
Organisation: The Economist
Organisation size: Big
Publication date: 6 Jun 2019

Credit: James Tozer, Dan Rosenheck, Matt McLean, Evan Hensleigh
Project description:
This article aimed to measure whether the algorithm that Google uses to serve news articles contained an ideological bias. To quantify this, I scraped search results from Google for a selection of 31 keywords on each day in 2018. Then I collected data for variables that Google claims to use in its algorithm (such as how trustworthy and reputable a news source is). After building a statistical model to replicate the algorithm, which predicted how often each news source should show up, we found no evidence that Google was disproportionately favouring publications on the left or right.
Impact reached:
This was the most popular article that The Economist’s data team published in 2019 on its Graphic Detail page, with over 350,000 page views on our website. I also tweeted a lengthy thread about how I put the analysis and article together, which generated more than 1.5 million impressions. (A link to the Twitter thread is included in this submission.)
Other people have also published analyses of potential bias in online search engines. But I believe that this is the first study that has attempted to explicitly replicate Google’s algorithm by building a statistical model.
Techniques/technologies used:
I built the scraper to collect news results from Google in Python, using the Selenium package to manipulate my browser. I also used this approach to collect data from Meltwater, a media tracking tool, about how often each publication had mentioned each keyword in its coverage.
For data about how much Americans trust each news source, The Economist asked YouGov, with whom we regularly conduct polls, to pose this question to 1,500 respondents.
The statistical model was built in R, using a logistic regression to predict how often we would expect each publisher to show up in Google’s news tab for each keyword.
The static charts for the article were created using ggplot2 in R, and then styled up using Adobe Illustrator.
The interactives were created using the React and D3 libraries.
What was the hardest part of this project?
There were two tricky aspects of this project.
The first difficulty was scraping the data from Google. Google is very good at identifying bots, so I could only scrape 20 pages’ worth of results at a time, before closing my incognito browser and starting again. It is also extremely hard to convince Google that you are somewhere other than your actual location. For this experiment to work, we had to make sure that the search engine was not tailoring its results to our office in London. We ended up using a VPN server in a swing district in Kansas, to represent a politically neutral part of America. But even after switching on the VPN, Google could still locate my computer as being in London (which we checked by opening Google Maps within the incognito browser). It was only after turning off location services on all my devices that we could apparently convince Google that our browser was actually a new user, with no search history, based in Kansas.
The second difficulty was building a statistical model that could replicate the variables used in Google’s news algorithm. We read through the documentation that the company gives to its human “search quality evaluators”, which suggests various measures of “expertise”, “authoritativeness” and “trustworthiness”. We then tried to quantify some of these concepts, using variables such as Pulitzer prizes, ratings by online fact-checkers, polls of the general public, age of publication, and so on. Gathering these figures was time-consuming, but an interesting challenge.
What can others learn from this project?
I think the main learning from this project is not necessarily the article’s conclusion: it is entirely possible that different researchers, using a different set of criteria and variables, would have reached an alternative finding.
Instead, I hope that this is a useful example of how journalists might be able to apply scrutiny to aspects of technology (such as proprietary algorithms) that shape our lives significantly, but which have little transparency. Scraping is not the only way to do this, but it does allow data journalists to gather a reasonable sample of data in a systematic way.