2020 Shortlist
Facebook Political Ad Collector – joint submission (The Globe and Mail and Quartz)
Category: Open data
Country/area: Canada
Organisation: The Globe and Mail, Quartz
Organisation size: Big
Publication date: 6 Mar 2019

Credit: Tom Cardoso, Jeremy B. Merrill, Steve Mickeler
Project description:
The Facebook Political Ad Collector is an open-source crowdsourcing project that monitors political advertising on Facebook. It asks readers to install a browser extension that collects ads from their feeds, submitting them to a database for analysis by journalists in newsrooms around the world. Machine learning is used to classify whether or not each ad is political. Originally built by ProPublica in 2017, the tool was taken over by The Globe and Mail and rewritten in 2019 ahead of Canada’s federal election. It has been used extensively by The Globe and Quartz for daily stories and major investigations.
Impact reached:
This international journalistic collaboration is no flash in the pan. After the spotlight left Facebook following the U.S. presidential election, American and Canadian journalists persevered through Facebook’s technological hurdles to expose wrongdoing by Facebook and its advertisers.
In Canada, reporters used the ad collector to stay on top of candidates and parties’ messaging during the hectic federal election campaign, and as a tool to identify third party advertisers. The project also led to a story revealing that political parties were uploading voters’ email addresses to Facebook for targeting purposes, a possible violation of federal privacy laws. It was also used to determine that ads bought by Elections Canada, the country’s federal elections authority, were being automatically directed at particular groups of people – often young men – by Facebook’s algorithm, meaning the agency was effectively advertising to the electorate in a biased manner.
In the U.S., the project has led to investigations into the shady advertising sold by Facebook. First, the banking industry used a targeting technique, called “Lookalike Audiences,” that experts suggest may be illegal and discriminatory. Second, Metals.com is a precious metals retailer that’s now under investigation for securities violations. It sells highly-marked up silver coins to conservative retirees, sometimes wiping out half of their savings the moment that they make a purchase. We revealed that it found its customers through millions of dollars in ads from numerous fake grassroots groups with names like “Retired Republicans” and “Sean Hannity Viewers”. Several seniors told us they found the investigation just in time, preventing them from investing; local cops, too, said it helped in (ongoing) investigations.
The extension has been downloaded tens of thousands of times since it was first released in 2017, and nearly 8,000 people currently participate in the project.
Techniques/technologies used:
The project consists of a constellation of apps and services written in at least five languages (JavaScript, Python, Ruby, Rust, and bash), all running in unison to provide a back-end that journalists can use to report on political advertising on Facebook.
There’s the front-end ad collector, a browser extension for the Firefox and Chrome browsers built in JavaScript and Node.js, which monitors a participant’s Facebook feed for ads. There’s also a back-end, running a combination of Rust and Ruby on Rails applications backed by a PostgreSQL database to receive ads submitted from installed extensions and serve up the interface journalists use to filter and search for political ads. Finally, a machine learning algorithm known as a Naïve Bayes classifier, written in Python, is used to assign a political likelihood to each ad based on its content. Shell scripts written in bash tie the whole system together and provide automatic database archiving in CSV format for analysis by journalists.
The entire project is open-source and available on GitHub.
What was the hardest part of this project?
Facebook does not like that we are crowdsourcing the political ads and targeting parameters our readers see on its website.
As a result, they’ve been aggressive in using technical tricks to change how ads are displayed in an effort to fool our participants’ browser extensions, reducing the transparency of political advertising on their platform. In the past, Facebook has made parts of its ad HTML invisible, hidden bits of the ad text or images in CSS, broken the word “Sponsored” into nine different HTML entities (one per letter), or even added extra letters to that text, turning it into “SpSonSsoSredSSS” in an effort to defeat our collector.
They’ve even added code that appears targeted only at us, using a JavaScipt hack to make clicking the “Why am I seeing this?” button impossible for our extension. It seems Facebook would rather journalists not have access to political advertising targeting data.
But every time Facebook’s developers make a change, we find a solution and tweak the ad collector code in turn. It’s a technical and arcane game of tennis between the ad collector team and engineers in Menlo Park, played methodically over many months.
To hold the company accountable, we have also written a story each time there’s a breaking change. The most recent came during the Canadian federal election in October, when we went from collecting targeting information on nearly 90% of ads to collecting the information just 16% of the time.
What can others learn from this project?
The Facebook Political Ad Collector has set a new standard for global, multi-newsroom collaboration on a deeply technical project – one with huge consequences for global democracy. Since being built by ProPublica in 2017, the project has added dozens of participant media organizations in Canada, the United States, Italy, Switzerland, Australia, Germany, the Netherlands, Denmark, Belgium, Mexico, Latvia and beyond. This can be difficult logistically, such as when we had to figure out why ads weren’t being collected in Italy, or when we tweaked the browser extension’s code to recognize ads for readers using Facebook in Burmese. Keeping everyone up to date took serious work.
But the effort was worth it. The project enabled organizations across the world to hold Facebook accountable for the political ads bought on its platform. The international nature of the project brought us strength, too, as Facebook’s efforts to thwart us earned quiet counterpressure from transparency advocates in several world capitals.
Project links:
github.com/globeandmail/facebook-political-ads/
www.theglobeandmail.com/politics/article-politics-briefing-who-is-seeing-political-ads-on-facebook/
qz.com/1751030/facebook-ads-lured-seniors-into-giving-savings-to-metals-com/
qz.com/1733345/the-fight-against-discriminatory-financial-ads-on-facebook/
qz.com/1537686/facebook-blocks-propublica-and-mozillas-ad-transparency-tools/