2022 Shortlist
Prediction: Bias
Country/area: United States
Organisation: The Markup, Gizmodo
Organisation size: Small
Publication date: 02/12/2021

Credit: Aaron Sankin, Surya Mattu, Annie Gilbertson, Dhruv Mehrotra, Dell Cameron, Daniel Lempres, Josh Lash, Evelyn Larrubia, Andrew Couts, Angie Waller, Joel Eastwood
Biography:
Aaron Sankin reports on how technology can be used to harm marginalized people. He focuses on platform governance and online extremism, which he previously covered for the Center for Investigative Reporting.
Surya Mattu builds tools to tell stories about how algorithmic systems perpetuate systemic biases. He previously worked at Gizmodo and ProPublica, where he was part of the “Machine Bias” team, a finalist for a Pulitzer Prize.
Annie Gilbertson is an investigative reporter and audio journalist based in Los Angeles.
Dhruv Mehrotra is a data reporter with Reveal. Additional reporting by Dell Cameron, Daniel Lempres, and Josh Lash.
Project description:
A Markup/Gizmodo collaboration, this investigation is based on more than eight million previously secret crime predictions that software developer PredPol left unsecured on the web. We conducted the first-ever independent analysis of actual PredPol crime predictions and found that they fell most heavily on low-income, Black, and Latino neighborhoods, while mostly sparing richer, White areas. Experts had feared the software was replicating police bias, but our unprecedented access to data allowed us to prove it. We also discovered that the company’s founders were aware of the inequities and developed a possible tweak, but the company didn’t change its algorithm.
Impact reached:
Published as the year came to a close, the investigation was well received by activists, and academics studying policing technology who said it provides needed transparency into this notoriously opaque universe of cop tech.
“No one has done the work you guys are doing, which is looking at the data,” said Andrew Ferguson, a law professor at American University who is a national expert on predictive policing. “This isn’t a continuation of research. This is actually the first time anyone has done this, which is striking because people have been paying hundreds of thousands of dollars for this technology for a decade.”
Immediately after publication, Sen. Ron Wyden’s office asked the reporting team for a private briefing on the findings, often a first step to legislation or other action.
Techniques/technologies used:
This investigation began when Gizmodo investigative data journalist Dhruv Mehrotra searched websites of law enforcement agencies using a tool he built and typed in “PredPol.” A page on the LAPD’s website linked to an unsecured server containing the motherload: millions of crime predictions PredPol delivered to dozens of law enforcement agencies across the country over years.
Dhruv downloaded the data, then partnered with The Markup’s Surya Mattu to analyze it. They converted more than eight million predictions stored on 42,000 individual files—small, red boxes drawn on street maps—into geolocation coordinates, and then joined them to demographic information from the U.S. Census Bureau and public housing locations from HUD. And then the really hard work began: What methods would we use? What thresholds? What about errors in Census data? Or the fact that we can’t get demographic information for areas as small as the prediction boxes? Getting the disparate impact analysis right and bulletproofing it took months.
We used Python scripts and Jupyter notebooks to build the data sets for analysis. We used Kepler.GL and Observable Notebooks to build interactive maps that visualized the prediction data. We carried out our analysis using R, relying on the R Targets package to build a deterministic data pipeline that could be easily audited. And we used R Markdown to build data sheets that contained the findings for individual jurisdictions as well as maps showing where the predictions occurred. We also created choropleth and grid density maps using mapping software, showing predictions in their geographical contexts, which was invaluable for reporting
The team filed more than 140 public records requests with 43 agencies, requesting data about stops, arrests, and use-of-force incidents. Surya and Dhruv wrote custom software to determine which incidents occurred in prediction locations.
What was the hardest part of this project?
One huge problem was confirming the legitimacy of the data, since it came from unsecured cloud storage rather than an official source. That involved dogging dozens of police and sheriff’s departments and local officials, searching public contracts, and scouring media reports.
Connecting the findings to real-world actions by police on the street was hampered by the agencies, who almost universally refused to share data on how officers responded to the crime predictions. So we filed more than 140 public record requests for data on arrests, stops, and uses of force. Most departments denied our requests, but we collected, standardized and examined more than 600,000 arrests, stops, and uses of force from those who fulfilled our requests.
The reporting was particularly challenging during COVID-19 travel restrictions. Reporters called hundreds of arrestees and spoke to defense attorneys and prosecutors, and none were aware crime software may have been related to their cases. In many cities, advocates weren’t even aware that crime prediction software was being used at all. The team also interviewed cops, academics, policing experts, and local officials. They were eventually able to visit some communities affected by police departments’ use of PredPol.
The data analysis required a lot of prototyping and iteration. We used regressions to look for correlation between different predictions and arrests, we carried out an exposure analysis to determine who was most likely to be exposed to policing due to predictions, and we calculated the demographic composition of the different neighborhoods based on how many predictions they received. Each approach resulted in similar conclusions. To ensure the report’s findings were bulletproof, we reached out to subject-matter experts on predictive policing and researchers from Stanford, Columbia, the University of Pennsylvania, Oxford, and Human Rights Watch to review our methodology before publication.
What can others learn from this project?
First and foremost: Persistence pays. This data came not from a public records request—it would have been denied—but rather from unsecured cloud storage that an industrious reporter found by digging around. Sometimes the back door is the only door.
Second, it shows the importance of reporting out the “why” behind the data findings. The data analysis revealed that the software targeted neighborhoods that were disproportionately inhabited by people of color and poor people. That’s a strong finding, yet those who believe algorithms are the cure to bias would question how this is possible. So we looked into it for them, finding a bevy of academic and government reports that most crime is not reported and that people of color and those living below the poverty line are more likely to file police reports when they’ve been victimized than White people or middle-class and rich people, who are more likely to handle the situation another way.
Lastly, transparency: We published an in-depth methodology showing precisely how we conducted our analysis and posted all of our data on GitHub, which allows anyone to download it and conduct their own research or reporting. As we in the media continue to battle a confused public’s contention that mainstream news is “fake” or biased, it’s never been more important to show your work, even your limitations. We do it every time at The Markup, and it’s heartening to see other newsrooms beginning to publish more robust methodologies as well.