2023 Shortlist

The International Consortium of Investigative Journalists (ICIJ)

Entry type: Portfolio

Country/area: United States

Publishing organisation: The International Consortium of Investigative Journalists

Organisation size: Big

Cover letter:

For nearly a decade data and technology have been a central part of ICIJ’s cross-border investigations. Throughout the years the team has developed tools that have made collaborations around the world possible and at the same time share and explore millions of records securely. In parallel, the organization has made data accessible to journalists in all continents and has driven analyses to help explore systemic global problems.

ICIJ started expanding its data work back in 2013 during the Offshore Leaks project. Since then, the team has grown and worked on development of tools, data analysis and research to address specific journalistic needs. The team has used open-source technologies to make large sets of files explorable and share them securely for projects like the Panama Papers, Paradise Papers and the Pandora Papers.

ICIJ’s tech team developed an open-source research tool known as Datashare that has been used since 2019 for various cross-border investigations. More recently, ICIJ has also developed another open-source tool called Prophecies to help with fact-checking and validation efforts.

ICIJ believes that data can be a key connector for journalists around the world and makes it accessible to everyone working on a project. By doing so, journalists bring their best local knowledge and help connect the dots across borders. To facilitate this, ICIJ has also used graph databases (Neo4j and Linkurious) as part of the reporting process.

To coordinate its efforts and stay connected, ICIJ and media partners use a communication tool known as the Global I-Hub. This platform was inspired by the experience of the Offshore Leaks collaboration and an idea brought by ICIJ member Giannina Segnini.

ICIJ’s data and research team has used various approaches to address complex data problems. These include the use of programming languages, machine learning, Microsoft Excel, Google Sheets as well as manual work. The team has worked on data validation, research, data gathering, data structuring, cleaning, analysis and fact-checking. The work has served journalists around the world who have worked together on different projects. It has helped explore topics systematically and driven some reporting efforts.

During its projects, ICIJ has also provided training to help understand how to best use its technologies for research.

For gathering data, journalists around the world have done thousands of freedom of information requests in projects such as the Implant Files, and explored public records. In the case of investigations that started with a leak, connecting the data with public records has been a central part of the work done by the team followed by traditional investigative reporting efforts done across borders.

ICIJ has also developed products to make data accessible to the audience, such as the Offshore Leaks database and the Medical Devices Database. The Offshore Leaks database contains information on more than 810,000 offshore entities from five different leaks.

ICIJ has been inspired by hundreds of partners around the world. Le Monde, Premières Lignes, Süddeutsche Zeitung, Buzzfeed News, The Guardian, have shared files with ICIJ and media partners that have triggered various global collaborations. SVT and Quartz AI Studio, have also contributed with data efforts.

In 2014 ICIJ established its first data unit comprised by three people: Mar Cabra, Matthew Caruana Galizia and Rigoberto Carvajal. The team has expanded since then.

In 2022 the ICIJ team working on data, research, technology and product included: Agustin Armendariz, Whitney Awanayah, Hamish Boland-Rudder, Jelena Cosic, Antonio Cucho Gamboa, Caroline Desprat, Emilia Díaz-Struck, Miguel Fiandor Gutiérrez, Marie Gillier, Jorge González, Karrie Kehoe, Javier Ladrón de Guevara, Soline Ledésert, Carolina Verónica López Cotán, Asraa Mustufa, Miriam Pensack, Delphine Reuter, Pierre Romera, Nicole Sadek, Bruno Thomas, Maxime Vanza Lutonda, Margot Williams.

Description of portfolio:

2022 brought different data challenges to ICIJ. Throughout the year different projects worked in partnership with journalists and media from around the world involved working with different scale data sets as well as data gathering and analysis efforts. Some started with leaks, others with public records. The team also structured data, shared records, and analysis with journalists to help with the reporting process, using different tools and approaches.

Between January and May 2022 ICIJ incorporated data on people and companies behind more than 11,000 offshore companies, foundations, and trusts from the Pandora Papers to ICIJ’s Offshore Leaks Database. To be able to do so, the team conducted checks and validation on the data that was originally extracted from 11.9 million records using a variety of methods including machine learning, programming languages as well as manual extractions. The Offshore Leaks Database now has data on more than 810,000 offshore entities from five different leaks and connects with more than 200 countries and territories.

Together with the publication of the data, ICIJ conducted an analysis on the presence of Russian oligarchs and politicians in the Pandora Papers that was published as part of the Russia Archive project.

Additionally, between February and December 2022, ICIJ worked on the Ericsson List that revealed that the Swedish-based multinational sought permission from Islamic State terrorists to work in an ISIS-controlled city in Iraq and paid to smuggle equipment into ISIS areas on a route known as the “Speedway.” ICIJ’s data team reviewed all the leaked documents to consolidate details of payments made by Ericsson and what constituted a breach of its own code of conduct to complement on-the-ground reporting. The team also reviewed public data on deferred prosecution agreements of the kind that Ericsson had negotiated while much of the corruption undercovered in Iraq was underway. ICIJ discovered that countries around the world had been adopting the use of such U.S.-style corporate leniency deals even as they had come under increasing criticism in the U.S. itself.

During July 2022 ICIJ, The Guardian and media partners published the Uber Files. The project started with more than 124,000 leaked records. Using Apache Tika, Python, Google Sheets and Microsoft Excel, ICIJ organized information on meetings, reviewed public records on lobbying and declaration of meeting schedules, to investigate how Uber stormed into markets around the world and how it deployed a phalanx of lobbyists to court prominent world leaders to influence legislation and help it avoid taxes. ICIJ had to parse countries’ varying lobbying and meeting-disclosure records, accounting for variations in the type of data available and its quality. It was key to confirm whether scheduled meetings referred to in the records took place or not, for which connecting the data with public records was central.

Datashare, an open-source research platform developed by ICIJ’s tech team was central to allow journalists to search and review the documents securely. As some of the emails were in French, the team developed a feature that converted the text to English.

Between November and December 2022, ICIJ, ProPublica and media partners published the Shadow Diplomats that identified at least 500 current and former honorary consuls accused of crimes or embroiled in controversies, before, while or after they were appointed, including some caught exploiting their status for personal gain. ICIJ spent months requesting, collating and analyzing data on honorary consuls from all around the world, and created a first-of-its-kind index to analyze the transparency of countries and their honorary consul appointments

The different 2022 projects have had impact in different countries, including the opening of investigations, regulation reviews, taxi drivers protests, among other things.

Project links: