Best data-driven reporting (large newsrooms) – year 2020

Winner: The Troika Laundromat

Organisation: OCCRP, The Guardian – UK, Süddeutsche Zeitung – Germany, Newstapa – South Korea, El Periodico – Spain, Global Witness and 17 other partners who can be viewed here.

Credit: Coordinators: Paul Radu, Sarunas Cerniauskas. Reporters: Olesya Shmagun, Dmitry Velikovsky, Alesya Marohovskaya, Jason Shea, Jonny Wrate, Atanas Tchobanov, Ani Hovhannisyan, Irina Dolinina, Roman Shleynov, Alisa Kustikova, Edik Baghdasaryan, Vlad Lavrov

Jury’s comment: In a field of strong entries, the substantial effort, investment and not inconsiderable risk in piecing this story together, were some of the factors appreciated by the jury in selecting the Troika Laundromat, by the Organized Crime and Corruption Reporting Project (OCCRP) as the winner in this category. This far-reaching investigation touched almost 3000 companies across 15 countries and as many banks, unveiling more than €26 billion in transfers tracked for 7 year period (2006-2013), with the main purpose of ‘channeling money out of Russia.’ The security and scrutiny undertaken for a project of this size is evident with real consequences for political leaders. The showcasing of detail in networks, locations and personalities embellished an already strong entry. This project in places read part thriller, part blockbuster, part spy movie. Do yourself a favour and dive in.

Organisation size: Big

Publication date: 4 Mar 2019

Project description: We exposed a complex financial system that allowed Russian oligarchs and politicians in the highest echelons of power to secretly invest their ill-gotten millions, launder money, evade taxes, acquire shares in state-owned companies, buy real estate in Russia and abroad, and much more. The Troika Laundromat was designed to hide the people behind these transactions and was discovered by OCCRP and its partners through careful data analysis and thorough investigative work in one of the largest releases of banking information, involving some 1.3 million leaked transactions from 238,000 companies. A video explainer:

Impact: First published in March 2019, with stories being added on an ongoing basis, the impact of of the Troika Laundromat was immediate and widespread. Raiffeisen, Citibank, Danske Bank, Nordea Bank, Swedbank, Credit Agricole, and Deutsche Bank were all seemingly implicated, and two banks — Raiffeisen in Austria and Nordea in Finland — deeply involved in the Laundromat saw their shares tumble. Twenty-one members of the European Parliament demanded sanctions against bankers whose financial institutions were involved in the money-laundering scheme. They also called for an “EU-wide anti-money laundering supervisory authority.” At the same time, the Parliamentary Assembly of the Council of Europe (PACE) called for swift and substantial action to strengthen anti-money laundering provisions and improve international cooperation in the fight against laundromats. The investigation triggered a major political crisis for the president of Cyprus as we revealed that a law firm he established and co-owned, and in which he was a partner at the time, was arranging business deals linked to a friend of Russian President Vladimir Putin, the infamous Magnitsky scandal, and a network of companies used in various financial crimes. It also ignited investigations into some of Russia’s most powerful politicians including an investigation in Spain into the property owned by the family of Sergei Chemezov – the president of the main State owned technology conglomerate in Russia, Rostec Corporation, and a former partner of Vladimir Putin in their KGB heydays in Dresden, East Germany. More recently, Sweden’s SEB bank was revealed to be caught up in the Laundromat when leaked data raised questions about its dealings with non-resident clients. Overall, the Troika Laundromat put the European banking system under increased scrutiny and is currently brought up in the European institutions as a main reason to clean up the European financial system.

Techniques/technologies: We received the data in various formats, including PDFs, Excel files and CSVs. We built our own virtual banking database, code-named SPINCYCLE. After grouping the source data by the given columns and format, we were left with 68 different structures. For each structure, we built individual Python parsing scripts that would feed data into the SPINCYCLE database. In the database, we organized the transactions so the data would link up. We used a proprietary IBAN API to pull details on banks that were missing in the data. For monetary values, we performed currency conversion at the time of the transaction, so we linked SPINCYCLE to an on-line table of historic exchange rates. We also tagged the accounts for which we had received information so that we could look at the overall flow of funds from the money laundering system. The neural net was trained using data from company registries and the Panama Papers, and it helped us to pick the names of 22,000 individuals from the 250,000 parties involved in the money laundering system. To make the data available to our members, we provided a web-based SQL interface. Later, we added a full-text search index based on ElasticSearch, which could be searched using Kibana as an interface. We also used Aleph, our home-grown open source data analysis engine. On the landing page we aimed to present an overview of the whole network with a chord diagram and a dashboard that sets the model for the whole exploration: a big graphic on top followed by a dashboard with main key points. For the data visualization section we used client side Quasar Framework over Vue.js and D3.js for the graphs, all designed in Adobe Creative Suite. The collaboration took place via the OCCRP secured wiki and Signal.

The hardest part of this project: The Troika Laundromat was born out of data work done on a large set of very dry banking transactions. We had to look for patterns in order to identify and isolate transactions that stemmed from what we later defined as the Troika Luandormat (TL). You can think of the TL as a TOR-like service meant to anonymize banking transactions. We had to look for the error, for the bad link, in order to identify who was the organizer and who were the users of the system. We finally found out through careful data analysis that the bankers putting this together made a small but fatal mistake: they used only three of their offshore companies to make payments to formation agents in order to set up dozens of other offshore companies that were themselves involved in transacting billions of dollars. These payments which were only in the hundreds of dollars each were of course lost in a sea of millions much larger transactions so we had to find them and realize that they were part of a pattern. The whole Troika Laundromat came in focus after this realization. Another hard part with this particular project was the security of the team’s members. The people we reported on were very powerful in their own countries and across borders and we had to insure the communication with reporters in Russia, Armenia and other places was always done via secure channels. Last but not least the factchecking had to be done across borders and across documents and audio in many languages so this took quite a bit of time and effort to make sure we had things right.

What can others learn from this project: We learned, once again, that it is the combination between deep data analysis and the traditional footwork that makes good investigative journalism. It is the ability to zoom in and out between the data and the reality in the field that can find you the hidden gems. We had a data scientist working with the investigative teams and this cooperation proved to be a recipe for success. We also insured that journalists had multiple entry points, trimmed down to their technical abilities, with the data. The secured wiki where we shared our findings had a section where we described in detail how the information can be accessed through different systems. This was also a place where advanced journalists shared their ready made formulas so that others could apply them on top of their data of interest. We have also learnt in previous projects and applied it here that the data scientist and our data journalists need to be available via Signal to the new arrivals in the collaborative team and be ready to explain how the systems work, what we already found in the data etc. This made their integration much easier and improved efficiency as the new journalists in the project did not have to start from scratch. Another important lesson that we drew is that it is not just cooperation across countries and between very smart reporters that makes a good project but cooperation across leaks can give you a fuller picture. In addition to the new leaked files, reporters on the Troika Laundromat used documents from previous ICIJ investigations, including Offshore LeaksPanama Papers and Paradise Papers. It’s crucial that at some point in time we unify all these datasets as there are many untold stories in the current gaps between them.

Project links: