As countries across Europe fought the COVID-19 pandemic and rushed to obtain critical supplies — such as PPE and ventilators — many countries suspended their usual public procurement rules, resulting in billions of euros of spending largely hidden from the public. Along with media partners in 37 countries, OCCRP followed the money and collected information from over 37,800 COVID-19 related tenders and contracts worth over 21 billion euros (U.S. $24.9 billion). The data gives an unprecedented view of just what Europe’s governments have been spending their billions on — and where things may have gone astray.
The project was the first collaborative effort by European journalists to pull together COVID-19 procurement data. It revealed that the majority of the continent’s known procurement deals were signed as direct awards without tender, and that large, established companies were the major beneficiaries of the deals. The data also discovered massive discrepancies in prices paid for key items such as FFP2 masks, and identified countries — such as Ukraine and Czechia — where the prices paid varied most wildly. The project also comprehensively documented which countries have been the most transparent about sharing data, and which countries have remained “black holes” for COVID-19 procurement information.
After publication, the authors were invited to speak on various events ranging from journalistic seminars through procurement specialist calls to the annual Transparency International Anti-Corruption Conference. At these occasions, colleagues could access the data and grievances were shared about the lack of transparency over irregular public spending (little competition) with expert audiences. One of the authors, Adriana Homolova, was also approached by the Dutch government to help design a more transparent system for public contracts publication, a project that was already in the pipeline of the government. Adriana is still being approached to speak about the data at universities and with procurement and open data researchers. The data collected will also be integrated into OpenContracting’s “Emergency Procurement Explorer.”
We used Python and Jupyter Notebooks to write scraping scripts for various public procurement websites and for data cleaning. This data was then fed into a Google Spreadsheet that was openly editable by anybody from the team and in the end was also used for data publication. Data visualisations for the publication were done with Flourish. Aleph was used to upload and search contracts and documents that were not public before.
What was the hardest part of this project?
The hardest part of this project was its sheer size. This project involved the coordination of dozens of individual journalists, all working to balance the demands of their own newsrooms, wrestling with a massive amount of data. Journalists stayed in regular contact via the Signal app, and over a period of months collaborated on a strategy to obtain and process extremely large and diverse data, regularly updating their findings in central spreadsheets and a wiki. This could have easily collapsed into chaos, but a core team worked diligently to make sure all data was properly inputted and the team was kept abreast of updates. Instead, the project managed to keep going thanks to a spirit of public service and constant communication and feedback from all involved. As for the data, the hardest part was to standardize very different data sources (official websites in many countries, FOIA requests, data leaks, TED data etc) into one coherent data set. This also unfortunately made it nearly impossible to keep it up to date properly.
What can others learn from this project?
This one-of-a-kind trove of data gives journalists a starting point to further investigate government spending in their own countries. Compiling this data in one place allowed reporters to compare government spending and ask questions that held their governments to account: Why did mask prices fluctuate so wildly from country to country? Why isn’t my government releasing this data when others are? OCCRP encouraged journalists to investigate COVID-19 related records in their countries and offered original source documents and assistance.
Information about government procurement is notoriously arcane for reporters to analyze: not only is the data itself incredibly complex in its structure (a tabular version of TED awards, the European procurement system, might have 150 columns), but it also requires a deep understanding of the legal and political process required for public institutions to spend billions of Euros. The pandemic compelled governments to deviate from their legal rules in ways that still needed to be proportional to the severity of the emergency. Holding power to account in this context requires a tight integration of data analysis and policy analysis.
We hope that now we’ve learned how to analyse EU procurement data in a crisis that put a special spotlight on the process, we can also encourage more reporters to dig into the data.