This is the first time that Infobae, the Spanish-speaking media news web site with the highest traffic between other Spanish newspapers, has carried out a project of this magnitude and with these characteristics: searching, asking and making systematic requests of the data, promoting open data, and sharing open data that was obtained from closed data.
The project was focused on transparency, the potential enrichment of public officials and legislators over time, and their income from public funds. It was carried out during 2020 for the Data Unit. The Unit was created two years ago and is composed by two journalists and a programmer.
Several of these revelations had never been published before, as the patrimony and public salaries of the governors, their wealth, and their reserved expenses were always a state secret.
Our objectives were from the beginning of the project:
– Exhaustive survey through official sites.
– Determine the information to get through Public Information Access requests, and / or in a personalized way.
– Easy navigation graphics to make your content more accessible (that was the reason why we used Flourish, since we understood that the tool is friendly for our users)
– Stories of social value for the audience and political impact.
-In line with the concept of open journalism, at the end of each article it was explained how the information was processed and shared the downloadable spreadsheet in open format
Description of portfolio:
The axis of the project were:
Patrimony of the governors of the 24 provinces, of 257 national deputies, and of 72 national senators from the analysis of their affidavits of assets
Public salaries of the governors of the 24 provinces of the country
Public salaries of mayors of the main province of the country, Buenos Aires.
The production was of 19 journalistic pieces published in 2020, that are listed here with their links: https://docs.google.com/spreadsheets/d/1ayYWNLdS8SqHbpApp4TQTCiUaszRFaCYfUjk8K-Vu1c/edit#gid=0
The initiative included the following process
Web detection of information in open and closed formats
Detection of sites with absence of information
Download of PDFs from official website of Anticorruption Office
Opening closed data, using WebScraping techniques and use of the Tabula tool
Placed requests for information in cases of non-existent or outdated information.
Personal requests to the actors involved and traditional journalistic reporting was used to complete and confront the information.
Data extraction in cases where it was possible to use the OCR tool.
Manual data loading in those cases in which the information was delivered in paper format and the OCRs did not work.
Using Python for data mining, with developer work.
Confection of a ranking to compare patrimonies.
Interactive data visualization using Fluorish
Building history after data mining and displaying information on different types of charts
The main impact was that, for the first time, some of the governors of the different provinces agreed to make public their patrimony, despite the fact that the legislation enabled them to keep it secret. They did so not to be exposed as they were hiding their patrimony after the request of Infobae. In fact, several admitted that they were in debt in terms of transparency, and that they were working or were going to work on better standards in that regard.
In the case of legislators, some of them that have not presented their patrimony affidavits, did it after being exposed in the notes.
Other impact was given by the repercussions and the reaction of the audience that these notes generated, in social networks and in other media. The impact was even higher in the provinces, where the informational locks have historically been stronger.
The third impact has to do with the Web traffic they generate: On average, each piece scored over 300,000 page views per piece.