Spain empty of ministers: 80% of the provinces do not appear on their agendas

Country/area: Spain

Organisation: El Confidencial

Organisation size: Big

Publication date: 07/11/2021

Credit: Fernando Anido, María Zuil, Darío Ojeda, Marta Ley


Fernando Anido is a journalist and developer. He studied a Master’s degree in Investigative, Data and Visualization Journalism and after that he did a TVET to specialize in programming. He did all the scraping for the project.

María Zuil is a reporter and data journalist. She studied a Master’s degree in Investigative, Data an Visualization Journalism and she is at El Confidencial since 2016. Before that, she spent two years at Berlin working for the news agency EFE and as a freelancer.

Darío Ojeda is a journalist specialized in data. He has been working at El Confidencial for 8 years, most of the time in the sports section. Since 2019, he began to train in data and is currently doing a Master in Data Journalism and Visualization.

Marta Ley is a data journalist. She studied a Master’s degree in Investigative, Data an Visualization Journalism and she worked at El Mundo for 6 years. Since October 2020 she’s leading the data desk at El Confidencial.

Project description:

El Confidencial analyzed all the records of the minister’s agendas (meetings, institutional events, trips…) during the last six years. From the information collected (24,236 recorded events), we seek the location at a provincial level. As a result of this analysis, we can conclude to which territories more attention is paid and to which less when organizing the agendas of the ministries: 80% of the provinces practically do not appear.

In addition, we were able to observe the differences between government parties to plan foreign official visits, communication media for interviews and the times in which the events are held.

Impact reached:

The Spanish territorial model is based on the system of autonomous regions in which all have, on paper, the same importance. However, this system presents inequalities when it comes to national policy. Madrid, the autonomous regions where the country’s capital is located, accounts the most of the presence of politicians at this level.

Thanks to this data journalism project, we provided information about this reality for the first time, providing conclusions that show the mismatch. On the day of publication, the main topic was the opening of the newspaper, and five other related news were also published as a block, including an interview with a former minister and an opinion article. El Confidencial’s commitment to achieving a greater impact was total and the reception was good. The information block obtained about 85,000 visits, a good figure considering that there was a paywall and that it was a weekend.

In fact, in the days prior to the publication of this information by El Confidencial, the government had raised the possibility of moving some ministries outside the autonomous region of Madrid. In addition, the subject is one of the political keys today. In the last national elections, a party that fights for a better balance of territorial decompensations, Teruel Existe, achieved representation for the first time in parliament. And facing the next elections, parties with demands of this type are already proliferating in what is called “emptied Spain.”

But in addition, the analysis serves to highlight another issue: the agendas of the ministries are not complete. Not all the activity of each minister appears, but only the ones that each cabinet wishes to communicate.

Techniques/technologies used:

The first step was to scrape the information from La Moncloa website and structure it in the form of a table, for which Python was used. The URLs did not always follow the same pattern, so we simulated the behavior of a person clicking with Selenium. 
Once we had the data in csv format, a first approach to the location was made by looking in the event description for names of cities or countries, and creating an additional column with them, using Python again. However, on many occasions a place is mentioned but this doesn’t mean that the event takes place in that place.
For this reason, it was necessary to review each of the elements and check the location. This was the process that took the longest (about two weeks to clean this data) and that adds great value to this work. We will explain later why this work was so important.

What was the hardest part of this project?

As we said, it was necessary to review each of the elements and check the location one by one. Thanks to this, we were able to have a very complete panoramic view of what type of events are recorded in these agendas and which ones are left out, something very important to understand our data.
But in addition, it was not enough to extract the location of each event from the agenda. For the analysis to be better, we decided to categorize those events that took place at the headquarters of the ministries, since all of them are located in Madrid. In this way, our main conclusions eliminate this centralist effect and, even so, it is observed that centralism is an evidence. We contribute with data to the debate from the public activity of the ministries for the first time.
We also established categories to differentiate interviews in the media or telematic events during de Covid-19 pandemic. This allowed us to draw conclusions about the differences between the media chosen by each cabinet, depending on who is in government.
The analysis of the information is useful, therefore, to see which territories are given more importance in order to communicate the government’s action, but it is not possible to go much further. The specific individuals or bodies they meet with, for example, are not listed entirely. All of this is clearly reflected in our analysis so that the reader understands how to interpret it, and also highlights the lack of transparency on a totally public management issue.

What can others learn from this project?

One of the most important things we learned with this project was the importance of decision-making prior to creating a unique and proprietary database. For this, it was important to previously agree on the categories that we were going to need. After several days working with this data, we understood that the quality depended on a careful review of all the information. Only when we were sure that our data were completely clean to work with, were we able to start the analysis.

In addition, we understood that on many occasions the information necessary to make good information is already published. It’s necessary to have a journalistic vision and a way of working based on data analysis to be able to extract exclusive and original information from databases of your own creation.

Also, the review and teamwork were so convenient. In addition to data journalists and developers, we wanted to engage other journalist at the newsroom specialized on political news in order to provide the best possible context and approaches to the reader.

We also understood the importance of knowing well the origin of the data you are working on so as not to publish erroneous conclusions. Knowing that what is published is a selection of what the ministers do was very important, so we investigate to understand how that selection was made: based on communication strategies.

And this brings us to the last lesson learned, although it was not a surprise: the level of transparency of the activity of ministers in Spain is really low and, although the improvement of this transparency has been a political promise, it has not been fulfilled.

Project links: