Erasmus a dos velocidades

Country/area: Spain

Organisation: El Confidencial, Osservatorio Balcani e Caucaso Transeuropa

Organisation size: Big

Publication date: 28/12/2021

Credit: Darío Ojeda, María Zuil, Ornaldo Gjergji


María Zuil is a reporter and data journalist. She studied a Master’s degree in Investigative, Data and Visualization Journalism and she is at El Confidencial since 2016. Before that, she spent two years at Berlin working for the news agency EFE and as a freelancer.

Darío Ojeda is a journalist specialized in data. He has been working at El Confidencial for 8 years, most of the time in the sports section. Since 2019, he began to train in data and is currently doing a Master in Data Journalism and Visualization.

Ornaldo Gjergji is a data and policy analyst working at Osservatorio Balcani e Caucaso Transeuropa and the European Data Network. He is specialized in philosophy and international relations: ethics, human rights and armed conflicts.

Project description:

‘Erasmus a dos velocidades’ is a collaborative project between El Confidencial and the Osservatorio Balcani e Caucaso Transeuropa. The articles have the aim of showing the flows in the Erasmus program and the inequalities between the countries taking part of it.

Although is a very popular academic program, since its origins Erasmus has failed being accessible for people from different cultural and economic backgrounds. The amount of the grants are very different regarding the country, the region, or even the university, which could be discriminatory for the European students and has social consequences such as less labor insertion, among others.

Impact reached:

The project was published a few days ago at El Confidencial and reached more than 120.000 viewers between the two articles published. Our partners, Osservatorio Balcani e Caucaso Transeuropa (OBCT) are publishing on the following days, so it is soon to know all the impact that it will get.

However, we consider it has been a pioneer project, since this topic haven’t been reached before like that in a massive media. The visualization of the student flows by country, region, academic field and economic level are accessible for the very first time in a newspaper, so everyone can look and play with the data. 

This is interesting for all the present and future students and also for the European public, since inequalities in the European framework is actually a problem that affects all of us.  Besides, the second part focused on the differences between cities in terms of daily cost, are also attractive and useful to any reader.

But our main goal is to call for a change in this program. The UE authorities are already aware of the inequalities of the program, but a well-informed society will lead to a change in the policies that are made in the Brussels Parliament. The main impact we hope from ‘Erasmus a dos velocidades’  is to influence in the next programs editions and budgets to make it more inclusive.

So this project wants to be a tool to give  information to the people in order to ask for a better program, which objective should be bringing the regions closer and make a stronger and more competitive European Union.

Techniques/technologies used:

The data analysis for this project has been carried out using the programming language R and RStudio, mainly relying on the tidyverse, sf, and osmdata libraries.

After an initial data exploration and cleaning, the Erasmus cities have been geolocated since there were no coordinates in the original dataset. Since city names were not entered in a standardized way, geolocation has been done either by: matching the cities’ names from the original dataset and the EUROSTAT dataset on European Local Administrative Units; geolocating the cities using OpenStreetMaps Nominatim; manually, in case the first two methods did not return valid coordinates.

Finally, once we got the coordinates of the cities, these have been geographically matched with the geometries of European NUTS2 in order to cross-reference Erasmus mobilities to economic development and produce the csv files used by the design team for the visualizations.

In this regard, the d3.js library was used for the visualizations of the articles and, specifically, the PixiJS library for the map, animated on canvas. Both are open source. Besides, the functionalities of the page (such as scrollytelling) are developed in pure javascript. 

The data used by the Sankey diagrams was converted from csv to json format. The reason these technologies were used was to decrease the load weight of the page. d3.js is a very powerful library established in the interactive visualization development community. For its part, PixiJS makes it easier to work with canvas elements, something we needed due to the number of elements present on the map. Both libraries, combined with data processing, javascript, and the use of HTML and CSS (Sass, specifically) have allowed us to develop what the design team had in mind for this project. In addition, Microsoft Excel, Google Sheets and LibreOffice were used to analyze the data in spreadsheets.

What was the hardest part of this project?

On the first hand, from the journalistic point of view, the first problem to solve was to analyze a dataset consisting of more than 17 million data points in order to obtain qualitative information with which to develop our hypotheses. For that, we used statistical programming and computational techniques working on the raw data for several days.

But the information of the flows by itself was not enough, so the big challenge was to see where the gaps were. That is why we looked for a methodology that could associate the cities with the European regions to which they belong in order to cross-reference the erasmus flows with regional economic data, categorize the regions in the three official levels of economic development, and compare them to be able to infer about inequalities and how them underpin the Erasmus programme.

In addition, to bring the story closer to the public, we wanted students to tell their own experience, both Erasmus program participants and those who could not afford it. For that we look for testimonials we used an online survey as a tool to obtain new data.

Besides that, tackling a project of this magnitude has also been a challenge for the journalists who have been part of it. Our experience leading works between various newsrooms was scarce until now, and this work has made us learn more about the flows and communications of these types of stories.

What can others learn from this project?

On one hand, data analysis is essential to approach complex issues to a general audience, and along with the design team we have created useful tools for both: readers and other colleagues. We would love if this project could help other journalists from different countries to apply ‘Erasmus a dos velocidades’ approach to their own environment. That is why we are going to release the database in the coming days, when it has also been published by OBCT and the members who want it of the European Data Network, of which we are both part.

Besides, the design of the visualizations and the tools to shape this project can also serve as an idea for colleagues with a more technical background such as developers and designers of data journalism when it comes to building their articles.

Furthermore, European affairs are often one of the most unknown in newsrooms, usually far away of the corridors of Brussels. We believe that this story is an example of how this issues can be brought out beyond mere European politics, and look at how the policies affects its citizens.

Project links: