2020
The danger of being a young in Mexico
Category: Best data-driven reporting (small and large newsrooms)
Country/area: Mexico
Organisation: El Universal
Organisation size: Big
Publication date: 8 Dec 2019

Credit: Jetzabel Daniela Guazo, Iñigo Arredondo, Daniel Gómez, Melissa Amezcua, Valente Rosas, Miguel Garnica
Project description:
Two of every ten of killings in Mexico, between 2007 and 2018, were young Mexicans (15-24 years); only in this period have been murdered 59,779. But the problem, it´s more huge: in three years (2007 to 2011) the homicide rate of this population, it tripled and showed a terrific number: 29 murders per 100 thousand young people. This was the proof of extreme violence and it was considered scandalous, but the new trends show violence that is surpassing any existing record.
For the experts, the main reason for this is the lack of justice and corruption in Mexico.
Impact reached:
At the media in Mexico is very difficult to create a multimedia product, but with this microsite, we achieved an entire product to engage users and explain in different forms a difficult phenomenon. Besides, the analysis help to understand what happened in each state of Mexico and this type of product is not common in Mexico media.
We were mentioned by the Global Investigative Journalism Network community (GJIN) as one of the best data journalism investigations published during that week (August 12, 2019) both for the visualizations and for the analysis that had been done of the homicides of young people in Mexico. Also, the work was republished by more than five electronic media in the country.
Techniques/technologies used:
We used Python to clean and analyze the data. Also, for the analysis, it was very important the use of two libraries: Pandas and Numpy.
Matplotlib and Seaborn were used to create graphics to allow explore different patterns that were important for the story.
The data visualization was created with Javascript D3, this tool helps us to join the data with the graphics. The library “graph-scroll” was used to aggregate extra interactions to complete the narrative and comprehension of the visualization through a scrolly telling.
The goal of this visualization was to show the trends of the homicide rate for two groups of age at first view, also create a ranking for each year. We don´t use a map because this would be a limit for the temporal category and it can´t be useful for a comparison of the two groups of age. That the main reason to create a different series of times and then we aggregate a transition to order the states according to its homicide rate for a year.
For the creative process of the visualization, the first step was a sketch in the paper, then we did a static version in Canvas using the library p5.js with real data. However, the curves of the series of times couldn´t see well and the transitions were very difficult and slow; that was the main reason to create a graphic in SVG through the library of Javascript D3, one of the most powerful software to visualize data.
What was the hardest part of this project?
The objective of this project was to show the danger of being young in Mexico and prove the probability of being killed in different states of Mexico. To try to understand the dynamics of a phenomenon so complex like the homicides in Mexico, you need the data as son a disaggregated possible (temporally and spatially) and have enough years to show real trends of the problematic. The big issue is that data only exists in the different databases in Mexico. Inegi, the agency in charge of providing the main socio-economic indicators of the country, is the most credible source for this topic since their figures come from death certificates, but all data is one year later. It means that the last data of murders in Mexico were to 2018 and at the end of 2020, we have the data of 2019. On the other side, the Secretariado Ejecutivo, an agency of the federal government, has an update data of murders in the country, but these numbers have not the detail with the age of each victim. In theses database, the age is in different ranks and if we wanted to obtain the number of victims and not the files of investigation, we had to limit our analysis from 2015, with what we couldn´t do a temporal analysis which showed real patterns of the time. We determine that the most accurate source for this project was Inegi because in this data we could find patterns for each county and that was a goal for the investigation and it was going to determine the interviews. Another difficult part is that most of the country is very dangerous for interviews and reports of violence. So, we had to select very carefully the counties that were visited. We wanted to show the magnitude of
What can others learn from this project?
One of the points that made this project stronger is to have a multidisciplinary team. Having specialized people for each of the topics in the project was essential.
Understand that the issue of homicides cannot be analyzed lightly. We need a methodology that supports each of the findings and know from the beginning what we want from the data. Since the beginning, we wanted to show the danger of being young in Mexico and that they were (and so far are) the population most affected by the violence in the country. Without the data that supported our hypothesis, we would only have stayed with the anecdotes and the testimonies, but in this case, the analysis of the databases gave us the best starting point.
Be clear, transparency and accurate with the data. And work with people who have different abilities.