The streets of the women
Entry type: Single project
Publishing organisation: Newtral.es
Organisation size: Small
Publication date: 2022-06-27
Authors: Javier Beltrán, Paula Boira, Abraham Carrera, Cristina Pita da Veiga, and Irene Larraz. With the collaboration of: Carla Pina, Clara Pérez, Gonzalo Alejándrez, Lucía Díaz, María García Arenales, María Pascual, Marta Casáis, Adrián Bono, and Uxía Carral.
Newtral Data is the data visualization and analysis area of Newtral.es. We are a pioneer team in Spain working with artificial intelligence, and creating algorithms where journalists and engineers work together to design tools focused on journalism. In the team, we are journalists, engineers, programmers, and graphic designers who turn data into stories through different visual narratives, supported by graphics, illustrations, infographics, and other approaches to address current affairs. To do so, we use technology to automate some scraping and data analysis processes.
In Newtral.es we analyzed the names of 39,251 streets in 69 cities of Spain to check how many of them recall a woman. But we have not done it alone: we developed an algorithm to reduce the enormous work that would have involved identifying, reviewing, and manually annotating each and every one of the streets of the 69 cities analyzed using Open Streets Maps. Therefore, we opted for a hybrid working mode in which the algorithm automatically labeled the gender of the streets supervised by the ‘human’ team and retrieved the information from open sources such as Wikipedia.
The project allowed to measure for the first time the size of the gender unequality in our own streets as a sign of recognition to prestigious people in the country. To realize this special, Newtral was inspired by other projects in this field, such as those developed by Geochicas and the Mapping Diversity project. The technology team developed a crawler (an automated program that crawls and extracts content from the web) that goes through the cities that have been studied and obtain both the names of the streets and their geometries so that the bot can decide whether it refers to a person and what their gender is.
It required creating a classifier based on artificial intelligence (AI), which separates street names and identifies those that refer to male or female characters. This process is done in two steps: the first step is based on rules and dictionaries, in which the automatic association is made if the street name contains a reference to professions, aristocratic and/or religious titles that allow determining the gender (e.g., Father Damien street). A second step is based on our artificial intelligence model, in which, having identified the name of a street, it is determined whether it refers to a person and its gender (e.g., Méndez Álvaro street). The training of the AI model was performed in two phases: the first, from a set of unrevised data with the association between the name and gender of streets in different Spanish and Latin American cities. In the second phase, the model was retrained from data reviewed by the Newtral.es team of journalists for more than 20 Spanish cities. The accuracy of this second algorithm in determining the gender of a street was above 93% for the cities analyzed. Finally, for the streets that refer to a character, a new automated process performed a search in Wikidata to find information of interest about each one. For visualization, two access screens were chosen: a scrollytelling, highlighting some data in a journey through the cities and their profiles, and an interactive map with a text analyzing the data.
Context about the project:
Almudena Grandes is the latest great female figure to enter the Spanish street map after the Madrid City Council agreed last November 27 to name a public street after her death. The writer thus joins the small group of women who have managed to be represented in the public space: streets named after women barely account for 15%. In some cities, representation is even lower, as in Las Palmas de Gran Canaria, Melilla and Tarragona, where they do not reach 8%. The large gender imbalance is also surrounded by the origin of the women represented in the streets, as most of them come from a religious or monarchical background. We wanted to expose this.
What can other journalists learn from this project?
The project is very useful as a way to process a huge amount of data with the aid of an algorithm that helped us to analyzed the names of the streets of 69 cities to check how many streets remember a woman, how many of them are dedicated to religious, fictional or mythological characters, how many of them have a Wikipedia page, and thus evaluate their significance. Crossing these sources gave as a really interesting material to work with in a journalistic level. Including a piece of these results in a scrollytelling way was a challenge necessary to accomplish to better tell the story.