2022
Mapping Diversity
Country/area: Italy
Organisation: Osservatorio Balcani Caucaso Transeuropa, Sheldon Studio, European Data Journalism Network
Organisation size: Big
Publication date: 14/7/2021

Credit: Giorgio Comai, Alice Corona, Lorenzo Ferrari, Ornaldo Gjergji, Chiara Sighele
Biography: Giorgio Comai – data analyst,
Alice Corona – data journalist,
Lorenzo Ferrari – editorial coordinator,
Ornaldo Gjergji – data analyst,
Chiara Sighele – project coordinator.
Project description:
Our cities’ street names are not harmless urban elements, useful only for orienting ourselves in the places we pass through: they have a strong symbolic power. They are the result of decision-making processes linked to the legitimisation of the past, and to the construction of a collective historical memory.
Mapping Diversity visualises gender inequalities among people who have streets named after them in regional capitals in Italy. Thanks to the project, it is possible to see not only how many streets are dedicated to women, but also where they are located and who these personalities were.
Impact reached:
Mapping Diversity enriches a wider discourse on gender inequalities in Italy. In fact, even if improvements were achieved in the last few years, Italy is still a country where men occupy the vast majority of top positions in the key areas of society.
Mapping Diversity raised a critical issue in the public debate in Italy. It was extensively mentioned by national newspapers (la Repubblica, il Post, Linkiesta) and magazines (Elle Decor, Cosmopolitan) to raise awareness on the recognition of female historical figures in Italian cities. After the publication we also started having intensive exchanges with national NGOs dealing with gender equity and female representation.
The project also won the Italian Data Journalism award awarded by the Festival Glocal Varese.
Techniques/technologies used:
The analysis has been carried out using R and Rstudio, specifically the tidyverse and osmdata library. For this project, an R library, “tidywikidataR” has been developed, in order to have a consistent approach to get data from wikidata in tidy formats.
The data was collected from open and crowdsourced sources. Street names were extracted from OpenStreetMap and then filtered, cleaned and matched with instances contained in Wikidata. Data collected automatically was then double-checked manually through an ad hoc interface developed for the purpose. The visual part was designed so as to let users explore the map of each city, with pop-ups showing a short profile of the women represented.
Though being currently centered on gender gap, we plan to cover more cities, European countries and topics. The established methodology can lead to further analysis, such as: which roles, segments of population and historical figures do we honour, and which ones do we ignore? To what an extent are foreigners, people of colour, etc. celebrated in our urban spaces? And to what an extent are problematic figures celebrated instead (e.g. people associated with racism, colonialism, dictatorships, etc.)?
What was the hardest part of this project?
The hardest part of this project has been dealing with the data. Being OpenStreetMaps and Wikidata crowdsources sources, matching toponyms and odonyms with wikidata entries has not always been smooth. Especially in cases where streets were named after local personalities with no Wikipedia pages to retrieve the data from.
This made it mandatory for us to manually check most of the associations manually in order to fine-tune our analysis. This has been highly tie consuming but ensured a quality that automatic data analysis could not have had.
What can others learn from this project?
What we learned themselves: that open data has great potential for culturally transformative projects. There are, of course, obstacles that hinder the use of this data. As a matter of fact, given that to finalize this project we had to develop our own R library, we also gave an extensive guide on how to use it in order to harness the wealth of data that is open, free, but not always easy to access.