#LaGrandeVacance (The Great Vacancy)
Category: Best data-driven reporting (small and large newsrooms)
Organisation: La Marseillaise, Marsactu, Médiapart, Le Ravi
Organisation size: Small
Publication date: 29/10/2019
Credit: David Coquille, Louise Fessard, Benoit Gilles, Jean-François Poupelin, Marius Rivière, Julien Vinzent
#LaGrandeVacance (The Great Vacancy) is a common investigation of multiple-newsroom based in Marseille city, France : La Marseillaise, Le Ravi, Marsactu and Médiapart about how the City of Marseille let dozens of its own buildings rot away. Using 5 424 real-estate transactions operated since 2003 by the city administration -public data recovered from real-estate recorded file- and visual checking, we have concluded that at least 68 buildings located in city-center, own by the city administration, were empty and has not been renovated at all. We also uncovered several dubious transactions involving local entrepreneurs or politicians.
With a simultaneous publication in four Marseille media outlets on October 29, 2019, Operation #LaGrandeVacance had some media impact. Several national newspapers and radio stations relayed the operation. The municipality, plunged into silence for many months, finally had to react. At a press conference held a few days before November 5, 2019 – a year after the collapse of two buildings on Rue d’Aubagne – the mayor’s office had to admit that it had a degraded real estate heritage and that the insalubrity was not only the fault of private owners.
As is often the case, the mayor, Jean-Claude Gaudin, tried to minimize the phenomenon and assure that he had done everything he could to improve the situation.
Moreover, few months after our investigation, a few unscrupulous landlords who were renting slums were arrested for the first time since the tragedy in the Rue d’Aubagne.
We firtly used Google Sheets to carry out the work of capturing the administrative formalities detailing the purchases and sales of properties by the City of Marseille and its satellites. We did try to do text recognition via PDFs of the scanned documents, but the result was imprecise, so we had to enter them one by one in a Google Sheets document. However, the formalities did not include an address but cadastral data. We had to develop a script (Python / selenium web driver) to scrap the addresses corresponding to these cadastral data one by one (see how it worked on this Youtube video or in the article in link 4). This allowed us to avoid a long and tedious manual input. We also created a list of nearly 1000 buildings concerned by a dangerous structure notice or included in a public-led renovation project, sometimes using Tabula to read PDFs. We used Outwit hub to extract and list the officers behind the companies involved in the transactions, OpenRefine to clean the data (adresses, names…). Finally, we used R studio to carry out the data analysis. Once the data had been entered and checked, we used Google Street View to view the addresses of the damaged properties and, in particular, to use the function to compare photos taken in previous years. We then supplemented this work with physical site visits.
Most of our communications took place by email and then via a Slack platform created for the occasion.
What was the hardest part of this project?
The task of manually entering the more than 5,000 administrative formalities was clearly the most challenging part of this survey. We got the paperwork in paper format, printed out, so we had to scan it and convert it to PDF. Despite this, the text recognition did not work properly. We therefore had to manually enter the data (name of the buyer, name of the seller, date of the deed of sale, cadastral reference, etc.) and fill in our Google Sheets. Each employee was responsible for entering a range of formalities (from 1 to 250, from 250 to 500, etc.). After this lengthy input, we had to check once again that the data entered was free of errors. To do this, each person was responsible for checking the entry made by a colleague.
Once the data was verified, we had to cross the data using pivot tables to isolate the only properties in the city, compare purchase and sale prices, identify sellers who had made a profit, etc. Next, we had to trace the chronology of each building to find out if the town hall was still the owner at the time of our investigation.
This work was then completed by a field survey to see the buildings, interview the neighbours to find out how long a building had been in this state, when the last works were carried out, etc.
What can others learn from this project?
This work is unprecedented in the history of the Marseilles press. It is the first collaborative investigation between several local media. It could never have been carried out without the commitment of Nourredine Abouakil, a housing rights activist, who got hold of these administrative formalities and distributed them to several journalists. He then worked to bring these media together to use the data. Given the mass of data, we felt it was necessary to work together. No single media outlet was able to process the data individually.
The collapse of two buildings on Rue d’Aubagne on 5 November 2018, in which eight people died, revealed the extent of the poor housing and unhealthy conditions in Marseille’s buildings. The media we represent, which sometimes compete with each other for the rest of the year, felt that the subject was too important to play individually and that dealing with such a subject required this unprecedented alliance.
We have developed a method and tool which could be replicated in cities, at least in France, where the question of housing policies related to public ownership are relevant.
PS : sorry for any English mistakes, awkward or outright incomprehensible sentences.
Originals articles from Link 1 to 6. English translation of the articles are in a Drive folder (Link 7 below)