The Mil Vidas (1,000 Lives) project exists since 2011. It is published yearly, whenever the 1,000th homicide of the year happens in Salvador, based on daily reports from the Public Security Office. The project has been a finalist in the DJA, in its first edition. In 2018, without announcement, the Office removed all daily reports from its website, keeping only last month or so. In 2019, we opened the whole database, with data from +17,000 homicide victims since 2011. So, thanks ONLY to our 9 YEAR diligent data journalism work, this data is again publicly available.
The project had over 10,000 pageviews in 6 months, but, beyond the audience, the major impact was being the only ones able to re-publish a data that the Government, overnight, without announcement, decided to remove from its Public Security Office website. And we did it even in a better way, since when they were still online, the daily reports from the Government didn’t provide the data in a structured way. Each report was published on a different webpage, We, on the other hand, make the entire database available in CSV.
Besides, because of our diligent work, the Government got notified by the Prosecutor’s Office, for going against the Brazilia FOIA (LAI). That happened because when the Office removed the reports from their website, in early September 2018, we hadn’t collected the data from August yet. It was the only month we lacked in the entire historical series since 2011, so we decided to wait for the response to open the database complete.
But we had the request rejected in all instances of the Government. After having our last appeal denied, we denounced the violation to the Prosecutor’s Office. This process took months, and only after the Prosecutor’s Office notified the Government, the information was finally sent. We were thus able to open the data in July 2019 and, since then, we keep the CSV updated.
Although it is “only” a CSV, this is the result of a continuous 9-YEAR work. And today, due to the Office’s policy of removing this data, this database is only available because of Correio’s work. We had feedback from professors who are now able to carry out their researches. Due to its granularity, it is an extremely important database for any study of violence in Salvador, covering an entire decade of data.
The entire project, since 2011, is based on data published daily by the Public Security Office since 2011, on this link: http://www.ssp.ba.gov.br/modules/consultas_externas/index.php?cod=5.
In the early years of the project, the data was collected manually, but only the first 1,000 homicides of the year were collected, since our stories were always based on those. Then, the project was used to be left on stand-by until the following year. With the advance of the project we felt we wanted to start making broader analysis of the whole years’ data. Of course, going back in many years over all the daily reports manually would be very hard. So, a former reporter at Correio, Thiago Freire, set up a scraper, and managed to scrape all previous data, since the first report, in 2011, thus obtaining the complete previous years.
From that moment on, we keep the data completely updated. And actually we keep doing it manually, because we need to access the reports for the daily coverage of violence in town. So when we access them, we already collect the data and update our database for the 1,000 Lives project. Then. monthly, we also update the CSV on the project’s page.
What was the hardest part of this project?
I believe the great merit of this project is not necessarily to have opened the data, since this is already a widespread good practice, not even the volume or type of data that was opened. It is also not a project that has complex technologies or shiny data visualizations. Being a mid-size newsroom, in a poor region of a developing country, we don’t have many resources (to give an idea, we don’t even have any webdesigner or developers in the newsroom).
Nevertheless, we strongly believe the project is unique, and this uniqueness lies in the fact that we were able to open the data only because, during nine years, we did a CONSTANT work to update this database. We, and only us, have kept this database up to date for so many years. And only due to our work this data is still available, after the Security Office decided, overnight, to remove it, in an anti-democratic decision that goes against Brazilian legislation. So, although the action of opening the data itself is common in other projects, what is unique in this work is the perseverance of our data journalism work, which allowed a quick and efficient response to an illegal decision of the government. We believe this process and the result of it can be a great reference for data journalists all over the world.
What can others learn from this project?
As already mentioned, it may be that the project has little to contribute in terms of technology or data viz, but we believe that this project has a lot to contribute and can be a global reference in terms of the importance of persevering in building a database, in order to prevent policies changes and maintain authorities accountable.
When working with data, we can sometimes take for granted that historical data will always be there, for when you want to use it. So, you won’t keep your own files because you can always scrape data from the past. However, our project shows that this is not always true, especially in countries or communities where the culture of opening data is not yet consolidated.
Another great point to learn from the project is the importance to perserve not only in a personal level, but also on a team level. Notice that, being already a 9-year project (and counting), the 1,000 Lives was born by the hand of a specific journalist (Juan Torres), but year by year the project (and specially the culture of keeping the data updated) was passed to other journalists (reporters and editors), who keep the project still running.
Besides, the project also has a good point on insisting on FOIA requests and using all instances of it to obtain a public data.