To publish for the first time the list of the more than 35,000 assets unregistered by the church in Spain between 1998 and 2015, the Government used an unstructured 3,000 pages PDF. In just one week, elDiario.es extracted, harmonized, verified and analyzed all the records to locate and classify each property registered. Different visualizations allow to understand the properties size and the location, at a glance, of the unregistered assets. This project includes a searchable table to open for the audience the data published by the Government in a non-reusable way.
elDiario.es was the first and almost the only media that carried out the work of publishing the data of the registration of the church in Spain in a reusable way, a task that the Government made very difficult because of the publication format they used. This process was highly valued by our subscribers and other readers and was even quoted by other media.
The project was also used later by local editions of the newspapers to explain the details of their respective regions or even to start their own stories about the municipalities with the most unregistered properties.
We used Abby Fine Reader to convert the PDF file to spreadsheet format. We then used Google Spreadsheets to, collaboratively, rearrange the data and build a reusable database. During this process we verified, one by one, every record.
What was the hardest part of this project?
Transforming the Government’s PDF into a complete and reusable database was the most labor-intensive part of the project. elDiario.es data team spent hours reviewing and reorganizing all the records one by one. For each property, the information on the municipality, the bishopric, the type of property and whether or not they were intended for religious activities had to be validated.
All this information was originally published in a PDF with non-uniform tables, with combined and where details sometimes were duplicated and sometimes ignored. Thanks to the verification and reordering work, we achieved a transparent and open piece of information and made it available to the public. This allowed them to make specific searches and calculations.
What can others learn from this project?
Many times, the fact that some information is public does not mean it is accessible. It is also a journalist’s task to publish the data in a format that is useful for citizens to form their opinion and make their decisions.
In this case, although the Government had published the church’s registration list, in elDiario.es we decided to dedicate our resources to convert that information into a reusable database and to interpret it. In this way, readers were able to easily see how the unregistration of properties affected their town.