2022
Mexico: a thirsty country, where there is plenty of water for the junk drinks industry
Country/area: Mexico
Organisation: Laboratorio de periodismo y opinion publica
Organisation size: Small
Publication date: 31/01/2021

Credit: Kennia Velázquez, Alonso Merino, Arnoldo Cuéllar, Einnar Espinosa, Emilio Jiménez, Juan Jose Plascencia, Nicolas Aranda, Miguel Cabrera
Biography:
The project was carried out by the Laboratory of Journalism and Public Opinion, a media outlet in Guanajuato, Mexico, which was formed in March 2019, made up of 12 people: long-term journalists, other young people in training, and a production team. visual and web development.
Some of its members have collaborated with international media such as the Washington Post and Latin American media such as Connectas and OjoPublico.
In just two years, our works have been recognized as finalists in the 2020 Gabo and Roche Foundation Health Journalism Award and with the 2021 Inter-American Press Association Award for Journalistic Excellence
Project description:
In Mexico, 133 billion liters of water are used by the ultra-processed food industry that, in addition to causing chronic diseases, causes serious damage to the environmen. We made a database and a map to locate the water wells. We found that some companies exceed the limit they are allowed
Impact reached:
The report was republished in many media. We were interviewed on TV shows and invited to some universities to talk about it. For the first time, the business chambers spoke about the water consumption of these industries. For the public, it was shocking to know the amount of water used to make a bottle of Coca Cola, there was a wide discussion on social networks on the subject. Activists used the information to create campaigns to promote the sustainable use of water
Techniques/technologies used:
Tools for scraping, parsing and processing data:
-
Python
-
Pandas
-
BeautifulSoup4
-
NumPy
-
Selenium
-
MongoEngine
-
-
MongoDB
Tools used on the website
-
Aqueduct: https://www.wri.org/aqueduct via query parameters in an iframe.
-
Angular
-
Mapbox.js
-
RxJs
-
NgRx
-
Kubernetes
First, we had to crawl the whole database contained in https://www.wri.org/aqueduct as the online explorer at the time wasn’t usable for research work. The website had 2 components, a registry of all the concessions that matched a query and the details page for each concession id. The first was a ASP.Net website with a secured REST API that we couldn’t use outside the browser. We wrote a crawler using automated chrome instances via selenium to interact with the ASP.Net query component and collect the Ids of each concession. As the website crashed and hanged pretty often, we decided to use MongoDB as our crawler’s database engine and rewrited the script so it could run on multiple threads and recover from the website constantly crashing while avoiding duplicated or missing data. Luckily the details page worked via path query parameters. Once we collected all the Ids, we were able to use a much simpler python script using requests, b4s and pandas to collect and process the details of each concession.
With MongoDB, we were able to easily query and extract subsets of data which were used for research by our journalist team.
The website was built on Angular v11. We exported the final dataset as json and used NgRx as a state manager for the data explorer. Using RxJs, we built a reactive dashboard that filters the data and shows the concession details, the associated brand, geographical position on a mapbox.js map and the water stress level of the region with WRI’s aqueduct platform.
Currently the website runs containerized on a K3s Kubernetes cluster for high availability.
What was the hardest part of this project?
Companies use different social reasons to request water concessions, so once the database was extracted we had to review the hundreds of thousands of permits, one by one to verify their business line, since not all of them have permits for industry. So we had to resort to public trade records to identify them and in some cases we called by phone to ask directly what brand they belong to.
The database provided by the National Water Commission is not entirely transparent, so we hope that this database will help citizens to locate these companies.
What can others learn from this project?
That official information, even if it is public, has a hidden layer that must be explored and that from it, stories of impact can be told
Project links:
pozoschatarra.poplab.mx/explorer
docs.google.com/document/d/11HNSCgeNE6lKPBfE8wi7b5ZUQsfYPfIh9GXVqa1yiJo/edit