2021
Covid19CubaData
Country/area: Cuba
Organisation: Postdata.club, Juventud Técnica, School of Math and Computer Science – Havana University
Organisation size: Small
Publication date: 23 Mar 2020

Credit: Yudivián Almeida, Leynier Gutiérrez, Ernesto Guerra, Saimi Reyes, Iramis Alonso, Claudia Alemañy, Carlos Bermudez, Roberto Marti, Hian Cañizares, Frank Sadan, Sandy Pérez, Gabriel López, Carlos C. Caballero, Norlan Capote, Maikel Llamaret, Luis Correa
Project description:
Covid19CubaData is the reference platform in Cuba for data and information of the Covid19 epidemic in the country. It is the only one that structures the data, visualizes it and releases it as open data, all following a collaborative and open model. It was conceived as a platform that, based on the data and the principal website, evolved over time, with an android app, a telegram bot and several mirror websites. This platform is the reference for Covid data in Cuba and has been used by media, government, institutions, researchers, as well as other data referents related to the epidemic.
Impact reached:
The Covid19CubaData platform is the national and international reference for Covid 19 data and information in Cuba. It is used periodically by the health authorities and the government of the country for monitoring and decision-making throughout the epidemic. The President of the country, in an article published in the Journal of the Cuban Academy of Sciences, recognized and mentioned the platform – the only result that does not come from a governmental or institutional initiative – as one of the main results and tools in the fight. against the epidemic in the country. The Minister of Public Health, in his report to Parliament, also mentioned the platform as one of the most useful tools. The United Nations (UN) included the platform in a compendium of digital initiatives on Covid 19 at a global level. The Pan American Health Organization (PAHO) registers it as a data source for the epidemic in Cuba. The National Office of Statistics and Information of Cuba, source of the country’s official statistics, also refers to it as a data source, as well as the Cuban Academy of Sciences and other multiple institutions and ministries. It is a source of data and information for multiple media and institutions, inside and outside of Cuba. The data that is released by the platform is used by scientists and researchers in their forecasts and analysis of Covid. The platform is cited in multiple articles and scientific publications. The project has also been the subject of several informative articles both by the press and by television, inside and outside of Cuba. The android application developed was the 5th most downloaded, counting applications of any kind, in the country during 2020. It was awarded as the best application in Cuba related to Covid19.
Techniques/technologies used:
The project was conceived as a collaborative platform, both at the software and data level. For this, its infrastructure was based on github, using this platform both as a repository of code and data, besides it was used as web server (via github pages) and also for tasks automation (vai github actions). Based on this, the data was structured, data that is taken from the textual reports published daily by the country’s authorities, as JSON documents that are filled in every day. With this, different databases were created as well as different visualizations and tools. Likewise, other international data sources were automatically integrated and correlated (using github actions) with national ones. Likewise, other data that were obtained from other national sources were integrated, analized and visualized. With all this, the website, an android application and a telegram bot were built. For this, multiple languages, technologies, formats and paradigms were used: html, css, javascript, jquery, c3js, d3js, leaflet, datatables, flutter, android, telegram, python, docker, json, csv, excel, openoffice, bootstrap.
What was the hardest part of this project?
In Cuba there is no tradition of open data, for the first time in history data related to an epidemic were commented on daily by health authorities. However, this data was published as common textual reports that did not allow its use in a automated and interoperable way. That was the first challenge, to structure the data on a daily basis so it could be released while being interoperable and in different formats. Next was, with this data, to build a platform that not only released the data but also offered affordable information with the technical elements related to the epidemic. Then, reach all possible audience by developing access and visuality for multiple formats (web, mobile and messaging applications). To this should be added the ability to react because due to the nature of the data and the way in which it is reported, it would be necessary to react quickly, with new developments, data structures, or the search for new information. Finally, all this would have to be done under quarantine conditions with a team that would not have the possibility of working in the same place.
From this, the Postdata.club team summoned two other actors, who accepted to be part from the beginning, who could contribute to the project, both from the journalistic and data curation point of view and from the development point of view: the magazine Juventud Técnica and the Havana University (students of the Data Journalism course that Postdata.club teaches there). In addition, since it was published as an open source project, several developers joined to collaborate. This team then was coordinated by Postdata.club through telegram groups. Thus, the project has been carried out remotely, publishing every day, developing new versions without the members of his team having met in person even once.
What can others learn from this project?
– The importance of open and interoperable data. The data that we structured and released was the basis for the development of multiple projects, research, collaborations, and publications. Projects, such as OurWorldInData, from which we collect data to correlate them with those of the country, in turn took data from us for their own developments, achieving a contribution cycle.
– The importance of developing an open project (both code and data). In this way, several people contributed to the project, either by reviewing the code, creating mirror sites to reach more people, or taking the code and data for other developments.
– The importance of collaboration. If this project had not been thought of as a collaboration from the beginning, it would have been difficult to achieve the impact and position it has in Cuba and all over. This was made possible by the integration of different people from different spaces in search of a social and communicative objective.
– The development of a community and its feedback. Data is the essence of the project and it is structured and built in different ways: from automatic to manual. This makes it necessary to have a total transparency of the process and that the data can be verified by all. In this way, the community contributed not only with new ideas, but actively in the curation of the data and the detection of inconsistencies and errors.
– The possibilities of spatially distributed newsrooms and teams with multiple knowledge through the use of different digital tools for communication and development.