In early February 2020, we built this interactive website to inform and communicate with local audiences about the COVID-19 situation in Singapore based on the daily reports released by the Ministry of Health. As the situation evolved, we kept updating the data every day until June 2nd. With the substantial details of the new cases, we created communicative data visualizations in different stages of the epidemic situation and with scales of data to help better picture the relationship between cases within a cluster, how these clusters are inter-related and activities-based, and how cases are disseminated to various hospitals.
This website is the most popular interactive news and data visualization story in Zaobao’s interactive news section, as reflected by the number of views in Google Analytics, fetching a total Page View of 892,278. Compared with daily breaking news or news archives, the accumulated data allows us to present more comprehensive insights on both temporal (i.e. the trend) and spatial dimensions (i.e. the cluster tracking on the map). Digital interactive format provides a more engaging and exploratory experience for audiences, as the audience can use filters and search to navigate the site according to their own interests. It also served as an additional tool for journalists to study and explore potential news angles.
The case data were from daily situation reports released by the Ministry of Health in PDFs. We manually record the information into multiple spreadsheets, including the key dates, age, gender, nationality, visited clinics and hospitals, related cases or/and belonging cluster, and reported news from our newsroom for individual cases, identified clusters, and the count of death and discharge cases. We also maintained a list of important measures announced by the government, such as border closures, travel restrictions, and lockdown measures. We visualized the close contact tracking and the testing situation in the beginning until the data was no longer available.
Part of data processing was done in RStudio in the late period. This included generating reused attributes across the spreadsheets and some Chinese translations.
Data visualizations were created using D3. The cluster map was based on OneMap, the local map provider, and Leaflet.js.
What was the hardest part of this project?
Besides the data collection and translation, we have to adjust the content and visualizations to meet the audience’s points of interest that were changing over time due to the unexpected COVID situation. Eventually, we built multiple customized data visualizations to provide an engaging and exploratory experience for the audience.
For example, in the initial stage, daily new cases remained largely in the tens. Instead of the overview trend typically represented by a line chart, local audiences were more interested in the details of each case. We managed to create a journey visualization to show the key dates such as confirmed date, quarantine date, and the audience can click to expand the journey and read more information such as the dates of visits to clinics, and related news.
As the number of discharged cases increased, the audience became more interested in the length and place of hospital stays. We built a two-sided bar chart to visualize for each case the number of days between the onset of symptoms and being confirmed and the length of subsequent stay in hospital. It also allows the audience to sort by case id or the number of days, group by the hospital, gender, country of origin, or age group, and filter by a particular cluster or group. This visualization intends to facilitate answering questions like whether cases of a certain age group are more vulnerable, and whether some hospitals can handle better, etc.
However, with the number of new cases crossing 500 per day, the government stopped giving details of cases. We then implemented a new visualization for summarizing the median length of known cases, and a cluster tracking map to highlight the newly discovered clusters and the geographical distribution of the clusters by plotting clusters as tiny dots on the local map.
What can others learn from this project?
The Singapore government was among the few globally that released substantial details of the new cases. This provided us with an opportunity to explore communicative visualizations in different stages of the epidemic situation and with scales of data.
This also highlights the need for customized visualizations in journalism and underlines the importance for the newsrooms to adapt accordingly.
This project exposes both data literacy and visualization literacy issues. In terms of data literacy, the audience might fail to know that the answer they are looking for can be answered by playing with filters. We found that the audience differs in their capacity for navigating and understanding the visualization. For instance, they might fail or are reluctant to read a visualization that is not a bar chart or line chart; they might get used to thinking a visualization is always static as print and fail to hover or click. These issues not only require design considerations, but also data visualization education for journalists who produce the content.