Behind 1541 Asymptomatic cases: varying definition and statistical puzzle

Country/area: China

Organisation: Shanghai Observer

Organisation size: Big

Publication date: 2 Apr 2020

Credit: Shuyao Xiao, Jun Cao, Chunjie You

Project description:

In late March, asymptomatic missions were increasingly spread and found in mainland China. However, the number of current asymptomatic cases was unknown while it’s excluded from confirmed cases.

In the story, we compared the modifications related to the asymptomatic definition in six versions of Guidance for COVID-19 Prevention, Control and Diagnosis and Management from CDC since January 22th. Meanwhile, we collected and structured the ratio data from public information at provincial level. 

The data varied from 1.83% to 37.84%, which indicated CDC’s ambiguous definition lead to contradictions and negligence in tracking asymptomatic cases in local health departments.

Impact reached:

The project is the first data-driven project that brought up the challenge of tracking asymptomatics in early April,  when most of the provinces refused to release data to the public. 

We structured the ratio of asymptomatic cases in different areas of China and the world and also structured the content of different versions of CDC guidance. Right after CDC published the total asymptomatic cases, we could check the data accuracy and depicted where they were.

Later the asymptomatics data was counted into Covid case reports and published daily both national and local. 

Techniques/technologies used:

We collected data from news, bulletins, medical journals and other public information. We used illustrator to visualize how definition texts varied and how the asymptomatics got excluded from confirmed cases, as well as compared ratios in different regions.

What was the hardest part of this project?

In March and April, before the project, there was no structured data about the asymptomatics. In China, the asymptomatics was not defined as confirmed cases even though the nucleic acid testing showed positive, which resulted in a lack of tracking.

There is no big data or fancy interactions in this project, but it points out the contradictions between China’s definition of the asymptomatics and WHO’s, reveals the different statistical methods used by local health departments can lead to a huge data difference. 

The unveiling information had covered disease transmission for a long time. The data-driven project shows the importance of data transparency and accuracy for controlling the disease.

Later the asymptomatics data was counted into Covid case reports and published daily both national and local. 

What can others learn from this project?

Looking for connections and between unstructured data including texts. Revealing hidden problems from data analysis. 

Project links: