2021

5000 Fact-checking Reports Analysis: COVID-19 Disinformation War

Country/area: Taiwan

Organisation: READr

Organisation size: Small

Publication date: 7 Jul 2020

Credit: Yu-Ju Lee , Yi-Chian Chen ,Kai-Wen Hsiung , Wen-Yen Chen ,Hsin-Chan Chieh, Meg Wu, Yu-Chung Cheng’s team at National Chengchi University

Project description:

READr analyzed over 5000 fact-checking reports, and discovered the danger of “COVID-19 disinformation”. We show the trends in the spread of disinformation in the last 5 months through data visualization.

Impact reached:

This is the first complete investigative report in the world that analyzed the disinformation about COVID-19. We analyzed more than 5,000 fact-checking reports written by fact-checking organizations in the world to find out the characteristics of disinformation spread in different countries. It was discovered that in addition to viruses, disinformation also caused huge harm to countries. Even in Taiwan, where epidemic prevention results seem to be very successful, disinformation had some kind of impact too. We also verified the influence of these fake messages in tens of millions of tweets from Twitter, and found that disinformation in the “good news” category is more likely to spread, which was different from the sensational one.​

Techniques/technologies used:

In the part of data analysis, we used web-crawlers to get the fact-checking reports of IFCN(The International Fact-Checking Network), and used R to analyze the data. Manual classification of disinformation was the most time-consuming part of this project. We used TF-IDF word segmentation to help reporter find the same type of disinformation. Also, we used some functions of Google Spreadsheet, like translation, to minimize the time spent on manual classification.

This project is made by Vue.js and D3.js to implement a scroll-telling story with clustered bubbles chart, scroll-telling is a well-known story form which is better for producing chart-based story, due to the chart is fixed on the main position of the web page, and each section of the story only have one main position, and is easy for implement in the year 2020 due to we can just using the “sticky” CSS position value. Moreover, by using d3-force module of the D3.js, we can easy integrate the data of the project with visualization which have the clusted bubbles and animations. All these features creater a better experience when reading a chart-based, scroll-telling story.

What was the hardest part of this project?

While the epidemic was still developing, that is, when there was hardly any academic resrearch (only a few weeks before our report published, a report contains 225 disinformation sample studies released by the Reuters Institute fot the Study of Journalism). To do this, the newsroom had to assume the responsibility of research, and this is exactly what data journalism should do.

More than 5,000 fact-checking reports have a lot of content. We used visualization to give the users an overview of this topic. Behind this was a considerable amount of information digestion and visual translation, so that users can easily absorb knowledge by scrolling the webpages.

What can others learn from this project?

Even if the topic is new, there is a lack of academic research, or even a lack of interviewees, as long as there is data, you can discover treasures from the data.

Project links:

www.readr.tw/project/covid19-disinformation-vis/en