Best news application – year 2020
Winner: HOT DISINFO FROM RUSSIA (Topic radar)
Credit: Nadiia Romanenko, Nadja Kelm, Anatoliy Bondarenko, Yuliia Dukach
Jury’s comment: Disinformation can play an important role in international politics, and more so when there is limited public awareness about the interference. The jury is delighted to find an app developed to address that in Ukraine. The tool tracks the content and intensity of Russian disinformation narratives and manipulative information in online media, and shows an overall dynamic as a result. As the first of its kind for Russian and Ukrainian languages, it allows user engagement in different ways, visually as an interactive dashboard, analytically through weekly posts, and functionally by offering a browser add-on to help individual citizens identify manipulative content. The project shows exactly what a great news app should do, which is to empower users to find their own narrative and make their own judgement within a larger dataset, and it is addressing some of the most critical challenges for journalism today.
Organisation size: Small
Publication date: 7 Aug 2019
Project description: TEXTY developed the data acquisition&analysis platform and dashboard tool https://topic-radar.texty.org.ua which shows an overall dynamics of topics of Russian disinformation in manipulative news. We are doing an NLP on thousands of news per week to detect manipulative ones, group them by topics and meta-topics to show on interactive dashboard. We also publish weekly reviews (21 so far), based on the results of analysis. In addition we developed “Fakecrunch” add-on based on the same platform (for Chrome and Firefox). It automatically signals to users about manipulative content and could be used to collect suggestions about possible low quality/fake/manipulative news items.
Impact: The project is aimed to track the content and intensity of Russian disinformation narratives and manipulative information in online media. It raises awareness of government bodies, civil society organizations, journalists and experts on major disinformation themes that are being pushed by Russia at any given week. Just one example: Dmytro Kuleba, Deputy prime minister of Ukraine, mentioned this project as an illustration of the huge level of Russian disinformation flowing to Ukraine. This quantitative approach allows us to overview and to zoom-in, from top to bottom, of the vast propaganda landscape and to track topics in different periods of time. Starting from May 2019, 21 weekly reviews, based on the project, were published. Each review illustrated key narratives of manipulations, which our application determined. Average audience engagement for each publication on texty.org.ua was about 8,000 users. Other media used to share our reviews, as well as some bloggers and influencers. Also we got positive feedback and mentions of this news application from international experts, for example Andreas Umland (German), Lenka Vichova (Czech republic). In words of Maciej Piotrowski, from Instytut Wolności in Warsaw, Poland: “Useful information. Sometimes we share it in our materials in Instytut Wolności, sometimes used for analysis. Longtime tracking is useful to see the full picture.” After many requests about additional features we decided to develop version 2 of the application. It will be published in April 2020 (approximate date) and we’ve freezed data updates until the new version arrives.
Techniques/technologies: Data was downloaded from sites’ RSS feeds or links on their Facebook pages. Preprocessed data about news items stored in PostgreSQL. Each text was prepared for analysis: tokenized (divided into language units — words and punctuation marks), lemmatized for topic modeling. Custom Python scripts were used to obtain (Scrapy), process and store data. Each news item was then evaluated by an improved version of our manipulative news classifier ( ULMFiT based model for Russian and Ukrainian languages, created by TEXTY back in 2018, programmed in Pytorch/fast.ai). This model is available from our github. It estimates the likelihood that the news contains emotional manipulation and/or false argumentation. Selected manipulative news, ~3,000 pieces per week on average, was broken down into topics by automatic topic modeling (NMF algorithm). We edited the resulting news clusters manually: combined similar topics, discarded irrelevant or overly general clusters. Each subtopic in our news application is also illustrated by a sample of titles from news which belong to it to let new readers know what it is about.
The hardest part of this project: For our best knowledge, this is the first such tool & whole pipeline for Russian and Ukrainian languages. The main challenge was to retrieve accurate topics and track them over time. Topic modelling was made using NMF, an unsupervised method of clusterization. Results are less accurate compared to supervised learning, when the model is trained using humal labels. But we cannot train topic classifier since we do not know all the topics in news and cannot easily update supervised model if the news agenda changes. So we have to keep using unsupervised NMF solution. Topics for the week are reviewed by analysts, as well as improved by rules to fix possible errors of unsupervised topic modelling. A lot of manual work is the hard part of this project. Because we detect topics in weekly samples of news, we have to aggregate them for dashboard to track topics for longer periods. We addressed this challenge by hierarchical NMF, namely clusterized weekly clusters. Meta-topics in the dashboard were first clusterized and reviewed by analysts so that each weekly topic relates to one meta-topic on the dashboard. Aggregation of clusters from different models is not well-studied and a great part of it is done manually.
What can others learn from this project: Long-term tracking of disinformation makes it possible to see what topics are most important for the Russian authorities, who is the biggest irritant to them, and what they plan to do in the future in Ukraine. One of the conclusions of our analysts is evidence that there are entire array of manipulative news from Russia which can be logically combined under the umbrella name of “Failed state” (related to Ukraine). The purpose of this campaign is obvious: it aims to create an image of Ukraine as a non-state, an artificial state entity that arose against historical logic. We are considering the dashboard as a usable tool for further research by analysts, and Fakecrunch add-on as a usable tool for online-readers in their everyday “life”. Other journalists got the source for their materials. General public got evidence-based tool for media literacy and for self-control in social media. Lenka Vichova, Czech Republic: “Many of these messages enter not only the information field of Ukraine, but also to Czech and Slovak media sphere. So it is core to know and be prepared. I use your reviews, when working on my own analytical articles and also in comments for Czech and Slovak media.”