Ideological GPS – Analysis of the political debate on Twitter

Category: Innovation (small and large newsrooms)

Country/area: Brazil

Organisation: Folha de São Paulo

Organisation size: Big

Publication date: 5 Jun 2019

Credit: Daniel Mariani, Fábio Takahashi, Thiago Almeida, Simon Ducroquet

Project description:

One thousand accounts of influencers (politicians, media, commentators) on twitter and 1.7 million users were positioned on the ideological spectrum according to who they followed or were followed. This classification was used to determine the formation of virtual bubbles, formed by an echo chamber, where users of a certain ideology tend to only share content from people ideologically close. In a second step, this project also made it possible to analyze how different political sides react to events that are trending on social media. 

Impact reached:

In 2019 Ideological GPS had one of the highest audiences among all content produced by Folha de S. Paulo (one of the main media outlets in Brazil), both in terms of page views and retention time.

The article boosted interest in the debate on ideology as measured  by Google Trend. The word “ideology” reached its annual peak at the platform after the publication. To date the project appears in 4 of the 5 related queries when searching for the term on the platform in Brazil.

Study groups from two of the most prestigious universities in the country (Universidade de São Paulo and Fundação Getúlio Vargas) invited the authors to debates with academic experts on the subject. 

Techniques/technologies used:

To store the data we use a mysql database. To extract the data we used the rtweet package in the R programming language. The identification, for later removal, of accounts that probably belonged to bots was done using the tweetbotornot package in R. The calculation of the ideological position of each account was done through a correspondence analysis using the the ca package. All other analysis were made in R and the visualizations were made in d3.

What was the hardest part of this project?

In this project we:

  1. Identified Twitter accounts that belonged to deputies, senators, governors, mayors, candidates for presidents, ministers of state, media and influencers;

  2. Created a structure capable of extracting and storing the data of millions of Twitter users who followed those accounts;

  3. Excluded accounts tagged as bots;

  4. Created a method to identify and delete inactive accounts;

  5. Calculated an ideological position of the users from the information of the followers;

  6. Validated this position using data from others sources (votes in plenary sessions and percentage of votes for president by city);

  7. Extracted posts from these accounts related to recent political events.

  8. Analyse the data to show how it demonstrates the existence of virtual ideological bubbles.

After this first material we used the data generated to continuously analyse how people with different ideologies react to trending political news.   

An analysis was also done using the data created in this project to investigate emojis that are more or less likely to be used in the profile descriptions of politically left or right user.

What can others learn from this project?

This project addresses the chalenge that the press routinely faces of categorizing political views. It started from the reading of the academic article Tweeting From Left to Right: Is Online Political Communication More Than an Echo Chamber by Pablo Barberá and collaborators. The adaptation and implementation of the methodology proposed in the paper gave the basis to objectively analyze the use of social media by politically left and right users.

Thus, from a simple data (who follows who on Twitter) we can extract the political affinity from the user. This can be used as a source for other insights, like the formation of virtual bubbles and analysis of how different ideological groups react to events.

At a time when different political sides are fighting for control of narratives on social media, being able to organize and identify where certain speeches come from, where they gain strength and how they are interpreted by different political sides is of paramount importance.

Project links: