How migration has shaped the World Cup

Entry type: Single project

Country/area: United States

Publishing organisation: Vox

Organisation size: Big

Publication date: 2022-12-08

Language: English

Authors: Authors: Stephen Osserman and Youyou Zhou
Story editor: Caroline Houck
Visuals editor: Dion Lee
Copy editing and fact checking: Elizabeth Crane, Kim Eggleston, Tanya Pai, Caitlin PenzeyMoog


Stephen Osserman is a data scientist and visualization engineer with over a decade of experience working with political and civic data. Youyou Zhou is a data journalist who currently works as the senior data editor at [Vox](https://vox.com).

Project description:

The 2022 Qatar World Cup has the highest share of foreign-born players in the event’s history. We collected the data on over 800 foreign players throughout the history of the World Cup from 1930 to 2022 to show that migration has shaped the World Cup since the very beginning. The story also explains how players come to represent countries other than their birthplaces overtime and role of FIFA in the nationalization of the sport.

Impact reached:

The internationalization of soccer is a story that comes up with each World Cup, during which the misalignment of allegiance and representation and racist comments toward players often become topics of discussion. While some might think that there existed an earlier simpler time when each country had its own pure form of soccer without foreign players, our data analysis has found that having foreign players on World Cup teams isn’t a recent phenomenon. In fact, migrant players shaped national soccer teams throughout the history of the World Cup.

The World Cup isn’t just a sport event for soccer fans. Instead, it attracts a global audience, soccer fans or not. The event creates an opportunity — especially when Morocco made history as the first African team to enter the semifinals — for the global audience to rethink who can represent a country and to see that soccer players are in fact another form of labor migrants. We got an overwhelming number of shares and positive responses from social media and emails. One reader wrote to us saying: “Probably the first time I’ve been interested in a World Cup story.” The story was also well received by the researchers studying international migration, whose work helped explain the data in the story.

Techniques/technologies used:

We collected the data on foreign-born players at the World Cup between 1930 and 2022 by scrapping data from squad rosters on Wikipedia using R and Python.

We then cleaned the data by handling national boundary changes, country name changes, and naming inconsistencies and misspellings on a case-by-case basis as explained [here](https://docs.google.com/document/d/1b8ansQNC9kxLJbk6LNYxFqXYOxCjVN40rz74qgM0CkU/edit).

We then cross-referenced the resulting data with a database on foreign-born players at the World Cup (1930-2018) compiled by Gijs van Campenhout, professor of geography at Utrecht University, which enabled us to identify places where national boundary changes or country name changes justified marking a player as “foreign-born” or not “foreign-born,” to the best of our ability. We also interviewed Gijs and other researchers studying the field to validate our approach and data.

After compiling the database for the project, we analyzed the data in R, tested various chart forms and interactive elements with canvas and SVG in Observable notebooks with mostly d3, and compiled the final project in a Rails app.

Context about the project:

The fact that this year’s event happened in Qatar, a country with very strict nationalization rules and a bad reputation for its treatment of migrant labors, on which team close to 40 percent of the players were not born in Qatar, creates an interesting tension and an urgent need to understand the connections among them. This story tries to connect the dots. The loosened nationalization rules in reaction to globalization and FIFA’s opposition to the denationalization of the sport are two strong forces that have shaped the event. The 2022 World Cup is a window into the complex and intertwined history of that.

The most time consuming part of the project is to make sure we can stand by the data. For one, it’s scrapped from Wikipedia; another thing we were cautious about from the very beginning is the definition of foreign-born, and whether that was an inclusive metric to talk about migration in the World Cup. To solve the first issue, we cross referenced our data meticulously with an external database by migration scholars on the same topic with a different methodology. We checked each different case to make sure they were different for justifiable reasons. The second issue isn’t solvable but we tried to interview the experts on migration, and noted the limitation of using “foreign-born” as a metric for migration in the story (last section.)

While many newsrooms wrote stories about the large number of foreign-born players at the Qatar 2022 World Cup, the coverage of 2022 alone makes it feel like a recent phenomenon. What we tried to do with the story is to put the 2022 event in context of the entire history of the sport and to show people that it’s nothing special to have foreign players on the World Cup stage. Our view on who can represent a country may not live up to the reality.

What can other journalists learn from this project?

There are several things worth mentioning about this project:

1. It is a story published on the topic of the World Cup, but what it really talks about is global migration, identity, and nationality. The international event created an opportunity to talk about these topics, which is a useful technique for journalists who want to cover a specific coverage area. It also makes this story stand out from other World Cup coverage.
2. The project tried to answer the question: Have there been more foreign players at the World Cup in recent years? If so — or if not — what does it say about globalization, migration, and capitalism? The question prompted us to compile the database, and the database showed us the complex history of migration at the World Cup. The mindset of using datasets to answer a question, and if no database, creating one, is useful for data stories in all coverage areas.
3. Reporting helps data stories go beyond the datasets. Although we started with the database of foreign-born players, what we found out during our reporting process by interviewing experts gave us the story direction and insights of the data that would not have been possible by ourselves.

Project links: