China’s data superpower

Country/area: Japan

Organisation: Nikkei

Organisation size: Big

Publication date: 24 Nov 2020

Credit: Toru Tsunashima

Project description:

This was a series of stories focusing on China’s influence in the digital world. We are curious how China uses its power to censor the cross-border data flow and fragment the internet world. We analyzed the cross-border data flow and revealed that China has overtaken the U.S. in the volume of data flow and now has twice the size of the U.S.. Also, we used the positioning satellite data to investigate China’s influence in space. According to our analysis, Chinese version of GPS, Beidou has more global coverage than the U.S. original.

Impact reached:

We published the article both in Japanese and English. Both received high engagement from our readers especially from outside Japan. Our report was referenced in many areas including defense officials and our group company, Financial Times also published the article on its site.

Techniques/technologies used:

To illustrate the fragmentation of the data world, we compared the user activities at the code-sharing site GitHub and its Chinese rival Gitee. We used their APIs with Python and analyzed the activities. Due to the fears of losing access to GitHub, some programmers are gradually networking with Gitee.

In the article on China’s satellites, we scraped the data from the visualization tool offered by U.S. receiver company Trimble. Trimble’s GNSS (Global Navigate Satellite System) planning tool is very helpful to understand where satellites circulate. We used Selenium with Python and collected the number of observable satellites for each country’s capital city in the world.

What was the hardest part of this project?

Nikkei acquired figures for the amount of data that enters and leaves each country from the International Telecommunication Union (ITU), a United Nations Specialized Agency, and U.S.-based data analytics company TeleGeography.

There are pros and cons to both the ITU data and the TeleGeography data. The ITU data is narrowly focused on the bandwidth used for actual communications, but it does not cover the communication partner country. TeleGeography’s data shows a breakdown of communication partners, but because it includes bandwidth not used for communications it is undeniably slightly different than the actual state of communications.

After the tough discussion, we decided to use ITU figures to compare the total amounts of cross-border data by country, and TeleGeography’s statistics were used to calculate the relative compositions of communications with partner countries.

What can others learn from this project?

It is difficult to grasp China’s influence in the digital world based on ordinary data. But China cannot interrupt data beyond its boundaries. By using open-source tools, we could reveal the China’s secret.

Project links: