In Many in Hong Kong have picked up a new vehemence in the government’s language since the protests of 2019. Compared with an earlier, more staid tone, it became peppered with phrases emblematic of China’s Communist Party’s approach to information, governing, and national security. But unlike the number of arrests under the national security law imposed in June, or convictions related to the protests, it would seem impossible to quantify what some have perceived as a linguistic cooptation to accompany Beijing’s political crackdown on the city’s demand for true political representation. While other journalists had picked up on Quartz reporter
The piece was devoured by Quartz readers around the world, as it offered a unique example of how life in Hong Kong has been changed by the Chinese government’s crackdown in the city. It validated readers’ concerns and suspicions in a new and compelling way.
To collect all of the press releases from the Hong Kong government we first used the command line tool wget to recursively crawl and download the press releases listed on the website. We then used R to extract the text of each release from the downloaded HTML files, separate the releases by language, and filter out duplicates. We continued in R using the tidytext library to determine the most common and fastest rising single-word, two-word, and three-word phrases. Finally, we used Datawrapper to create charts of our findings.
What was the hardest part of this project?
With assistance from Quartz Things editor David Yanofsky, Mary and Dan compiled and analyzed 165,000 statements issued by the Hong Kong government and its departments in the past decade, to see exactly how frequently certain coded phrases frequently used by the Communist Party were showing up. The results took several weeks to discern, as they tabulated instances of everything from a’s and the’s to two- and three-word formulations. There was an abundance of data, and within it was the trend we were looking for.
What can others learn from this project?
As Yuliya Komska and her co-authors note in Linguistic Disobedience, which is cited in the piece, “True, language critique did not forestall Nazism or authoritarian Communism”—but one way to resist official efforts to shape a people’s thinking is to study those words closely. When words matter, it’s vital to be able to show convincingly how they’re being used, by whom and against whom, and for what purpose. We know journalists around the world value this sentiment; we are proud to show them an example of how they can report out their findings to the public.