As President Tsai Ing-wen’s second four-year term got under way on May 20, 2020, we came up with a visual presentation of top issues on her agenda. In addition to her priorities as president, we also presented with charts and graphs those of her predecessors after analyzing all 15 presidential inaugural speeches since 1948 as well as press releases issued during Tsai’s first fours in office, from 2012 to 2016.
Using a series of visualizations and charts, we presented the key words of all 15 presidential inaugural speeches from 1948 through 2020 and the main issues on the top of President Tsai’s agenda during her first four-year term.
We did not seek to interpret the meanings of individual keywords. Instead, we wanted the readers to see if there were differences in the choices of words by presenting both the frequency of each term that was used and the connection of those words while demarcating the years when Taiwan was under Martial Law from 1949 to 1987 and the years since those stringent regulations were lifted and also the times when the president was or was not popularly elected. There was much discussion after our project was posted on the internet, with readers offering their different interpretations and viewpoints. Some political scholars said that analyses like this were highly valuable and could be of great interest to think tanks based overseas.
Ckip Tagger, an open-source Chinese language processor developed by Academia Sinica, Taiwan’s top research institution, was used for this project for the purpose of word segmentation. All words having been sorted, nouns (with related postpositions and quantifiers taken off), verbs and adjectives were selected for statistical analysis and subsequent visualization.
Keywords related to government policies were computed using word2vec, a Google open-source machine learning module, which projected the terms into a vector space and calculated their distances from each other. The closer one keyword is to another, the more closely they are connected. This was combined with word frequency data and finally an editor came up with a final list of keywords, complete with the usage frequency of those words and their proximity to each other.
What was the hardest part of this project?
Until now, no other Taiwan media outlets were known to have explored written texts with this method. Our project marked the first attempt by a Taiwanese media organization to not only consider word frequency but to select parts of speech in order to systematically digitize a huge number of texts and to visually present the result in a way that is easy to comprehend.
What can others learn from this project?
For this project, artificial intelligence rather than experts was used to analyze the texts. A reporter could potentially write a more comprehensive article by identifying the differences among the speeches of successive presidents with the help of those analyses. At the same time, a thorough exploration of the press releases issued on behalf of President Tsai over the previous four years could also help a reporter to compare her policies during her second term of office with what she said previously.