We built an election forecasting model to assess the uncertainty in political polling and project the outcome of Germany’s parliamentary election, and followed the election through to its ultimate result.
The model helped us frame the election and put its swings and surges in context. Our readers could follow not only the polls but also how they were likely to change as the election neared, and get a preview of the inevitable coalition discussions that followed the vote.
Germany’s multi-party system is complex, and the model was able to cut through some of its complexity. It could reveal, for example, which parties were likely to gain or lose at others’ expense. In this way, it helped inform the rest of our coverage, giving our reporters a real-time sense of how the dynamics of the election changed as the campaign progressed.
The election model is a dynamic Bayesian polling model which combines economic and political “fundamentals” with polling data to assess the strength of each party and predict election results. The model was trained on elections from 1953 to 2017 and programmed in the R and Stan programming languages. We ran the model every day to incorporate new polling data as they came out.
The model used past election results to come up with a Bayesian “prior” of how the election may turn out. These predictions were not only point-predictions; The Economist’s data team tries to pay extra attention to uncertainty, so we allowed the model to have fat tails and explore many more outcomes than traditional handicapping allows. The model also used various techniques from machine learning and optimisation-based data science to calibrate our polling averages such that they were neither over- nor under-fit to the historical polling data; we did not want to give a false impression of movement just to drive clicks to the site.
We used D3 and Svelte to generate the charts, and used a workflow with Google Docs and ArchieML to keep the text of the interactives easy for our journalists to edit.
We pioneered a new chart type (for The Economist, at least) for these data, including our “fuzzy bars”, which serve the function of a histogram but in a fraction of the space. We’re expecting to reuse this in other forecasting models.
In a first for the Economist, we published the forecast page in both English and German.
What was the hardest part of this project?
From a design perspective, correctly communicating the uncertainty in the model was key—historically, readers often read too much into prediction models and we wanted to take pains to correct that. The fuzzy bars mentioned above let us show a range of probabilities without putting too much emphasis on a central prediction that readers might take as an emphatic projection. We also refrained from giving exact probabilities for various outcomes, instead giving vaguer, but still accurate, estimates like “about a one in three chance”, that were less likely to be over-interpreted.
What can others learn from this project?
Predictive models can add to and deepen readers’ and journalists’ understanding of an election, so long as care is taken to ensure they don’t give the impression of overconfidence. Embedding models and data analysis into reporting teams results in better coverage of data-heavy topics like elections, and results in better data-led interactives and graphics.