The Economist’s Democratic Primaries project aggregates national and state-level polling of America’s Democratic Primaries from a number of firms. Additionally, it uses microdata from the weekly Economist/YouGov poll to explore the demographics of each candidate’s support in detail. It breaks down candidates’ support by race, age and education, and also looks at which candidates have scope to expand their support beyond their core supporters.
The project conveniently aggregates a lot of information about the state of the race, and it has helped to shape our coverage of the race for the Democratic nomination. Because we’re able to work with the microdata of a weekly tracking poll, we’ve been able to take a relatively nuanced look at the nature of Democratic candidates’ support, and how and where it has ebbed and flowed over time.
It has proved popular with readers and pundits alike. It has been viewed half a million times since its launch, and has been cited by reporters from other media outlets, such as FiveThirtyEight, the Huffington Post, and the Washington Post.
We aggregate polls at both the state and national level to build a polling average. We use an in-house method that combines Bayesian statistics, hierarchical Dirichlet regression and splines (akin to a kernel smoother or LOESS) to draw trend lines through data over time. Our custom method allows us to control for firm-level biases in ways that other polling aggregators cannot, as well employing a hierarchical structure in the underlying regression that interpolates state-level aggregates based on movement in the national data. We’ve also used betting markets to give a sense of the likelihood of any given candidate winning the nomination.
We’ve used R to aggregate the data, Stan to model it and ggplot to analyze it and produce sketches of our charts. We then created a bespoke website for the project using Next.js and D3, backed by a Google Doc and ArchieML to allow easy editing of the website’s text content.
We’ve used a number of techniques to convey information, to try to make different types of data clearly delineated, and to show uncertainty where appropriate. Sometimes, as with the principal trend chart, we show uncertainty only on interaction. In other charts, such as the “beeswarm” charts on the candidate page, we ignored uncertainty so we would be able to show the whole great mass of candidates and position each one within that swarm.
What was the hardest part of this project?
Setting up the site itself was probably the hardest part. In the past, we would have structured a project like this as an article and manually updated it regularly. However, since we had so much data to present and explore in this case, we instead built a separate “microsite”, running on AWS and Google Docs, and automated as much of the process of running and updating it as we could. We put this together into a template for creating complex, data-driven microsites, one which we used again a few months later for our coverage of the UK election in December.
What can others learn from this project?
We were one of the first news sites to launch a Democratic Primaries polling page so we were able to inform readers and other journalists about the state of the race from August 2019. We also have access to microdata from YouGov that are not publically available and have presented them in an easily digestible way, allowing readers to quickly find an angle to the story.