As the coronavirus swept across the world, it mutated into several major groups, or strains, as it adapted to its human hosts. Mapping and understanding those changes to the virus is crucial to developing strategies to combat the COVID-19 disease it causes.
Reuters unraveled and analysed over 185,000 genome samples from the largest database of novel coronavirus genome sequences in the world, to show how the global dominance of major strains has shifted over time.
Although it wasn’t uncommon for a virus to mutate, few had described how and when the viruses changed. Reuters graphics delved into the laboratory samples to demonstrate how strains looked different from the one that had first emerged in Wuhan. We received a lot of attention on social media because of the granularity of the graphics that we published and how we shaped that into a narrative explaining the evolution.
We used Matplotlib and D3 to visualise the datasets.
The visualisation demonstrated how complicated these mutations were. It showed in granular detail how the virus evolved and showed the origins of each mutation.
What was the hardest part of this project?
The full genome samples were a myriad of details that we had to decipher and understand. By reading and sifting through the data, we developed a working knowledge of these strains mutated.
What can others learn from this project?
The dataset was enormously complicated. Sometimes it would be easy to give up considering how monumental the task was. But members of our team persisted with their analysis and were able to report out how the strains had evolved. It was emblematic of good journalistic rigour in a digital form.