Reuters analyzed how the ‘world’s largest vaccination campaign’ failed young Indians – leaving some 600 million people aged between 18 and 45 years scrambling to find COVID-19 shots – and delivered an in-depth data-rich but visually accessible piece that shed light on an important but under-examined facet of the ongoing pandemic.
Reuters analysis was some of the most extensive available at the time into the inner failings of the vaccination rollout to young adults in India, cataloguing a host of failures to plan and quickly react that left the poorest in the country again most disadvantaged.
The project involved extensive data collection from multiple sources, analysis, data visualisation, web design and development. The vaccination datasets were scraped using Node.js scripts and automatically collected daily using GitHub Actions. Data analysis was mostly done using a mix of MS Excel and Observable notebooks, which allowed for quick visualisations and insight. The various visualisations were first drafted on paper and then laid out with the rest of the page design in Adobe Illustrator. The interactive, scroll-controlled charts were coded in D3.js, while the static ones were made using tools like RawGraphs and styled in Illustrator.
What was the hardest part of this project?
The most difficult part of the project was collecting data from the official CoWIN portal. There were three datasets – the first two were doses administered weekly on national and district levels, by age groups and dose numbers. The national level data, along with the demographic data from the United Nations Population Division powers the streamgraph-bar chart combo. The district level data was used in another analysis of the inequity in vaccination rates among urban and rural districts. This information drove our attention to the “privilege gap” caused by vaccine shortages and prices.
The third dataset – availability of vaccination slots by age groups – was the crux of the project and the hardest to collect. The booking portal included a public API with details of district vaccination centres and booking availability by age group. The data was real-time; there was no record of the previous day or coming days, as slots would be made available for a small window during the day. This data had to be scraped at the end of each day.
More than 78,000 vaccination centres were found between May 15 and June 4, 2021, from 753 districts in 36 states and union territories in India. The scraper took hours to finish each day and had to be constantly monitored.
This dataset was reshaped in different ways in an Observable notebook, and the analysis of the vaccination booking slots and pricing of vaccines at private centres emerged. This work helped us tell an important story about the faltering vaccinations and a weak start to vaccinating the youth at a time when the second COVID-19 wave was raging though the nation.
What can others learn from this project?
Though Reuters started collecting the data in mid-May, it took weeks to understand nuances and required reaching out to a community of others working with India’s COVID-19 data to pool our experiences and insights.
Within the project, we chose some atypical dataviz techniques that we found to be better alternatives for the narrative we needed to tell. The chart comparing daily vaccinations among countries is a heatmap, where it would traditionally be a line chart. However, that format makes it easier to get an easy overview of how countries speed up or plateaued their vaccination rates. The streamgraph was chosen to walk the reader through the vaccination drive, how the pace varied in that time and leaves the reader with the column chart in the end that shows the outcome of the journey so far.