Since the beginning of the pandemic, I have been reporting that India’s official death toll from covid is likely a severe underestimate, both on account of historical registration issues, and account of an overly strict definition of a covid death that kept number artifically low. By first building a case for and then accessing confidential all-cause mortality data in India, I was able to provide the first estimates of true covid mortality in India. I reported on excess mortality in the city of Chennai, and the states of Madhya Pradesh, Andhra Pradesh, Tamil Nadu and Kerala and found that in
First, my reporting produced the first estimates of excess mortality in India, and by extension, the first estimates of missed covid mortality. The reporting was immediately picked up by others news organisation in India and abroad, and the impact was magnified by the fact that my articles appeared not only in English, but also in Hindi (Dainik Bhaskar) and Tamil (IndiaSpend Tamil). I have had the opportunity to speak about my reporting on excess mortality at multiple fora including the UN World Data Forum. Second – and this is something that I’m particularly proud of – by putting all of my methodology and data in the public domain, as well as throwing out an open invitation to journalists across India (no matter what language they worked in) to contact me if they had all-cause mortality data that I could guide them in fashioning into a story, I was able to set off a domino effect. Journalists across Indian newsrooms began reporting on excess mortality in their cities and states, and between us we have now produced data for 18 India states or over 700 million people. Lastly, since all of my data is on Github, researchers from across the world have had access to it to do scientific work on covid mortality in India. Multiple papers have been written using the dataset, and it forms part of global repositories. Where there would have been a large India-shaped hole in our understanding of global mortality from covid, reporting by me and then other Indian journalists was able to fill this gap. Additionally, I had dozens of bereaved people from across the country reach out to me to say that my work was helping honour the memory of the people they had lost, when official statistics were erasing them. I think that
I used Excel for data analysis and Google Sheets, Infogram and Datawrapper for charts. This was not a data journalism project that used much technology, and I had to rely chiefly on my investigative, journalistic, analytical and narrative skills.
What was the hardest part of this project?
I am going to go out on a limb here and say that too much of data journalism now rewards visualisation rather than journalism; while visualisations are difficult to pull off for independent journalists and those in resource-constrained settings, this is also where some of the most vital jouralism is being done. This is not a project that used advanced tech or visualisation tools, but a project that required all of the best skills of investigative reporting to tell a data story. First identifying why this data was needed by specifically pointing out what India’s official covid death toll was missing (instead of over-broad generalisations and suspicions about data suppression) through analytical articles like the ones I wrote for IndiaSpend was vital to create an understanding of what all-cause mortality data could do. Then accessing this data as I did for the pieces in The Hindu, Dainik Bhaskar, Scroll and IndiaSpend, required developing sources to access confidential data. After I wrote the articles, I wrote an op-ed for The Hindu explaining both the deficiencies and the advantages of using all-cause mortality data, and helping put what I had found in context. The hardest part of this work has been the consistent refusal of the Indian government to part with data. All of the all-cause mortality data had to be accessed using confidential sources which had to be developed at the level of each city and state government. The National Health Mission administrative data that I used to estimate rural India’s excess mortality was pulled off the government’s website while I was trying to use it (and is now no longer updated since I wrote the article). As the government continues to stonewall all attempts, putting this data out was all the more essential to know the true impact of covid
What can others learn from this project?
Hopefully, other single-woman teams like me can learn that being unable to produce impressive-looking data journalism doesn’t mean that you can’t produce data journalism that can change the world for the better. I used nothing more impressive that Microsoft Excel and had to do all of my reporting solo, but was, I hope, able to produce journalism that has altered how the world sees covid in India, push back against government stonewalling, and grant some dignity, even if it is in statistics, to the millions who died of covid but went unremarked on and uncounted.
I hope the domino effect that this project had can also help other solo journalists like me realise that we are not alone, and if we put our data and methodologies out in public, others can reach out to us, learn from us, and carry on the mission of speaking truth to power through data.
Finally, I hope that this project gives other journalists who are operating in environments of government suppression the courage to develop alternative sources and push back against the denial of data.