2022 Shortlist

Data Gaps

Country/area: India

Organisation: IndiaSpend

Organisation size: Small

Publication date: 15/02/2021

Credit: Shreehari Paliath,  Prachi Salve, Shreya Khaitan, Karthik Madhavapeddi, Madhur Singh, Anoo Bhuyan, Shreya Raman, Pranab R Choudhury, Gokulananda Nandan, Archita Raghu, Snigdhendu Bhattacharya, Rukmini S


Shreehari Paliath,  Prachi Salve – Journalist, IndiaSpend
Shreya Khaitan, Karthik Madhavapeddi – Senior Editors, IndiaSpend
Vishal Bhargav – Producer, IndiaSpend
Gulal Salil – Former Graphic Journalist, IndiaSpend
Madhur Singh – Former Managing Editor, IndiaSpend
Anoo Bhuyan – Former Health Reporter, IndiaSpend
Shreya Raman – Former Data Reporter, IndiaSpend
Gokulananda Nandan, Archita Raghu – Former Interns, IndiaSpend
Pranab R Choudhury – Founder and coordinator of NRMC Centre for Land Governance, Bhubaneswar.
Snigdhendu Bhattacharya – Author and Journalist
Rukmini S – Data Journalist

Project description:

Our Data Gaps series is our effort to highlight the numbers that are not measured, or not shared publicly, and to question why. By shining the light on known or unknown gaps in public data, we hope to explain how this makes government policy less effective and inclusive, while limiting transparency and independent critique.

Impact reached:

As data journalists, we routinely run into problems with data availability, accessibility, and reliability.

It has long been considered a truism in data circles that ‘if you can’t measure it, you can’t fix it’. Today, when the Indian government is relying more readily on data to design, implement and evaluate its policies and programmes, the use of incomplete or unrepresentative data can perpetuate, and even create newer forms of, inequity.    

By shining the light on known or unknown gaps in public data, we sought to explain how this makes government policies less effective and inclusive while limiting transparency and independent critique. This ongoing series has sparked many a conversation about the need for better, open, and accessible data in India’s public sphere, and many stories have been republished in several national and provincial publications, and cited in academic journals (examples below).

In 2021, we’ve written about the lack of data related to Covid-19 in India and the resulting underestimation of deaths, on the lack of data on transgender persons, on Dalit Christians and Muslims, on migrant workers, and women’s land ownership.

Other impacts (Citations)

Devesh Kapur, Neelanjan Sircar and Milan Vaishnav Johns Hopkins University School of Advanced International Studies, Washington, October 2021 Journal Article (international) Gender, Social Change and Urbanisation in Four North Indian Clusters 

cited: https://www.indiaspend.com/how-official-data-miss-details-on-half-of-indias-citizens

Sharan Bhavnani, Prashant Narang, and Jayana Bedi Centre for Civil Society October 2021 research paper (national) Rights, Restrictions, and the Rule of Law COVID-19 and Women Street Vendors

cited: https://www.indiaspend.com/how-better-data-could-help-prevent-custodial-deaths/ 


Techniques/technologies used:

The nature of the series limits the use of technologies, and most stories used boots-on-the-ground reportage. We perused reports going back to Indian Independence, and examined smaller surveys and reports commissioned by the government. We also filed several requests under India’s Right to Information Act, but many of these requests were met with a standard response: “No such data are maintained.”

For instance, the story on Dalit Christians and Muslims had to depend on a report commissioned by the government in 2008, even as more recent data should have been available through the 2011 national census, but wasn’t. We relied on experts and members of the community to give us access to reports and insights on the data gaps. 

We used OCR tools to glean information from some of the scanned documents, while many handwritten documents had to be studied manually. Further, to analyse what data were available, we used Google Sheets and Tableau.


What was the hardest part of this project?

Since many of the issues covered in the series are politically contentious–including caste, human rights, and welfare entitlements–there was a lack of open data, and a reluctance on the part of government departments to provide the information. We filed several RTI applications and appealed against the non-disclosure of information for several months. 

We looked at decades-old data, spoke to several experts to understand the issues, and used their work to shine the light on inadequate information.


What can others learn from this project?

As countries around the world restrict access to good quality data at the right time, this series serves as a guide to journalists across the world to highlight the gaps and initiate conversations in the public discourse towards a truly open data culture. 

We hope that such conversations in Indian journalism will urge governments to make more data freely available, which in turn will help an engaged electorate participate in evidence-based policymaking.


Project links: