Life in Metro was a 10-part visual data journalism series by Mint that examined different aspects of city life in the six largest urban agglomerations of India. The goal was to shed light on cities themselves (like which areas are better-off than others) and to compare cities with each other. From mobility to migration, green spaces to housing, each part of the series used data to map and chart out components of city life – making it a first-of-its-kind project in India
The project generated significant buzz both internally in our newsroom and externally. Academics and journalists reached out asking for some of the data while others were intrigued by how we developed the maps. Our final rankings of cities received significant traction on social media (Twitter, LinkedIn and Reddit).
Like several data journalism projects in this newsroom and elsewhere – this project largely relied on the standard repository: Python, MS Excel, QGIS, Adobe Illustrator and Google Earth Engine All scraping was done in Python. Most analysis then happened in MS Excel and Python. All maps were generated in QGIS and styled in Adobe Illustrator for print.
What makes the series different is not the tools but how these tools were used to achieve our stories. The mandate was to use data that was publicly available. This came with two challenges : Data in India is often not publicly available and if it was, it was probably restricted to one city. The lack of relevant official data at a city level meant we had to improvise. Road speeds for 150 major roads were extracted from Google maps, without an API. Population data at a neighbourhood level were generated from the World Population Dataset. Similarly, green cover was estimated from Google Earth Engine. Data was also not geocoded. We used publicly available shapefiles and (even generated new shapefiles) to take advantage of census datasets for our stories on migration and public facilities – to generate never-before-seen neighbourhood level maps of major Indian cities.
What was the hardest part of this project?
One of the challenges of a daily page meant that our small team of 7 often has to deal with multiple stories at the same time, and also other responsibilities such as the production of the page for print. This 10 part series was largely anchored by two members in our team (Sriharsha Devulapalli and Vishnu Padmanabhan) with some data help from a consultant (HowIndiaLives). For each story in the series, we had a lead time of five days – from data sourcing to print. This meant getting the data, making sure it’s error-free, brainstorming a story, writing it and generating highly detailed graphics – and then repeating this process every week for two and half months. We can confidently say that this process of consistently producing high-quality visual data stories is unheard of in an Indian newsroom.
What can others learn from this project?
Despite India’s burgeoning newspaper industry, there remain very few large venues for visual print journalism in the country. Plain Facts, by virtue of being a daily data journalism page has largely also pioneered visual data journalism in the last one year. This series, among all the other stories produced at Plain Facts, remains a good model of doing high quality, rigorous and accurate visual data journalism within short turnaround times.