For this exclusive analysis, I cross-referenced a database of 250,000 Airbnb listings across Great Britain with official housing stock figures to identify places where the rate of listings to homes was disproportionately high. Static and interactive graphics in the story showed readers the national picture and let them explore the figures for their local area.
The analysis was then followed up with on-the-ground reporting from hotspots in England and Scotland, which revealed that the high proportion of short-term rentals drives up property prices, disrupts local communities and often leads young locals to leave the areas in question.
The main article made the front page of the print paper and was viewed more than 200,000 times online. Following publication I was invited on two national radio shows to discuss the findings of the analysis. The broadcasters also invited people living near Airbnb hotspots and property owners to call into the show which led to some very interesting debates.
Our investigation also sparked renewed political calls for more regulation for short-term rentals in the UK. Prompted by our exclusive story, Green MP Caroline Lucas said local councils should be given the power to impose a 90-day cap on homes let out on Airbnb and similar platforms.
Our story was also picked up by local media in Scotland where we identified the highest Airbnb rates in the country.
Listings data was provided by activist project Inside Airbnb which regularly scrapes the Airbnb website.
The figures were then cross-referenced with official housing stock figures at a very small geographic level — so called Middle Super Output Areas in England and Wales, and Intermediate Zones in Scotland. This allowed me to calculate the rate of Airbnbs at a neighbourhood level, using QGIS and Node.js.
I also ran various quality checks on the data. For example, Airbnb claims that the exact location of apartments on its website can be falsified by up to a few hundred metres for privacy reasons. To see whether this could distort my data analysis, I randomly shifted each listing by a few hundred metres myself and ran my statistics again, which didn’t materially change the local figures or the list of worst-affected areas.
Static maps were created in QGIS and Illustrator and the interactive embed was created on the basis of QGIS exports. The interactive embed also features a postcode lookup (try inputting NW54DE, for example).
What was the hardest part of this project?
As mentioned above, one of the most involved parts of the project was checking the vast source dataset for all kinds of potential flaws. These challenges ranged from defining ‘active’ Airbnb listings to confirming the geographic accuracy of our analysis. Another challenge was merging separate housing figures for England, Scotland and Wales.
Overall I believe the project should be selected because it was the first time anyone conducted this kind of investigation at a national level, because we approached the source data with a healthy amount of scepticism and worked hard to rule out potential problems in the headline figures, and because we followed up the data analysis with powerful on-the-ground reporting.
Finally, a lot of work went into designing both static and interactive graphics that would enrich the storytelling experience. I ended up with a very straightforward static map that could go near the top of the piece, while the interactive ‘postcode finder’ embed was designed to be fairly small in its collapsed form, making it easily skippable for people who just want to read the national story. This concept of initially following a journalist-driven angle before allowing the reader to explore the data interactively is sometimes referred to as ‘martini glass’ format.
What can others learn from this project?
One of the main takeaways from this project for me was how powerful the combination of data analysis and on-the-ground reporting really is.
Sometimes traditional journalists fall into the trap of talking to a handful of people and trying to extrapolate a wider ‘trend’ from very anecdotal reports. In data journalism we sometimes have the opposite problem — we analyse a large national dataset and write extensively about places or population groups that we haven’t actually visited or spoken to first-hand.
This project was a great example of combining the two steps. After identifying (and double-checking) the top 10 Airbnb hotspots in the country, news editors assigned reporters based outside of London to visit two of the worst affected places on the list — Edinburgh and north Devon. The article on north Devon in particular painted a very nuanced picture of the local situation and communicated the scale of the issue so much better than a purely data-driven article written from our London office ever could.