An inventory crunch is making life impossible for home buyers. Our interactive map can help you track the availability of houses for sale near you.
Entry type: Single project
Country/area: United States
Publishing organisation: MarketWatch
Organisation size: Big
Publication date: 2022-05-12
Authors: Reporters: Eleanor Laise, Jacob Passy
Reporter and developer: Katie Marriner
Editor: Nathan Vardi
The MarketWatch enterprise team produces engaging deep-dive features, investigations, data-driven stories and interactives that make an impact and give readers insight into the ever-evolving business and financial world.
MarketWatch analyzed more than five years of monthly data from Realtor.com to drive reporting on the housing inventory crisis and to create a tool for users to search by county how many homes are available for sale in a given month, the median home price and the five-year change in these indicators.
The nation’s housing inventory shortage was one of the biggest economic and personal finance stories of 2022. The plunging number of homes available for sale caused housing prices to soar and complicated efforts of policymakers fighting inflation. To help readers understand the implications on a personal level, MarketWatch reporters turned to the data.
MarketWatch analyzed five years of raw monthly housing inventory statistics provided by Reator.com for more than 3,000 U.S. counties. The analysis drove MarketWatch’s reporting on the issue by identifying areas where the shortages were most acute, making sure to account for variations and noise in the data based on county size. MarketWatch created a unique interactive so that readers could easily search 2,296 counties by looking them up via a search bar or clicking on a map to see the five-year inventory change along with median list-pricing data. The tool updates monthly.
By breaking down inventory figures by county, this project contextualizes the crisis for readers in their own markets. While the crisis is national, each county has its own challenges based on different variables and factors. For potential homebuyers seeking a move to another county, understanding how that area’s landscape has changed over five years is key to understanding that housing market.
Other publications followed with their own stories on the issue using similar timeframes and data sources. Months after MarketWatch launched its housing inventory project, Fortune (https://fortune.com/2022/07/06/housing-market-correction-is-hitting-your-local-housing-market-as-told-by-one-interactive-map-home-prices/) published a U.S. housing inventory map that looked very similar to ours. But it had much more limited data and utility, covering only large metros as opposed to thousands of counties. It was also not easily searchable. The New York Times (https://www.nytimes.com/2022/07/14/upshot/housing-shortage-us.html) reported that the housing inventory crisis was no longer a “coastal issue,” an issue that our reporting and analysis had revealed a month earlier.
Data that drives the reporting and interactive is gathered and organized into a repository of JSON files with a reusable Jupyter iPython Notebook. After each new month of data from Realtor.com is published, the Notebook is run to update JSON files in a Github repository, which are then deployed to public data storage using Jenkins and can be accessed by the interactive.
A CSV file is created and uploaded to a Google Spreadsheet with this Notebook as well. It is formatted in a consistent way that allows for comparable month-by-month analysis, which drives reporting by identifying which counties and metrics to focus on for our next stories. Those involved with the project can sift through the data to identify in which counties the housing inventory crisis is most acute and various other metrics that are useful in reporting like pending and active sales, median price per square foot, and average home price.
The front-end was developed using Svelte to create modularized components that are easy to rearrange and extract. The visualization library, D3.js, is used for rendering the choropleth maps using TopoJSON files stored server side. These TopoJSON files are created with a repeatable bash script with the topojson-client library, which merges topographical data with values (housing inventory change) that map onto a color scale.
Context about the project:
As with any data-centered story, there are always compromises that have to be made and an explanation that data alone cannot 100% represent a situation. Deciding which metrics, type of data transformations, the time period, the granularity, and what these elements can say and not say together is important to explain for audiences. Realtor.com supplies many columns of data. In particular there are three separate data points for listings — total, active and pending — each of which give separate insight into a housing market. The decision to present total lists was a deliberate one. We wanted to capture the whole picture — the change in active listings shows surplus while pending shows actual buying activity. Because these two are more volatile, and we were seeking to understand both of these together over a longer period of time, the decision was made to use total as our core metric, using the other two as supplemental for our reporting.
Without a team of developers, compromises had to be made on how the interactive performs and is presented. Despite that, three separate views were created that each serve a vital part of telling the story of the housing inventory crisis. Each piece — the searchable map with detailed metrics by-county, the time-lapse map over five years and top-10 lists — was thoughtfully constructed with performance, relevance and necessity in mind. As the only developer for the interactives is a primarily front-end developer with limited time and resources to safely and reliably implement a database connection or build an API, a creative solution had to be found in order to handle the large amount of data this project requires.
The technological compromise made was creating a folder structure of multiple static JSON files stored publicly in place of a database to be queried with an API. JSON files with detailed data by county are divided into one file per state. This prevents the browser from requesting data for more than 2,000 counties at once, requesting the state file only when a county within that file is searched for or a state is clicked on. To speed up county searches within that state, the data is stored client-side so another cross-site request does not have to be made twice for one state.
What can other journalists learn from this project?
While technological compromises had to be made with this project, these compromises can teach data and technological journalists that meaningful and robust work can still be accomplished quickly and securely. Building the interactives for this project centered on the product development practice of “minimum viable product” (MVP) — identifying a balance between usability, readability and performance to publish on quickly and often. This serves as a great way to think about deadline-oriented projects.
Thinking about coding journalistic products on deadline forces a simple approach, leading to longer-term viability. Storing static files on Github or other third-party sources reduces the reliance on internal databases and products that can quickly become outdated and neglected due to a product team’s constraints. Less maintenance and less reliance on systems that could be decommissioned in future years means that it is more likely to survive multiple updates to websites and systems and be preserved on the internet for longer.