Dying for Care tapped death reports, resident demographics and ownership records to identify the nursing home chain whose records revealed the highest death rate during COVID-19’s worst months, employing statistical regression to control for exacerbating circumstances. The series then dug deeper on an issue highlighted during the first story: short staffing. Using federal timesheet data for each home, reporter Jayme Fraser identified facilities that missed government benchmarks and yet were never cited for it, based on enforcement data. As a whole, through data reporting and shoe leather, the series revealed appalling impunity for abysmal care.
Within 24 hours of the first story publishing, the White House reached out. The Biden administration had a proposal for nursing home reforms in the works, and staffers wanted to incorporate our findings. Real estate investment trusts like the one that owns Trilogy, the focus of our first story, had not been on their radar; our reporting provided insight into the financial drivers behind low staffing levels. The stories also informed the administration’s effort this year to tackle problems at the chainwide level for the first time.
Congress reacted swiftly to USA TODAY’s investigation as well. U.S. Rep. Bobby Rush called for congressional hearings into the operations of “profiteering, cold-hearted” nursing home companies during the pandemic. In September, Congress held that hearing, digging into complex ownership structures at the nation’s largest operators, some backed by REITs, and highlighting the plight of overworked, low-paid staff.
David Grabowski, a leading nursing home researcher from Harvard and congressional advisor on Medicare, said USA TODAY’s investigation had impact even before it published. Because of our conversations with him, he brought REITs to the attention of the National Academies group studying possible reforms. As a result, REITS were included in the group’s final 600-page report.
Reporter Jayme Fraser performed analysis in R and occasionally shared data with the larger reporting team in Excel or Google Sheets. Collaborator Jeff Kelly Lowenstein, a freelancer and professor at Grand Valley State University, did work in SPSS and Microsoft Access.
The federal government developed a methodology to determine which nursing homes outperformed expectations on COVID-19 infections and deaths, awarding bonuses to top scorers. Reporter Jayme Fraser adapted this methodology, using Poisson regression to control for resident characteristics. The chain with the highest raw death rate based on its own reports, Trilogy Health Services, also was tops using the regression method. (Identifying chain ownership was its own task that included Selenium-scraped, archived versions of chain websites for property lists.)
Outputs of the regression produced numbers for every U.S. nursing home, enabling us to create a unique A-to-F grading system on the stress test brought by COVID-19’s peak. Chris Amico of USA TODAY’s Storytelling Studio used Svelte and Mapbox to turn the results into a high-utility consumer search tool.
Separately, to identify how many facilities with substandard staffing are punished for it, we drew on the federal Centers for Medicare and Medicaid Services “STRIVE” index for each nursing home. STRIVE indicates the number of caregiver hours expected in each facility based on resident need. By marrying these numbers with timesheet data (the Payroll-Based Journal database from CMS), we were able to identify facilities whose staff fell short of the government’s own expectations. Further marrying this dataframe with enforcement data showed which subset received citations. The answer: not many.
Graphics specialists Ramon Padilla and Carlie Procell delivered most of the data results for each story as “scrollytelling” features, a frequently used USA TODAY technique. These employed Adobe Illustrator, After Effects and Svelte to animate charts and images as their own narrative.
Context about the project:
We presented Trilogy Health Services detailed findings weeks before our planned publication. There was no response until the last minute, when company officials said they’d erroneously over-reported COVID-19 deaths to the federal govenment. We asked them exactly which numbers they had reported incorrectly, at which facilities, and why it happened. They responded with an Excel file of aggregate data and an emailed list of four broad categories of errors Trilogy said it made. Our run date was postponed so we could explore their response.
Company officials declined to give us copies of the data Trilogy planned to resubmit. They also declined to provide anonymized examples of their misreporting. All they offered was a new set of death totals by state and by week, in contrast to our facility-level weekly death counts. That made it extremely hard to verify Trilogy’s assertion that 40% of its reported deaths shouldn’t count.
Absent documentation for each death or group of deaths Trilogy proposed to eliminate, we decided to see what our results would look like if we took Trilogy’s revised figures at face value. We found Trilogy would remain significantly above the industry average for the 2020-21 winter surge. In the end, we published the numbers Trilogy originally reported, the company’s assertions about its own errors and our best estimate of what they would mean if true.
When Trilogy later submitted its new numbers to the government, it was the biggest COVID-19 death revision by any operator during the pandemic. The feds told us they planned to review it, and we requested documentation of that review via FOIA. Months later, we are still awaiting answers.
Developing the project and targeting reporting efforts required stitching together more than a dozen data sources. We had to geocode and match addresses, align myriad time units, keep track of various reporting requirements, untangle differing definitions of COVID cases and staff titles, and consider which data points would be most reliable amid a global pandemic.
Notably, our dataset on the REITs collecting rent from the nation’s largest nursing home operators required extensive public records digging to build a spreadsheet from scratch. Because it was unique, it enabled us to answer questions regulators couldn’t about the pandemic. While the feds focused on individual facilities, we were able to analyze outcomes at the level where decisions are often made: operating company or owner. This work identified the need for greater scrutiny of the real estate landlords involved in the industry.
Advanced statistical techniques added authority to our reporting and cut off the most common defenses of poor outcomes (resident health, regional pandemic outbreaks). After publication, David Zimmerman, a retired researcher who led the design of the Center for Medicare and Medicaid Services’ quality indicators for nursing homes, wrote: “I am so impressed with the work that you folks have done in the two articles that I’ve read and the various supplemental things you had in each of the articles. I haven’t seen anything like this ever, actually, but particularly the methodology.”
What can other journalists learn from this project?
One lesson is that government tools can be repurposed. The feds created a formula that could highlight the best performers among nursing homes facing down COVID-19, treating everyone fairly by controlling for local differences. Reporter Letitia Stein recognized that this technique could also highlight the worst players. It was a terrific insight.
Similarly, reporter Jayme Fraser re-used a federal formula designed to calculate Medicare reimbursements — an approach used in a study by a Harvard researcher. The formula shows how many employees the government expects to be hired at each nursing home, giving us a number that could be compared with actual staffing. It’s a difficult-to-refute benchmark: the government’s own expectations. We were able to show that regulators allowed 96% of facilities to miss this benchmark and get away with it.
Whenever possible, a second set of data eyes can help. USA TODAY’s Data Team uses a second data reporter to review the lead data reporter’s analytical code in each major project. In this case, Aleszu Bajak was able to replicate Jayme Fraser’s work independently and run alternative models that gave additional confidence in her results. Data reporter Kevin Crowe reviewed her code on the second major story focused on understaffing.