TuitionTracker is a tool that projects the net costs for prospective students enrolling in college and allows them to compare the value and outcomes of 3,891 colleges against one another. Unlike other web apps offering historical data to students, TuitionTracker uses a financial projection technique to estimate the tuition to be charged by each institution in the coming year and the expected net cost (which is often much lower). This means that students enrolling in 2020-21 would know the cost of tuition in 2020-21, not be forced to imagine the price based on what it cost in 2017-18.
The data produced for this tool went on to power a number of impactful stories. The Hechinger Report produced two of these stories, based on issues relating to the cost of American universities. The first story, reported and written by Emmanuel Felton, focused on Nicholls State University and how that Louisiana college tried to offset cuts in state funding by increasing tuition on low-income students and their families. This became the basis for an on-air story from Soledad O’Brien for her nationally televised show Matter of Fact.
The second piece, reported and written by Pete D’Amato, examined skyrocketing tuition prices at elite universities and how the “sticker shock” scares off low-income and minority students from applying, despite the fact that they are likely to receive institutional funding that drops their attendance costs close to zero. Focusing on the University of Chicago and the 2025 milestone when the school is expected to break $100,000 for a year of attendance, D’Amato traced rising tuition prices back to certain accounting tweaks in the 1980s at some small liberal arts schools, which created the byzantine and opaque process of paying for college that we have now. The University of Chicago milestone spurred a piece on the cost of college in The Atlantic, and data on expected cost increases generated state-level stories for public radio stations in Iowa and Vermont.
Importantly, the tool itself has been used by many students, as well as the parents and guardians of college-bound students, to learn about the disparity between advertised tuition and the real cost of attendance. It allows comparisons for five different levels of family income. Counselors working with high schoolers have contacted Hechinger Report to say they recommend the tool to prospective college students.
The web application itself is built off of an earlier version, created in partnership with the Education Writers Association and The Dallas Morning News. Various components were overhauled or developed from scratch for the new version.
The process of generating the data that powers the tool was the most difficult and sensitive part of the development process. Over a period of months, Pete D’Amato worked with federal education data using the R statistical package to determine the best predictors of year-on-year changes for college costs. Performing regression analysis, he compared the historical net prices for colleges to a range of variables, from advertised tuition to incomes of students enrolled to U.S. gross domestic product. In the end, he determined the most closely correlated statistic would be advertised tuition, and so he pegged the actual cost to the estimated tuition for the upcoming school year. This data was generated using the compound annual growth rate over the past ten years for each institution.
The tool uses Python to extract all the data from a master CSV file and place it into separate json files for each individual institution. The data is then read from the json files using D3.js and displayed using D3 or HTML with CSS animations. A small component — the comparison tool, which allows users to compare multiple schools by graduation rate or by actual cost — is built in React.js and deployed using Webpack.
Tuition Tracker includes data from 3,891 American colleges and universities.
What was the hardest part of this project?
The most difficult part of this project was figuring out how to best project costs for college-bound students accurately using available historical data. It was important to spend the time working with R to decide which variables were most correlated with each other, then testing the formula used to project prices against fresh data.
What can others learn from this project?
As people demand more transparency about institutions such as colleges, and more data becomes available to the wider public, journalists should strive to present this data in the most digestible way to audiences. This requires thinking about the situational context of users and the way that they consume and interpret data. It is thus not sufficient to show historical data on to audiences and hope they will be able to apply the historical data to the future. In the college affordability example, a parent who is presented historical data on college costs without instructions on how to project them forward may take the topline data point — the last year price data was released — and assume this is the price to be advertised for them and their child, when this figure is likely to be thousands or tens of thousands of dollars off the mark. Adding to its usefulness for consumers, the tool also allows users to compare graduation rates, acceptance rates and the percentage of students who pay the full sticker price at any of the 3,891 colleges.
Journalists creating data visualization tools that are consumer-focused need to develop skills to better interpret such data, whether through machine learning, statistical analysis or other methods, in order to present a final product that requires less guesswork on the part of users and allows for more accurate estimation.