This project used exclusive income and spending data for every school in Australia to show for the first time exactly how big the divide between rich and poor schools has grown.
By ranking schools according to income, then analysing each school’s capital spend over five years, we were able to illustrate the enormous wealth gap between public and private schools.
We combined this first-of-its-kind analysis with powerful visual storytelling techniques to explicitly connect the data with the reality most readers directly experience via schools in their neighbourhoods.
The topic of school resourcing is widely seen as dry, complicated and impenetrable, with both “sides” of the debate tending to rely on well-worn, emotive arguments, rather than facts and figures. This project injected credible facts and figures into one of Australia’s most protracted and polarising debates. The combination of ground-breaking data analysis and compelling visuals allowed the story to achieve a degree of cut-through rarely managed by stories about this topic. At the core of this investigation is My School data – one of Australia’s most tightly-held datasets on school resourcing. Numerous organisations have tried to obtain this data and requests for the data have only ever been granted to a handful of researchers under strict conditions not to further disseminate the data. ABC News is the first media outlet to successfully scrape the full dataset. To verify the data, we worked closely with a number of academic researchers who were familiar with it. Our analysis is the first of its kind, and the most comprehensive and detailed to date. It compares the individual finances of nearly every school in Australia, and then analyses this data to reveal the powerful connection between sector (Government, Catholic and Independent) and school funding. Due to distorted spending patterns pre-2012, this was the first year such an analysis was possible. This story put a closely-guarded government dataset into the hands of the public, allowing them to “drill down” into the data and see, for the first time, exactly how their school’s income and expenditure compares. This stirred vital debate about transparency and accountability in school As one expert says in the story: “The Federal Government had much of this information but it had never been made public before… And now you can see how little gets spent on government schools compared to non-government
For the opening visualisation, we combed annual reports, school newsletters, fundraising brochures, P&C bulletins, campaign videos, architectural plans and development applications for dozens of schools
The technical challenge of visualising 8,500 dots – particularly for our large mobile phone readership – required creative design and development responses. In particular, we chose not to allow users to interact with the visualisation, instead curating a “guided” visualisation that highlighted specific schools that are either well-recognised or representative in the dataset.
We used online survey tool Screendoor to crowdsource data on the capital funding needs of schools across Australia. Data included photos and personal accounts from dozens of school principals, P&C committees, teachers, students, former students and parents. This proved invaluable for reaching out to under-resourced schools, which are often prevented from speaking to media by the education departments or non-government sector authorities
We also built a searchable database to allow users to drill down into the data and find “personalised” information for their school. This included an interactive map, that allowed users to compare public and private funding and capital expenditure for any given school and its neighbours across the nine-year dataset.
Data was scraped using Chrome browser extension webscraper.io. Data organisation, cleaning and blending was done in Excel and Tableau Prep. Analysis and “proof of concept” visualisations were done in Tableau Desktop.
What was the hardest part of this project?
The first challenge of this project was collecting the data. The second was finding the most rigorous way to use it. The third – and most difficult challenge – was working out how to transform a vast and detailed (but also potentially “dry” and boring) financial dataset into an arresting story that would cut across the deep ideological divide on school funding.
Our visual language had two main aims:
- To help readers make the explicit connection between the data and their local schools
- To show how each of the 8,500 schools fit into the bigger picture; a story previously untold because the only data previously available were summary statistics.
We initially built the data with svg using D3.js, but due to the large volume of schools represented, we opted to flatten the bubbles in the visualisation to ease the load on browsers. This ultimately led to a more streamlined, guided experience.
The intention of opening with a monolithic ‘beeswarm’ scroll was to invoke sense of scroll fatigue for the user, physically articulating the staggering gap between the top and lower income schools.
To counteract the considerable length of the visualisation, we highlighted breadcrumbs of information along the way, showcasing promotional material, crowdsourced photos, GIFs, 3D fly-throughs, time-lapse videos and satellite imagery. This was accompanied by an overview of the school’s spending, including income, capital expenditure, and government capital expenditure.
The final product was a unique story that could not have been adequately told in any other way.
What can others learn from this project?
This story showcases how data-driven techniques can be applied across the entire spectrum of news-gathering and storytelling to propel a ground-breaking investigation.
The skills and techniques other could potentially learn from this project include:
Making data explorable, so users can find information that is relevant and interesting to their personal situation
Connecting the personal and the impersonal – that is how to link “impersonal” data to the personal (and often emotional) reality of users’ lives and choices (such as where to send your child to school)
When does interactivity serve the audience (and when should you abandon it)?
The pros and cons of crowdsourcing data
Scraping, organising and verifying large datasets
Visualising a complex, detailed or technical analysis for a general audience
Creative approaches to handling vast amounts of data
Reaching users where they are (in our case, optimising all our content for users on mobile phones. These users now make up more than half our entire audience.)