I am a data journalist based in Birmingham, UK, working for the BBC England Data Unit.
Since graduating with a masters in journalism from Cardiff University in 2016, I worked in investigative documentaries for ITV Studios, as a producer for various BBC radio stations and as a digital journalist for BBC News online, before joining the unit in January 2019.
Over the last year I have thrown myself into innovative and original data-driven investigations, upskilled myself in coding, while retaining the mindset of a local reporter, who prioritises connecting his work with the regular lives of people across the country.
I have a lot of freedom in the range of stories I can cover within our relatively small team of three, but have had to quickly adapt and embrace new skills such as wrangling and visualising data using code – as we do everything between us.
My main task is to create original, data-driven stories by researching a topic through traditional tip-offs or finding interesting an interesting source of data. I’m then responsible for cleaning the data, pitching the story, writing it and visualising the data. A keystone of this is also finding compelling case studies from a diverse range of people – to show our readers why the data we’ve gathered affects them and their lives.
This is by far the most important aspect to every piece I write. While I believe a data-driven story without a human voice can still be compelling and technically impressive, the personal aspect brings it under the noses of everyday readers, where it can be a better force for change.
I’m also often tasked with reacting to data released into the public domain – usually by public bodies such as the government or the NHS – and finding an original take on any story that emerges, to make it more than just an update to statistics we may have seen before.
The technical part of my job I find most satisfying is finding ways around the challenges of cleaning and structuring data, to build original datasets. For this I use a variety of tools which I’m constantly updating and changing as I learn.
Every week I will find a new use for a package in R, or a way to use a combination of Excel functions to do a complex task. I also find using this technology improves my workflow in breaking data stories, by preparing scripts for visualisation to allow swift publication.
The projects I’ve attached show how I’ve applied my skills to a range of different subjects, all driven by issues faced by people in the UK in everyday life. I’ve also shown a mix of original journalism and ‘on the day’ stories, which required investigating prior to release to establish whether there was a newer, better public interest story behind the numbers. The technical skills required have ranged from visualisation in R, as well as data wrangling and analysis, to using advanced formulas in Excel.
For the vast majority of submissions I was responsible for all aspects of the story, however, all were sub-edited by our team’s senior journalist, Dan Wainwright, who has also helped me immeasurably develop over the past year, and Paul Bradshaw, who in times where I was overwhelmed by technical challenges, offered me invaluable advice and suggestions on ways to overcome any issues.
My submissions are a small selection of pieces from January to October 2019.
Description of portfolio:
In July 2019 I published my original investigation into the effectiveness of the public’s ability to appeal “lenient” criminal sentences in the UK – I found many appeals were being discarded automatically because there were huge limits to the types of crimes that were eligible to be reviewed.
I acquired the data through the Freedom of Information Act, asking for how many sentences had been appealed over five years and what the results had been, which was provided to me in an excel spreadsheet.
To get the headline figures, I had the issue of sorting large blocks of text which were not uniform, but meant the same thing, such as mis-spelt offences which were skewing the overall figures. To get over this hurdle, I merged the data for each year together into one spreadsheet using Rbind, then ran the results through OpenRefine to clean these descriptions up.
However, the main technical issue with one aspect of this story was extracting the lengths of each criminal sentence before and after it was reviewed, when the raw number was embedded in a block of text in each cell of the spreadsheet.
To do this, I had to create a system of functions to access where the number began in the cell and then a further set of functions, nested within each other, to extract it.
The story was published in July, and in September the government announced an expansion of the scheme, to include more offences.
Every year the UK government reveals how many families have been fined for taking their children out of school for holidays during term time. However, the shock story to emerge this time was the numbers had nearly doubled in a year.
I was able to ascertain the significance of the story well ahead of the release, by finding and approaching Jon Platt, a father who had lost a historic court case in the previous year for taking his child on holiday during term time. Jon had conducted his own research in his local area, and we were able to obtain enough information for other parts of the country to suggest there had been a surge in fines. This led me on to other parents, who formed the heart of the story.
On the day, when the data was released, we had the bones of the story ready to go, and were able to quickly turn around the data using R to quickly analyse the relevant spreadsheet – this was then visualised as an interactive map using Carto, as well as some simpler charts.
An example of data journalism and news that our audiences can use in their day-to-day lives, this piece on the cost of school uniform was a collaboration between myself and a colleague in national TV.
I incorporated the results of a scraper written by my colleague Paul Bradshaw, which calculated how many Facebook groups were created relating to school uniform swaps in the UK. We also used one of the groups we approach as examples in the piece.
We combined this with data from UK government surveys, independent surveys and information I scraped from supermarket websites, to look at the costs of school uniforms, how it varied between primary and secondary school clothes and discussed possible ways for families to save money.