BBC Shared Data Unit

Entry type: Portfolio

Country/area: United Kingdom

Publishing organisation: BBC Shared Data Unit

Organisation size: Big

Cover letter:

The BBC Shared Data Unit has continued to promote and support data journalism at a regional level across the UK, whilst making national and local headlines with a series of public interest data investigations. The team’s collaborative model sees it share its data journalism with more than 1,000 local news outlets across television, radio, print and digital. The team’s work led to at least 280 print or online stories among our local news partners last year.

During the course of the year, the team have also led free week-long masterclasses in investigative journalism for regional journalists, and hosted drop-in sessions over Zoom for journalists wanting to learn key technical skills. Participants develop skills to be able to source, clean, analyse and visualise datasets for their audiences, as well as exploring more advanced techniques such as how to maximise their use of Freedom of Information laws, how to scrutinise financial data from Companies House accounts and how to use advanced internet searches. In June, we hosted our first ever two-day coding course for journalists.

In May, we hosted a hack day inviting regional journalists to explore Twitter data, alongside MPs, academics and data scientists. Later in the year, we brought together data journalists from across the UK at Data Journalism UK conference we co-hosted with Birmingham City University.

A commitment to transparency and the open data movement is at the heart of our team and our output. We have demonstrated how adopting open data principles enhances the trust in our reporting, builds a personal relationship with audiences and helps to engage them in the process of journalism. We demonstrate our commitment to transparency through sharing source data, methods and code for the majority of our projects.

A key part of the team’s editorial remit is to find news stories that are hiding in plain sight, using data journalism techniques to explore the impact of local and national government policies on the lives of individuals.

The team has demonstrated a wide range of technical skills across its portfolio. Techniques include accessing open data sources, using Freedom of Information laws, scraping data from websites or building our own databases. What matters is finding the right data to bring important information into the public record.

Description of portfolio:

In January, we revealed how NHS dentistry in the UK was “hanging by a thread” with some patients facing two-year waits for routine check ups.

First, we built a scraper in Python to find out how many practices in England were accepting NHS patients. More than 75% of all dental practices in England had out of date information – hinting at the scale of the problem.

Next, we turned to the annual dental workforce statistics. A time series – created with the programming language R – revealed an 8% drop in the number of NHS dentists last year. Breaking the data down by clinical commissioning group showed areas that had lost more than a quarter of their NHS dentists in the last year.

We found stories of people using metal files and superglue on their own teeth, people in pain for over a year and dentists pushed to financial and mental breaking point.

Four days after publication, £50million was released by Westminster for NHS dentistry.

In February 2022, we reported around half of police employees who had committed gross misconduct since the formation of the latest police watchdog had not been dismissed.

In May 2022, we partnered with BBC Newsbeat to report on the gender imbalance among headline performers at UK music festivals.

We compiled and analysed a spreadsheet of the headline acts for 50 music festivals.

In August 2022, we reported how the share of homegrown doctors and nurses joining England’s NHS had reached its lowest point in seven years.

We analysed workforce data provided by NHS Digital showing the nationalities of joiners, leavers and staff in post in England’s NHS from 2015 to 2021.

We also submitted 27 FOI requests to every health board in Scotland and Wales and every health and social care trust in Northern Ireland.

In October 2022, we revealed a postcode lottery in terms of the amount of money NHS Clinical Commissioning Groups (CCGs) spend per woman aged 45-60 on common hormone replacement therapy (HRT) treatments.

In November, we collaborated with BBC Learning & Identity to look at the rise in the number of five and six year olds who need speech and language support at school, finding that it had risen by 10% in England in the last year.

An R script was written in a Python notebook to download data, filter it, pivot it by local authority and year, and calculate year-on-year changes.

A second script downloaded data on pupil numbers, combined it with the figures on special educational needs (SEN), and divided the SEN figures by total pupil numbers to get a proportion.

This allowed us to test whether an increase in speech, language and communication (SLC) needs might simply be due to an increase in pupils (it was not)

A third script downloaded data on the numbers of pupils for whom English was a second language (ESL), to test whether an increase in speech, language and communication needs might be due to an increase in ESL pupils (it was not),

Also in November, we reported how thousands of disabled people had had their benefits paused during extended hospital stays under a rule which charities said penalised the most vulnerable.

Later in the month, we shared an analysis of almost 3 million tweets mentioning politicians’ Twitter accounts. More than 3,000 tweets were sent to UK Members of Parliament every day that a Google algorithm classified as “toxic”. The project used “big data” tools and artificial intelligence to analyse the data.

In December 2022 we looked at the patchwork provision of holiday food vouchers being offered to families with children in receipt of Free School Meals.

Project links: