The number of people that are injured or die as a result of a traffic accident in the Netherlands is on the increase. The number of casualties is even twice as high as the government’s target. This needs to change, say experts. They argue that places where the risk of an accident is highest need to be tackled. But what are those places? That is something that data journalists of RTL Nieuws researched on the basis of extensive data analysis. By coupling accidents, traffic volume and lengths of streets they calculated the risk of an accident for 10,000 residential areas.
The Dutch government has been battling with road safety for years. Targets to reduce the number of people that are killed or injured in accidents are by no means met. Moreover, figures by the Dutch Central Bureau of Statistics reveal that 82 percent of all Dutch citizens experience traffic problems in their own areas at some point. Almost 1 in 3 even experience severe vehicle nuisance.
Experts speaking to RTL Nieuws are unanimous in their verdict: the trends are worrying, and the numbers need to go down. But how? According to experts it is important to identify places with the highest risk of accidents and target those locations. With that information in mind, data journalists of RTL Nieuws set about their business. The result is a large investigation in which data from various sources have been coupled in order to get a clear picture of the risk of a serious accident in almost 10,000 areas in the Netherlands. They not only looked at the number of accidents, but also at traffic volume. Ten serious accidents in a busy area pose less of a risk than the same number in a quiet area.
This is the first time anyone has looked at traffic safety in residential areas in the Netherlands in this way. A professor of Transport Policy at the TU Delft followed the investigation. The results have been presented in a tv broadcast and various online articles, enabling the public to see the scoring of area where they live. RTL Nieuws received a lot of reactions after publication. Both from citizens who recognized themselves in the results as well as from policy makers, traffic experts and researchers from universities who wish to use the results to address dangerous places.
The great thing about data journalism is that you can process things in a hundred different ways with a variety of tools and techniques. In this investigation, many of those different tools and techniques were used. For the collection and organisation of the data with information about location of tens of thousands of traffic accidents we used R and Excel. We obtained data on road length and intensity from the databases of TomTom with 12 different GeoJSONs. This had to be done in different GeoJSONs as the annual intensity and road lengths of hundreds of thousands road sections of the entire road network in the Netherlands are too extensive to retrieve in one go. The files were organised in R.
Subsequently, the data from 12 files were combined in QGIS and both the locations of accidents as well as the number of accidents per location were added. These layers of data were compared to a map of Dutch residential areas by the Dutch Central Bureau of Statistics. On that basis the number of accidents, road lengths and traffic intensity were calculated per area.
We then calculated the risk of accidents per area, according to an existing scientific methodology.
In order to present the data in a way that is comprehensible and attractive to the public we converted the risk of accidents to a star rating based on the deviation from the average. The data have been mapped per area with the aid of Flourish tools. Moreover, on the basis of all the data and with the aid of an editing robot, ADAM (Automatic Data Article Machine), that we ourselves developed, an article was written with the relevant results for those specific areas in all 2,500 municipalities in the Netherlands. A script was written in Python in order to achieve this.
What was the hardest part of this project?
An investigation such as this is based on many different choices. Sometimes small and sometimes big, but every choice ultimately impacts the end result. With every step we make during this project, we have always asked ourselves if it was the right one. We were constantly updating and refining. Take for example the fact that national highways cut through Dutch residential areas without being connected to the area. The intensity on those roads is very high, while they have no impact on the traffic safety in that residential area. Therefore, we decided to not take national highways into account for our investigation, in order to create the clearest possible picture of all areas.
We are well aware of the fact that we are data journalists, not traffic experts. That is why we sought out professor of Transport Policy Bert van Wee of the TU Delft at an early stage. He served as back-up when decisions had to me made. This way, we could present and test the choices we made and the formulas we used as well as our calculations and interpretations.
We need to also mention the fact that we cooperated with a third party. Not every company is eager to share its data with a news organisation. For our investigation, however, it was crucial to be able to use valid data on traffic volume. City councils sometimes execute research on traffic intensity and that data is available, but it is limited to the specific locations that were assessed. TomTom navigation systems’ floating car data covers the whole country and also contains data of intensity for less busy roads in towns and villages. Thanks to a very detailed plan, we were at last able to convince TomTom navigation systems, after months of communicating, to share their data with us.
What can others learn from this project?
Cooperate with other data journalists, but also experts. Together you can do more than on your own. By cooperating you reduce the risk of mistakes and only strengthen the investigation. Also, give your plan time to come to fruition in your head. As research journalists, we are able to spend more time than others on a particular subject. That is a privilege, but it should be handled wisely. Make sure to reach out to other parties in time and have a good story to tell if you wish to use their data.
The huge number of data that is sometimes available can make you lose sight of what is important. Make sure you conduct a thorough preliminary investigation. What is the problem? How bad is it? What is known about it already? What do experts say? What do I want to know exactly? These are all questions you need to ask yourself before you even start gathering, analysing or visualising data. They will eventually help you with your investigation.