Exploratory Data Analysis of Road Accidents in France - 2021

Viktoriia Untilova
Python in Plain English
5 min readJan 6, 2023

--

In this short article I would like to share the results of my Exploratory Data Analysis (EDA) using Python and Pandas regarding the data set containing full information about road accidents in France in 2021.

Photo by Karsten Würth on Unsplash

This is a great chance to combine together Python Programming, Data Science, discover a little bit France and of course remind about being careful on the road !

In this EDA we will get the answers to the following questions :

  • Number of accidents per french department ?
  • Is there a link between week days and accident frequency ?
  • What is the percentage of people killed by the accident among all involved ? How much were pedestrians ?
  • What is the age distribution of persons involved ?
  • Where were happened the most dramatic accidents with more then two deaths ?
  • What is the percentage of survived people among all involved in some kind of accident ?
  • What is the Geographical visualization with accident locations ?

Data description

The Data was taken from the open data platform from French Government. Current data set contains the information on places, users, vehicles and general characteristics of the incidents. Here is my Jupyter notebook placed on GitHub in case you want to follow a technical part or perform some additional calculations.

Exploratory Data Analysis

Let’s start with discovering the characteristics of the incidents to see the number of accidents per french department. French territory is divided into 13 regions which are then divided into departments. There are 96 departments excluding the overseas territories. [1]

Figure 1 depicts the incident frequency. The highest number of accidents took place in Paris, Seine-Saint-Denis, Bouches-du-Rhône, Val-de-Marne, Rhône and Hauts-de-Seine. This departments correspond to the following capitals Paris, Bobigny, Marseille, Créteil, Lyon, Nanterre. It is worth to mention that among this six, only Marseille and Lyon do not concern Paris, as the rest are communes adjacent to Paris area!

Figure 1. Number of accidents per french department happened in 2021

Let’s analyze Paris, Marseille, Lyon and Bordeaux departments regarding the day of the week of accidents.

Figure 2. Number of accidents per department grouped by the day of the week for Paris, Marseille (Bouches-du-Rhône), Lyon (Rhône) and Bordeaux (Gironde)

Disregarding the department we observe very light peak of accidents on Friday and slight decrease till Sunday with obvious increase on Monday.

Considering the main conditions, most of the accidents happened on bidirectional circulation, on the flat and straight road and during the normal weather conditions.

Figure 3. Main characteristics: circulation regime (a), weather conditions (b), road type (c) and trip reason (d)

In France there are several types of roads: Municipal (Communal), Departmental, National, Controlled Access, European Roads and Highways.

By the length, the most of roads are communal roads, then twice less departmental roads. Total highway length is 30 times (!) shorter that departmental roads. This may feel counterintuitive, but indeed the highways represent only ~ 1.05% of the total road network. Among 1,1 million kilometers of roads in France in 2018, highways represent only 11 670 kilometers. [2] So it’s no wonder that the most of the accidents happened on the communal and departmental roads. However both count approximately equal number of cases.

Now, we will switch to the users involved in road incidents.

In 2021, there were 459 pedestrians that died among the total of 3 219 person passed away in the road accident. Fortunately, this represents only 2.5% of the total quantity involved in any road accident. Figure 4 depicts the age distribution of involved people. Left figure concerns the overall data set age distribution and it is skewed right distributions, which might be explained by the inadvertence of young drivers/pedestrians. The number of accidents clearly decreases as a function of age.

Figure 4. Age distribution for the whole data set (left), and for the accidents with at least on death (right)

Figure 4 (yellow) for fatal accidents has slightly deformed shape. Namely for the age range above 30 years, there is a kind of a plateau with more or less constant incident’s rate for the age range between 30 and 70 years. The range below 20 years old has symmetrical shape to the overall distribution curve. It turns out that 31.0 % were aged below 30 years old !

1036 locations where fatal incidents were happened are visualized below showing those with at least one death (red) and with more than two death (blue) corresponding to 151 locations. Noteworthy that those last mentioned incidents were happened predominantly during the day or on the road without street lighting !

Figure 5. Geographical locations of accidents with at least 1 death (red) and more than 2 deaths (blue)

Let’s calculate the percentage of injured but survived after an accident. This part represents 97.5 % of the road users involved contrary to 2.5 % that passed away !

The good news is that 54.8 % of involved are survived after an accident. And 42.7 % were free from any injuries !

I hope you enjoyed this descriptive analysis ! Feel free to leave comments below, subscribe and/or check out my Jupyter notebook to explore used data set further !

References:

[1] - https://www.france-pub.com/departements.php

[2] - https://www.statista.com/statistics/759437/length-roads-by-road-type-france/

In Plain English 🚀

Thank you for being a part of the In Plain English community! Before you go:

--

--