Analysis of historical road accident data supporting autonomous vehicle control strategies


Analysis of historical road accident data
supporting autonomous vehicle control
strategies
Sándor Szénási1,2

1 Faculty of Economics and Informatics, J. Selye University, Komárno, Slovakia
2 John von Neumann Faculty of Informatics, Óbuda University, Budapest, Hungary

ABSTRACT
It is expected that most accidents occurring due to human mistakes will be eliminated
by autonomous vehicles. Their control is based on real-time data obtained from
the various sensors, processed by sophisticated algorithms and the operation of
actuators. However, it is worth noting that this process flow cannot handle
unexpected accident situations like a child running out in front of the vehicle or an
unexpectedly slippery road surface. A comprehensive analysis of historical accident
data can help to forecast these situations. For example, it is possible to localize
areas of the public road network, where the number of accidents related to careless
pedestrians or bad road surface conditions is significantly higher than expected.
This information can help the control of the autonomous vehicle to prepare for
dangerous situations long before the real-time sensors provide any related
information. This manuscript presents a data-mining method working on the already
existing road accident database records to find the black spots of the road network.
As a next step, a further statistical approach is used to find the significant risk
factors of these zones, which result can be built into the controlling strategy of self-
driven cars to prepare them for these situations to decrease the probability of the
potential further incidents. The evaluation part of this paper shows that the
robustness of the proposed method is similar to the already existing black spot
searching algorithms. However, it provides additional information about the main
accident patterns.

Subjects Autonomous Systems, Data Mining and Machine Learning, Spatial and Geographic
Information Systems
Keywords Data mining, DBSCAN, Road accident, Statistics, Autonomous vehicle, Road safety

INTRODUCTION
Human drivers have many disadvantages compared to autonomous vehicles (slower
reaction time, inattentiveness, variable physical condition) (Kertesz & Felde, 2020).
Nevertheless, they can often perform better (Chatterjee et al., 2002) in some unexpected
situations like a child running out in front of the vehicle. Because beyond the information
gained in real-time, they may have specific knowledge about a given location (linked
to the previous example, the human driver may know that there is a playground without
a fence near the road; therefore, the appearance of a child is not unexpected). Drivers
also have some incomplete but useful historical knowledge about accidents and they can
build this information into their driving behavior. If they know that there were several

How to cite this article Szénási S. 2021. Analysis of historical road accident data supporting autonomous vehicle control strategies. PeerJ
Comput. Sci. 7:e399 DOI 10.7717/peerj-cs.399

Submitted 13 October 2020
Accepted 28 January 2021
Published 23 February 2021

Corresponding author
Sándor Szénási,
szenasi.sandor@nik.uni-obuda.hu

Academic editor
Chintan Amrit

Additional Information and
Declarations can be found on
page 22

DOI 10.7717/peerj-cs.399

Copyright
2021 Szénási

Distributed under
Creative Commons CC-BY 4.0

http://dx.doi.org/10.7717/peerj-cs.399
mailto:szenasi.�sandor@�nik.�uni-obuda.�hu
https://peerj.com/academic-boards/editors/
https://peerj.com/academic-boards/editors/
http://dx.doi.org/10.7717/peerj-cs.399
http://www.creativecommons.org/licenses/by/4.0/
http://www.creativecommons.org/licenses/by/4.0/
https://peerj.com/computer-science/


pedestrian collisions somewhere, they will decrease the speed and try to be more attentive
without triggering real-time signals. Thanks to this behavior, they can prepare for and
avoid some types of accidents, which were not possible without this historical data.
Another example might be a road section, which is usually extremely slippery on rainy
days. Real-time sensors can detect the element of slipping when it is too late to avoid the
consequences. Some historical accident data can help to prepare the car for these
unexpected situations.

We propose the following consecutive steps to integrate historical data into the control
algorithm for autonomous devices:

1. Localize accident black spots in an already existing accident database, using statistical or
data-mining methods;

2. Determine the common reasons for these accidents with statistical analysis or pattern
matching;

3. Specify the necessary preventive steps to decrease the probability of further accidents.

This article mainly focuses on the first two steps because the third one largely depends
on the limits and equipment of the self-driven car. For example, in the case of dangerous
areas is it possible to increase the power of lights to make the car more visible? Or in
the case of large chance of pedestrian accidents, is it possible to increase the volume of
the artificial engine sound to avoid careless road crossing? Can the car change the
suspension settings to prepare for potentially dangerous road sections? The scope of this
paper is the development of the theoretical background to support these preliminary
protection activities.

The appropriate preliminary actions may significantly decrease the number and severity
of road accidents. For example, Carsten & Tate (2005) present a model for the relationship
between changes in vehicle speed and the number of occurred accidents. It is visible
from this model (based on the national injury database of Great Britain to predict the
effects of speed on road accidents) that for each 1 km/h change in mean speed, the best-
estimated change of accident risk is 3%. Accordingly, it is worth making assumptions
about the dangerous areas and adapting the control of the autonomous cars to these
predictions.

BACKGROUND
Black spot identification
Black spot management (identification, analysis, and treatment of black spots in the
public road network) is one of the most important tasks of road safety engineers.
The identification of these extremely hazardous sections is the first step to prevent further
accidents or to decrease the seriousness of these. It is a heavily researched area, and
there are several theoretical methods for this purpose.

However it has a long tradition in traffic engineering; interestingly, there is not any
generally accepted definition of road accident black spots (also known as hot spots),

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 2/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


the official definition varies by country. It follows that the method used to find these
hazardous locations also varies by country. For example, by the definition of the
Hungarian government, outside built-up area black spots are defined as road sections
no longer than 100 meters where the number of accidents during the last three years is
at least 3. According to this, road safety engineers use simple threshold-based methods
(for example, the traditional sliding window technique) to find these areas. Switzerland
uses a significantly different definition as black spots are sections of the road network
(or intersections) where the number of accidents is “well above” the number of accidents
at comparable sites. The key difference is the term “comparable sites” because these
advanced comparative methods do not try to classify all road segments by itself but try to
compare to similar areas.

There are some general attributes of accident black spots to overcome the
conceptual confusion. These are usually well-defined sections or intersections of the
public road network, where road accidents are historically concentrated (Elvik, 2008;
Delorme & Lassarre, 2014; Murray, White & Ison, 2012; Montella et al., 2013;
Hegyi, Borsos & Koren, 2017). Nowadays, road accidents are monitored by the
governments and all data about accidents are stored in large, reliable and partially
public databases (without any personal information about the participants).
Much data about the road network is also available (road layout, speed limits, tables,
etc.). As a result, road safety engineers can use several procedures from various fields
(statistics, data mining, pattern recognition) to localize accident black spots in these
databases.

It is a common assumption that the number of accidents is significantly higher at
these locations compared to other sections of the road network. However, this alone is
neither a necessary nor sufficient condition. The variation of the average yearly accident
count of road sections is relatively high compared to the number of accidents. Because of
this, the regression to the mean effect can distort the historical data. A given section
with more accidents than average is not necessarily an accident black spot. The converse is
also true, as there may be true black spots with relatively few accidents for a given year.
However, this deficiency is already theoretically proven as most black spot identification
methods are based on the accident numbers of the last few years, simply because this is
the best place to start a detailed analysis.

Nevertheless, it is always worth keeping in mind that these locations are just black spot
candidates, but it needs further examination to make the right decision concerning them.
The best way to do this is via a detailed scene investigation, but it is very expensive and
time-consuming. Another theoretical approach can be the analysis of accident data to find
some irregular patterns and identify one or more risk factors causing these accidents.
Without these, it is possible that the higher frequency of accidents is purely coincidental at
a given location and time.

To localize potential accident black spots, the most traditional procedure is the sliding
window method (Lee & Lee, 2013; Elvik, 2008; Geurts et al., 2006). The input parameters of

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 3/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


the process are the section length and a threshold value. The method is based on the
following:

1. Divide the selected road into small uniform sized sections;

2. Count the number of accidents that have occurred in the last few years for each section;

3. Flag the segments where this number is higher than a given threshold as potential black
spots.

There are many variants of the proposed traditional sliding window method
(Anderson, 2009; Szénási & Jankó, 2007). A potential alternative is to use variable window
length. One of its advantages is that it is unnecessary to set the appropriate parameter, but
sufficient to give a minimal and maximal value. The method can try several window
lengths to find the largest black spots possible. Due to this modification, it can find small
local black spots and larger ones too. The traditional sliding window method uses non-
overlapping segments, but it is also possible to slide the window with smaller steps than the
window size. This leads to a more sensitive method, which can find more black spot
candidates. However, it is also necessary to manage the overlapping black spots
(considering these as one big cluster, or multiple distinct ones). It is worth mentioning that
the method has some additional advantages: it has very low computational demand
(compared to the alternatives) and is based only on the road accident database.

The sliding window method is one of the first widely used procedures; therefore,
it is based on the traditional road number + section number positioning system
(for example, the accident location is Road 66, 12+450 kilometer+meter). This traditional
positioning system was the only real alternative in the past. However, in the last decades,
the spreading of GPS technology makes it possible to collect spatial coordinates of
accidents. This step has several benefits (faster and more accurate localization) but also
requires the rethinking of the already existing methods. It is possible to extend the sliding
window method to a two-dimensional procedure, but it is not widely used. It is better
to seek out better and more applicable methods fitting to the spatial systems given by the
GPS coordinates.

From this field, Kernel Density Estimation (KDE) methods are one of the most popular
spatial data analysis techniques (Bíl, Andrášik & Janoška, 2013; Flahaut et al., 2003;
Anderson, 2009; Yu et al., 2014; Toran & Moridpour, 2015). These have been employed in
many research projects to analyze road accidents. KDE methods have the advantages
of simple implementation and easy understanding. These also have the benefit to naturally
handle the noise of the data (caused by inaccuracy of GPS devices). In general, KDE is
used as an estimation of the Probability Random Function of a random variable. From
the safety experts’ point of view, the result of the KDE method is the accident density
estimation at a given reference point. The procedure has several parameters, like the search
radius distance from the reference point (bandwidth or kernel size) and the kernel function.

Several researchers recommend the use of empirical Bayesian methods combining
the benefits of the predicted and historical accident frequencies. These models usually
analyze the distributions of the already existing historical data from several aspects, and

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 4/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


give predictions about the expected accident state. In the Empirical Bayesian method, the
existing historical accident count and the expected accident count predicted by the model
are added using different weights (Ghadi & Török, 2019). Because of this, this process
requires an accurate accident prediction model.

Another group of already available methods is based on clustering techniques.
These procedures are from the field of data mining, where clustering is one of the widely
used unsupervised learning methods. In this context, a cluster is a group of items,
which are similar to each other and differ from items outside the cluster. Accidents
with similar attributes (where properties can be the location and/or another risk factor(s))
can be considered as one cluster, using this concept in the field of black spot searching.
Most studies use the basic K-means clustering method (Mauro, De Luca & Dell’Acqua,
2013), but there are also some fuzzy-based C-means solutions.

As already mentioned, the results of the proposed methods are just a set of black
spot candidates. It needs further analysis to make a final, valid decision as to whether it is a
real accident black spot or not. Furthermore, whether or not it requires any actions.
This is the point where our research turns away from traditional road safety management
work (identification and elimination of black spots). Based on the collected clusters,
road safety engineers must select the black spot candidates having the largest safety
potential, which is based on the prediction of the effect of the best available preventive
action (cost of the local improvement activity compared to the expected befits in the
number and severity of further accidents). From the perspective of autonomous car
control, the role of this safety potential is essential. The self-driven car has no options
to solve road safety problems. The only important information is the existence of accident
black spots and the potential safety mechanisms, which may help to avoid further crashes.
As a second difference, from the road safety engineers’ point of view, it is not necessary
that the accidents of a given black spot have common characteristics. The hot spot
definition of this paper assumes that accidents of a given cluster have similar attributes
because this pattern will be the basis of the preventive actions.

The localization of accident black spot candidates is a heavily researched area and there
are several fully-automated methods to find these. Nevertheless, the further automatic
pattern analysis of these is not as well developed. This phase usually needs a great deal
of manual work by human road safety experts (they must travel to the scene and
investigate the environment to support their decisions about recommended actions).
However, this process is supportable by some general rules but is mostly done manually
using the pattern matching capability of the human mind. To fully automate it, it is
necessary to make this method applicable to self-driven cars.

According to this objective, this paper focuses on the help for autonomous vehicles to
take the appropriate preventive actions to avoid accidents:

� Localize black spot candidates using historical accident database;
� Make assumptions about the common risk factors and patterns of these accidents;
� According to these preliminary results, the autonomous device will know where the
dangerous areas are and what preventive actions to take.

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 5/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


Automated accident prevention
Autonomous vehicles will have several ways to avoid accidents and, therefore, is a hot,
widely researched topic. Nevertheless, most papers deal with options existing only in
the far future when autonomous devices will be a part of a densely connected network
without any human interferences. Real-world implementations are far from this point, but
some technologies already exist, although they are not closely related to autonomous
vehicles. Currently, implemented accident prevention systems are built into traditional
cars as braking assistants, etc. However, it is worth considering these because such
methods will be the predecessor of the future techniques applicable to self-driven vehicles.

The two main classes of accident prevention systems are passive and active methods.
Passive systems send notifications to the driver about their warnings but do not perform
any active operations. On the contrary, active methods have the right to perform
interventions (braking, steering, etc.) to avoid accidents. It seems obvious that these
prevention systems have a large positive impact on accident prevention, and it has
already been proven by Jermakian (2011) that passive methods have significant benefits.
There are more than one million vehicle crashes prevented in the USA each year.
As Harpen proved (Harper, Hendrickson & Samaras, 2016), the cost-benefit ratio of these
systems is also positive.

Brake assist systems are one of the most researched active systems, where the potential
benefits are the lower risk of injury, and the less serious injuries of the pedestrians
(Rosén et al., 2010). Current forward-looking crash avoidance systems are usually
continuously scanning the space in front of the vehicle using various devices (camera,
radar, LIDAR, etc.). If any of these detects an unexpected vehicle or pedestrian, the brake
assistant system takes the appropriate (preliminary) actions, which can be the enforcement
of the braking system or direct autonomous emergency braking. Bálint, Fagerlind &
Kullgren (2013) presented very promising results with a test-based methodology for the
assessment of braking and pre-crash warning systems. These typically are only using
the real-time information given by the vehicle sensors without any knowledge extracted
from historical accident data.

Run-time crash prediction models are also related to the topic of this paper. Hossain
et al. (2019) presented a comprehensive comparison and review of existing real-time
crash prediction models. The basic assumption of these systems is that the probability of a
crash situation within a short time window is predictable by the current environmental
parameters measured by the sensors. Therefore, most of the already existing methods
use only the acquired sensor data to make real-time decisions about potential crash
situations. According to this assumption, authors do not use the already existing accident
databases as an input to fine-tune the system’s predictions.

The work of Lenard, Badea-Romero & Danton (2014) is closer to the research presented
in this paper. They analyzed the common accident scenarios to support the development
of autonomous emergency braking protocols. Based on the hierarchical ascending
method in two British accident databases filtered by some previously defined conditions
(they use only the urban pedestrian accidents that occurred in daylight and with fine

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 6/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


weather conditions), attributes of the most common accident scenarios were presented.
This paper defines the major accident scenarios and classifies all existing pedestrian
accidents into one of these categories. The results of this research would be useful in the
training phase of a self-driven vehicle to introduce all possible scenarios to the algorithm.

The objective of Nitsche et al. (2017) is similar, which proposes a novel data analysis
method to detect pre-crash situations at various (T- and four-legged) intersections.
The purpose of this work is also to support the safety tests of autonomous devices.
They clustered accident data into several distinct partitions with the well-known
k-medoids procedure. Based on these clusters, an association rules algorithm was applied
to each cluster to specify the driving scenarios. The input was a crash database from
the UK (containing one thousand junction crashes). The result of the paper contains
thirteen crash clusters, describing the main pre-accident situations.

MATERIALS AND METHODS
Black spot candidate localization
Density-based spatial clustering of applications with noise
For the black spot candidate localization step, the Density-Based Spatial Clustering of
Applications with Noise (DBSCAN) algorithm was used. It is not widely used in the field of
road safety engineering; however, it is one of the most efficient density-based clustering
methods from the field of data mining. The main objective of density-based clustering
tasks is the following: the density of elements within a cluster must be significantly higher
than between separate clusters. This principle distinguishes the two distinct classes of
elements: items inside a cluster and the outliers (elements outside of any cluster).

According to the road safety task, elements are the accidents in the public road network.
These are identified by spatial GPS coordinates and have several additional attributes
(time, accident nature, etc.). The general DBSCAN method needs a definition for distance
calculation between two elements. In the case of road accidents, the Euclidean distance
between the two GPS coordinates was used (black spots are usually spread over a small
area. Therefore, it is a good estimation of the real road network distances).

The DBSCAN method requires two additional parameters:

� ε: a radius type variable (meters);
� MinPts: the lower limit for the number of accidents in a cluster (accidents).

The main definitions of the DBSCAN algorithm are as follows:

� ε environment of a given x element is the space within the ε radius of the x element;
� x is an internal element if the ε environment of the given x contains at least MinPts
elements (including x);

� x is directly densely reachable from y means that x is in the ε environment of y which is an
internal element;

� x is densely reachable from y if it is accessible through a chain of directly densely
reachable elements from y;

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 7/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


� all points not densely reachable from an internal element are the outliers;
� if x is an internal element then it forms a cluster together with all densely reachable
elements from x.

The objective of the process is to find clusters of accidents in the public road network in
which all elements are densely connected, and no further expansion is possible. The steps
to achieve this are as follows:

1. Select one internal element from the accident database as the starting point. This will be
the first point of the cluster.

2. Extend the cluster with all directly densely reachable elements from any point of the
cluster recursively.

3. If it is not possible to extend the cluster with additional points, the cluster can be
considered as final (it contains all the densely reachable items from the starting point).
If this cluster meets the prerequisites for a black spot candidate, it is stored in the
result set.

4. Repeat steps 1–3 for all internal elements of the database.

The result of the presented procedure is a set of black spot candidates.
The prerequisites of Step 4 can be one or more of the following:

� The number of accidents should be more than a given threshold.
� The accident density of the given area should be more than a given threshold.

The proposed method has several advantages over the traditional methods. Unlike
the sliding window algorithm, which analyzes only the accidents of a given road section,
the DBSCAN is a spatial algorithm managing all accidents of the database together.
This difference would be substantial in the case of junctions where the accidents of the
same junction were assigned to different road numbers. This can be especially critical
in the case of built-up areas and traffic roundabouts, where the number of connected roads
is high.

Determination of accident density
One of the benefits of the traditional sliding window method is that it is easy to
interpret for human experts. The number of accidents in a given road section is a very
informative number. It is also easy to calculate some derived values, like the accident
density, which is the number of accidents divided by the length of the road section.
This divisor is often extended with the traffic rate or the time period length values. In the
case of spatial black spot localization techniques, the definition of road accident density
is more complex. These methods are not based on road sections, so division by the section
length is not applicable. It is necessary to calculate the area of the black spot to use it as a
divisor somehow.

This article proposes a novel method to calculate the area of the region spanned by the
black spot accidents. It finds the smallest boundary convex polygon containing all

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 8/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


accidents of a given cluster. The density of the black spot will be the number of
accidents divided by the area of this polygon. The area is calculated by the Gauss’ area
formula Eq. (1).

aðCÞ¼
Pn�1

i¼1 xiyiþ1 þxny1 �
Pn�1

i¼1 xiþ1yi �x1yn
�� ��

2

¼ x1y2 þx2y3 þ…þxn�1yn þxny1 �x2y1 �x3y2 �…�xnyn�1 �x1ynj j
2

(1)

where

� a(C): the area of the C polygon (cluster);
� n: the number of vertices of the polygon;
� (xi, yi): the two-dimensional coordinates of the i-th vertex of the C polygon (where
i ∈ {1, 2, …, n}).

If the number of accidents is less than three, the proposed area concept is not applicable.
However, clusters with one or two accidents are usually not considered as black spot
candidates. Therefore, this is not a real limitation. In the case of clusters with more than
two accidents, the accident rate is calculated as Eq. (2).

rðCÞ¼ jCj
aðCÞ (2)

where

� ρ(C): the accident density of the C cluster;
� |C|: the number of accidents in the C cluster.
The formula requires the sequence of corner coordinates of the polygon in a given order
(in this case, a clockwise direction). The traditional DBSCAN algorithm continuously
builds the polygon from a starting point and the result is a set of accidents. Consequently,
there should be an additional step to give the corner points in the appropriate order.
It is possible to do this after the DBSCAN finishes, but it is also possible to extend the
DBSCAN method with the following steps:

� In the case of the first (P1) and second (P2) items, the concept of “polygon” cannot
be interpreted. Hence, these are automatically marked as further corner points of the
polygon.

� With the third point (P3), the items already form a polygon. The p3 point must be on the
right side of the vector P1P2

 ��
, which can be checked using a scalar multiplication to

ensure the clockwise direction requested by the Gauss formula. If this is not the case, it is
necessary to change the order of P1 and P2. After that step, P1, P2 and P3 will be the
corner points of the polygon in a clockwise direction.

� For every additional point (P5, P6, . . ., Pn), it must be checked that the additional point is
inside the actual boundary convex polygon or not. It is possible to check that the new

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 9/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


point (Pnew) is on the right side of the boundary vector or not. If it is true for each
vectors, the point must be inside the polygon (or in the border). Therefore, it is not
necessary to modify the shape. If the new point is on the left side of any boundary
vectors, then it is outside the boundary convex polygon. There must be a sequence
of one or more consecutive vectors breaking the rule. Let k and l be the first and last
vectors of this sequence. It is possible to substitute the Pk�1; Pk; Pkþ1; . . . ; Pl�1; Pl; Plþ1
part of the boundary vector list with Pk − 1, Pnew, Pl + 1. Because of the convexity of the
original polygon, the Pk − 1, Pnew, Pl + 1 triangle contains all the Pk; Pkþ1; . . . ; Pl�1; Pl
points, and the transformation also ensures the convexity of the new polygon and the
clockwise direction of the corner points. Three figures about this process have been
attached to the article in the Supplemental File “DBSCAN images”.

It is possible to calculate the black spot area and the accident density of a given cluster
using the previous method.

Analysis of black spot candidates
The result of the various black spot localization algorithms (sliding window, clustering,
etc.) is a list of potential hot spots. However, having some accidents in a cluster does
not mean that the hazard of accidents is significantly higher here. It is usually accepted
by researchers that the number of accidents in a given area (section) of the road network
fits the Poisson distribution. A special feature of road accident distribution is that the
number of accidents is relatively low (compared to the size of the road network), and
the variance is high. Therefore, the volatility of the accident number is very high, which
means that a given cluster where the number of accidents is above the average is not
inevitably a hot spot. The list given by the previous methods needs further examination to
find the real hazardous sites.

At this point, the methodology of this paper significantly differs from the work of
road safety engineers. Their objective is to find hazardous sites and take the appropriate
actions to decrease the probability of further accidents. They must select the sites
having the largest safety potential where the best cost-effective actions can be taken to
decrease the number and severity of accidents. It is a very complex procedure based on
the data of historical accidents, the expected number of accidents, the environmental
conditions and the cost/expected benefits of different safety actions. Contrary to this,
the objective of a self-driven car is not the elimination of road safety problems.
As an ordinary participant of traffic, it has no chance to make the road network
better. Nevertheless, as a passive participant, it should be able to localize the
problematic areas, analyze these, and take the necessary preliminary steps to avoid further
accidents.

Another difference between the methods of these fields is that from the perspective of
road safety engineers, it is not necessary that the accidents of a given black spot have any
special patterns or common characteristics. For the self-driven car, the localization of
high-risk areas where the number of accidents is significantly higher than expected is

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 10/25

http://dx.doi.org/10.7717/peerj-cs.399#supplemental-information
http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


not enough because this fact does not help to take the appropriate preliminary steps.
This is the reason why this paper focuses on the identification of accident reasons.

The result of this further investigation can be one of the following:

� If it is not possible to identify any unexpected pattern in the accident attributes then the
cluster cannot be considered as an accident black spot. The high number of accidents is
just a coincidence and there are no suggestions to avoid further crashes.

� In contrast, if there is a special pattern in the accident attributes then this cluster has the
potential to decrease the probability of further crashes. These reasons for similar
accidents would be related to the road network, weather, lighting conditions or human
errors (drivers and pedestrians).

In the second case, the knowledge of this special pattern (the common reasons for
accidents in the same cluster) can be essential. It is presumable that it is possible to
avoid accidents caused by the car itself. For example, if it is visible from the accident
database that the number of accidents caused by slippery road is significantly higher than
expected in a given area, the self-driven car should decrease the speed or change its
trajectory to reduce the probability of this event. However, it is also worth noting that
the preliminary actions can be very useful to decrease the probability of accidents caused
by other drivers or pedestrians. For example, if the historical accident data contains
patterns that the number of accidents caused by pedestrians is higher than expected, then
the self-driven car would proactively try to decrease this negative potential using some type
of visual or auditory warning or decreasing speed.

Deducing the environmental reasons for accidents

Accident databases usually contain certain taxonomy for accident types. These are usually
structured classes of specific events and reasons, and scene investigators must classify
each accident into one of these categories, which is very important statistical information.
This method has several limitations because it is rare when the occurrence of an
accident originates in one specific reason. Usually, multiple reasons, forming a complex
structure, cause an accident. For example, the investigator codes the accident as a type of
“catching-up accident”, but this does not give any information about why the accident
occurs. It is also typical that most of the accidents in the Hungarian road network are
caused by “incorrect choice of speed”. However, it is obvious that not just the speeding
itself was the triggering reason for these accidents. There should be other factors (besides,
it is unarguable that speeding increases the effects of other factors and makes certain
accidents unavoidable).

Based on these experiences, this paper does not try to assign all accidents to mutually
exclusive accident reason classes. Contrarily, the proposed method defines several
potential accident reasons, which are not mutually exclusive. These factors can be
complementary and having different weights and roles in the occurrence of the accident.
Only the reasons with potential preventive operations are discussed because these have
valuable information for the self-driven car.

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 11/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


The proposed method is based on the following consecutive steps:

1. All known accidents are analyzed by all possible accident reasons, and a score value is
assigned to the accident showing how much the accident is affected by a given factor.

2. The distribution of these score values is approximated by the examination of all known
accidents.

3. Based on the result of the previously presented DBSCAN algorithm, the distribution of
these score values is also calculated for each black spot candidate.

4. The distributions for all accidents and a given black spot are compared. If the
distribution of a given factor is significantly differing (to the positive direction), the
cluster is marked as a hazardous area for the given factor.

The independent accident reason factors, like “slippery road”, “bad visibility”, “careless
pedestrians”, etc. are defined as R1, R2,…, RN, where N is the number of these. As discussed
previously, these reasons are not stored directly in the database but can be inferred
from the general attributes of accidents. A scoring table is used for this purpose: the
weights of the i-th accident factor (1 ≤ i ≤ N) is stored as Wi; where Wiattr = value shows the
score for the Ri accident reason when the attr attribute equals to value.

Accordingly, the cumulative score of the Ri reason for the x accident is Eq. (3):

SiðxÞ¼
X

8attr2AðxÞ
Wiattr¼x:attr (3)

where Si(x) is the score value of the Ri reason for accident x. The x.y corresponds to the
value of the specific y attribute of the x accident, and AðxÞcontains all the available known
attributes of x.

It is also possible to calculate the same value, not just for an accident but also for all
accidents of a black spot candidate. The Hi(C) set contains all the Si(x) score values for all x
accidents in the C cluster as visible in Eq. (4):

HiðCÞ¼ fSiðxÞjx 2 Cg (4)

Distribution of accident scores
As a further step, it is necessary to determine that there is any significant reason which
proves that the C set is a real hot spot or not. For a well-established decision, it is necessary
to analyze all the accidents in the database to determine the main characteristics of the
distributions of all R reasons. Based on these results, it is possible to compare the
distributions of Hi(C) values for the examined C hot spot candidate and the reference Ĥi
values for the whole accident database (D) for a given Ri reason Eq. (5).

Ĥi ¼fSiðxÞjx 2 Dg (5)
If the distributions of Hi(C) and Ĥi are the same, it is assumable that the Ri reason has

no significant role in the accumulation of accidents. Otherwise, if these distributions differ

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 12/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


as the number Ri score values are higher in Hi(C) than in Ĥi, there may be some
cause/causal relationship between them.

Hypothesis tests can show if the mean value of a given accident reason score (Ri) in a
given cluster is higher than the same mean for all accidents in the database. The used
alternative hypothesis states that the mean score of the cluster minus the mean score of
the whole population is greater than zero Eq. (7). The null hypothesis covers all other
possible outcomes Eq. (6).

H=j� : mC �mD � 0 (6)

H=k� : mC �mD . 0 (7)
where:

� μC is the mean score value for the black spot candidate;
� μD is the mean score value for all accidents in the database (full population).
This article proposes the application of Welch’s t-test, which is a two-sample location test
used to test the hypothesis that the means of two populations are equal (like the popular
Student’s t-test, but Welch’s test is more reliable when sample sizes are significantly
different and the variances are also unequal). The Welch’s test assumes that both
populations have normal distributions. Nevertheless, in the case of moderately large
samples and application of the one-tailed test, the t-tests are relatively robust to moderate
violations of the normality assumption. In this case, the populations are large enough
(the full population contains thousands of accidents and black spots also contain several
accidents), and it also holds that the one-tailed test is the appropriate method because
we are looking for clusters where the mean is significantly higher than in the entire
population. Ahad & Yahaya (2014) shows that Welch’s test can cause Type I errors when
the variances of the two populations differ and distributions are non-normal. In this case,
the variances are similar, and Type I errors are acceptable (some identified black spot
candidates may not be real black spots).

According to Welch’s method, the statistic t value is given by Eq. (8).

t ¼ x1 �x2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
v1
n1
þ v2
n2

r (8)

where:

� x1 is the mean of the first sample;
� x2 is the mean of the second sample;
� v1 is the variance of the first sample;
� v2 is the variance of the second sample;
� n1 is the size of the first sample;
� n2 is the size of the second sample.

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 13/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


The degree of freedom (v) is calculated by Eq. (9)

v ¼

�
v1
n1
þ v2
n2

�2
v21

n21ðn1 �1Þ
þ v

2
2

n22ðn2 �1Þ
(9)

Based on the previously calculated t and v values, the t-distribution can be used to
determine the probability (P). The one-tailed test is applied because it will answer the
question that the mean of the cluster is significantly higher than the mean of the entire
population. Based on P and a previously defined level of significance (a) it is possible to
reject or not the null hypothesis.

In the case of rejection, it can be assumed that the examined accident reason is related to
the accidents as one of the possible causal factors. If the null hypothesis cannot be rejected,
there is no evidence for this.

Scoring factors
The practical evaluation presented by this paper focuses on one specific accident reason
(N = 1) the slippery road condition factor (R1).

The used accident database contains more than two hundred fields, in four categories:

� general accident attributes (date and time, location, nature, etc.);
� general environmental attributes (weather conditions, visibility, etc.);
� data about participants (was it a vehicle or pedestrian, speed, direction, etc.);
� data about injured persons (age of the injured person, etc.).

Weighting tables have been developed to estimate the effect of a given accident reason
factor on the occurrence of the accident. It is possible to distinguish the following three
type of accident properties, focusing on the slippery road condition accident factor:

� Some fields directly contain information about the examined factor. In this case,
the “Road surface” property (abbreviated as roadsrf) of an accident has an option of
“4-oily, slippery”. This is taken as the basis for further weights; the score value of
this attribute is 1.0 (W1roadsrf = 4 = 1.0), showing that the accident is highly affected by the
slippery road condition factor. It is worth noting that it is not efficient making a
binary decision about the examined factor based on this value because there are other
values (“3-snowy”, “5-another staining”) having similar effects. This is reflected in the
weight values.

� In some cases, there are no such direct fields, but it is possible to deduce information
about a given factor from the already existing data. For example, in the case of the
slippery road condition factor, the weather conditions (wthr property in the database)
can help this process. In these cases, the score values assigned to different weather
condition cases show an estimation of how much the given factor affected the
occurrence of the accident. In the case of snowing (“6-snowy”), it would be higher

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 14/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


(W1wthr = 4 = 0.3) than for ideal conditions like “1-sunny” (W
1
wthr = 1 = 0). It is also

considered that in the case of accident nature “31-Slipping, carving, overturning on the
road”, the slippery road factor influenced the results (W1accnat = 32 = 0.2)

� The last group contains the fields without any relation to the examined factor. For
example, fields like “Age of the driver” do not affect the results. The weights for all values
of these fields are consequently zero.

The Supplemental File “Scoring tables” contains the given weight values for the affected
fields. Weight values are based on a comprehensive literature review from the fields of
road safety and road friction measurements (Wallman & Åström, 2001; Andersson et al.,
2007; Sokolovskij, 2010; Colomb, Duthon & Laukkanen, 2017). However, some of the
values are affected by the subjective experiments of the authors. It should take some further
research to determine the most efficient weights.

RESULTS
Accident database
This paper uses the official road accident database of Hungary, where data regarding
accidents with personal injury are collected by the police. After some conversion and
corrections, this dataset is handled by the Central Statistics Department. The completeness
of the database is ensured by legislation, and participants of public road accidents with
personal injury are obliged to report it to the police. A police officer starts the data
collection on the spot by recording the most relevant data about the location and the main
attributes of the accidents (participants, casualties, etc.). After 30 days, it is possible to
refine the final injury level for all participants. After that finalization step, the Central
Statistics Department collects and rechecks all records. Road safety engineers and
researchers can use this database for their work.

The evaluation part of this paper is based on the accidents of this database from
1 January 2011 to 31 December 2018. It contains 128,767 accidents with personal injury
classified into three categories: fatal, serious and slight. There are no accidents in the
database without personal injury. Because of the high number of accidents and high
computational demand of the clustering algorithm, this paper deals with two counties of
Hungary: accidents of “Győr-Moson-Sopron” county was used to find the optimal
parameters of the algorithm and “Heves” county was used as a control dataset.

DBSCAN clustering
The input database for the clustering was the personal injury accidents of a given county of
Hungary (“Győr-Moson-Sopron” county). This experiment was performed twice at two
consecutive time intervals to measure the robustness of the method. The examined t
interval contains the accidents which occurred in 1 January 2011–31 December 2014.
and the t̂ validation interval was 1 January 2015–1 December 2018. The number of
accidents was 3,256 in the t interval (the D set contains these accidents) and 3,011 in the

t̂ interval (the D̂ set contains these accidents).

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 15/25

http://dx.doi.org/10.7717/peerj-cs.399#supplemental-information
http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


In the hot spot search phase, the following DBSCAN parameters were used:

� ε value: 100 m;
� minimum accident count: five accidents;
� minimum accident density: 0.0001 accident/m2.

The result of this raw DBSCAN clustering method was 165 black spot candidates in the t
interval and 152 black spot candidates in the t̂ interval.

Statistical test
Unlike traditional black spot searching methods, the next step is not the calculation of
some safety potential index, but the determination of the different accident reason
factors using the scoring method presented in Section. Considering the R1 slippery road
condition factor, the S1(x) value is calculated for all x accidents. Most of these are not
related to a slippery road surface reasons; so, S1 value for these is 0.

As a prerequisite for the Welch-test, a population of Si(y) values is generated where y
stands for all accidents in the database. The main parameters of this sample are:

� number of items (n1): 3,256
� mean (x1): 0.2438
� variance (v1): 0.1115

It is possible to calculate these values for all of them, iterating the overall black spot
candidates. Based on the whole population comparison and the black spot candidates, the
Welch-test was applied to get the statistical result values. According to the Welch-test, it is
possible to use the Student distribution with these parameters and the given level of
significance (a = 0.05) to reject the null hypothesis or not.

Table 1 shows the black spot candidates of the t interval where the null hypothesis
was rejected because the mean of the R1 score for the given black spot candidate was
significantly higher than the expected average. It can be assumed that these black spots are
affected by the examined R1 factor.

Figure 1 shows the environment and the accidents of the first black spot from this list.
As is visible in the satellite image, it is a part of a long straight road; consequently, there is
no reason for the autonomous car to decrease its speed. From the historical database,
Table 2 contains detailed information about the accidents. As is visible, there is a high
number of accidents affected by one or more slippery road-related attributes. This pattern

Table 1 Accident black spots where the null hypothesis was rejected.

# Location Count Mean Variance Prob.

1 LAT 47.6301/LON 16.7333 8 0.75 0.0857 0.000878

2 LAT 47.5956/LON 17.5872 11 0.55 0.0887 0.003629

3 LAT 47.3866/LON 17.8659 5 1.12 0.2820 0.010502

4 LAT 47.5708/LON 17.5790 6 0.56 0.1307 0.040157

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 16/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


significantly differs from the expectations; hence, there should be some environmental
issues at this location. The examination and elimination of these reasons is the task of road
safety experts (Orosz et al., 2015). Nevertheless, until then, it is worth taking preventive
steps to decrease the chance of further accidents. The autonomous vehicle should adapt its
control to this situation (speed reduction, using safer trajectory, etc.).

DISCUSSION
There is not any generally accepted method for the evaluation of black spots because
there is not an exact definition for these. Based on real-world accident data, there is not any

Figure 1 Road accidents of the black spot located at LAT 47.6301/LON 16.7333. Map Data @2021
Google, Satellite Images @2021 CNES/Airbus, Geoimage Austria, Maxar Technologies.

Full-size DOI: 10.7717/peerj-cs.399/fig-1

Table 2 Accidents of the black spot located at LAT 47.6301 / LON 16.7333.

Time Latitude Longitude Outcome Surface Weather Accident nature

2011.02.03 16:05 47.6302 16.7327 Light Wet Sunny Track leaving

2011.05.06 17:35 47.6298 16.7340 Hard Normal Sunny Track leaving

2011.06.26 10:24 47.6300 16.7338 Light Wet Rainy Track leaving

2011.06.26 10:28 47.6300 16.7334 Hard Wet Rainy Track leaving

2011.07.21 9:10 47.6302 16.7330 Hard Wet Rainy Track leaving

2013.06.24 17:50 47.6298 16.7340 Light Wet Overcast Frontal crash

2014.01.09 12:45 47.6301 16.7330 Light Wet Sunny Slipping, carving

2014.01.20 10:45 47.6303 16.7325 Hard Wet Overcast Track leaving

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 17/25

http://dx.doi.org/10.7717/peerj-cs.399/fig-1
http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


list of real black spots. So, the widely accepted confusion table-based methods are not
usable here (assigning the clusters into true-positive, false-positive, true-negative, false-
negative classes and calculating the common measurements like accuracy, recall, etc.).
Therefore, it is necessary to evaluate these results based on the general characteristics of
these locations. The accident density of black spots is significantly higher than the average;
though, this is just a necessary condition but not sufficient for validity. Because of the
high volatility of accidents, the regression to the mean effect can distort the results. It is a
well-known statistical phenomenon that roads with a high number of road accidents
in a particular period are likely to have fewer during the consecutive period just because of
the random fluctuations in crash numbers. In the case of real black spots, the high number
of accidents is permanent. Thus, it should be a good evaluation technique to check the
number of accidents of the consecutive validation time interval inside the clusters
identified in the t interval.

There are specific tests for this purpose introduced by Cheng & Washington (2005) used
by various article (Montella, 2010): site consistency tests, method consistency tests, and the
total rank differences test. Since these are developed for black spot searching methods
based on road intervals, it was necessary to adapt them to use spatial coordinates and black
spot regions. The input series for all tests were the result of the previous black spot
identification process, as

� Ci is the i-th cluster identified in the D database (1 ≤ i ≤ n where n is the number of
identified black spots in the t interval);

� Ĉi is the i-th cluster identified in the D̂ database (1 � i � n̂ where n̂ is the number of
identified black spots in the t̂ interval).

Site consistency test
This test assumes that any site identified as a black spot in the t time period should also
reveal high risk in the subsequent t̂ time period.

Let p(C) the convex boundary polygon of the C cluster given by the algorithm presented
in Section, and p is the union of these regions identified in the t time period (10).

� ¼
[n
i¼1

�ðCiÞ (10)

As the next step, we collect all accidents for the consecutive t̂ time period, which are
inside the clusters identified by the prior t time period. The T1 attribute shows the number
of these accidents divided by the summarized area of these clusters. Thus, this is the
accident density of these clusters in the consecutive time period Eq. (11).

T1 ¼
jfx 2 D̂jx inside �gjPn

i¼1
aðCiÞ

(11)

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 18/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


Accident reason factor consistency test
As this paper goes further, revealing the accident reason factors, it is also worth checking if
the accidents in the t̂ time period inside the region identified by the t time period have
the same attributes or not. This leads to the introduction of the T′1 value, which shows the
average score value for these accidents Eq. (12).

T01 ¼
P
8x2D̂

�
S1ðxÞ; if x inside �
0; else

jfx 2 D̂jx inside �gj
(12)

Method consistency test
It is also assumable that a black spot area identified in the t time period will also be
identified as black spot in the consecutive t̂ time period. A given black spot searching
method can be considered consistent if the number of a black spots identified in both
periods is large. Meanwhile, that of black spots identified only in one of the examined
periods is small. It is possible to use Eq. (13) to calculate this method consistency:

T2 ¼
jfC1; C2; . . . Cng\fĈ1; Ĉ2; . . . Ĉn̂gj
jfC1; C2; . . . Cng4fĈ1; Ĉ2; . . . Ĉn̂gj

(13)

where T2 is the ratio of the number of clusters existing in both search results and the
number of clusters given by only the search in t or only in t̂ time period (△ stands for the
symmetric difference of sets). A pair of clusters from the t and t̂ period considered identical
if the distance between these is less than 300 m.

Rank difference test
The rank difference test is based on black spots identified in both the t and t̂ periods.
The black spots of both periods are sorted by accident density, and the rank difference test
shows the difference in the positions of the same cluster in the two lists. The smaller the
value, the more consistent the examined method is, because the sequence of clusters is
similar. Large numbers shows that the examined method was able to identify the same
black spot in both intervals but with a different severity related to each other.

Let O and Ô the sequences of black spots identified in both periods (both sequences
contain the items of thefC1; C2; . . . Cng\fĈ1; Ĉ2; . . . Ĉn̂g set) ordered by accident density
in the t time period (O) and in the t̂ time period (Ô). The T3 will show the rank difference
of the examined method Eq. (14). Obviously jOj ¼ jÔj.

T3 ¼
P

c2O jRankðc; OÞ�Rankðc; ÔÞj
jOj (14)

where Rank(x, Y) is the rank of the x black spot in the Y sequence.

Evaluation results
First, the proposed method was compared to the traditional Sliding Window method (SW)
using dynamic window length. The minimal window length parameter was 250 m, the

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 19/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


minimal accident number was 5, and the minimal accident density was 0.01 accidents/m.
As a further step, the novel method was also compared to the raw DBSCAN based
clustering (without the accident factor scoring). The parameters of this were the same as
presented above. The proposed method is presented in the comparison under the
DARF (DBSCAN with Accident Reason Factor determination) name.

Table 3 shows the overall results for “Győr-Moson-Sopron” county. As visible, the
number of black spots recognized by the DARF method is significantly less than by its
alternatives. It was expected because the SW and DBSCAN methods list all clusters where
the accident density is higher than a given threshold. In contrast, the DARF method
results in only black spots affected by the R1 accident factor. The difference between the
SW and DBSCAN is also significant and is caused by the fact that the SW uses road name +
road section positioning which is not available in built-up areas. In comparison, the
DBSCAN method is based on GPS coordinates and can find the black spots of municipal
roads (which is one advantage of this approach).

The T1 result is similar in the case of DBSCAN and DARF methods and it is
significantly less in the case of SW. The T2 results are almost the same for all algorithms.
The third general metric shows that the proposed method performs very well on the
rank difference test. However, it is worth noting that the number of black spots is
significantly less in this case, which can be an advantage.

The T′1 metric shows the real strength of the proposed method. As expected, black spots
identified by the SW and DBSCAN contain a mixture of various accidents. Consequently,
the average of the R1 score is near to the mean of the population (0.2159 and 0.1922
compared to 0.2438). Contrary to this, the score number for the accidents of the t̂ time
interval placed inside the clusters located by the data of the t interval is 0.62, which is
significantly higher than the average.

These results confirm that the proposed method has very similar characteristics to the
already existing methods. The slightly lower T2 value shows that as a raw black spot
searching algorithm, it is not as robust as the alternatives. Nonetheless, the T′1 result shows

Table 3 Results of the comparison of the SW, DBSCAN, and DARF methods based on the road
slippery condition. Precision is the ratio of the number of confirmed black spots (identified in both
intervals) and the number of all black spots (identified at least in one of the intervals). Results are based
on the personal injury accidents occured in “Győr-Moson-Sopron” county.

Value SW DBSCAN DARF

BS identified in both t and t̂ 67 129 4

BS identified in t but not in t̂ 8 36 2

BS identified in t̂ but not in t 20 23 0

Precision 41.36% 40.69% 40.00%

T1 test result (accidents/m) 0.0094 0.0435 0.0447

T′1 test result 0.2159 0.1922 0.6200
T2 test result 0.5447 0.5223 0.5000

T3 test result 3.8765 5.9054 0.2000

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 20/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


that it is satisfactory for our purpose. It can localize areas when the expected number of
accidents with given accident reasons is significantly higher than the average.

Table 4 shows the same values for another county (“Heves”) as a control dataset to
check the robustness of the method. As visible, the main characteristics of the results are
very similar. In this case, the T1 and T3 results are better compared to the alternatives.
However, the T′1 value is slightly lower, but still significantly higher than the population
average.

CONCLUSIONS
This work presents a novel, fully automated method updating autonomous vehicles
concerning potential road risk factors. The method is based on the DBSCAN data-mining
algorithm, which can localize black spot candidates where the number of accidents is
greater than expected. It has several advantages to the traditional sliding window method,
especially in built-up areas and accidents occurred at junctions.

Beyond the traditional road safety engineering work, an additional processing step was
also introduced, making assumptions about the main accident reasons. All possible
reasons (road slippery, pedestrian issues, etc.) should be checked one-by-one, assigning
score values to all accidents. The proposed method considers the distribution of these score
values for the full population (all accidents of the given county) and each black spot
candidate. Using hypotheses tests (one-tailed Welch-test), it is possible to select clusters in
which the mean of the score values is significantly higher than the expected value
(calculated by statistical methods based on the entire accident database). These can be
considered as black spots affected by the given factor.

The output of this process is a sequence of risky locations on the public road network
and a prediction concerning the accident reasons. These would be the base of further
research suggesting automatic preventive steps to autonomous vehicles. This dataset can
be useful in the route planning phase (try to avoid black spots) and in the traveling phase
(take preventive steps when approaching dangerous locations) (Alonso et al., 2016).
This knowledge would decrease the number and seriousness of public road accidents.

Table 4 Results of the comparison of the SW, DBSCAN, and DARF methods based on the road
slippery condition. Precision is the ratio of the number of confirmed black spots (identified in both
intervals) and the number of all black spots (identified at least in one of the intervals). Results are based
on the personal injury accidents are occured in “Heves” county.

Value SW DBSCAN DARF

BS identified in both t and t̂ 25 38 4

BS identified in t but not in t̂ 9 12 0

BS identified in t̂ but not in t 16 26 3

Precision 33.33% 33.33% 36.36%

T1 test result (accidents/m) 0.0074 0.0323 0.0732

T′1 test result 0.2148 0.2286 0.5778
T2 test result 0.3333 0.3333 0.4000

T3 test result 1.8667 1.4912 0.0000

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 21/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


As a limitation, it is worth noting that the proposed method would result in false
positive alarms. Fortunately, these results are used by autonomous vehicles; therefore, the
consequences are usually minor inconveniences (decreasing the speed, etc.) compared to
the traditional road safety investigations, where the manual revision is essential. It is
also worth seeing that our method is based only on local historical data resulting in
problems typical of traditional statistical black spot searching methods (high variation
compared to the expected value). It would be worth developing a hybrid method based on
the Empirical Bayes method, which achieves superior control for random variation.

The next step of this research project will be the development of these preventive steps.
The previously acquired information should be built into the control of the self-driven
vehicle to fine-tune its strategy of movement to avoid all predictable risky situations.
For example, if the presented method predicts high probability of pedestrian accidents, the
car should increase the engine voice volume; in the case of a high chance of frontal
accidents, it is worth increasing the power of the headlights; and obviously, decreasing
the speed near any of the dangerous locations may decrease the seriousness of most
accidents. Building an expert system to give similar advice based on the historical data
should be the next step of this project.

Another direction of further development is to make the method more sensitive to
real-time environmental conditions. For example, if the autonomous car has to plan a
route at night in wet weather, then it should pay more attention to historical accidents that
have occurred under similar conditions. This also confirms the fact that it is necessary to
make simple and fully automatic algorithms for this purpose to make the fast
recalculations available.

As another further development, an Artificial Intelligence based approach should be
used to extend the database to solve the problems raised by the limitations of the dataset.

ACKNOWLEDGEMENTS
The authors would like to thank Domokos Jankó for his support and novel ideas about the
topic. Rest in peace our friend.

ADDITIONAL INFORMATION AND DECLARATIONS

Funding
The research presented in this paper was carried out as part of the EFOP-3.6.2-16-2017-
00016 project in the framework of the New Széchenyi Plan. The completion of this project
is funded by the European Union and co-financed by the European Social Fund. The
funders had no role in study design, data collection and analysis, decision to publish, or
preparation of the manuscript.

Grant Disclosures
The following grant information was disclosed by the authors:
New Széchenyi Plan: EFOP-3.6.2-16-2017-00016.
European Union and European Social Fund.

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 22/25

http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


Competing Interests
Sándor Szénási is an Academic Editor for PeerJ.

Author Contributions
� Sándor Szénási conceived and designed the experiments, performed the experiments,
analyzed the data, performed the computation work, prepared figures and/or tables,
authored or reviewed drafts of the paper, and approved the final draft.

Data Availability
The following information was supplied regarding data availability:

Data and code are available in the Supplemental Files.

Supplemental Information
Supplemental information for this article can be found online at http://dx.doi.org/10.7717/
peerj-cs.399#supplemental-information.

REFERENCES
Ahad NA, Yahaya SSS. 2014. Sensitivity analysis of Welch’s t-test. AIP Conference Proceedings

1605(February 2015):888–893.

Alonso F, Alonso M, Esteban C, Useche SA. 2016. Knowledge of the concepts of black spot,
grey spot and high accident concentration sections among drivers. Science Publishing Group
1(4):39–46.

Anderson TK. 2009. Kernel density estimation and K-means clustering to profile road accident
hotspots. Accident Analysis & Prevention 41(3):359–364 DOI 10.1016/j.aap.2008.12.014.

Andersson M, Bruzelius F, Casselgren J, Gäfvert M, Hjort M, Hultén J, Håbring F, Klomp M,
Olsson G, Sjödahl M, Svendenius J, Woxneryd S, Wälivaara B. 2007. Road friction estimation
IVSS project report. Available at https://research.chalmers.se/en/publication/101026.

Bálint A, Fagerlind H, Kullgren A. 2013. A test-based method for the assessment of pre-crash
warning and braking systems. Accident Analysis & Prevention 59:192–199
DOI 10.1016/j.aap.2013.05.021.

Bíl M, Andrášik R, Janoška Z. 2013. Identification of hazardous road locations of traffic accidents
by means of kernel density estimation and cluster significance evaluation. Accident Analysis &
Prevention 55(3):265–273 DOI 10.1016/j.aap.2013.03.003.

Carsten OM, Tate FN. 2005. Intelligent speed adaptation: accident savings and cost-benefit
analysis. Accident Analysis & Prevention 37(3):407–416 DOI 10.1016/j.aap.2004.02.007.

Chatterjee K, Hounsell NB, Firmin PE, Bonsall PW. 2002. Driver response to variable message
sign information in London. Transportation Research Part C: Emerging Technologies
10(2):149–169 DOI 10.1016/S0968-090X(01)00008-0.

Cheng W, Washington SP. 2005. Experimental evaluation of hotspot identification methods.
Accident Analysis & Prevention 37(5):870–881 DOI 10.1016/j.aap.2005.04.015.

Colomb M, Duthon P, Laukkanen S. 2017. Characteristics of adverse weather conditions. In:
DENSE. Brussels: CER.

Delorme R, Lassarre S. 2014. A new theory of complexity for safety research—the case of the
long-lasting gap in road safety outcomes between France and Great Britain. Safety Science
70:488–503 DOI 10.1016/j.ssci.2014.06.015.

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 23/25

http://dx.doi.org/10.7717/peerj-cs.399#supplemental-information
http://dx.doi.org/10.7717/peerj-cs.399#supplemental-information
http://dx.doi.org/10.7717/peerj-cs.399#supplemental-information
http://dx.doi.org/10.1016/j.aap.2008.12.014
https://research.chalmers.se/en/publication/101026
http://dx.doi.org/10.1016/j.aap.2013.05.021
http://dx.doi.org/10.1016/j.aap.2013.03.003
http://dx.doi.org/10.1016/j.aap.2004.02.007
http://dx.doi.org/10.1016/S0968-090X(01)00008-0
http://dx.doi.org/10.1016/j.aap.2005.04.015
http://dx.doi.org/10.1016/j.ssci.2014.06.015
http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


Elvik R. 2008. A survey of operational definitions of hazardous road locations in some European
countries. Accident Analysis & Prevention 40(6):1830–1835 DOI 10.1016/j.aap.2008.08.001.

Flahaut B, Mouchart M, Martin ES, Thomas I. 2003. The local spatial autocorrelation and the
kernel method for identifying black zones. Accident Analysis & Prevention 35(6):991–1004
DOI 10.1016/S0001-4575(02)00107-0.

Geurts K, Wets G, Brijs T, Vanhoof K, Karlis D. 2006. Ranking and selecting dangerous crash
locations: correcting for the number of passengers and Bayesian ranking plots. Journal of Safety
Research 37(1):83–91 DOI 10.1016/j.jsr.2005.10.020.

Ghadi M, Török Á. 2019. A comparative analysis of black spot identification methods and road
accident segmentation methods. Accident Analysis & Prevention 128:1–7
DOI 10.1016/j.aap.2019.03.002.

Harper CD, Hendrickson CT, Samaras C. 2016. Cost and benefit estimates of partially-automated
vehicle collision avoidance technologies. Accident Analysis & Prevention 95:104–115
DOI 10.1016/j.aap.2016.06.017.

Hegyi P, Borsos A, Koren C. 2017. Searching possible accident black spot locations with accident
analysis and GIS software based on GPS coordinates. Pollack Periodica 12(3):129–140
DOI 10.1556/606.2017.12.3.12.

Hossain M, Abdel-Aty M, Quddus MA, Muromachi Y, Sadeek SN. 2019. Real-time crash
prediction models: state-of-the-art, design pathways and ubiquitous requirements.
Accident Analysis & Prevention 124:66–84 DOI 10.1016/j.aap.2018.12.022.

Jermakian JS. 2011. Crash avoidance potential of four passenger vehicle technologies.
Accident Analysis & Prevention 43(3):732–740 DOI 10.1016/j.aap.2010.10.020.

Kertesz G, Felde I. 2020. One-shot re-identification using image projections in deep triplet
convolutional network. In: SOSE, 2020—IEEE 15th International Conference of System of
Systems Engineering, Proceedings. Piscataway: IEEE, 597–601.

Lee S, Lee Y. 2013. Calculation method for sliding-window length: a traffic accident frequency case
study. Easter Asia Society for Trasportation Studies 9:1–13.

Lenard J, Badea-Romero A, Danton R. 2014. Typical pedestrian accident scenarios for the
development of autonomous emergency braking test protocols. Accident Analysis & Prevention
73(4):73–80 DOI 10.1016/j.aap.2014.08.012.

Mauro R, De Luca M, Dell’Acqua G. 2013. Using a k-means clustering algorithm to examine
patterns of vehicle crashes in before-after analysis. Modern Applied Science 7(10):11–19.

Montella A. 2010. A comparative analysis of hotspot identification methods. Accident Analysis &
Prevention 42(2):571–581 DOI 10.1016/j.aap.2009.09.025.

Montella A, Andreassen D, Tarko AP, Turner S, Mauriello F, Imbriani LL, Romero MA. 2013.
Crash databases in Australasia, the European Union, and the United States. Transportation
Research Record: Journal of the Transportation Research Board 2386(1):128–136
DOI 10.3141/2386-15.

Murray W, White J, Ison S. 2012. Work-related road safety: a case study of Roche Australia.
Safety Science 50(1):129–137 DOI 10.1016/j.ssci.2011.07.012.

Nitsche P, Thomas P, Stuetz R, Welsh R. 2017. Pre-crash scenarios at road junctions: a clustering
method for car crash data. Accident Analysis & Prevention 107:137–151
DOI 10.1016/j.aap.2017.07.011.

Orosz G, Mocsári T, Borsos A, Koren C. 2015. Evaluation of low-cost safety measures on the
Hungarian national road network. In: Proceedings of the XXVth World Road Congress, Seoul:
World Road Association, 1–11.

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 24/25

http://dx.doi.org/10.1016/j.aap.2008.08.001
http://dx.doi.org/10.1016/S0001-4575(02)00107-0
http://dx.doi.org/10.1016/j.jsr.2005.10.020
http://dx.doi.org/10.1016/j.aap.2019.03.002
http://dx.doi.org/10.1016/j.aap.2016.06.017
http://dx.doi.org/10.1556/606.2017.12.3.12
http://dx.doi.org/10.1016/j.aap.2018.12.022
http://dx.doi.org/10.1016/j.aap.2010.10.020
http://dx.doi.org/10.1016/j.aap.2014.08.012
http://dx.doi.org/10.1016/j.aap.2009.09.025
http://dx.doi.org/10.3141/2386-15
http://dx.doi.org/10.1016/j.ssci.2011.07.012
http://dx.doi.org/10.1016/j.aap.2017.07.011
http://dx.doi.org/10.7717/peerj-cs.399
https://peerj.com/computer-science/


Rosén E, Källhammer JE, Eriksson D, Nentwich M, Fredriksson R, Smith K. 2010. Pedestrian
injury mitigation by autonomous braking. Accident Analysis & Prevention 42(6):1949–1957
DOI 10.1016/j.aap.2010.05.018.

Sokolovskij E. 2010. Automobile braking and traction characteristics on the different road
surfaces. Transport 22(4):275–278 DOI 10.3846/16484142.2007.9638141.

Szénási S, Jankó D. 2007. Internet-based decision-support system in the field of traffic safety on
public road networks. In: 6th European Transport Conference. Budapest, 131–136.

Toran A, Moridpour S. 2015. Identifying crash black spots in melbourne road network using
kernel density estimation in GIS. In: Road Safety and Simulation.

Wallman C-G, Åström H. 2001. Friction measurement methods and the correlation between road
friction and traffic safety: a literature review. Available at https://books.google.hu/books?
id=VL9BHQAACAAJ.

Yu H, Liu P, Chen J, Wang H. 2014. Comparative analysis of the spatial analysis methods for
hotspot identification. Accident Analysis & Prevention 66(2083):80–88
DOI 10.1016/j.aap.2014.01.017.

Szénási (2021), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.399 25/25

http://dx.doi.org/10.1016/j.aap.2010.05.018
http://dx.doi.org/10.3846/16484142.2007.9638141
https://books.google.hu/books?id=VL9BHQAACAAJ
https://books.google.hu/books?id=VL9BHQAACAAJ
http://dx.doi.org/10.1016/j.aap.2014.01.017
https://peerj.com/computer-science/
http://dx.doi.org/10.7717/peerj-cs.399

	Analysis of historical road accident data supporting autonomous vehicle control strategies
	Introduction
	Background
	Materials and Methods
	Results
	Discussion
	Conclusions
	flink7
	References


<<
  /ASCII85EncodePages false
  /AllowTransparency false
  /AutoPositionEPSFiles true
  /AutoRotatePages /None
  /Binding /Left
  /CalGrayProfile (Dot Gain 20%)
  /CalRGBProfile (sRGB IEC61966-2.1)
  /CalCMYKProfile (U.S. Web Coated \050SWOP\051 v2)
  /sRGBProfile (sRGB IEC61966-2.1)
  /CannotEmbedFontPolicy /Warning
  /CompatibilityLevel 1.4
  /CompressObjects /Off
  /CompressPages true
  /ConvertImagesToIndexed true
  /PassThroughJPEGImages true
  /CreateJobTicket false
  /DefaultRenderingIntent /Default
  /DetectBlends true
  /DetectCurves 0.0000
  /ColorConversionStrategy /LeaveColorUnchanged
  /DoThumbnails false
  /EmbedAllFonts true
  /EmbedOpenType false
  /ParseICCProfilesInComments true
  /EmbedJobOptions true
  /DSCReportingLevel 0
  /EmitDSCWarnings false
  /EndPage -1
  /ImageMemory 1048576
  /LockDistillerParams false
  /MaxSubsetPct 100
  /Optimize true
  /OPM 1
  /ParseDSCComments true
  /ParseDSCCommentsForDocInfo true
  /PreserveCopyPage true
  /PreserveDICMYKValues true
  /PreserveEPSInfo true
  /PreserveFlatness true
  /PreserveHalftoneInfo false
  /PreserveOPIComments false
  /PreserveOverprintSettings true
  /StartPage 1
  /SubsetFonts true
  /TransferFunctionInfo /Apply
  /UCRandBGInfo /Preserve
  /UsePrologue false
  /ColorSettingsFile (None)
  /AlwaysEmbed [ true
  ]
  /NeverEmbed [ true
  ]
  /AntiAliasColorImages false
  /CropColorImages true
  /ColorImageMinResolution 300
  /ColorImageMinResolutionPolicy /OK
  /DownsampleColorImages false
  /ColorImageDownsampleType /Average
  /ColorImageResolution 300
  /ColorImageDepth 8
  /ColorImageMinDownsampleDepth 1
  /ColorImageDownsampleThreshold 1.50000
  /EncodeColorImages true
  /ColorImageFilter /FlateEncode
  /AutoFilterColorImages false
  /ColorImageAutoFilterStrategy /JPEG
  /ColorACSImageDict <<
    /QFactor 0.15
    /HSamples [1 1 1 1] /VSamples [1 1 1 1]
  >>
  /ColorImageDict <<
    /QFactor 0.15
    /HSamples [1 1 1 1] /VSamples [1 1 1 1]
  >>
  /JPEG2000ColorACSImageDict <<
    /TileWidth 256
    /TileHeight 256
    /Quality 30
  >>
  /JPEG2000ColorImageDict <<
    /TileWidth 256
    /TileHeight 256
    /Quality 30
  >>
  /AntiAliasGrayImages false
  /CropGrayImages true
  /GrayImageMinResolution 300
  /GrayImageMinResolutionPolicy /OK
  /DownsampleGrayImages false
  /GrayImageDownsampleType /Average
  /GrayImageResolution 300
  /GrayImageDepth 8
  /GrayImageMinDownsampleDepth 2
  /GrayImageDownsampleThreshold 1.50000
  /EncodeGrayImages true
  /GrayImageFilter /FlateEncode
  /AutoFilterGrayImages false
  /GrayImageAutoFilterStrategy /JPEG
  /GrayACSImageDict <<
    /QFactor 0.15
    /HSamples [1 1 1 1] /VSamples [1 1 1 1]
  >>
  /GrayImageDict <<
    /QFactor 0.15
    /HSamples [1 1 1 1] /VSamples [1 1 1 1]
  >>
  /JPEG2000GrayACSImageDict <<
    /TileWidth 256
    /TileHeight 256
    /Quality 30
  >>
  /JPEG2000GrayImageDict <<
    /TileWidth 256
    /TileHeight 256
    /Quality 30
  >>
  /AntiAliasMonoImages false
  /CropMonoImages true
  /MonoImageMinResolution 1200
  /MonoImageMinResolutionPolicy /OK
  /DownsampleMonoImages false
  /MonoImageDownsampleType /Average
  /MonoImageResolution 1200
  /MonoImageDepth -1
  /MonoImageDownsampleThreshold 1.50000
  /EncodeMonoImages true
  /MonoImageFilter /CCITTFaxEncode
  /MonoImageDict <<
    /K -1
  >>
  /AllowPSXObjects false
  /CheckCompliance [
    /None
  ]
  /PDFX1aCheck false
  /PDFX3Check false
  /PDFXCompliantPDFOnly false
  /PDFXNoTrimBoxError true
  /PDFXTrimBoxToMediaBoxOffset [
    0.00000
    0.00000
    0.00000
    0.00000
  ]
  /PDFXSetBleedBoxToMediaBox true
  /PDFXBleedBoxToTrimBoxOffset [
    0.00000
    0.00000
    0.00000
    0.00000
  ]
  /PDFXOutputIntentProfile (None)
  /PDFXOutputConditionIdentifier ()
  /PDFXOutputCondition ()
  /PDFXRegistryName ()
  /PDFXTrapped /False

  /CreateJDFFile false
  /Description <<
    /CHS <FEFF4f7f75288fd94e9b8bbe5b9a521b5efa7684002000500044004600206587686353ef901a8fc7684c976262535370673a548c002000700072006f006f00660065007200208fdb884c9ad88d2891cf62535370300260a853ef4ee54f7f75280020004100630072006f0062006100740020548c002000410064006f00620065002000520065006100640065007200200035002e003000204ee553ca66f49ad87248672c676562535f00521b5efa768400200050004400460020658768633002>
    /CHT <FEFF4f7f752890194e9b8a2d7f6e5efa7acb7684002000410064006f006200650020005000440046002065874ef653ef5728684c9762537088686a5f548c002000700072006f006f00660065007200204e0a73725f979ad854c18cea7684521753706548679c300260a853ef4ee54f7f75280020004100630072006f0062006100740020548c002000410064006f00620065002000520065006100640065007200200035002e003000204ee553ca66f49ad87248672c4f86958b555f5df25efa7acb76840020005000440046002065874ef63002>
    /DAN <FEFF004200720075006700200069006e0064007300740069006c006c0069006e006700650072006e0065002000740069006c0020006100740020006f007000720065007400740065002000410064006f006200650020005000440046002d0064006f006b0075006d0065006e007400650072002000740069006c0020006b00760061006c00690074006500740073007500640073006b007200690076006e0069006e006700200065006c006c006500720020006b006f007200720065006b007400750072006c00e60073006e0069006e0067002e0020004400650020006f007000720065007400740065006400650020005000440046002d0064006f006b0075006d0065006e0074006500720020006b0061006e002000e50062006e00650073002000690020004100630072006f00620061007400200065006c006c006500720020004100630072006f006200610074002000520065006100640065007200200035002e00300020006f00670020006e0079006500720065002e>
    /DEU <FEFF00560065007200770065006e00640065006e0020005300690065002000640069006500730065002000450069006e007300740065006c006c0075006e00670065006e0020007a0075006d002000450072007300740065006c006c0065006e00200076006f006e002000410064006f006200650020005000440046002d0044006f006b0075006d0065006e00740065006e002c00200076006f006e002000640065006e0065006e002000530069006500200068006f00630068007700650072007400690067006500200044007200750063006b006500200061007500660020004400650073006b0074006f0070002d0044007200750063006b00650072006e00200075006e0064002000500072006f006f0066002d00470065007200e400740065006e002000650072007a0065007500670065006e0020006d00f60063006800740065006e002e002000450072007300740065006c006c007400650020005000440046002d0044006f006b0075006d0065006e007400650020006b00f6006e006e0065006e0020006d006900740020004100630072006f00620061007400200075006e0064002000410064006f00620065002000520065006100640065007200200035002e00300020006f0064006500720020006800f600680065007200200067006500f600660066006e00650074002000770065007200640065006e002e>
    /ESP <FEFF005500740069006c0069006300650020006500730074006100200063006f006e0066006900670075007200610063006900f3006e0020007000610072006100200063007200650061007200200064006f00630075006d0065006e0074006f0073002000640065002000410064006f0062006500200050004400460020007000610072006100200063006f006e00730065006700750069007200200069006d0070007200650073006900f3006e002000640065002000630061006c006900640061006400200065006e00200069006d0070007200650073006f0072006100730020006400650020006500730063007200690074006f00720069006f00200079002000680065007200720061006d00690065006e00740061007300200064006500200063006f00720072006500630063006900f3006e002e002000530065002000700075006500640065006e00200061006200720069007200200064006f00630075006d0065006e0074006f00730020005000440046002000630072006500610064006f007300200063006f006e0020004100630072006f006200610074002c002000410064006f00620065002000520065006100640065007200200035002e003000200079002000760065007200730069006f006e0065007300200070006f00730074006500720069006f007200650073002e>
    /FRA <FEFF005500740069006c006900730065007a00200063006500730020006f007000740069006f006e00730020006100660069006e00200064006500200063007200e900650072002000640065007300200064006f00630075006d0065006e00740073002000410064006f00620065002000500044004600200070006f007500720020006400650073002000e90070007200650075007600650073002000650074002000640065007300200069006d007000720065007300730069006f006e00730020006400650020006800610075007400650020007100750061006c0069007400e90020007300750072002000640065007300200069006d007000720069006d0061006e0074006500730020006400650020006200750072006500610075002e0020004c0065007300200064006f00630075006d0065006e00740073002000500044004600200063007200e900e90073002000700065007500760065006e0074002000ea0074007200650020006f007500760065007200740073002000640061006e00730020004100630072006f006200610074002c002000610069006e00730069002000710075002700410064006f00620065002000520065006100640065007200200035002e0030002000650074002000760065007200730069006f006e007300200075006c007400e90072006900650075007200650073002e>
    /ITA <FEFF005500740069006c0069007a007a006100720065002000710075006500730074006500200069006d0070006f007300740061007a0069006f006e00690020007000650072002000630072006500610072006500200064006f00630075006d0065006e00740069002000410064006f006200650020005000440046002000700065007200200075006e00610020007300740061006d007000610020006400690020007100750061006c0069007400e00020007300750020007300740061006d00700061006e0074006900200065002000700072006f006f0066006500720020006400650073006b0074006f0070002e0020004900200064006f00630075006d0065006e007400690020005000440046002000630072006500610074006900200070006f00730073006f006e006f0020006500730073006500720065002000610070006500720074006900200063006f006e0020004100630072006f00620061007400200065002000410064006f00620065002000520065006100640065007200200035002e003000200065002000760065007200730069006f006e006900200073007500630063006500730073006900760065002e>
    /JPN <FEFF9ad854c18cea51fa529b7528002000410064006f0062006500200050004400460020658766f8306e4f5c6210306b4f7f75283057307e30593002537052376642306e753b8cea3092670059279650306b4fdd306430533068304c3067304d307e3059300230c730b930af30c830c330d730d730ea30f330bf3067306e53705237307e305f306f30d730eb30fc30d57528306b9069305730663044307e305930023053306e8a2d5b9a30674f5c62103055308c305f0020005000440046002030d530a130a430eb306f3001004100630072006f0062006100740020304a30883073002000410064006f00620065002000520065006100640065007200200035002e003000204ee5964d3067958b304f30533068304c3067304d307e30593002>
    /KOR <FEFFc7740020c124c815c7440020c0acc6a9d558c5ec0020b370c2a4d06cd0d10020d504b9b0d1300020bc0f0020ad50c815ae30c5d0c11c0020ace0d488c9c8b85c0020c778c1c4d560002000410064006f0062006500200050004400460020bb38c11cb97c0020c791c131d569b2c8b2e4002e0020c774b807ac8c0020c791c131b41c00200050004400460020bb38c11cb2940020004100630072006f0062006100740020bc0f002000410064006f00620065002000520065006100640065007200200035002e00300020c774c0c1c5d0c11c0020c5f40020c2180020c788c2b5b2c8b2e4002e>
    /NLD (Gebruik deze instellingen om Adobe PDF-documenten te maken voor kwaliteitsafdrukken op desktopprinters en proofers. De gemaakte PDF-documenten kunnen worden geopend met Acrobat en Adobe Reader 5.0 en hoger.)
    /NOR <FEFF004200720075006b00200064006900730073006500200069006e006e007300740069006c006c0069006e00670065006e0065002000740069006c002000e50020006f0070007000720065007400740065002000410064006f006200650020005000440046002d0064006f006b0075006d0065006e00740065007200200066006f00720020007500740073006b00720069006600740020006100760020006800f800790020006b00760061006c00690074006500740020007000e500200062006f007200640073006b0072006900760065007200200065006c006c00650072002000700072006f006f006600650072002e0020005000440046002d0064006f006b0075006d0065006e00740065006e00650020006b0061006e002000e50070006e00650073002000690020004100630072006f00620061007400200065006c006c00650072002000410064006f00620065002000520065006100640065007200200035002e003000200065006c006c00650072002000730065006e006500720065002e>
    /PTB <FEFF005500740069006c0069007a006500200065007300730061007300200063006f006e00660069006700750072006100e700f50065007300200064006500200066006f0072006d00610020006100200063007200690061007200200064006f00630075006d0065006e0074006f0073002000410064006f0062006500200050004400460020007000610072006100200069006d0070007200650073007300f5006500730020006400650020007100750061006c0069006400610064006500200065006d00200069006d00700072006500730073006f0072006100730020006400650073006b0074006f00700020006500200064006900730070006f00730069007400690076006f0073002000640065002000700072006f00760061002e0020004f007300200064006f00630075006d0065006e0074006f00730020005000440046002000630072006900610064006f007300200070006f00640065006d0020007300650072002000610062006500720074006f007300200063006f006d0020006f0020004100630072006f006200610074002000650020006f002000410064006f00620065002000520065006100640065007200200035002e0030002000650020007600650072007300f50065007300200070006f00730074006500720069006f007200650073002e>
    /SUO <FEFF004b00e40079007400e40020006e00e40069007400e4002000610073006500740075006b007300690061002c0020006b0075006e0020006c0075006f0074002000410064006f0062006500200050004400460020002d0064006f006b0075006d0065006e007400740065006a00610020006c0061006100640075006b006100730074006100200074007900f6007000f60079007400e400740075006c006f0073007400750073007400610020006a00610020007600650064006f007300740075007300740061002000760061007200740065006e002e00200020004c0075006f0064007500740020005000440046002d0064006f006b0075006d0065006e00740069007400200076006f0069006400610061006e0020006100760061007400610020004100630072006f0062006100740069006c006c00610020006a0061002000410064006f00620065002000520065006100640065007200200035002e0030003a006c006c00610020006a006100200075007500640065006d006d0069006c006c0061002e>
    /SVE <FEFF0041006e007600e4006e00640020006400650020006800e4007200200069006e0073007400e4006c006c006e0069006e006700610072006e00610020006f006d002000640075002000760069006c006c00200073006b006100700061002000410064006f006200650020005000440046002d0064006f006b0075006d0065006e00740020006600f600720020006b00760061006c00690074006500740073007500740073006b0072006900660074006500720020007000e5002000760061006e006c00690067006100200073006b0072006900760061007200650020006f006300680020006600f600720020006b006f007200720065006b007400750072002e002000200053006b006100700061006400650020005000440046002d0064006f006b0075006d0065006e00740020006b0061006e002000f600700070006e00610073002000690020004100630072006f0062006100740020006f00630068002000410064006f00620065002000520065006100640065007200200035002e00300020006f00630068002000730065006e006100720065002e>
    /ENU (Use these settings to create Adobe PDF documents for quality printing on desktop printers and proofers.  Created PDF documents can be opened with Acrobat and Adobe Reader 5.0 and later.)
  >>
  /Namespace [
    (Adobe)
    (Common)
    (1.0)
  ]
  /OtherNamespaces [
    <<
      /AsReaderSpreads false
      /CropImagesToFrames true
      /ErrorControl /WarnAndContinue
      /FlattenerIgnoreSpreadOverrides false
      /IncludeGuidesGrids false
      /IncludeNonPrinting false
      /IncludeSlug false
      /Namespace [
        (Adobe)
        (InDesign)
        (4.0)
      ]
      /OmitPlacedBitmaps false
      /OmitPlacedEPS false
      /OmitPlacedPDF false
      /SimulateOverprint /Legacy
    >>
    <<
      /AddBleedMarks false
      /AddColorBars false
      /AddCropMarks false
      /AddPageInfo false
      /AddRegMarks false
      /ConvertColors /NoConversion
      /DestinationProfileName ()
      /DestinationProfileSelector /NA
      /Downsample16BitImages true
      /FlattenerPreset <<
        /PresetSelector /MediumResolution
      >>
      /FormElements false
      /GenerateStructure true
      /IncludeBookmarks false
      /IncludeHyperlinks false
      /IncludeInteractive false
      /IncludeLayers false
      /IncludeProfiles true
      /MultimediaHandling /UseObjectSettings
      /Namespace [
        (Adobe)
        (CreativeSuite)
        (2.0)
      ]
      /PDFXOutputIntentProfileSelector /NA
      /PreserveEditing true
      /UntaggedCMYKHandling /LeaveUntagged
      /UntaggedRGBHandling /LeaveUntagged
      /UseDocumentBleed false
    >>
  ]
>> setdistillerparams
<<
  /HWResolution [2400 2400]
  /PageSize [612.000 792.000]
>> setpagedevice