key: cord-0954826-iz5n3svz authors: Zhu, Liping; Wang, Bingyao; Yan, Yihan; Guo, Shuang; Tian, Gangyi title: A novel traffic accident detection method with comprehensive traffic flow features extraction date: 2022-04-28 journal: Signal Image Video Process DOI: 10.1007/s11760-022-02233-z sha: dde5f836aacd17dd129944db020f192a91a6a5c4 doc_id: 954826 cord_uid: iz5n3svz With the rapidly increasing of automobiles, traffic accidents are gradually becoming more frequent. This creates a great need for effective traffic anomaly detection algorithms. Existing methods shed light on directly inferring the abnormalities from traffic flow, which is short in features extraction and representation of traffic flows. In this paper, we propose three new traffic flow features, namely the road congestion, the traffic intensity, and the traffic state instability, for more comprehensive traffic status representation and anomaly detection. Residual analysis, quadratic discrimination, multi-resolution wavelet analysis are integrated for the extraction of the aforementioned features, which will be applied for the downstream tasks of traffic anomaly detection. Experimental results reveal that accident identification based on the proposed features is more effective than the raw traffic flow, which is supposed to provide an alternative approach for further applications and studies. In modern transport systems, various detectors provide actionable information in critical situations, which enable us to automatically discover the abnormality of the traffic stream in time. However, due to the complexity and variability in mass traffic behavior, it is difficult to directly identify abnormal traffic events from raw observed flow measures. Therefore, more sophisticated approaches that can extract features with clean meaning and effective representation are necessary for the automatic analysis of traffic flow data. The detection of abnormal traffic accidents has already been realized based on machine learning algorithms, artificial intelligence [27] and deep learning [10] -related technologies. Xia [28] proposed an unsupervised method based on the sparse topic model to capture motion patterns and detect anomalies in traffic surveillance. Elizabeth Hou [12] addressed the problem of detecting anomalous activity in traffic networks where the network is not directly observed based on the Bayesian hierarchical model. Ronald D. Hagan [9] presented a case study on the analysis of New York City taxi traffic using the compound analytics framework. Silva Nuno [24] used PCA to analyze the attributes complexity of traffic features. Cuadra-Sanchez [4] focused on longitudinal traffic analysis, namely detecting sudden peak changes. Takahiro Kudo [14] detected traffic anomalies for every period of measured traffic via PCA. Youcef Djenouri [6] reviewed the use of outlier detection approaches in urban traffic analysis. Rupam Deb [5] presented a correlation-based imputation method to improve the quality of traffic flow. Shu-Bin Li [15] realized accident detection by taking into account the traffic ratio at the entrances and crossways. Seyed Hessam-Allah [11] provided a novel rule-based method to predict traffic accident severity according to user's preferences. Recently, Zheng Zhao [31] discussed a novel traffic forecast model based on long short-term memory (LSTM) network. Meanwhile, Mehrannia and BagiSiamese, et al [18] also investigate the deep representation of loop detector data using LSTM for automatic detection of freeway accidents. To deal with scenarios where only small datasets are avail- Fig. 1 A visual statement of the vehicle collision accident, the normal condition (top) vs abnormal condition (bottom) after accident able for training, Sabour and Rao, et al. [22] further develop the Siamese neural network-based DeepFlow to automatically analyze traffic flow data. Meanwhile, XGBoost [21] , ensemble support vector machine [26] , isolation forest [17] , and other machine learning algorithms [23] are also applied for flow data-based abnormal traffic status detection. In this study, we focus on the feature extraction of traffic flow data for more effective abnormal traffic events identification. Here the term abnormal traffic event refers to an exception point in time that the traffic system behaves abnormally and is significantly different from the previous normal behavior. This can be caused by natural factors (heavy rainfall in short-terms for example), or human factors such as traffic accidents. Fig. 1 shows the abnormal scene of vehicle collision. Since traffic flow parameters often show a significant trend under normal circumstances, the abnormal vehicle collision will cause a significant impact on traffic parameters [7] [20] . However, the occurrence of traffic accidents is real-time, complex, and sporadic. Meanwhile, relationships between these accidents and their reflection on raw traffic flow parameters are hard to be summarized as clear rules. This brings great difficulties to automatic anomaly detection of traffic status. Based on the I80-E highway traffic flow data and accident records provided by the US PeMS system [30] , we propose three new traffic flow features, namely the road congestion, the traffic intensity, and the traffic state instability, for more comprehensive traffic status representation and anomaly detection. Our study is executed according to the flowchart shown in Fig. 2. Road congestion refers to the traffic phenomenon caused by the traffic vehicle surge. McMaster algorithm is an algorithm based on the theoretical model of highway traffic [29] state catastrophe. However, during our application of the McMaster algorithm to the I80-E highway data, we find there are three limitations in the McMaster algorithm: (1) The critical Therefore, we develop a new method to eliminate the above limitations. Fig. 3 demonstrates the traffic flow data of the detector S400430, where the X-axis represents the flow, Y-axis is the occupancy, and the color illustrates the speed. When the speed is faster than 50, the road is in the non-congestion state, and the flow is linear related to the occupancy. On the contrary, in the state of road congestion (speed is lower than 50), the flow and occupancy no longer satisfy the simple linear relationship, which forms several clusters away from the linear normal trends. According to the above characteristics of road congestion, we design road congestion identification features based on the Quadratic Discrimination Analysis (QDA) algorithm [3] . We utilize the following linear basis function model to map the flow X f low into the occupancy X occ for the noncongestion state: where the constant a represents the vertical intercept, and b represents the slope of the linear basis function. Then, the residual E occ between the predicted occupancy X occ and the ground truth X * occ is adopted to measure the state of from the: Then, the non-congestion occupancy of each point can be estimated by the linear relationship, while these observations are roughly labeled by the 3σ criterion [25] according to the residual difference with the real occupancy. Substituting each point flow X f low into the above linear relationship, it can be found that the E occ basically satisfies the Gaussian distribution with a mean of 0. Since the residual of the data points in the uncongested state is basically in the normal range, the residual value in the congestion state is obviously large. According to the 3σ criterion, the point of E occ > 3σ can be considered to be in a road congestion state, where σ is the standard deviation of E occ . Finally, the Quadratic Discrimination Analysis (QDA) algorithm is used for supervised learning to predict congestion probability. The conclusion based on the 3σ criterion is only a rough discriminant conclusion, whether it is crowded or not. In order to obtain more accurate conclusions, the QDA algorithm [3] is used to obtain a classification prediction model. Applying the model to the detector data, the congestion probability of each point can be achieved, indicating the degree of road congestion. Traffic intensity refers to the number of vehicles detected by a detector per unit time. Since traffic patterns for working days and holidays are different, therefore, traffic data of Saturdays, Sundays, and legal holidays in the United States are marked for further analysis. Then, the short-term historical data of flow data are collected for the forecast of trend values in each day. The median of non-holiday traffic intensity in the first two weeks (about 10 days) is used as the trend values for working days. Meanwhile, the median of holidays traffic We take the following approach for the extraction of traffic intensity anomaly features. Firstly, the differences between actual and predicted occupancy and speed are calculated for each moment. Assuming that X occ and X speed represent actual data values of occupancy and speed at a certain time, respectively, while T occ and T speed indicate their corresponding predicted values. When there is no anomaly, D occ and D speed tend to 0. When abnormalities occur and traffic increases, D occ will be far greater than 0, and D speed will be far less than 0. In contrast, D occ will be far less than 0, and D speed will be far greater than 0 if traffic decreases. Then, the cumulative probability density functions of D speed and D occ according to normal distribution are estimated. As can be observed in Fig. 4 , we find that the empirical distribution of D occ and D speed of each detector is quite similar with the probability density curves of corresponding normal distribution. To further validate this observation, we performed the Kolmogorov-Smirnov normal likelihood test [16] for D occ and D speed , while the testing results are presented in Table 1 . It is confirmed that the distribution of D occ and D speed on each detector obey the normal distribution (with expectation 0 and statistic variances, respectively) with large confidence values. Therefore, the abnormal traffic intensity features P more and P less can be calculated from the cumulative probability function of approximated normal distributions. According to the definition of cumulative probability function of normal distribution, the cumulative probability density functions of D occ and D speed are: where δ 2 occ and δ 2 speed are the statistic variances of D occ and D speed , respectively. Then, the abnormal traffic intensity features P more and P less are defined as: The value ranges of P more and P less are both (0,1). A high P more value indicates an abnormal increase in traffic intensity, while a higher P less indicates an abnormal decrease. Compared with directly using D occ and D speed for the abnormal traffic intensity representation, P more and P less effectively eliminate the magnitude differences of D occ and D speed among different detectors. We also develop a wavelet analysis-based approach to extract features for the representation of traffic state instability. This approach is applied to the series of three raw observed traffic parameters (flow, speed, and occupancy), and the corresponding local activity and fluctuation intensity features are obtained for the representation of overall and local variations of traffic flows. The traffic state instability features are extracted based on the multi-resolution wavelet analysis framework. It is necessary to eliminate local fluctuations in the flow data that are independent of the overall trend changing. Therefore, we calculate the overall trend of traffic flow through a discrete binary wavelet transform [19] -based frequency domain smoothing algorithm. In our approach, the Daubechies wavelet basis is applied. According to Daubechies wavelet function ψ(t) and scale function φ(t): Then, the trend f j a (t) and detail f j d (t) on the j-th scale can be constructed step by step: In the above definition, c j,k is the scale expansion coefficient and d j,k is the wavelet expansion coefficient. Setting J as an arbitrary scale, traffic flow data can be reconstructed by: The multi-resolution wavelet algorithm can be applied for band-pass filtering of traffic flows. We will decompose the raw signal f (t) with J -level multi-resolution wavelet transform at first, while scale expansion coefficients c j,k and wavelet expansion coefficients d j,k at all levels are obtained. By thresholding c j,k and d j,k , the corresponding coefficients of components outside the band-pass region are assigned with 0. Then, the modified coefficients c j,k and c d,k are used for wavelet reconstruction to obtain the overall trends of traffic flows. According to the above multi-resolution wavelet filtering method, 8-level Daubechies wavelet transform is performed on the road occupancy data of detectors from low to high, while the high-frequency part of each level is eliminated step by step. We retain 4 to 8 wavelets to smooth the series of flow, speed, and occupancy, which means the timedomain resolution of estimated overall trends is about half an hour. Finally, we calculate the mean absolute difference (fluctuation intensity) between traffic flow data and overall trend within a given range to indicate the local fluctuation. Traffic flow data are obtained from PeMS System every 5 minutes. At the current time point, 13 points (about one hour) are taken forward and backward, respectively, to calculate the local activity and fluctuation intensity. Assuming that the value of traffic flow data at time t is X t and the corresponding overall trend value is T t , the formula for calculating local activity A t and fluctuation intensity F t at that time is as follows: According to the above formulas, six values can be calculated from three raw observed traffic parameters. We name them with the terms of local activity and fluctuation intensity for flow, speed, and occupancy, respectively. Our experiments are based on the traffic data of the I80-E highway in 2016 from the US PeMS system [30] . The spatiotemporal features (location, date, and time) of traffic detectors and their three kinds of series data, namely speed, flow, and occupancy are collected for feature extraction. Injuries are selected to form the positive sample set for vehicle collision accidents. Meanwhile, 5000 negative samples are also randomly collected from normal time without any traffic event records (the workflow for negative sample collection is presented in Fig. 5 ). These 6524 samples form the experimental dataset for our following study. Then, we use the proposed methods (refer to Sect. 3) to extract 60 input features. 48 upstream and downstream state features are extracted at first. These traffic accident identification features are extracted from series data of upstream and downstream detectors, within 15 minutes before and after the vehicle collision records. They are all basic mean statistics of the raw traffic flows and the extracted feature flows. The meanings and definitions of the 48-dimensional upstream and downstream features are listed in Table 2 . Then, 12 California algorithm features are also extracted referring to the California algorithm [13] : OCC R D F = OCC S1,t − OCC S2,t OCC S1,t In the above definition, S1 represents the upstream detector and S2 is the downstream detector. We simply extend the California algorithm to other traffic flow parameters (not only the occupancy flow data), while names and definitions of these features are presented in Table 3 . Based on the collected 6524 samples and 60 input features, we view the detection of vehicle collision accidents as a binary classification task, and learn classification models with machine learning algorithms for further accidents detection: To overcome the "dimension disaster" problem, the importance of 60 features is evaluated by the MDA (Mean Decrease in Accuracy) (14) . R3: California algorithm features referred to Eq (15) method of random forest [8] . Its core idea is to investigate the influence of the random disturbance of each feature on the prediction accuracy [1] , which is evaluated on the OOB (Out Of Bagging) test sets through though ensemble learning. 2) Algorithm Parameters Tuning: Six classification algorithms including the linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), neural network (NNET), classification support vector machine (CSVM), classification and regression trees (CART), and classification random forest (CRF) are applied. These algorithms take the above 9 dimensional features as inputs, while directly output whether a vehicle collision accident happens for this moment. According to the normal range of each algorithm, we use grid search [2] to evaluate the accuracy of model generalization under various parameter combinations and then determine the optimal parameter combinations. The best algorithm parameter combination and its generalization accuracy evaluation are shown in Table 4 . We use 20-fold cross-validation to validate the generalization performance of established models. The evaluated results are shown in Table 5 , while the best performance is highlighted with bold text. It is easy to see that the accuracy of the CRF model is 0.911, which is significantly higher than other models. The prediction accuracy and precision of CRF are both above 0.85. Because of the serious imbalance of sample data, the recall rate of CRF is only 0.655. However, it is also significantly higher than other models. Therefore, the CRF algorithm is finally selected for modeling. Then, the final vehicle collision detection model is constructed with all 6234 samples. We further applied the proposed method to traffic data of I80-E highway in 2017. Table 6 shows the performance evaluation results of various baseline models [4, 17, 22, 22, 31] , while the better performances in each pair are highlighted with bold texts. CRF is short for the supervised random forest approach we used in the previous section, SPC represents the sudden peak change-based method proposed by Cuadra-Sanchez [4] , iForest is the isolation forest-based unsupervised abnormal analysis approach, which has been applied in [17] and [22] , LSTM is the supervised LSTM-based method [31] , The best results are highlighted in bold The best results are highlighted in bold The best results in each pair are highlighted in bold while the idea of DeepFlow [22] is also applied to further improve the performance of LSTM-based methods. The subscript RAW represents that these algorithms are applied to the raw observed traffic flow data, while RTS means that these algorithms are applied to the further extracted features. It is easy to see that the comprehensive features indeed improved the detection of vehicle collision accidents. Besides, the performance of the CRF RTS model is even slightly better than LSTM RAW . Meanwhile, extracted feature flows-based LSTM RTS and DeepFlow RTS are also more accurate than LSTM RAW and DeepFlow RAW . These phenomena further confirmed the advantages of the proposed feature extraction methods. In fact, the feature selection approach in Sect. 3 also reveals the effectiveness of the comprehensive features. The above application and comparison confirmed the advantage of extracted comprehensive features for abnormal traffic events identification. In this section, we will further discuss why the proposed flow features are superior to the raw traffic flow data. (1) Due to differences in road conditions and traffic environments at their locations, as well as the sensors themselves, traffic flows measured by different detectors show significant differences in the magnitude and response patterns, which means the distribution of them is diverse. However, statistical abnormal identification assumes that all the samples involved obey the same distribution. The extracted flow features adaptively eliminate distribution diversion between detectors by statistical modeling on the flow data of each detector, respectively. For example, based on the flow patterns of each detector, the road congestion features use the QDA to automatically model and convert the raw traffic flow data into congestion probability values, which approximately obey the same Bernoulli distribution. Meanwhile, the cumulative probability functions (with parameters established from flow data of each detector, respectively) also effectively eliminate the magnitude differences of D occ and D speed among different detectors. Through our extraction approach, irrelevant responses due to detectors or their locations are suppressed, while features associated with abnormal traffic states are retained. This greatly reduces the difficulty and robustness of the subsequent traffic events identification. (2) The proposed flow features effectively highlight the responses of abnormal traffic states. The commonality of extracted features is that they all reflect the deviation of current states from normal states. For example, the road congestion features automatically identify the occ-flow rela-tionship for normal non-congestion situations, while indicating the degree of road congestion based on the deviation from the normal occ-flow relationship. Traffic intensity features take the median daily trends of working and holidays as the normal states, respectively, and measure the abnormal degree of current states by statistical residual modeling. Meanwhile, traffic state instability features separate the trend (normal status) and local details (abnormal variation) through multi-resolution wavelet analysis, which also highlights the possibly abnormal fluctuation intensity of raw flow data. Since the occurrence of traffic accidents often causes traffic flows to deviate from normal statuses, flow features that highlight abnormal responses of traffic flows will effectively improve the identification of traffic events. However, the proposed traffic flow feature extraction method still has its limitations. The studied traffic data are observed from an ideal closed highway system without horizontal crossing. This enables us to assume that the road traffic is only affected by the upstream and downstream. Therefore, for a more complicated traffic system, especially for road systems with many intersections, the changes in traffic status can be influenced by more factors. In this case, it is not certain whether the anomalous response exhibited on the traffic flow data is caused by traffic events on this road. It can be expected that the accuracy of the proposed method, in this case, will be greatly reduced. Moreover, all the proposed feature extraction methods adopt a data-driven approach. Even though this facilitates the adaptability of extracted features on specialized traffic data, it also means that the established feature extraction and accident detection models can only be applied to the studied road. If the traffic system or traffic patterns change significantly, we need to conduct the whole workflow again for updating. For example, due to the impact of COVID-19 on the frequency of people traveling, traffic conditions on the I80-E freeway may change dramatically, while the validity of the model learned from previous data cannot be guaranteed further. In this manuscript, we proposed three new traffic flow features, namely the road congestion, the traffic intensity, and the traffic state instability, for more comprehensive traffic status representation. Based on the I80-E highway traffic flow data provided by the US PeMS system, we illustrate the application of using the extracted traffic flow features for vehicle collision accident detection. Comparative experiments reveal that the proposed comprehensive traffic features can effectively improve the performance of abnormal traffic events identification, which is worth further application. However, the proposed traffic flow feature extraction method still has its limitations. For a more complex traffic system, whether the proposed method still works well still needs more investigation. Meanwhile, all the proposed feature extraction methods adopt a data-driven approach. This also assumes that the traffic system or traffic patterns should not change significantly. Therefore, How to adaptively update the established models for dynamically changing traffic systems is still worth more study. Empirical comparison of tree ensemble variable importance measures Random search for hyper-parameter optimization Performance analysis of lda, qda and knn algorithms in left-right limb movement classification from EEG data Proposal of a new information theory-based technique based on traffic anomaly detection analysis A correlation based imputation method for incomplete traffic accident data A survey on urban traffic anomalies detection algorithms Impact analysis of accidents on the traffic flow based on massive floating car data Correlation and variable importance in random forests Classification and anomaly detection in traffic patterns of New York city taxis: A case study in compound analytics Deep learning Traffic accident severity prediction using a novel multi-objective genetic algorithm Anomaly detection in partially observed traffic networks Comparison of fuzzy-wavelet radial basis function neural network freeway incident detection model with California algorithm PCA-based robust anomaly detection using periodic traffic behavior Incident detection method of expressway based on traffic flow simulation model Evaluating Kolmogorov's distribution Detecting anomalous driving behavior using neural networks Deep representation of imbalanced spatio-temporal traffic flow data for traffic accident detection From sensors data to urban traffic flow analysis Feature recognition of urban road traffic accidents based on GA-XGBoost in the context of big data Deepflow: Abnormal traffic flow detection using Siamese networks A machine learning based framework for IoT device identification and abnormal traffic detection Road anomalies detection system evaluation Modern applied statistics with S-PLUS An improved selective ensemble learning method for highway traffic flow state identification Defining artificial intelligence. SMPTE Motion Imag Anomaly detection in traffic surveillance with sparse topic model Highway traffic accident prediction based on SVR trained by genetic algorithm Real-time flux and density estimation of freeway traffic with decentralized speed data Lstm network: a deep learning approach for short-term traffic forecast Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations