DRIVER SLEEPINESS DETECTION ALGORITHM BASED ON RELEVANCE VECTOR MACHINE

Driver sleepiness is one of the most important causes of traffic accidents. Efficient and stable algorithms are crucial for distinguishing nonfatigue from fatigue state. Relevance vector machine (RVM) as a leading-edge detection approach allows meeting this requirement and represents a potential solution for fatigue state detection. To accurately and effectively identify the driver’s fatigue state and reduce the number of traffic accidents caused by driver sleepiness, this paper considers the degree of driver’s mouth opening and eye state as multi-source related variables and establishes classification of fatigue and non-fatigue states based on the related literature and investigation. On this basis, an RVM model for automatic detection of the fatigue state is proposed. Twenty male respondents participated in the data collection process


Introduction
Driver sleepiness is one of the main causes of traffic accidents, which severely threatens road traffic safety. The risk of accident occurrence in the fatigue state is four to six times larger than that of the non-fatigue state (Gonçalves et al., 2015a). In China, there were 244 937 road crashes in 2018, resulting in 63 194 fatalities and 258 532 injuries. Fifteen percent of these crashes were reported to be directly or indirectly associated with driver drowsiness (Wang, Ma, & Wei, 2019a). In Europe, the average prevalence of falling asleep at the wheel in 2014 and 2015 was 17%. Among respondents who fell asleep while driving, the median prevalence of sleep-related accidents was 7.0% (Gonçalves et al., 2015b). In the USA, driver sleepiness has been causing roughly 1550 fatalities and 40 000 nonfatal injuries every year since 2007 (Wang, Ma, & Li, 2018). The factors causing traffic accidents include human factors, vehicle factors, road factors, and environmental factors. Among them, human factors are the most critical ones (Ning & Feng, 2006). Also, among human driving behaviors, driver sleepiness is most difficult to detect even though it has a high incidence rate. The so-called driver sleepiness or drowsy driving occurs when after a long-time driving operation, the driver experiences psychological and physiological function disorder, which significantly decreases their driving ability. According to statistics, about 89% of the world's safety accidents are caused by traffic accidents, and more than 80% of the casualties are caused by traffic accidents. Therefore, it is necessary to prevent driver sleepiness, but due to individual differences in driver behavior and physiological state during driver sleepiness (Xu, Pei, & Wang, 2016), it is more difficult to measure it using the quantitative physiological indexes than the drunk driving. In addition, driver sleepiness is difficult to supervise through legislative enforcement. Hence, the driver's fatigue state detection becomes a hotspot in the driver sleepiness 2 02 1 / 1 6 (1) research field. Vehicle fatigue pre-warning equipment is considered to be the most important tool for fatigued driving prevention (Zhao, Xu, Sun, Pan, & Li, 2018b), the automatic driving sleepiness discrimination algorithm being its core element.
The driver sleepiness detection methods can be roughly divided into subjective and objective. The subjective detection methods judge the fatigue state based on the driver's feedback, and the objective detection methods detect the fatigue state based on the driver's physiological data or driving behavior. Objective detection can be divided into intrusive and non-intrusive (Sun, Zhang, Peeta, He, & Li, 2017). The intrusive detection includes the electroencephalogram (EEG) and electrocardiogram (ECG). Intrusive detection indicators can accurately reflect the driver's fatigue state, but the detection process can affect the driver's driving, which is not conducive to the detection in practice. The non-intrusive detection mainly detects the fatigue state by detecting the facial indicators, including eyes and mouth. The non-intrusive detection methods can detect the fatigue state without affecting the driver's driving. Compared to the intrusive detection, non-intrusive detection is safer and more convenient, and it has been widely used. Therefore, in this work, the eyes and mouth as non-intrusive indicators are selected for detection. The purpose of this paper is to study a method for early warning of driver sleepiness. In order to identify the driver's fatigue state accurately and efficiently, this paper proposes a multi-index fusion discrimination algorithm based on face recognition. The face is recognized by a convolution neural network (CNN), and the PERCLOS value and mouth opening degree are used as detection indexes; the fatigue state is judged by the index fusion discrimination algorithm based on the relevance vector machine (RVM). The experimental results show that the accuracy of the proposed algorithm is higher than 90%, and an almost ideal fatigue discrimination effect is obtained.
The paper is organized as follows. In Section 1, a review of the related research is presented. In Section 2, the index recognition method and the RVM algorithm are introduced. In Section 3, the method for fatigue state identification is introduced, and the experimental results are presented. In Section 4, the conclusions are summarized.

Index selection
Much research has been conducted on the selection of driver sleepiness detection indicators. Driver sleepiness test indicators can be divided into intrusive and non-intrusive. Intrusive indicators include the electroencephalography (EEG) index, electrocardiogram (ECG) index, and many others. Non-intrusive indicators that have less impact on driving include driving behavior indicators (Aghaei et al., 2016), eye movement indicators (Mandal, Li, Wang, & Lin, 2017), and others. Zhao et al. (2018a) studied driver sleepiness detection based on driving behavior. Ahlstrom et al. (2013) studied driver sleepiness detection based on eye indicators, which included eye-opening, blink frequency, and other indicators. The PERCLOS (Percentage of Eyelid Closure Over the Pupil Over Time) refers to the percentage of time when eyes are closed for a certain time. The PERCLOS is currently the most effective indicator for detecting driver sleepiness based on eye data (Junaedi & Akbar, 2018). Deng and Wu (2019) accurately judged the driver's fatigue state by analyzing the state of the driver's eyes and mouth. Because the non-invasive indicator measurement does not interfere (He, Li, Fan, & Fei, 2016) with normal driving, the non-invasive detection represents a good way to detect the fatigue state of drivers, and also can accurately reflect the driver's state. Intrusive measurement indicators (e.g., EEG index, ECG indicator) have been proven to be effective and can guarantee the recognition accuracy of the detection algorithm to the utmost extent. Jung, Shin, and Chung (2014) conducted extensive research on driver sleepiness detection based on the ECG. Ronen Oron-Gilad, and Gershon (2014) studied driver sleepiness detection based on heart rate, and the results showed that this indicator could accurately reflect the driver's state. Although these intrusive measurement indicators highly correlate with the fatigue state, they are intrusive and have a greater interference with the driving status (Zhao et al., 2018a); also, the costs of the required equipment are extremely high. In order to achieve a balance between the interference with the driving status and driver's state judgment accuracy, an efficient and stable algorithm adopting the non-intrusive measurement indicators is proposed to enhance the correlation between the fatigue state and non-intrusive measurement indicators and achieve successful and accurate detection of the fatigue state. Therefore, this paper selects eye state and mouth-opening degree as two non-invasive detection indexes for the driver's state analysis.

Face recognition
Image recognition technology, especially face recognition technology, has always been in the focus of machine learning research. Image recognition methods can be divided into geometry-based recognition methods, 3D face recognition methods, support vector machine (SVM) based face recognition methods, and CNN-based recognition methods. The geometry-based recognition methods are based on the geometric features of a human face. Osia and Bourlai designed a face recognition device that can realize face recognition under complex conditions by geometric normalization of the input face image and detection of facial features (Osia & Bourlai, 2014). The 3D face recognition methods are based on 3D face data. Chai et al. established 3D face recognition to overcome the difficulty of face recognition under different illumination levels (Chai, Shan, Qing, Chen, & Gao, 2006); Kun, Lin, Li, and Liang (2012) proposed a 3D face recognition method based on the cascade classifier algorithm. Compared to other related algorithms, the performance of this algorithm has been improved. In the field of the SVM-based face recognition, Osuna, Freund, & Girosit (1997) has been the pioneer of the SVM-based face recognition algorithm. The main idea of this method is to map the lowdimension features to the high-dimension features, and then transform the recognition problem to the classification problem. Wang and Zhou (2006) used two different kernel function classifiers based on the SVM to classify and recognize the training sample set composed of 280 pictures of 40 people; the two classifiers achieved the classification error rates 9.83% and 10.08%, and the classification effect was good. The CNN-based recognition methods are the most common recognition methods at present. Rehman, Tu, Huang, and Yang (2016) proposed the unsupervised learning methods based on CNN and sparse filtering, which can effectively distinguish image features. The test results of face recognition showed that image recognition and detection performances have been improved significantly. Due to advantages offered by avoiding complex image pre-processing, CNNs can use the original image as the input directly without any pre-processing, which conditioned their wide use in the field of image recognition.

Discriminant algorithm
With the development of artificial intelligence, machine learning has been considered to be the best method for driver sleepiness detection (Martensson, Keelan, & Ahlstrom, 2018). The analysis of the driver sleepiness discrimination algorithms shows that among all types of fatigue detection algorithms, especially compared to the single monitoring driver sleepiness scheme, fatigue detection based on multi-source measurement index technology and artificial intelligence technology has demonstrated higher accuracy and stability. Namely, compared to a single detection index, multiple detection indexes can more accurately reflect the fatigue state of drivers, which is why these indexes have been widely used in the development of automatic driver sleepiness detection methods. At present, artificial intelligence-based methods mainly include the dynamic Bayesian network (He, Li, Fan, & Fei, 2015), neural networks (Yan, Coenen, & Zhang, 2015), RVM, and SVM (Yeo, Li, Shen, & Wilder-Smith, 2009). The RVM (Caesarendra, Widodo, & Yang, 2010) is based on the active correlation decision theory, which can remove irrelevant data points and reduce the impact of random volatility of data on the fatigue state discrimination. Compared to the SVM, the RVM has the advantages of shorter test time and the ability to construct arbitrary kernel functions.
Based on the results of the related studies presented above, this paper selects the mouth opening degree index and eye state as indicators and proposes a driver drowsiness recognition algorithm based on the multi-source heterogeneous data. The proposed algorithm uses multiple indicators to identify driver sleepiness, improves the accuracy of fatigue state recognition, and provides a scientific theoretical basis for the development of early warning of dangerous driving conditions.

Methods
In this section, the proposed index recognition and fatigue state identification algorithm is introduced, and CNN, Canny operator, Hough transform, and RVM are presented.

Parameter setting Index selection
The list of characteristics and performances of 11 common detection indicators obtained by analyzing a large number of driver sleepiness detection studies is presented in Table 1. Although the detection index based on driver's physiological parameters can guarantee the accuracy of driver sleepiness detection, these parameters are difficult to collect, and the collection process has a great impact on the driver, so it is not used. This paper aims to select detection indicators that have as little interference with the driver behaviour as possible. Based on the criteria of accuracy and cost, eye state, and mouth opening and closing degree have been determined to be the most appropriate detection indicators. In addition, these two indicators are also the most intuitive indicators to judge whether the driver is tired or not.
Fatigue state division PERCLOS refers to the percentage of eye closure time in a specific time, and it is calculated by: where f denotes the eye closure degree, which represents the percentage of eye closing time in a certain unit time, i.e., PERCLOS value, %; t 1 denotes the time from the maximum opening of the eye to 20% eye closing; t 2 denotes the time from the maximum opening of the eye to 80% eye closing; t 3 denotes the time from the maximum opening of the eye to total eye closing and then to 20% eye opening; t 4 denotes the time from the maximum opening of the eye to total eye closing and then to 80% eye opening, as shown in Figure 1.
Since eye opening and closing is a continuous process, the method adopted in this paper is to distinguish the state of the eyes every 60 s. The number of driver's frames detected in 60 s is denoted as A, and the  number of the frames where driver's eyes are closed during the same period is denoted as F; then, the PERCLOS value f can be expressed as the ratio of F to A, as given by Eq. (2). A large number of studies have shown that in the PERCLOS algorithm, when the threshold value of f is 0.4, the driver's fatigue state can be well detected. When f > 0.4, the driver is in the fatigue state (Tian & Ji, 2019). (2) Yawning is the most characteristic indicator of the fatigue state. It shows that the mouth of a person opens to a certain extent and stays in that state for a period of time. According to the feature points of the face, a two-dimensional coordinate system of the driver's face is constructed to obtain the characteristic data of the driver's mouth. Since the size of the mouth of people can differ, this method lacks universality and accuracy to judge directly from the change in the local size of the mouth when a driver is yawning. The mouth opening degree α is defined by Eq. (3), and it is taken as one of the criteria of the fatigue discrimination algorithm of the proposed method.
In Eq. (3), H denotes the height of the inner ring of the mouth and L denotes the distance between the two corners of the mouth, as shown in Figure 2.
According to Deng and Wu (2019), when a driver yawns, the mouth opening degree will be greater than 0.6, with the highest mouth opening degree of about 1.2, while the opening degree will vary between 0.4 and 0.5 under normal speaking conditions. The mouth opening degree for different states is shown in Figure 3. In this work, the threshold value of mouth opening degree is set at 0.6. When the maximum value of mouth opening degree of a driver in the period of 60 s exceeds 0.6, the driver is judged to be in the fatigue state. In this paper, mouth opening degree and eye state are used to distinguish the diver's state. The data taken in the experiment need to be judged in advance. The classification of the fatigue state is shown in Table 2, and according to the values presented in this table, the preclassification should be carried out in the data collection process.

Index identification CNN-based Face Recognition
CNN is a hierarchical structure, and its main layers are the input layer, volume base layer, pooling layer, fully-connected layer, and output layer. The CNN usually includes two or more convolution and pooling layers. At the end of the CNN, a feed-forward neural network with a fullyconnected layer is adopted. The backpropagation is the most commonly used neural network training method.
The specific technical indicators and calculation steps of face recognition are as follows. First, the input image is pre-processed to  Non-fatigue 0 The PERCLOS value is less than 0.4.
The mouth opening degree is less than 0.6

Fatigue 1
The PERCLOS value is greater than or equal to 0.4. The mouth opening degree is greater than 0.6 obtain the image with the size of 32 × 32 pixels, which is then fed to the network input, and six 5 × 5 convolution kernels are used to generate C1 layer with six characteristic maps. After a 2 × 2 filtering, the output of C1 pooling layer is used as the input of S1 layers to obtain six corresponding feature maps. Next, these six feature maps of S1 layer are used as the input of 12 5 × 5 convolution checkers to conduct the second convolution of the six feature maps. The output consisting of 12 characteristic graphs is fed to the input of C2 layer, whose output is fed to a 2 × 2 filter for pooling. The output of the filter is fed to S2 layer, and after being processed by S2 layer, the pooled image features are fully connected and normalized, and the obtain categories are fed to the input of the softmax classifier. The softmax classifier classifies the samples and outputs the classification results, and finally conducts the face recognition process. The structure of the CNN-based face recognition model is shown in Figure 4.

Eye and mouth parameter recognition method
Using the collected face image, first, the face region is trimmed, and the gray image is extracted by the Canny operator, as shown in Figure 5. Next, the fast Hough transform is performed using the gray gradient information of the boundary pixels. By using the prior knowledge and the method for gradient calculation of a gray image, the gray transformation direction of each edge point is obtained. Finally, recognition of the eye and mouth contours is realized. The recognition effect is shown in Figure 6. The specific Hough transformation (Wang, Wang, & Li, 2019b) process is as follows.
Suppose the standard equation of a circle is given by: where a and b represent the abscissa and ordinate values of the circle center, respectively, and r denotes the circle radius. Then, the polar equations of the circle can be expressed as: x a r = + cos ; θ (5) Next, by eliminating r from Eqs. (5) and (6), we get: Eq. (7) represents a linear equation. In the accumulation array M(a, b), the increment in a and b parameters of the accumulator is accumulated in Eq. (7) through the edge points at the same edge angle θ in x and y directions.
The whole algorithm includes the following steps: Step 1: Quantization of parameter spaces a and b; Step 2: Initialization of the accumulated array M(a, b) to zero; Step 3: Calculation of gradient value and gradient angle; Step 4: For each edge point satisfying the edge angle, updating the accumulator array M(a, b) by Eq. (7); Step 5: The position where the accumulator has a local maximum represents the image center.

Construction of the driver sleepiness state classification model
In order to realize the driver sleepiness state recognition, according to the driver's state definitions given in Table 2, training samples x n [n = 1, 2, ..., N; x ∈ (σ h , σ r )] are used as the input, the discriminating results t n (n = 1, 2, ..., N; t ∈ {0; 1}) are used as the output, and the RVM-based discriminant given by Eq. (8) is constructed to distinguish the nonfatigue state from the fatigue state. . ε In Eq. (8), w denotes the weight vector, ε n represents the Gaussian noise, and ε ∈ N(0; σ 2 ), while its variance is σ 2 . Therefore, expression p(t n | x) = N (t n | y(x), σ 2 ) obeys the Gaussian distribution, and y(x n ) is determined by the kernel function φ i ≡ K(x, x i ).
In order to classify the non-fatigue and fatigue states of a driver, the following classification function is constructed: In Eq. (10), t = (t 1 , …, t n ) T , w = (w 0 , …, w n ) T . Since there is no noise variance in Eq. (10), in the classification problem, noise does not need to be considered.
In the fatigue state classification process, the approximate solution of the Laplace approximation (Fang, 1993) is needed to obtain solution p(w | t, α) or edge distribution p(w | α), so the classification process represents the process of solving p(w | t, α) or edge distribution p(w | α). The approximation procedure based on the Laplace method is as follows.
1. Since p(w | t, α) ∝ p(w | t)p(w | α), when the value of α is fixed, the weighting process of the most likely value is equivalent to finding the minimum value of Eq. (11); thus,w MP can be solved by using the least square iteration process, which is given by: where y n = σ{ y(x n ; w)}, and A denotes a diagonal matrix.

RVM-based driver sleepiness recognition algorithm
According to the construction of the driving sleepiness state classification model, the specific steps of the RVM-based driver sleepiness state recognition algorithm are as follows. The flowchart of this algorithm is displayed in Figure 7.
Step 1: Fatigue state division: By using the fatigue state division method, the data collected in each unit time interval of 60 s are marked as either non-fatigue or fatigue state. When the mouth opening degree and eye state index reach the values defined by the fatigue standard, the driver's state is marked as the fatigue state; otherwise, it is marked as the non-fatigue state.
Step 2 the driver's state; if it is equal to zero, the driver's state is considered as non-fatigue; otherwise, it is considered as fatigue; [σ hi , σ ri ] ∈ x i represents the input parameter.
Step 3: The training and test data matrices are determined; namely, the dataset is divided into the training dataset T 1 and test dataset T 2 .
Step 4: Different kernel functions K and different kernel function parameters are used to map feature parameters to high dimensions.
Step 8: The convergence scale is checked; if the convergence scale is not reached, it is necessary to go back to Step 6.
Step 9: Fatigue state classification training: 500 samples from the training dataset T 1 (half of non-fatigue and half of fatigue) are fed to the RVM model to train it for the fatigue state classification.
Step 10: Fatigue state classification test: the test dataset T 2 is fed to the trained RVM model and the classification result of the fatigue state is statistically analyzed.

Acquisition of driver sleepiness parameters
The data on the physiological characteristics of the drivers were obtained by the CLT-353 simulation driving platform. The driving device had the same operation mode as a real car and was equipped with a force feedback system, which could simulate the real driving environment. In the experiment, the driver's mouth opening degree data and the eye state index data were obtained by an infrared anti-shake camera.
Twenty males participated in the data collection process, they were between 23 and 33 years old and had between three and six years of driving experience. Before data collection, the participants had at least 12 hours of high-quality sleep in order to ensure that their mental state was good before the experiment. The simulation scene represented a two-way four-lane circular highway with a length of 20 km. The traffic density was 10%, the weather was fine, the drivers were driving for four hours without making any overtaking, the speed was controlled below 80 km/h, and there was no rest within four hours of the test. The data were collected by the image sensor, which was placed in front of the driver's face. During the driving process, the driver's image sensor transmitted the collected data to the cc2530 control board, which was the main control board, through the wireless transmission module; and finally, the collected data were stored on the computer disk. When the mouth opening degree and eye state index reached the values defined by the fatigue standard, the former data were marked as non-fatigue, and the later data were marked as fatigue, and the current driver's state was judged as the fatigue state.

Construction of driver sleepiness state recognition RVM classifier
Based on the driver sleepiness parameters and the fatigue state division standard, the mouth opening degree and eye state indexes corresponding to the non-fatigue and fatigue states were obtained. According to the algorithm steps, 500 datasets of the driving data (half of non-fatigue and half of fatigue) were used as training data of the RVMbased driver sleepiness state recognition algorithm, and 1000 datasets (half of non-fatigue and half of fatigue) were used as the test data. The computer had 4G, the main frequency was 2.5 GHz, and an Intel i5 processor was used. The Matlab software environment was first used to train the RVM classifier on the training data set and then to test it on the test data. The RVM classifiers were trained using different kernel functions (Kernel) and kernel function length scale parameters. The kernel function was selected from the Gauss kernel, the Laplace kernel, the Spline kernel, and the Cauchy kernel. The kernel function length scale parameters were 1.0, 0.5, and 0.1, and the maximal number of iterations was set to 5000. It should be noted that the methods based on a single index and multiple indexes were used in the comparison. The PERCLOS discrimination and the opening degree discrimination were the methods based on a single index, and the fusion discrimination of PERCLOS and opening degree was the method based on multiple indexes.
In the test experiment of driver sleepiness judgment, the accuracy of the discrimination result was calculated by: where FP denoted the number of identified errors, S denoted the total number of test samples, and A denoted the achieved recognition accuracy.

Results and analysis
The experimental results are shown in Table 3 and displayed in Figure 8. Based on the results presented in Table 3, it can be concluded that (1) the average classification accuracy of most RVM classifiers was higher than 90%. The Cauchy kernel function achieved the highest accuracy rate of 92.07% and the lowest accuracy rate of 90.77%, indicating that the analysis of heart rate and blink frequency could effectively identify the fatigue state. Thus, the proposed algorithm achieved good recognition performance, and the selected mouth opening degree and eye state index were closely related to the fatigue state. This also confirms that the research results of Baronti, Lenzi, Roncella, and Saletti (2009) and Park (2011) are correct. (2) The correctness order of the   Figure 8, in the optimal classification of different RVM classifiers, the number of the non-zero parameters (RVs) was small. There were four Gauss kernel functions, nine Laplace kernel functions, two Spline kernel functions, and four Cauchy kernel functions. By reducing the non-correlation vector, the calculation amount of the kernel function corresponding to the non-correlation vector was also reduced, i.e., the fatigue state recognition algorithm was quicker. Thus, other sample points had a small influence on the classification, which identified the fatigue state. The proposed recognition algorithm demonstrated high robustness.
The performances of the single-index method and fusion-index method were compared, the comparison results are shown in Table 4.
As shown in Table 4, the accuracy rates of the single-index models were lower than those of the fusion-index model. Namely, the phenomenon of driver's fatigue in the driving process is often multifaceted, such as yawning or closing eyes, and when only a single indicator is used to judge driver sleepiness, the judgment can often be incorrect. In addition, when only a single indicator is satisfied, the state does not necessarily have to be the fatigue state. For instance, when a driver occasionally opens his mouth widely, if only a single indicator is used, the driver's state can be judged as the fatigue state, which is not correct. However, when the fusion-index is used, the system will consider many aspects, resulting in more accurate judgment on the driver's state.

Discussion and conclusions
Driver sleepiness has always been one of the most important factors causing traffic accidents, and it is related to the driving habits of the drivers, road environment, and other related factors. Driver sleepiness or driver drowsiness is a gradual process, so it is difficult to supervise it in the same way as drunk driving. In this paper, mouth opening degree and eye state index are used as detection indicators. The driver's state detection algorithm has been developed using the real driving data of 20 male drivers. The image sensor was used for data collection and the RVM discriminant model was used to identify the fatigue state. The proposed method uses the non-contact detection indexes, so it can distinguish the fatigue state without affecting the driver's driving.
The experimental results show that the average classification accuracy of the proposed algorithm is higher than 90% under different kernel functions, so the proposed algorithm demonstrates good recognition performance and can distinguish the driver's state quickly; also, the obtained results confirm that mouth opening degree and eye state index are closely related to the fatigue state. The comparison of single-index and multi-index discrimination methods show that the multi-index identification method has higher accuracy and can reflect the driver's fatigue state more accurately. Besides, the proposed algorithm provides a scientific theoretical basis for the development of fatigue state warning methods.
In future, fatigue driving detection should be conducted using some additional indicators, including the eye movement index, vehicle state, and others. Also, the influence of the environmental parameters, such as temperature change, illumination change, weather conditions, different road structures, and traffic density, on the fatigue state should be considered.
Nevertheless, there is still room for further development and improvement of the proposed method. For instance, a larger number of people of different ages, both men and women, should participate in the data collection process. Besides, the impacts of objective environmental factors, including the temperature and driving environment, on driving sleepiness should also be considered in future tests.