A DISTRACTED DRIVING DISCRIMINATION METHOD BASED ON THE FACIAL FEATURE TRIANGLE AND BAYESIAN NETWORK

Distracted driving is one of the main causes of road crashes. Therefore, effective distinguishing of distracted driving behaviour and its category is the key to reducing the incidence of road crashes. To identify distracted driving behaviour accurately and effectively, this paper uses the head posture as a relevant variable and realizes the classification of distracted driving behaviour based on the relevant literature and investigation. A


Introduction
The road crash causing factors include human, vehicle, road, and environmental factors. Among human factors, distracted driving behaviour has been considered the leading factor causing road crashes (Née et al., 2019;Kidd & Chaudhary, 2019). Distracted driving generally refers to the driver behaviour phenomenon where driver attention is directed to activities that are not related to driving, resulting in the decline in the driving operation ability (Pope et al., 2017;Craig et al., 2021), and the distracted driving behaviour has a direct impact on the probability of collision (Shaaban et al., 2020). Due to the insufficient accumulation of distraction category data, it is challenging to prevent distracted driving behaviour effectively, and thus it is difficult to formulate relevant regulations to reduce distracted driving behaviour (Nevin et al., 2017). Therefore, the way of distinguishing and classifying drivers' distracted driving behaviour accurately, preventing distracted driving behaviour effectively, and avoiding road crashes caused by distracted driving successfully has been an urgent problem to be solved in the field of distracted driving (Tung & Khattak, 2015). Since distracted driving is difficult to be measured using quantitative physiological indicators and there have been no unified laws and regulations to supervise and restrict this type of driving behaviour, developing an accurate discrimination method of distracted driving behaviour could be an effective way to prevent distracted driving (Wei et al., 2021).
Parameter acquisition is the primary problem to solve when studying distracted driving behaviour, and selecting appropriate indicators is a precondition of parameter acquisition. Parameter acquisition methods can be roughly divided into invasive and non-invasive (Sun et al., 2017).
The invasive detection methods use electroencephalogram (EEG) and electrocardiogram (ECG). Although the invasive index is highly related to distracted driving, the contact between the equipment and a driver during the detection process greatly affects the driving state, which is not suitable for on-board equipment (Wang et al., 2015). The noninvasive detection methods consider the vehicle driving characteristics and external environmental parameters. These indicators can be detected without affecting the driver, but the detection accuracy has been difficult to improve (Siddiqui et al., 2021). With the development of the face recognition technology, the face feature index detection based on face recognition has not only ensured the non-invasive approach to drivers but has also provided the theoretical feasibility to improve the detection progress because of the close correlation between its detection parameters and distracted driving features (Lei et al., 2017). Therefore, this paper selects the eyes and mouth parameters as indicators because they can effectively represent the distracted driving characteristics as a detection index of face recognition and constructs a distracted driving discrimination method based on the facial feature triangle.
Related research has shown that the most common distracted driving behaviours include talking to passengers, smoking, and using mobile phones (Parr et al., 2016). Drivers' distracted driving behaviour categories present different risk levels (Fu et al., 2022), and during these behaviours, a driver's head posture changes significantly. In this study, the yaw, pitch, and roll angles are used to quantify the head posture, and the head posture is used to judge distracted driving behaviour. The Bayesian network model is employed to fuse prior knowledge and actual data and learn causality (Chen et al., 2018). The Bayesian network has been selected due to its advantages of stable classification efficiency and insensitivity to missing data, and the judgment of distracted driving behaviour is a typical classification and discrimination problem. Therefore, this paper proposes a distracting action recognition method based on the Bayesian network.
The contributions of this study are mainly reflected in two aspects. First, this paper establishes a head pose estimation method using facial feature triangles to estimate a driver's head pose accurately and effectively. Second, this paper uses the Bayesian network to establish a distracted driving behaviour discrimination model, which can effectively and accurately distinguish and classify distracted driving behaviour, so as to accumulate distracted category data and prevent distracted driving.
The rest of this paper is organised as follows. In Section 2, the current research and related studies are reviewed. Section 3 presents the index selection method, facial feature point recognition algorithm, and distracted driving discrimination algorithm. The head pose estimation method is introduced in Section 4. The proposed distracted driving category judgment method is explained in Section 5. The analysis of experimental and prediction results is given in Section 6. The conclusions and future research areas are presented in Section 7.

Literature review
In this study, a facial feature triangle is constructed by using the information on the eyes and mouth of a driver to reflect the driver's head posture, and then the distracted driving state is distinguished by the Bayesian network.

Head pose estimation method
In recent years, many studies have been conducted on the driver's head posture detection. Fice et al. (2018) studied the driver's head deflection angle and duration under normal conditions and found that compared with the stationary state, during the movement of a vehicle, the driver's head attitude deflection amplitude was smaller and the duration of driver's head attitude deflection was shorter. Zhao et al. (2020) used the head posture as the evaluation parameter of distracted driving. The experimental results show that the distracted driving state can be effectively judged according to head posture. Yan et al. (2022) found that different combinations of head, hand and object positions constituted a complex category of driver posture. The research shows that it has continuity, diversity, superposition, similarity, transition and interaction. Teyfouri et al. (2021) designed a fatigue warning system based on the driver's neck position and blinking frequency based on the fact that the driver's head drooping drives the change in the neck position during sleep. He et al. (2015) established the evaluation model of driver fatigue using the EEG data and a driver's head nodding angle as detection indicators. The results showed that this method could effectively prevent driver fatigue during driving. Therefore, the distracted driving state can be distinguished based on the head state, so this paper selects the head posture as an evaluation index of the driving state.
In the non-invasive detection indicators, the eyes and mouth on a relatively stable face can be used as feature points to construct a facial feature triangle, which can reflect the driver's head posture (Ghimire et al., 2017). On this basis, the paper uses three parameters, namely, the yaw, pitch, and roll angles, to quantify the head deflection angle to improve the detection accuracy of Head Pose Estimation.

Facial feature point recognition algorithm
Based on the increasingly mature image classification and detection technology, the recognition methods of a driver's distracted driving face can be mainly divided into traditional computer vision (CV) algorithms and deep learning-based algorithms. Driver distraction detection based on the traditional CV algorithms extracts image features using the scale invariant feature transform (SIFT), histogram of oriented gradient (HOG), and other feature operators (Zhang, Tang, & He, 2019), then combines them with the support vector machine (SVM) and establishes a classification model. However, traditional CV algorithms have the disadvantages of high requirements for the environment, a narrow application range, numerous parameters, and a large amount of calculation. Convolutional neural networks (CNNs) have been proven to be the most effective technology to achieve high precision in face recognition . With the rapid development of deep learning, CNNs have been applied to many computer vision tasks, such as image recognition and target detection (Ding & Tao, 2018). The test results of face recognition show that the performance of image recognition and detection could be significantly improved using the CNNs (Hu et al., 2019). The comparison of different recognition algorithms shows that the aggregating handcrafted and deep CNN features can make up for the deficiency of deep learning with higher accuracy (Alkinani et al., 2022). The recognition methods based on deep learning have attracted great research attention in recent years. By using an instrument panel camera in a car to record a driver's driving process and a pretrained neural network model to detect and recognise the captured image, extracting the local neighbourhood texture information of a grey image, it is found that the mouth and eyes contribute the most to facial expression, the deep learning-based algorithm can obtain a high recognition rate (Zhang & Hua, 2015). Therefore, this study uses a deep learning-based method to recognise a driver's face and mark the facial feature points.
In this study, the eyes and mouth are selected as facial feature points. At present, three types of methods have been commonly used for human eye positioning: 1. Feature-based methods, such as the projection method, which have fast processing speed but are greatly affected by the face pose transformation; 2. Shape-based methods, such as the template matching method, which can achieve accurate positioning, but usually include a large amount of calculation and have insufficient real-time performance; 3. Performance-based methods, such as the AdaBoost algorithm, which have strong robustness but often require a large number of training samples. This paper combines the projection method and Hough transform to detect the circle, locate the eyeball accurately, and overcome their shortcomings. The combined algorithm improves the accuracy and speed of human eye positioning. In addition, on the basis of the eye positions, the mouth is located according to the distribution characteristics of facial organs.

Distracted driving behaviour discrimination method
The in-depth study of distracted driving has shown that the parameter curves of a distracting action often have similar morphological characteristics. Therefore, similar parameter curves can be classified, and then the description characteristics can be obtained according to the characteristics of the parameter curves and distracting action. By using multiple description features to express and describe the knowledge about distracting action, the morphological features of the parameter curve can be recognised, and then the distracting action can be determined. There are four main types of distracted driving: cognitive, visual, audio (Babić et al., 2021;van der Zwaag et al., 2012;Warren & Micha, 2011;Catalina et al., 2020) and manual distraction (Zhang & Hua, 2015). The Bayesian networks can learn causal relationships. Therefore, it is an ideal model to fuse a priori knowledge and data and realise reasoning under the condition of incomplete and uncertain information. It can learn from practice and optimize the network structure and parameters (Yang et al., 2010;Fasanmade et al., 2020). The Bayesian network has many advantages in classification. For instance, it uses a graphical method to describe the relationship between data, which is simple and efficient; it is beneficial to deal with incomplete datasets; it can deal with the causal relationship between variables. In addition, combining the Bayesian statistics can make full use of information on the domain knowledge and sample data (Ruitao, 2021). Therefore, in this study, the Bayesian network is used, and a distracting action recognition method based on the Bayesian network is proposed. First, parameters related to a distracting action in distracted driving data are divided into multiple morphological feature classes using the time series hierarchical clustering method based on the DTW (Distance to Waypoint) distance (Wan et al., 2017). Next, the descriptive characteristics of each parameter curve are determined by the method based on the statistical dependency analysis to distinguish various parameter sequences. Then, a Bayesian network for distracting action recognition is constructed by fusing multiple description features (Liang & Lee, 2014). Finally, the recognition of distracting actions is realised by the Bayesian network reasoning.

Feature point recognition of human eyes and mouth
This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn.

CNN-based face recognition
The CNN model used in this paper consists of two convolution layers and two maximum pooling layers. To improve the accuracy of face recognition, in the proposed CNN model, the 5×1 and 1×5 convolution kernels are used instead of the 5×5 convolution kernel to reduce the parameter number and calculation amount and improve the accuracy. After being processed by the connection layer, the input is sent to the output layer, and the softmax function is used as a classification function in the classification process. The CNN model structure and calculation steps are shown in Figure 1.

Eye feature point positioning based on Hough transform
In a face image, first, the face area is cropped. Then, the human eyes are roughly located through two integral projections, and the edge of the roughly located eye grey image is extracted after enhancement processing. The boundary of the eyeball region is obtained by processing the previous section. The idea of using the Hough transform is as follows. First, two points m and n on a circle, having the coordinates of x m , y m and x n , y n respectively, are recorded. The centre of the circle must be on the vertical line connecting the two points, and this vertical line is defined by: Then, a value of one is added to the accumulated value of all points on this vertical line in the parameter space. Finally, the point with the largest accumulated value in the parameter space is extracted. This point represents the centre of the detection circle. The detected centre is the position of the pupil, which denotes the position of the eye feature point. The positions of eye feature points are marked as B and C, as shown in Figure 2.

Mouth positioning
Once the eyes have been located, the mouth location can be obtained based on the eye locations. According to the distribution characteristics of facial organs, the mouth area can be roughly divided, as shown in Figure 3(a). Also, according to the distribution law of facial organs, it is assumed that the distance between two eyes is a; the size of the mouth area is 0.4a × 1.3a, and this area is located below the eyes at a distance of a. The experimental analysis has shown that in the HIS space, selecting H component can accurately distinguish the colour of skin and lip. Therefore, in the H component, after morphological image processing, the mouth area is selected by a rectangular frame, and the centre of the rectangle is taken as the centre of gravity of the mouth. The centre of the mouth represents the feature point of the mouth, which is marked as A. The mouth detection process is shown in Figure 3(b).

Facial feature triangle-based driving attitude estimation
In the head movement analysis, the eyes and mouth with a stable facial movement are selected as feature points to construct the facial feature triangle. Using the geometric changes in the facial feature triangle, the yaw, pitch, and roll angles of the head can be determined and used as parameters to infer the head posture. The changes in the yaw, pitch, and roll angles can be used to judge whether the driver is distracted.
Attitude parameter estimation mainly refers to calculating the deflection angle of the head relative to the three coordinate axes, namely, yaw, pitch, and roll, as shown in Figure 4. The vertical direction of a driver's head image is set as the z-axis, the horizontal direction is set as the x-axis, and the direction perpendicular to the driver's image is set as the y-axis.
In a video image, the relative position of eyes and mouth change with the change in the head posture, and they show certain geometric features. Therefore, as long as the positions of eyes and mouth are located, the head posture can be preliminarily estimated.
Assume the coordinates of feature points A, B, and C in the pixel coordinate system are denoted by (x 1 , y 1 ), (x 2 , y 2 ) and (x 3 , y 3 ),  respectively. According to the geometric relationship characteristics of the human face, when the head is facing a camera, ΔABC is an isosceles triangle, and when the head posture changes, the ΔABC geometry in an image also changes. Therefore, the current head posture can be judged by analysing the ΔABC geometric features. According to the three vertex coordinates of the face feature triangle, the three side lengths of ΔABC can be obtained as follows: The three sides a, b, and c of ΔABC correspond to angles ∠A, ∠B and ∠C, respectively, which are obtained as follows: The height corresponding to side a of ΔABC is given by: where p = (a + b + c)/2.

Yaw attitude analysis
Yaw indicates that a driver's head rotates around the z-axis, as shown in Figure 5(a). As shown in Figure 5(b), in the video image (i.e., the x-z plane), ΔA p B p C p is no longer an isosceles triangle, and its properties change. Further, as also presented in Figure 5(b), in which direction the head rotates, in that direction, the triangle top angle will become larger. For instance, if ∠C > ∠B, the head rotates to the right, but if ∠B > ∠C, turn the head rotates to the left. For the convenience of calculation, two triangles are projected onto the x-y plane, as shown in Figure 5(c).
Set the rotation angle as α; then, it holds that: where α denotes the distance between the eyes after turning the head; a′ is the distance between the eyes in the frontal image of the head.
When turning the head, the side length of the feature triangle changes, but the corresponding height h stays unchanged. According to the positional relationship between the mouth and eyes presented in Figure 5(a), it holds that Therefore, it can be written as:

Pitch attitude analysis
Pitching refers to the rotation of the head around the x-axis, as shown in Figure 6(a). In a video image (i.e., the x-z plane), as shown in Figure 6(b), ABC and 123 denote isosceles triangles, and the height h of  the characteristic triangle changes. Set the pitch angle as BB. For the convenience of calculation, project two triangles onto the y-z plane, as shown in Figure 6(c). It can be written as: where h denotes the height of the triangle in the image after nodding, and h′ is the height of the triangle in the frontal image of the head. When nodding, although the coordinate positions of the eyes in an image are changing, the relative distance between the eyes remains unchanged, i.e., a remains unchanged. According to the previous analysis, for the front face image, the centre of the mouth should be located at 1.2a below the eyes, so it can be written as: Therefore, it holds that

Roll attitude analysis
Tumbling means that the head rotates around the y-axis, as shown in Figure 7(a). In a video image (i.e., the x-z plane), as shown in Figure 7(b), ΔA p B p C p and ΔA′ p B′ p C′ p denote isosceles triangles, and the distance between the two eyes remains unchanged. The only change in the image after the head rotation around the y-axis is that the height h of ΔA p B p C p is no longer parallel to the z-axis, but forms a certain angle with the z-axis, which is denoted by γ. For the convenience of calculation, the two triangles are projected onto the x-z plane, as shown in Figure 7(c).
Considering the actual situation, the motion range of the head swing is small. Define the motion range in the image pixel coordinate system as x 3 < x 1 < x 2 . Then, if y 2 > y 3 , the head deviates to the left, the deflection angle γ is given by: If y 2 < y 3 , the head deviates to the right, and the deflection angle γ is given by:

Morphological and description feature node construction
Descriptive and morphological feature nodes refer to the first and second-layer nodes in the Bayesian network model, respectively. These nodes are the key to the Bayesian network construction. They include the classification of parameter sequences, as well as the discretization and selection of descriptive features.

Parameter sequence classification based on hierarchical clustering
This paper adopts condensed hierarchical clustering; it starts from a single distracted driving behaviour sample, then merges smaller distracted driving behaviour samples and, finally, forms a distracted driving behaviour category containing all samples. In this study, relevant parameter sequences of simulated distracted driving behaviours are clustered based on the DTW distance. According to the clustering situation, the distance threshold is set for classification, and multi-class parameter sequences are obtained as morphological feature nodes. Fifteen typical distracted driving behaviours, including visual, cognitive, operational, and auditory distractions, are considered, as shown in Table 1.

Description features and their discretization
According to the specific analysis of the distracted driving behaviour and related parameters, the description characteristics of a parameter sequence are determined. The description characteristics and their descriptions are shown in Table 2. The discretization results of describing the features are shown in Table 3. The continuous quantity uses the heuristic Gaussian cloud algorithm to mine the qualitative concept of each description feature of distracted driving behaviour and divides the interval with the intersection of the expected curve.   The obtained interval is used for node discretization of the description features of distracted driving behaviour. The change in the continuous quantity corresponds to the overall changing trend of a parameter without considering small fluctuations in data.

Node selection of description feature class based on statistical dependency statistical analysis
The dependency relationship is usually determined by the mutual information or conditional mutual information of two nodes, which indicates the correlation degree of the two nodes. The dependency relationship between the feature nodes and morphological feature nodes can be analysed and described by statistical or information theory methods.
Let us assume that X and Y represent two nodes, having the values of x and y, respectively. Then, the mutual information between nodes X and Y is given by: The greater the value of I(X,Y ) is, the stronger the dependency between nodes X and Y is, i.e., the greater the correlation between them is. If I(X,Y) is less than the set threshold ξ, nodes X and Y are considered independent. Therefore, Equation (5) has been usually used for the conditional independence test of nodes.
The statistical dependency analysis is used to calculate the mutual information I between a description feature class node and a morphological feature class node of the corresponding parameter. By setting the threshold ξ, the description feature class node with a strong dependency is selected to construct the Bayesian network.
The calculation results of mutual information between the description feature nodes of the Bayesian network and their corresponding morphological feature nodes are shown in Table 4. In this study, the threshold is set as ξ = 0.5 according to the actual demand. The description feature nodes of mutual information I > ξ are selected as morphological feature nodes, and the selection results are shown in Table 5. The Bayesian network model used in the proposed driving behaviour recognition method includes three layers, namely, description feature layer, morphological feature layer, and driving behaviour layer, as shown in Figure 8. The description feature layer is the first layer, and it includes the description features of each head pose parameter. The morphological feature layer is the second layer, and it indicates the morphological characteristics of each parameter sequence, including analogue quantity information and switching quantity information. The driving behaviour layer is the third layer, and it represents various distracted driving behaviours.

Identification process
The driving behaviour recognition process based on the Bayesian network includes two main steps, the Bayesian network learning and Bayesian network reasoning, as shown in Figure 9.  The Bayesian network reasoning is based on the conditional probability calculation when the network model is known. Through the analysis and application of known conditions, the target node probability is calculated. The change characteristics of each parameter are extracted from the driving data to be identified and then discretized and input into the Bayesian network. The probability obtained in the Bayesian learning process is used for a priori probability calculation to infer the probability that the input data belongs to the corresponding action.

Case study and discussion
The characteristic physiological data of the driver were obtained using the CLT-353 simulation driving platform. The driving device had the same operation mode as the real car and was equipped with a force feedback system, which could simulate the real driving environment. In the experiment, the triangle index data of the driver's facial features were obtained by an infrared anti-shake camera.
To test the actual detection accuracy of the proposed method, 20 volunteers of different genders were selected to participate in the test. The volunteers had different driving experiences. There was no significant difference in the relationship between executive functions and distracted driving behaviours ; thus, these volunteers were classified into the age groups of 18-30 (5 male and 3 female), 30-45 (3male and 3 female), and 45-70 (4 male and 2 female). The volunteers were assigned numbers from the range of 1-20, and the simulation test was performed on the above-mentioned driving simulator.

Facial feature triangle-based head pose detection
To test the accuracy of the facial feature triangle used for head pose estimation, the following tests were conducted. The camera was placed at a fixed position G; the seat was fixed in front of the camera; the intersection between the front of the camera and the top of the seat was denoted as point O; points A, B, C, A′, B′, C′ were marked on the left and right sides of the camera head; the angles of OA, OB,OC,OA′,OB′,OC′,and OG were 20°,30°,40°,20°,30°,and 40°,respectively. Similarly,points D,E,F,D′,E′,F′ were denoted at angles of 20°, 30°, and 40° above and below of the camera; strip marks were denoted at 20°, 30°, and 40° on the left and right of the longitudinal axis. During the test, the volunteers only turned their heads, while pupil, sight, and body remained unchanged, and looked at each mark in turn. For each volunteer, data of 100 rounds were collected, summarised, and sorted, and the average error was calculated by: where X represents the average error of a data reading, x i represents the ith reading at the same position, and ε represents the angle marked by the fixation mark point. The head pose estimation was performed using the proposed facial feature triangle algorithm. The target angle of gaze was compared to evaluate the accuracy of the facial feature triangle-based head pose estimation.  Integrating all experimental results, the absolute value of the angle error was calculated to be within 3°. The test results for different angles are shown in Figure 10. As shown in Figure 10, the head pose estimation result was in line with the actual situation, and high accuracy with a small error range was achieved. Unlike the traditional attitude estimation, the proposed algorithm does not require information on the initial attitude, is easy to implement, and has a faster calculation speed.

Facial feature triangle-based distracted driving detection
To test the distracted driving detection accuracy of the proposed algorithm, the following experiments were conducted. Volunteers were asked to drive normally and make some distracted driving behaviours randomly during the driving process, such as looking at a mobile phone, lighting a cigarette, eating food, and taking certain things. The video images of the drivers' driving state were collected and edited. Each video lasted 3-5 s. Humans judged the driver's driving state to determine whether the driver is in a distracted driving state and in what kind of distracted driving state. For each driver, 100 segments were selected, and the proposed algorithm was used to distinguish each of the segments. The correct output of the driving state discrimination was denoted by "1" and the wrong output was denoted by "0". The sum value M of the judgment results was obtained by: where P n is the discrimination accuracy of a driver n, and M n is the sum of the judgment results of the driver n.
The detection accuracy of 20 drivers is shown in Figure 11.  As shown in Figure 11, the distraction discrimination algorithm based on the facial feature triangle had high recognition accuracy, which could provide a guarantee for the smooth and accurate operation of the system and ensure the feasibility of subsequent research.

Bayesian network-based distracted driving behaviour judgment
To verify the accuracy and effectiveness of the proposed judgment method of the distracted driving category based on the Bayesian network, the simulation tests were conducted using the driving simulator. In the early Bayesian network parameter learning, 10 volunteers were randomly selected from 20 volunteers to simulate 15 typical distracted driving behaviours. Each behaviour included 10 groups of video data, so a total of 1500 groups of sample videos were collected in the experiment. The Bayesian network model was constructed using the Netica software developed by Norsys company in Canada. For each action, 100 groups of data were preprocessed, and feature sequences were extracted. The feature class nodes were discretized, and the parameters were optimized combined with the EM algorithm to construct the CPT to realise the parameter learning of the Bayesian network. Taking the yaw angle model as an example, after the CPT was applied, the Bayesian network yaw angle model for distracting action recognition was obtained, as shown in Figure 12.
The driving data of another 10 drivers were selected as test samples. Each driver performed 10 distracted driving actions randomly in the test; 100 groups of test samples were collected and used as the input of the Bayesian network model. According to the edge probability of class nodes and the conditional probability of characteristic nodes determined by the learning process of the Bayesian network, the Bayesian reasoning was performed. The probability of class nodes of distracted driving behaviour was used as a basis for judging the distracted driving action type; namely, the distracted driving action corresponding to the place with the largest probability was considered the recognition result.
The results of the first and fourth groups of test samples are shown in Figure 13. As displayed in Figure 13, the probabilities of behaviours J and F were the largest, so it was judged that the action recognition results of distracted driving behaviour of the first and fourth groups of test samples were action J (phone) and F (adjusting instruments such as radio, air condition, or navigation), respectively. In the experiment, for 91 groups of 100 test samples, the distracted driving behaviour was correctly judged, and the results of only 11 groups were wrong; thus, the discrimination accuracy of 91% was achieved. Therefore, the proposed solution could achieve high discrimination accuracy of distracted driving behaviour.

Discussion
This experiment verified the feasibility of distinguishing distracted driving based on facial feature triangles. In addition, compared with other classification models, because of the advantages of Bayesian network classification efficiency stability and insensitivity to missing data, this experiment has good performance in judging distracted  driving behaviour. However, this experiment also has some limitations. For example, the size of samples collected is relatively small, and the age of participants is not divided in detail, which is also the scope of our next research.

Conclusions
Distracted driving has always been one of the crucial factors causing road crashes. This driving behaviour is related to drivers' driving habits, road environment, and other related factors. When analysing the distracted driving of a driver, this paper takes the head posture as a detection index. Based on the real driving data of 20 drivers, a driver distraction discrimination and classification model is developed. The image sensor is used for data acquisition, and the Bayesian network is used to identify the distracted driving behaviour category. The proposed method adopts a non-contact detection index, which can distinguish and classify distracted driving behaviour without affecting the driver's normal driving. The experimental results show that the proposed method has good recognition performance and can identify and classify a driver's distracted state accurately and rapidly. The results confirm that the head posture parameters are closely related to the distracted driving state. Compared with other methods, Bayesian network has the characteristics of stable efficiency and insensitivity to missing data, so it can more accurately distinguish the distracted driving behaviour category. The proposed method can provide a reliable data basis for standardizing the driver driving behaviour, thus preventing the driver distracted driving, and help define relevant laws and regulations.
In the future, the proposed method framework could be improved. Namely, for small or difficult targets in images or videos of the human head, the weight in the proposed method could be adjusted appropriately to improve the overall recognition effect of the method. In addition, depth images could be used to estimate the head posture. The datasets used in this study include the RGB images. Although such training data are easy to obtain and suitable for extensive research, in the process of feature training, the RGB-D images could provide sufficient feature information and more dimensional information. Namely, the RGB-D images have more advantages than ordinary images under extreme conditions and can meet the requirements under strict accuracy requirements.