Drivers use active gaze to monitor waypoints during automated driving

Automated vehicles (AVs) will change the role of the driver, from actively controlling the vehicle to primarily monitoring it. Removing the driver from the control loop could fundamentally change the way that drivers sample visual information from the scene, and in particular, alter the gaze patterns generated when under AV control. To better understand how automation affects gaze patterns this experiment used tightly controlled experimental conditions with a series of transitions from ‘Manual’ control to ‘Automated’ vehicle control. Automated trials were produced using either a ‘Replay’ of the driver’s own steering trajectories or standard ‘Stock’ trials that were identical for all participants. Gaze patterns produced during Manual and Automated conditions were recorded and compared. Overall the gaze patterns across conditions were very similar, but detailed analysis shows that drivers looked slightly further ahead (increased gaze time headway) during Automation with only small differences between Stock and Replay trials. A novel mixture modelling method decomposed gaze patterns into two distinct categories and revealed that the gaze time headway increased during Automation. Further analyses revealed that while there was a general shift to look further ahead (and fixate the bend entry earlier) when under automated vehicle control, similar waypoint-tracking gaze patterns were produced during Manual driving and Automation. The consistency of gaze patterns across driving modes suggests that active-gaze models (developed for manual driving) might be useful for monitoring driver engagement during Automated driving, with deviations in gaze behaviour from what would be expected during manual control potentially indicating that a driver is not closely monitoring the automated system.

n n ∑ i=1 w i,k ). The time headway probability density is the sum of the weighted mixture regression model densities: Each linear regression model is specified with parameters θ k = α k β k σ k : The first regression model is the primary GF model (θ G ), with a fixed slope (β ) of zero but freely varying intercept (α) and standard deviation (σ ). Any additional regression model for a point of special interest has a fixed slope of -1 (Manuscript Figure  4D). Since the fitting process is susceptible to outliers and gaze data is generally noisy we also add a noise cluster (θ N ) to pick up outlying data that cannot be confidently associated with the main gaze clusters. The noise cluster has fixed parameters: a slope of zero, intercept fixed to the grand mean (x) of the data, and a standard deviation of twice the grand standard deviation (σ ) of the data. If one specifies the number of salient points as S, the parameters are as follows: The model is fitted using the Expectation-Maximisation algorithm [1]. The steps are as follows. If the component parameters (θ k ) are known, then calculating the mixture proportions (ω k ) is relatively simple. If we take n as the number of observations, from calculating the log-likelihood of the mixture model: One gets the normalised weights for each observation, (w i1 , . . . , w iK ). This is the Expectation step. Once these weights are known the weighted averages and standard deviations that maximise the likelihood of the data, given the current weights (expectation) can be computed. By iterating from estimating weights to estimating parameters we can increase the likelihood on every step.
If appropriate initial values are set the algorithm converges to the posterior mode. The current implementation of the model had two gaze clusters (the GF cluster bend entry clusters) and a fixed noise cluster, therefore four free parameters Since the final fit is sensitive to initial values we adopt a sparse grid of initial values and select the fit with the highest likelihood. The initial value of β G was fixed atx. The initial values for β E , σ G , σ E were gridded around the intercept corresponding to the point of highest gaze density in the pooled dataset (for β E ) andσ (for σ G and σ E ). The gridding process resulted in 27 separate fits.
To retain the trial-by-trial structure when sampling from the model, each trial time headway density is estimated using the regression weights, (w i1 , . . . , w iK ), for each gaze observation for that trial. These trial densities are then averaged across trials to get a participant's average time headway density. From these average densities the mean, standard deviation, and cluster weights are calculated. Individual fits for every condition can be viewed in supplementary materials here. The mixture model implementation can be found in an online repository [2].

SI Appendix 2. Bayes Model Details
This appendix describes the details of the distributional models used for inference. For all the models used in this paper, each participant has a single observation per condition (a median or a mean). In all cases, the spread of participant means are approximated by a normal distribution. We wish to model one factor (Driving Mode) with three levels (Manual, Auto-Replay, Auto-Stock). Treating Manual as the reference condition (i.e. the model's intercept; β M ) means that the coefficients for Auto-Replay (β R ) and Auto-Stock (β S ) are modelled as deflections from the Manual condition, controlled by binary variables (R, S) that denote the presence of the condition.
Weakly informative priors are specified based on previous literature (the inferences remained unchanged when we tested a range of priors). The distributional model for all time headway inferences is of the form: The tracking duration model was fitted with a log normal distribution, with the priors given below: All models were fitted in R using the package brms [3], using 1200 iterations split into four chains.  Entry Fixation Placements. After model fitting, for each participant the Entry Fixation (EF) cluster has an intercept, in units of time along the midline, and a standard deviation. Upon inspection of the individual fits it seems that participants with a high EF cluster standard deviation often did not clearly fixate on a single point (e.g. participant 5, SI Fig 8). For these participants, the EF categorisation did well at separating gaze which diverged from the primary GF cluster, but did poorly at estimating a single point of fixation. Therefore, when estimating where along the track participants looked, on average, each observation (intercept) is weighted by its precision ( 1 σ 2 ) to avoid poor fits biasing the population estimate. A) The Entry fixation placement for each participant, for each driving mode, placed along the midline reference. The precision of each EF cluster is indicated by the dot size and transparency (highly certain observations are small and bold). The weighted means are also shown. B) Bayesian weighted linear regression was performed, with the posterior estimates of the mean intercepts for each condition shown. C) The posterior contrasts between each driving mode condition. Any differences are estimated to be very small (<.09 s, or < 71 cm), so we recommend that the EF fixation placement be considered practically equivalent across conditions. Fitted Cluster Weights. The fitted cluster weights across driving modes, for each participant. For each participant the gaze probability associated with the noise cluster has been removed (see Table 1), with the remaining gaze probability normalised across entry fixation and guiding fixation clusters. The black lines denote the driving mode mean.

Individual Fits
Listed below are the mixture modelling individual fits, for each driving mode, entitled with the participant number and the driving mode. The caption for all figures is as follows: A) Gaze data, with gaze time headway (TH) along the ordinate and Time into Trial along the abscissa. Gaze fixation data are shaded according to the probabilities of belonging to Guiding Fixations (GF; blue), Entry Fixations (EF; red) or Noise (Grey) clusters. B) The regression lines and standard deviations of each model, from which a weighted sample is taken. C) Smoothed average cluster weights across the track. D) Raw gaze TH density (solid line) overlaid with the fitted gaze TH density (dashed line). E) The fitted gaze TH density decomposed into mixtures of GF (blue) and EF (red).