Synchronization of passes in event and spatiotemporal soccer data

The majority of soccer analysis studies investigates specific scenarios through the implementation of computational techniques, which involve the examination of either spatiotemporal position data (movement of players and the ball on the pitch) or event data (relating to significant situations during a match). Yet, only a few applications perform a joint analysis of both data sources despite the various involved advantages emerging from such an approach. One possible reason for this is a non-systematic error in the event data, causing a temporal misalignment of the two data sources. To address this problem, we propose a solution that combines the SwiftEvent online algorithm (Gensler and Sick in Pattern Anal Appl 21:543–562, 2018) with a subsequent refinement step that corrects pass timestamps by exploiting the statistical properties of passes in the position data. We evaluate our proposed algorithm on ground-truth pass labels of four top-flight soccer matches from the 2014/15 season. Results show that the percentage of passes within half a second to ground truth increases from 14 to 70%, while our algorithm also detects localization errors (noise) in the position data. A comparison with other models shows that our algorithm is superior to baseline models and comparable to a deep learning pass detection method (while requiring significantly less data). Hence, our proposed lightweight framework offers a viable solution that enables groups facing limited access to (recent) data sources to effectively synchronize passes in the event and position data.

Time-series event detection algorithms relate to data-mining techniques and, thus, focus on detecting specific patterns in the time-series.A discrimination of approaches in this topic is made by examining the utilized methods which can be divided into classical statistical and novel machine learning approaches 28 .A comparison of these methods in the field of time-series forecasting demonstrated that statistical methods outperform machine learning approaches for the majority of the time 28 .Although statistical methods in the field of event detection are well-examined (cf.change point detection) 29 , the similarities to other pattern recognition tasks promoted various studies addressing the application of supervised and unsupervised machine learning algorithms 30 .Accordingly, the majority of recent approaches in time-series event detection are based on machine learning techniques while only a few studies investigate the performance of statistical methods.
The specific task of soccer event detection has been successfully implemented by various approaches performing statistical and machine learning on video data [31][32][33][34][35][36][37][38][39][40] , audio data 31,41,42 and data gathered from social networks 43 .The focus of these approaches lies on automatically creating a summary of the match, which is a function of high interest for viewer-based applications, e.g., for the creation of highlight videos.Consequently, the research in recent years mainly addressed the detection of infrequently occurring important events where a precise determination of an exact frame is less important.
In detail, the detection of goals as the most important event in soccer was targeted by multiple studies.A heuristic method focusing on the intensity of video and audio data was introduced 31 .Two independent event detection algorithms on video data were utilized to perform the real-time detection of scores and near misses 34 .Furthermore, the detection of probable scoring opportunities from video data was proposed 39 .Additionally, a variety of studies performs the summarization of the match by detection of more detailed events beneath scores.The application of complex machine learning techniques to detect goals, shots, corner kicks, and cards was proposed 33 .Similar studies also address the recognition of replays in broadcast videos to accordingly detect different event types using support vector machines 32 , convolutional neural networks 35 , multiple instance learning 38,41 , trajectory-based deep convolutional descriptors 36 , or a combination of support vector machines and neural networks 40 .The modality of auditory data was also used in various approaches 31,41,42 where, for instance, the sentiment of the commentators was analyzed along situation-specific sounds to detect highlights.Another modality has been proposed by Van Oorschot et al. 43 , who scraped posts from social networks which comment on the match to recognize important events.
Contrarily, only a few studies lay the focus on the detection of events occurring with high frequencies (passes, shots, tacklings).An approach similar to the previously presented studies applied Bayesian networks to broadcast videos 37 and was able to further detect non-highlight events.The automatic detection of ball possession of individual players based on video data was implemented by using object detection methods and deep learning 44 .A similar study applied long short-term neural networks on video data to automatically detect time spans where a pass is played 13 .In contrast, a frame-accurate annotation of atomic pass events without a duration was conducted by a machine learning action approach using self-attention on both, video and position data 14 .The latter study detected shots, receptions, and passes from the respective data without additional information about the individual player or event.However, since this information can be obtained from, e.g. the event data, an application that utilized this information to synchronize shots in the event data with the position data was already proposed 7 .A computationally less demanding approach for event detection has been proposed by Vidal et al. 15 , however, their algorithm requires high-quality, exact position data which restricts the applicability in cases where this data is not available.
In this work, we expand the previously presented study to the synchronization of passes.Therefore, we initially perform a general pass event detection using the SwiftEvent algorithm 1 by conducting a feature extraction and probabilistic classification.Subsequently, we apply the detection algorithm to refine the imprecise pass annotation from the event data.We evaluate the performance of the algorithm by performing a 4-fold cross validation of our dataset and show that our proposed methods strongly improve the degree of synchronization for the passes in event and position data.In contrast to previously proposed machine learning algorithms, our approach requires a very small amount of data and is able to detect localization errors in the position data.Thus, this work addresses research groups with limited availability of recent position data who aim to perform a coincidental analysis of passes in the spatiotemporal position and event data.

Pass event synchronization using time-series analysis
In this section, we describe the proposed methods to improve the synchronization of pass events in soccer games using position and event data.First, we define the problem and present our input data.Second, we discuss relevant information to detect passes in the position data.Regarding the synchronization of shots, a previous algorithm proposed using player-ball distance and ball acceleration 7,45 .Consequently, we adopt this procedure for our approach and present the computation of these signals from the position data in Section "Computation of player-ball distance and ball acceleration time-series", the segmentation into time-series windows in Section "Segmentation of player-ball distance and ball acceleration time-series", and post-processing steps in Section "Post-processing".
Concerning the methods for the detection of passes, we need to deal with the novelty of the apparent task and the lack of research in this specific domain.As pointed out in Section "Introduction", the highly competitive nature of professional soccer prohibits the majority of applications to assess large datasets.Therefore, we want to utilize different methods than previously proposed large-scale machine learning algorithms 14 and decide on a lightweight framework.
Regarding the specific design of features, we can not rely on a feature space that has already been evaluated.A previous approach 7 used features originating from an additional spatial annotation in the event data, which, however, is not always included in the event data.Thus, we can not simply adopt the utilized feature space.To

Problem definition
Given the position data, the event annotation data (consisting of imprecise temporal annotations for passes in the match from a data provider), and the expert annotations (consisting of precise temporal annotations of passes in the match), our objective is (1) to establish a general pass event detection from the expert annotation and position data and (2) to apply it to the imprecise event annotation for passes to estimate the exact time when the ball left the foot.We repeat this procedure for all passes in the match to ultimately improve the degree of synchronization between these data sources.
Positional data: The spatiotemporal data is automatically captured by specific camera systems 5 or positioning systems 6 and contains trajectories of the players and the ball.We regard it as temporally exact and define it as the synchronization target for the refinement of the event data.For a specific match, we obtain positions in two coordinates for all R players on the pitch and the ball captured with sampling frequency of f s = 25 Hz .www.nature.com/scientificreports/ We define the position of an individual player r = 1, . . ., R at frame k = 1, . . ., K as a 2-tuple (an ordered pair) m r (k) = (m r x (k), m r y (k)) .Here, K refers to the total number of samples during the match, while m r x (k) and m r y (k) denote the current x-and y-coordinate of player r at frame k, respectively.Furthermore, we define the specific player position vector m r = [m r (1), ..., m r (K)] T ∈ R K with the K player position tupels over the course of the match as elements.Analogously, we introduce the ball position vector , where b x (k) denotes the x-coordinate and b y (k) denotes the y-coordinate of the ball.Finally, we combine the individual player positions for players r = 1, . . ., R in the aggregated player positions matrix M = [m 1 , ..., m R ] ∈ R K×R .The entire spatiotemporal position data of the match is thus captured in the matrix M and the vector b .We account for substitutions by appending the player-ball distance of the in-sub to the distances of the respective out-sub, and for red cards or injuries, by appending zeros to the suspended player's column in M.
Event Data and Error Analysis: Event data for soccer matches can be obtained from various providers, (e.g. 8,9 .It is typically captured by human annotators and its annotation can therefore be temporally imprecise.Here, two different types of errors are encountered: (1) A systematic error in the absolute timestamps of the positions and the event data, and (2) a non-systematic error which largely varies along different passes 7 .While the compensation of the latter is the primary task of our approach, we also need to account for the systematic error.This error can be caused by various circumstances, e.g. the human event data annotator watching the match from a broadcast video by either terrestrial transmission or satellite transmission with respective transmission delays.We compensate the systematic error as previously proposed 7 by regarding relative timestamps in both data sources which are respectively computed as the temporal difference of the timestamp to the kickoff.This way, we examine the pass annotation for four matches in professional European soccer for which we obtain video and position data along with the event data.From the latter, we gather the relative timestamps as well as information about the passing player.We group all N u pass annotations l u 1 , . . ., l u N u during a match and denote them as imprecise pass labels L u = {l u 1 , . . ., l u N u } .The information about timestamp and player is then stored as 2-tuple l u i = (k u i , r u i ) where k u i ∈ {1, . . ., K} is the timestamp of the pass and r u i ∈ {1, . . ., R} is the passing player.Annotation of Precise Expert Pass Labels: To further investigate the non-systematic error, we carefully acquire precise pass labels for the four matches through annotations of a domain expert.The expert determines the exact point in time where the ball left the foot of the respective passing player by analyzing the video frame by frame, as illustrated in Fig. 2. For this purpose, we implement a custom application that realizes the projection of the annotated passes from the video to the position data.We refer to the total N p manual annotations as expert pass labels L p = {l p 1 , ..., l p N p } with l p i = (k p i , r p i ) analogously to above.This way, we are able to sample the non-systematic error between the expert pass labels and the imprecise pass labels.The systematic error is exemplary displayed in Fig. 3 and utilized in Section "Comparison to baselines and state of the art" for the design of a statistical baseline method which addresses the systematic error of delayed pass annotations in the original event data.

Computation of player-ball distance and ball acceleration time-series
As already introduced in Section "Problem definition", we describe the position data of a match in terms of the player positions matrix M ∈ R K×R and ball position vector b ∈ R K , where the single entries m r (k) and b(k) represent 2-tuples with the two-dimensional field position of player r = 1, . . ., R and the ball, respectively.To detect shots in the position data using player-ball distance and ball acceleration, a previous approach has been proposed 7 .Thus, we aim to exploit these quantities in the detection of passes in the position data and present the computation of player-ball distance and ball acceleration time-series from the position data.We compute the distance of player r to the ball at frame k = 1, . . ., K , i.e. d r (k) = �m r (k) − b(k)� with � • � being the Euclidean norm.Furthermore, we introduce the ball acceleration a(k) that approximates the current real ball acceleration at frame k.To compute a(k), we take the first derivative of the ball velocity in one step as , where the interval between adjacent points is one.

Segmentation of player-ball distance and ball acceleration time-series
The previous section presented the computation of player-ball distance and ball acceleration time-series from the position data of a match.In this section, we discuss the segmentation of the previously computed time-series.We therefore choose a sliding window approach to separate time-series into windows (with fixed window length N) while also allowing for an overlap between windows (according to the chosen window shift S).
The goal of this is to divide the information time-series into smaller parts describing short periods of the match.To achieve this, we individually segment the time-series to obtain player-ball distance windows and ball   www.nature.com/scientificreports/assign a negative window label l r w = 0 .Please note that this process might require a tolerance (analogous to the event detection zone 1 ) around the window center k n c w if the window shift S is greater than 1 f s , with f s denoting the sampling frequency (see Section "Problem definition").
Ultimately, we emphasize that the procedure described in this section could easily be extended to an online approach which is technically realized by a fixed time-series window where the last data point of the window is iteratively replaced by a new data point.Accordingly, we choose the window shift of S = 1 f s , which corresponds to updating a fixed time-series window by a single frame at each iteration.

Post-processing
The automatically captured spatiotemporal position data from video tracking systems, although temporally precise, can show a partly unstable and inaccurate behavior, which has already been reported 7 and especially affects the ball position data.For example, the ball could be invisible to (some cameras in) the multi-camera system due to occlusions, e.g., with the crowd when a lofted pass with high trajectory was performed.In such cases, the ball position is often assigned to a nearby player position until it again becomes visible to the tracking systems.However, this process of visually retrieving the ball can consume some time even after the ball technically became visible again for the video tracking system.
As a consequence of this kind of incorrect localization, individual values of player-ball distance and ball acceleration are erroneous.We partially compensate this by our post-processing step via a low-pass filter with cutoff-frequency f c .If the cutoff is chosen accordingly, the low-pass is able to smooth the time-series by excluding high frequencies from the signal (see example in Fig. 1) while preserving its envelope.Since these frequencies are likely to originate from undesired artifacts (or noise) in the tracking process, this procedure can enhance signal quality.Admittedly, this does not apply to long periods of incorrect localization.Thus, we address the detection of passes with faulty position data in Section "Pass event refinement and outlier detection".

Feature extraction
Since, in general, the size of the previously presented time-series window matrix Y r depends on the segmentation parameters N and S (see Section "Segmentation of player-ball distance and ball acceleration time-series"), it can develop to be very large (c.f."Dataset" Section).Thus considering memory and computing time limitations, we regard it as favorable to reduce the amount of data by extracting descriptive features from the individual time-series windows Y r w .Concerning the specific design of the features, we need to deal with the novelty of the apparent task and the lack of research in this specific domain.As pointed out at the beginning of this section, we can not simply adopt the utilized feature space.Therefore, we separately investigated a broad number of different suitable features for the characterization of player-ball distance and ball acceleration.Among other descriptive features, we choose the minimum and maximum values and their respective position within a time-series window, the value at the window center, the curvature mean, and the curvature at the window center.Here, the curvature is approximated as the instantaneous frame difference, e.g., for a time-series y(k) at sample k as [y(k + 1) − y(k − 1)]/2 .Moreo- ver, we separately conduct a polynomial approximation for player-ball distance and ball acceleration within the time-series windows and use the obtained weights for this approximation as a feature.We illustrate the extraction process for a representative choice of features in Fig. 5 and refer to the Appendix for a complete list of examined features and a detailed explanation of the polynomial approximation procedure.
The completed extraction of individual descriptive features from the time-series windows Y r w allows the introduction of a feature representation vector f r w ∈ R D , which respectively comprises a combination of D descriptive features for either player-ball distance or ball acceleration (see Fig. 5, right).We performed extensive evaluation of various combinations of features (listed in the Appendix) and decided on three distinct feature configurations (see Table 1) that achieved the highest performance in explorative experiments while being manageable in their complexity.Another finding from these experiments is that such lightweight feature configurations perform similar to more complex configurations (i.e. using all listed features from the Appendix).This complies www.nature.com/scientificreports/with the SwiftEvent algorithm that works on low-dimensional feature spaces to enable a computation in real time.This is an important aspect of our algorithm regarding the real world applicability for different tasks (see Section "Introduction").For the sake of conciseness, we thus decide for Section "Evaluation4" to only to report detailed experimental results for the three configurations from Table 1.

Pass event detection
We design an event detection algorithm for passes as a binary classification problem which unfolds based on the extracted feature representations (see Section "Feature extraction") of the individual time-series windows (see Section "Segmentation of player-ball distance and ball acceleration time-series").
Due to the absence of large-scale datasets labeled for precise pass events 9 , we want to perform the pass event detection with a lightweight probabilistic framework that requires only a small amount of training data.Moreover, to ensure applicability our approach also requires to run in real time.These requirements are fulfilled by the SwiftEvent algorithm that has been proposed for the feature-based supervised event detection in timeseries 1 .Thus, we adopt the general workflow of the algorithm, however, while refining it for the given pass event detection problem.
Our player-ball distance and ball acceleration time-series are given within windows k w with w = 1, . . .W and players r = 1, . . .R .The occurrence of events is indicated by window labels l r w (see Section "Segmentation of player-ball distance and ball acceleration time-series") and we extract feature representations f r w ∈ R D .We fol- low the SwiftEvent algorithm 1 which assumes the features are normally distributed random Gaussian variables, i.e. f r w ∼ N (f |µ, Σ) 1 and compute the suggested Mahanalobis distances ∆ Σ 0 (f r w ) and ∆ Σ 1 (f r w ) to the centers of the distributions for the two window label classes l r w = 0 and l r w = 1 .Similar to SwiftEvent, these distances serve as criteria in the detection of an event within window k w .
However, opposing to Gensler and Sick 1 we do not evaluate learned thresholds.In contrast, we propose a novel method to compute probabilities for the present pass event detection problem where we expect that the number of labels l r w = 0 in the binary classification task significantly exceeds the number of labels l r w = 0 .Accordingly, we define the pass event probability P(f r w ) for w = 1, . . ., W as such that the Mahanalobis distance to the window label class l r w = 1 needs to be lower than the distance to the window label class l r w = 0 before a non-zero prediction is made.Moreover, to receive probability values in the range [0, 1] we define the normalized probability P(f r w ) as with P(f r w ) being the probability of the normally distributed random variable divided by the probability at the distribution mean µ 1 , computed from all feature representations with window label class l r w = 1 as Here, W 1 represent the number of windows with a positive window label l r w = 1 for player r.

Pass event refinement and outlier detection
To refine the existing imprecise pass labels from the event data source, we utilize the previously discussed general pass event detection (see Section "Pass event detection") to implement an informed maximum a posteriori (MAP) estimation.Thus, we inspect the pass event probabilities in the neighbourhood of a given individual imprecise pass label l u i = (k u i , r u i ) with r u i = r and search the feature representation f r w with the highest pass event probability P(f r w ) (described in "Introduction" Section) within a generally defined search interval.This interval is chosen by closely inspecting the sampled temporal error of the imprecise pass labels (see Fig. 3) such that the majority of the total errors lie within.
We initialize the refinement process by searching the time-series window Y r w u at window k w u with underlying window center k n c w being closest to k u i .We then examine the set of windows K s i = {k w v , . . ., k w o } , where k n c u and k n c o respectively are the smallest and largest window centers that still lie within the search interval.Subsequently, we apply our event detection algorithm to the the time-series window Y r w to compute pass event probabilities P(f r w ) for all feature representations f r w that originate from windows k w ∈ K s i .
(1) In consequence, we regard the window center k n c w,opt of the refined time-series window Y r w,opt as the frame-wise exact refined pass label.
Finally, we utilize the previously computed maximum pass event probability P(f r w,opt ) as in (2) to approach the detection of outliers.We refer to an expert pass label (annotated from the video data) as an outlier if its underlying position data contains a large amount of noise.This is caused, for instance, by the previously mentioned poor localization of the ball (see Section "Post-processing") and is manifested by non-realistic behavior of player-ball distance and ball acceleration at the expert pass label, e.g. a large ball distance of the passing player (see Fig. 8) and low ball acceleration (see Fig. 2).Since such unrealistic behavior also correlates with a low maximum pass event probability P(f r w,opt ) , we are able to detect outliers by regarding P(f r w,opt ) as a confidence score and introducing a detection threshold τ in the refinement process.If this threshold is not exceeded by P(f r w,opt ) , no improvement of the imprecise pass label is done and it is recognized as an outlier.This way, the algorithm is capable to detect localization errors in the underlying position data at the expert pass labels according to specific requirements of particular applications.

Evaluation
In this section, we outline different experiments to evaluate the performance of our proposed algorithm and the impact of its components.Therefore, we initially describe the utilized dataset and introduce a prerequisite for computing errors between imprecise pass labels and expert pass labels.We then discuss metrics to assess the pass refinement and noise removal.Based on these metrics, we conduct a selection of optimal parameters in the algorithm and, finally, compare the optimal configurations with three baselines.

Dataset
Our dataset comprises four matches from European top-flight soccer from the 2014/15 season with a total frame count of 280, 187 positional data points.We present the errors of the event data annotation for the four matches in Fig. 6.In total, N u = 2404 imprecise pass labels and N p = 2552 annotated expert pass labels (see Sec- tion "Problem definition") were obtained for those matches.As a result, 2552 time-series windows Y r w labeled for a valid pass event (l r,w = 1 ) were extracted.The remaining time-series windows are labeled with a negative window label, i.e. l r w = 0. To evaluate the algorithm, we perform four-fold cross-validation by splitting the dataset into training data comprising three matches and test data containing the remaining match.

Training
For training, we extract a total of W train time-series windows Y r w with window labels l r w (see Sections "Computation of player-ball distance and ball acceleration time-series"-"Post-processing") from three of the four matches.From the time-series windows we compute feature representations f r w , w = 1, . . ., W train , as described in Sec- tion "Feature extraction".Training is completed by utilizing the representations and their assigned window labels for estimating the parameters of the probabilistic distributions in order to construct the naïve Bayes classifier for the pass event detection.

Testing
As previously mentioned the line between a pass and another action (i.e. a shot or a tackle) is not well established.Therefore, it is often difficult to decide on a definitive number of ground truth passes in a game.Analogously, the number of imprecise pass labels and expert pass labels in our dataset differs (see "Dataset" Section).While this is not further problematic for the training of the algorithm, it poses a problem for the evaluation step as detection errors can not be computed by simply subtracting the passes according to the order of appearance.The straightforward way to approach this issue is to assign each detected pass its nearest expert pass label (Nearest Neighbour Matching) which was previously applied 14 .However, it has been argued that this assignment can introduce a positive bias to the evaluation results that originates from possible many-to-one mappings 46 .Due to the fact that our proposed algorithm provides a refinement step for existing pass event annotations, a more suitable way to calculate meaningful error metrics is given by one-to-one mappings between expert and imprecise pass labels.Thereupon, we use the previously introduced 46 Sequence Consistency Matching (SCM) to retain a one-to-one mapping of expert pass labels and imprecise pass labels for evaluation.While this method enables the comparison of different amounts of passes from two data sources, we likely exclude some difficult examples from the evaluation.Thus, the obtained results in the testing procedure rather present an upper bound for the results in practice.In detail, SCM is performed by regarding sequences of active play and projecting the chronological order of two annotations within a sequence on to each other if and only the number of annotations within a sequence matches 46 .
We apply SCM using player identities from the imprecise pass labels and expert pass labels.However, if there exists a mismatch of pass labels within a sequence, it difficult to assign a imprecise pass label to its corresponding expert pass label.Thus, we decided to exclude the passes within these sequences for the evaluation.As a result, 1, 690 consistent passes were extracted for all four games of our dataset.Depending on the split in the cross-fold validation, the corresponding consistent passes of the test game are used for evaluation.The time of a pass event was carefully labeled at frame-level (Section "Introduction"), resulting in expert pass labels.This was achieved by regarding the distance of the ball to the foot of a player in the video data.Thus, it is expected that the player-ball distance in the position data is very small as well.However, we identified that in some cases the calculated distance of the passing player and the ball can be very large.We conclude that for these outliers (see Section "Pass event refinement and outlier detection"), the position data contains a significant amount of noise (see Section "Post-processing").Therefore, we can evaluate the quality of the outlier detection by measuring this distance for all detected outliers.Moreover, this allows us to describe the total amount of removed noise in the outlier detection by a combination of the OLPD and NOL metrics.Please note that we accordingly report the aggregated NOL value and the mean OLPD value for the four splits.

Parameter selection
In this section we aim to find the optimal parameters for different choices of segmentation, post-processing, and feature configurations for our proposed algorithm.We therefore perform a comparison of the synchronization performance for specific ranges of possible parameters, summarized in Table 2.Moreover, we also inspect the impact of the outlier detection for different tolerance thresholds by examining the position data quality of the detected outliers as well as of the remaining samples.
Regarding the segmentation process we observe three different window lengths of N = 1 s, N = 1.8 s, and N = 2.6 s.Here, we find that N = 1 s is the minimum applicable window length which still contains the structural characteristics of a pass event (see Fig. 5, left).In contrast, we keep the minimum window shift of S = 0.04 s (1 frame, regarding sampling frequency f s = 25 Hz) constant as this relates to an online approach (see Section "Seg- mentation of player-ball distance and ball acceleration time-series").
The examination of post-processing parameters is accomplished by regarding different low-pass filter cutoff frequencies of f c = 12.5 Hz , f c = 25 Hz , and f c = 37.5 Hz along a scenario without post-processing.For the influence of the feature space, we choose the three defined configurations FR 1-FR 3 (see Section "Feature extraction").Finally, we select a constant search interval around the imprecise pass labels of [−6 s, 0.8 s] since it contains over 97 % of the frame errors occurring in the used event data (see Fig. 3).Appropriately combining the different possible parameter choices yields 12 configurations for which we display the respectively obtained results in Table 3.
General Findings: The conducted experiments examine the general performance of the algorithm in the refinement of imprecise pass labels with respect to the influence of the selected parameters.Among the different examined algorithm configurations themselves, there persist relatively small differences concerning the presented metrics.This indicates the general robustness of the system in the investigated parameters.
Segmentation: Concerning the examined segmentation parameter, we find that the shortest applicable window length N = 1 s (see "Parameter Selection" Section) performs best for all examined feature representations.Moreover, the results for all feature representations FR 1-FR 3 continually decrease when increasing the window length to N = 1.8 s and N = 2.6 s.This indicates that window length N = 1 s still contains the defining charac- teristics of pass events while larger window lengths add redundant information which has a negative effect on the pass event detection.
Post-processing: The application of a low-pass filter, in general, leads to positive effects in the examined metrics, however, only if the cutoff frequency f c is chosen accordingly.Here, the lowest value of f c = 12.5 Hz leads to inferior results compared to no postprocessing in all metrics and for all feature representations.Contrarily, the higher cutoff frequencies f c = 25 Hz and f c = 37.5 Hz have a largely positive effect on the obtained results while the achieved benefit varies along FR 1-FR 3. Reasons for this can be found in the higher degree of smoothing which simultaneously increases with the cutoff frequency.This illustrates that the post-processing is able to remove a certain amount of noise from the time-series which allows for a better generalization of pass events.Yet, for FR 3, the application of the low-pass filter with cutoff frequency f c = 37.5 Hz leads to a small decrease of SE compared to no postprocessing.Consequently, the features of the more complex representation FC 3 are able to tolerate noise in the time-series windows to some degree.This indicates a dependence of parameters within the algorithm configuration and encourages a fine-tuning of the low-pass filter when applying the approach for a given configuration and dataset in practice.
Comparison of Feature Representations: From the obtained results we determine the optimal segmentation and post-processing parameters for each examined feature representation and denote them respectively FR 1*-FR 3*.Based on the lowest TD (and using EX as a tiebreaker) we determine the superior configurations of N = 1 s with f c = 25 Hz for FR 1 and N = 1 s with f c = 37.5 Hz for FR 2 and FR 3.

Impact of the Outlier Detection:
The outlier detection was separately investigated for the superior algorithm configurations FR 1*-FR 3*.Examining different detection thresholds τ ∈ [0, 1] , we individually compute the previously defined measures NOL and SE for each value.To assess the amount of noise in the position data for the detected outliers (see Section "Pass event refinement and outlier detection"), we additionally compute representative values of OLPD for a certain number of selected detection thresholds.The results for the outlier detection are displayed in Fig. 7. Additionally, we present qualitative examples in the outlier detection in Fig. 8.
For all examined configurations we observe an coincidental increase of NOL and SE along a decrease of OLPD with the detection threshold τ .This illustrates the general capability of the maximal pass event probability in the search interval, P(f r w,opt ) to serve as a conclusive confidence score regarding the certainty of the refinement decision.
Another indication for this is given by the OLPD metrics.In the entire dataset, the mean player-ball distance of all passes is given at 5.13 m.This value is exceeded by the OLPD values for the majority of detection thresholds for all feature configurations.Moreover, the OLPD values at detection thresholds τ ≈ 0 , given at 32.22 m (FR 1*), 34.94 m (FR 2*), and, 22.61 m (FR 3*), largely surpass the mean.Therefore, we state that the algorithm is capable to detect localization errors in the position data at the expert pass label.Increasing the detection threshold causes a simultaneous decrease of OLPD among all examined feature configurations.This demonstrates a high correlation of the computed pass event probabilities and the amount of noise in the underlying position data.In general, the results indicate that the outlier detection is a valuable extension to our proposed algorithm since it allows for highly precise fine-tuning between the quantity and quality of the obtained annotated data.However, the concrete decision on an optimal configuration and detection threshold highly depends on the requirements of a possible application.Accordingly, we recommend three configurations of the outlier detection and provide a comparison with the respective superior configurations without outlier detection in Table 4.
Regarding an application with high quality constraints for the utilized position data and a large available amount of data, we recommend OL max comprising FR 3* with τ = 0.675 .Here, we achieve a large improvement of SE, from 70.34% without outlier detection to 88.52% with outlier detection.The detected 1114 outliers have a mean player-ball distance of 6.48 m and the remaining 576 passes have a mean player-ball distance of 2.09 m which indicates the low amount of noise in the underlying position data.
In contrast, given a different application with strong limitations regarding the amount of available data we recommend performing the outlier detection with configuration OL min comprising FR 3* and τ = 0.025 .This configuration improves the initial SE of 70.34% without outlier detection to 76.14% while only 151 outliers with an OLPD of 25.15 m are detected.
As a compromise of the presented strategies, we propose the outlier detection OL opt comprising FR 3* and τ = 0.275 since it detects 427 outliers with an OLPD of 12.32 m reliable and achieves a SE of 81.22% .However, please note that an application may also perform outlier detection as a preprocessing step with one feature configuration and perform the actual pass refinement with another.

Comparison to baselines and state of the art
In the following section, we evaluate our proposed solution against meaningful baselines extracted from the original event data as well as a recent state-of-the-art approach for group activity detection 14 ."Baselines" Section presents the different baselines in more detail.Finally, the results are presented and discussed in "Comparison to baselines and state of the art" Section.

Baselines
Four baselines are examined in the scope of this experiment.
Imprecise pass labels: As a first baseline, we consider the imprecise pass labels from the event data.However, the comparison displayed in Fig. 3 of the expert pass labels to these imprecise pass labels reveals that the majority of passes are annotated after the corresponding expert pass label.This originates from the highly challenging real-time annotation process.The individual annotators usually capture a current event while simultaneously looking out for the following events in the match.Consequently, a rather reactive annotation scheme emerges as the anticipation of passes can only be established for certain rare cases.In contrast, during the annotation of the expert pass labels our annotator was able to pause and navigate the video for frame-accurate pass annotations while having no further restrictions regarding the duration of the process (see Section "Problem definition").
Statistical Baseline: To counteract the issues of imprecise pass labels we suggest a statistical baseline that accounts for the average delay of the underlying annotation.Since typically different annotators are responsible to create event data for soccer matches, the error can depend on their individual characteristics and behaviors.Thus, we specifically compute a mean temporal distance between imprecise pass labels and expert pass labels for each match and half in our dataset.Based on this value, we perform a statistical refinement of the imprecise pass labels through an individual correction of each pass annotation by the respective mean temporal distance.According to Fig. 6 this intuitively relates to a shift of the origin of the x-axis to the mean of the respective match (and half, not displayed in the figure).The corrected annotations are then computed for each half and subsequently aggregated to define the statistical baseline.
Classifier Baseline: Regarding the evaluation of the used classifier in the pass event classification (see Section "Pass event detection") we design a baseline that varies from the proposed solution only in the used classifier.Therefore, we adopt the previously presented pipeline (signal and feature extraction, pass event probability computation, pass event refinement refinement) as well as the training and test procedure (see "Dataset" Section).Moreover, to obtain comparable results we decide on feature representation FR 3 (see Section "Feature extraction") since it is the highest performing feature configuration.However, in contrast to our presented method, we compute the pass event probabilities using a simple logistic regression classifier 48 .Here, we increase the number of iterations such that the algorithm converges and balance the class weights to account for the highly imbalanced classification problem.Thus, a comparison with this baseline enables an isolated evaluation of the adapted SwiftEvent algorithm and its impact for the pass event refinement.Ultimately, we can also use this baseline to assess the performance of the proposed classifier in the outlier detection.Therefore, we present two additional configurations of the classifier baseline: one with minimal outlier detection (OL min ) that preserves the majority of data and one with optimal outlier detection (OL opt ) that achieves the best performance.The selection of these configurations is performed analogously to the procedure in "Parameter Selection" Section.
State-of-the-Art Baseline: Due to the novelty of the task there exists no strictly similar pass synchronization concept in the related work which we can use to compare our results against.Moreover, datasets and approaches are typically not publicly available.Nevertheless, we still aim to compare our method with a (task-related) stateof-the-art model.
As described in Section "Introduction", the detection of passes defined as the duration in which the ball travels between two players has been proposed 13 .However, due to the difference in our task, a comparison with this method is not possible.In a more general sense, a recent approach 14 can be somewhat discussed with regard to the results reported in this paper.The authors consider the pass event detection as part of a general group activity detection framework.Yet, the observed problem differs from our approach in two main aspects.Firstly, the detection of events from the positional data is executed without additional information about the time, involved players, or the type of the event.Secondly, the algorithm is not limited to the detection of passes but furthermore detects shots and receptions in the match.
A prerequisite for this highly general approach is the initial problem of detecting an event in the first place.This problem is addressed by the implementation of a non-maximum suppression (NMS) procedure which only allows for a single prediction within a specified NMS window length.Preventing multiple detections within the window, this also addresses a problem regarding the assignment of detected passes, detailed in Section "Dataset".Thus, the possibly induced positive bias can be partially accounted for by the NMS procedure, however, only if the utilized NMS window length (which is not reported) is chosen large enough.

Results and discussion
In the following section we report the results from the experiments comparing the optimal algorithm configurations to the introduced baselines.We evaluate the comparison to the event based baselines (in particular the statistical baseline) and to the classifier baseline independently and focus on the implications of both experiments.Finally, we assess the comparison of the proposed method to the state-of-the-art pass detection algorithm 14 .Here, we pay special attention to the fact that we adopted the reported metrics to compare the two algorithms, however, did not focus on optimizing them.
The results for the optimal configurations of our proposed solution as well as the baseline approaches are reported in Table 5.In addition to the proposed metrics in Section "Metrics", we report results for ME (medium errors), Q50 (50 % quantile), and Q95 (95 % quantile) 14 to allow a comparison to the state of the art.ME is given by the fraction of passes with a refined pass label with maximum temporal distance of 0.96 s (24 frames) to the corresponding expert pass label.Q50 and Q95 relate to temporal distances larger than or equal to 50 %, and 95 % of all errors, respectively.
Besides, we provide a visualization of error occurrences in Fig. 9 and indicate qualitative results for representative imprecise pass labels in Fig. 8.
Comparison to Baselines based on Event Data: The results reveal that the statistical baseline (Baseline B) performs superior compared to the original imprecise pass labels (Baseline A).In turn, we find that the examined  www.nature.com/scientificreports/superior configurations of the proposed algorithm outperform the statistical baseline by a significant margin.In terms of TD, the results improve by up to 0.3 s compared to the baseline.The other metrics also improve: STD from 0.24 s up to 0.16 s, EX from 3.02% up to 6.75 % , and, SE from 13.55% up to 70.83 % .These values indicate a strong benefit of our proposed algorithm compared to (independent) statistical synchronization approaches.
Comparison to the Classifier Baseline: The classifier baseline (Baseline C) adopts the same workflow, features, and parameters as our the configuration FR 3* and is, thus, used to independently evaluate the role of our presented classifier in the framework.While the TD for the classifier baseline is able to outperform the results for FR 1* and FR 2*, the differences with respect to the feature configurations do not allow for a conclusive comparison of these results.More compelling is the comparison to FR 3* that uses the same features.Here, we find that our proposed algorithm performs superior to the classifier baseline for six out of seven metrics.Moreover, the slight advantage of the baseline in the ME metric is somewhat limited by the inferior results for Q50 and Q95.
This superiority of our proposed solution becomes more clear when investigating the baseline performance for applied outlier detection (BL C-OL opt ).While both algorithms have a similar number of detected outliers, the algorithm configurations with outlier detection outperform the baseline in all evaluation metrics.These results indicate that a more fine-grained confidence score is produced by our proposed classifier which is better suitable for the noise removal in position data.
An especially clear advantage of our proposed solution against the baseline is the performance in metrics that relate to the frame-wise accuracy of the detected passes (EX, SE).This is an important aspect for a broad number of applications such as visual action recognition tasks which often demand highly exact labels.Thus, we find that our adopted SwiftEvent 1 classifier is a valuable component in the pass synchronization framework and that its high specificity regarding the classification task benefits the results.
Comparison to the State of the Art: In general, the state-of-the-art pass detection method 14 is difficult to compare with our approach.Since the utilized data, as well as the source code, is not publicly available, we can not employ a common dataset.Therefore, for our proposed algorithm, we report results obtained by performing a four-fold cross-validation where we used three matches for training and one match testing.In contrast, the state-of-the-art machine learning algorithm 14 performed training with 64 matches and testing with five matches.Moreover, the positional dataset from the 2018/2019 season is likely to contain significantly less noise than our dataset from 2014/2015.
Nevertheless, We adopt the evaluation metrics ME, Q50, and Q95 presented by the authors to enable a discussion of the performance of our proposed algorithm in relation to the state of the art 14 .However, please note that while we adopt these metrics from Sanford et al. 14 , we did not pursue an optimization in the parameter selection for our optimal algorithm configurations (see "Parameter Selection" Section).We present the reported measures and selected superior configurations of our approach in Table 5.
The obtained results reveal a general superiority of our proposed.While for the ME metric the results for FR 3* are inferior to the state of the art, the application of the outlier detection in FR 3-OL opt is able to outperform this ME result.Moreover, regarding the state-of-the-art quantile Q50 of 0.2 s, all examined variants of our algorithm achieve largely superior results.This is somewhat remarkable given that our data is expected to be significantly less precise.Regarding the state-of-the-art Q95 metric of 0.48 s, we find that while the standard configurations show inferior results (likely caused by the larger amount noise in the data), performing the outlier detection does also lead to superior results of 0.28 s.
Thus, we generally find that our proposed algorithm is capable to achieve a comparable amount of medium errors (ME) and simultaneously obtains a superior frame-exact accuracy, documented by the Q50 and Q95 values.Moreover, our probabilistic approach was trained on a significantly smaller and less recent position dataset.Finally, we emphasize that the state of the art did not describe or discuss any kind of localization errors in the position data 14 .
Table 5. Results of the different baseline approaches imprecise pass labels (Baseline A), statistical baseline (Baseline B), classifier baseline (Baseline C), classifier baseline with outlier detection (BL C-OL opt ) and Baseline D (Sanford et al. 14 ) as well as of different configurations of our proposed algorithm.Bold: Please note, that results for Baseline D are reported for another dataset but are displayed here for comparison.

Conclusions
Tracking technology has many applications in soccer and other sports domains.Yet, for more sophisticated analysis regarding team and player behaviors, companies 8 provide (professional) clubs with additional match and player events such as shots, passes, etc.However, in this work, we have empirically shown that this event data is often temporally imprecise.To counteract this issue and allow for exact pass annotations this work has presented a novel framework for pass event refinement based on existing event data.In a first step, features for player-ball distance and ball acceleration, obtained from the spatiotemporal position data, were extracted to construct a classifier for a general pass event detection that is based on SwiftEvent 1 .Subsequently, this classifier was employed to refine the existing pass events from the event data to fit the expert annotation.In this process, the classifier generates a respective confidence score which we further applied for the detection of localization errors, i.e. an inaccurate location of the ball, in the position data.
Experimental results have shown the superiority of the proposed solution in terms of the temporal accuracy of refined pass events compared to the annotations of existing event data and to another statistical baseline.This statistical baseline addresses the systematic error of delayed pass annotations in the original event data.Furthermore, we have shown that replacing our classifier by a logistic regression causes a significant decrease in performance, especially in the localization of errors.This promotes the choice of of our adaption of SwiftEvent 1 as the optimal classifier in our framework.
An in-depth analysis of the various system parameters was conducted and has shown the robustness of the system as well as the efficiency of an outlier detection that removes unreliable positional data points.Parameter settings with various complexity were investigated and results have demonstrated that a lightweight solution can already improve the temporal accuracy of passes drastically.Due to the absence of a public test benchmark and common evaluation protocol, the overall performance of our proposed solution was discussed in relation to results from another more complex state-of-the-art approach for pass detection.Better performance was investigated for all evaluation metrics.
In the future, we assume a more significant improvement when utilizing data from video tracking systems or positioning systems (including three-dimensional ball position).Finally, we believe that, given a sufficiently large sample size of manual annotations, our algorithm may be modified to other events in the event data (e.g., shots and tacklings) to enable the synchronization with the position data.

Figure 1 .
Figure 1.The diagram illustrates the Workflow of the proposed algorithm.On the left, the input data and the involvement in the respective steps of the algorithm is displayed.On the right, an exemplary sequence for player-ball distance (cyan curve, in m) and ball acceleration (orange curve, in m s 2 ) for different steps of the algorithm are presented.The bottom two plots on the right part include the respective expert pass label (green line), the imprecise pass label (red line), and the refined pass label (black line, quantitative) which is computed from the algorithm's pass event probabilities P(f r w ) (black curve).
(a) Frame before pass (b) Annotated pass (c) Frame after pass

Figure 2 .Figure 3 .
Figure 2. Visualization of a representative pass in the video data.Precise expert pass labels were acquired by domain experts that have specified the exact frame of each pass in a football match.

Figure 4 .
Figure 4. Visualization of the construction of a single time-series window Y r w for a single player r from the player-ball-distance (cyan) and ball acceleration (orange) windows d r w and a r which are computed from the position data within the window k w with window length N.

Figure 5 .
Figure 5. Left: Exemplary sequence for a single player-ball distance (cyan) and ball acceleration (orange).Right: Separate extraction of selected features representation from the segment.

Figure 7 .
Figure 7. Results of the outlier detection in terms of small errors (SE) versus detected number of outliers (DNO) for detection thresholds τ in the range [0, 1].At representative thresholds indicated are values of mean outlier player-ball distance (OLPD) that correlate with the impreciseness of the removed outliers.

Figure 8 .
Figure 8. Qualitative Results for the outlier detection of the algorithm with displayed quantities similar to Fig. 10.

Figure 10 .
Figure 10.Qualitative Results of the algorithm for representative passes with displayed player-ball distance (cyan curve, in m) and ball acceleration (orange curve, in m s 2 ) along the respective expert pass label (green line), the imprecise pass label (red line), and the refined pass label (black line, quantitative), which is computed from the algorithms pass event probabilities P(f r w ) (black curve) in thesearch interval.

Table 2 .
Overview of varied parameters in the algorithm (see Table1for used features).

Table 3 .
Results in the introduced metrics from Section "Metrics" for different variants of window length, lowpass cutoff frequency, and feature representation.Superior algorithm configurations (*) and results for each metric are printed in bold.

Table 4 .
Comparison of the recommended outlier detection with the respective superior algorithm configurations.