Physically-interpretable classification of biological network dynamics for complex collective motions

Understanding biological network dynamics is a fundamental issue in various scientific and engineering fields. Network theory is capable of revealing the relationship between elements and their propagation; however, for complex collective motions, the network properties often transiently and complexly change. A fundamental question addressed here pertains to the classification of collective motion network based on physically-interpretable dynamical properties. Here we apply a data-driven spectral analysis called graph dynamic mode decomposition, which obtains the dynamical properties for collective motion classification. Using a ballgame as an example, we classified the strategic collective motions in different global behaviours and discovered that, in addition to the physical properties, the contextual node information was critical for classification. Furthermore, we discovered the label-specific stronger spectra in the relationship among the nearest agents, providing physical and semantic interpretations. Our approach contributes to the understanding of principles of biological complex network dynamics from the perspective of nonlinear dynamical systems.


Introduction
Complex systems are modelled as a collection of the discrete elements that non-linearly interact.Dynamic processes for such complex systems are of great interest in a variety of scientific fields, such as sociology [1,2], epidemiology [3,4], neuroscience [5,6] and physics [7,8].These systems are often represented as networked systems using graphs, which have several problems such as in classification [9,10], prediction [11,12] and control [13,14].Among various networked systems, dynamic networks of collective motions [15,16,17], in which nodes and links are agents and their relationships (e.g. based on their positions), pose several challenges.These challenges arise in changes of the relation among agents (i.e. the structure of a graph) in transient and complex ways, with the exception of moderate changes such as diffusion or contagion [12].In the network dynamics of collective motions, if the governing equation is given such as mathematical models (e.g.cellular automata [18] and coupled dynamical systems [19]) and controlled systems [20], these are often modeled as graph dynamical systems (GDSs).However, for many biological collective motions [21,22], the relation among agents transiently and complexly changes according to different situations, with the exception of the simply modelled collective motions [23,24].These biological network dynamics can be regarded as the network dynamics or GDSs of collective motions represented by graph sequence data.To address this problem, conventional approaches have basically computed the properties of a graph in each temporal snapshot [25,5] or in a temporal sliding window [26] of the sequence data.However, for the understanding of network dynamics, these approaches are difficult to directly extract the dynamical properties, i.e. physically-interpretable information about the dynamics such as frequencies with decay/growth rate and the corresponding spatial (network) structures.The fundamental question addressed here is how complex collective motion networks should be classified based on the physically-interpretable dynamical properties.
The motivation of this paper is to understand network dynamics (or underlying global dynamics of GDSs) of collective motions by directly extracting the dynamical properties of the network in a data-driven manner.As a method of describing nonlinear dynamical systems with a global mode by the direct extraction of dynamical properties, operator-theoretic approaches have attracted attention in fields such as applied mathematics, physics and machine learning.
One of the approaches is based on the composition operator (often referred to as the Koopman operator [27,28]), which defines the time evolution of observation functions in a function space, rather than directly defines the time evolution in a state space from a classical and popular view of the analysis.The advantage of using the operator-theoretic approach is to lift the analysis of nonlinear dynamical systems to a linear (but infinite-dimensional) regime, which is more amenable to subsequent analysis.Among several estimation methods, one of the most popular algorithms for spectral analysis of the Koopman operator is dynamic mode decomposition (DMD), which was originally developed in fluid physics [29,30].DMD has been successfully applied in many real-world problems, such as analyses of epidemiology and electrocorticography [31,32].Among several variants of DMDs (for details, see Material and Methods), Graph DMD [33] can extract and visualise the underlying low-dimensional global dynamics of GDSs with structures among observables from data.However, for more general complex collective motions including strategic human groups, agents flexibly change the rules of behaviour according to the situation; thus, the computation of a feature for the classifier using the obtained spectra cannot be uniquely determined.Therefore, in this paper, we computed Graph DMD for each temporal sliding window and the feature for the classifier reflecting the essential (e.g.dynamical) information for multiple types of collective motions.Previous other approaches have extracted linear dynamics in complex networks [34] and physical interaction networks [35].
Recent neural network approaches automatically extract graph (spatial) and temporal information and perform classification or prediction [36,37].In contrast to these approaches, our methodological contribution is to provide a classification method by directly extracting physically-interpretable dynamical properties for nonlinear GDSs in a data-driven manner.Additionally, we propose a more straightforward formulation for Graph DMD in an observed data space than the previous one [33].
One of the goals of this data-driven or equation-free method [38] is characterising complex collective motions that emerge from complex rules, rather than motions that emerge from fewer rules such as biological rhythms (e.g. a day and year) and simple rules (e.g.attraction and repulsion).Organised human tasks in small groups such as navigation [21] and ballgame teams [22] provide excellent examples of complex dynamics and pose challenges in many research fields because of their switching and overlapping hierarchical subsystems [22].In these cases, the focusing networks are often small (e.g. less than 30 nodes) but spatio-temporally complex due to the highly-strategic (or contextual) behaviours.
Thus, we hypothesised that the node information (i.e.individual agent) is more critical for the understanding the subtle yet significant differences between motions [22] rather than the network properties such as graph spectrum [39].
Previous works [40,38] have predictively classified the time-varying interactions into two group outcomes (i.e.scored or unscored) while reflecting the timevarying interactions among attackers, defenders and the ball.However, since the previous method utilising so-called kernel methods decompose the modes in infinite functional space to acquire high expressiveness [41], it is difficult to physically interpret or visualise the decomposed modes in the observed data space.Thus, the scientific contribution of this study is to provide a method for interpreting the discovered network structures corresponding to the extracted dynamical properties.Additionally, from the viewpoint of practical applications, supervisors such as coaches and researchers who analyse collective motions of agents (e.g.humans, other animals and objects), must exert great effort to observe the motions and label them to specific formations or objectives.The practical advantage of our approach is thus to automatically classify such complex collective motions to aid further analyses such as discovering network structures or dynamical properties.
As already mentioned, the purpose of this study is to understand the network dynamics for complex collective motions via physically-interpretable classification using data-driven spectral analysis of GDSs known as Graph DMD.To this end, we first present the theoretical framework and results of applying graph DMD to actual ballgame data.Then, to validate our approach, we present the classification results of multiple recognition tasks for globally different collective behaviours.Again, we hypothesise that, in addition to physical information, the contextual node information is critical for the classification.We investigated this hypothesis by comparing various existing classification methods.Finally, we visualise and physically and semantically interpret the network structures (called Graph DMD modes) that correspond to extracted dynamics properties.

Graph DMD framework.
Here, we briefly review Koopman spectral analysis, which is the underlying theory for various DMDs, and then describe the Graph DMD framework.First, we consider a nonlinear dynamical system: x t+1 = f (x t ), where x t is the state vector in the state space M ⊂ R p with time index t ∈ T := N 0 .The Koopman operator, which we denote by K, is a linear operator acting on a scalar observable function g : M → C defined by where g • f denotes the composition of g with f [27].That is, it maps g to the new function g • f .We assume that K has only discrete spectra.Then, it generally performs an eigenvalue decomposition: Kϕ j (x) = λ j ϕ j (x), where λ j ∈ C is the j -th eigenvalue (called the Koopman eigenvalue) and ϕ j is the corresponding eigenfunction (called the Koopman eigenfunction).We denote the concatenation of scalar functions as g := [g 1 , . . ., g d ] T .If each g i lies within the space spanned by the eigenfunction ϕ j , we can expand the vector-valued g in terms of these eigenfunctions as g(x) = ∞ j=1 ϕ j (x)ψ j , where ψ j is a set of vector coefficients called the Koopman modes.Through the iterative applications of K, the following equation is obtained: Therefore, λ j characterises the time evolution of the corresponding Koopman mode ψ j , i.e. the phase of λ j determines its frequency and the magnitude determines the growth rate of its dynamics.
Among several possible methods to compute the above modal decomposition from data, DMD [29,30] is the most popular algorithm, which estimates an approximations of the decomposition in Eq. (2).For the details of basic DMD and its variants, see Material and Methods.Among several variants of DMDs, Graph DMD [33] can extract and visualise the underlying low-dimensional global dynamics of GDSs with structures among observables from data.Then, we briefly introduce Graph DMD framework, described in Figs.1a and b.Here, we consider an autonomous discrete-time weighted and undirected GDS defined as where V = {V T is a concatenated vector-valued observable function (i.e.g c : M → R md ).
A t ∈ R m×m is an adjacency matrix, whose component a i,j,t represents the weight on the edge between V i and V j at each time t.For example, the weight represents the relation (e.g. a function of distance) between moving agents in multi-agent systems.
For the formulation of Graph DMD, we propose a more straightforward formulation with dependent structure among observables than the previous formulation based on vector-valued reproducing kernel Hilbert spaces (RKHSs) [33].
Regarding the details of the analysis, the connection with Graph DMD (such as the relation between A t and g c ) and the reason why it can visualise the relation between elements, see Materials and Methods.For the practical implementation of the spectral decomposition, a modified tensor-based DMD [33,42], which is a generalised DMD for the application to tensor data, is applied to the adjacency matrix series.
In Graph DMD, a sequence of the matrices A t or an order-3 tensor A :,:,t for t = 0, . . ., τ is given, where colons are used to indicate all elements.In a similar way of basic DMD procedure (see Materials and Methods), we define tensors X , Y such that X :,:,t = A :,:,t and Y :,:,t = A :,:,t+1 for t = 0, . . ., τ − 1.Instead of singular value decomposition (SVD) in basic DMD, Graph DMD utilises tensortrain decomposition [43] to maintain the tensor structure of input data.Here, we compute M ∈ R m 2 ×r2 , Σ ∈ R r2×r2 and N ∈ R r2×τ by matricising after tensor-train decomposition of X (r 2 is a tensor-train rank in the decomposition and Σ is a full-rank diagonal matrix; for further understanding, see the previous paper [33]).Similarly, we compute P ∈ R m 2 ×s2 and Q ∈ R s2×τ by matricising after tensor-train decomposition of Y, where s 2 is a tensor-train rank.Note that this is similar to SVD in the matrix form, but SVD and this matrisation after tensor-train decomposition with maintaining the tensor structure are completely different.After that, in a similar way of basic DMD procedure, we then define a , where M * is the Hermitian transpose of M and N † is the pseudo-inverse of N .Thereafter, we perform eigendecomposition of F and obtain eigenvectors w j and eigenvalues λj for j = 1, . . ., p.The latter is the estimated Koopman eigenvalues called Graph DMD eigenvalues.
Finally, we obtain the spatial coefficients Z j ∈ C m×m by matricising z j = (1/ λj ) • P Q • N † Σ −1 • w j in the inputted adjacency matrix form, which are called Graph DMD modes.In summary, a sequence of adjacency matrices is decomposed into spatial coefficients and temporal dynamics (Fig. 1b): where b j,0 works as an initial value described in Materials and Methods (we also describe other detailed procedures there).
Examples of Graph DMD for ballgame data.
Next, we show a representative example of Graph DMD result in Fig. 2.
Here we used player-tracking data from actual basketball games.Position data was composed of the horizontal Cartesian positions of every player and the ball on the court, recorded at 25 frames per second.We analysed an attack-segment defined as the period that begins when all players enter the attacking side of the court and ends before a shot.The input of Graph DMD is adjacency matrix series A t visualised in Fig. 1a and its elements are: where i and j indicate players and a ring (i.e. a goal as geometric information), • 2 is the Euclidean norm and σ is a coefficient for adjusting the value of A i,j,t , which represents a proximity with contextual meaning in this case (for details, see Materials and Methods).Furthermore, since the order of i is not uniquely determined in general, we sorted the adjacency matrix in order of nearest attackers and defenders from the ball in each time stamp, to provide the order of player to the semantic information.Here, we denote the first to the fifth nearest attackers and defenders to the ball as A1, ..., A5 (i = 1, . . ., 5) and D1 , . . ., D5 (i = 6, . . ., 10), respectively (for the ring, i = 11).Also, we denote the relationship between two players, e.g.A1 and D1 as A1-D1.
Firstly, we obtained DMD eigenvalues via Graph DMD.As illustrated in Fig. 2b, all eigenvalues appear to be on the unit circle.Eigenvalue λj is transformed into temporal frequency ω j such that ω j = ln( λj )/2π∆t, where ∆t is a time interval of the discrete time system (i.e.∆t = 1/25).Then, for each eigenvalue, we obtained the time dynamics in Fig. 2c-e and Graph DMD modes in Fig. 2f-h.
For the sake of clarity, only three pairs of the dynamics and modes showing the smallest frequency in this order are presented.Graph DMD extracted the dynamics with approximately one, two and three cycles.For the Graph DMD modes, we visualised the strength of the spectrum (i.e.coefficient) for each time dynamics.Note that DMD computes complex-valued DMD modes and time dynamics, but here we only present the absolute value and real parts, respectively.In Supplementary Text 2, we describe the selection of parameters for Graph DMD (e.g.tolerance, temporal window size and cutoff frequency) and quantitatively validated the applicability of Graph DMD to the sport data in terms of the reconstruction error.
Classification in various global collective behaviours.
Next, we indicate that our methods had better classification performance than most of existing methods in Table 1 and Fig. 3.As validation, we performed two classification tasks with different global collective behaviours: the team-defence (zone or person-to-person defence) and team-offence (offence with and without screen-play) recognition tasks.By definition, zone and person-toperson defence are exclusive (i.e.players guard their area and opponent players, respectively).However, they are actually mixed based on different situations whereas experienced people can distinguish them.A screen-play is the basic and minimal strategic cooperative play in basketball [44], in which an attacker stands on the course of defence player like a 'screen' and prevents the defender from defending another attacker in a legal way.Additional details are provided in Materials and Methods.
After Graph DMD, we performed classification using three different approaches.The first two approaches create feature vectors and the third automatically extracts the features via a neural network approach.The first approach is simply to vectorise the Graph DMD modes (denoted GDMD spectrum) to directly reflect the node (i.e.player) information.For the team-defence recognition task, we used the elements of the modes regarding the relations among defenders, those among attackers and defenders and those among defenders and ring (as geometric information) as a feature vector.For the team-offence recognition task, we used those among attackers and among attackers and defenders (without geometric information).Selection and elimination were based on prior knowledge of the sport.The final two approaches computed graph features using existing methods: the second is graph Laplacian eigenvalues [39] as a baseline feature of a graph [45] (denoted GDMD Laplacian) and the third is deep graph convolutional neural networks [46] (denoted GDMD GCN) as a recent promising method for classifying the graph data.The second approach extracts the graph topology without node information, and the third one retains more node information and learns the graph topology.
Due to the highly-strategic (or contextual) behaviours in this experiment, we hypothesised that, in addition to physical information, the contextual node information (i.e. the nodes themselves in the first approach) is more important for the classification than the graph features (i.e. the second and third approaches).
For all classification methods with the exception of the neural network approaches, we adopted logistic regression as a simple linear binary classification model.
As comparable methods for classifying this data, we adopted four methods.
The first is a method using vectorised basic DMD modes [47] (denoted DMD spectrum) as a baseline of DMD approaches.The second is the Koopman spectral kernel [40] using DMD with reproducing kernels [41] as the existing method for classifying the collective motion dynamics [38] (denoted KDMD spectral kernel).
For these two methods, the input data must be a matrix; thus we did not utilise the structure of the graph sequence data.The third is a hand-crafted feature as a simple baseline method, consisting of the vectorised temporal average, maximum and minimum values of the elements of the input adjacency matrix series.Fourth is an advanced neural network approach for classifying graph sequence data called spatio-temporal GCN [36] as a recent promising end-to-end method.More information on the selection and the details of these methods are provided in Supplementary Text 1.
We investigated classification performance in Table 1 in terms of accuracy, the area under the curve (AUC) based on receiver operating characteristic (ROC) curve in Fig. 3a and b, and F-measure, which is the trade-off between recall and precision (the curve is shown in Fig. 3c and d).Overall, the classification performance in GDMD spectrum was better than those in most of the remaining methods.In the statistical evaluation, there were significant differences in all classification performance and tasks (p > 3.63 × 10 −8 ) using analysis of variance (ANOVA) or Friedman test.In the following evaluations, we indicate the posthoc comparison results.Accuracy in GDMD spectrum was significantly higher than that of other methods (r > 0.39 and p < 0.03) for both recognition tasks, with the exception of GDMD Laplacian and DMD spectrum in the team-defence recognition (r > 0.16 and p > 0.05).AUC and F-measure in GDMD spectrum were also higher than those of the other methods (r > 0.40 and p < 0.028) for both recognition tasks, with the exception of hand-crafted feature in the team-defence recognition (both AUC and F-measure: r > 0.34 and p > 0.05).
As a feature extraction method, Graph DMD was a better method than most of the other methods, and especially simple Graph DMD spectrum was better than the graph features in the collective motions.
Analysis of visualised Graph DMD modes.
Since our method can decompose the data into temporal dynamics and spatial coherent structures in the observed data space, we can interpret the decomposed modes that contribute to the classification results.Here we show the averaged spectra for each label and classification task to identify the trends (note that the  For the team-defence recognition task, a strong spectrum for zone defence in  and Supplementary Fig. 2, respectively. For the team-offence recognition task, a strong spectrum for offence with screen-play in Fig. 4c was observed in the relationship between the nearest attackers; however, it was not observed for offence without screen-play Fig. 4d (the boxplot is shown in Figs.5d and e  2 and Supplementary Fig. 2. With these methods, we can interpret the types of interaction between agents (and environments) that contributes to focusing dynamic properties (i.e.frequency in this case) with physical meaning.In Discussion section, we discuss further semantic interpretation.

Discussion
The objective of this paper was to understand the network dynamics for complex collective motions via physically-interpretable classification using datadriven spectral analysis of GDSs called Graph DMD.In this section, we discuss the comparison with the conventional approaches, semantically interpret the visualised Graph DMD modes, describe the limitation and the future perspectives For all statistical results, see Supplementary Tables 1 and 2.
for our methods.We then present our conclusions.
Using ballgame data as an example, we classified the strategic collective motions in the team-defence and offence recognition tasks more successfully than most of the existing methods.The results of the comparison with graph features (GDMD Laplacian and GCN) suggest that the collective motions in this case can be classified with the node information (nearest to the ball) rather than the graph features.The results of other DMD methods (DMD spectrum and KDMD spectral kernel) suggest that both the reflecting graph structure and the use of appropriate sliding windows were important for recognition.Furthermore, in comparison with KDMD spectral kernel [40] (i.e.decomposition in a feature space), the proposed method has an advantage in physically and semantically interpreting the DMD modes in the observed data space.Regarding an end-toend neural network approach (i.e.spatio-temporal GCN), the model for motion capture data (i.e.joint location data for a few people) must be customised.
However, each agent is not physically connected in this case; thus, the positional relationship (i.e.constraint condition) dynamically changes in contrast to the motion capture data.In this paper, we used a simple setting with minimal correction; however, further adaptation to complex collective motions is required (e.g. more appropriate customisation of the network and adequate learning such as using a larger amount of data).For hand-crafted features, we compute simple features to demonstrate the validity of our methods in this paper.The use of a more customised framework [44] which includes the detection of a few related people might improve specific classification performance.However, in this paper, we proposed a more generalised classification framework for global dynamics of collective motions, which successfully performed two different recognition tasks (team-defence and offence) by selecting only the elements of the Graph DMD modes as feature vectors.Furthermore, our method can obtain the dynamical properties of the collective motion network.
The proposed method can obtain physically-interpretable dynamical properties of network dynamics and can perform classification even for complex collective motions.For example, in Fig. 5, we found the label-specific stronger spectra in two recognition tasks, in which the relationships among the nearest attackers or defenders, those among attackers and defenders far from the ball and those between defenders and the ring, which were oscillated in low-frequency modes.Semantically, in zone defence, defenders adjusted their positions more according to a teammate (Fig. 5a), the attacker with the ball (Fig. 5b: especially in the distant case from the ball) and less according to the geometric reference point (Fig. 5c) than the defenders in person-to-person defence.For team-offence with screen-play, attackers adjusted their positions more according to a teammate (Fig. 5d) and the distant defender from the ball (Fig. 5e) than the attackers in team-offence without screen-play.Generally, our approach can be applied to the analysis of complex global dynamics in groups of living organisms or artificial agents, which currently eludes mathematical formulation.The human group in a sport used in this study can be considered as an examples in which communication is likely to be measured in a physical space [38].Therefore, by measuring the outputs of communication and assuming that there are underlying dynamics behind the obtained data, our approach can handle various means of communication between agents.In practice, in various sports teams or other human communities, supervisors (e.g.experimenters, coaches and teachers) spend considerable amounts of time analysing the collective motions in their domain.Application of a system, such as the one presented here, can lead to the creation of useful plans that are currently derived only from their implicit experience.
However, there are some limitations to this study.One is the validation of whether our approach can extract true dynamical properties if used as an equation-free method, which cannot confirm the true dynamical properties (e.g. frequencies with growth/decay rate), as well as in the previous works [32,31].
Originally, DMD demonstrates its strength for the dynamical systems which can be mathematically defined [41,48] or of which solutions are empirically known [29,30].We instead validated our approach using classification performance, a qualitative evaluation with specific knowledge of the sport domain (e.g.lowfrequency band) and the reconstruction error (see Materials and Methods).
However, a general quantitative validation method for the unknown dynamics is further required.Another is to reflect the more local interaction dynamics such as local competitive and cooperative play by the attackers and defenders [49,50,51,22], which can provides more practical information in the sport domain.Although the purpose of Graph DMD is to extract the underlying global dynamics of GDSs and we can obtain the interpretable local spectra in the DMD modes, there are other approaches for extracting the more specific local dynamics.Even when using only players' location data such as in this study, more specific methods reflecting local competition can be applied to more practical application such as score prediction [40,38] and prediction of a player to obtain the ball after shot [52].
In conclusion, we applied a data-driven spectral analysis called Graph DMD, which obtains physically-interpretable dynamical properties of network dynamics for the classification even for complex collective motions.We classified the motions in different global behaviours and discovered that, in addition to the physical properties, the contextual node information was critical for classification.
Furthermore, we discovered the label-specific stronger spectra in the relationship between the nearest agents, which provided physical and semantic interpretation.
Our approach contributes to the understanding of complex networks involving collective motions from the perspective of nonlinear GDSs.

Methods
Positional data in a ballgame.
The positional data of players and the ball (25 frames per second) were obtained from actual men's Asian international level practical games held in 2015 and preprocessed by STATS SportVU system (Northbrook, IL, USA).We obtained the consent to use it for research.We analysed 220 min of play (in 4 days) in which the two teams scored 746 points (386 vs. 360).For each day, players performed one and a half games (i.e. 60 min) except for 1 day (only one game).The positional data contained the XY position of each player and the XYZ coordinates of the ball on the court (All-court: 28 × 15m; half-court: 14 × 15 m).After the following data segmentation, we obtained 319 attack-segments.

Data segmentation.
Prior to data segmentation, we used a custom-made automatic individual play-detection system to detect shots using positional data similar to that used in previous studies [22,38,44].We analysed an attack-segment defined as the period that begins when all players enter the attacking side of the court and ends before a shot (we analysed only the attack-segment finishing with a shot).Then, the authors, who have experience in playing and coaching basketball, manually labelled all attack-segments into a zone defence, a person-to-person defence as the team-defence recognition task and offence with and without screen-plays as the team-offence recognition task.
Person-to-person defence is a team-defence strategy in which defenders basically guard their predetermined attackers.In contrast, zone defence is another team-defence strategy in which defenders basically guard their predetermined areas.Although zone and person-to-person defence are exclusive by definition, they are actually mixed according to different situations.For example, against an attacker's penetration with the ball to the ring, the defenders in person-to-person defence also guard the area near the ring; in contrast, against an attacker's shot, the defenders in zone defence also guard the attacker with the ball.Despite this ambiguity, people with the experience of games can distinguish the two team-defence strategies by observing the defensive motion without the ball.
Screen-play is the basic and minimal strategic cooperative play in basketball [44], in which an attacker stands on the course of defence player like a 'screen' and prevents the defender from defending another attacker in a legal way.We labelled attack-segments into an offence with (at least including a screen-play) and without screen-play.Note that we focused on the global network dynamics of collective motions, thus we did not segment and label screen-plays themselves (i.e. in a spatio-temporally accurate sense such as in the previous study [44]).
We randomly created a validation dataset (30 attack-segments) for the selection of parameters regarding Graph DMD, and a test dataset (289 attacksegments) for the subsequent analyses.
Creating adjacency matrix series.
Next, we compute the adjacency matrix sequence in the attack-segment for inputting to the subsequent graph DMD.Here, using the Gaussian kernel in Eq. ( 5), we converted the distances between all individuals and the geometric reference position (i.e. the ring) into weights of adjacency matrix series, which indicate 1 if close to each other and 0 if far from each other.In this way, we can obtain more stable time series than those directly using the distances.The numerator in the exponential in Eq. ( 5) represents the distance between all individuals and the ring.Since the order of i cannot be uniquely determined in general, we must define the order based on reasonable grounds.For example, the order of playing positions (e.g.guard, forward and centre) or jersey numbers does not provide fair comparison among attack-segments because the players themselves frequently changes throughout attack-segments.A simple way to determine the order is based on the distance between players and a reference point (e.g.ball or ring).However, if the order is simply determined by proximity to the reference point, there may arise a problem that, for example, the distance between the third and fifth players is shorter than that between those (the third and fifth) and the fourth players.Thus, first, we simply set i = 1, 2 for the two attackers in order from the attacker closer to the ball.Then, we set i = 3 for the closest attacker in terms of the sum of the distance to the above two attackers.In this way, we set i = 4, 5 for the closest attacker based on the sum of the distance to the three and four attackers, respectively.Similarly, we set i = 6, . . ., 10 for the five defenders and set i = 11 for the ring position to take the geometric information into consideration.
Next, denominator 2σ in the exponential in Eq. ( 5) is a coefficient for adjusting the value of A i,j,t .In this paper, we set σ for the relationship between players that satisfies A i,j,t = 0.5 when y i,t −y j,t 2 = 1.5 m (i.e.σ = 1.5 2 /2log2), which is considered to be a meaningful proximity between players (e.g. 1 m is near and 2 m is far in terms of the contact of two players).Also, we set σ for the relation between players and the ring that satisfies A i,j,t = 0.5 when y i,t − y 11,t 2 = 6 m (i.e.σ = 6 2 /2log2).In this way, a sequence of matrix A t with A i,j,t as an element at time t is created as a sequence of adjacent matrices of the graph.In this paper, we used this adjacency matrix series for both team-defence and offence recognition tasks.
Selection of an appropriate representation of the data is a fundamental problem in pattern recognition.Time-series data is challenging to design features for because of difficulties in reflecting the data structure (including time length).
A simple method involves using the representative values (e.g.average and maximum values).However, for time series data, which cannot be distinguished by these representative values, it is necessary to extract dynamic properties of the time series.One of the effective methods is DMD [29,30] as described below.
Basic DMD and its variants.
Here, we first describe the basic DMD procedure and briefly explain the variants of DMDs including Graph DMD.Among several possible methods to compute the spectral decomposition from data, DMD [29,30] is the most popular algorithm, which estimates an approximations of the decomposition in Eq.
(2).Consider a finite-length observation sequence y 0 , y 1 , . . ., y τ (∈ C m ), where basically approximates it by calculating the eigendecomposition of matrix F = Y X † , where X † is the pseudo-inverse of X.The matrix F may be intractable to analyse directly when the dimension is large.Therefore, in the popular implementation of DMD called exact DMD [47], a rank-reduced representation where * is the conjugate transpose.Thereafter, we perform eigendecomposition of F ∈ C p×p to obtain the set of the eigenvalues λj and eigenvectors w j .Then, we estimate the Koopman modes in Eq. ( 2): which is called DMD modes.Time dynamics of jth mode is defined as λt j b j,0 , where b j,0 = ψ † j y 0 and ψ † j is the j-th row of the pseudo-inverse of ].In summary, obtained time series data is decomposed into spatial coefficients and temporal dynamics: y t ≈ p j=1 ψ j λt j b j,0 .Theoretically, for DMD to compute the decomposition Eq. ( 2), each observable function g i should lie within the space spanned by the Koopman eigenfunction ϕ j , i.e. the data should be rich enough to approximate the eigenfunctions.However, in basic DMD algorithms naively using the obtained data such as exact DMD, the above assumption is not satisfied such as when the data dimension is too small to approximate the eigenfunctions.Thus, there are several algorithmic variants of DMDs to overcome the problem of the original DMD such as a formulation in RKHSs [41], in a multitask framework [54], and using a neural network [48,55].
The previous Graph DMD [33] utilises the structure of graph sequences (i.e. a tensor structure of adjacency matrix series) by a formulation in a vector-valued RKHS and an implementation using tensor-train decomposition.Although the previous Graph DMD is based on a general formulation in a vector-valued RKHS, g G (x) = ∞ j=1 ϕ j (x)ψ j , where ψ j is a set of vector coefficients.Through the iterative applications of K G , the following equation is obtained: For the practical implementation of the spectral decomposition of the linear operator, it is needed to project data onto directions that are effective in capturing the properties of data such as in exact DMD [47], DMD with reproducing kernels [41] and tensor-based DMDs [42,33].Here, we compute the projection onto some orthogonal directions based on tensor-train decomposition [43], in the same ways of modified tensor-based DMD [33].In other words, Graph DMD assumes that the data are sufficiently rich and thus a set of the orthogonal direction gives a good approximation of the representation with the eigenfunctions of K G .Concretely, for an implementation of the above analysis, we first regard the given or calculated matrices as a realisation of the structure of matrix-valued observable G(x t ) at each t.We denote the realised matrices (i.e.adjacency matrix) as A t ∈ R m×m for t = 0, . . ., τ .Second, we need to compute the projection onto orthogonal directions M (see below and Results) and obtain DMD solution F ∈ R p×p .Then, after eigendecomposition of F , we compute DMD eigenvalues λj and DMD modes z j ∈ C m 2 for j = 1, . . ., p. Finally, we obtain Graph DMD modes Z j ∈ C m×m by matricising z j .
It should be noted that, although we vectorise G(x t ) in this formulation, the order of the vector is unique; thus it reflects the dependent structure of the matrix-valued observable, G(x t ).In the implementation, we utilise tensor-train decomposition to obtain the matrix M = XN † Σ −1 , where X = [vec(A 0 ), . . ., vec(A τ −1 )] (see Results).Although A t is also vectorised, unlike SVD, M reflects order-3 tensor structure based on tensor-train decomposition.
Therefore, both Koopman spectral analysis with dependent structure among observables and Graph DMD can be calculated without breaking the dependent structure.
frequency) using a validation dataset.Here we describe the summary (for the details, see Supplementary Text 2).First, we set Graph DMD tolerance ε (i.e. the tolerance in the successive SVD in tensor-train decomposition) to 1.0 × 10 −5 based on the result in Supplementary Fig. 1a.For the sliding temporal windows, we set the window size to 50 frames (2 s) including overlaps of 25 frames (1 s) based on the result in Supplementary Fig. 1b.Regarding the cutoff frequency, we set the cutoff frequency to 2 Hz based on the result in Supplementary Fig. 1c.
Next, we describe the detailed computation of the feature vectors in classification using the Graph DMD modes.
Feature vectors using Graph DMD modes.
We first adjusted the computed DMD modes.One is a normalisation within the DMD modes such that the maximum absolute value of the elements is 1 because the amplitude of the modes depends on that of the time dynamics.
Second is symmetrisation.Since the computed Graph DMD modes were different from symmetric matrices in a precise sense (although the ideal modes are symmetric matrices), we added the modes to the transposed modes and divided them by two.Finally, to evaluate the global complex behaviours, we averaged the DMD modes for all valid sliding windows.

Classification.
After Graph DMD, we perform classification in three different approaches.
For details, see Supplementary Text 1.The first two approaches create feature vectors and the third automatically extracts the features via a neural network approach.The final two approaches computed graph features using existing methods: the second is graph Laplacian eigenvalues [39] as a baseline feature of a graph [45] (denoted GDMD Laplacian) and the third is deep graph convolutional neural networks [46] (denoted GDMD GCN) as a recent promising method for classifying the graph data.For all classification methods with the exception of the neural network approaches, we adopted logistic regression as a simple linear binary classification model.
As comparable methods for classifying graph sequence data, we adopted four methods.The first is a method using vectorised basic DMD modes [47] as a baseline of DMD approaches.The second is the Koopman spectral kernel [40] using DMD with reproducing kernels [41] as the existing method [38] for classifying the collective motion dynamics.The third is a hand-crafted feature as a simple baseline method.Fourth is an advanced neural network approach for classifying graph sequence data called spatio-temporal GCN [36] as a recent promising end-to-end method.For the detailed setups, see Supplementary Text 1.

Statistical analysis.
For an accurate evaluation of classification performance from the multiple viewpoints, in addition to accuracy, AUC and F-measure were calculated to compare the classification performance.AUC is based on the ROC curve, which is generated by plotting a cumulative distribution function of the true-positive rate with respect to the false-positive rate.Then, AUC takes various decision thresholds into consideration.Next, recall and precision rate were computed for F-measure.Recall rate is defined as the ratio of the sum of true positives and true negatives to the number of true positives (the true-positive rate), and the precision rate is defined as the ratio of the sum of true positives and true negatives to false positives.The trade-off curve between recall and precision was created using the cumulative distribution function.To evaluate the trade-off, the F-measure was calculated as follows: (2× precision rate × recall rate) / (precision rate + recall rate).
For investigating the robustness of the classification results, we repeated test sessions for logistic regression five times using different test sets in analogy to fivefold cross-validation (i.e.classified 25 times).To compare the various methods, since the hypothesis of homogeneity of variances between methods was rejected with Leveneï£¡fs test, the Kruskal-Wallis nonparametric tests were performed.
As the post-hoc comparison, Wilcoxon rank sum test with Bonferroni correction was used within the factor where a significant effect in Kruskal-Wallis test was found.Since comparisons were only performed between GDMD spectrum and others based on our hypothesis, p-value was multiplied by six.We used r values as the effect size for Wilcoxon rank sum test.
We adopted logistic regression for simply investigating the effect of each element of the feature vector.We calculated the odds ratio (related to effect size), its p-value and 95% confidence interval.For all statistical calculations, p < 0.05 was considered significant.All statistical analyses were performed using the MATLAB 2018a Statistics and Machine Learning Toolbox (The MathWorks, Inc., MA, USA).
learn the global node topology by sorting a graph's node in a consistent order called SortPooling layer, so that traditional neural networks can be trained on the graphs.This GCN is closely related to some type of graph kernels based on structure propagation, especially the Weisfeiler-Lehman subtree kernel [58] and propagation kernel [59].The previous work [46] showed that the GCN achieved highly competitive classification performance with various graph kernels and neural network methods.Thus, we used this method as a recent promising method to classify the graph data.We used the averaged Graph DMD modes as an input adjacency matrix and default parameters in open source code https://github.com/muhanzhang/DGCNN.
For all classification methods except for the neural network approaches, logistic regression was adopted as a linear binary classification model.For the neural network approaches [46,36], we used the default softmax layer as classifiers.

Classification using other methods
As comparable methods to classify graph sequence data, we adopted four methods.The first is a method using vectorised basic DMD [47] modes (denoted as DMD spectrum) as a baseline of DMD approaches (the selection of the elements was the same as GDMD spectrum).The second is the Koopman spectral kernel [40] using DMD with reproducing kernels [41] as the existing method [38] to classify the collective motion dynamics (denoted as KDMD spectral kernel).
In the two methods, input data should be matrix thus we reshaped the input adjacency matrix series to the matrix in which rows and columns were temporal stamps and vectorised adjacency matrix, respectively.Koopman spectral kernels generalised a kernel [60] between dynamical systems to nonlinear dynamical systems.Among the above kernels, we used Koopman kernel of principal angle between the subspaces of the estimated Koopman mode, showing the best discriminative performance [40] using the Koopman modes given by DMD with reproducing kernels.Regarding DMD with reproducing kernels [41], we adopted the Gaussian kernel and the kernel width was set as the median of the distances from data.
The third is a hand-crafted feature as a simple baseline method, consisted of vectorised temporal average, maximum and minimum values of the elements of input adjacency matrix series.Fourth is an advanced neural network approach to classify graph sequence data called spatio-temporal GCN [36] as a recent promising end-to-end method.Spatio-temporal GCN is a graph-based neural network for action recognition by modeling dynamic skeletons with the joints as graph nodes and natural connectivities in both human body structures and time as graph edges.That is, the input of the original work is the joint coordinate vectors on the graph nodes.Multiple layers of spatio-temporal graph convolution operations will generate higher-level feature maps on the graph.
We used the sequence of adjacency matrices as the input and basically used default parameters in opensource code https://github.com/yysijie/st-gcn,except for the below parameters.To adjust the model to the collective motion in this study, since indices of players were based on their distance from the ball and teammates, the neighbour edges were defined as the nearest attackers and defenders, and the ring and players.Moreover, for higher-level spatio-temporal graph convolution, we adopted spatial configuration partitioning, which shows the best action recognition performance among various partitioning strategy [36].
The strategy divides the neighbour set into three subsets: the root node itself, centripetal group and centrifugal group.This enables us to semantically higherlevel spatio-temporal graph convolution.
Text 2. Selection of Graph DMD parameters.
Here, we describe the selection of parameters regarding Graph DMD (the Graph DMD tolerance, the temporal window size and the cutoff frequency) and quantitatively validated the applicability of Graph DMD to the sport data in terms of reconstruction error using a validation dataset.
We used the absolute reconstruction error defined as (1/m 2 τ ) τ −1 t=0 m i=1 m j=1 A i,j,t − Âi,j,t , where Âi,j,t is a reconstructed element corresponding to A i,j,t .In Graph DMD, Graph DMD tolerance ε (i.e. the tolerance in the successive SVD in tensor-train decomposition) is critical for the data reconstruction.We set ε = 1.0 × 10 −5 because Supplementary Fig. 1a shows that the reconstruction error in this tolerance is lower than other values.
Next, we performed Graph DMD in sliding temporal windows, because complex collective motions often transiently change their rules of motions.For basic DMD, researchers used sliding windows applied to cortical electroencephalogram data [32].However, there is a trade-off between the reconstruction error and the meaningful dynamical information in the time interval.If the window size is too small, the reconstruction error may be small but the extracted information may be useless.If too large, the extracted information may reflect meaningful information but the reconstruction error may be too large.Supplementary Fig. 1b shows that when the window size is 60 frames, the reconstruction error greatly increased.Thus, we set the window size to 50 frames (2 s) including overlaps of 25 frames (1 s).
In addition, a cutoff frequency is also a important parameter.In this study, meaningful motion frequency is considered to be in a low-frequency band (e.g.under 2 Hz) rather than a high-frequency band (e.g. over 2 Hz), when considering the distinction among team-defense or offense motions.Thus, obtained data was first low-pass filtered at the cutoff frequency.Moreover, since DMDs are not guaranteed to extract the dynamics within the frequency band, we averaged Graph DMD modes within the temporal frequency band using DMD eigenvalues.The selection of the cutoff frequency also has a trade-off between 38 the reconstruction error and meaningful dynamical information.Supplementary Fig. 1c shows that when the cutoff frequency was 3 Hz, the reconstruction error greatly increased.Thus, we set the cutoff frequency to 2 Hz.

Figure 1 :
Figure 1: Schematic diagram of Graph DMD and classification.(a) Obtained data in three temporal frames.The colours in each edge represent the values as a function of the distance between agents; they approach 0 if far and 1 if near (further details are provided in the main text).The input of Graph DMD is adjacency matrix series At.(b) Graph DMD decomposes the input At into the sum of the product of graph (spatial) modes and temporal dynamics.To analyse the time-varying dynamics, Graph DMD is performed for each temporal sliding window.(c) After Graph DMD, features for classification are computed in different approaches.One is simply to vectorise the Graph DMD modes (i.e.spectrum).The second is to compute graph features using existing methods.Other approaches are described in the main text.(d) Using feature vectors and labels, a classification model is trained, validated and tested.

Figure 2 :
Figure 2: A representative example of Graph DMD.(a) An example of obtained data (components of adjacency matrix series).(b) Graph DMD eigenvalues (black circles) computed by Graph DMD in a complex plane.(c-e) Time dynamics and (f-h) visualised Graph DMD modes for the dynamics with the lowest frequencies.
location of collective motion networks shown in Fig. 2 change with time; thus for an accurate understanding, we averaged and visualised them in an adjacency matrix form).

Fig. 4a was
Fig. 4a was observed in the relationship among the nearest defenders (expanded in the figure) but it was not observed in person-to-person defence Fig. 4b.The variability of the feature vectors is also presented via boxplots in Figs.5a-c.In examining the individual contribution, statistical analysis using logistic regression indicates that the relationships among defenders (D1-D2 and D4-D5: odds ratio > 31.81 and p < 0.023; D1-D3 was also significant but too small), those among attackers and defenders (A1-D4, A1-D5, A2-D5, A4-D1, and A5-D5: odds ratio > 0.02 and p < 0.023) and those among defenders and ring (ring-D1, D2 and D4: odds ratio > 0.002 and p < 0.049) significantly explained the labels of team-defences.Note that this analysis separately investigated the individual contribution as well as the above classifications.For more detailed statistical results and boxplots of other elements of DMD modes, see Supplementary Table

Figure 3 :
Figure 3: Classification results of seven methods in two recognition tasks.Receiver operating characteristic (ROC) curves in (a) the team-defence and (b) team-offence recognition task are shown.Also, trade-off curves between recall and precision in (c) the team-defence and (d) team-offence recognition task are shown.The seven methods are given in the legend of (a) and the text.

Figure 4 :
Figure 4: Averaged Graph DMD modes.Averaged Graph DMD modes for label 1 (a and c) and 2 (b and d) in the team-defence (a and b) and offence (c and d) recognition tasks are shown.We denote the first to fifth nearest attacker and defender as A1 , . . ., A5 and D1 , . . ., D5, respectively.R indicates the ring (goal).A strong spectrum for (a) zone defence was observed in the relationship among the nearest defenders; however, it was not observed in (b) person-to-person defence.Similarly, a strong spectrum for (c) offence with screen-play was observed in the relationship between the nearest attackers; however, it was not observed for (d) offence without screen-play.

Figure 5 :
Figure 5: Boxplots of feature vectors as elements of Graph DMD modes.Feature vectors as elements of Graph DMD modes for label 1 (red) and 2 (green) regarding the feature vectors used in the team-defence (a-c) and offence (d and e) recognition tasks are shown (label 1: zone defence and offence with screen-play; label 2: person-to-person defence and offence without screen-play).For the team-defence recognition task, the elements of GDMD modes (a) among defenders, (b) those among attackers and defenders and (c) those among defenders and the ring were selected.For the team-offence recognition task, those (d) among attackers and (e) those among attackers and defenders were selected.Notations are same as Fig. 4. Horizontal bars indicate that the relationship is significantly explained by logistic regression.
1, . . ., V m } and E = {E 1 , . . ., E l } are the vertex and edge sets of a graph, respectively, fixed at each time t ∈ T := N 0 .x t is the state vector in a state space M ⊂ R p for the GDS and f : M → M is a (typically, nonlinear) state-transition function (again, x t+1 = f (x t )).y t = [y T 1,t , . . ., y T m,t ] T are concatenated observed values, where y i,t ∈ R d are observed values for a vertex i = 1, . . ., m.They are given by y t := g c (x t ), where g

Table 1 :
Classification performance of seven methods for two recognition tasks.
Accuracy (Acc), area under the curve (AUC) based on receiver operating characteristic (ROC) curve and F-measure are indicated.Overall, classification performances in GDMD spectrum were higher than those in most of remaining methods.

Table S1 :
Results of logistic regression in team-defense recognition task.

Table S2 :
Results of logistic regression in team-offense recognition task.