Social aspects of collision avoidance: a detailed analysis of two-person groups and individual pedestrians

Pedestrian groups are commonly found in crowds but research on their social aspects is comparatively lacking. To fill that void in literature, we study the dynamics of collision avoidance between pedestrian groups (in particular dyads) and individual pedestrians in an ecological environment, focusing in particular on (i) how such avoidance depends on the group’s social relation (e.g. colleagues, couples, friends or families) and (ii) its intensity of social interaction (indicated by conversation, gaze exchange, gestures etc). By analyzing relative collision avoidance in the “center of mass” frame, we were able to quantify how much groups and individuals avoid each other with respect to the aforementioned properties of the group. A mathematical representation using a potential energy function is proposed to model avoidance and it is shown to provide a fair approximation to the empirical observations. We also studied the probability that the individuals disrupt the group by “passing through it” (termed as intrusion). We analyzed the dependence of the parameters of the avoidance model and of the probability of intrusion on groups’ social relation and intensity of interaction. We confirmed that the stronger social bonding or interaction intensity is, the more prominent collision avoidance turns out. We also confirmed that the probability of intrusion is a decreasing function of interaction intensity and strength of social bonding. Our results suggest that such variability should be accounted for in models and crowd management in general. Namely, public spaces with strongly bonded groups (e.g. a family-oriented amusement park) may require a different approach compared to public spaces with loosely bonded groups (e.g. a business-oriented trade fair).

To plan safe buildings [34][35][36][37][38] and manage crowds in busy public places (i.e. transportation hubs) and events [39][40][41] , it is important to consider group behavior in crowd simulators and monitoring/predicting tools [42][43][44][45] .This includes taking into account factors such as the structure of different types of groups, their social relations 46 , and cultural differences 47,48 for more accurate simulations.It is also crucial to model not only the internal dynamics of the group, but also their reaction to the external environment, such as the presence of other pedestrians, while keeping in mind the aforementioned characteristics of each group.Furthermore, it is essential to consider the impact of the group on other pedestrians and how their behavior may be influenced by the presence of the group.
The emergence of autonomous navigating agents in industries like smart vehicles, assistive robots, and drones has led to a significant increase in research attention towards collision avoidance in recent years.Human beings are naturally good at avoiding each other and the famous Shibuya crossing, in Tokyo, is a good illustration of our ability to avoid collision, even in crowded environments.Consequently, researchers tried to model this ability in various ways [49][50][51] .Early on, in the Social Force Model 52 , a repulsive force between particles (representing pedestrians) was used to account for collision avoidance.Such models have yielded good results when used in tandem with path-finding algorithms to navigate autonomous agents 53 .More recent research has delved into the specific scenario of pairwise avoidance during face-to-face encounters among pedestrians, using real-life trajectory data 26,27,54 .They measured the deviation of pedestrians from their expected undisturbed trajectories when encoutering others (1vs1 and 1vsN scenarios) and compared the results to a Langevin-like physics model.They notably showed that interaction with multiple incoming pedestrians is better discribed using non-linear superposition of short-ranged contact avoidance forces.
However, the majority of the studies mentioned above concentrate solely on the collective behavior of groups, i.e. the dynamics that drive the group to move as one cohesive unit.Mathematically oriented works on group dynamics may introduce the concept of a "group potential" and examine it by assuming that interactions with pedestrians outside the group can be approximated as white noise 6 or as an external "mean-field" potential 14 on average.Similarly, observation-based works 7,19,21,22 often describe group properties using probability density functions defined by an average process that neglects the specifics of the environment.
On the other hand, when introducing group behavior into a microscopic simulator, it is essential to incorporate specific rules that describe interactions between the group and the environment, particularly with surrounding pedestrians.Introducing in-group dynamical rules is a logical starting point, and these can simply be added to the collision-avoidance rules used for individuals.For instance, if a two-person group encounters a single-walking pedestrian in an acceleration-based or "Social Force" model 52,55,56 , the acceleration terms of the group's pedestrians can be obtained by summing the individual-individual collision avoidance term and those resulting from in-group interaction 57 .The behavior of the lone pedestrian can be modeled by adding the two avoidance terms with respect to the pedestrians in the group.In other words, collision avoidance is treated as a one-to-one behavior, group dynamics are regarded as an exclusively in-group phenomenon, and the overall dynamics is the sum of these distinct components.This is the (classical) linear superposition principle 58 , which is typically assumed (and verified) in mechanics and simplifies physical models and their dynamics to a great extent.However, this principle does not necessarily apply to pedestrian dynamics.For instance, when a single walking pedestrian encounters a two-person group walking side by side, particularly if they are socially interacting, he or she may choose to avoid walking through them, even if it seems like the best choice from a pure collision-avoidance perspective.This intrusion decision would be likely, should the two pedestrians be perceived as independent.
In this work, we investigate a relatively unexplored aspect of pedestrian behavior and crowd dynamics: collision avoidance of (or against) groups, and in particular, its dependence on the groups' "social attributes", which refer specifically to social relation and intensity of interaction.
In doing that, we use two data sets of pedestrian trajectory including annotations of groups' social attributes to investigate the nature of individual-group collision avoidance.Moreover, we focus particularly on groups composed of 2 people (i.e.dyads), since they are much more common than larger ones and they actually constitute their fundamental building block together with triads (i.e. for an easy navigation and social interaction, large groups break into sub-groups of 2 or 3 people) 5 .In addition, larger groups (of 3 or more people) may require a categorization of pair-wise social relations or interactions, which may be very complex to formulate or generalize.In that respect, we use the word group to refer simply to dyads.As groups' counterpart in collision avoidance, we focus on individuals, which is a term we use to refer to people not appertaining to a group.

Data sets
In this study, we used two data sets, namely the ATC data set and DIAMOR data set, both of which are reviewed and approved for studies involving human participants by the ATR ethics board 6,7 , are publicly available and contain trajectories derived from range data [59][60][61] .From these trajectories, we computed the normalized cumulative density maps of the experiment environments shown in Figure 1.
The data sets are annotated based on video footage for different social attributes of groups.Specifically, the ATC data set is annotated from the viewpoint of social relations, whereas the DIAMOR data set is annotated from the viewpoint of the intensity of interaction.
For the ATC data set, possible options for social relation are couples, colleagues, family and friends, which are determined through the domain-based approach of Bugental 62 and correspond to the domains of mating, coalitional, attachment and reciprocal, respectively.This annotation process yields the values presented in Table 1-(a).
For the DIAMOR data set, the intensity of interaction is evaluated at 4 degrees, 0 representing no-interaction and 1, 2, and 3 representing weak, mild and strong interaction, respectively.This annotation process yields the values presented in Table 1-(b).Note that in order not to bias the coders' assessment, we only defined the number of interaction levels as 4, but we did not give any guidelines on what can be considered as weak, mild or strong interaction 63 .Instead, we let the coders grasp a feeling about different intensities of interaction through a free-viewing task (i.e.letting them watch the videos for 3 hours before giving any labels, see Supplementary Information Section 1 for further details.)

Approach
Provided that the group and the individual do not perform any collision avoidance, we can expect their (relative) motion to be approximated by a straight line.We are aware that this assumption requires implicitly the environment to be sufficiently straight and wide (e.g.like DIAMOR, see Figure 1-(b) and the discussion in Supplementary Information Section 7) and is valid up to a reasonable range (i.e. over a few meters).Namely, in environments with complex geometries (curved or with many obstacles, intersections etc.), the pedestrians need to deviate as part of their interaction with the boundaries.Similarly, over long distances, they will eventually meet some walls, or divert towards different goals, making their relative motion bent.Nevertheless, in a sufficiently straight corridor and on a scale of few meters, we can expect it to be a good approximation.This trivial assumption can be considered to serve as a hypothesis, opposite to what we actually anticipate.Based on such a hypothesis, the deviation of (relative) motion from a straight line can be attributed to group-individual collision avoidance.Specifically, by measuring this deviation with respect to different social relations or intensities of interaction (of the group), we may understand the reflections of such group attributes on collision avoidance.
This formulation presents a striking resemblance to one of the fundamental problems of Physics, namely, the "scattering problem", where a "particle" (blue ball in Figure 2-(a)) is shot on a "target" (green ball), and its deviation from the straight line motion is used to study the interaction potential.In the original scattering problem (see Figure 2), this deviation is assessed by accounting for the straight-line distance b (called the impact parameter) and closest approach r min , which is derived from the scattering angle θ , as an accurate measurement of particles' location is very difficult.By repeating the experiment with different impact parameters and estimating the corresponding closest approach r min , one can get an approximation for the potential acting on the particle.
(a) Illustration of the scattering problem in physics.A mobile particle (in blue) is projected toward a fixed particle (in green).The impact parameter, b, is the straight-line distance between the particles, and r min is the closest approach.The particle is deflected with an angle θ .(b) A typical pedestrian avoidance situation in the group-centered reference frame.The individual enters the vicinity of the group (gray region) at time t ′ with velocity v i .The straight-line distance from the individual to the group is denoted by r b .At time t c , the individual is closest to the group at a distance of r 0 .
In this study, we establish a simple duality relation between the above-mentioned problem and our group-individual collision avoidance scenario.Namely, the impact parameter b is replaced with a straight-line distance r b and the closest approach can simply be measured as the shortest distance r 0 between pairs of trajectory data points of the group and the individual.In Section Observables we elaborate in detail on how we define these observable quantities.
Using this approach from physical sciences to describe human behavior represents obviously a strong approximation, not only because human behavior is too complex to be modeled through simple physical forces, but also because it completely ignores the effect of the environment, which, in physical parlance, is equivalent to a strong and non-uniform external force.Nevertheless, as we will see, this approach still allows us to grasp the fundamentals of the collision dynamics between groups and individuals and to quantify the interaction.At this point, it is also worth stressing that the proposed model in Section Modeling is aimed at assessing the effect of different social attributes in a qualitative way, rather than reproducing quantitatively human behavior.

Data preparation and transformation
We first carry out a data preparation step by (i) removing atypical/non-characterizing motion (waiting, running etc.), (ii) representing the group as a single unit (its geometrical center) and (iii) focusing on frontal encounters of groups and individuals, for which we expect the pedestrians to be able to judge the social attributes of the incoming party.
Subsequently, we transform trajectories of the group g and the individual i to a reference frame, which is co-moving with the group.Namely, at each time instant (i) the positions of the group and the individual r g,i are translated such that the group (center of mass) is positioned at the origin and (ii) their velocities v g,i are rotated such that the velocity of the group is directed towards x + .Finally, the velocities v g,i are translated by −v g , rendering the group immobile.The main purpose of this transformation is to provide an easier visualization of relative position in 2D, which represents the position of the individual with respect to the group center, while having the group motion as a preferential direction.On the other hand, most of our analysis is based on the absolute value of the relative distance between the group center and the individual, which is rotationally invariant and independent of frame choice.

Relative distance r
Similar to Corbetta et al. 27 , our analysis is based on r, the relative position between the group center and the individual, Its time derivative is the relative velocity v,

4/25
The absolute value (norm) of r is simply denoted as r.

Straight-line distance r b
The straight-line distance r b is computed as the shortest distance from the origin (i.e.translated position of the group) to the line, which passes through the point at which the individual enters a pre-defined vicinity around the group termed as window of observation.This refers to the area in the group-centered reference frame from −W to W meters both along x and y axes (i.e.along the group's motion direction and the direction orthogonal to that).Empirically a W of 4 m is seen to contain the most significant part of the group-individual collision avoidance (see Figure 2-(b) for an illustration.Refer to previous literature 64,65 and Supplementary Information Section 3 for details on the choice of W ). Let t ′ be the time instant at which the individual enters W and let r(t ′ ) be its relative position at that instant.According to the hypothesis mentioned in Section Methods, provided that there is no collision avoidance the individual will follow a path starting at r(t ′ ) and move along its velocity vector at that instant v(t ′ ).In this case, the straight-line distance r b can be computed as the shortest distance between this line and the origin (i.e.translated position of the group), In the analysis, in order to alleviate the impact of orientation noise on the velocity of the individual, we averaged its velocity vector over 4 time instants (before t ′ ) and used this mean velocity in Equation 3 instead of v(t ′ ).
Observed minimum distance r 0 The observed (i.e.actual) minimum distance r 0 between the group and the individual is simply, where t c is the time instant at which the individual is closest to the group.The time steps, at which pedestrian positions are recorded, are obviously discrete.Nevertheless, in order to have a more accurate estimation of r 0 , one can also interpolate r(t) between two consecutive time steps t k and t k+1 by using the velocity vector at time t k , This procedure allows detecting minimum distances not only exactly at sampling instants, but also at intermediate time points between consecutive samples, which yields a much more accurate estimation of r 0 (refer to Supplementary Information Section 4 for details).

Scaled distances
Groups' interpersonal distance is shown to depend on their social relation and interaction intensity 20,21 .Thus, we represent the distances defined above in two ways: in a group-independent way (in meters) and in a group-dependent way, in which the unit of distance is the average interpersonal distance of dyads with the given social bonding 19 ).In the text, we denote distances measured in meters with the normal font (e.g.r) and scaled distances measured in interpersonal distance units with a bar (e.g.r).Since we observed that results concerning scaled values are in general easier to interpret, in the main text we mostly report those (for further details, refer to Supplementary Information Section 5).

Analysis of collision avoidance
As mentioned in Section Methods, our study of the collision avoidance dynamics between groups and individuals is fundamentally based on examining the relation between r b and r 0 .The relation between these two observables, although defined in a slightly different way, has been studied in a similar way for 1-1 encounters by Corbetta 27 .In what follows we define two different methods to analyze this relation and then propose a method to model it.

Empirical relation between r b and r 0 and its statistical analysis
To examine the distribution of r b versus r 0 , the values of rb are quantized into bins of 0.5 unit and for each bin, the average and standard error of the corresponding values of r0 are computed.The choice of 0.5 as bin size was primarily driven by empirical observations.For certain combinations of distance r b (or rb ) and bonding (social relation or interaction level), setting a smaller bin size results in having bins with little or no data.Conversely, using a larger bin size decreases the resolution.In that respect, we consider a bin size of 0.5 to strike a balance between these competing factors.The results will be presented and discussed in Section Results on the relation between r0 and rb .

Intrusions
Small groups, such as dyads, have been shown to usually prefer deviating to avoid splitting 66 .Nonetheless, we found situations where the individual passes through the group (i.e. between group members), and we refer to them as "intrusion".For simplicity's sake, we define the probability of intrusion as the probability of having r 0 smaller than the group interpersonal distance (see Section Scaled distances).We perform a statistical analysis to investigate the dependence of intrusion on the social attributes of the group.The results will be shown and discussed in Section Intrusion, whereas the details of the computational procedure can be found in Supplementary Information Section 6.

Modeling
Many models of pedestrian collision avoidance are based on "Social Forces" 52,55,56 , which may be defined through a potential.
It has been reported that using position-dependent potentials in modeling of pedestrian collision avoidance fails to reproduce detailed behavior.Even if we assume that a "Social Force" approach may reproduce actual human behavior, the corresponding potential should at least be velocity-dependent and based not on current distance but on future distance at the moment of predicted closest approach 15,56 .Nevertheless, determining a potential that may, at least qualitatively, describe the collision avoidance between groups and individuals, represent an important first step towards a more realistic quantitative modeling.
Let us first review how we can study the potential energy between two interacting bodies in physics (note that while discussing the physical model, we use the word "interaction" to refer to the effect that the bodies exert on each other.).The study of such a "scattering" problem is obviously a cornerstone of physics, and the non-Quantum formalism analyzed in this section was used to study such important problems as the structure of atoms 67 and gravitational lens effect due to space-time curvature 68 among others.In general, the "bodies" in focus are very complex and composed of many particles (e.g.planets, stars).Nevertheless, due to the scale of the problem, they may be treated as point particles themselves (in our "pedestrian scenario", the group will be represented with a single point).
Their interaction is determined by a potential energy U(r), which is in general a function of only relative position (a result connected to invariance under space translations and equivalent to Newton's third law 69 ).Nevertheless, in many important applications, the potential is central, i.e. rotationally invariant, and depends only on the magnitude of the distance, U(r).
In such a case, it is shown that the interesting (potential-dependent) dynamics is studied in the r variable 67 .Defining the reduced mass µ as where m 1 and m 2 are the masses of the two bodies, the angular momentum L and energy E result to be constants of motion, In a scattering problem, the system is not bound and r diverges for t → ±∞.In most physical applications, we can only measure the scattering angle and the velocity far before/after the interaction.We can then compare the measured angle with a theoretical result involving an integral.However, if the full trajectory is known, a simpler way to study the system is available.
We study the system far before interaction, i.e. for t → −∞ and r → +∞, and call the corresponding asymptotic speed v ∞ .We see that the absolute value of angular momentum can be written as where the impact parameter b is the minimum value of r assumed in case of straight motion (i.e.no interaction).Assuming lim r→+∞ U(r) = 0, we obtain On the other hand, since we actually have interaction, the minimum distance r min in the observed trajectory turns out to be different than b, i.e. r min ̸ = b.At r = r min , having a minimum, we have ṙ = 0 and the corresponding energy is Conservation of energy implies E ∞ = E 0 and provides the following relation for the value of U(r) at r = r min

6/25
This relation enables studying the potential U(r), provided that v ∞ , b and r min are measured.In modeling collision avoidance between pedestrians based on the above framework, we assume that dU(r) dr < 0 ∀r ⇒ U(r) > 0 ∀r.
In other words, the force is assumed to be repulsive.Namely, denoting F 1 as the force acting on body 1, and recalling the usual definition r = r 1 − r 2 , we have We apply these physical concepts in a pedestrian scenario to model the "collision avoidance potential" between groups and individuals.As mentioned in Section Methods, r b is inspired by the impact parameter b, whereas r 0 corresponds to the closest approach r min .Thus, the term v ∞ in Equation 11 should be approximated by using the relative velocity when r b is computed.But since pedestrian velocities have a small variation, we may consider it to be almost constant.In a similar way, as usual when studying "forces" that determine the pedestrians' cognitive decisions, all masses are considered to be equal (to one) 55 and we may remove µ from the equation.Finally, since the approach is completely of a qualitative nature, we opt for ignoring the overall constant in Equation 11and study the following simplified version, to which we will refer to as the "collision avoidance potential" (defined as a dimensionless pure number).A comment on Equation 14 is probably needed.This equation does not represent the functional form of the dependence of the potential on r.Instead, it shows which is the value of U attained at r 0 given that the straight-line distance is r b .Different values of r b allow us to probe different values of U, where the smaller b is, the higher U ′ is.Nevertheless, Equation 14clearly allows us only to probe values U ′ < 1.This is due to the fact that in the computation of U ′ the value of the initial kinetic energy is taken as fixed, and we are measuring the probed values of the collision avoidance potential as multiples of such kinetic energy.Note that in particle physics short distances are indeed probed by using very high kinetic energies.
The results are shown and discussed in Section Potential, whereas the details of the computational procedure are described in Supplementary Information Section 7. In addition, in Section Comparison to individual-individual collision avoidance we also show the results concerning a similarly defined potential describing individual-individual collision avoidance.

Results on relative frame pdfs
The group-centered reference frame is particularly suitable to study the 2D distribution of r, i.e. of the position of the individual around the group.Figures 3 and 4 show the 2D distributions in relation to different social relations and interaction intensities of the group, respectively, using as a distance unit the groups' average interpersonal distance.Note that, in order to highlight the specificities of each social attribute as compared to the whole, we depict the difference between a given attribute and the overall 2D average, which is computed as an unweighted average of the distributions of all relating cases.Therefore, positive values depict an increased likeliness of presence for the individual, while, reciprocally, negative values depict a decreased likeliness.
The effects of varying social relations are presented in Figure 3. Comparing Figure 3-(a) with Figure 3-(b) and (d), one may notice that individuals do not have a prominent preference to pass on the right or left side of colleagues, whereas they prefer to pass more on the right for couples and on the left for friends (as compared to the overall average).In addition, they pass with a very small distance (r ≈ 0) more often for families (see Figure 3-(c)) than for other kinds of social relations, which may be due to a more dispersed configuration of family group members 21 .On the other hand, in Figure 3-(b) we see very clearly two low probability horizontal stripes, roughly located around y = ±1.As these stripes correspond more or less to group members' positions, they suggest that the group's abreast formation is rarely disturbed in couples.
Concerning social interaction, the difference with respect to varying intensities is much more noticeable, the most interesting one being between 0 and 3 (see Figures 4-(a) and (d)).Namely, concerning groups annotated as non-interacting (i.e. with 0 intensity of interaction), the center stripe presents positive values, while the lower and upper stripes y ≈ ±2.5 present negative values, indicating that individuals are more likely to maintain a trajectory directly facing the group (possibly even intruding it) (see Figure 4-(a)).Reciprocally, from Figure 4-(d) we can see that individuals are less likely to position themselves on a colliding trajectory with the group and prefer to place themselves on its side, when it has a high intensity of interaction.There are interesting left/right asymmetries in Figure 4, which may be related to the tendency of Japanese pedestrians to move mainly on the left, and overtake on the right 70 .This tendency may cause low-interaction groups, when they are not intruded on, to have a relatively higher possibility to be passed on their left than on their right, since they are expected to have a higher speed than highly interacting ones.We do not have a clear interpretation for the right/left asymmetry between the couples distribution in Figure 3-(b) and the friends distribution in Figure 3-(d).

Results on the relation between r0 and rb
We divide the range of rb into bins of 0.5 unit and compute the mean and standard error of r0 corresponding to each bin.The results are depicted in Figures 5-(a) and (b) and are similar to previous results 27 .Smaller values of rb indicate that the straight line trajectory of the individuals would require them to pass very close to the group.In addition, rb < 0.5 signifies a distance smaller than half of the group interpersonal distance, which means that the individual would need to intrude on the group (if moving straight).
Concerning social relations, we observe that when rb < 0.5, the average value of r0 is considerably larger for couples and friends than for colleagues and families (see Figure 5-(a)).In other words, there is a strong resistance against intruding on groups with the former social relations.Concerning intensity of interaction, we have a similar observation for higher intensities of interaction (from 1 to 3, see Figure 5 These observations make us believe that the social attributes of the group do impact group-individual collision avoidance.Specifically, there is larger avoidance, when there is a strongly-bonded group involved (i.e.couples, friends or with high intensity of interaction).
The statistical significance of these results can be assessed through an ANOVA (see Supplementary Information Section 8 for considerations regarding the necessary assumptions).To that end, we compute the p values concerning each bin shown in Figures 5-(a) and (b) and demonstrate the results in Figures 5-(c) and (d), respectively.Regarding lower values of rb (i.e.rb < 1.5), we observe statistical significance (i.e.p < 0.05) concerning both social relation and intensity of interaction.Regarding larger values of rb (i.e.rb > 2), there is no statistically significant difference, as it can be expected observing the overlapping curves in the corresponding regions of Figures 5-( Let us also notice that in Figure 5-(b) concerning the DIAMOR data set, for rb ≫ 1, we have rb ≈ r0 regardless of intensity of interaction, in agreement with the hypothesis that collision avoidance can be ignored for such values (Equation 14).The fact that this is not the case in ATC, where we actually observe rb > r0 for rb ≫ 1, is considered to be an effect of the ATC environment being less straight and narrower (see Figure1-(a)).
To compensate for this effect in the computation of the potential, we perform a linear correction in the computation of rb in Section Potential.The details of this correction are presented in Supplementary Information Section 7. In addition, results concerning the relation between r b and r 0 , i.e. values measured in meters and not scaled with group interpersonal distance, are shown in Section Comparison to individual-individual collision avoidance.

Intrusion
It is noticeable that the observed minimum distance r0 reaches particularly low values in some encounters.For instance, the first bin in Figure 5-(b) for intensity of interaction 0 presents an average value of r0 smaller than 1.This means that the distance from the center of mass of the group to the individual gets smaller than the group interpersonal distance (see Supplementary Information Section 5).In such cases, it is likely that the individual is actually intruding on the group instead of deviating, essentially following the straight line trajectory.
To quantify the frequency of such intrusions, we computed the probability of r0 being smaller than 1.Specifically, this is an empirical probability computed as the ratio of the number of observations with r0 < 1 to the total number of observations (for a given bin of rb ).The results are shown in Figures 6-(a) and (b).Here, we see that there is indeed a correlation between the probability of intrusion and the social bonding of the group being intruded on.Namely, individuals have a higher probability to intrude on loosely-bonded groups (i.e.colleagues, families and non-or slightly-interacting groups) than strongly-bonded groups (couples, friends and strongly interacting groups).
The statistical significance of this observation is assessed through Pearson's χ 2 test and the relating p-values are presented in Figures 6-(c) and (d).The difference in probability of intrusion concerning different social relations is significant (p < 0.05), when rb is smaller than 1.5.On the other hand, for the intensity of interaction we have a significant difference of intrusion for rb < 1.
Actually, the average distance of a group member from the group center is r0 = 0.5.The corresponding analysis for the probability of having r0 < 0.5 is shown in Supplementary Information Section 6.

Potential
As described in Section Modeling, we study U ′ (r 0 ) (see Equation 14) to model the "potential" representing the interaction between the group and the individual.To that end, we again quantize the values of rb and compute the corresponding mean values of r0 before calculating the values of the potential U ′ (r 0 ) for each bin.Interestingly, the potential is shown to be affected by the nature of the social bonding of the group.As a matter of fact, stronger bondings (e.g.couples, high intensity of interaction) generate a "stronger potential" (i.e. with a steeper negative derivative) which, as seen in Sections Results on the relation between r0 and rb and Intrusion "causes" individuals to deviate more, and significantly decreases their probability to intrude on the group.On the other hand, loosely-bonded groups (e.g.colleagues and non-or slightly-interacting groups) generate a weaker potential, resulting in a smaller deviation and a higher chance of intrusion.
The discussion above concerns results obtained using distances scaled with the group interpersonal distance; results concerning computations performed using distances measured in meters are shown in Section Comparison to individualindividual collision avoidance.

Comparison to individual-individual collision avoidance
Many practitioners simulate crowds on the basis of individuals.Thus, it is interesting to compare the above-mentioned potentials with results obtained for individual-individual interaction.The results (using distances measured in meters, i.e. not scaled by group interpersonal distance) are shown in Figures 7-(c) and (d).
Note that groups are larger (than individuals) and expected to exert a stronger "social force", but they are also susceptible to being disrupted and intruded on (passed at ≈ 0 distance to their geometrical center).Also, while it is expected (statistically) that collision avoidance between individuals is symmetric, it may be that groups interact less than individuals by deviating very little.These effects seem to balance and potentials for collision avoidance between individuals are quite similar to those with groups.
Nevertheless, it may be seen that potentials describing low intensity social interactions, colleagues and families have typically a less steep derivative than the one for individual-individual encounters, while the opposite is observed for high-intensity social interactions and (in particular) for couples.

Conclusion
In this work, we analyzed how group-individual pedestrian collision avoidance depends on the group's social relation and social interaction intensity.In detail, we verified that when straight motion (i.e.absence of collision avoidance) would lead to a collision, the actual minimum distance r 0 between the individual and the group is a growing function of social interaction intensity, and assumes a higher value for couples and friends.Similarly, individuals have a stronger tendency to "intrude" or "disrupt" a group by passing at a distance comparable to the group interpersonal distance when they face groups with low interaction intensity and colleagues and families, as can be verified both by studying 2D distance probability distributions, and by performing a statistical analysis on the probability that the minimum distance becomes smaller than the group interpersonal distance.
We also introduced a "potential" to study the dependence of "intensity of collision avoidance" on relative distance, by mimicking the theoretical modeling of two-body scattering in classical mechanics.This approach, which may be used as a guiding light in the development of a "social force model" of individual-group interaction, shows again that the potential determining collision avoidance tends to grow much faster with decreasing distance values (i.e. it has a steeper negative derivative) for strongly interacting groups, couples and families.
The latter result is particularly clear when studied using the group's average interpersonal distance as a length unit.A further comment on this result may be necessary, since the tendency of individuals not to pass through "strongly bonded dyads" (such as couples, friends and strongly interacting dyads) may be due not only to some kind of "social rule", but also to the fact that passing through these groups is actually harder due to the narrower space between them.
To this respect, we should finally comment also on the results concerning families, which may be a little counter-intuitive by suggesting that families are somehow perceived as weakly interacting and are often "intruded" 71 .It should be stressed that, as reported by Zanlungo et al. 19 , the families in the ATC data set are mostly composed of parent-child pairs, that often do not walk abreast, or at least have a weaker tendency to walk abreast.The authors of the original study justify this tendency by referring to "the erratic behavior of children", but it may also be related to a stronger hierarchical structure in a parent-child dyad with respect to couples, friends and colleagues 19 .It may thus be argued that the tendency of individuals to approach families at a shorter distance may depend on families being less spatially structured, or correspondingly having a higher tendency to change their spatial structure.Such role of group spatial structure in individual-group interaction could be the subject of future studies, possibly when larger data sets collected in more suitable environments will be available.
We believe that our results and inferences point out interesting variabilities in pedestrian motion due to social aspects of human navigation 72 .A valuable implication of our study is that infrastructure design could be adapted to the nature of the social bonding of its users.We can speculate that, for instance, if a particular environment is known to be frequented mostly by strongly bonded groups, such as an amusement park, providing additional space (e.g. by widening corridors or walkways) to allow for collision avoidance may make it more comfortable.Nevertheless, these qualitative considerations should ultimately be corroborated with quantitative simulation models that include our findings.By taking into account the social dynamics of the people using a particular space, designers and architects could create environments that are more conducive to safe and efficient movement.This could help to reduce the risk of accidents and improve the overall user experience.We also hope that using models which account for the expected social composition of the crowd may help in improving the performance of tracking and simulation systems 73 .
The recording location is an underground pedestrian street network in a commercial district of Osaka, Japan.The entire underground network is composed of a total of more than several kilometers of walking path, and it is connected to the Osaka-Umeda railway and underground station complex, which is considered to be one of the busiest in the world (the busiest outside Tokyo) and visited daily by millions of pedestrians.For example, according to the Osaka municipal transportation bureau, the three metro stations located in the underground area had a daily number of passengers of more than 700K in the fiscal year 2019.The DIAMOR data set includes recordings from a junction of two straight corridors in a relatively peripheral portion of this street network and we focus on one of these.Similar to ATC data set, several train stations, business centers, shopping malls etc. are accessible from the recording location leading to diversity in pedestrian profile.
The recording area is roughly 200 m 2 and allows continuous tracking along approximately 50 m.The recording time span is two weekdays and a total of eight hours of recordings are available, which, although shorter than the ATC data set, we consider to be enough for the purposes of this study.
Similar to the ATC data set, it is composed of depth and video information.The depth information is used to derive the trajectories of the pedestrians based on the method reported by Glas et al. 75 and the trajectories can be freely downloaded 59 .As a result of this tracking process, the normalized cumulative density map shown in Figure 1-(b) is obtained.
Based on the video information, human coders were asked to annotate groups and individuals (people who do not belong to a group).Of course, it is not wrong to say that each group member is an "individual", but within the context of this study we refer to them explicitly as "group member", and use the word "individual" specifically to refer to people who do not belong a group.Coders also annotated whether or not members of dyads were engaged in interaction (oral communication, possibly accompanied with non-verbal elements such as gestures or gaze exchange, as defined by Knapp et al. 63 ), and the corresponding intensity of interaction (evaluated at 4 degrees from 0 representing no-interaction to 3 representing strong-interaction).Note that the annotations of ATC data set are inherently disjoint, i.e. the annotation labels can be considered to be mutually exclusive nominal variables.On the other hand, the annotation labels of the DIAMOR data set can be viewed as ordinal variables (i.e. with a gradual relation).The outcome of the annotation process (i.e. the number of observations for each intensity of interaction) is summarized in Table 1-(b).

B Data preparation
Both the ATC data set and the DIAMOR data set are collected in an ecological environment.Namely, they contain trajectories collected from uninstructed people moving freely.Although they were not recorded secretly (i.e.there were signboards informing the pedestrians that a data collection campaign was being carried out), the pedestrians' awareness of being recorded is anticipated to have a negligible effect on how they move, in particular as compared to participant experiments performed in artificial (laboratory) environment.In that respect, since the data are collected under uncontrolled settings, the tracked trajectories may contain behaviors like waiting, running etc. From the point of view of this study, such cases are not of interest.In order to eliminate atypical/non-characterizing observations, each trajectory is treated as explained below.
Let a group (dyad) be described as an unordered pair composed of (two members) p and q, i.e. g = (p, q) and let i denote an individual.For the sake of simplicity, we reduce a group to a single mobile agent in the data preparation phase.Namely, the location of the group is represented by the group center of mass r g and its velocity is represented with group velocity v g .Specifically, concerning a group g = (p, q), r g and v g are represented as the average positions and velocities of p and q at each time step, respectively.
Thereby, we treat a group g and an individual i in the same manner and first check the sufficiency of the number of trajectory data points.Provided that | r g,i |≥ 16, the trajectories are considered to have enough data points for characterizing locomotion.Note that this corresponds to a minimum of 8 seconds of observation, since the sampling time step is 0.5 s.Any trajectory which includes fewer samples is discarded.
Next, we check instantaneous speeds v g,i and remove the trajectories which are associated with too low or too high speeds (i.e.out of walking range).For judging the typical speed range of pedestrians, we referred to the literature on human locomotion.Based on the results reported by Zanlungo et al. 6 , g and i are considered to depict typical walking motion, if their instantaneous speed lies within the range 0.5 ≤ v g,i ≤ 3 (in m/sec).Otherwise, they are assumed to be "not walking" and discarded.
As mentioned in Section Introduction of the main track, we focus on the effect of social attributes of (the group) on collision avoidance.To have an understanding of those, the peers need to have sufficient visual information about each other.Therefore, we start with conditioning on having a frontal view of each other, which implies moving in opposite directions.In other words, 18/25 if g and i move in the same direction, one party will be leading and the other following, such that the leading party will not see and thus not be aware of the other, and the following party will have limited information about the leading one (e.g. on social relation, age, interactions, etc.).In theory, the group and individual might also approach each other at an angle, but in the studied environments such cases are rare and we do not consider them in this work.However, if they move in opposite directions, they will have the opportunity to watch the incoming party and get a sense of its social features.
For ensuring a frontal view, we detected the relative motion direction of g and i and considered only those g and i, which move in opposite directions.Let φ represent the angle between the velocity vectors v g and v i at a given time instant, Then, g and i are considered to be moving in opposite directions, if 3π/4 ≤ φ < π.Note that, in addition to boasting a bigger potential from the viewpoint of our purposes, in the studied "bi-directional" environments, considering opposite relative motion direction has also the advantage that the number of observations associated with it is significantly higher than those with other directions (namely, 68% of all observations), which is preferable for the statistical analysis performed in our study.

C Window of observation
Illustration of the impact of the choice of the window size W on the computation of the straight-line distance r b .In red, a small window violating completeness, in blue, a large window violating atomicity and in green, an "optimal" window which satisfies both constraints.Note that since we study frontal encounters in the group reference frame, individuals will always enter the window from the right.
As measures at infinity are obviously not feasible, in order to measure the straight-line distance r b we need to define a window of observation (centered on the group).In particular, this window should satisfy two competing properties, completeness and atomicity.On the one hand, the window needs to be large enough to verify that, when entering and exiting from the side of the window the individual is not yet (significantly) influenced by the group (i.e.he/she is not yet engaged in an avoidance maneuver).This guarantees that the deviation is entirely contained in the window of observation, hence the notion of completeness.On the other hand, the window should not be too large, so that the assumption that the individual would walk on a straight line in the absence of the group stays valid.Indeed, although pedestrians certainly do not walk on straight lines at all times, at a relatively small scale and in the absence of perturbations (resulting from either external sources such as the environment or other pedestrians, or internal sources such as a change of planned destination) it can be expected that one's trajectory will be close to a straight line (atomicity).Figure 8 illustrates the impact of the size of the window.
Regarding completeness, we referred to literature on collision avoidance and searched for a reasonable threshold value.Cinelli and Patla found that, the "safety zone", i.e. the area in which individuals allow a moving object to approach before initiating an avoidance behavior, is on average 3.73 m 64 .Furthermore, Kitazawa et al. showed that pedestrians gaze most at other approaching individuals, when they are on average 3.97 m away, and that they seldom look at pedestrians at longer distances than this 65 .Therefore, we deliberated that 4 m is a reasonable lower bound for considering that their mutual influence is still null.
Regarding atomicity, the environment needs to be taken into consideration.In a straight and wide corridor, like in DIAMOR, the straight line assumption is more founded than for a more complex environment, like the bent and narrow corridor in ATC where pedestrian have to follow the curve and will have naturally less straight trajectory.Nonetheless, the discussion provided in Section G suggests that the disparity due to environment geometry can be accounted for with a linear correction.Therefore, we argue that the value of 4 m derived from the completeness condition is adequate also to satisfy the atomicity condition.

D Improving accuracy of estimation for the observed minimum distance r 0
The trajectories provided at the ATR pedestrian group data set 59 are derived by the algorithm of Brščić et al. 60 , whose output rate depends on the rate of sensor readings, which may be non-uniform.In order to make the rate of trajectory samples uniform and also to eliminate the effect of gait and sensor noise, we re-sampled the trajectories at 2 Hz.However, the new time resolution can be too sparse for the purpose of computing the observed minimum distance.As a remedy to this issue, we propose interpolating the position of an individual i between two consecutive time steps t k and t k+1 with its own velocity vector at time t k , v ′ i (t k ) (see Figure 9).For each time step t k , the smallest distance between g and i can be computed by measuring the distance from the origin to the line passing through p ′ i (t k ), the position of i at t k and directed by v ′ i (t k ), the velocity of i at t k .We can this distance as the actual minimal distance to the group g in the interval [t k ,t k+1 ], only if i can reach the position where this distance is observed within t k+1 − t k seconds (0.5 s in this study).We denote the signed distance from p ′ i (t k ) to that position with λ , which can be computed by taking the projection of the vector from p ′ i (t k ) to the origin (i.e.−v ′ i (t k )) onto the unit vector )||, it means that g and i are getting closer and that it is highly likely that in the next time interval, they will be even closer.The same operation is carried out for all time intervals and the minimum of all the registered values is used as r 0 .

E Scaling by interpersonal distance
Yücel et al. 21showed that interpersonal distance between members of a dyad strongly depends on their social relation.For instance, couples were shown to walk with an interpersonal distance significantly smaller than for other social relations (values of interpersonal distance for various social relations and intensities of interaction are show in Table 2).
This variability may affect the behavior of the individual approaching the group.Namely, an individual i may choose not to intrude a group g due to insufficient space (between its members) or may prefer to intrude g due to ample space.So the presence or lack of intrusion may simply be due to geometric circumstances and not stemming from social factors (e.g.strength of social bonding).This effect is actually better studied by measuring distances using a common unit (i.e.meters).
On the other hand, when i does not intrude on g, the effect of the social bonding may be better expressed, if distances are measured with respect to the physical size of the group (interpersonal distance between the members), since avoiding a small group at a distance of, e.g. 2 meters, may involve a stronger avoidance behavior, if the group size is smaller.

F Probability of intrusion
In the main track, to study intrusion, we examined the probability of r0 < 1, i.e. that the individual reaches a distance from the group center smaller than the average group interpersonal distance.One could argue that an alternative, if not better, definition of intrusion, can be based on studying the probability of having r0 < 0.5, i.e. a distance smaller than the average distance of a group member from the group's center.For this reason, in this section we perform a similar analysis to the one presented in the main track, but using the value of 0.5.Namely, Figure 10 shows the probability P(r 0 < 0.5) and Figure 11 shows the corresponding p-values.
Nicely, the probabilities show a similar trend to those given in the main track and confirm our inference that loosely-bonded groups are more likely to be intruded on than strongly-bonded ones.

G Linear correction to r b
As mentioned in Section C, we tried to calibrate the window of observation in such a way that the trajectory of the individual should be close to a straight line for large values of r b .If the straight-line distance r b is large, it means that the individual should have enough space to pass comfortably without deviating and we would expect the minimum distance r 0 to be somewhat similar to r b .
Nevertheless, it seems clear that the curved and narrow nature of the ATC environment puts a limit on the applicability of the straight line hypothesis.In particular, as the width of the ATC corridor is comparable to the size of the chosen window of observation (see Figure 1), we can expect that the environment will pose some constraint on the motion of the pedestrians in particular for large values of r b .This is indeed confirmed by the data.Namely, by looking at Figure 12-(a) relating to ATC dataset, we observe that the curves are all noticeably offset from the x = y (dashed) line for high values of rb .As r0 is smaller than rb , it seems as if the individual steers towards the group.
It is not trivial to propose a geometric model for such deviation, due to the relatively complex nature of the ATC environment, but for simplicity's sake we may thus assume the correction on r b to be linear.Following this hypothesis, we evaluate the impact of environment geometry by computing an average value of the observed distance r 0 (resp.r0 ) for large values of r b (resp.rb ) (corresponding to the highest bin in Figure 12), for all groups and individuals.We then compute a correction coefficient c defined as the ratio between the observed average value and the expected value r 0 = r b (as stated above, when r b is large, we expect no deviation during encounters).Such coefficients are found to be 0.82 for scaled values and 0.77 for unscaled values.These coefficients can then be used to multiply the values and alleviate the effect of the curvature of the environment.Specifically, when computing the potential U ′ , we replace the values of r b (resp.rb ) by the corrected values r ′ b = cr b (resp.r′ b = cr b ).
In Figure 13, we show the scaled distances for groups with various social relations, along with the line corresponding to the correction coefficient.By definition, the correction fits the various curves for larger values of r b .For further reference, Figure 14 shows the correction coefficient for the unscaled values concerning all groups and individual pedestrians used for the potential of Figure 7 in the main track.The qualitative agreement between the scaled ATC plots and the unscaled DIAMOR ones suggests that the linear correction used to obtain r ′ b and r′ b is reasonable.

Figure 1 .
Figure 1.The normalized cumulative density maps for (a) the ATC data set and (b) DIAMOR data set.The environment is discretized as a 2D mesh with a grid cell size of 10 cm by 10 cm, and the number of observations is counted in each grid cell.Normalization refers to the scaling of this histogram with its maximum value. b

Figure 3 .
Figure 3. 2D probability distribution of individuals' position r relative to overall average.Positions are shown in the group-centered reference frame and the x axis is aligned with the direction of motion of the group.Each sub-figure depicts the difference between the distribution relating to a certain social relation and an unweighted average concerning all social relations.(a) Colleagues, (b) couples, (c) families, (d) friends.The color scales are adjusted for highlighting the differences.

Figure 4 .
Figure 4. 2D distribution of individuals' position r relative to overall average.Positions are shown in the group-centered reference frame and the x axis is aligned with the direction of motion of the group.Each sub-figure depicts the difference between the distribution relating to a certain intensity of interaction and an unweighted average of all intensities.(a) 0, (b) 1, (c) 2, (d) 3. The color scales are adjusted for highlighting the differences.

Figure 5 .
Figure 5. Observed minimum distance r0 as a function of the undisturbed straight-line distance rb (a) for various social relations and (b) intensities of interaction of the group.Error bars report standard error intervals.The dashed line corresponds to the r0 = rb linear dependence.p-values for the ANOVA of r0 (c) for various social relations and (d) intensities of interaction of the dyad.In (c), results for rb < 1 are not displayed as very low values were obtained (p < 10 −6 ).

Figure 6 .
Figure 6.Probability that the distance r0 is smaller than 1 for (a) for various social relations and (b) intensities of interaction of the dyad.Pearson's χ 2 p-values for the hypothesis of independence of the frequencies of samples verifying r0 < 1 for (c) for various social relations and (d) intensities of interaction (of the group).

Figures 7 -
(a) and (b) show the relating values.Additionally, to extrapolate outside the range available, a function of the form k/r β is fitted to the data using non-linear least squares, illustrated with dashed lines in Figures 7-(a) and (b).

Figure 7 .
Figure 7. Collision avoidance potential U ′ (r 0 ) (a) for various social relations and (b) intensities of interaction of the group.Dashed lines correspond to a power function fit of the quantized data.Collision avoidance potential U ′ (r 0 ) (c) for various social relations and (d) intensities of interaction of the group.Dashed lines correspond to an exponential fit of the quantized data.(c) and (d) report a comparison to individual-individual (non-group) interaction using non-scaled distances.

Figure 9 .
Figure 9. Illustration of computation of the observed minimum distance r 0 .

Table 2 .
Average interpersonal distance of groups annotated with each (a) social relation (in ATC data set) and (b) intensity of interaction (in DIAMOR data set).

Figure 10 .Figure 11 .
Figure 10.Probability that the distance r0 is smaller than 0.5 for (a) for various social relations and (b) intensities of interaction of the group.

Table 1 .
Number of groups annotated with each (a) social relation (in ATC data set) and (b) intensity of interaction (in DIAMOR data set).
The distance λ is traveled by i in t min , Note that if t min < t k+1 − t k , a smaller distance is achieved within the time interval [t k ,t k+1 ] than at its initial and final instants (t k and t k+1 ).Thus, this value is registered as the minimum distance concerning this time interval.This implies that the minimum distance concerning that time interval is achieved at an intermediate instant.Otherwise the lower one of ∥p ′ i (t k )∥ and ∥p ′ i (t k+1 )∥ is registered.In this case, if ||p ′ i (t k )|| < ||p ′ i (t k+1 )||, it means that g and i are getting further away.If ||p ′ i (t k )|| > ||p ′ i (t k+1