Response to perturbation during quiet standing resembles delayed state feedback optimized for performance and robustness

Postural sway is a result of a complex action–reaction feedback mechanism generated by the interplay between the environment, the sensory perception, the neural system and the musculation. Postural oscillations are complex, possibly even chaotic. Therefore fitting deterministic models on measured time signals is ambiguous. Here we analyse the response to large enough perturbations during quiet standing such that the resulting responses can clearly be distinguished from the local postural sway. Measurements show that typical responses very closely resemble those of a critically damped oscillator. The recovery dynamics are modelled by an inverted pendulum subject to delayed state feedback and is described in the space of the control parameters. We hypothesize that the control gains are tuned such that (H1) the response is at the border of oscillatory and nonoscillatory motion similarly to the critically damped oscillator; (H2) the response is the fastest possible; (H3) the response is a result of a combined optimization of fast response and robustness to sensory perturbations. Parameter fitting shows that H1 and H3 are accepted while H2 is rejected. Thus, the responses of human postural balance to “large” perturbations matches a delayed feedback mechanism that is optimized for a combination of performance and robustness.

For over 50 years, responses to perturbations have been used to investigate the feedback control of human balance [1][2][3] . These studies established that human balance is not maintained by stereotyped reflexes. Instead, with development, balance control emerges as the nervous system learns to apply generalized rules for maintaining balance 4 . In healthy individuals this ability to adapt and improve balance in a feedback-driven manner does not appear to decline with age 5 . This observation underscores the current development of perturbation-based balance training protocols to improve reactive balance in the elderly as a way to reduce their risk of falling 3,6 .
A large variety of perturbations have been used to disturb human standing balance including sudden platform translations, pulls and tugs 1,2,7-10 . The nervous system responds with a continuum of "ankle" and "hip" strategies, the exact combination depending on the trade-offs between the required effort and the degree of postural instability to be overcome [11][12][13][14] . Despite these observations, theoretically-motivated investigations suggest that common underlying principles may be at work 8,[14][15][16][17] . Thus, the challenge has become to identify the nature of the governing principles 7,18,19 .
The dynamics of human postural sway during quiet standing with eyes closed is very complex and has been described in terms of stochastic and even chaotic motions 20,21 . The underlying mathematical model can be either an inherently stochastic process 22,23 or a deterministic nonlinear feedback mechanism governed by some kind of intermittent control [24][25][26][27] or their combinations [28][29][30] . One of the simplest physiological mechanisms for generating a chaotic motion is the interplay between a time-delayed, sampled data system and a sensory dead zone, i.e., corrective actions take place only when the sensory inputs exceed some threshold values 21 . Model based analysis of postural sway is therefore a difficult task since the governing deterministic dynamics may be hidden in the seemingly noisy/chaotic response. Here we employ perturbations in the anteroposterior (AP) and the posteroanterior (PA) directions during quiet standing that are large enough to produce excursions that are significantly larger than the magnitude of the fluctuations in postural sway and the size of sensory dead zones. In this way, the response to the perturbation is not affected by the local noisy/chaotic dynamics and the underlying feedback mechanisms can be identified distinctly by numerical fitting techniques.
Inverted pendulum models are widely used to investigate human balance 19,[31][32][33][34][35] and are currently thought to be "functionally correct" 34,36 . Reactive balance control, that is maintaining balance in the response to a perturbation, is inherently a feedback sensorimotor process in which muscles are activated in direct response to task-level error 14,[37][38][39] . However, the time delayed nature of the feedback control has important implications for the feedback controlled response of balance to perturbations. Our concept is illustrated in Fig. 1. This figure shows the stability diagram for an inverted pendulum stabilized by a time-delayed proportional-derivative (PD) feedback controller. The stable region in the plane (p, d) of the control gains is characteristically D-shaped 31,40 . For every choice of the control gains located within this D-shaped region, the upright position recovers from a perturbation; however, not all control gains exhibit the same dynamical response to the perturbation. The perturbed responses range from a monotonic exponential recovery to the upright position to recoveries which exhibit an oscillatory component.
The boundary between the monotonic and oscillatory responses is formed by the node-spiral separation line (black-green dashed line). The exponential decay of the response is shown by solid contour lines: the smaller the value of γ 1 , the faster the response. Namely, |θ(t)| θ 0 e γ 1 t where θ 0 is the initial angle. γ 1 < 0 is called exponential decay rate 41 . The fastest responses to perturbations are for choices of the gains in the lower left quadrant of the stability diagram. The optimal point with respect to the response's settling time is (p * , d * ) (black × marker), which is associated with the maximal achievable decay rate γ 1 = γ * . This parameter point lies on the node-spiral separation line and divides it into two sections indicated by black and green color. Small changes in p or in d in the lower (black) branch of the separation line results in significantly larger changes in the decay rate γ 1 than the same changes do in the upper (green) branch. Hence, the system is more robust to changes in the control gains when operating at a parameter point on the upper (green) branch rather than on the lower (black) one. Note that perturbation of the gains can directly be linked to perturbations in sensory perception since the actual control force is determined as the product of the control gains p and d and the corresponding sensory inputs, the angle θ and the angular velocity θ . This suggests that the control gains should be selected based on a goal function which takes into account both fast response and robustness. This would give an operation point (p,d ) on the upper branch indicated by green marker.
The main goal of this paper is to compare the dynamics of the time-delayed PD feedback model to measured responses to perturbations during quiet standing. Control gains and feedback delays are estimated by fitting the response of the mechanical model to the measured time histories. The fitted parameter point is indicated by (p,d) in Fig. 1b (black marker). We pose three hypotheses related to the location of the fitted control gains.
H1 The fitted control gains are tuned towards the node-spiral separation line, which would indicate that reducing oscillatory response is one of the main goal of the feedback mechanism. This concept can be linked to a critically damped oscillator in the sense that when the proportional gain p (which operates as a kind of artificial active stiffness) is fixed, then the derivative gain d (a kind of artificial damping) is tuned towards the node-spiral separation line and the resulted motion resembles that of a critically damped oscillator. Identified control parameters. The control gains p and d and the time delay τ were estimated for all the 20 trials (10 PA and 10 AP) for all the 10 subjects. The stability charts together with the identified control gain parameters are shown in Fig. 3 for each subject. The stable region (thick black solid curve) and the node-spiral separation line (dashed black-green curve) correspond to the average delay of the overall 20 trials per subject (the average delay is indicated in each panel). The larger the average delay, the smaller the stable region. All the identified control gains are within the stable region and are distributed close to the node-spiral separation line. Parameter points (p * , d * ) (fastest decay) and (p,d ) (closest point on node-spiral separation line) are indicated by black × and green markers for each subject. The basic statistical results of the fitted control parameters τ , p and d are collected in Table 1. The best fitting time delays were found to be in the range of 100 ∼ 200 ms, which is in agreement with different estimates in the literature [47][48][49][50] . The identified control gain values also resemble those in the literature that assumes the same delayed PD feedback models of human quiet standing 49,50 . Settling time versus robustness to parameter changes. Experiments showed that the fitted control gains are slightly larger than the gains (p * , d * ) corresponding to the fastest decay. An explanation for this is that the sensitivity of the exponential decay rate γ 1 to parameter changes is different at different sections of the nodespiral separation line. Note that the response is bounded by |θ(t)| θ 0 e γ 1 t , therefore changes in γ 1 are amplified through an exponential function. The change of γ 1 along the node-spiral separation line is shown in Fig. 4 for the parameters of Subject 9. Panel a shows the stability diagram with some sample values of γ 1 . It can be seen that γ 1 changes faster in the lower (black) section of the node-spiral separation line than in the upper (green) one. Panels b and c show the change of γ 1 as function of p and d, respectively. The system with control gains selected from the lower branch is more sensitive to changes in p and d than the system corresponding to the upper (green) www.nature.com/scientificreports/ www.nature.com/scientificreports/ branch. Actually, the parameter point (p * , d * ) is infinitely sensitive to negative changes in p and d, which can be seen from the vertical slope of its tangent in panels b and c. When applying ±10 % perturbation in the gains (p * , d * ) , then, in the worst case, γ 1 changes from γ * = −2.98 to −1.84 . The same ±10 % perturbation on the gains (p,d) , in the worst case, results in changes of γ 1 from −2.59 to −2.38 . Hence, when perturbations/uncertainties are present in the sensory perception, then it is more beneficial to select the control gains form the upper (green) branch of the node-spiral separation line than from the lower (black) one. Based on this observation, three measures are introduced to describe the response in terms of the control gains. Let the mean of the fitted control gains be denoted by (p,d) . First, the oscillatory nature of the response related to hypothesis H1 can be described as the distance between the point (p,d) and closest point (p,d) of the node-spiral separation line (see Fig. 1). Point (p,d) is defined such that (p −p) 2 + (d −d) 2 has to be minimal (for this, the control gains have to be normalized by introducing the dimensionless time t = t/τ ). Second, the decay of the response (i.e., settling time) related to hypothesis H2 can be characterized by the distance between the fitted parameter point (p,d) and the parameter point (p * , d * ) associated with the fastest decay. Third, the combined concept of fast decay and robustness to uncertainties in the control gains related to hypothesis H3 can be characterized by the relations p > p * and d > d * .
The above discussion implies that the goal function might not be to achieve (p, d) = (p * , d * ) but to tune the control gains to (p, d) = (p,d) , which guaranties a lower limit to γ 1 even in the case of perturbations of the control gains. This concept minimizes the settling time (e.g., maximizes the magnitude of γ 1 ) while at the same time preserves robustness to static perturbations in the control gains (p, d).
Hypothesis H1: node-spiral separation line-accepted. Control gains associated with oscillatory and non-oscillatory responses are separated by the node-spiral separation line. Hypothesis H1 can be tested using the distance between the identified mean control gains (p,d) and the closest point (p,d) on the node-spiral separation line. First, the normality of the data was checked by the Anderson-Darling test and it was found that the data are not normally distributed. Therefore, Hypothesis H1 was tested by the non-parametric Wilcoxon signed-rank test ( signrank(p,p) and signrank(d,d) in Matlab 2017b), see Table 2. Results show that Hypothesis H1 related to the location of the proportional gain p is accepted for most of the subjects (8 out of the 10) and also for the overall data. H1 related to the location of the differential gains d is accepted for all the subjects. This observation confirms that the control gains are tuned to be close to the node-spiral separation line.   Table 2. In case of most of the subjects and also in case of the overall data, Hypothesis H2 was rejected. Note that this is a weak rejection, since 7 out of the 10 subjects was rejected for p and only 4 out of the 10 for d. This analysis shows that although the fitted control gain pairs seem to be distributed close to the control gain pairs that yields the fastest decay (i.e., close to the by black × markers in Fig. 3), this hypothesis is not verified statistically. An explanation to this observation is that the parameter point (p * , d * ) is infinitely sensitive to small changes in the control gains, hence to the small changes in the perceived sensory feedback.
Hypothesis H3: fast decay of the response with robustness-accepted. The fitted control gains was shown to be close to the point (p,d) on the node-spiral separation line in H1. Now it is to be checked whether (p,d) lies on the upper or on the lower branch of the separation line. For 9 out of the 10 subjects, it was observed that p > p * and d > d * , hence (p,d) lies on the upper (green) branch. This confirms that the control gains are indeed tuned to the more robust section of the node-spiral separation line, hence H3 is accepted. Therefore, it is a plausible assumption that the control gains are tuned to achieve fast decay ( γ-stability) but at the same time allow some variations in the gains or in the sensory feedback.
No difference between PA and AP parameters. The difference between the fitted control gains obtained for the PA and the AP trials was analyzed using Wilcoxon signed-rank test. No significant difference was found (the p-value for gain p was v p = 0.922 , for gain d it was v p = 0.557 and for the delay τ it was v p = 0.334 ). Hence, both AP and PA balancing process can be modeled by delayed feedback with control gains tuned close to (p,d).
Variation of the fitted gains over the trials-no learning. The variation of the fitted control gains over the 10-trial series is shown in Fig. 5 in order to check whether the control gains are coherently tuned towards certain region of the plane (p, d). There is no clear trend in the change of the fitted parameters (either in the mean or in the variations), which suggests that learning process was not present during the trials. Hence, reacting to perturbation during standing still can be considered as an already learned and acquired feedback mechanism.
Effect of passive stiffness. The above results have been obtained for the mechanical model where the passive ankle stiffness was k t = 0.91mgh 58 . In order to check the validity of the results, the same calculations and the same parameter estimations were performed for k t = 0.67mgh too, which is in the lower region of the physiologically plausible stiffness values 59 . Wilcoxon signed-rank test shows that Hypothesis H1 is accepted for the overall data for both the proportional control gains p ( h = 0 , v p = 0.478 ) and the derivative gain d ( h = 0 , v p = 0.970 ), while Hypothesis H2 is rejected for p ( h = 1 , v p = 0.008 ) and is weakly accepted for d ( h = 0 , v p = 0.067 ). Thus, the main results regarding the location of the fitted control gains does not change significantly with the value of the passive ankle stiffness.

Discussion
The results confirm that recovery of quiet standing after a sudden perturbation can well be described by a delayed state feedback mechanism described by three parameters: the reaction delay τ , the proportional and the derivative control gains p and d, respectively (see Methods for validation). While the reaction delay is an inherent feature of the control process, the control gains can be tuned to improve performance, namely, to reduce oscillations and www.nature.com/scientificreports/ settling time. Parameter fitting shows that the control gains are tuned such that the response is non-oscillatory (H1) and result in fast recovery even in case of parameter uncertainties related to the perception of the angular position and the angular velocity (H3). Hence, the typical response is fast and non-oscillatory that resembles the dynamics of a critically damped nondelayed mechanical system 46,51 . This feature of the response is also in agreement with delayed models of human standing in the sense that the control gains are located in the lower left regions of the stability region 42,47,52 . No significant changes were observed on the parameters over the trials implying that the identified feedback mechanism has been already learned and practiced before during the activities of daily living. The effect of learning process is more pronounced when a new and unknown task is to be performed, e.g, ball-and-beam balancing 44 , balancing on balance board 53 , beam walking 54 or combined quiet standing and stick balancing 27 . A question arises whether practice or other techniques can be developed to further improve the performance against sudden perturbations.
It shall be mentioned that other than delayed PD feedback is also a possible concept to describe the stabilization process. Several ideas are available in the literature, such as intermittent feedback where intermittency can be either a control logic utilizing the structure of stable and unstable manifolds in the phase plane of the inverted pendulum [24][25][26][27]55 or an inherent consequence of the uncertainties in the operation of the sensory system, e.g., sensory dead zones 19,29 . Further possible control concepts are acceleration feedback 32,56 , predictor feedback 39,43 or an event-driven combination of ankle and ankle-hip strategies 25 , just to mention a few. An advantage of the delayed PD feedback model is that while it is widely used in the literature 26,31,33,47,50 , it accounts with the two most important features of neural motor control: (1) actuation is performed based on perceived sensory signals; and (2) there is a reaction time delay between sensory perception and action.
The postural responses to anterior (AP) and posterior (PA) perturbations are very complex 1,4,61,62 . For AP perturbations the force generated by contraction of the calf muscles is resisted by the reactant force generated by the standing surface. Presumably this may help to offset the decrease in passive ankle stiffness as the dorsiflexion angle increases [58][59][60][61] . In contrast, for PA perturbations the contraction of the calf muscles is not opposed by the www.nature.com/scientificreports/ surface reactant force. Thus in order to maintain balance coactivation of the anterior and posterior shank muscles becomes important and the restoration of balance depends more on the movements of the trunk 62 . This may explain why the angle θ is slightly larger for PA than AP responses (see the peaks in Fig. 2d).
We have shown that when the tilt angle ( ≈ 3 • ) is much larger that the proprioceptive sensory dead zone ( ≈ 0.05 • − 0.08 • ), the feedback control of the responses to AP and PA perturbations are well described by a simple PD feedback control mechanism for the stabilization of an inverted pendulum. The surprising observation is that the values of the feedback gains are the same for both responses. Thus, as pointed out previously in a different context 34 , modeling human balance using an inverted pendulum works quite well. It may be possible to understand the basis for this observation by using more complicated models for balance control 24,25 . However, it is more likely that the responses to larger perturbations are more relevant for understanding the etiology of falls than investigations into the subtle nature of the fluctuations in balance that occur during quiet standing. Thus, we anticipate that the observation that human balance control responses to large perturbations resembles that of a delayed state feedback optimized for performance and robustness will greatly simplify investigations into the nature of human falls.

Methods
Mathematical model. An inverted pendulum model for human standing balance is shown in Fig. 1a.
Briefly the human body is modeled as a homogeneous rod of mass, m, pivoted on a joint A 30,32 . The equation of motion takes the form where h denotes the distance between the center of gravity and the ankle joint A, g is the acceleration due to gravity, J A is the moment of inertia with respect to point A, and θ is the general coordinate which describes the angular position of the body with respect to vertical. There are two types of torques which interact to stabilize the upright position. First, there is the passive stiffness of the ankle related to the mechanical properties of the foot, Achilles tendon and aponeurosis. The contribution of these forces to balance is modeled by a torsional spring of stiffness k t 19,21,30 .
The intrinsic mechanical stiffness of the ankle is not sufficient to maintain stability during quiet standing and contractions of parallel connected calf muscles are required. Thus active muscle contractions produce feedbackdriven torques, Q(t), which act across the ankle joints. The proportional-derivative feedback control had the form where τ is the reaction time delay, and P and D are, respectively, the proportional and derivative gains. Substitution into (1) and linearization about the θ = 0 upright vertical position gives the delay-differential equation where is a system parameter and are normalized control gains.
The stability of (3) can be determined using the D-subdvision method 40 . Substitution of the exponential trial solution θ(t) = Be t into (3) gives the characteristic function The characteristic equation D( ) = 0 has infinitely many complex roots, i ( i = 1, 2, . . . , ∞ ), which are called characteristic exponents. The system is stable, i.e, the solution θ(t) converges to 0, if Re( i ) < 0 for all i. Taking = γ ± iω and setting γ = 0 , the equation D( ) = 0 gives the transition curves These parametric curves delimit the D-shaped stability region shown in Fig. 1b.
In order to characterize the response associated with different parameter pairs (p, d), the general solution has to be analysed, where B i is the complex amplitude corresponding to i . The values of B i 's are determined by the initial functions (initial perturbations) during t ∈ [−τ , 0] , but the stability is independent of these parameters. It B i e Re( i t) cos(Im( i t)) + i sin(Im( i t))  ( 1 ) gives the oscillation frequency. In mathematical terminology, γ 1 = Re( 1 ) < 0 is called exponential decay rate and the system is said to be γ-stable if γ 1 ≤ γ < 0 41 . Figure 1 shows the dynamic behaviour of (3). The stable parameter region (where γ 1 < 0 ) is bounded by black thick curve. Control gains out of the stable region results in either increasing oscillations or exponential growth, hence, in both cases, falling. Within the stable region, thin curves represent different contour lines of γ-stability. The larger the magnitude of γ 1 the shorter the settling time. The fastest response is obtained when (p, d) = (p * , d * ) . Thus, p * and d * can be considered as optimal gains with respect to settling time. This concept can be associated to the terminology of critical damping of nondelayed models, which plays an important role in modelling the response to sudden perturbations during standing still 46,47,51 .
The stable region can be separated into two parts based on the imaginary part of the dominant root 1 . Darker shaded region to the left from the black-green dashed line is associated with Im( 1 ) = 0 . In this case, the dominant solution component B 1 e 1 t is non-oscillatory. Lighter shaded region to the right of the dashed line is associated with Im( 1 ) > 0 and the corresponding solution component reads B 1 e 1 t +B 1 e¯ 1 t , which is oscillatory with angular frequency Im( 1 ) (here ¯ 1 and B 1 are the complex conjugate of 1 and B 1 , respectively). The black-green dashed line is called node-spiral separation line since it separates node type and spiral type solutions 44 . Note that the point (p * , d * ) corresponding to the fastest response lies on the node-spiral separation line. The node-spiral separation line can be divided into two parts based on the location of the dominant (rightmost) roots. In the lower branch (black dashed line) in Fig. 1b, the rightmost characteristic root is real and has a multiplicity of 2. In the upper branch (green dashed section), a real and a complex pair of characteristic roots coexists with the same real part. At parameter point (p * , d * ) , the rightmost root is real ( 1 = γ * 1 ) and has a multiplicity of 3. Besides the actual values of the exponential decay rate γ 1 , its robustness to changes in the control parameters is also an important feature of the control process. ε p relative error in p and ε d relative error in d alter the control force as hence this perturbation can also be implemented as perturbation in the sensory perception of θ and θ with the same relative error ε p and ε d , while the gains p and d are constant. This suggest a sensitivity analysis of γ 1 to changes in p and d.
Participants. We carried out the experiments with 10 subjects (8 males, 2 females) whose parameters and related statistical data are shown in Table 3. The subjects had no self-reported medical conditions which could affect their ability to perform the required tasks. The research was carried out in accordance with relevant guidelines and regulations following the principles of the Declaration of Helsinki. All subjects provided written informed consent for the procedures, signed a General Data Protection Regulation (GDPR) form and were given the opportunity to withdraw from the study at any time. The research project and the study protocol was approved by the Faculty of Mechanical Engineering, Budapest University of Technology and Economics.
Procedure. The concept of the measurements is shown in Fig. 2. We perturbed standing balance by the unexpected release of a resisting force 7 . While standing comfortably the subject resists a horizontal force, F H provided by hanging a weight via a rope that was connected to the subject by a body harness. A constant force F H was applied either in the PA or in the AP direction (Fig. 2a,b, respectively). Under these conditions the subject's www.nature.com/scientificreports/ preferred standing position was slightly tilted in order to resist the applied force. The force was released manually using a bolt mechanism at an unexpected moment. This causes an initial sway in the direction opposite to the released force as shown by the peaks in Fig. 2c, d. After some transients, subjects found their new vertical equilibrium (normal posture) and they kept on standing in this position for t s = 15 s. Subjects were instructed to keep their hip and knee joints in a constant extended position and to recover their balance without flexing their knees or hips or moving their arms. The applied force F H shown in Table 3 was the largest one that the subject was able to resist without any difficulties. Each subject performed 10 trials with F H applied in one direction (either PA or AP) followed by 10 trials in which F H was applied in the other direction. In order to prevent carry order effects, the direction of F H for the first 10 trials was chosen randomly. The time instant when the weight was released is t 0 . Maximum excursion was reached at time instant t = t 1 , normal posture was recovered at t = t 2 , after that subjects kept on standing quietly for t s = t 3 − t 2 = 15 s and the balancing trial was terminated after time instant t = t 3 . Response was evaluated and parameter fitting was performed over the period t ∈ [t 1 , t 1 + t p ] with t p = 10 s. The time signal over the period t ∈ [t 2 , t 3 ] was used to calibrate the normal posture such that the mean value of the tilt angle θ(t) during this period was zero.

Data collection.
A high-speed motion capture system (8 synchronized OptiTrack Prime13 cameras, 120 Hz) was used to measure the three dimensional position (x i (t), y i (t), z i (t)) of spherical reflective markers (diameter 16 mm ) where the subscript i refers to the location of the markers: i = 0, 1, 2, 3, 4 respectively stand for the ankle, knee, hip, shoulder, and head. Projecting all movements to the anterior-posterior plane (x, z), the absolute tilt angles can be calculated as where i = 1, 2, 3, 4 respectively indicate ankle-knee, ankle-hip, ankle-shoulder and ankle-head angles.
Validity of the single inverted pendulum model. Tilt angles θ i ( i = 1, 2, 3, 4 , ankle-knee, ankle-hip, ankle-shoulder and ankle-head angles) were calculated based on the marker positions located on the ankle, knee, hip, shoulder and head, respectively. Measurements show that most of the corrective motion happen at the ankle joint as the participants were instructed to keep the knee and hip joints fixed. The angle across the knee joint was found to be small, i.e., θ 1 − θ 2 ≈ 0 , and was further reduced at perturbation onset by contraction of the rectus femoris muscle 45 . Hence, θ 2 ≈ θ 1 provides a good measure of the ankle across the ankle joint. The distribution of ankle-hip angle θ 2 and the difference between the ankle-shoulder and the ankle-hip angles ( θ 3 − θ 1 ) (10) θ(t) = 1 3 (θ 2 (t) + θ 3 (t) + θ 4 (t)).