sEMG-assisted inverse modelling of 3D lip movement: a feasibility study towards person-specific modelling

We propose a surface-electromyographic (sEMG) assisted inverse-modelling (IM) approach for a biomechanical model of the face to obtain realistic person-specific muscle activations (MA) by tracking movements as well as innervation trajectories. We obtained sEMG data of facial muscles and 3D positions of lip markers in six volunteers and, using a generic finite element (FE) face model in ArtiSynth, performed inverse static optimisation with and without sEMG tracking on both simulation data and experimental data. IM with simulated data and experimental data without sEMG data showed good correlations of tracked positions (0.93 and 0.67) and poor correlations of MA (0.27 and 0.20). When utilising the sEMG-assisted IM approach, MA correlations increased drastically (0.83 and 0.59) without sacrificing performance in position correlations (0.92 and 0.70). RMS errors show similar trends with an error of 0.15 in MA and of 1.10 mm in position. Therefore, we conclude that we were able to demonstrate the feasibility of an sEMG-assisted inverse modelling algorithm for the perioral region. This approach may help to solve the ambiguity problem in inverse modelling and may be useful, for instance, in future applications for preoperatively predicting treatment-related function loss.


EMG-assisted static optimisation
Our EMG-assisted inverse modelling algorithm is based on the inverse tracking controller in ArtiSynth developed by Stavness et al. 15 . They used a combined movement target term and an l 2 -norm regularisation term, which resulted in a quadratic programming problem. In the current paper, we stacked the position coordinates of a set of ten tracked 3D marker points on the lips in a 30D vector k z ( ) t where k is the discrete time index. For brevity, we shall use the notation z t instead of k z ( ) t . The model-predicted positions z(k) depend on − k a ( 1), which is the vector of muscle activations at time − k 1, and on the previous state − k z ( 1). This is denoted by Note also that the elements of a are limited to the interval [0,1]. The technology of sEMG provides indirect measurements of the innervation of each muscle. These measurements provide quantitative indications of the activations and are therefore denoted by a t , which gives rise to the following quadratic cost function: f a z M f a z a Aa a a D a a a a E a a ( ) 1 2 ( ( ) ) ( ( ) ) 1 2 prev . The matrices M, A, D, and E are matrices that weigh different cost aspects. The term with M assures that model positions are close to measured positions. The term with A is a regulation term to tame the found activation signals. The term with D prevents large fluctuations of the found activations. Finally, the term with E assures that the estimated activations are consistent with the measured sEMG signals. In our experiments, the numerical values of the matrices were as follows: (0 005), and = diag emg E ( ) val or = diag E (0) in case inverse modelling is performed without sEMG tracking. emg val was determined during the experiments.
To minimise the cost function in equation (1), the expression was worked out to a form: T T T T in which irrelevant terms in equation (1) were dropped, and a linearised approximation of the state-space model was used based on Taylor series expansion. Equation (2) is recognised as a quadratic programming problem for which stable, numerical solutions are available. The seed for the inversion was always set to the estimated muscle activity of the previous frame. The initial frame's seed was always set to zero muscle activity.
To acquire 3D lip movements, we tracked six optical face markers  ∈ ( X ) OR 18 for head orientation and ten optical lip markers  ∈ (X ) 30 at 100 frames per second using a triple camera set-up (avA1000-100gc, Basler AG, Ahrensburg, Germany), which we had developed for assessing tongue mobility and capturing tongue movement after hypoglossal nerve stimulation 8,39 (Fig. 1).
We asked the volunteers to perform six different instructions once: A. purse lips, B. raise upper lip, C. depress mouth corners, D. voluntary smile, E. draw mouth corner to the left, then to the right, and again to the left, and F. purse lips to closed-mouth smile to purse lips (Eskes et al. 7 , Fig. 2).
The experiments were approved by the Medical Research Ethics Committee of the Netherlands Cancer Institute and all volunteers gave written informed consent.
Finite-element face model. We performed inverse modelling on the generic face model used in Eskes et al. 7 (Fig. 1), which was based on the work performed at ICP/GIPSA and TIMC-IMAG laboratories in Grenoble 40,41 , with details published by Nazari et al. 42 . Their ANSYS ® model was ported to ArtiSynth and was named the reference face model [43][44][45] . With soft tissues represented in three layers of elements, this model had 6342 elements (6024 linear hexahedral and 318 linear wedge) and 8720 nodes. Fourteen muscle groups were available as muscle fibres. We created finite-element muscles, which were defined as the elements surrounding the muscle fibres within a radius of 5 mm. The elements of the Orbicularis Oris muscles were manually assigned. All these muscle elements were given muscle properties as described by Blemker et al. 46 . The bony parts, the mandible and maxilla, were modelled as rigid bodies. We used literature-based common muscle model parameters for all volunteers 7,11 , with the exception of maximum muscle stress (σ max ). We optimised the stress parameter per volunteer starting at 300 kPa and gradually decreased σ max repeatedly with 10 percent until the simulation ran smoothly without creating inverted elements. Simulations were performed on two workstations with intel Xeon core and one laptop computer with an intel i7 core. sEMG to normalised model activations. The model used Orbicularis Oris Peripheralis (OOP) and Marginalis (OOM) definitions. Therefore, these activations were constructed from the measured OOS and OOI activations, taking into account the information about activation patterns described by Flynn et al. 11 . The Buccinator (BUC), the Depressor Labii Inferior (DLI), and the Levator Anguli Oris (LAO) were not directly measured but derived from the measured muscles as follows:  In previous research, we found the following procedure to be optimal for transforming measured sEMG signals into normalised muscle activations [5][6][7] . We first calculated the Willison Amplitude with = s 10 mV lim over sliding windows of 200 ms with maximum overlap: The feature g m (t, i, r) was calculated from the measured sEMG s m (t) of muscle m, where t was the time index of the EMG signals, and n the running time index within each sliding window consisting of N samples. This was done for all instructions i and repetitions r (in this case = r 1). The feature g m (t, i, r) was normalised according to: Registration of measured 3D lip markers to generic face model. As each face has unique dimensions, we had to apply a registration to allow for movement tracking and root mean square (RMS) error comparison of the generic face model's lip markers with the measured lip markers. We registered each measured coordinate according to equation (10): Performance measures. To perform quantitative evaluation, we used the RMS error, e pos , that was calculated over time and over the markers via: With k being the discrete time index, K the number of time samples, and Z d (k) the model's lip marker position coordinates. = D 30 reflects the dimensions, i.e. 10 markers with 3 coordinates each. The factor 3 was introduced because we wanted to express the RMS in terms of distances, rather than in terms of coordinates.
The 3D correlation coefficients were calculated as described by Pitermann et al. 36 . The mean position μ z of a 3D lip marker trajectory, with samples = x y v Z ( , , ) t t t t , was calculated with equation (12): The standard deviation σ z of Z t was calculated with equation (13): The 3D correlation coefficient ρ 3D between 3D landmark trajectories Z t and X t was calculated with equation (14): The RMS error was also calculated for the activations (e act ) according to equation (15) with g(t) being the normalised feature values and a(t) the inverse calculated activation values, whereas Pearson's correlation coefficient was used as an activation correlation measure. For all experiments, we compared the inverse calculated activation signals with the original sEMG features using the RMS error and Pearson's correlation coefficient. Also, the movement tracking errors (e pos and ρ 3D ) were calculated for all experiments. Together, these measures give an indication of performance.

Experiments
In this study, we performed three different experiments to investigate the added value of sEMG-assisted inverse modelling: I. A simple muscle contraction to test feasibility of the model and implementation of the inverse methods II. Inverse simulations with synthetic data produced by the sEMG-driven forward model. Inverse modelling was guided by 3 different sEMG constraints: no constraint, using all muscles (act all ), and using relevant muscles (act rel ). By comparing the results of these three constraints, we could test our method for feasibility inside the mathematical universe of the face model. III. Inverse simulations with measurement data containing 3D position data of ten lip markers and sEMG data of fourteen facial muscles. This experiment was conducted to assess the contribution of sEMG in a realistic situation.

Experiment I: Test Cost term implementation by means of a simple point-mass system. Goal
and experimental set-up. To test our implementation of the cost function, we first created a simulated muscle activation pattern, contracting the north-north-east, north-east, and east-north-east muscle bundles of the point-mass system as shown in Fig. 2 47 . It should be noted that the muscles have different maximum isometric forces, the thick muscles being more powerful than the thinner muscles. Next, inverse modelling was performed, first alternating the cost terms and finally using all cost terms at once. We expected to find that IM with each cost-term alone would not result in calculated IM activations that were similar to the simulated activation patterns, except for IM with the sEMG term, which would probably mimic the forward simulation. When using all cost terms together, we expected there would be a trade-off between the different cost terms, which would likely cause a result that was less perfect but more usable in the real application. In line with logic, when testing a cost term alone, we set its weighing factor at one. When testing all cost-terms together, we set the various weighing factors as described in section 2 Results. For the point-mass system, movement tracking errors were similar in all simulations, whereas activation patterns differed greatly. Using the motion term alone produced a very stiff system, whereas the l 2 -norm distributed the forces over the different muscles in the same way the damping term did. Including only the sEMG term showed minimal differences between the inverse calculated activation and the simulated activation and resulted in a good forward solution (e pos ). When using all cost terms together, including our sEMG term, we found that muscle activation patterns were still good (Fig. 3) while used muscle activation strategies improved considerably over performance with individual cost terms or all cost terms combined with exclusion of the sEMG term. However, it should be noted that the solution depends on the weighing factors of the cost terms, e.g. when too much sEMG information is used, the result will mimic the forward solution.
The results were not perfect because of the other cost terms in the objective function and because of integration, which adds noise. Even when we activated only the sEMG target term, there was still a small error between the inverse calculated activations and the simulated sEMG pattern used in forward modelling. Larger errors occurred when we applied all cost terms in the inverse modelling of the point-mass model, which is a direct consequence of taking into account all cost terms, where the sum of all terms should be small, instead of only minimising the sEMG term.
Conclusion. To conclude, these experiments justified our approach and showed that sacrificing only a little performance in movement tracking resulted in major improvement in muscle activation tracking. Neither the use of any original cost term by itself nor any combined use of cost terms resulted in the correct muscle activation strategy. Incorporation of the sEMG cost term greatly improved the estimated muscle activations while keeping movement tracking orders in the same range. The weighing factors influence the result and should be determined experimentally for the next experiments. Experiment II: Inverse modelling using simulated data. Goal and experimental set-up. To test the inverse modelling approach within the mathematical universe of the face and assess its feasibility, we started with a standard inverse-modelling approach 15 . To first evaluate this approach in a simple situation, we used our forward-modelling results as motion targets for this experiment 7 . After activating the relevant muscles per instruction (act rel ), the forward simulation produced 3D trajectory data of the lip markers. Since this movement lies within the range of the model (position, acceleration) there is no need for registration, which could induce error, and the movement can function as a first indicator of feasibility. Figure 4 depicts the mean activations and their standard deviations based on all volunteers for the measured muscles. For use as input for the forward model, they were adjusted with equations (3) to (7). In this experiment, we used three constraints for the IM sEMG term: no sEMG, including all muscle activations (act all ), and including relevant muscle activations (act rel ). Thus, the sEMG term's penalty matrix E was set to zero if no activation targets were used, while we experimentally obtained the optimal value using three different values for emg val to get an idea of the influence of the sEMG term: will be made. In this experiment, all muscles were used (act all ). After obtaining the optimal emg val , the constraints act all and act rel were tested.
Results. The influence of the sEMG cost term and thus the optimal weighing factor can be derived from Fig. 5. All volunteers show the same pattern: a weighing factor of × − 5 10 3 actually results in forward modelling as it depends too much on the muscle activations patterns, whereas × − 5 10 4 appears to be the optimal value of all tested factors. Table 1 gives the RMS error between the target lip markers and the models' lip markers e pos averaged over all instructions and volunteers for experiments II and III, as well as the e act between the models' calculated activations and measured muscle activations. Similarly, Table 2 shows the 3D correlation coefficients ρ 3D between model markers and measurement markers and Pearson's correlation coefficients ρ between calculated model activations and measured muscle activations.
As we evaluate these experiments, some comments have to be made. The experiments confirm the load-sharing problem: three different activation strategies showed similar performances in 3D lip movement tracking with a mean ρ 3D of 0.93 (no constraint), 0.93 (act all ), and 0.92 (act rel ), while the correlation with the normalised sEMG features varied: 0.27 (no constraint), 0.44 (act all ), and 0.83 (act rel ), respectively, illustrating different activation strategies. The forward solution was created with act rel , leading to good correlations in the experiment with act rel constraint (mean ρ = . 0 83). Like in experiment I, the correlations were not perfect because of the other cost terms in the objective function and because of the noise added by integration.
Although we cannot perform statistical tests that will be reliable because of our small data set, some clear trends can be seen. Looking at the RMS errors, we note that the e pos of no sEMG constraint was about the same as with act all constraint, whereas for act rel the e pos was always higher than the other two. The activations errors e act were always lower for act rel constraint than the other two constraint, except for OOM and BUC. More surprisingly, the act rel constraint resulted in a higher e pos , while we had expected the most accurate results from the use of act rel as it was used in the forward simulation. Presumably, the influences of other cost terms and integration and the optimisation of muscle stress must have caused inaccuracies that resulted in better (though not perfect) estimated activations, sacrificing a little in motion tracking performance.  Experiment III: Inverse modelling using measured data. Goal and experimental set-up. The goal of experiment III was to apply our new sEMG-assisted IM approach on real data and test its performance. To do so, we used measurement data obtained from healthy volunteers. The motion targets were obtained from recorded position data registered to the generic face model with equation (10). The sEMG term's penalty matrix E was set to 0 in case of no sEMG constraint and to = × − emg 5 10 val 4 in case of the sEMG constraint act all (as determined during the previous experiment, see Fig. 4). Tables 1 and 2 show the RMS errors and the correlation coefficients, respectively. Congruence between measured muscle activations and calculated activations via inverse modelling was similar between volunteers, showing huge standard deviations and a mean around zero in correlations when using no sEMG constraint and reasonable to high correlations using act all (Fig. 6). 3D movement correlations were similar, too. Remarkably, when using no constraint we found that volunteer 6 showed a deviating higher error in the movement e pos (Fig. 6). The ρ 3D s of lip movement were always equal or higher compared to no constraint. Except for the marker 7. The mean ρ 3D s showed a moderate to good correlation (ρ . 0 7). The e pos was always lower in the sEMG-assisted approach, suggesting that the IM without constraint got stuck in a local minimum.

Results.
Calculating correlation coefficients for lip marker performance, we found that the lateral lip markers 1, 2, 6, and 7 performed better than the centre markers, similarly to the forward modelling results 7 . This can be explained by the fact that the volunteers' centre markers moved notably, whereas the model's centre markers only slightly deviated from their original position due to symmetry in the model. However, when we compare the e pos for all lip markers we observe the opposite effect: the RMS errors are higher for the lateral markers than for the centre markers. This may also be explained by the fact that more movement allows for greater error due to a larger possible distance.
There was a lack of correlation without the sEMG constraint for the activations, caused by too many degrees of freedom in the muscle space. The sEMG-assisted inverse-modelling approach showed clear tendency of producing better, realistic and consistent muscle activations patterns.
Zooming in on the errors and correlation coefficients of the activations, those muscles whose activations were derived from measured muscles (DLI, BUC, LAO) performed worse than the muscles that were measured directly. This helps to explain why our forward model showed lower correlation coefficients in previous studies 7 . The OOP and OOM, derived from OOS and OOI measurements, also showed lower correlations (values), ρ . 0 5 versus ρ . 0 7. This is actually an interesting result, suggesting that the measurements do contribute a lot and can provide useful information. It would be interesting to look into the effects of only tracking the measured muscles instead of using derived muscle activations as we did here and to compare the results with experiments in which the DLI, BUC, and LAO are also measured directly. Conclusion. In conclusion, adding sEMG tracking does not reduce 3D movement tracking accuracy, whilst giving better solutions in muscle activation tracking, as we already expected after experiments I and II. In essence, adding sEMG tracking tailors the inverse solution to a personalised activation strategy with equal performance. Apparently, surface EMG is sufficiently accurate without requiring any invasive needle approaches. However, challenges remain, as the inversion without constraint gave some questionable results, suggesting that the inversion may have got stuck in a local minimum. This would mean that including the sEMG constraint would be a way to avoid the inversion getting stuck in that miminum. However, it also hampers the general goal of seeking compensatory mechanisms by means of other muscle activation strategies. Also, because of a small dataset no statistical test could be performed. However, clear trends were observed and should be confirmed by future experiments.
General results. Muscle stress varied per volunteer, per instruction, and per experiment (Table 3). Variation was highest between instructions and between experiments. The required computational time varied across simulations. Experiment III without the sEMG constraint may serve as a good example for computational times, as it was run completely on one workstation whereas the other experiments were distributed over the two workstations and the laptop computer, requiring longer computational times per simulation.
General discussion. To our knowledge, this is the first study to describe the feasibility of sEMG-assisted inverse modelling of 3D lip movements using a biomechanical model of the face and lips. We have shown that implementing a simple sEMG cost term can direct the calculated muscle activations towards the derived muscle activations calculated from sEMG measurements. Adding the sEMG cost term showed a clear trend towards superior overall performance with regard to 3D lip marker trajectories as well as muscle activation patterns when compared with regular inverse modelling.   With sEMG act all µ (σ)     Table 1. Root mean square errors. The mean (µ) and standard deviation (σ) of the e pos for the ten lip markers and e act for the ten muscles left and right over all volunteers and all instructions.   Our inverse-modelling approach has inherited the limitations of the model described by Eskes et al. 7 . First and foremost, the generic model does not account for individual physical geometry. Although our volunteers' measurements were entered into the model initially, inaccuracies could build up during simulations due to mismatches in patient and model morphology. To account for individual geometry and anatomy, our future models should use the mismatch-and-repair algorithm or similar methods 48,49 , including diffusion-tensor magnetic resonance imaging (DT-MRI) to reveal muscle fibres and their trajectories 50 . Such combined approach may yield better approximation of muscle dimensions, orientations, and trajectories.

With sEMG
Furthermore, we may improve our simple skin model by introducing anisotropicity and viscoelasticity. Although the simplified soft representation does induce inaccuracies, these are negligible in the light of the larger errors caused by suboptimal registration and sEMG to force mapping. Our conclusions would probably not change if we would use more advanced models with anisotropic and viscoelastic properties.
Inverse modelling without sEMG tracking resulted in estimated activation patterns that totally lacked any correlation with the sEMG signals measured. It may even got stuck in a local minimum. Future experiments to address this could use the sEMGs as starting point and from there calculate the inverse activations. As expected, adding sEMG tracking gave calculated muscle activation patterns that resembled the measurements more closely. Pitermann et al. already highlighted the load-sharing problem by demonstrating that their calculated muscle  EXPERIMENT II Without . × . × 6 5 10 (5 1 10 ) 4 4 06h06m55s (00h 57m 09s) act all . × . × 3 2 10 (2 6 10 ) 4 4 11h04m17s (05h 57m 01s) act rel . × . × 6 8 10 (5 7 10 ) 4 4 07h41m32s (4h 02m 17s) EXPERIMENT III Without . × . × 3 3 10 (2 5 10 ) 4 4 05h31m15s (06h 45m 46s) act all . × . activations patterns did not show any correlation with the measured intra-muscular rectified and integrated EMG patterns 36 . Varying the initial conditions resulted in different solutions to the inverse problem, including solutions with negative muscle activity. To address this issue, they restricted the inverted EMG to positive values, only, but they found no significant difference in performance between the methods with and without this positive constraint. This illustrates the difficulty of getting volunteer-specific muscle activation patterns when muscle redundancy causes an ill-posed inverse-dynamics problem. Nevertheless, they produced good correlation coefficients for 3D lip marker coordinates 36 , even when they applied a volunteer-specific face model to a different volunteer and restricted registration to general linear scaling.
These promising results encouraged us to make the step towards patient-friendly measurements. Pitermann's team measured intramuscular EMG using invasive needle electrodes, but we chose to acquire muscle activation signals with the noninvasive technique of sEMG. Another improvement we made in the experimental set-up was measuring sEMG and 3D lip markers bilaterally. Pitermann et al. measured EMG on the left and facial movement on the right side, which may have induced error as volunteers may not have performed each instruction with perfect symmetry. Our results suggest that surface EMG is sufficiently accurate to replace the invasive technique of intramuscular EMG with intramuscular needle placement.
Terzopoulos & Waters created one of the first physics-based face models using discrete mass-spring systems to estimate muscle activity from video employing interactive deformable contours (snakes) 33 . They were able to resynthesize facial expression from estimated muscle activity using a simple, yet powerful algorithm, which called for further research in this direction. Where they mapped static facial expression to muscle activity in 2D, our results relate to 3D musculature. Incorporating improved tissue biomechanics, the ArtiSynth model uses a continuum mechanics based FE formulation as well as an advanced orbicularis oris muscle, in contrast to the two fiducial points used in Terzopoulos & Waters' model. Furthermore, we increased the number of perioral muscles to 20, where Terzopoulos & Waters studied merely 4.
Kim & Gomi and Kim et al. created a discrete model of lumped nodal masses connected via viscoelastic elements 34,35 . Despite much lower computational costs, a major drawback of their set-up is the simplified representation of reality provided by their continuum-based finite-element model. Moreover, their inverse-modelling approach involved a gradient descent search with optimisation per trial instead of per sample and without quantitative reporting. However, if sufficiently accurate, such model may be a useful addition to our virtual-therapy toolbox for rapidly simulating new inverse solutions. Our computational times, were quite high, especially when simulating the instruction set proposed in Eskes et al. for all essential functional movements 6 .
To exert similar force on the elements in the model across experimental conditions, maximum muscle stress had to be variable. Although muscle stress differed per volunteer and per instruction, we found that mean muscle stress was similar in experiments II and III, at . × 3 3 10 kPa 4 . The variance can be explained by the fact that muscle activation amplitudes differed, as did the extent of co-contraction. The different amplitudes may be explained by sEMG-technical issues. Signal amplitude may have been affected by numerous factors including sensor placement 51 : inaccurate sensor placement will inevitably contribute to crosstalk.
Another important paper by Hirayama et al. 52 reports on inverse dynamics of articulatory trajectories. Using a supervised-learning algorithm, they followed the direct inverse-modelling approach as described by Jordan & Rumelhart 53 . However, theirs was a statistical model, while we prefer biomechanical models that also account for physical laws to simulate the effects of surgical interventions.
All of the above publications confirm the difficulty of validating computed muscle activations with the actual muscle activation strategy. Most researchers have used EMG data as reference values to test algorithm performance. This method is even less reliable when EMG information is used to best track the muscle activation patterns. Recently, Nikooyan et al. reported on a new method to validate forces (and activation levels) in patients with shoulder prostheses, measuring the glenohumeral-joint reaction forces in vivo 29 . Similar data obtained with knee prostheses were made available for the "Grand Challenge Competition to Predict In Vivo Knee Loads" 54,55 . Unfortunately, this type of direct-force data cannot be obtained for facial muscles.
Despite these challenges, we were able to demonstrate that performance in 3D movement tracking did not decrease drastically -in fact, it had a tendency towards improvement -while the activation tracking improved. We think this will open new ways of obtaining realistic person-specific activation strategies.

Conclusion
We have demonstrated the feasibility of an sEMG-assisted inverse-modelling algorithm for the perioral region. Our method means an important step in the development of a virtual-surgery toolkit for the preoperative estimation of function loss after lip and oral cavity cancer surgery.
Ethical approval. All volunteers were informed about the experiment and about their rights. Written consent was obtained for publishing the photograph in Fig. 1. The Medical Research Ethics Committee (MREC) of the Netherlands Cancer Institute determined that the study did not fall under the scope of the Medical Research Involving Human Subjects Act (WMO), because the study did not infringe the (psychological) integrity of the volunteers. The measurements were noninvasive and not stressful. Thus, prior review by an accredited MREC was not required. The study was performed within the Dutch legislation regarding the Agreement on Medical Treatment Act, Personal Data Protection Act, and the Code of Conduct for Responsible Use of the Federa (Dutch Federation of Biomedical Scientific Societies). Written informed consent was obtained.