Fast by Nature - How Stress Patterns Define Human Experience and Performance in Dexterous Tasks

Pavlidis, I.; Tsiamyrtzis, P.; Shastri, D.; Wesley, A.; Zhou, Y.; Lindner, P.; Buddharaju, P.; Joseph, R.; Mandapati, A.; Dunkin, B.; Bass, B.

doi:10.1038/srep00305

Download PDF

Article
Open access
Published: 06 March 2012

Fast by Nature - How Stress Patterns Define Human Experience and Performance in Dexterous Tasks

I. Pavlidis¹,
P. Tsiamyrtzis²,
D. Shastri¹,
A. Wesley¹,
Y. Zhou¹,
P. Lindner¹,
P. Buddharaju¹,
R. Joseph³,
A. Mandapati¹,
B. Dunkin³ &
…
B. Bass³

Scientific Reports volume 2, Article number: 305 (2012) Cite this article

6701 Accesses
71 Citations
47 Altmetric
Metrics details

Subjects

Abstract

In the present study we quantify stress by measuring transient perspiratory responses on the perinasal area through thermal imaging. These responses prove to be sympathetically driven and hence, a likely indicator of stress processes in the brain. Armed with the unobtrusive measurement methodology we developed, we were able to monitor stress responses in the context of surgical training, the quintessence of human dexterity. We show that in dexterous tasking under critical conditions, novices attempt to perform a task's step equally fast with experienced individuals. We further show that while fast behavior in experienced individuals is afforded by skill, fast behavior in novices is likely instigated by high stress levels, at the expense of accuracy. Humans avoid adjusting speed to skill and rather grow their skill to a predetermined speed level, likely defined by neurophysiological latency.

EEG dynamics and neural generators of psychological flow during one tightrope performance

Article Open access 24 July 2020

Back to feedback: aberrant sensorimotor control in music performance under pressure

Article Open access 16 December 2021

Physiological correlates of cognitive load in laparoscopic surgery

Article Open access 31 July 2020

Introduction

Stress (defined here as physiological arousal) is an ever-present mechanism that helps humans cope with perceived or real threats or challenges. It is suspected to play a key role in the context of task execution¹. There has been a lot of work on the relationship between stress and task performance, starting with the postulation of the famous Yerkes-Dodson law in 1908². According to this ‘law’, performance increases with stress up to a point and decreases past that - a relationship that proved to be true in several experimental studies. Throughout the last century researchers struggled to investigate the role of stress on performance in as realistic conditions as possible and as objectively as possible. Both aims proved difficult to attain.

Specific experimental studies focused overwhelmingly on aviation, where the effect of stress on performance deemed paramount³. There have also been some studies on the effect of stress on surgical performance^4,5,6. Both the aviator and surgeon professions are critical to society and involve dexterity. Due to the introduction of new technologies, such as laparoscopy in surgery and unmanned aerial vehicles in aviation, required skills in the two professions look increasingly similar (e.g., maintaining dexterity despite loss of proprioception). Emerging professions, such as robot tele-operators and actors controlling avatars, fall under the same skilled category.

While this convergence of skilled professions takes place, the literature on addressing issues of stress versus performance in dexterous tasks remains fragmented (per profession) and lacks appropriate methods and unifying abstractions. Indeed, common threads in many published studies are the use of subjective or snapshot stress indicators and the reliance on non-orthogonal performance measures that are often culturally defined.

Key aims of our investigation are: (a) to develop an objective stress measurement method that is unobtrusive and real-time; (b) to articulate dexterous performance abstractions that can naturally link-up with neurophysiological responses and are rid of redundancies and disciplinary bias.

We monitored stress and performance patterns among surgeons during training in an inanimate laparoscopic skills lab. The selected activity locus merely serves as a sample window through which we can observe the human behaviors of interest.

To date, galvanic skin response (GSR) sensing on the fingers has been the standard method used to peripherally quantify stress in real-time⁷. This method is not applicable in surgical training assessment for obvious reasons; the surgeons' fingers are engaged, a limitation that would apply to all dexterous task scenarios. To solve the problem, we developed a novel stress quantification methodology where the targeted physiological response is transient perspiration on the perinasal area - a phenomenon we have shown is associated with stress⁸.

This perinasal response follows the transient perspiratory response on the fingers and correlates well with it, as we demonstrate in the Results-Validation Analysis section. Hence, it can be used as an alternate measure of stress with distinct advantages. The perinasal area is much more accessible than the fingers and thermal imaging can be brought to bear to quantify perspiration unobtrusively (see Methods-Thermal Imaging sections).

We have also formulated two new performance abstractions: (a) attempt pace, which unlike the standard time measure, always relates to neurophysiological latency; (b) error propensity, which includes not only standard errors but also latent errors and remains representative of accuracy across different task architectures.

Refocusing attention from the fingers to the face and replacing probes and electronics with imaging and computation empowered a field study of stress. The collected neurophysiological data were analyzed in the context of the new performance abstractions. The results brim with intriguing leads about human nature - a testament to the method's power and promise.

Results

Macroscopic Study Variables

Surgeons belonging to two skill levels (novices and experienced) engaged in training on three laparoscopic drills (Supplement-Fig. S1):

Task 1: A simple, ad hoc, drill where a string is manipulated from one end to the other via its colored sections.
Task 2: A more challenging drill that requires the cutting of a circular pattern on a piece of gauze. It is part of the Fundamentals of Laparoscopic Surgery (FLS), a widely accepted educational module in laparoscopic surgery⁹.
Task 3: A highly complex drill that requires precise suturing on a fine rubber tube. This is also part of FLS.

Training was longitudinal, with repeat sessions spread over the course of a few months; every session included multiple trials of each task. In our analysis, we studied the relation of stress indicators to surgeon performance. The stress indicators included neurophysiological (via thermal imaging) and observational (via visual imaging) trial measurements, while the performance indicators included time and error trial measurements, reflecting the grading of the surgical educator; these eventually were supplanted by better abstractions.

Neurophysiologically, stress was tracked through the perinasal response. Specifically, in every trial i of a task j in session k for a surgeon l (x ≡ (j,k,l)), we quantified the entire perinasal perspiratory signal E(x, i) and represented it via its mean intensity . Then, we tracked stress by computing the mean signal intensity over all trials i = 1,…,I of task j in session k for surgeon l.

Typically, the aid of an observational variable (such as facial expressions) would be necessary to disambiguate instances of negative (distress) versus positive (eustress) excitation in a sympathetic signal, such as the perinasal. This was the motivation behind gathering visual imaging data concomitantly with thermal imaging data. As it was proved at the end (see Results-Specificity Analysis section), observational annotation of the physiological signal is not absolutely necessary in the particular context. For this reason, the observational variable was dropped from consideration in the main analysis.

Regarding performance, in every trial i of a task j in session k for a surgeon l (x ≡ (j,k,l)), we defined time as the real variable Time(x, i), which represented how long (in [s]) it took a surgeon to complete the trial. We also defined error as the binary variable Err(x, i), which was 0 if the trial was flawless and 1 otherwise. Then, we tracked performance by computing the mean time and the mean error over all trials i = 1, …, I of task j in session k for surgeon l.

Before each session, every surgeon completed a State Anxiety Inventory (SAI) sheet¹⁰. Scoring of SAI gave an indication of the surgeon's stress level prior to the execution of the protocol.

Main Analysis

Initially we present the marginal distribution of each response variable (stress: µ_E(x), time: µ_Time(x) and error: µ_Err(x)) on each surgical skill level (novices and experienced), for each task (Task 1, Task 2 and Task 3) - Table 1 and Fig. 1a-c. Furthermore, we test whether the two skill groups of surgeons have equivalent mean responses or not. This is a family of n = 14 tests, including 4 tests on stress, 7 tests on time and 3 tests on error. Hence, the significance level α = 0.05 is Bonferroni adjusted¹¹ to α_B = 0.05/14 = 0.0036. Please note that for stress we include a test in the relaxation period (baseline). Please also note that regarding time, we compare mean time scores not only between groups for each task, but also between each group and the task's proficiency mark, where this is available (i.e., Task 2 and Task 3). These tests provide nuance by indicating not only if novices perform slower than experienced surgeons, but also if they meet proficiency time, a mark presumably above their level.

Table 1 Distributions of macroscopic study variables

Full size table

Novice surgeons arrived at each session with stress levels significantly higher than those of experienced surgeons, based on the State Anxiety Inventory (SAI) scoring (analysis of variance, P < 0.05). This anticipatory stress in novices was somewhat diffused during the baseline period, where the perinasal indicator µ_E(x) showed no significant stress differences between the two skill groups (analysis of variance, P > 0.0036). During task execution, stress differences between novice and experienced surgeons, as measured by µ_E(x), became significant again (analysis of variance, P < 0.0036 for all three tasks - Fig. 2).

Time-wise in Task 1 and Task 2 the indicator µ_Time(x) showed that the novice surgeons performed as fast as the experienced surgeons (analysis of variance, P > 0.0036 for both tasks). In addition, both skill levels met the FLS proficiency time in Task 2, which has been set by the American College of Surgeons (ACS) to 98 [s] (analysis of variance, P > 0.0036 for both skill levels). Task 3 was the only task where novice surgeons maintained time performance commensurate to their skill; they completed the task significantly slower than experienced surgeons and they did not meet the FLS proficiency time, which has been set by ACS to 112 [s] (analysis of variance, P < 0.0036 for both cases).

Error-wise in Task 1 and Task 3 the indicator µ_Err(x) showed that the novice surgeons committed significantly more errors than experienced surgeons (analysis of variance, P < 0.0036 for both cases). In Task 2, however, this significant difference in error performance between the two skill groups eroded away (analysis of variance, P > 0.0036).

Departure from the usual time and error behavior in Task 3 and Task 2 respectively, does not stand up to deeper analysis of the task architecture. Task 1 is discrete repetition of the following subtask: grab the string at the colored section s; then, proceed grabbing the colored section s+1 and repeat until the end of the string. Task 2 is nearly continuous repetition of the following subtask: cut around the circular pattern up to a point that a substantial change in direction is needed; then, transiently adjust the cutting direction and repeat until the circular pattern is fully severed. Please note that an error in a subtask of Task 1 or Task 2 has finality (cannot be corrected) and hence, the surgeon has no choice but to proceed uninterrupted to the next repetitive step. In other words, neurophysiological latency (or response speed) tracks time performance (or task speed) in the first two tasks, because there is one to one correspondence between subtasks and attempts.

Task 3 is different because there is one to many correspondence between subtasks and attempts and hence, neurophysiological latency does not track time performance. Specifically, Task 3 consists of a sequence of six different subtasks: Subtask 1: passing the needle through the marks; Subtask 2: first (double) knot; Subtask 3: second (single) knot; Subtask 4: third (single) knot; Subtask 5: grabbing the string; Subtask 6: cutting the string. In order to proceed to Subtask s + 1 one must adequately complete Subtask s. For Subtask 1 this means that the surgeon has to pass the needle as close to the marks as possible, introducing at best a small error. For the other subtasks, it means that they have to be flawlessly completed and there is little other choice. Hence, the surgeon can engage in repeated attempts in each subtask of Task 3 until it is done right (Subtask 2–6) or until further improvement is deemed counter-productive (Subtask 1). We characterize the final attempt in each subtask as the ‘settlement’. Most of the errors in Task 3 are found in settlements in Subtask 1. Barring catastrophic failure, settlements in the other subtasks are mostly successful.

Let us denote t_s(y, i) the duration (in [s]) of the attempt in which surgeon l adequately completes Subtask s during trial i of Task 3 in session k (y ≡ (k, l)). Let us also denote A_s(y, i) the number of attempts it takes for surgeon l to adequately complete Subtask s during trial i of Task 3 in session k. Hence, A_s(y, i) is a random variable taking values in the positive integer range [1, 2, 3, …]. These data constitute a geometric distribution A_s(y, i) ∼ Geometric(P_s(y, i)), where the parameter P_s(y, i) expresses the probability of adequately completing Subtask s. For each surgeon during a session we have I data points A_s(y, i) (corresponding to the I trials) for the variable A_s. We use the A_s(y, i) data points of each session to obtain an estimate of the parameter of interest P_s(y), based on Maximum Likelihood Estimation (MLE): . Hence, the higher the value of the better the surgeon's chance to adequately complete Subtask s with fewer attempts (Fig. 3a).

Analysis reveals that novice surgeons need significantly more attempts with respect to experienced surgeons in the difficult knotting subtasks until they perform them correctly (analysis of variance, P < 0.0125 for A₂ + A₃ + A₄ - Table 2 and Fig. 3a). This is the reason that macroscopically novices appear slow in Task 3 and do not meet time proficiency standards.

Table 2 Distributions of Task 3 decomposition variables

Full size table

However, novices maintain fast behavior in their action attempts at the subtask level, which is similar to their behavior in Task 1 and Task 2. This is evident from two pieces of information:

In Settlement at Once: In the knotting subtasks, novice and experienced surgeons do not differ significantly in settlement times that correspond to immediate successes (analysis of variance, P > 0.0125 for , and ). Please note that denotes the settlement time in subtask s when the surgeon succeeds in the first attempt. We also use a Bonferroni adjusted level of significance (α_B = 0.05/4 = 0.0125) to account for the 4 tests involved in the Task 3 decomposition (one for A_s and three for ).
On an Agonizing Path to Settlement: In the knotting subtasks, there is a significant positive relationship between the number of attempts and the settlement time for novice surgeons (P < 0.05 - Fig. 3b).

Hence, when novices are lucky enough to settle at once, they are as fast as experienced surgeons. When their path is more agonizing, then their settlement represents an adjustment to slower pace.

To synopsize, time performance has been recast as an attempt pace measure rather than a task completion measure to provide a unifying abstraction across different task architectures. Error performance has been expanded to include the concept of latent errors (i.e., multiple attempts), which are not reflected in the final grade, but inform the accuracy skill of the subject. Please note that the original error performance measure µ_Err(x) is quite restrictive even if one excludes the possibility of latent errors in certain tasks. Due to its binary nature, it tracks apparent ‘perfection’ rather than detailed accuracy performance - a measurement philosophy that is culturally fitting to the surgical profession. For certain tasks, such as Task 1, where brief attention is needed at discrete points in time, µ_Err(x) tracks well detailed accuracy performance (just 4.76% of Task 1 trials have more than one errors). For other tasks, where continuous attention to accuracy is required and perfection is more difficult to attain, µ_Err(x) heavily undercounts errors, favoring novices. Supplement-Fig. S2 depicts how gross µ_Err(x) is in the case of Task 2 - a fact that explains the surprising error equivalence between the two skill groups in this task.

To investigate the role of skill versus error in the prediction of the stress differentiation between the two groups of surgeons, we ran for each task the linear regression model:

The interaction term was found insignificant and subsequently removed from Eq. (1). The simplified model showed that while the variable Level is significant (P < 0.05 for all tasks), the variable µ_Err(x) misses significance in all three tasks (P = 0.07 > 0.05 for Task 1, P = 0.32 > 0.05 for Task 2 and P = 0.09 > 0.05 for Task 3), mostly by a thin margin. A careful look in the error histograms of Fig. 1d reveals the reasons behind the unexpected lack of significance for µ_Err(x). Due to the binary nature of the error variable, the mode of the distributions is at 0 in Task 1, at 1 in Task 2 and close to 1 or at 0 in Task 3, depending on the surgeons' skill level. This bias renders the regression lines unstable and the error coefficients insignificant.

Interestingly, Fig. 1e shows the lack of interaction between level and task for stress, time and error - results that are verified by running the respective linear models. This is indication that the culturally perceived task difficulty may not be grounded to reality. Any one of the three tasks presents significant challenges to novices, while the same tasks are almost uniformly unchallenging to experienced surgeons.

Validation Analysis

The current standard in real-time measurement of peripheral sympathetic responses is GSR sensing on the fingers. The perinasal imaging method used in this study aims to become the new standard. It has two important advantages: (a) It applies on a more accessible part of the body. (b) It is contact-free and hence, has minimal imprint on stress generation. Still, it has to pass a validation check, which could be summarized as follows: “Is the perinasal imaging method equivalent to the finger GSR method?”

To provide an answer to the validation question, we conceived the following experimental design: We recruited volunteers (n_V = 18, 8 males and 10 females) who underwent a controlled stress producing protocol, approved by the Institutional Review Board of the University of Houston. All subjects signed informed consent forms, including publication statements. Stress was induced using auditory startle. The experiment lasted 4 [min] per subject. After the first minute, a stimulus was delivered and after that two more were delivered, spaced about one minute apart, resulting in three events. During the experiment, the subjects focused on the simple mental task of counting circles that appeared on a monitor. This amplified their reactions to stimuli.

GSR probes were attached on the subject's left-hand index and middle fingers, a thermal imaging sensor aimed at the subject's right-hand index finger and another thermal imaging sensor aimed at the subject's perinasal area (Fig. 4a). All three measurement modalities were synchronized and recording throughout the experimental timeline. This design allows us to examine first, if the imaging method correlates with the ground-truth method (i.e., GSR) on the same part of the body (fingers). Additionally, it facilitates examination of the correlation between the perinasal and finger responses.

We base our comparative analysis on a signal abstraction that is consistent with established psychophysiological views¹². We reason that one can interpolate the sympathetic signal to a good approximation if s/he knows three critical points for each event: Onset (marking the start of activation), Peak and Offset (marking the end of relaxation). For the measurement methods to be in gross agreement with each other, they need to produce similar results for these three points and the trends (ascending and descending) they demarcate. Therefore, we use the time footprints of Onset, Peak and Offset and an intensity measure for the ascending and descending trends to test the relationships of GSR versus Thermal Imaging Measurement on Finger (TIMF) and GSR versus Thermal Imaging Measurement on Perinasal (TIMP).

Regarding the time axis comparisons we have 3 time points for each event, 3 events and 2 pairs of methods that we are interested to compare (GSR versus TIMF and GSR versus TIMP); this yields n = 3 × 3 × 2 = 18 tests. Therefore, the standard level of significance α = 0.05 needs to be adjusted to α_B = α/n = 0.0028.

Fig. 4b depicts the signals of all three modalities for every subject in the validation data set, annotated with 3 critical points per event (Onset, Peak, Offset). Table 3 provides the P-values regarding comparisons between GSR and TIMF and between GSR and TIMP on time points critical to each event. Almost all the tests fail to reject the null hypothesis, which means that GSR reports critical event times indistinguishably from TIMF or TIMP. Table 3 also provides the r-values between GSR and TIMF and between GSR and TIMP for each critical time point across events. All r-values indicate strong linearity between methods along the event evolution pattern.

Table 3 Tests (α_B = 0.0028) and correlations on critical event times

Full size table

Intensity-wise, we compare the slopes of the linear ascending (Onset-Peak) and descending (Peak-Offset) trends of each event between GSR and TIMF and between GSR and TIMP. Please note that we have 2 trend slopes per event, 3 events and 2 pairs of methods; this yields n = 2 × 3 × 2 = 12 tests. Therefore, the standard level of significance α = 0.05 needs to be adjusted to α_B = α/n = 0.0042.

Table 4 provides the P-values regarding comparisons between GSR and TIMF and between GSR and TIMP on trend slopes critical to each event. Almost all the tests fail to reject the null hypothesis, which means that GSR signals feature ascending and descending trends in each event that are indistinguishable from TIMF or TIMP.

Table 4 Tests (α_B = 0.0042) on event trend slopes

Full size table

To recap, GSR has a strong linear agreement with TIMF and TIMP regarding key evolution times of sympathetic events that define the activation, peak and relaxation stages. GSR also has trend agreement with TIMF and TIMP regarding the rate of change during the activation and relaxation stages of sympathetic events.

Specificity Analysis

As a sympathetic response, the perinasal response is non-specific to negative or positive excitation. One would expect then, the overall intensity of the perinasal perspiratory signal to be agnostic to the precise levels of distress versus eustress. To investigate this issue, we thought to use in parallel visual observation of facial expressions to annotate the onset of distress versus eustress bouts in the perinasal signal.

The visual imagery has been processed frame by frame by a certified expert in Facial Action Coding (FACS)¹³. To avoid bias, the FACS coder was not aware of the corresponding perinasal signals. The type and the duration of every facial expression was recorded on the timeline. Furthermore, facial expressions were broadly classified in three categories: positive, neutral and negative. The positive expressions indicated positive excitation (eustress), while the negative expressions negative excitation (distress).

Observational annotation of the neurophysiological response resulted in a more detailed level of stress analysis. Specifically, we quantified just the portions of the perinasal perspiratory signal where the surgeon showed facial expressions manifesting negative feelings (distress); let us denote this negative affect signal as E_N (x,i) (with mean ) and its extent (percent of total frames in the trial) as N(x,i). In this case, we tracked stress by computing the mean signal intensity over all trials i = 1,…,I of task j in session k for surgeon l. We also computed the mean extent of the negative affect signal portions. Therefore, at this level of analysis distress changes were evident not only via the changes of , but also via the changes of µ_N(x).

At the same time, we tracked positive excitation by quantifying the portions of the perinasal perspiratory signal where the surgeon had facial expressions manifesting positive feelings (eustress); let us denote this positive affect signal as E_P(x, i) (with mean ) and its extent (percent of total frames in the trial) as P(x,i). These positive affect signal portions were characterized by mean intensity as well as mean extent µ_P (x), similarly to the negative affect signal portions. Therefore, eustress changes were evident either via the changes of or µ_P (x).

We compared this more detailed level of analysis, where physiological measurements are guided by visual observations, with the simpler, unguided physiological analysis we adopted in the main analysis. We found that both analysis styles lead to the same conclusions. To make the case, we cite an example that is related to a fundamental issue in this study: The effect of the surgeons' levels of experience on stress.

Specifically, we found that not only the unguided stress indicator E, but also the guided stress indicators E_N and N pinpoint that stress levels are negatively related to experience (analysis of variance, P < 0.05 - Supplement-Fig. S3).

For this reason, after making here the case of virtual equivalence between the overall perinasal signal E(x,i) and its negative affect portion E_N(x,i), we used only E(x,i) in the main distress analysis described in the Results - Main Analysis section of the article; we also prefer to use the term stress instead of distress.

Discussion

There is no rational unifying reason for novice surgeons to favor speed over accuracy. The scoring system weighs time of performance and accuracy equally, so one would expect that surgeons would be equally attentive to both performance measures. Although surgeons were informed about the FLS proficiency times for Task 2 and Task 3, they could not check time progress during tasking. Hence, in the absence of feedback it would be difficult to consistently guess the proficiency time and uniformly meet it in trial after trial (which is what happened in Task 2, where time performance tracks latency). Furthermore, there is the case of the ad-hoc Task 1, where no widely accepted proficiency time exists. There, both novice and experienced surgeons also converged to a specific time performance, in trial after trial - a point that suggests that time responses are viscerally spawned.

We theorize that a good way to apriori determine proficiency times in newly constructed dexterous tasks is by measuring latencies. In FLS, surgical educators determine proficiency times by averaging the time performance of many experienced laparoscopic surgeons. The lack of clear abstraction between time performance and latency obscures the fact that in tasks such as Task 2, these are one and the same, irrespectively of the skill level. In tasks such as Task 3, time performance aligns with latency only in the experienced cohort, who are perfect. In any case, humans appear to grow their dexterous skill to fit a mean latency level, specific to the challenge. Hence, wherever time performance does not align with latency from the start, it is the limit to which it eventually converges.

We hypothesize that the high stress levels in novice surgeons is the hidden driver of their viscerally fast behavior, which further undermines their error performance. We have two pieces of circumstantial evidence in support of this hypothesis. First, by detangling time corresponding to attempt pace from time lost in error recovery, we get a temporal measure that is close to neurophysiological latency and can be reasonably associated with arousal levels. Second, the novice's fast attempt pace clearly gets them into trouble in critical subtasks of Task 3, where they waste a lot of attempts until they get it right. Eventually they get it right only when they slow down.

To definitely prove this hypothesis one would need to perform an interventional study, where the controls will be novice surgeons following the standard training protocol, while the interventional group will be novice surgeons whom the training session stress is ameliorated via some method. Per the hypothesis, novices in the interventional group with substantially reduced stress levels would be expected to exhibit slower task attempt pace, which is more appropriate to their skill level. This reduction in speed would likely lead to reduction in errors and propensity for errors, bootstrapping confidence early on.

In the current data set all novice surgeons have relatively high stress levels and all experienced surgeons nearly identical low stress levels. Hence, it is difficult to see any direct associations of stress with performance indices within these groups.

Please note that there was no significant improvement in accuracy for the novice cohort at the end of the five session training sequence (analysis of variance, P > 0.05) - an indication that current training practices are slow in producing results. Further investigation of the hypothesis put forward in this study may lead to changes in prevailing training philosophies and practices with significant benefits.

We admit that the number of subjects in this study is relatively small (n = 17) and the null should be viewed with some caution. However, a number of ameliorating factors offer some protection: (a) This was a longitudinal rather than one shot experiment. (b) The subjects belonged to a relatively homogenized cohort of people. (c) We tested against Bonferroni corrected significance levels to further guard against Type II errors.

The outcome of this study was made possible by the introduction of a new methodology capable of unobtrusively quantifying human neurophysiological responses in natural settings and the articulation of performance measures that are orthogonal and universal. If the result of the current effort is any guide, the method and the performance abstractions are not only valuable tools for scientific discovery, but they can also be used in practice to assist in the design of dexterous training modules.