Introduction

The evaluation and formative feedback of a surgeon’s skill is an essential part of training. In the past few decades, operating microscope playback analysis with a surgical trainer has gained in popularity. However, a drawback of this technique is the large inter-observer variability [1] and lack of quantifiable objective measures with which changes of surgical skills can be monitored over time. Furthermore, there is evidence that there is a significant correlation between objective measures of manual dexterity and surgical skill with the outcome of a procedure [2, 3].

Human rating systems such as the OSACSS [4] looked at discrete segments with task-specific stems to facilitate trainer led quantitative scores. Further work led to the development of the ICO-OSCAR [5], which was based on the OSACSS and additionally defined stems pertaining to key tasks during cataract surgery. These tools employ a modular approach which has been shown to be valid and reliable [6, 7]. The modular approach also reflects how training is currently delivered for new trainees, due to the manner in which this previous work segmented the procedure. For instance, a trainee may be instructed to perform all the lens insertions on a particular theatre list and on a different list all the incisions. In this manner the trainee would build on their experience and may begin by learning the final and perhaps simpler steps of the procedure.

Motion analysis is a technology that underpins virtual simulators. The methods are validated as a purely quantitative technique of surgical skill evaluation [8,9,10]. ‘PhacoTracking’ is a novel motion tracking software that has been validated in applying motion analysis methodology to actual cataract surgery videos, as opposed to simulated procedures [11]. Expert human rating systems have been used to define what is good or to be avoided at each step and have consequently aided the development of parameters for computer-based assessment tools. These include PhacoTracking and the EyeSi (VRMagic Holding AG, Mannheim, Germany), which have shown statistically significant correlation with the OSACSS [12, 13]. However, used in isolation, rating systems that are based on performance evaluations by a human rater can be labour intensive and potentially prone to bias [14, 15]. Furthermore, the EyeSi is now a key component of most teaching deaneries’ syllabi within the United Kingdom [16,17,18]. Trainers therefore have an increasing availability of feedback to provide using both human-based and computer-based tools.

Motion tracking methods are employed in simulators such as the EyeSi [19]. Performance on the EyeSi has been significantly and highly correlated to real-life surgical performance [20]. In addition, it has been shown that there is a significant transference of cataract surgical skills from proficiency-based training on the EyeSi to the operating theatre. Both novices as well as surgeons at an intermediate level of experience showed an improvement in their operating room (OR) performance scores [13].

The three individual segments of cataract surgery which are repeatedly rated to be the most difficult are: (1) continuous curvilinear capsulorhexis (CCC), (2) phacoemulsification, and (3) I&A [21,22,23]. To date no technology has used motion tracking to analyse these segments from phacoemulsification videos in the OR and explored alignment of its metrics with those from a simulator such as the EyeSi. By aligning the two systems, the objective analysis of trainee OR videos through PhacoTracking to identify areas for improvement can be used to guide focused improvement of these areas in a controlled simulator environment. This study therefore sets out to use the PhacoTracking software, with the aim of evaluating individual segments in a modular approach and exploring its potential to complement simulation based training.

Materials and methods

A prospective cohort analysis was undertaken to compare junior vs. senior surgeons. Junior surgeons were defined as having less than 200 phacoemulsification cases experience and senior surgeons having more than 1000 cases experience. Junior surgeons were supervised by senior surgeons whilst operating. Full institutional review board and research ethics approval were obtained (REC: 12/NW/0489; Protocol No: SALG1004). Patients’ and surgeons’ consent was sought prior to the procedure and written consent obtained from patients. The paper includes no patient-identifiable information. Videos of cataract surgery were recorded using the microscope viewing platforms and standard video recording apparatus available in the operating room.

The inclusion criteria were: adult patients who had given informed consent prior to undergoing routine phacoemulsification cataract surgery; fully dilating pupils; mild to moderate cataract (1 + /2 + nuclear sclerosis or cortical lens opacity only); able to fully lie flat and still for the duration of surgery; and no ocular comorbidity (e.g. glaucoma or pseudoexfoliation syndrome). Exclusion criteria: unable to give informed consent or not wishing to participate; non-routine cataract (e.g. secondary to trauma or prior intraocular surgery); and concurrent pathology that would exclude a clear view (e.g. corneal pathology).

The EyeSi manual [24] was used to identify metrics measured by the simulator that were comparable and could be extrapolated to PhacoTracking measurements. Some of these are already assessed under validated tools such as the OSACSS and were therefore not duplicated. These metrics include: (1) forceps open and closed, (2) eye torque, (3) iris contact time, (4) horizontal insertion of instruments, (5) odometer, (6) anti-tremor, (7) capsulorhexis roundness/centreing/radius/spikes and (8) time. Additional metrics previously explored were probability density function and frequency distribution, however, these were not readily identifiable on the EyeSi.

Data was then recorded for the following three segments: (1) CCC, (2) phacoemusification and (3) I&A. The movement of each instrument in the field of view was analysed one frame at a time by the computer system. Three parameters were calculated, including the instrument path length, number of movements and total time accrued during each segment of the operation [11]. When analysing these three parameters, the p-value for a t-test between the two cohorts was calculated for each of these three components. An approximate t-test analysis was performed to test for a significant difference (p < 0.05) using Python programming libraries (SCIPY 1.90) software to perform the statistical analysis [25].

Motion tracking algorithms were applied to videos of procedures from each cohort. Stable feature points (speeded up robust features) [26] in video frames were found and tracked over time for each of the videos. The motion of these stable points were then tracked with the Kanade–Lucas–Tomasi tracking algorithm [27] and analysed to identify the actual movements belonging to the surgical instrumentation. Vectors of the surgical instrument movements were then calculated from this raw data. This method is an evolution of the previously reported PhacoTracking technique for cataract surgery [11]. An illustration of the output is shown in Fig. 1.

Fig. 1
figure 1

Examples of Phacotrack instrument tracking, green points on instruments are tracked over time for a capsulorhexis, b phacoemulsification, and c irrigation and aspiration. The coloured markers are points on the instrument for which motion is being tracked automatically

Results

Surgical videos were analysed for three different components of cataract surgery. A total of 60 components from videos of 20 junior surgeons and a total of 60 components from videos of 20 senior surgeons were analysed. The results show that overall (i.e. for all three steps) the junior surgeons used a greater total path length (p < 0.05), larger number of movements (p < 0.05) and took more time (p < 0.05), to complete a cataract operation.

Significant differences were found between junior and senior surgeons in continuous curvilinear capsulorhexis (CCC) for path length, p = 0.0004 (mean ± SD for novices = 545.7 ± 253.0 mm; experts = 293.0 ± 103.3 mm), number of movements, p < 0.0001 (mean ± SD for novices = 129.9 ± 67.2; experts = 53.9 ± 17.3) and time taken, p < 0.0001 (mean ± SD for novices = 309.65 ± 116.4 s; experts = 155.65 ± 57.6 s).

Significant differences were found in phacoemulsification for path length p < 0.0001 (mean ± SD for novices = 1818.5 ± 506.6 mm; experts = 883.6 ± 280.6 mm); number of movements, p < 0.0001 (mean ± SD for novices = 277.6 ± 157.4; experts = 80.4 ± 60.1); time, p < 0.0001 (mean ± SD for novices = 674.6 ± 237.2 s; experts = 287.0 ± 103.1 s).

Significant differences were found for I&A (path length p = 0.006 (mean ± SD for novices = 955.0 ± 501.4 mm; experts = 574.9 ± 225.7 mm; number of movements, p = 0.013 (mean ± SD for novices = 214.5 ± 237.5; experts = 64.65 ± 33.3); time p = 0.036 (mean ± SD for novices = 440.55 ± 345.3 s; experts = 255.5 ± 107.9 s). In addition, the junior surgeons showed a larger variation in the total path length, number of movements and time taken, whereas the senior groups’ results were more consistent. Table 1 shows the full results for each of the three segments in terms of actual path length, number of movements and time taken by junior and senior surgeons in addition to the respective standard deviations (SD) with p-values from an approximate t-test. The number of movements for CCC and phacoemulsification are visualised in Figs. 2, 3. From the eight EyeSi metrics mentioned previously, we were able to extrapolate three PhacoTracking software metrics as demonstrated in Table 2. This includes ‘number of movements’ which is the ‘odometer’ on the EyeSi. The second is ‘time’ which is of the same name for the EyeSi metric. Third, ‘path length’ on PhacoTracking corresponds to ‘anti-tremor progress’ on the EyeSi. The higher order motion patterns for movements, probability density function and frequency distribution, could not be at present extrapolated to any EyeSi metric. These are harder to grasp conceptually but probably will be more useful in training in the long term and is something EyeSi are yet to engineer.

Table 1 Mean path length, number of movements and time taken for junior and senior surgeons during CCC, phacoemulsification and I&A
Fig. 2
figure 2

The number of movements for junior and senior surgeons during continuous curvilinear capsulorhexis

Fig. 3
figure 3

The number of movements for junior and senior surgeons during phacoemulsification

Table 2 Summary of EyeSi and comparable PhacoTracking metrics

Discussion

The present study successfully measures instrument motion during individual segments of cataract surgery via video analysis. It has previously been shown that measurements provided by video analysis technology can discriminate between different levels of surgical skill, therefore showing the potential for providing valid and constructive feedback to surgical trainees [11]. This initial work established the feasibility and evidence of validity of the technique’s use in a specific and targeted manner, linking it directly to the EyeSi. The results of this study show that it may now be possible to breakdown this type of feedback for individual segments of an operation, which is in keeping with the current modular surgical training techniques [4, 5, 19]. Analysis provided by this study could therefore provide a platform for PhacoTracking to become a complementary tool supplementing existing virtual simulator feedback systems.

We identified eight metrics from the EyeSi and investigated their translation to PhacoTracking as summarised in Table 2. Some of the metrics were technically difficult to translate, for example, depth analysis on virtual reality simulators such as the EyeSi occurs through accurately tracking surgical instruments through a combination of optical and magnetic tracking [19]. This high fidelity tracking of surgical instruments allows for depth perception analysis, which cannot be readily extracted from a 2-dimensional (2D) video. Overall, we applied three metrics to the PhacoTracking software from those identified. The ‘number of movements’ metric, which corresponds to the ‘odometer’ on the EyeSi, provides a measure of target efficiency; as more outstretched movements are made, tissue stress increases and so does the risk for tissue injury. The second, ‘time’ taken for a task to be completed, which we have demonstrated discerns junior surgeons from senior. Third, ‘path length’ corresponded to ‘anti-tremor progress’ on the EyeSi.

Early construct validation studies have compared junior vs. senior surgeon performance [28]. In that study, abstract training tasks such as using forceps to place objects into a defined area and anti-tremor circle drawing were evaluated. They showed significant differences between senior and junior surgeons. The only parameter used in their study that overlaps with our work is the time taken to complete the task.

The greatest differences between junior and senior surgeons were found during the phacoemulsification and CCC portions. This is likely to be reflected by the widely held recognition that these segments are the more technically challenging portions of the operation and adds further strength to the validity evidence of the PhacoTracking methodology [22]. The results of this study also confirm that junior surgeons as a group have a larger variation, as has been previously demonstrated [29], in comparison to senior surgeons for phacoemulsification and I&A in both path length and number of movements as shown in Table 1.

In addition to aligning PhacoTracking metrics with the EyeSi, this study shows that surgical video analysis can provide independently detailed information for the surgeon. This has the potential to offer surgical trainees a numerical report with a breakdown of individual segments that can be used to target performance training. This sort of feedback is not currently available with existing training techniques for live OR videos and would be available with minimal time investment from the trainers as it is an automated process. This information may also have application in the semi-automated augmentation of human performance by machines if a large enough pool of data and better understanding of its application can be garnered in the future. However, providing a numerical breakdown of motion efficiency in isolation may be insufficient, as it has been shown that the addition of expert feedback alongside a numerical breakdown leads to lasting improvements [30].

Similar discernment of surgical experience has previously been shown using different metrics to evaluate performance in live surgery through the use of human marked schemes such as OSACSS [4] and automatically measured properties in simulated environments [8,9,10]. However, a strength in the approach used in this study is that the tracking technology directly observed the instruments and accurately measured their trajectories, rather than the indirect approach of analysing the movements of the surgeon’s hands which has been the approach in previous studies [10]. Another advantage of PhacoTracking is that it only requires a recorded video whereas previously, instrument tracking required several motion recording sensors [31]. However, these can be cumbersome, expensive and often problematic to use during sterile procedures as opposed to simulated surgery.

A limitation of PhacoTracking as an assessment tool is that it requires a centralised image of the surgical video; something that a junior surgeon may find difficult. However, potential errors in computer-derived metrics may be remedied by applying post hoc software based corrections. A further limitation is that surgical experience, gauged by number of cases, was the primary benchmark and only included junior and senior surgeons, thereby making it an extreme-group comparison. Future studies could try to quantify the correlation and also include intermediate level surgeons. Although, the inclusion of intermediate level surgeons may lead to results which are difficult to generalise, due to their ‘experimental movement pattern’ making it more challenging to discriminate.

In addition, we were unable to translate several metrics for technical reasons such as depth analysis on a 2D video. In future work this may be explored with more advanced computed depth estimations. Finally, higher order motion patterns such as probability density function and frequency distribution could be evaluated in the future as these may suggest surgeons of varying experience employing different movement combinations to complete a standardised surgical task. These additional metrics, which were explored were not readily identifiable on the EyeSi.

Future research into the educational application of this technology should better establish its precise role in providing formative feedback. For example, this could be done by investigating a possible improvement in performance, as a result of the specific training needs identified from PhacoTracking analysis. PhacoTracking has already been applied to endoscopic dacryocystorhinostomy surgery [32], but future work may focus on other microsurgical procedures.

This is the first time segmental analysis of actual cataract surgery has been undertaken and it echoes established work on simulators. This study shows that individual segments of cataract surgery analysed using motion tracking analysis can discern between junior and senior surgeons. Alignment of PhacoTracking and EyeSi parameters could not only allow trainees to potentially examine how their techniques differ from that of seniors but also focus on sections where they are most divergent in a controlled simulator environment. The alignment of PhacoTracking and EyeSi metrics therefore provides a platform for the former to become a complementary tool, supplementing and strengthening existing simulator feedback systems.

Summary

What was known before:

  • Present feedback for cataract surgeon trainees focuses on trainer led tools.

  • PhacoTracking software can objectively analyse an entire cataract procedure discerning between senior and junior trainees.

What this study adds:

  • Individual segments can now be analysed to provide objective feedback for trainees.

  • Certain PhacoTracking metrics may be aligned with the EyeSi to allow focused training based on objective computer-based feedback in a simulator environment