With over 19 million procedures performed every year in the European Union and United States, colonoscopy is the ‘gold standard’ for managing colorectal diseases1. The need for colonoscopy is expected to rise by 16% in the next decade2, with the primary benefit being early detection and prevention of colorectal cancer (CRC), the third most common malignancy worldwide3. Preventive colon screening can increase early-stage detection rates for CRC, where a patient’s five-year survival rate is over 90%. Survivability drops drastically to less than 10% when diagnosed at a late stage4. The availability of colonoscopy is largely hindered by the aging design of the flexible endoscope (FE) used for this procedure5. Originally introduced in the 1960s6, FEs have several drawbacks and have seen very few improvements. Specific design limitations of the FE7 include (1) the inherent complexity of the device, which prevents a single-use approach and thus necessitates cleaning and sterilization8, (2) patient pain due to tissue stretching as the endoscope is pushed through the colon, limiting social acceptance and introducing risks such as tissue perforation and anaesthesia-related adverse events, and (3) lack of intuitiveness, requiring highly trained personnel and thus a long and expensive training process9 and a shortage of endoscopists with respect to demand10. This reduces the availability of early screening and increases the potential loss of human lives. Overcoming the limitations of FEs would allow colonoscopies to become ubiquitous and have a substantial impact in the early detection of malignant diseases.

The limitations of FEs have motivated the development of alternative approaches. The lack of intuitiveness and ease of use have been addressed by robotic actuation of conventional endoscopes11,12, leading to a shorter procedure duration, but the pain and reprocessing issues are not addressed. Wireless capsule endoscopes (WCE)13 and internally actuated robotic devices14,15 address the issues of pain and discomfort, but wireless devices fail to provide therapeutic functionalities such as biopsy and removal of polyps. Moreover, the complexity of internally actuated mechanisms results in cumbersome design and prevents a substantial cost reduction. Magnetically actuated endoscopes16,17,18,19,20 have demonstrated the potential to reduce pain, reduce cost, enhance diagnostic capabilities21 and improve therapeutic interventions. Although promising, translating magnetically actuated endoscopes for clinical use has failed due to control challenges. External actuating magnetic fields, generated by varying electromagnetic coils22 or by moving permanent magnets23 (commonly mounted on robotic manipulators; Fig. 1), are nonlinearly related to the motion of the magnetic endoscope. Giving the user the complex and unintuitive task of guiding the endoscope by controlling the field requires experience and results in unsatisfactory procedure times24. Developing advanced control strategies capable of assisting and offering an intuitive user experience with reduced procedure times would serve to enable the clinical translation of magnetic colonoscopy, with the overarching goal of widening and improving patient care.

Fig. 1: Overview of the robotic magnetic flexible endoscope (MFE) system.
figure 1

The magnetic endoscope (bottom right) is equipped with an endoscopic camera, an insufflation channel and a working channel. Illumination is provided by a light-emitting diode (LED). A KUKA LBR Med robotic arm is used to manipulate an external permanent magnet. The endoscopic video feed is projected on a monitor with a graphical interface showing parameters such as relative robot speed and inter-magnetic distance.

Work, thus far, on improving navigation in magnetic endoscopy has been demonstrated in magnetic endoscopes for gastric screening25, catheter steering26 and bronchoscopy27. In the context of mobile and complex environments such as the colon, navigation has only been shown when following simple, predefined trajectories28, thus failing to provide substantial proof of clinical feasibility. The colon is an unstructured and dynamic environment consisting of convoluted soft tissue that is subject to notable variabilities due to gravity, changes in patient position, peristalsis and insufflation. Furthermore, the colon contains obstacles such as tissue folds, water and debris. Pre-defined trajectories would soon become inaccurate in this ever-changing environment. To provide a practical and clinically viable alternative, intelligent control of magnetic endoscopes must be advanced substantially.

We hypothesize that controlling magnetically manipulated endoscopes by introducing superior levels of intelligence and autonomy could increase their navigational performance. This would ultimately reduce procedure times and the mental and physical burden placed on the operator, allowing more focus on the clinical aspects of the procedure, as reduced training would be needed for manual manipulation of the endoscope. This would have a positive effect on the availability of the procedure. Autonomy for magnetic endoscopes can be contextualized in the general trend towards enhanced autonomy that is gaining momentum in the field of medical robotics. Inspired by the standardization of autonomy levels in self-driving cars29, the medical robotics community is converging towards the definition of six levels of autonomy30,31,32 characterized by increased intelligence. In this work, the discussion about our magnetic endoscope and the development of autonomous control will support an analysis of how general definitions can be specified for robotic endoscopy as well as the features required to reach each autonomy level. Our contribution to the field of machine intelligence is the ability to explore how different levels of computer assistance may improve the procedure and reduce user workload in robotic colonoscopy.

The main scientific questions we investigate in this work are as follows. (1) How can intelligent control strategies overcome the inherent complexities of controlling magnetic intracorporeal endoscopes? (2) What level of autonomy is required to enable a non-expert operator to navigate a magnetic endoscope in an unstructured environment such as the colon, while maintaining procedure duration comparable to an FE? (3) Can effective, intelligent control strategies reduce the physical and mental burden of the operator?

A successful outcome to these questions, combined with a technology such as the magnetic flexible endoscope (MFE; Fig. 1), designed for painless colonoscopy, could provide a major improvement and welcome disruption in the early detection and treatment of colorectal diseases. The MFE has been developed by our group over the past 12 years17,24,33, and in this Article we present a comprehensive approach to autonomous navigation of the endoscope.

Besides being crucial for colonoscopy, this work is applicable to several other endoscopic applications where the environment is unstructured and poses notable challenges for effective navigation. This would also reduce the pure dependency on manual expertise. With robotic assistance in navigation, training resources can thus be directed towards the cognitive aspects of endoscopy such as recognition of pathology, differential diagnosis and creation of treatment plans.

To investigate these scientific questions, we developed a control methodology that allows simplified user inputs and image-based, autonomous navigation, capable of computing motion based on a real-time visual analysis of the environment. This methodology was tested comparatively in benchtop and in vivo (porcine model) settings with non-expert users. In doing so, we provide the following contributions:

  1. (1)

    The demonstration of intelligent and autonomous control enabling non-expert users to successfully perform magnetic colonoscopy by travelling a considerable distance in vivo, and with a duration comparable to standard FE

  2. (2)

    A framework to define the increasing levels of autonomy in medical robotics applied to robotic flexible endoscopy

  3. (3)

    An analysis of the autonomous features required to overcome the complexities of magnetic manipulation in unstructured tubular cavities

  4. (4)

    The development of intelligent and autonomous control strategies for magnetic endoscopy, which have enabled a reduction in exertion for the user

A schematic overview of our approach to control is provided in Fig. 2 and described in Supplementary Video 1. The navigation system is composed of several elementary blocks, organized in three main layers. Each layer provides a set of features characterized by increasing autonomy, relying on functionalities offered by the underlying layers.

Fig. 2: Schematic overview of the control layers associated to autonomy levels.
figure 2

In the first layer, where no autonomy is available, the user manually controls the robot end effector in five degrees of freedom (DOF) in an attempt to manipulate the MFE. In the second layer, the user controls the endoscope and the system carries out suitable motions of the robot by taking into consideration localization information and magnetic field interaction. In the third layer the user has discrete control over the endoscope and the lumen is detected and followed autonomously. In the standard definition of the autonomy levels, these correspond to level 0, 1 and 3, respectively. Level 2 is defined as ‘task autonomy’ and, in robotic endoscopy, describes autonomous execution of tasks such as retroflexion or guided biopsy.

The first and most simple layer is defined as ‘direct robot operation’. In this layer, the user is manipulating the robot (which holds an external permanent magnet, EPM) to influence the MFE motion. This layer exhibits functionality offered by the mechanical platform and an elementary level of manual control whereby the user must themselves control variations in the interacting magnetic fields. The functionality offered can be associated to level 0, as the manipulator is a mere executor of the movements imparted by the human operator (with the addition of some safety constraints). This layer serves to act as a comparative baseline for the subsequent developments in control and autonomy defined in this work.

In the second layer, user inputs are directly focused on navigating the endoscope through the colon, while the system carries the burden of generating a suitable magnetic control action to accomplish the desired endoscope motion. In this layer, the presence of the robot is inconsequential to the user, whose inputs directly control the endoscope tip via the video feed. With this, the user intuitively instructs how they wish the endoscope camera to move inside the colon. Using the real-time positional information for the MFE (accuracy of 5 mm (±1 mm) and 6° (±0.8°), 100 Hz, Supplementary Fig. 1), as provided by previous work34 on a magnetic localization algorithm and a Hall effect/inertial measurement unit sensor circuit in the tip of the MFE, this level of control computes the best motion strategy to perform the required action and subsequently operates the robot to adapt the magnetic field accordingly. We define this layer as ‘intelligent endoscope teleoperation’, which can be associated to level 1 or robotic assistance, following the classification provided by refs. 30,31. The human operator maintains continuous control over navigation, while the robot assists with magnetic manipulation.

In the third layer, the system governs motion of the MFE based on a real-time analysis of the endoscopic video feed, combined with knowledge of the endoscope’s pose from the localization system. The direction of motion is computed by an image analysis algorithm that detects the centre of the lumen. The endoscope is then autonomously steered and advanced through the colon using the navigation control developed in the underlying layer. Local real-time knowledge of the anatomy, acquired through the image analysis, is crucial for enabling this level of autonomy. If desired, the user can override the system’s choice by clicking on the desired location in the image. To highlight the autonomous features, we define this layer as ‘semi-autonomous navigation’. This layer can be associated to level 3 or conditional autonomy, where the system generates task strategies and relies on the operator to approve or override the choice. In our system, the navigation task is performed autonomously but under supervision of the operator, who can perform discrete control actions and override the autonomous control to select a different output of orientation.

In the discussion of levels of autonomy, level 2 has been omitted. This level, defined as ‘task autonomy’, describes a system that carries out semi-autonomous motion but is dependent on a human-in-the-loop to indicate the end target and waypoints of that motion. Examples of level 2 in the context of endoscopy are motion along predefined trajectories28, autonomous retroflexion33 and stabilization of the endoscope’s tip during interventional tasks (for example, biopsy). Although task autonomy is promising and could contribute to the goal of simplifying the overall procedure, this work is focused on navigation inside the colon. As the shape of the colon is not fixed and changes frequently, waypoints and end targets of predefined trajectories under level-2 control would need to be constantly updated by the user. From a technical viewpoint, the features required to perform this task are the same as level 1, as the user remains in continuous control. Accordingly, the discussion of level 2 has been omitted in this work.

Benchtop and in vivo results

Experimental validation

We conducted a set of experiments to evaluate the developed control strategies, with their respective performances being scored in terms of navigation and user workload. We first designed an experiment to assess the effectiveness of the endoscope orientation controller (shown in Extended Data Fig. 1 and Supplementary Video 2) and then conducted a benchtop study where users untrained in colonoscopy were asked to make multiple attempts at navigating the MFE in a latex phantom (the layout is represented in Fig. 3c), using the various intelligent control strategies. Finally, we conducted an in vivo study on two porcine models, with the goal of further comparing the performance and ease of use of different control methods in a living being.

Fig. 3: Benchtop experimental set-up and results.
figure 3

a, The experimental set-up. The user is manipulating the joystick with the right hand and feeding the tether with the left hand. The phantom is covered and the endoscopic video feed is visible in the user interface. b, Successful completion times for each control strategy: direct robot operation, n = 29; intelligent endoscope teleoperation, n = 48; semi-autonomous navigation, n = 50. Red bars indicate median, edges are 25th and 75th percentiles, whiskers indicate range, and red crosses denote outliers. P values were computed using the Kruskal–Wallis test. c, Detail of the latex phantom representing a human colon (M40, Kyoto Kagaku Co.). Anatomical features are reproduced by template fixations provided by the manufacturer. The standard configuration was chosen. d, Example completed trajectory of the MFE using direct robot operation. e, Example completed trajectory of the MFE using intelligent endoscope teleoperation. f, Example endoscope path during a semi-autonomous execution. The user override is represented in yellow and the autonomous motion in blue.

Benchtop experimental results

To compare the different strategies, we performed a comparative trial on a benchtop platform (Supplementary Video 3). A latex simulator was configured into a standard colon shape used by gastrointestinal practitioners during training (Fig. 3c) and then covered from view (Fig. 3a). Ten novice participants (no endoscopy experience) were instructed to navigate the MFE from the rectum to the caecum as quickly as possible, five times for each control strategy (15 in total per user). Each task was repeated five times before proceeding to the following task and all the participants performed the tasks in the same order. Each participant completed all the tasks on the same day, but different participants were admitted to the laboratory on different days. The end of the navigation task (the caecum) was placed and clamped at nine haustral folds from the end of the colon as per the manufacturer’s instructions. This resulted in a rectum-to-caecum distance of 100 cm. A test was labelled as complete upon navigating from the rectum to the caecum in 20 minutes or less. Users were given a lead-in time of 20 minutes for each of the three control strategies to become familiar with the controls before initiation of the trial. The choice of a 20-minute time limit is based on ref. 35, which reports that the average caecal intubation time for a trainee in a standard colonoscopy is 14.1 minutes, as well as the time limit chosen in a colonoscopy simulator study36. Detailed data on each task are available in Supplementary Dataset 1.

After every attempt, users were asked to complete a NASA Task Load Index (TLX) questionnaire37. The NASA TLX is a widely used workload assessment instrument that aims to score human perceived workload on six subjective subscales: mental demand (how mentally demanding was the task?), physical demand (how physically demanding was the task?), temporal demand (how hurried or rushed was the pace of the task?), performance (how successful were you at completing the task?), effort (how hard did you have to work to accomplish your level of performance?) and frustration (how frustrated, insecure, discouraged, irritated, stressed or annoyed were you?). All subscales range from 0 (very low) to 100 (very high) with an exception for performance, which ranges from 0 (perfect) to 100 (failure).

The overall completion rates (percentage successfully navigated from the rectum to the caecum in 20 minutes or less) for direct robot operation, intelligent teleoperation and semi-autonomous navigation were 58% (29/50), 96% (48/50) and 100% (50/50), respectively. As shown by Fig. 3b, out of all the successful attempts, direct robot operation presented the slowest average completion time of 11 min 8 s ± 3 min 59 s and had the MFE commonly produce convoluted trajectories (an example is shown in Fig. 3d). This was often because the user would position the MFE in an undesired manner, get stuck, then would have to pull back the tip via the tether, readjust the position of the MFE and try again. Intelligent teleoperation and semi-autonomous navigation were substantially faster and comparable to each other, with average completion times of 4 min 6 s ± 2 min 8 s and 4 min 14 s ± 1 min 31 s, respectively. These results outperform colonoscopies carried out on the same phantom by novice users, which, in another study36, lasted an average of 17 min ± 8 min. More details are provided in the Discussion. The completed trajectories of the MFE using endoscope teleoperation (an example is shown in Fig. 3e) and autonomous navigation (Fig. 3f) were much more direct and smooth compared to robot operation—the MFE was positioned more easily, and reaching the caecum did not require the user to inefficiently withdraw and retry difficult sections. The P values in Fig. 3b indicate statistical significance when comparing completion times.

Regarding ease of use (Fig. 4), users found direct robot operation to be notably more demanding in all NASA task load categories. High levels of effort and frustration arose from the endoscope losing magnetic coupling with the EPM. In different relative poses of the two magnets, user commands produced different changes in magnetic forces and torques, appearing to the users as a random effect on the movement of the MFE. The main points of failure using direct robot operation were the hepatic and splenic flexures, with the lack of an intuitive connection between command and motion making these tight turns particularly difficult to navigate. Of the three control strategies, semi-autonomous navigation presented the lowest user workload scores in all categories. The performance of the autonomous system let the users take on more of a monitoring role; this, in turn, made the task much less demanding.

Fig. 4: NASA Task Load Index mean user workload ratings from benchtop trial results.
figure 4

High, orange-shaded values indicate poor user experience and low, green-shaded values indicate good user experience.

In the 50 successful semi-autonomous repetitions, the MFE was autonomously operated for, on average, 91% of the total time required to navigate from the rectum to the caecum, with 12 completed procedures being performed fully autonomously without any manual override necessary. Of the procedures requiring manual intervention, users most commonly needed to give an input via the joystick in the rectum due to the multiple sharp turns found in quick succession that placed the lumen behind and out of view of the camera. An example of semi-autonomous execution is shown in Fig. 3f.

In vivo experimental results

After the benchtop study highlighted the improved ease of use and performance associated with increased MFE autonomy, we performed an in vivo study on a porcine model (two female Yorkshire-Landrace pigs, 33 kg and 35 kg; Supplementary Video 3). The primary objectives of the experiments were as follows: (1) to highlight the shortcomings of simple robot teleoperation in magnetic manipulation; (2) to compare the benefits provided to non-trained users by the increasingly intelligent control strategies in a variable and tortuous environment such as the porcine colon; (3) given that a porcine colon is comparatively more difficult to navigate than a human colon, to provide a strong indicator for the potential of the system in the less demanding human anatomy. The increased tortuosity of the porcine colon results from its highly spiralled structure (Supplementary Fig. 3). This continuously spiralling trajectory arguably creates more points of tissue–MFE contact and increased friction, which requires a higher magnetic force to overcome. Furthermore, the colon loops present a navigational challenge that requires continuous rotation of the internal permanent magnet (IPM) and EPM, often reaching the limits of magnetic actuation or the robotic manipulators joints. As a result, repositioning of the animal (for example rotation) or reconfiguring the robotic manipulator joints may be necessary and thereby extend the overall procedure time.

The experimental scheme was completed by two operators with no prior endoscopic experience. The experiments were designed to compare the use of a conventional FE (Olympus PCF-160AL) and the various levels of control strategies developed for the MFE. At the beginning of the experiment, each user was given a 10-min lead-in period and instructed to use a standard FE to travel as far possible inside the porcine colon. After 10 min, the end point—the furthest distance reached in the colon—was tattooed to serve as a comparable distance marker for subsequent attempts (Fig. 5b). Travelled distance was measured using the incremental markings on the endoscope insertion tube. At every iteration, if the end-point distance reached surpassed the marker, the new furthest point reached was measured, tattooed and updated to be the new target.

Fig. 5: In vivo results.
figure 5

a, Experimental set-up of the in vivo trial. The robotic arm is operating the MFE. The endoscopic video feed is displayed on the user interface. b, Detail of the tattoo marker used to identify the maximum distance reached with a conventional endoscope. c, The path travelled by the MFE using autonomously assisted control (level 3), reaching 85 cm. Two anatomical loops can be observed, with the MFE being able to successfully overcome the difficult turns. d, An example video frame of the system autonomously detecting and steering the MFE towards the porcine lumen (green dot).

Subsequently, each user attempted to navigate the porcine colon with the MFE using the different control strategies (Fig. 5a). Trials were divided into sets in the order of one direct robot operation, one intelligent teleoperation and one semi-autonomous navigation. The number of sets performed was four on the first animal and three on the second animal, as the available time was affected by limiting factors such as the risk of prolonged anaesthesia. During every repetition, the time required to reach the tattooed marker and the position of the EPM and MFE were recorded. The user completed a NASA TLX after each attempt to compare the ease of use of the different approaches. Detailed data are provided in Supplementary Dataset 2.

Completion times and completion rates for the two users are reported in Supplementary Table 1. On the first animal, the tattooed distance reached using the standard FE was 45 cm. Substantial tortuosity in the colon prevented any further distance being traversed. The user was then able to perform four attempts using each MFE control strategy. As time allowed four sets to be completed using the MFE, the fastest four attempts using the FE were used in this comparison. The average completion times were 1 min 39 s for the standard FE, 9 min 4 s for direct robot operation, 2 min 20 s for intelligent endoscope teleoperation, and 3 min 9 s for semi-autonomous navigation. The same approach was followed on the second animal. During the initial phase with the conventional FE, the user reached a notable distance of 85 cm, which became the tattooed-distance target for following attempts. A faecal blockage prevented travelling any further distance. Although the difference between distances travelled in the first and second animals is notable, this is quite common in experiments involving animals, where the colon is tortuous, prone to gas retention (which can cause the bowel to press into and collapse neighbouring lumens) and difficult to clean before the procedure (for example, humans undergo a rigorous bowel preparation that requires ingestion of fluids in a closely followed protocol—this cannot be performed on animals).

Time allowed three sets of attempts to be completed using the MFE, with the fastest three standard FE attempts being used for comparison. The average completion times for the second user were 3 min 29 s for the standard FE, 8 min 36 s for intelligent endoscope teleoperation and 9 min 39 s for semi-autonomous navigation. The lowest level, direct robot operation, was unable to reach the marker.

The trajectory of the MFE during one of the autonomously assisted control trials (level 3, user 2, target of 85 cm) is shown in Fig. 5c, with the on-board camera image detecting the lumen shown in Fig. 5d. The trajectory shows the MFE being able to overcome two loops and several tortuous bends. Regarding user workload (Fig. 6), both users found that using the standard FE and direct robot operation were the most demanding in all NASA workload categories. Direct robot operation was more demanding than a standard FE for most mental workload categories. The FE, while difficult to master, has a physical cable link between the control interface and the tip, resulting in a direct and predictable response in tip movement. The absence of intelligent control for direct robot operation and a physical link between the interacting magnetic fields meant that the user would have to mentally predict the result of their next input, given the current state of the magnetic system, often resulting in frustration when motions of the MFE did not move as predicted. Intelligent teleoperation and semi-autonomous navigation were substantially less demanding for the novice user. Similarly to the benchtop experiments, in autonomous mode the user had the ability to override the motion with manual control. During the semi-autonomous repetitions, the MFE was navigated in autonomous mode for (on average) 87% of the time required to reach the marker for user 1 (total distance of 45 cm) and 78% for user 2 (total distance of 85 cm). This remarkable result was obtained under the supervision of a veterinary surgeon, who continuously verified the safety of the procedure. Such a high rate of autonomy indicates that the semi-autonomous mode, in conjunction with the use of safety measures such as limited minimum inter-magnetic distance, provides satisfactory safety levels.

Fig. 6: NASA Task Load Index mean workload ratings on porcine models.
figure 6

High, orange-shaded values indicate poor user experience and low, green-shaded values indicate good user experience.


In this work, we enable intelligent and autonomous navigation of magnetic endoscopes in complex environments such as the colon and define how increasing levels of autonomy can be applied to robotic endoscopy. We discuss the features required to enable each autonomy level and synthesize an integrated control scheme in which software layers with higher intelligence capitalize on features offered by the underlying layers. The effectiveness of our techniques was tested on benchtop and in vivo, in a porcine model. With respect to the scientific questions outlined at the beginning of thie Article we show the following:

  1. (1)

    The inherent complexity of navigating magnetic endoscopes with a single external permanent magnet can be overcome by the developed intelligent control strategies. These were able to mask the unintuitive nature of interacting magnetic fields and field gradients. In particular, the simultaneous use of localization and an advanced closed-loop control strategy is crucial to achieve satisfactory procedure times. The availability of a reliable localization mechanism can be substituted by estimation-based techniques27 or visual feedback38 in applications where the environment is more structured and constrained, such as navigation in lungs or cardiovascular apparatus. However, the inherent complexities of colonoscopy require internal (such as in this work) or external localization39.

    Moreover, we show how an effective control strategy can overcome the limitations imposed by the actuation of a single permanent magnet. Systems based on different magnetic field sources such as coils40 or rotating permanent magnets41 may provide similar capabilities with comparable results, although continuous rotation of the endoscope (and, consequently, of the camera) may hinder the integration of vision, localization and control.

  2. (2)

    The minimum level of autonomy required for a non-expert to effectively navigate a complex environment such as the colon is level 1. In ref. 36, the time required to reach the caecum with a conventional endoscope (on the same phantom used for this study) was evaluated on 32 novice users and 21 experienced endoscopists. An average of 17 min ± 8 min was required for completely untrained operators, decreasing to 11 min ± 7 min after 11 h of training. Additionally, experienced colonoscopists who performed the same test gave an average procedure duration of 7 min ± 5 min. The results of this study show that endoscope teleoperation and semi-autonomous navigation outperform conventional colonoscopy for novice and newly trained operators, reducing the time to reach the caecum to a value comparable to that of experienced clinicians.

  3. (3)

    Autonomous navigation introduces an important step towards the autonomous execution of colonoscopy, thus providing substantial benefits in terms of reducing mental and physical workload. Moreover, the degree of autonomy enabled by this feature, similar to other tasks like surgical suturing, has the potential to revolutionize clinical workflow, requiring minimal and discontinuous intervention from the operator. In the future, the robot speed will be increased to achieve faster motion and further reduce procedure duration.

Other examples of the manipulation of magnetic endoscopes are available in the literature42, but they either lack localization27, an endoscope tether (required for interventional capabilities)43 or an intelligent control system44, limiting the translation to clinical use. This Article shows a tethered magnetic endoscope successfully navigating the colon of a porcine model by means of a blended use of magnetic localization, closed-loop robotic control and elaboration of the endoscope camera image.

The results shown here build on work carried out over 12 years, in which the foundations of the MFE platform have been developed. In previous works, our group evaluated different control strategies aimed at tackling particular aspects of navigation such as predefined trajectories on benchtop34, levitation45 or overcoming obstacles46, but here we decribe the full control of the navigation process, successfully piloting the MFE in a porcine model. We also demonstrate autonomous navigation of magnetically manipulated endoscopes, in vivo. By adopting a fusion of magnetic and visual feedback, we have developed a system that can make endoscopic inspection of the bowel autonomous and more user friendly when compared to using conventional endoscopes. Our aim is to reduce the complexity of endoscopic procedures by automating the manual aspects of endoscope manipulation, thus reducing the burden on the operator and enabling more focus on the clinical aspects of the procedure. This work may facilitate the adoption of colonoscopy by requiring a reduced skillset for the navigation of magnetic endoscope devices, thus allowing previously required training resources to be better utilized on the diagnosis and treatment of patients. Considering the demand for colonoscopy and the expected rise in preventive screening campaigns in the next decade, the results of this work may substantially contribute to saving human lives. This work is also crucial as a scientific foundation for transitioning to clinical trials, where other crucial hypotheses, such as acceptability and level of pain associated with the procedure, can be tested.

With 19 million colonoscopies performed every year in the United States and the European Union and with demand constantly increasing, this technology has a disruptive potential to revolutionize current practice. Potential impacts of this work also concern the control of magnetic endoscopes for other applications such as gastroscopy47 and bronchoscopy48. The framework defined in this work could be adopted with different endoscope designs and lays the groundwork for the development of additional levels of autonomy.

The methods applied in this work strongly rely on endoscope localization, which operates under the assumption of no distortion in the magnetic field produced by the EPM. This can be a challenging restriction in clinical environments, although the sensitive workspace is limited to the patient’s abdominal area and the availability of MRI-compatible devices may mitigate this hurdle. Moreover, the effective workspace is limited by the strength of the magnets. This may have a negative effect on patients with high body mass index, for which the minimum safe EPM-to-endoscope distance is higher. The magnet strength also affects the manoeuvrability of the endoscope; stronger magnets would require smaller EPM motion, thus improving the endoscope reactivity and disturbance rejection. The most viable solution would be to revise the design of the endoscope, which is currently limited to using standalone cameras that require dedicated cabling and reduce the space available for the magnet. To further validate the methods described in this Article, in the future we will consider an extensive trial on benchtop phantoms with complex configurations such as alpha loops, which may be substantially more complex to navigate.

For the benchtop study, the purpose was to validate our control methods and test the hypothesis that a non-expert can navigate a colon with low mental and physical exertion by using the MFE and increasing robotic assistance. Therefore, a cohort of complete novices was deemed most appropriate. For future work, a study involving users of various skill levels would be interesting and would indicate the learning curve of this technology—something we have explored previously49 and intend to explore in future works with the next generation of the hardware platform.

The findings of this work also open the way towards the development of other autonomous tasks in endoscopy. Further benefits could be found through the development of autonomous control strategies to aid in therapeutic tasks such as biopsy and polypectomy. The current diagnostic practice relies on an operator’s experience and training in analysing the endoscopic image. We hypothesize that artificial intelligence and autonomous navigation may, in the future, be coupled to ultimately improve patient care (diagnosis and therapy) and that, with the development of dedicated control strategies (that completely integrate a vision module with artificial intelligence), higher levels of autonomy (that is, level 4) will be possible. Furthermore, this work is particularly timely, with the global pandemic COVID-19 severely restricting endoscopy practice under instruction by governing bodies50. Standard endoscopy requires multiple staff and the proximity of staff and the patient. This is problematic, because FEs generate aerosols that can readily spread infection between the multiple, grouped personnel. The MFE and the control developed here demonstrate the potential for robotic endoscopy procedures to be performed with fewer staff and, with minor adjustments to the MFE system (such as a simple tether feeder), the option to reduce contact between staff and patient. This may facilitate procedures with considerably lower risk of viral infection or cross-contamination and allow endoscopy practice to be unhindered by any future pandemics.

Intelligent control and autonomous navigation

The experiments described in this work are aimed at evaluating the performances of different levels of autonomy required to successfully navigate the colon with the MFE platform. In this section, we initially describe the system and the features provided by each layer. We then discuss the validation process of the autonomous lumen detection algorithm.

System overview

The main components of the system are a robotic arm with an EPM mounted on its tip and the magnetic endoscope. The endoscope, shown in Fig. 1, is composed of a three-dimensional (3D) printed shell, a localization circuit, an endoscopic camera and an intracorporeal permanent magnet (IPM) that is immersed in the field produced by the EPM.

The interaction between the EPM and IPM, shown in Supplementary Fig. 4, is provided by magnetic coupling. Forces (f1) and torques (τ1) exerted on the IPM, computed with respect to the world reference frame OW are described by the magnetic dipole model:

$${{{\bf{f}}}}_{{{\bf{l}}}} = \nabla ({{{\bf{m}}}}_{{{\bf{I}}}} \cdot {{{\bf{B}}}}_{{{\bf{E}}}})\;{{\uptau }}_{{{\bf{l}}}}{{ = {\bf{m}}}}_{{{\bf{I}}}} \times {{{\bf{B}}}}_{{{\bf{E}}}}$$

where mE and mI are the magnetic moments of the EPM and IPM expressed with respect to the global coordinate frame OW, BE is the vector representing the magnetic field generated by the EPM in the IPM position and mE,mI are the vectors describing the cartesian positions of the EPM and IPM. A 5-DOF manipulator could theoretically move the EPM in any pose, generating all feasible combinations of forces and torques accordingly. In practice, a 7-DOF manipulator, such as the KUKA LBR Med R820 used in this work, provides enhanced dexterity and minimizes the risk of reaching joint limits. The system was implemented via ROS (Robot Operating System), with the robotic manipulator controlled in the joint space. For all tests, user inputs were given via a 6-DOF joystick (3D Space Pilot, 3D Connexion). This high-DOF joystick was necessary, as control level 0 requires the user to control the motions of the EPM in multiple DOF (pitch, yaw and up, down, forwards and backwards translation). This joystick was then also used for endoscope teleoperation and autonomous navigation to remain ergonomically consistent across all tests. After completing this study it was noted that the more successful intelligent control levels required simplified user inputs with reduced DOF. As such, a more appropriate joystick can be used for future development, such as a PlayStation 3 navigation controller (Fig. 2, Sony Corporation).

Direct robot operation

This layer provides the core functionality of teleoperating the EPM to indirectly produce effects on the MFE. The end effector of the robotic arm is operated via a joystick interface connected to a low-level controller. Operator commands provided through the joystick interface constitute the user’s request to move the robot end effector. Inputs, which include angular displacements \({\updelta} {{{{\hat {\bf{m}}}}}_{{{\bf{E}}}}} \in {\Bbb R}^3\) (where \({{{{\hat {\bf{m}}}}}_{{{\bf{E}}}}}\) is the unit vector associated to mE) and linear displacements \(\updelta {{{\bf{p}}}}_{{{\bf{E}}}} \in {\Bbb R}^3\), are gathered and transformed in joint angle variations \(\updelta {{{\bf{q}}}} \in {\Bbb R}^7\) by means of the differential relation:

$${{\updelta {\bf{q}}}} = {{{\bf{J}}}}^\dagger {{{\bf{W}}}}_{{{\bf{a}}}}{{{\bf{s}}}}\left[ {\begin{array}{*{20}{c}} {{{\updelta {\bf{p}}}}_{{{\bf{E}}}}} \\ {{{\updelta }} {{{\hat{\bf{m}}}}_{{{\bf{E}}}}}} \end{array}} \right] + {{\gamma }}\left( {{{{\bf{I}}}} - {{{\bf{J}}}}^\dagger {{{\bf{J}}}}} \right){{\updelta {\bf{q}}}}_0$$

where \({{{\bf{J}}}}^\dagger \in {\Bbb R}^{7 \times 6}\) is the pseudoinverse of the robot’s Jacobian and \({{{\bf{W}}}}_{{{\bf{a}}}} \in {\Bbb R}^{6 \times 6}\) is a suitable weighting matrix. To avoid joint limits, a modified version of the saturation in the null space algorithm51 has been adopted. This involves scaling the task by means of the scaling factor \({{{\bf{s}}}} \in {\Bbb R}^6\) (Supplementary algorithm 2) and injecting a suitable action \(\updelta {{{\bf{q}}}}_0 \in {\Bbb R}^7\) in the null space of the robot’s Jacobian, multiplied by a scaling factor γ, a function of robot velocity gain (manually tuned). In this work, δq0 was chosen in such a way to minimize the distance of the robot’s joint angles from the central position (Supplementary algorithm 3). For safety reasons, the absolute rotation and vertical height of the end effector have been limited and are adjustable through the user interface. The robot joint angles q(t) are computed at every time step of the control algorithm by time integration (forward Euler method).

The position and orientation of the endoscope tip are shown to the user in conjunction with the robot pose by means of a virtual 3D environment. Simultaneously, the video feed of the camera embedded in the capsule is presented to the user. The robot’s end effector is teleoperated under the assumption that the endoscope would follow the motion of the EPM, due to magnetic coupling. This approach, although very simple from a computational and architectural viewpoint, presents several drawbacks: (1) the magnetic coupling is nonlinear, hence similar variations of the EPM pose might not result in the same effect on the endoscope, (2) the orientation of the camera is not aligned with the perspective of the user (gravity could be in any direction in the camera frame), increasing the mental effort required of the operator and (3) the control action applied to the endoscope is suboptimal, as the human presence in the loop can reduce performance.

Owing to the axial symmetry of the permanent magnets, rotations about their longitudinal axis have no effect on the magnetic field. For this reason, the orientation in the roll axis of the endoscope is not controllable and therefore the camera alignment on that axis is ungovernable. The operator is required to mentally compute the rotation of the image with respect to the horizon and consider it when operating the robot. Endoscopists usually tackle this complexity by a trial-and-error process, but experience is required and the operator can experience considerable stress. Moreover, the effect generated on the endoscope by the EPM motion can change in different relative poses due to the nonlinearity of the magnetic coupling, thus adding complexity to the navigation task.

During preliminary tests, several users reported severe difficulties in separately requesting end-effector rotation and translation. This might be due to the complexity of pushing the controller joystick without inducing any rotation and vice versa. To ease the teleoperation, the EPM motion was restricted by preventing movements along the y axis and rotations around the roll axis, as the first corresponds to lateral motion of the endoscope, while the second is a rotation around the magnetization axis. Moreover, two operating modalities were defined with the aim of separating the motion and orientation control of the endoscope. In the first, pitch rotation of the EPM is prevented. In the second, linear motions are nullified in favour of rotation control. This feature is obtained by assigning suitable weights to the matrix Wa, shown in Supplementary equation (1).

Intelligent endoscope teleoperation

The main feature of this layer is to mask the complexity of teleoperating the robot for a desired motion of the endoscope, as inducing an effect on the IPM by commanding motions of the EPM is unintuitive. This subsystem provides direct control of the endoscope’s tip to the user, thus overcoming the limitations of the lower layer in terms of ease of use.

Intelligent teleoperation of the MFE is enabled by a real-time localization system, as thoroughly described in ref. 34. This is based on a particle filter estimation of the tip pose with respect to a pre-computed map of the magnetic field, generated by the EPM52. The system is capable of estimating the orientation and position of the MFE tip with an accuracy of 5 mm (±1 mm) and 6° (±0.8°) in static and dynamic conditions. Taking advantage of the sensing provided by the localization, a closed-loop control scheme aimed at navigating the endoscope was explored.

To develop a control system based on a linear model, a magnetic dipole model for the forces and torques described in equation (1) can be expressed with respect to the position and orientation of the magnets and locally linearized, resulting in the following differential relation:

$$\begin{array}{rcl}\left[ {\begin{array}{*{20}{c}} {{{\updelta {\bf{f}}}}_{\bf{l}}} \\ {\updelta}{{{\uptau }}_{\bf{l}}} \end{array}} \right] &=& \left[ {\begin{array}{*{20}{c}} {\frac{{\partial {\bf{F}}_{\bf{m}}}}{{\partial {\bf{p}}_{\bf{E}}}}} & {\frac{{\partial {\bf{F}}_{\bf{m}}}}{{\partial {\bf{p}}_{\bf{I}}}}} & {\frac{{\partial {\bf{F}}_{\bf{m}}}}{{\partial {\hat{\bf{m}}_{\bf{E}}}}}} & {\frac{{\partial {\bf{F}}_{\bf{m}}}}{{\partial \widehat {{\bf{m}}_{\bf{I}}}}}} \\ {\frac{{\partial {{\uptau }}_{\bf{m}}}}{{\partial {\bf{p}}_{\bf{E}}}}} & {\frac{{\partial {{\uptau }}_{\bf{m}}}}{{\partial {\bf{p}}_{\bf{I}}}}} & {\frac{{\partial {{\uptau }}_{\bf{m}}}}{{\partial {{{\hat{\bf{m}}}}_{\bf{E}}}}}} & {\frac{{\partial {{\uptau }}_{\bf{m}}}}{{\partial {{\hat{\bf{m}}}_{\bf{I}}}}}} \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {{{\updelta {\bf{p}}}}_{\bf{E}}} \\ {{{\updelta {\bf{p}}}}_{\bf{I}}} \\ {{{\updelta }}\widehat {{\bf{m}}_{\bf{E}}}} \\ {{{\updelta }}\widehat {{\bf{m}}_{\bf{I}}}} \end{array}} \right]\\ &=& {\bf{J}}_{{\bf{FA}}}\left( {{\bf{p}}_{\bf{E}},{\bf{p}}_{\bf{I}},\widehat {{\bf{m}}_{\bf{E}}},\widehat {{\bf{m}}_{\bf{I}}}} \right)\left[ {\begin{array}{*{20}{c}} {{{\updelta {\bf{p}}}}_{\bf{E}}} \\ {{{\updelta {\bf{p}}}}_{\bf{I}}} \\ {{{\updelta }}\widehat {{\bf{m}}_{\bf{E}}}} \\ {{{\updelta }}\widehat {{\bf{m}}_{\bf{I}}}} \end{array}} \right]\end{array}$$

where \({{{\bf{F}}}}_{{{\bf{m}}}} \in {\Bbb R}^3\) and \({{\uptau }}_{{{\bf{m}}}} \in {\Bbb R}^3\) are the nonlinear expressions of magnetic forces and torques (the complete expression is provided in Supplementary equation (2))53, pE and \({{{\bf{p}}}}_{{{\bf{I}}}} \in {\Bbb R}^3\) are the positions of the EPM and IPM, \({{\hat{{{\bf{m}}}}}_{{{\bf{E}}}}}\) and \({{\hat{{{\bf{m}}}}}_{{{\bf{I}}}}} \in {\Bbb R}^3\) are the unit vectors representing the orientation of EPM and IPM in the world reference frame, and δflin and \(\updelta {{\uptau }}_{{{{\bf{lin}}}}} \in {\Bbb R}^3\) represent the variation of Fm and \({{\uptau }}_{{{\bf{m}}}} \in {\Bbb R}^3\) with respect to a local configuration change. Assuming a constant pose of the endoscope, equation (3) can be simplified to

$$\left[ {\begin{array}{*{20}{c}} {\updelta {{{\bf{f}}}}_{{{\bf{l}}}}} \\ {\updelta {{\uptau }}_{{{\bf{l}}}}} \end{array}} \right] = {{{\bf{J}}}}_{{{\bf{F}}}}\left( {{{{\bf{p}}}}_{{{\bf{E}}}},{{{\bf{p}}}}_{{{\bf{I}}}},\widehat {{{{\bf{m}}}}_{{{\bf{E}}}}},\widehat {{{{\bf{m}}}}_{{{\bf{I}}}}}} \right)\left[ {\begin{array}{*{20}{c}} {\updelta {{{\bf{p}}}}_{{{\bf{E}}}}} \\ {\updelta \widehat {{{{\bf{m}}}}_{{{\bf{E}}}}}} \end{array}} \right]$$

Although the magnetic dipole model is globally nonlinear, the local linearization and constant endoscope pose are reasonable assumptions as the motion of the endoscope is slow (~0.01 m s−1) with respect to the frequency of the control loop (100 Hz). The Jacobian JF is computed at every time step; thus, the simplified linear model is locally valid and provides satisfactory performances.

The orientation control is carried out by a closed-loop system, described by the following expression:

$${\updelta{{{\uptau} }}}_{{{\bf{l}}}} = {{{\bf{R}}}}_{{{\bf{G}}}}^{{{\bf{I}}}}\;{{{\rm{pd}}}}\left( {\overline {{{\updelta }}{\boldsymbol{\vartheta}} _{{{{\bf{a}}}},{{{\bf{I}}}}}} ,\overline {{{\upomega }}_{{{{\bf{a}}}},{{{\bf{I}}}}}} } \right)$$

The pd() function computes a proportional-derivative control action with respect to the user input \(\overline {\updelta {\boldsymbol{\vartheta}} _{{{{\bf{a}}}},{{{\bf{I}}}}}} \in {\Bbb R}^3\) and the current endoscope angular velocities \(\overline {{{\upomega }}_{{{{\bf{a}}}},{{{\bf{I}}}}}} \in {\Bbb R}^3\) expressed in local coordinates (the overbar indicates the local reference frame). \({{{\bf{R}}}}_{{{\bf{G}}}}^{{{\bf{I}}}} \in {\Bbb R}^{3{\mathrm{x}}3}\) is the rotation matrix describing the endoscope orientation with respect to (w.r.t.) the global reference frame, shown in Supplementary Fig. 5.

The control of the linear motion of the endoscope is not based on the linearized magnetic model of equation (5). When a linear motion is required, the orientation of the endoscope \({{{{\hat {\bf{m}}}}}_{{{\bf{I}}}}}\) is projected on the horizontal plane by the projxy() operator and multiplied by the motion command \(\overline {\updelta {{{\bf{X}}}}_{{{{\bf{a}}}},{{{\bf{I}}}}}} \in {\Bbb R}^3\). A damping term \(\left( {\alpha \left[ {1 - \frac{{F_{{\rm{m}}_z}}}{{F_{{\rm{m}}_z,{\rm{max}}}}}} \right]} \right)\) is introduced to maintain the EPM in the proximity of the endoscope, where \(F_{{\rm{m}}_{z}}\) is the force exerted by the EPM along the z (vertical) direction, \(F_{{\rm{m}}_z,{\rm{max}}}\) is the maximum value of the same force and α is a weighting constant. Finally, \(W_{\rm{FF}} \in \left\{ {0,1} \right\}\) is an activation term, thus enabling the feedforward term when the motion is commanded. The overall control function is shown in equation (6), while the computation of the pseudoinverse of the Jacobian \({\mathbf{J}}_{\mathbf{F}}^\dagger\) is carried out by means of the weighted/damped least-squares algorithm as shown in Supplementary algorithm 4:

$$\begin{array}{rcl}\left[ {\begin{array}{*{20}{c}} {{{\updelta {\bf{p}}}}_{{{\bf{E}}}}} \\ {{{\updelta }}\widehat {{{{\bf{m}}}}_{{{\bf{E}}}}}} \end{array}} \right] = & {{{\bf{J}}}}_{{{\bf{F}}}}^\dagger \left[ {\begin{array}{*{20}{c}} 0 \\ {{{\updelta {{\uptau}} }}_{{{\bf{l}}}}} \end{array}} \right] + {{{\bf{W}}}}_{{{{\bf{FF}}}}}\left( {\overline {{{\updelta {\bf{X}}}}_{{{{\bf{a}}}},{{{\bf{I}}}}}} } \right)\\&{{{\rm{proj}}}}_{{{{{xy}}}}}\left( {\widehat {{{{\bf{m}}}}_{{{\bf{I}}}}}} \right)\left( {\overline {{{\updelta {\bf{X}}}}_{{{{\bf{a}}}},{{{\bf{I}}}}}} - {{\upalpha }}\left[ {1 - \frac{{{{{\bf{F}}}}_{{{{\bf{m}}}}_{\bf{{{z}}}}}}}{{{{{\bf{F}}}}_{{{{\bf{m}}}}_{{{{z}}}},{{{\bf{max}}}}}}}} \right]} \right)\end{array}$$

Preliminary trials have shown that a linearized approach to teleoperation of the endoscope might induce drift in the EPM with respect to the optimal pose (that is, exactly above the endoscope), thus resulting in reduced controllability of the MFE. In the normal motion state, the robot is controlled by equation (6). If the magnetic coupling is not optimal (condition number \({{{\bf{J}}}}_{{{\bf{F}}}} \gg 1\)), the system enters a ‘recoupling’ state and the magnet is brought back to the optimal condition, and equation (7) controls the robot motion. The pi() function computes a proportional integral control action to move the EPM directly above the endoscope, maintaining the orientation in the xy plane:

$${{\updelta {\bf{p}}}}_{{{\bf{E}}}} = {{{\rm{pi}}}}\left[ {\begin{array}{*{20}{c}} {{{{\bf{p}}}}_{{{{\bf{I}}}},{\boldsymbol{{{x}}}}} - {{{\bf{p}}}}_{{{{\bf{E}}}},{\boldsymbol{{{x}}}}}} \\ {{{{\bf{p}}}}_{{{{\bf{I}}}},{\boldsymbol{{{y}}}}} - {{{\bf{p}}}}_{{{{\bf{E}}}},{\boldsymbol{{{y}}}}}} \\ {{{{\bf{h}}}}_{{{{\bf{reset}}}}} - {{{\bf{p}}}}_{{{{\bf{E}}}},{\boldsymbol{{{x}}}}}} \end{array}} \right]\;{{\updelta }}\widehat {{{{\bf{m}}}}_{{{\bf{E}}}}} = {{{\rm{pi}}}}\left( {{{{\rm{proj}}}}_{{{{{xy}}}}}\left( {\widehat {{{{\bf{m}}}}_{{{\bf{I}}}}} - \widehat {{{{\bf{m}}}}_{{{\bf{E}}}}}} \right)} \right)$$

Similarly, it is possible to take advantage of the instants when no motion is commanded on the endoscope to maximize the magnetic manipulability. When the joystick is not generating input to the control system, the control action is switched to a different state, where the translational dynamics is controlled by equation (7), while the rotational dynamics is described by equation (8). The rotational displacements of the EPM are computed as an optimization problem where the wrenches applied to the endoscope are minimized to prevent any undesired motion of the endoscope:

$${{\updelta }}\widehat {{{{\bf{m}}}}_{{{\bf{E}}}}} = {{{\bf{min}}}}\;\;{{{\bf{J}}}}_{{{{\bf{FA}}}}}\left[ {\begin{array}{*{20}{c}} {{{\updelta {\bf{p}}}}_{{{\bf{E}}}}} \\ 0 \\ {{{\updelta }}\widehat {{{{\bf{m}}}}_{{{\bf{E}}}}}} \\ 0 \end{array}} \right]$$

Autonomous navigation

This layer is aimed at further enhancing autonomy by offering autonomous navigation capabilities for the MFE application. To autonomously navigate through the colon, we leverage a combination of the magnetic manipulation algorithms defined in the previous sections and image processing to autonomously detect the direction of the colon. With this directional information we can (1) autonomously steer the MFE camera frame towards the centre of the colon lumen and (2) autonomously advance the MFE forwards through the colon, once aligned to the lumen.

Owing to the inherent, highly variable mobility of the colon (introduced by patient body movement, breathing, peristalsis and the low modulus of tissue), we sought to adopt an approach to autonomous navigation that is devoid of predefined trajectories. Our approach leverages a real-time understanding of the colon’s pathway using image processing. Multiple groups have developed image-processing techniques to infer motion direction in endoscopic images, remarking that future benefits would be found in the application of these techniques to the active control of endoscopes54, such as the work we present here. Our chosen approach to inferring direction in the colon is not based on specific features of the colon such as haustral folds55. Conversely, it is based on an approach that could possibly be adapted to a variety of tubular cavities. The absence of feature-specific function enables the autonomous navigation work we present here to be transferable to other magnetic endoscope devices, designed for navigation in other tubular cavities.

To detect the colon lumen in the endoscope image, we build on the adaptive threshold segmentation algorithm presented in ref. 54 (the pseudocode of this algorithm shown in Supplementary algorithm 1). The image is first segmented (Extended Data Fig. 2a,b) to remove all but the darkest and most distinct region, with the assumption that this area most likely contains the distal lumen. This segmentation is performed using the red channel of the RGB image, as this channel amplifies the distinction between bright and dark regions in the predominantly red-shaded colon.

The image is then downsampled by 50% to reduce the computational complexity and then converted to greyscale. The corresponding grey-level histogram of the image contains distinct valley points that can be used to separate pixels into two classes: a non-lumen-region class and a lumen-region class. To define an optimal threshold for separating pixels into these two classes, each possible threshold value is measured for its class separability using a discriminant criterion measure. The threshold that returns the maximum value for this measure gives the threshold that most effectively segments the image. However, multiple regions can remain in the image after this segmentation. Each region is thus scored on its likelihood to contain the lumen, with the highest scoring region being a function of the largest area and the darkest average pixel intensity54. With this final region, all but the darkest pixels are removed, with the centre mass point of these remaining pixels being the final centre-of-lumen estimate.

To advance the endoscope in the colon, we assume that the camera should be directed towards the lumen before any forward motion is requested. The orientation control builds on what is described in the previous section, and the input to the endoscope orientation controller \(\overline {\updelta {{\vartheta}} _{{{{\rm{a}}}},{{{\rm{I}}}}}}\) is generated as described by equation (9). A proportional controller aligns the centre of the image (xc,yc) with the detected lumen (xl,yl), as shown in Extended Data Fig. 2c:

$$\overline {\updelta {{\theta }}_{{{{\rm{a}}}},{{{\rm{I}}}}}} = {{\beta }}\left[ {\begin{array}{*{20}{c}} {x_{\rm{c}} - x_{\rm{l}}} \\ {y_{\rm{c}} - y_{\rm{l}}} \end{array}} \right]$$

The velocity input imparting translational motion to the endoscope \(\overline {\updelta X_{\rm{a,I}}}\) is directly proportional to the alignment between the endoscopic image and the centre of the lumen, as described by equation (10) and shown in Extended Data Fig. 2d:

$$\overline {\updelta X_{\rm{a,I}}} = 1 - {\rm{e}}^{\kappa \sqrt {\left( {{{{{x}}}}_{{{\rm{c}}}} - {{{{x}}}}_{{{\rm{l}}}}} \right)^2 + ({{{{y}}}}_{{{\rm{c}}}} - {{{{y}}}}_{{{\rm{l}}}})^2} }$$

With the linear velocity being throttled by the positioning of the lumen, priority is given to steering the endoscope. The endoscope is advanced through the colon only when the lumen is towards the centre of the image, thus preventing the endoscope from being driven into a tissue fold or against bends.

However, if the endoscope is directed against the colon wall and the lumen is not visible, the autonomous system needs to respond and avoid advancing the endoscope towards an incorrectly identified lumen. For this reason, the FAST (‘features from accelerated segment test’) detection algorithm is used to identify discernible edges within the image. In no-lumen scenarios (Extended Data Fig. 3a), there is a distinct reduction in the number of features in the image when compared to images containing a substantial portion of colon lumen (Extended Data Fig. 3b). In particular, the presence of haustral folds and overlapping tissue flaps constitutes a satisfactory set of features.

With a threshold set for the number of features based on experience with the system, the controller will detect when no lumen can be found and initiate a mitigation routine. This routine involves the system moving the EPM away from the MFE so that, being free from counteracting magnetic torque, the endoscope can naturally align to the lumen of the colon as the user pulls back slightly on the tether. Once the lumen has been relocated, autonomous navigation resumes.

Additional information on in vivo experiments

The in vivo trials were performed in the Large Animal Experimental Facility at the University of Leeds under a Home Office (UK) Licence (Procedure Project Licence PC71ADE55) in accordance with the Animal (Scientific Procedures) Act 1986 and the NC3Rs guidelines. The reporting has been carried out in accordance with the ARRIVE guidelines.

The female Yorkshire-Landrace pigs weighed 33 kg and 35 kg and were placed in the supine position under general, terminal anaesthesia, with a water enema being administered after general sedation to clean the bowel prior to the start of the study. Users performing the trial were trained in animal welfare and husbandry and were also supervised by a trained gastroenterologist and named veterinary surgeon. When navigating the colon during in vivo experiments, users had access to air insufflation to distend the colon, suction to remove stool and debris, and water irrigation to clean the endoscope’s camera. Necropsy was performed and did not reveal any gross trauma or perforation of the colon.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.