Introduction

There are high expectations for small robots’ application in exploration missions in confined spaces and future search and rescue mission scenarios under rubble1,2,3,4. However, the application of small artificial robots in such scenarios will still be challenging due to difficulties in design, manufacturing, and limited operating time. Instead, insect-computer hybrid robots, consisting of an insect and a microcontroller2,5,6, become an alternative to small artificial robots (Fig. 1a). The living insect is the delivery platform, converting bioenergy into kinetic energy to move forward. The microcontroller generates control decisions by fusing information from various sensors on the hybrid robot platform. It stimulates the corresponding senses of the insect hybrid platform to induce relevant movements. Insect-computer hybrid robots with different capabilities can be obtained by using various species of insects as delivery platforms. For example, insects such as beetles or dragonflies can be employed to create miniaturized insect-computer hybrid robots that can fly7,8,9. However, these aerial robots are unsuitable for scenes with complex terrain, such as under ruins. On the contrary, if we utilize insects with excellent climbing abilities, such as cockroaches, we get small hybrid robots that can adapt to complex terrain10,11. These insect-computer hybrid robots demonstrated superior locomotor performance and range compared to small artificial robots7,12.

Fig. 1: The overview of the insect-computer hybrid robot and the diagram of the navigation scenario.
figure 1

a The overview of the insect-computer hybrid robot. The hybrid robot is made up of a Madagascar cockroach and a microcontroller. The cockroach is the delivery platform of the hybrid robot. b The architecture of the microcontroller of the insect-computer hybrid robot. The microcontroller consists of the ESP32-CAM, responsible for image acquisition, and the stimulation module, responsible for outputting control signals. c The diagram of the navigation scenario. One insect-computer hybrid robot travels along a road filled with obstacles. When it encounters an obstacle, the hybrid robot’s monocular camera detects the obstacle and controls the robot to avoid the obstacle and reach the destination.

One of the most critical tasks of the hybrid robot’s development process is insect motion control. In recent years, there has been an influx of work investigating the development of protocols for insect movement control13,14,15,16,17,18,19. Ma et al. demonstrated that locust jumping movements can be induced by stimulating their leg muscles19. Choo et al. initiated flight behavior by applying electrical stimulation to the beetle’s dorsal longitudinal muscles20. Ye et al. studied the optimal electrical stimulation characteristics of bees’ unilateral optic lobes to induce their turning behavior21. Stimulating cockroach cerci and unilateral antennae can generate acceleration and turning behaviors11. Building on this foundation, some researchers have further developed insect navigation algorithms based on these insect locomotion control protocols10,22. However, these navigation algorithms are still at a very preliminary stage. Because in future practical applications, the terrain and environmental conditions faced by insect-computer hybrid robots will be much more complex. Obstacle sensing is crucial for robot navigation systems in rugged terrain conditions. Some studies have shown that insects can use their antennae to sense and avoid obstacles23. However, the detection range of insects’ antennae is limited. Obstacles are often only detected when the insect touches the obstacle. However, obstacle avoidance becomes problematic when the hybrid robot is close to the obstacle. The collision of the robot with the obstacle also affects the insect’s motion. In addition, many studies steer insects by stimulating the antennae, affecting or even completely disabling insects’ ability to avoid obstacles11. Moreover, when the insects are subjected to control stimulation, their subsequent reactions when they detect an obstacle are unpredictable and unreliable in navigation. They may fall into deeper traps to avoid the obstacle and have no escape. Therefore, relying solely on insects’ ability to avoid obstacles to achieve the established navigation goal is unrealistic.

To expand the adaptability of insect-computer hybrid robots to complex terrains, we need to enhance the robot platform’s capability to perceive the surrounding environment. Additional sensors will significantly improve the robot’s obstacle-avoidance capability. Currently, the sensors that can supply reliable obstacle detection are LiDAR or RGB cameras. Insect-computer hybrid robots have small dimensions, resulting in limited load capacity. They can only carry small-sized sensors with low power consumption. Hence, LiDAR sensors are unsuitable for insect-computer hybrid robotic platform applications. Instead, the monocular RGB camera is well suited for small hybrid robots due to its small size, low power consumption, and robust information acquisition capabilities24,25. A hybrid robot with an integrated onboard RGB camera, CameraRoach, was presented by Rasakatla et al.25. They demonstrated several applications, such as using the camera to navigate a cockroach robot by recognizing arrows indicating direction. However, they did not further process the images taken by the camera to discover more applications for monocular cameras.

Obstacles identification and avoidance can be achieved using the monocular camera to predict the distance between the camera and environmental objects26,27,28,29. In recent years, depth estimation algorithms based on monocular cameras, the monocular depth estimation, have seen unprecedented development due to the growth of deep learning. Depending on the training data, monocular depth estimation can be categorized into supervised30,31,32,33,34 and unsupervised35,36,37,38,39,40 learning approaches. Since unsupervised learning methods do not require images labeled by depth truth values for training, this dramatically reduces the threshold of acquiring training data. Therefore, unsupervised learning methods have gained increasing attention in monocular depth estimation. Many reports have shown the application of monocular cameras to guide drones or unmanned vehicles in obstacle avoidance41,42,43. However, due to the application scenarios of the insect-computer hybrid robot, applying monocular cameras to insect-computer hybrid robot navigation tasks remains challenging. The generalization ability of deep learning models in different scenes is a critical reason that limits their application scenarios. Enriching and expanding the diversity of datasets is beneficial to address this issue. However, collecting new datasets is a costly and time-consuming task44. Currently, no monocular depth estimation model can provide sufficiently accurate predictions for insect-computer hybrid robot application scenarios because of the lack of a dataset taken from the perspective of insect-computer hybrid robots. The existing public datasets, such as KITTI45 and Cityscapes46, mainly serve autonomous driving scenarios. From an insect’s perspective, the world differs significantly from the view captured in these datasets. Therefore, the monocular depth estimation models trained from these datasets have difficulties providing usable prediction results for the navigation task of small robots, which limits the application scenarios of monocular cameras to insect-computer hybrid robots. Another unresolved problem is generating obstacle avoidance control commands for insect-computer hybrid robots based on the depth maps from the monocular depth estimation models. To overcome the above challenges, we propose one navigation algorithm for insect-computer hybrid robots with obstacle avoidance functions using a monocular camera. Specifically, our contributions can be summarized as follows:

  1. 1.

    To enhance the obstacle avoidance capabilities of insect-computer hybrid robots, we developed the first navigation algorithm with an integrated obstacle avoidance function using a monocular camera.

  2. 2.

    One unsupervised learning monocular depth estimation model is utilized to process images from the monocular camera to gain the depth information of obstacles. We collected the first dataset, the SmallRobot Dataset, obtained from the viewpoint of insects. We used it to train a monocular depth estimation model that can provide accurate depth predictions for an insect-computer hybrid robot.

  3. 3.

    We proposed one simple but effective way to process the depth maps to generate obstacle avoidance commands for insect-computer hybrid robots.

Results and discussion

To evaluate the effectiveness of the obstacle avoidance module, we conducted point-to-point navigation experiments of insect-machine hybrid systems guided by navigation algorithms with and without obstacle avoidance features. The navigation algorithm drove the insect-computer hybrid robot from the start point to the destination with an obstacle in between, as illustrated in Fig. 2. This obstacle has a corner with three closed sides, which may trap the hybrid robot’s navigation into a deadlock. If the robot successfully navigates around the obstacle to reach the destination, we count it as a successful attempt. If the robot fails to get around the obstacle and is stuck in the dead corner, we count it as a failed attempt. We conducted the navigation experiments in two setups: navigation with and without the Obstacle Avoidance module. Each setup was repeated in 15 trials using three insect-computer hybrid robots. Figure 2a shows the motion trajectory of hybrid robots guided by the navigation algorithm without the Obstacle Avoidance Module. Figure 2b shows the hybrid robots’ motion trajectory under the navigation algorithm’s guidance with the Obstacle Avoidance Module. By comparison, the obstacle avoidance module can grant the hybrid robot superior obstacle avoidance capabilities. After integrating the Obstacle Avoidance Module into the navigation algorithm, the success rate of the navigation task soared from 6.7% to 73.3% (Fig. 2c).

Fig. 2: The results of the navigation experiments.
figure 2

a The insect-computer hybrid robot’s moving trajectory under the algorithm’s guidance without an obstacle avoidance module. b The insect-computer hybrid robot’s moving trajectories under the algorithm’s direction with an obstacle avoidance module. c The success rate of the navigation experiments without and with obstacle avoidance modules. Algorithms with obstacle avoidance modules achieve higher success rates. d Comparison of robots’ trajectories relative to the risk zone guided by navigation algorithms with and without obstacle avoidance modules. The results show that when robots are guided by an algorithm that has an obstacle avoidance module, their probability of entering the risk zone is much lower. And even if they enter the risk zone, there is a certain probability of getting out of it. e and f Motion dissection of a robot guided by two navigation algorithms. A navigation algorithm with an obstacle avoidance module can detect the presence of obstacles and output obstacle avoidance commands to keep the robot away from the obstacles. However, without an obstacle avoidance module, the navigation algorithm cannot detect obstacles and forces the robot to move toward the destination, resulting in a conflict between the control commands and obstacles, causing the robot to be trapped.

A more in-depth study found that the navigation task tended to fail when the hybrid robot entered a specific range close to the obstacle. We call this region the risk zone (Fig. 2d). When the robot was inside the risk zone, the collision between the robot and the obstacle hindered the robot’s motion. The space left for posture correction became small as well. For the navigation algorithm without the Obstacle Avoidance Module, the robot entered the risk zone in up to 93.3% of the attempts. For the navigation algorithm with the integrated Obstacle Avoidance Module, this figure was only 40%. Meanwhile, out of these 40% of attempts, 33.3% of the robots were directed out of the risk zone by the navigation algorithm, while for the navigation algorithm without obstacle avoidance, this number was 0. This indicates that the Obstacle Avoidance Module can anticipate the presence of obstacles and take action to avoid them. This allows the robot to correct its direction early and avoid entering the risk zone.

Another cause of navigation failure is conflicts between navigation commands and obstacles. For navigation algorithms without the Obstacle Avoidance Module, since it cannot detect the presence of obstacles, the General Navigation Module will disregard the shape of the obstacles to correct the robot’s orientation to face the destination forcefully. This leads to conflicts between the navigation commands and the obstacles, which can cause the robot to be trapped in the obstacles (Fig. 2e). However, the algorithm integrated with the Obstacle Avoidance Module prioritizes obstacle avoidance operations, thus ensuring that the robot avoids obstacles before approaching the destination (Fig. 2f).

Applying the technology of monocular depth estimation to robots can empower them with superior obstacle-avoidance capabilities. However, the generalization ability of monocular depth estimation algorithms based on deep learning is a big challenge that limits their deployment. Insect-computer hybrid robots possess unique requirements for the training data due to their tiny size and unique camera viewpoint. None of the existing publicly available training datasets for monocular depth estimation models can meet the deployment requirements for insect-computer hybrid robot applications. Therefore, models trained using these datasets cannot produce reliable predictions when applied to insect-computer hybrid robots. To overcome this issue, we collected training datasets for monocular depth estimation models that can be used for small robots such as insect-computer hybrid robots. In Fig. 3, the first column is images taken from an insect’s point of view. The second column shows the prediction results of the model trained with the KITTI dataset. The third column is the prediction result of the model trained with our collected dataset, SmallRobot Dataset. It is clear to see that the model trained using KITTI cannot generate reasonable and reliable depth maps for images taken from an insect’s perspective. In contrast, the model trained with SmallRobot Dataset produces high-quality depth maps with sharp edges.

Fig. 3: The depth network’s prediction results.
figure 3

The first column contains the input images. The second column contains the depth maps from the depth network trained by the KITTI dataset, and the third column contains the depth maps from the depth network trained by the SmallRobot dataset.

Another area for improvement is completing the conversion process from depth map to obstacle avoidance control commands. Artificial devices like drones have higher control accuracy and faster movement speed. Their obstacle avoidance algorithm needs to determine the contour boundaries of obstacles to avoid collisions41,42, which may require adding additional features, such as object detection, resulting in more computation. However, the insect-computer hybrid robot’s biological body makes it more tolerant to collisions and not as easily damaged during crashes. As such, we can reduce the need for obstacle edge detection and instead, develop a method to generate obstacle avoidance commands based only on depth distribution trends. Figure 4 shows examples of generating an obstacle avoidance command from an RGB image. The first column is the input RGB images. The monocular depth estimation model processes these images to produce depth maps. After that, the weighted sums (Turn Left, Turn Right, and Go Forward) are computed according to the proposed obstacle avoidance algorithm. Finally, these values are fed into SoftMax to get obstacle avoidance commands. For the first and third images, the obstacle avoidance algorithm generates steering commands to drive the robot away from the obstacles according to the shape of the obstacle. For the second image, the algorithm maintains the robot’s moving direction. These generated commands are reasonable and consistent with human judgment, which validates the effectiveness of our proposed obstacle avoidance algorithm.

Fig. 4: The examples from input images to obstacle avoidance commands.
figure 4

The depth estimation module analyzed the raw images and generated the depth maps. The weighted sum values were identified from the depth maps. The correct navigation command was chosen to output and induce the hybrid robot’s motion by comparing the Weighted Sum values of the three stimulation commands.

This paper demonstrates the first successful automatic navigation algorithm with an obstacle avoidance function using a monocular camera for the insect-computer hybrid robot. Experiments prove the effectiveness of the proposed algorithm. The ease of fabrication of insect-computer hybrid robots makes them suitable for a wide range of potential applications. Our algorithm can help them overcome their deficiencies in obstacle recognition and avoidance. However, the microcontroller employed to deploy the algorithm is unwieldy. In the future, we will develop a microcontroller with integrated image acquisition and stimulation modules. In addition, limited by microcontrollers’ current storage and processing capabilities, we send images back to the workstation via WiFi for processing. This would increase energy consumption. We are developing an ultra-lightweight depth model that can be deployed on the insect side, placing the model and inference process running onboard to reduce the information transmitted. Another essential follow-up question is ensuring that the camera is installed horizontally. We are also developing structures that simplify camera installation and maintain a horizontal position.

Methods

Insect platform

We used the Madagascar hissing cockroach, which has excellent climbing abilities, as a platform to build the insect-computer hybrid robot (Fig. 1a). Their large size (5.7 ± 0.6 cm)10 gives them a robust carrying capacity25. At the same time, there are many mature control protocols and demonstrations for controlling Madagascar cockroaches1,10,11. These traits make them well-suited as platforms for insect-computer hybrid robots. We kept these cockroaches in a laboratory environment with suitable temperature and humidity and regularly provided them with water and food. The experiment procedures of this study are in accordance with the literatures11,18,47,48. We followed similar steps for insect anesthesia and electrode implantation and adhered to high ethical standards throughout the experiment.

Microcontroller and implantation methods

The controller used to control insect movement is comprised of two parts. They are the ESP32-CAM with camera OV2640 used to acquire images and the custom-designed stimulation module used to output control signals (Fig. 1b). ESP32-CAM from Ai-Thinker Technology is powered by ESP32-S MCU module, which supports WiFi and Bluetooth communication. The ESP32-CAM can capture RGB images with a resolution of 320 × 240 pixels and pass them to the workstation via WiFi. The camera was carefully inspected to ensure it was mounted horizontally. The stimulation module employs MSP432P4011 as the central controller. The components of the stimulation module are shown in Supplementary Fig. 2. CC1352 works as the Bluetooth module. AD5504 is used as the signal generator. The stimulation module has four output channels that generate voltage from 0 to 12 V. Our navigation experiments only use stimulus signals below 3 volts to control insect movements. The stimulation module receives commands from a workstation via Bluetooth to create stimulus signals to control insect movement. We used a Li-Po battery with a calibrated voltage of 3.7 V and a capacity of 180 mAh to power the two modules simultaneously. The controller can be effortlessly detached from a hybrid robot and reused on another insect to build a fully functional hybrid robot.

We adopted the same electrode implantation method as in Erickson et al.’s work11 to control the motion of cockroaches. We implanted electrodes into the cerci part to induce the forward movement of cockroaches and used electrodes implanted in the antennae to induce the turning movements of cockroaches. An electrode was implanted into the cockroach’s abdomen as a common ground wire. The four electrodes were fixed on the cockroach’s body using melted beeswax and were connected to the four channels of the controller’s stimulation module through wires. The preparation and assembly of the insect-computer hybrid robot takes approximately 15 minutes.

Monocular depth estimation model

We followed an unsupervised approach that adopted image sequences from a monocular camera as training data to train a monocular depth estimation model (Fig. 5). The depth estimation model was trained together with a pose estimation model38,39. The target image was first fed into the depth estimation model in the training phase to generate a predicted depth map. At the same time, the image pair consisting of the target picture and its adjacent frame (the reference image) was sent to the pose estimation network to calculate the camera’s pose change between the two frames. Afterward, the matching relationship between pixels on the target image and the reference image was calculated based on the camera’s pose change and depth map. The target image was synthesized by sampling the corresponding pixels from the reference image. The training optimized the models by minimizing the difference between synthetic and natural target images. We used the same network as in Godard et al.’s work38 as the depth estimation model and a pose estimation network with a resnet50 basic skeleton. Meanwhile, we followed the loss function and the auto-masking technology proposed by Godard et al.38 to train the model. To make the depth maps have the same scale, we added a scale consistency loss49. The training was conducted on an NVIDIA GeForce RTX 3090 GPU with 24 G memory.

Fig. 5: The overview of the monocular depth estimation model.
figure 5

The depth estimation network used follows an unsupervised approach to training. During training, the depth network is trained along with the pose network, which predicts the camera pose changes. Camera pose change is a matrix containing the translation and rotation changes of the camera from the target image to the reference image. The target image is fed to the depth network to get the depth map. At the same time, an image pair consisting of the target image and its neighboring frames (reference image) is fed to the pose network to predict the camera’s pose change. Finally, the depth map, the pose change, and the reference image are used to synthesize the target image. According to the depth map and camera pose change, each pixel on the target image will be matched to the corresponding pixel on the reference image. After that, the pixels will be sampled from the reference image to synthesize the target image according to the obtained correspondence. The training process optimizes the networks by minimizing the difference between the target and the synthetic target image.

Dataset captured from small robot’s view

The performance of a depth estimation network in its application scenarios is highly reliant on the training dataset. No publicly available datasets are suitable for the application scenarios of insect-computer hybrid robots. Therefore, it is difficult for monocular depth estimation models trained on existing datasets to extend their application scenarios to navigation tasks of insect-computer hybrid robots. To overcome this issue, we have collected a dataset suitable for visual models of small robots from the viewpoint of tiny robots, the SmallRobot Dataset.

We employed the ESP32-CAM to capture the dataset. The ESP32-CAM was mounted on a tray powered by a power bank. The tray had a lever with a handle that allowed the operator to move the ESP32-CAM to capture images. The ESP32-CAM transmitted the images to the laptop via WiFi. An image capture program in Python ran on the laptop to collect images at intervals of 0.01 s. The resolution of the images was 320 × 240 pixels.

Obstacle avoidance module

The obstacle avoidance module works by processing the depth map generated by the monocular depth estimation model. It includes two functions: turning the obstacle avoidance function on and off and generating control commands to guide the robot for obstacle avoidance.

As shown in Fig. 6, we select a 40 × 40 pixel area in the center of the depth map. Then, the minimum depth value of this region is regarded as the distance from the obstacle to the robot. The obstacle avoidance function will be triggered when the minimum depth is smaller than the threshold.

Fig. 6: Working principle of obstacle avoidance.
figure 6

This image shows the process from the depth map to the obstacle avoidance control command. We chose a 40 Ă— 40-pixel area in the center of the depth map and took the closest distance of this area as the distance of the obstacle from the robot. Obstacle avoidance is turned on when this distance is less than a set threshold. Then, we select a region of 40 by 320 pixels along the height direction and calculate the weighted sums of the Left Turn, Right Turn, and Go Forward differently. Then, we input these values into the Softmax to generate the control commands for obstacle avoidance.

In addition, we section off a 40 × 320 pixel area along the height direction of the depth map. The control commands (Left Turn, Right Turn, or Go Forward) will be generated based on this section. Specifically, we give each pixel point a different weight along the width direction, which is assigned differently for different commands. For the Left Turn command, maximum weights are assigned at the left side of the depth map and decrease towards the right. This is reversed for the Right Turn command. For the Go Forward command, maximum weights are assigned in the middle and decrease towards both sides. Then, the weighted sums of the Left Turn, Right Turn, and Go Forward are calculated and passed to the SoftMax function to decide the control command. As shown in Supplementary Fig. 1, the Left Turn indicates that the insect’s right antenna is stimulated to rotate to the left side and vice versa for the Right Turn. Go Forward suggests that the insect’s cerci are stimulated to move forward.

Navigation experiment and algorithm

The setup of the navigation experiment is shown in Fig. 7a. The insect-computer hybrid robot was navigated from the start point to the destination. The navigation algorithm of the insect-computer hybrid robot consists of two modules. They are the obstacle avoidance and general navigation module (Fig. 7b). A monocular depth estimation model deployed on the workstation processes the image from ESP32-CAM by WiFi to obtain the predicted depth map. Meanwhile, the workstation receives and processes the robot’s location data from the 3D motion capture system to generate suitable control commands, which would be issued to the insect-computer hybrid robot via BLE. The navigation algorithm first calculates the distance from the robot to the destination to determine if the robot has reached the destination. If not, the obstacle avoidance module will check the distance of the obstacle to decide whether to trigger the obstacle avoidance function. The general navigation module guides the robot towards its destination if obstacle avoidance is not required. The workflow of the general navigation module consists of two steps. First, the robot’s direction of movement is checked. The Go Forward command is released directly if the robot is moving towards the destination, else the steering command is released first to adjust the moving direction before outputting the Go Forward command.

Fig. 7: The overview of the navigation experiment and algorithm.
figure 7

a The overview of the navigation experiment. The insect-computer hybrid robot automatically moves from the start point to the destination. A 3D motion capture system tracks and captures the robot’s positional information. The position information is passed to the workstation through cables. The ESP32-CAM on the robot’s microcontroller, responsible for image acquisition, uses a WiFi network set up by a router to deliver the images to the workstation. Then, the workstation generates control commands by processing the acquired position information and images and transmits them to the robot’s microcontroller using Bluetooth to control the robot’s movement. b The diagram of the navigation algorithm. The navigation algorithm consists of the Obstacle Avoidance Module and the General Navigation Module. The ESP32-CAM transmits the acquired images to the workstation. A monocular depth estimation model on the workstation processes the images to obtain a predicted depth map. At the same time, the workstation acquires the robot’s position data from the 3D motion capture system. The navigation algorithm first calculates the distance from the robot to the destination and determines whether the robot has reached the destination. If it has not yet arrived, the obstacle avoidance module checks the distance to the obstacle and decides whether to activate the obstacle avoidance function. The general navigation module will guide the robot to the destination if obstacle avoidance is not required.